Hi,
I am working on a natural language processing problem involving information retrieval from a number of similar PDF documents. After some research, I have decided to have UIMA be used for all the NLP work.
The problem is as follows:
I have a bunch of high court judgements with text format as follows
"
IN THE HIGH COURT OF JUDICATURE AT THELMES
EXTRA ORIGINAL CIVIL JURISDICTION
MAP PETITION NO. 5643 OF 2009
Dr. Jose Costellas
... Petitioner.
V/s.
Union of Thelmes and Ors.
... Respondents.
WITH
MAP PETITION (L) NO. 5628 OF 2009
Dr. Dee Dee Smith
... Petitioner.
V/s.
Union of Surrey.
... Respondent.
WITH
MAP PETITION (L) NO. 5421 OF 2010
St. Williams Education Society & Association
... Petitioners.
V/s.
All Thelmes Council For Education.
... Respondents.
Mr. V.M. Smith A/d. Ms. Perry V. Tine for the Petitioners.
Mr. Dennis Trudy a/d. Gina Mason A/d. V.P. Gill i/b. Scholm Mason LLP for Respondent 1.
Mr. A.B.. Borris for Respondent 2.
Mr. E.P. Mccotter, Senior Associate A/d. Nancy Parizek i/b. Kay, Windsor & Cohen LLP for Respondent 3.
CORAM : Hon. D.Y. Meier &
A.A. Copola
27 DECEMBER 2011.
"
I need following to be extracted from above:
High court: (Thelmes)
Jurisdiction : (Extra original civil)
Petitioner(s): (Dr. Jose Costellas, Dr. Dee Dee Smith, St. Williams Education Society & Association)
Respondent(s): (Union of Thelmes and Ors., Union of Surrey., All Thelmes Council For Education.)
Attorneys for petitioner: (Mr. V.M. Smith A/d. Ms. Perry V. Tine)
Attorneys for Respondents: ( Dennis Trudy, Gina Mason, V.P. Gill; A.B.. Borris; E.P. Mccotter, Nancy Parizek )
Law firms involved: Scholm Mason LLP, Kay, Windsor & Cohen LLP
Judges: D.Y. Meier, A.A. Copola
Judgement date: 27 DECEMBER 2011
Number of case lumped in one judgement can vary greatly. Also the date format may be a little different from document to document.
I need deliverable with complete source code as UIMA annotator. NO REGEX, may be a little to locate certain anchor words.
I should be able to use the UIMA annotator as part of an aggregate analysis engine.
I am a Java programmer having knowledge of artifical intelligence and writing algorithms. I have done a lot of POC in my earlier works. I think, i can perform this task with high degree of proficiency.