Natural language processing (NLP) is a subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language, in particular how to program computers to process and analyze large amounts of natural language data. In this repository, NLP tasks such as Named Entity Recognition, Relation Extraction, and Coreference Resolution are done using Jupyter Notebook.
- Definition:
- Named-entity recognition is a subtask of information extraction that seeks to locate and classify named entities mentioned in unstructured text into pre-defined categories such as person names, organizations, locations, medical codes, time expressions, quantities, monetary values, percentages, etc. (Wikipedia,nd.)
- Entities recognised:
- PERSON
- LOCATION
- ORGANIZATION
- etc.
- Definition:
- Relationship extraction is the task of extracting semantic relationships from a text. Extracted relationships usually occur between two or more entities of a certain type (e.g. Person, Organisation, Location) and fall into a number of semantic categories (e.g. married to, employed by, lives in).
- Relations recognised:
- per_employee_of
- per_member_of
- per_origin
- etc.
- Definition:
- In linguistics, coreference, sometimes written co-reference, occurs when two or more expressions in a text refer to the same person or thing; they have the same referent, e.g. Bill said he would come; the proper noun Bill and the pronoun he refer to the same person, namely to Bill. (Wikipedia, nd.)
- Examples:
- Cluster 1 = [Maybank, the bank, it]
- Cluster 2 = [Bill, He, his, him]