Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Some NER tools do not mark multi-word NEs #1234

@reckart

Description

@reckart

Some NER tools such as the CoreNlpNamedEntityRecognizer mark every token individually as a NE instead of creating a multi-token NE.

2018-05-23_11-43-44

IMHO the default behavior should be that NEs with the same label are joined unless the model uses a BIO-like encoding in which case the BIO markers should be respected.

Also the unit tests for the NER tools should be changed to include a multi-word NE, e.g. change John from the current unit tests into John Smith.

  • CoGrOO Named Entity Recognizer
  • CoreNLP Named Entity Recogizer (old API)
  • CoreNLP Named Entity Recognizer
  • Illinois CCG Named Entity Recognizer
  • LingPipe Named Entity Recognizer
  • NLP4J Named Entity Recognizer
  • OpenNLP Named Entity Recognizer

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions