Thanks to visit codestin.com
Credit goes to Github.com

Skip to content

pos mismatch breaks similiarity #55

@aponty

Description

@aponty

Love the tool! Super helpful. However, it bugs out if you try to run the maxsim disambiguation on a sentence where the wn.sysnet pos doesn't match the NLTK tagged pos.

Try running

sen = 'these potato chips are great'
disambiguate(sen, algorithm=maxsim)

and you get an index out of range error because result in max_similarity in similarity.py is [], because wn.synsets(ambiguous_word, pos=pos) is nothing as NLTK has (incorrectly) decided the part of speach of 'Potato' is an adjective, and there's no synset for that.

A very simple fix- change line 114 from

for i in wn.synsets(ambiguous_word, pos=pos):

to

for i in wn.synsets(ambiguous_word, pos=pos) or wn.synsets(ambiguous_word):

to provide a fallback option

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions