-
Couldn't load subscription status.
- Fork 0
Description
There is a small set of words which are currently not able to be scored in their own context, because they appear as words in other languages too often.
For example, the quotative particle "to" clashes with both "to" and "too" in English, which means I can't even include it in the dictionary.
It would be more apt if the score of these words were dependent on the words near them. For example, in "i talk to him", the neighbors of "to" match alphabetically and not at all, respectively; this would be grounds to mark "to" at zero.
However, there are some complexities: "to" in particular is not helped so much by this, because it's often at the end of the sentence, lacking one following neighbor. This could be helped by pulling the nearest three neighbors if there are any, and then considering any missing neighbors to just be zeroes.