This is a fun project to see how NLP + ML can be used to play popular word base games
https://www.nytimes.com/games/connections
Games are specified in a json file were the keys are the words/phrases and the value is an integer that represents the group the word/phrase belongs to.
A text embedding is created for each word using a language model. Then every combination of possible guesses are listed (using itertools.combinations). For each possible guess a "spread" is calculated by taking the mean of the embedding vectors and summing the cosine similarity of each vector to the mean. All possible guesses are then sorted from lowest to highest spread. The idea being is guess guesses that are tightly cluster in meaning based on text embeddings.
If a guess is one word away from guessing correctly, lets call it "one-aways". Guess are prioritized on if they closely match (all words but one match) a one-away.
NOTE: The code in nyt_connections.py is generalized to handle games of n groups with each group having n words. Even though that NYT Connections is specifically for groups of 4 with 4 words in each group
python nyt_connections.py example_nyt_connections.json 4
The number represents the number of incorrect guesses that are allowed before using. If unspecified then the default of 4 max wrong guesses is used.