Replacing words with common synonyms
While working with NLP, especially in the case of frequency analysis and text indexing, it is always
beneficial to compress the vocabulary without losing meaning because it saves lots of memory. To
achieve this, we must have to define mapping of a word to its synonyms. In the example below, we will
be creating a class named word_syn_replacer which can be used for replacing the words with their
common synonyms.
Example
First, import the necessary package re to work with regular expressions.
import re
from nltk.corpus import wordnet
Next, create the class that takes a word replacement mapping −
class word_syn_replacer(object):
def init (self, word_map):
self.word_map = word_map
def replace(self, word):
return self.word_map.get(word, word)
Save this python program (say replacesyn.py) and run it from python command prompt. After running it,
import word_syn_replacer class when you want to replace words with common synonyms. Let us see
how.
from replacesyn import word_syn_replacer
rep_syn = word_syn_replacer ({‘bday’: ‘birthday’)
rep_syn.replace(‘bday’)
Complete implementation example
replacesyn.py
import re
from nltk.corpus import
wordnet class
word_syn_replacer(object):
def init (self, word_map):
self.word_map = word_map
def replace(self, word):
Now oncereturn self.word_map.get(word,
you saved the above program and run word)
it, you can import the class and use it as follows –
Main.py
from replacesyn import word_syn_replacer
rep_syn = word_syn_replacer ({'bday': 'birthday','atm': 'at the moment'})
print(rep_syn.replace('atm'))
print(rep_syn.replace('bday'))
Output
The disadvantage of the above method is that we should have to hardcode the synonyms in a Python
dictionary. We have two better alternatives in the form of CSV and YAML file. We can save our synonym
vocabulary in any of the above-mentioned files and can construct word_map dictionary from them. Let
us understand the concept with the help of examples.
Using CSV file
In order to use CSV file for this purpose, the file should have two columns, first column consist of
word and the second column consists of the synonyms meant to replace it. Let us save this file as
syn.csv. In the example below, we will be creating a class named CSVword_syn_replacer which will
extends word_syn_replacer in replacesyn.py file and will be used to construct the word_map dictionary
from syn.csv file.
Example
First, import the necessary packages.
import csv
Next, create the class that takes a word replacement mapping −
class CSVword_syn_replacer(word_syn_replacer):
def init (self, fname):
word_map = {}
for line in csv.reader(open(fname)):
word, syn = line
word_map[word] = syn
super(Csvword_syn_replacer, self). init (word_map)
After running it, import CSVword_syn_replacer class when you want to replace words with common
synonyms. Let us see how?
from replacesyn import CSVword_syn_replacer
rep_syn = CSVword_syn_replacer (‘syn.csv’)
rep_syn.replace(‘bday’)
Output
'birthday'
Complete implementation example
replacesyn.py
import re
from nltk.corpus import
wordnet class
word_syn_replacer(object):
def init (self, word_map):
self.word_map = word_map
def replace(self, word):
return self.word_map.get(word, word)
import csv
class
CSVword_syn_replacer(word_syn_replacer):
def init (self, fname):
word_map = {}
for line in csv.reader(open(fname)):
word, syn = line
word_map[word] = syn
super(CSVword_syn_replacer, self). init (word_map)
Now once you saved the above program and run it, you can import the class and use it as follows –
Main.py
from replacesyn import CSVword_syn_replacer
rep_syn = CSVword_syn_replacer('syn.csv')
print(rep_syn.replace('bday'))
Output
syn.csv file content
*Needs to DL yaml pip
Using YAML file
As we have used CSV file, we can also use YAML file to for this purpose (we must have PyYAML
installed). Let us save the file as syn.yaml. In the example below, we will be creating a class
named YAMLword_syn_replacer which will extends word_syn_replacer in replacesyn.py file and will be
used to construct the word_map dictionary from syn.yaml file.
Example
First, import the necessary packages.
import yaml
Next, create the class that takes a word replacement mapping −
class YAMLword_syn_replacer(word_syn_replacer):
def init (self, fname):
word_map = yaml.load(open(fname))
super(YamlWordReplacer, self). init (word_map)
After running it, import YAMLword_syn_replacer class when you want to replace words with common
synonyms. Let us see how?
from replacesyn import YAMLword_syn_replacer
rep_syn = YAMLword_syn_replacer (‘syn.yaml’)
rep_syn.replace(‘bday’)
Output
'birthday'
Complete implementation example
import yaml
class YAMLword_syn_replacer(word_syn_replacer):
def init (self, fname):
word_map = yaml.load(open(fname))
super(YamlWordReplacer, self). init (word_map)
Now once you saved the above program and run it, you can import the class and use it as follows −
from replacesyn import YAMLword_syn_replacer
rep_syn = YAMLword_syn_replacer (‘syn.yaml’)
rep_syn.replace(‘bday’)
Output
'birthday'
Antonym replacement
As we know that an antonym is a word having opposite meaning of another word, and the opposite of
synonym replacement is called antonym replacement. In this section, we will be dealing with antonym
replacement, i.e., replacing words with unambiguous antonyms by using WordNet. In the example
below, we will be creating a class named word_antonym_replacer which have two methods, one for
replacing the word and other for removing the negations.
Example
First, import the necessary packages.
from nltk.corpus import wordnet
Next, create the class named word_antonym_replacer −
class word_antonym_replacer(object):
def replace(self, word, pos=None):
antonyms = set()
for syn in wordnet.synsets(word, pos=pos):
for lemma in syn.lemmas():
for antonym in lemma.antonyms():
antonyms.add(antonym.name())
if len(antonyms) == 1:
return antonyms.pop()
else:
return None
def replace_negations(self, sent):
i, l = 0, len(sent)
words = []
while i < l:
word = sent[i]
if word == 'not' and i+1 < l:
ant = self.replace(sent[i+1])
if ant:
words.append(ant)
i += 2
continue
words.append(word)
i += 1
return words
Save this python program (say replaceantonym.py) and run it from python command prompt. After
running it, import word_antonym_replacer class when you want to replace words with their
unambiguous antonyms. Let us see how.
from replacerantonym import word_antonym_replacer
rep_antonym = word_antonym_replacer ()
rep_antonym.replace(‘uglify’)
Output
['beautify'']
sentence = ["Let us", 'not', 'uglify', 'our', 'country']
rep_antonym.replace _negations(sentence)
Output
["Let us", 'beautify', 'our', 'country']
Complete implementation example
replacerantonym.py
from nltk.corpus import wordnet
class
word_antonym_replacer(object):
def replace(self, word, pos=None):
antonyms = set()
for syn in wordnet.synsets(word,
pos=pos): for lemma in syn.lemmas():
for antonym in
lemma.antonyms():
antonyms.add(antonym.name())
if len(antonyms) == 1:
return
antonyms.pop()
else:
return None
def replace_negations(self,
sent): i, l = 0, len(sent)
words = []
while i < l:
word = sent[i]
if word == 'not' and i+1 < l:
ant =
self.replace(sent[i+1]) if
ant:
words.append(ant
) i += 2
continue
Now once you saved the above program and run it, you can import the class and use it as follows −
from replacerantonym import
word_antonym_replacer rep_antonym =
word_antonym_replacer ()
rep_antonym.replace('uglify')
sentence = ["Let us", 'not', 'uglify', 'our', 'country']
print(rep_antonym.replace_negations(sentence))
Output