Metaphors

main.py

The file main.py represents the process of finding metaphors in a text:

Text Segmentation
Finding candidates for the metaphors: a candidate is a pair of word: adjective-noun or verb-noun
Labeling the metaphors: Is a candidate metaphorical or literal?

Command Line

Arguments:

-ml or --labeler followed by an ID: choose the metaphor labeling method:
- darkthoughts
- cluster
- kmeans
- Default value: darkthoughts
-cf or --finder followed by an ID: choose the candidate finding method:
- adjNoun
- verbNoun
- Default value: adjNoun
-v or --verbose
- Print the different steps of the process
- Default value: False
-f or --file followed by a file name: look for metaphors in a text file
-s or --string followed by a string: look for metaphors in a specified string
-cg or --cgenerator:
- Useful when combined with an excel or csv file. Use word pair in the file as candidates instead of looking for candidates in the annotated text
- Default value: False

If no string or text file is specified in the command line then a default text is used.

The Execution

Initialization

Parsing the command line
Creating a hash-table
1. Adding the candidate finder functions
2. Adding the metaphor labeler functions
Initializing the text either from:
1. Default text - defined in modules/utils.py
2. A string - written in the command line
3. A file - path in the command line
Creating the object MetaphorIdentification

Step 1: Text Segmentation

The AnnotatedText is created from the raw text using the nltk.word_tokenize() function. The Part-of-Speech and the Lemma of each word is also determined with NLTK functions: nltk.pos_tag and nltk.WordNetLemmatizer.

Step 2: Finding Candidates

Call the procedure MetaphorIdentification.findCandidates()

Step 3: Labeling Metaphors

Call the procedure MetaphorIdentification.labelMetaphors()

The Registry Class

Defined in /new_structure/modules/datastructs/registry.py.

To identify metaphors in a text, at least two steps need to be followed: the candidate identification step and the labelling step. Each of these steps can be done in many ways. Each method needs to be registered in the metaphorRegistry defined in /sample/modules/registry.py

The MetaphorIdentification Class

Defined in /new_structure/modules/datastructs/MetaphorIdentification.py.

It has four fields:

rawText: string
annotatedText: class AnnotatedText from modules/datastructs/annotated_text.py
candidates: class CandidateGroup from modules/datastructs/candidate_group.py
metaphors: class MetaphorGroup from modules/datastructs/labeled_metaphor_list.py

##How to Add a New Metaphor-Labeling Function Your function must be defined in a new file in the modules folder.

Input

The input of the function must be:

candidates
- Type: Object of class CandidateGroup
cand_type:
- Type: string
- Value: "adjNoun" or "verbNoun"
- Usage: Corresponds to a database
verbose:
- Type: Boolean
- Usage: Display some information if its value is True

Output

The output of the function must be an object of class MetaphorGroup

Useful Classes

CandidateGroup

Variables
- candidates: list of objects of class Candidate
- size: number of elements in the list above
Methods
- addCandidate(candidate): Add the element candidate to the list candidates and increment the variable size by 1
- getCandidate(index): Return the candidate of index index in the list candidates
- __iter__()
- __str__()

MetaphorGroup

Variables
- metaphors: list of objects of class Metaphor
- size: number of elements in the list above
Methods
- addMetaphor(metaphor): Add the element metaphor to the list metaphors and increment the variable size by 1
- getMetaphor(index): Return the metaphor of index index in the list metaphors
- writeToCSV()
- __iter__()
- __str__()

Candidate

Variables
- annotatedText: object of class AnnotatedText
- sourceIndex: index of the source in the annotatedText
- sourceSpan: 2-tuple = (index of the first word in the source, index of the last word in the source)
- targetIndex: index of the target in the annotatedText
- targetSpan: 2-tuple = (index of the first word in the target , index of the last word in the target)
Methods
- getSource(): return the first word of the source
- getTarget(): return the first word of the target
- getFullSource()
- getFullTarget()
- __stringAdder(): used in the getFull... functions

Metaphor

Variables
- candidate: object of class candidate
- result: boolean
- confidence: number between 0 and 1
Methods
- getSource(): return candidate.getFullSource()
- getTarget(): return candidate.getFullTarget()
- getResult()
- getConfidence()
- __str__()

Name		Name	Last commit message	Last commit date
Latest commit History 107 Commits
Sample		Sample
new_structure/data		new_structure/data
scripts		scripts
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

Metaphors

main.py

Command Line

The Execution

Initialization

Step 1: Text Segmentation

Step 2: Finding Candidates

Step 3: Labeling Metaphors

The Registry Class

The MetaphorIdentification Class

Input

Output

Useful Classes

CandidateGroup

MetaphorGroup

Candidate

Metaphor

About

Uh oh!

Releases

Packages

Languages

Uh oh!

Uh oh!

vivancoson/metaphor-identification

Folders and files

Latest commit

History

Repository files navigation

Metaphors

main.py

Command Line

The Execution

Initialization

Step 1: Text Segmentation

Step 2: Finding Candidates

Step 3: Labeling Metaphors

The Registry Class

The MetaphorIdentification Class

Input

Output

Useful Classes

CandidateGroup

MetaphorGroup

Candidate

Metaphor

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages