This is an interface to a pre-trained formality, informativenes, and implicature classifier trained on the SQUINKY! corpus. The SQUINKY! corpus is a collection of 7,032 sentences annotated manually for formality, informativeness, and implicature. For details on the corpus and the annotation process, please see Lahiri (2015):
Lahiri, S. (2015). SQUINKY! A Corpus of Sentence-level Formality, Informativeness, and
Implicature. Ann Arbor, 1001, 48109. https://arxiv.org/pdf/1506.02306.pdf
Each of the three annotations have their own logistic regression classifier trained on various syntactic features of natural language, borrowed from the (unrelated) thesis by Vincze (2015). Please consult the thesis for details on feature selection/generation:
Vincze, V. (2015). Uncertainty detection in natural language texts (Doctoral dissertation, szte).
This interface outputs probabilities of the positive and negative classes, for each of the three annotations, for a given sentence. For example, given the sentence "A BIG THANKYOU GOES TO holli!", the output will be:
({'formal': 0.0041114378047021338, 'informal': 0.99588856219529787},
{'informative': 0.011593792054814324, 'ambiguous': 0.98840620794518563},
{'implicative': 0.95996335945188804, 'verbose': 0.040036640548111957})
It is strongly recommended that you read Lahiri (2015) before attempting to interpret the results -- informativeness and implicature are complicated concepts and their meaning should not be assumed.
sudo pip3 install squinky # Train the classifiers using the provided training data.
squinky train /path/to/data.csv
# Validate the precision, recall, and f1-score for the provided training data
# using a 25% train/test split.
squinky validate --split=0.25 /path/to/data.csv
# Predict the Formality, Informativeness, and Implicature of the given sentence.
squinky predict "This is a test sentence."The formality, informativeness, and implicature classifiers have the following precision, recall, and f1-scores:
Precision Recall F1-Score
Formality: 0.82 0.82 0.82
Informativeness: 0.84 0.84 0.84
Implicature: 0.60 0.60 0.60
Lahiri (2015) provided a set of sample sentences with formality, informativeness, and implicature annotations. These classifiers have been validated against those examples. Examples [3] and [6] fail for informativeness and implicature, respectively.
| Expected | Predicted | ||||||
|---|---|---|---|---|---|---|---|
| Example from Lahiri (2015) | FORM | INFO | IMPL | FORM | INFO | IMPL | |
| [1] | A BIG THANKYOU GOES TO holli! | Low | Low | - | Low | Low | High |
| [2] | As Maoists menace continued to be unabated, the government is all set to launch the much-awaited full-fledged anti-Naxal operations at three different areas, considered trijunctions of worst Naxal-affected states. | High | High | - | High | High | Low |
| [3] | 4) 'We find no clear relation between income inequality and class-based voting.' | High | Low | - | High | High | High |
| [4] | 2) Just wipe the Mac OS X partition when u install the dapper. | Low | High | - | Low | High | Low |
| [5] | alright, well, i guess i just made a newbie mistake. | Low | - | High | Low | Low | High |
| [6] | All seven aboard the Coast Guard plane are stationed at the Coast Guard Air Station in Sacramento, Calif., where their aircraft was based. | High | - | High | High | High | Low |
| [7] | Maoists sabotaged Essar's 166-mile underground pipeline, which transfers slurry from one of India's most coveted iron ore deposits to the Bay of Bengal. | High | - | Low | High | High | Low |
| [8] | Wait. | Low | - | Low | Low | Low | Low |