Benjamin S. Meyers < [email protected] >
SPLAT is a command-line application designed to make it easy for linguists (both computer-oriented and non-computer-oriented) to use the Natural Language Tool Kit (NLTK) for analyzing virtually any text file.
SPLAT is designed to help you gather linguistic features from text files and it is assumed that most input files will not be already annotated. In order for SPLAT to function properly, you should ensure that the input files that you provide do not contain any annotations. Because there are so many variations of linguistic annotation schemes, it would simply be impossible to account for all of them in the initial parsing of input files; it is easier for you to remove any existing annotations than it is for me to do so.
SPLAT is being developed and tested on 64-bit Ubuntu 15.10 with Python 3.4.3. Minimum requirements include:
- Python 3.4 or Later
- NLTK 3.1 or Later
- Java (for the Berkeley Parser)
- Ensure that Python3.4 (or newer) is installed on your machine.
- Run the following in a command line:
pip3 install SPLAT-library
# Recommended, but not required.
echo 'alias splat="splat-cli"' >> ~/.bashrc
echo 'alias splat="splat-cli"' >> ~/.bash_profile
source .bashrcTo uninstall run the following in a command line.
pip3 uninstall SPLAT-library splat --commands # List all available commands
splat --help # Provide helpful information
splat --info # Display version and copyright information
splat --usage # Display basic command line structure
splat bubble filename # Display the raw text from the file splat tokens filename # List all Tokens
splat types filename # List all Types
splat ttr filename # Calculate Type-Token Ratio
splat wc filename # Word Count (Token Count)
splat uwc filename # Unique Word Count (Type Count) splat pos filename # List Tokens with their Parts-Of-Speech
splat poscounts filename # List Part-Of-Speech Tags with their Frequencies splat cdensity filename # Calculate Content-Density
splat idensity filename # Calculate Idea Density
splat flesch filename # Calculate Flesch Readability Ease
splat kincaid filename # Calculate Flesch-Kincaid Grade Level
splat yngve filename # Calculate Yngve-Score
splat frazier filename # Calculate Frazier-Score splat function filename # List all Function Words
splat content filename # List all Content Words
splat ufunction filename # Unique Function Words
splat ucontent filename # Unique Content Words
splat cfr filename # Calculate Content-Function Ratio splat utts filename # List all Utterances
splat sents filename # List all Sentences
splat alu filename # Average Utterance Length
splat als filename # Average Sentence Length
splat uttcount filename # Utterance Count
splat sentcount filename # Sentence Count
splat syllables filename # Display Number of Syllables
splat wpu filename # List the Number of Words in each Utterance
splat wps filename # List the number of Words in each Sentence splat mostfreq filename x # List the x Most Frequent Words
splat leastfreq filename x # List the x Least Frequent Words
splat plotfreq filename x # Draw and Display a Frequency Graph splat disfluencies filename # Calculate various Disfluency Counts
splat dpa filename # List the Number of Disfluencies per each Dialog Act
splat dpu filename # List the Number of Disfluencies in each Utterance
splat dps filename # List the Number of Disfluencies in each Sentence splat trees filename # List Parse-Tree Strings for each Utterance
splat maxdepth filename # Calculate Max Tree Depth
splat drawtrees filename # Draw Parse Trees splat unigrams filename # List all Unigrams
splat bigrams filename # List all Bigrams
splat trigrams filename # List all Trigrams
splat ngrams filename n # List all n-grams splat annotate filename # Semi-Automatically annotate the UtterancesI would like to thank Emily Prud'hommeaux and Cissi Ovesdotter-Alm for their guidance during my initial development process. I would also like to thank Bryan Meyers, my brother, letting me bounce ideas off of him, and for giving me wake-up calls when I was doing something in the less-than-intelligent (stupid) way.
| Name | Website | GitHub | |
|---|---|---|---|
| Emily Prud'hommeaux | < [email protected] > | < CLaSP > | |
| Cissi O. Alm | < [email protected] > | < CLaSP > | |
| Bryan T. Meyers | < [email protected] > | < DataDrake > | < GitHub > |
See LICENSE.md.