NLP-Modern-Neural-Networks-Meet-Linguistic-Theory

This is the group project for the course Natural Language Processing.

Description

This repository has two main folders: datasets and experiments.

The datasets folder contains code for creating German, Dutch and English corpus.
The experiments folder contains the experiment code, parameter.yaml file and replicable output.

1. Preparing the datasets

You can prepare all datasets by running sh ./createALLdatasets.sh from within the folder datasets. For English, German and Dutch a list of all available lemma's with their wordforms is saved to [english|dutch|german]_bylemma_[orth|phon].txt

After that, 6100 wordforms are chosen and saved to [src|tgt]_[train|test|valid].txt, and you can use these data files directly to run the experiments.

2. Running the experiments

The experiments folder contains the experiments that replicate the two experiments in the paper Kirov & Cotterell (2018), and also include experiments using new generated English, German and Dutch data. You can run the experiments by starting from experiment_phon.ipynb for experiments related to phoneme features with the data in the folder [english|dutch|german]_phon, and experiment_orth.ipynb for experiments related to orthographic (grapheme) features with the data in the folder [english|dutch|german]_orth from the Experiments folder.

Name		Name	Last commit message	Last commit date
Latest commit History 210 Commits
Experiments		Experiments
datasets		datasets
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NLP-Modern-Neural-Networks-Meet-Linguistic-Theory

Description

1. Preparing the datasets

2. Running the experiments

About

Uh oh!

Releases

Packages

Contributors 4

Uh oh!

Languages

JingyanChen22/IK-NLP-Project-4

Folders and files

Latest commit

History

Repository files navigation

NLP-Modern-Neural-Networks-Meet-Linguistic-Theory

Description

1. Preparing the datasets

2. Running the experiments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Uh oh!

Languages

Packages