CEFR-SP provides 17k English sentences annotated with CEFR levels assigned by English-education professionals. For details of the corpus creation process and our CEFR-level assessment model, please refer to our paper.
The CEFR-SP corpus is in /CEFR-SP directory and our codes for CEFR-level assessment model are in /src directory.
Please refer to README of each directory for details.
Please cite the following paper if you use the above resources for your research.
Yuki Arase, Satoru Uchida, and Tomoyuki Kajiwara. 2022. CEFR-Based Sentence-Difficulty Annotation and Assessment.
in Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2022) (Dec. 2022).
@inproceedings{arase:emnlp2022,
title = "{CEFR}-Based Sentence-Difficulty Annotation and Assessment",
author = "Arase, Yuki and Uchida, Satoru, and Kajiwara, Tomoyuki",
booktitle = "Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2022)",
month = dec,
year = "2022",
}
Yuki Arase (arase [at] ist.osaka-u.ac.jp) -- please replace " [at] " with an "@" symbol.