Stars
A large, free audio sample database (10M words pronounced), a test bed for voice activity detection algorithms and for single-syllable word recognition
A Malware classifier dataset built with header fields’ values of Portable Executable files
A Typed Event-Focused Lexical Inference Benchmark for Evaluating Natural Language Inference
NERO-nlp is a PyPI package for biomedical Named Entity (Recognition) Ontology
RiddleSense: Reasoning about Riddle Questions Featuring Linguistic Creativity and Commonsense Knowledge
A Python library designed for scraping data from the SCP wiki.
Comprehensive evaluation framework for Open Information Extraction.
Reading the data from OPIEC - an Open Information Extraction corpus
WikiWhy is a new benchmark for evaluating LLMs' ability to explain between cause-effect relationships. It is a QA dataset containing 9000+ "why" question-answer-rationale triplets.
TriggerNER: Learning with Entity Triggers as Explanations for Named Entity Recognition (ACL 2020)