Stars
PressMint: Interoperable Corpora of Historical Newspapers
Compute various size metrics for a Git repository, flagging those that might cause problems
A pipeline for machine translation (using OPUS-MT models) of parliamentary text collections in 30+ languages (ParlaMint corpora). The pipeline includes parsing TEI XLM and CONLL-u files, linguisticโฆ
katjameden / siParl
Forked from DARIAH-SI/siParlSlovenian parliamentary corpus
Lexicons for the Multilingual UCREL Semantic Analysis System
OASIS Lexicographic Infrastructure Data Model and API (LEXIDMA) TC: A repository designed for use in development of TC chartered work products and test suites. https://github.com/oasis-tcs/lexidma
Repo for ParlaMint showcase
EveOut: Reproducible Event Dataset for Studying and Analyzing the Complex Event-Outlet Relationship
A tool for text normalisation via character-level machine translation
TomazErjavec / Stylesheets
Forked from TEIC/StylesheetsTEI XSL Stylesheets
Benchmarking NLP tools on Slovene, Croatian and Serbian
clarinsi / classla
Forked from stanfordnlp/stanzaCLASSLA Fork of the Official Stanford NLP Python Library for Many Human Languages
Schema for modelling parliamentary debates
java library for CLARIN's CMDI curation
TEI Lex0: dictionary samples and tools (inactive, of historical value mostly)
This research seeks to examine best practice in the field of digital editions by collating relevant evidence in a detailed catalogue of extant digital projects.
work space for a coherent proposal for inline attributes of <w> in TEI XML
๐ Work continues on INCEpTION ๐ https://github.com/inception-project/inception ๐ --
ufal / lindat-kontext
Forked from czcorpus/kontextAn alternative web front-end for the Manatee corpus search engine
An advanced, extensible web front-end for the Manatee-open corpus search engine