Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View TomazErjavec's full-sized avatar
๐ŸŽฏ
Focusing
๐ŸŽฏ
Focusing

Organizations

@UniversalDependencies

Block or report TomazErjavec

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this userโ€™s behavior. Learn more about reporting abuse.

Report abuse
Showing results

PressMint: Interoperable Corpora of Historical Newspapers

XSLT 2 9 Updated Nov 29, 2025

ParlaMint: Comparable Parliamentary Corpora

XSLT 72 55 Updated Nov 2, 2025

TEI XSL Stylesheets

XSLT 258 130 Updated Dec 11, 2025

Compute various size metrics for a Git repository, flagging those that might cause problems

Go 3,941 179 Updated Dec 1, 2025

A pipeline for machine translation (using OPUS-MT models) of parliamentary text collections in 30+ languages (ParlaMint corpora). The pipeline includes parsing TEI XLM and CONLL-u files, linguisticโ€ฆ

Jupyter Notebook 2 Updated May 6, 2024

Slovenian parliamentary corpus

HTML 1 Updated Feb 6, 2025

NLP dataset of the Slovenian Biography

XSLT 1 Updated Jun 18, 2022

Lexicons for the Multilingual UCREL Semantic Analysis System

Python 47 17 Updated Dec 15, 2025

OASIS Lexicographic Infrastructure Data Model and API (LEXIDMA) TC: A repository designed for use in development of TC chartered work products and test suites. https://github.com/oasis-tcs/lexidma

XSLT 8 8 Updated May 15, 2025

Repo for ParlaMint showcase

Jupyter Notebook 4 1 Updated Jun 22, 2021

EveOut: Reproducible Event Dataset for Studying and Analyzing the Complex Event-Outlet Relationship

Python 2 Updated Jan 10, 2021

A tool for text normalisation via character-level machine translation

Python 13 6 Updated Jun 12, 2020

TEI XSL Stylesheets

XSLT 1 Updated Apr 27, 2018

Slovenian parliamentary corpus

HTML 5 3 Updated Sep 5, 2024
Perl 1 Updated May 25, 2023

Benchmarking NLP tools on Slovene, Croatian and Serbian

Python 7 3 Updated Dec 7, 2023

CLASSLA Fork of the Official Stanford NLP Python Library for Many Human Languages

Python 46 22 Updated May 6, 2025

Schema for modelling parliamentary debates

Makefile 21 6 Updated May 23, 2022

java library for CLARIN's CMDI curation

Java 4 Updated Dec 16, 2025
XSLT 2 3 Updated Feb 10, 2023

The Text Encoding Initiative Guidelines

HTML 317 96 Updated Dec 10, 2025

TEI Lex0: dictionary samples and tools (inactive, of historical value mostly)

9 7 Updated Mar 23, 2023

TCF XML Schema specification

Java 5 2 Updated Oct 12, 2020

This research seeks to examine best practice in the field of digital editions by collating relevant evidence in a detailed catalogue of extant digital projects.

Python 56 34 Updated Dec 17, 2025

The Open Multilingual Wordnet

HTML 66 10 Updated May 6, 2024

work space for a coherent proposal for inline attributes of <w> in TEI XML

1 1 Updated Mar 21, 2017

๐Ÿ†• Work continues on INCEpTION ๐Ÿ‘‰ https://github.com/inception-project/inception ๐Ÿ‘ˆ -- โš ๏ธ The official WebAnno repository has reached the end of the line. -- ๐Ÿš€ To migrate, export your annotation projeโ€ฆ

Java 249 95 Updated Feb 22, 2023
HTML 1 Updated Feb 13, 2018

An alternative web front-end for the Manatee corpus search engine

TypeScript 5 1 Updated Sep 20, 2021

An advanced, extensible web front-end for the Manatee-open corpus search engine

TypeScript 77 24 Updated Dec 9, 2025
Next