Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View shigashiyama's full-sized avatar

Organizations

@naist-nlp

Block or report shigashiyama

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Data for OpenCHJ

2 Updated Nov 12, 2025

Tools for evaluating the performance of MT metrics on data from recent WMT metrics shared tasks.

Python 119 25 Updated Oct 13, 2025

BLEURT is a metric for Natural Language Generation based on transfer learning.

Python 769 92 Updated Aug 4, 2023

BLEURT implementation in PyTorch

Python 36 5 Updated Jan 19, 2023
Python 4 1 Updated Oct 6, 2025

A set of Python scripts for preprocessing the Wikidata JSON dump and running simple queries in an efficient manner.

Python 134 24 Updated Oct 17, 2024

歴史資料の市民参加型翻刻プラットフォーム「みんなで翻刻」のテキストデータ置き場です。 / Transcription texts created on Minna de Honkoku (https://honkoku.org), a crowdsourced transcription platform for historical Japanese documents.

17 3 Updated Apr 14, 2025

Repository includes scripts for MQM error analysis and annotation results from English to Chinese.

Python 2 Updated Sep 27, 2020

🔥 Rankify: A Comprehensive Python Toolkit for Retrieval, Re-Ranking, and Retrieval-Augmented Generation 🔥. Our toolkit integrates 40 pre-retrieved benchmark datasets and supports 7+ retrieval techn…

Python 519 39 Updated Oct 23, 2025

100+ Fine-tuning Tutorial Notebooks on Google Colab, Kaggle and more.

Jupyter Notebook 3,818 542 Updated Nov 10, 2025

Official inference framework for 1-bit LLMs

Python 24,404 1,894 Updated Jun 3, 2025
Python 5 3 Updated Oct 1, 2024

NLP2025 のチュートリアル「地理情報と言語処理 実践入門」の資料とソースコード

Jupyter Notebook 17 1 Updated Nov 12, 2025

「源氏物語」形態論情報データ

2 Updated Mar 7, 2025
Shell 29 5 Updated Dec 2, 2024

InstructDoc: A Dataset for Zero-Shot Generalization of Visual Document Understanding with Instructions (AAAI2024)

Python 158 6 Updated May 31, 2024

NLP2024 チュートリアル3 作って学ぶ日本語大規模言語モデル - 環境構築手順とソースコード / NLP2024 Tutorial 3: Practicing how to build a Japanese large-scale language model - Environment construction and experimental source codes

112 Updated Apr 2, 2024

Pytorch implementation of EntQA paper

Python 65 13 Updated May 21, 2022

CLI for loading Wikidata subsets (or all of it) into Elasticsearch

Python 70 7 Updated Feb 3, 2022

ReFinED is an efficient and accurate entity linking (EL) system.

Python 223 49 Updated Dec 13, 2024

Multiple NVIDIA GPUs or Apple Silicon for Large Language Model Inference?

Jupyter Notebook 1,828 70 Updated May 13, 2024
TypeScript 89 4 Updated Aug 3, 2025
6 2 Updated Jan 10, 2025

SikuBERT:四库全书的预训练语言模型(四库BERT) Pre-training Model of Siku Quanshu

146 15 Updated Jul 30, 2023

VnDT: A Vietnamese Dependency Treebank

23 1 Updated Nov 6, 2021

MultiLexNorm 2021 competition system from ÚFAL

Python 15 4 Updated Dec 30, 2021
Ruby 2 Updated Nov 3, 2025

Evaluate your speech-to-text system with similarity measures such as word error rate (WER)

Python 816 107 Updated Feb 15, 2025
Next