The framework of training large language models，support lora, full parameters fine tune etc, define yaml to start training/fine tune of your defined models, data and methods. Easy define and easy s…

Python 31 5 Updated Sep 19, 2024

xinyi-code / SimCSE-Pytorch

中文数据集下SimCSE+ESimCSE的实现

Python 193 32 Updated May 21, 2022

khs0415p / training-LLM2VEC-PyTorch

LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders

Python 2 Updated Oct 16, 2024

trestad / mitigating-reversal-curse

Code for paper 'Are We Falling in a Middle-Intelligence Trap? An Analysis and Mitigation of the Reversal Curse'

Python 14 Updated Aug 2, 2024

HqWu-HITCS / Awesome-Chinese-LLM

整理开源的中文大语言模型，以规模较小、可私有化部署、训练成本较低的模型为主，包括底座模型，垂直领域微调及应用，数据集与教程等。

21,932 2,085 Updated May 19, 2025

pluto-junzeng / CNSD

中文自然语言推理数据集（A large-scale Chinese Nature language inference and Semantic similarity calculation Dataset）

436 45 Updated Feb 10, 2020

fdschmidt93 / trident-nllb-llm2vec

Repository for "Self-Distillation for Model Stacking Unlocks Cross-Lingual NLU in 200+ Languages"

Python 15 Updated Oct 4, 2024

Iambestfeed / llm2vec

Python 5 1 Updated May 5, 2024

KwangKa / SIMCSE_unsup

中文无监督SimCSE Pytorch实现

Python 135 31 Updated Jul 8, 2021

princeton-nlp / SimCSE

[EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821

Python 3,627 534 Updated Oct 16, 2024

yangjianxin1 / SimCSE

SimCSE有监督与无监督实验复现

Python 152 26 Updated Feb 22, 2024

trueto / medbert

本项目开源硕士毕业论文“BERT模型在中文临床自然语言处理中的应用探索与研究”相关模型

Python 133 14 Updated Apr 28, 2021

ContextualAI / gritlm

Generative Representational Instruction Tuning

Jupyter Notebook 680 49 Updated Jun 25, 2025

McGill-NLP / llm2vec

Code for 'LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders'

Python 1,631 135 Updated Dec 4, 2025

david-xinyuwei / david-share

Jupyter Notebook 383 74 Updated Dec 21, 2025

Acruxos / GLAME

[EMNLP 2024] Knowledge Graph Enhanced Large Language Model Editing

Python 13 2 Updated Jun 30, 2025

ruiyang-medinfo / KG-Rank

KG-Rank: Enhancing Large Language Models for Medical QA with Knowledge Graphs and Ranking Techniques

Python 48 8 Updated Dec 9, 2024

chroma-core / chroma

Open-source search and retrieval database for AI applications.

Rust 25,067 1,976 Updated Dec 21, 2025

BinNong / meet-libai

李白 👤 作为唐代杰出诗人，其诗歌作品在中国文学史上具有重要地位。近年来，随着数字技术和人工智能的快速发展，传统文化普及推广的形式也面临着创新与变革。国内外对于李白诗歌的研究虽已相当深入，但在数字化、智能化普及方面仍存在不足。因此，本项目旨在通过构建李白知识图谱，结合大模型训练出专业的AI智能体，以生成式对话应用的形式，推动李白文化的普及与推广。

Python 1,851 231 Updated Jul 12, 2025

yuntianhe2014 / Easy-RAG

一个适合学习、使用、自主扩展的RAG【检索增强生成】系统！可联网做AI搜索

Python 522 50 Updated Sep 4, 2024

awslabs / decode-answer-logical-form

Python 32 4 Updated Apr 14, 2023

amazon-science / tree-of-traversals

Python 15 2 Updated Jul 19, 2024

Zeyi-Lin / LLM-Finetune

大语言模型微调，Qwen2VL、Qwen2、GLM4指令微调

Jupyter Notebook 579 75 Updated May 26, 2025

UnaXff

Lists (16)

chatgpt_openai

Dataset

Graph_Database

Learn

neo4j

NLP_Chinese

NLP_model

NLP_tensorflow

PaddleNLP

PLM

prompt

query-to-CQL

query-to-SPARQL

query-to-SQL

Theories

triple_extraction

Stars