Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View UnaXff's full-sized avatar

Block or report UnaXff

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

🌞 CareGPT (关怀GPT)是一个医疗大语言模型,同时它集合了数十个公开可用的医疗微调数据集和开放可用的医疗大语言模型,包含LLM的训练、测评、部署等以促进医疗LLM快速发展。Medical LLM, Open Source Driven for a Healthy Future.

Python 993 125 Updated May 9, 2024

[CBLUE1] 中文医疗信息处理基准CBLUE: A Chinese Biomedical Language Understanding Evaluation Benchmark

Python 824 136 Updated May 3, 2023

Code for 'LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders'

Python 2 Updated Jul 24, 2024

[Medical_NLP ➟ Awesome-AI4Med] medical-related LLMs, Multimodal systems, Datasets, Benchmarks, and more.

2,454 433 Updated Nov 29, 2025

大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP

9,839 1,558 Updated Sep 8, 2025

Large-scale Pre-training Corpus for Chinese 100G 中文预训练语料

993 82 Updated Oct 17, 2022

维基百科中文语料生成

Python 10 Updated Sep 11, 2024

The framework of training large language models,support lora, full parameters fine tune etc, define yaml to start training/fine tune of your defined models, data and methods. Easy define and easy s…

Python 31 5 Updated Sep 19, 2024

中文数据集下SimCSE+ESimCSE的实现

Python 193 32 Updated May 21, 2022

LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders

Python 2 Updated Oct 16, 2024

Code for paper 'Are We Falling in a Middle-Intelligence Trap? An Analysis and Mitigation of the Reversal Curse'

Python 14 Updated Aug 2, 2024

整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。

21,932 2,085 Updated May 19, 2025

中文自然语言推理数据集(A large-scale Chinese Nature language inference and Semantic similarity calculation Dataset)

436 45 Updated Feb 10, 2020

Repository for "Self-Distillation for Model Stacking Unlocks Cross-Lingual NLU in 200+ Languages"

Python 15 Updated Oct 4, 2024
Python 5 1 Updated May 5, 2024

中文无监督SimCSE Pytorch实现

Python 135 31 Updated Jul 8, 2021

[EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821

Python 3,627 534 Updated Oct 16, 2024

SimCSE有监督与无监督实验复现

Python 152 26 Updated Feb 22, 2024

本项目开源硕士毕业论文“BERT模型在中文临床自然语言处理中的 应用探索与研究”相关模型

Python 133 14 Updated Apr 28, 2021

Generative Representational Instruction Tuning

Jupyter Notebook 680 49 Updated Jun 25, 2025

Code for 'LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders'

Python 1,631 135 Updated Dec 4, 2025
Jupyter Notebook 383 74 Updated Dec 21, 2025

[EMNLP 2024] Knowledge Graph Enhanced Large Language Model Editing

Python 13 2 Updated Jun 30, 2025

KG-Rank: Enhancing Large Language Models for Medical QA with Knowledge Graphs and Ranking Techniques

Python 48 8 Updated Dec 9, 2024

Open-source search and retrieval database for AI applications.

Rust 25,067 1,976 Updated Dec 21, 2025

​ 李白 👤 作为唐代杰出诗人,其诗歌作品在中国文学史上具有重要地位。近年来,随着数字技术和人工智能的快速发展,传统文化普及推广的形式也面临着创新与变革。国内外对于李白诗歌的研究虽已相当深入,但在数字化、智能化普及方面仍存在不足。因此,本项目旨在通过构建李白知识图谱,结合大模型训练出专业的AI智能体,以生成式对话应用的形式,推动李白文化的普及与推广。

Python 1,851 231 Updated Jul 12, 2025

一个适合学习、使用、自主扩展的RAG【检索增强生成】系统!可联网做AI搜索

Python 522 50 Updated Sep 4, 2024

大语言模型微调,Qwen2VL、Qwen2、GLM4指令微调

Jupyter Notebook 579 75 Updated May 26, 2025
Next