Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View loulianzhang's full-sized avatar

Block or report loulianzhang

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Toolkit to segment text into sentences or other semantic units in a robust, efficient and adaptable way.

Python 1,182 75 Updated Oct 8, 2025

Free ChatGPT&DeepSeek API Key,免费ChatGPT&DeepSeek API。免费接入DeepSeek API和GPT4 API,支持 gpt | deepseek | claude | gemini | grok 等排名靠前的常用大模型。

Python 33,996 2,421 Updated Oct 10, 2025

A library for minimum Bayes risk (MBR) decoding

Python 49 6 Updated Nov 2, 2025
Python 23 4 Updated Jul 7, 2025
Python 22 2 Updated Oct 24, 2025

Repository of Benchmarking and Improving Large Vision-Language Models for Fundamental Visual Graph Understanding and Reasoning

10 Updated Jun 18, 2025

✨✨Latest Advances on Multimodal Large Language Models

16,616 1,072 Updated Nov 4, 2025

[NeurIPS 2024] How do Large Language Models Handle Multilingualism?

Python 44 9 Updated Nov 8, 2024

Trying to prototype a multimodal llm which can take text and audio as input and then output text.

Jupyter Notebook 9 2 Updated Jul 31, 2024

Build your own visual reasoning model

Jupyter Notebook 414 27 Updated Oct 7, 2025

Open neural machine translation models and web services

Python 739 79 Updated Jun 17, 2025

s1: Simple test-time scaling

Python 6,592 764 Updated Jun 25, 2025

A lightweight reproduction of DeepSeek-R1-Zero with indepth analysis of self-reflection behavior.

Python 247 10 Updated Apr 15, 2025

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, GLM4.5, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, Llava, GLM4v, Ph…

Python 10,859 942 Updated Nov 4, 2025

Democratizing Reinforcement Learning for LLMs

Jupyter Notebook 4,663 440 Updated Nov 4, 2025
Python 2 Updated Oct 10, 2024

Fully open reproduction of DeepSeek-R1

Python 25,605 2,400 Updated Sep 8, 2025

Multilingual Generative Pretrained Model

Jupyter Notebook 207 22 Updated May 13, 2024

BLEURT implementation in PyTorch

Python 36 5 Updated Jan 19, 2023

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Python 3,331 277 Updated Jul 17, 2025

[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.

Cuda 2,619 256 Updated Oct 28, 2025

Paper Reproduction Google SCoRE(Training Language Models to Self-Correct via Reinforcement Learning)

Jupyter Notebook 142 23 Updated Sep 21, 2024
Python 115 19 Updated Dec 12, 2024

STACL simultaneously translation model with PaddlePaddle

JavaScript 9 1 Updated Aug 1, 2022

LLama3中文个人版本

40 2 Updated Apr 26, 2024

Open Multilingual Chatbot for Everyone

1,274 73 Updated Jun 8, 2025

Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.

Python 25,244 1,758 Updated Oct 13, 2025

Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek-R1, Qwen3, Gemma 3, TTS 2x faster with 70% less VRAM.

Python 47,855 3,910 Updated Nov 4, 2025

Llama3、Llama3.1 中文后训练版仓库 - 微调、魔改版本有趣权重 & 训练、推理、评测、部署教程视频 & 文档。

Python 4,166 338 Updated May 7, 2025

Best practice for training LLaMA models in Megatron-LM

Python 659 56 Updated Jan 2, 2024
Next