Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Tags: microsoft/MInference

Tags

v0.1.6

Toggle v0.1.6's commit message
add SCBench

v0.1.5.post1

Toggle v0.1.5.post1's commit message
V0.1.5.post1: Support LLaMA-3-70B, Multi-gpu, fix kernel / sqrt(dk)

v0.1.5

Toggle v0.1.5's commit message
V0.1.5: Support LLaMA 3.1

v0.1.4.post4

Toggle v0.1.4.post4's commit message
V0.1.4.post4: Hotfix vLLM >= 0.4.1

v0.1.4.post3

Toggle v0.1.4.post3's commit message
V0.1.4.post3: remove flash_attn dependency

v0.1.4.post2

Toggle v0.1.4.post2's commit message
V0.1.4.post2: support multi-gpu, remove pycuda

v0.1.4.post1

Toggle v0.1.4.post1's commit message
V0.1.4.post1: support other vllm version

v0.1.4

Toggle v0.1.4's commit message
V0.1.4: hotfix config in pip

v0.1.3

Toggle v0.1.3's commit message
V0.1.3: add bdist cache

v0.1.2

Toggle v0.1.2's commit message
V0.1.2: Hotfix pip setup