Qwen2.java

Practical Qwen2 inference implemented in a single Java file.

This project is the successor of llama2.java based on llama2.c by Andrej Karpathy and his excellent educational videos.

Besides the educational value, this project will be used to test and tune compiler optimizations and features on the JVM, particularly for the Graal compiler.

Features

Single file, no dependencies
GGUF format parser
Qwen 2 tokenizer based on minbpe
Qwen 2 inference with Grouped-Query Attention
Support for Q8_0 and Q4_0 quantizations
Simple CLI with --chat and --instruct modes.
Compatible with GraalVM's native-image.

Setup

Download pure Q4_0 and/or Q8_0 quantized .gguf files from: https://huggingface.co/collections/mukel/qwen2-666644562f3762a838f035de

Please be gentle with huggingface.co servers:

# Download the 1.5B parameter Q8_0 quantized model
curl -L -O https://huggingface.co/mukel/Qwen2-1.5B-Instruct-GGUF/resolve/main/Qwen2-1.5B-Instruct-Q8_0.gguf

Optional: quantize to pure `Q4_0` manually

In the wild, Q8_0 quantizations are fine, but Q4_0 quantizations are rarely pure e.g. the output.weights tensor is quantized with Q6_K, instead of Q4_0.
A pure Q4_0 quantization can be generated from a high precision (F32, F16, BFLOAT16) .gguf source with the quantize utility from llama.cpp as follows:

./llama-quantize --pure Qwen2-1.5B-Instruct-F32.gguf Qwen2-1.5B-Instruct-Q4_0.gguf Q4_0

Build and run

jbang is a perfect fit for this use case, just:

jbang Qwen2.java --help

Or execute directly, also via jbang:

chmod +x Qwen2.java
./Qwen2.java --help

Run from source

java Qwen2.java --model Qwen2-1.5B-Instruct-Q8_0.gguf --chat

Optional: Makefile + manually build and run

A simple Makefile is provided, run make to produce qwen2.jar or manually:

javac -g -d target/classes Qwen2.java
jar -cvfe qwen2.jar com.llama4j.Qwen2 LICENSE -C target/classes .

Run the resulting qwen2.jar as follows:

java -jar qwen2.jar --help
java -jar qwen2.jar --model Qwen2-1.5B-Instruct-Q8_0.gguf --chat

Native Image

Build a native image:

native-image -jar qwen2.jar -o qwen2

Run:

./qwen2 --help

For example:

./qwen2 --model Qwen2-1.5B-Instruct-Q8_0.gguf --chat

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
Qwen2.java		Qwen2.java
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Qwen2.java

Features

Setup

Optional: quantize to pure `Q4_0` manually

Build and run

Run from source

Optional: Makefile + manually build and run

Native Image

License

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

mukel/qwen2.svm.java

Folders and files

Latest commit

History

Repository files navigation

Qwen2.java

Features

Setup

Optional: quantize to pure Q4_0 manually

Build and run

Run from source

Optional: Makefile + manually build and run

Native Image

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Optional: quantize to pure `Q4_0` manually

Packages