1 Pre-release version can run various models supported by the main branch of Ollama. only tested for windows .

test env: 
    Intel(R) Core(TM) Ultra 9 285H 
    Windows11 24H2

Explanation of environment variables:

OLLAMA_INTEL_GPU : true = use intel gpu; false normal ollama

OLLAMA_INTEL_IF_TYPE : SYCL = use SYCL of ggml lib to find gpu device and auto parse memory size; ONEAPI = use level zero lib to find gpu device (can't get memory size)

OLLAMA_NUM_GPU : when use OLLAMA_INTEL_IF_TYPE=ONEAPI,  this param is max number of layers offload to gpu; default = 64. if use OLLAMA_INTEL_IF_TYPE=SYCL, layers offload to gpu is auto calculate.

Run Ollama server

.\ollama-intel-gpu.bat

Run client

.\ollama.exe run gemma3:12b --verbose

2 For Intel OneApi developer

This document recode process of merge ggml-sycl from llama.cpp. to support Intel-Gpu.

Only tested in windows and intel integrated Graphics Card.

A portable package in https://github.com/chnxq/ollama/releases

`develope config`

Pre-request

Install Intel-OneApi: from OneApiBaseToolkit

Other ref: SYCL document for detail.

(default install in C:\Program Files (x86)\Intel\oneAPI) for next example.

Compile the CPU & GPU dynamic libraries

on windows powershell,establish environmental variables:

cmd.exe "/K" '"C:\Program Files (x86)\Intel\oneAPI\setvars.bat" && powershell'

build libraries: on ollama root directory:

cmake -B build -G "Ninja" -DGGML_SYCL=ON -DGGML_SYCL_TARGET=INTEL -DGGML_CPU_ALL_VARIANTS=ON -DGGML_BACKEND_DL=ON -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icx -DCMAKE_BUILD_TYPE=Release

cmake --build build --config Release -j

on linux,establish environmental variables:

source /opt/intel/oneapi/setvars.sh

build libraries: on ollama root directory:

cmake -B build -DGGML_SYCL=ON -DGGML_SYCL_TARGET=INTEL -DGGML_CPU_ALL_VARIANTS=ON -DGGML_BACKEND_DL=ON -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx -DCMAKE_BUILD_TYPE=Release

cmake --build build --config Release -j -v

build go src:

go build -o ollama.exe

Run

on ollama root directory

cmd.exe "/K" '"C:\Program Files (x86)\Intel\oneAPI\setvars.bat" && powershell'

set OLLAMA_INTEL_GPU=true
set OLLAMA_INTEL_IF_TYPE=SYCL
set OLLAMA_NUM_GPU=999
set SYCL_CACHE_PERSISTENT=1
set OLLAMA_LIBRARY_PATH=./build/lib/ollama
# run ollama server
.\ollama.exe serve

or use shell ollama-intel-gpu.bat:

set OLLAMA_INTEL_GPU=true
set OLLAMA_INTEL_IF_TYPE=SYCL
set OLLAMA_NUM_GPU=64
set SYCL_CACHE_PERSISTENT=1
set OLLAMA_LIBRARY_PATH=./build/lib/ollama
set ONEAPI_ROOT=C:\Program Files (x86)\Intel\oneAPI
set PATH=%PATH%;%ONEAPI_ROOT%\2025.1\bin;./build/lib/ollama;

.\ollama.exe serve

note:

set OLLAMA_NUM_GPU=xxx xxx: It needs to be manually set. According to the number of model layers that the video memory can load.for example my T140 has 16G shared video memory,i set it to 64.
Next 2 env is necessary if use sycl for discover intel gpu: set OLLAMA_INTEL_GPU=true set OLLAMA_INTEL_IF_TYPE=SYCL (env OLLAMA_INTEL_IF_TYPE is used in go and c code of ollama and llama.cpp,same name as build param)
When use pure CPU inference,a known bug need to delete ggml_sycl library temporay.

# run ollama test client 1
.\ollama.exe run deepseek-r1:1.5b --verbose

# run ollama test client 2
.\ollama.exe run qwen3:4b-fp16 --verbose

# run ollama test client 3
.\ollama.exe run gemma3:12b --verbose

Name		Name	Last commit message	Last commit date
Latest commit History 4,465 Commits
.github		.github
api		api
app		app
auth		auth
cmd		cmd
convert		convert
discover		discover
docs		docs
envconfig		envconfig
format		format
fs		fs
integration		integration
kvcache		kvcache
llama		llama
llm		llm
logutil		logutil
macapp		macapp
ml		ml
model		model
openai		openai
parser		parser
progress		progress
readline		readline
runner		runner
sample		sample
scripts		scripts
server		server
template		template
thinking		thinking
tools		tools
types		types
version		version
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
.golangci.yaml		.golangci.yaml
CMakeLists.txt		CMakeLists.txt
CMakePresets.json		CMakePresets.json
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile.sync		Makefile.sync
README.md		README.md
SECURITY.md		SECURITY.md
go.mod		go.mod
go.sum		go.sum
main.go		main.go
ollama-intel-gpu.bat		ollama-intel-gpu.bat
ollama-intel-gpu.sh		ollama-intel-gpu.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

1 Pre-release version can run various models supported by the main branch of Ollama. only tested for windows .

Explanation of environment variables:

Run Ollama server

Run client

2 For Intel OneApi developer

`develope config`

Pre-request

Compile the CPU & GPU dynamic libraries

on windows powershell,establish environmental variables:

on linux,establish environmental variables:

Run

About

Uh oh!

Releases

Packages

Languages

License

chnxq/ollama

Folders and files

Latest commit

History

Repository files navigation

1 Pre-release version can run various models supported by the main branch of Ollama. only tested for windows .

Explanation of environment variables:

Run Ollama server

Run client

2 For Intel OneApi developer

develope config

Pre-request

Compile the CPU & GPU dynamic libraries

on windows powershell,establish environmental variables:

on linux,establish environmental variables:

Run

About

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

`develope config`

Packages