Codestin Search App

UniCoMo: A Unified Learning-Based Cost Model for Tensorized Program Tuning

Environment Setup

git clone --recursive https://github.com/ZhW-loop/UniCoMo.git
cd UniCoMo
conda env create --file conda/build-environment.yaml
conda activate tvm-build
mkdir build && cd build
cp ../cmake/config.cmake . # set USE_LLVM ON, set USE_CUDA ON
cmake ..
make -j32
pip3 install numpy decorator attrs
pip3 install tornado
pip3 install tornado psutil 'xgboost>=1.1.0' cloudpickle
export TVM_HOME=/path/to/tvm
export PYTHONPATH=$TVM_HOME/python:$PYTHONPATH

TensorizeSet

Configuration common.py.

TensorizeSet_dir = '/path/to/your/dataset'
TRASH_PRINT = '/path/to/your/tmp_dir'

Dump neural network and subgraph.

# For tensor core with float16
python dump_relay_models.py && python dump_relay2task.py
# For avx512 or neon with int8
python dump_op2task.py

Dump tensorized programs in search space. Tensorized programs on the same semantics share the same schedules (but can also be different), while those on different semantics have different schedules (and cannot be the same).

python dump_programs.py && python programs_avx2vnni.py && python programs_neon2sdot.py

Dump tensorized programs with latency label.

python dump_measured.py

Sort the dataset files.

cd '/path/to/your/dataset'
mv ./measured_records/*tensorcore161616nn* tensorcore161616nn_dataset/measured_records
mv ./measured_records/*tensorcore161616nt* tensorcore161616nt_dataset/measured_records
mv ./measured_records/*tensorcore32816nn* tensorcore32816nn_dataset/measured_records
mv ./measured_records/*tensorcore32816nt* tensorcore32816nt_dataset/measured_records
mv ./measured_records/*tensorcore83216nn* tensorcore83216nn_dataset/measured_records
mv ./measured_records/*tensorcore83216nt* tensorcore83216nt_dataset/measured_records
mv ./measured_records/*avx512* avx512_dataset/measured_records
mv ./measured_records/*vnni* vnni_dataset/measured_records
mv ./measured_records/*neon* neon_dataset/measured_records
mv ./measured_records/*sdot* sdot_dataset/measured_records

AST Feature Mining with Attention

Configuration common.py.

TensorizeSet_dir = '/path/to/your/dataset'
TRASH_PRINT = '/path/to/your/tmp_dir'

Make TLP Embedding Table.

# --intrin_name tensorcorexxx, avx512, vnni, neon, sdot
python embedding_table.py --intrin_name tensorcore161616nn

Make TLP DataSet. Encoding scheduling primitives as NLP tasks.

# --intrin_name tensorcorexxx, avx512, vnni, neon, sdot
python tlp_make_dataset.py --intrin_name tensorcore161616nn

Transfer TLP to tensorized programs and train it on TensorizeSet, called TDTLP.

# --intrin_name tensorcorexxx, avx512, vnni, neon, sdot
python tlp_train.py --intrin_name tensorcore161616nn

Test TDTLP with top-k score.

# --intrin_name tensorcorexxx, avx512, vnni, neon, sdot
# --model_id 0-n_epoch
python tlp_eval.py --intrin_name tensorcore161616nn --model_id 37

Draw Schedule Attention Matrix.

# build cal_attention
g++ -O3 -Wall -shared -std=c++11 -fPIC `python3 -m pybind11 --includes` cal_attention.cpp -o cal_attention`python3-config --extension-suffix`

# --intrin_name tensorcorexxx, avx512, vnni, neon, sdot
# --model_id 0-n_epoch
python get_attention_matrix.py --intrin_name tensorcore161616nn --model_id 37
python draw_ScheduleAttention.py --intrin_name tensorcore161616nn

Unified Feature Representation

In the implementation, unified feature representation is called tensorize notation (TN). Feature extraction and encoding based on the unified feature representation are implemented in

src/meta_schedule/feature_extractor/per_block_feature.cc

Configuration common.py.

TensorizeSet_dir = '/path/to/your/dataset'
TRASH_PRINT = '/path/to/your/tmp_dir'

Make TN DataSet with unified feature representation.

# --intrin_name tensorcorexxx, avx512, vnni, neon, sdot
python TN_make_dataset.py --intrin_name tensorcore161616nn

Train UniCoMo.

# --intrin_name tensorcorexxx, avx512, vnni, neon, sdot
python TN_train.py --intrin_name tensorcore161616nn

Test UniCoMo with top-k score.

# --intrin_name tensorcorexxx, avx512, vnni, neon, sdot
# --model_id 0-n_epoch
python TN_eval.py --intrin_name tensorcore161616nn --model_id 37

Life Long Learning

Generate a task dataset through catDataLoaderLL.py to simulate the cross-semantic scenario. The following configuration can generate the dataset of Task 7 in the paper. Following a similar configuration, generate tasks 9, 0, 1, 2, and 4 in the paper.

new_task_id = 7
LL_task_save_name = f"task{new_task_id}_train.pkl.DataLoader"
pkl_dataloader_paths = ['tensorcore161616nn_train_and_val.pkl.DataLoader',
                        'avx512_train_and_val.pkl.DataLoader',
                        'sdot_train_and_val.pkl.DataLoader']

Configuration common.py.

TensorizeSet_dir = '/path/to/your/dataset'
TRASH_PRINT = '/path/to/your/tmp_dir'

Train UniCoMo with life long learning.

python LL_train.py --last_task_id -1 --last_task_model_id -1 --new_task_id 9 --train_type standard
python LL_train.py --last_task_id 9 --last_task_model_id 37 --new_task_id 0 --train_type LL
python LL_train.py --last_task_id 0 --last_task_model_id 37 --new_task_id 1 --train_type LL
python LL_train.py --last_task_id 1 --last_task_model_id 37 --new_task_id 2 --train_type LL
python LL_train.py --last_task_id 2 --last_task_model_id 37 --new_task_id 4 --train_type LL

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.github		.github
3rdparty		3rdparty
LLlearning		LLlearning
TensorizeSet		TensorizeSet
apps		apps
ci		ci
cmake		cmake
conda		conda
configs		configs
docker		docker
docs		docs
gallery		gallery
golang		golang
include/tvm		include/tvm
jvm		jvm
licenses		licenses
python		python
rust		rust
src		src
tensorize_notation		tensorize_notation
tests		tests
tlp_attention		tlp_attention
vta		vta
web		web
.asf.yaml		.asf.yaml
.clang-format		.clang-format
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
.pre-commit-config.yaml		.pre-commit-config.yaml
CMakeLists.txt		CMakeLists.txt
CONTRIBUTORS.md		CONTRIBUTORS.md
KEYS		KEYS
LICENSE		LICENSE
Makefile		Makefile
NEWS.md		NEWS.md
NOTICE		NOTICE
README.md		README.md
conftest.py		conftest.py
mypy.ini		mypy.ini
pyproject.toml		pyproject.toml
version.py		version.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

UniCoMo: A Unified Learning-Based Cost Model for Tensorized Program Tuning

Environment Setup

TensorizeSet

AST Feature Mining with Attention

Unified Feature Representation

Life Long Learning

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

ZhW-loop/UniCoMo

Folders and files

Latest commit

History

Repository files navigation

UniCoMo: A Unified Learning-Based Cost Model for Tensorized Program Tuning

Environment Setup

TensorizeSet

AST Feature Mining with Attention

Unified Feature Representation

Life Long Learning

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages