Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Official PyTorch implementation of the paper: UniGaze: Towards Universal Gaze Estimation via Large-scale Pre-Training.

License

Notifications You must be signed in to change notification settings

ut-vision/UniGaze

Repository files navigation

UniGaze: Towards Universal Gaze Estimation via Large-scale Pre-Training

[arxiv], [project page], [online demo] [Huggface paper page]

Jiawei Qin1, Xucong Zhang2, Yusuke Sugano1,

1The University of Tokyo, 2Delft University of Technology

grid

Overview

This repository contains the official PyTorch implementation of both MAE pre-training and unigaze.

Todo:

  • ✅ Release pre-trained MAE checkpoints (B, L, H) and gaze estimation training code.
  • ✅ Release UniGaze models for inference.
  • ✅ Code for predicting gaze from videos
  • ✅ (2025 June 08 updated) Release the MAE pre-training code.
  • ✅ (2025 August 25 updated) Online demo is available.

Installation

To install the required dependencies, run:

pip install -r requirements.txt

Pre-training (MAE)

Please refer to MAE Pre-Training.

Training (Gaze Estimation)

For detailed training instructions, please refer to UniGaze Training.


Usage of UniGaze

Available Models

We provide the following trained models:

Filename Backbone Training Data Checkpoint
unigaze_b16_joint.pth.tar UniGaze-B Joint Datasets Download (Google Drive)
unigaze_L16_joint.pth.tar UniGaze-L Joint Datasets Download (Google Drive)
unigaze_h14_joint.pth.tar UniGaze-H Joint Datasets Download (Google Drive)
unigaze_h14_cross_X.pth.tar UniGaze-H ETH-XGaze Download (Google Drive)

Loading Pretrained Models

  • You can refer to load_gaze_model.ipynb for instructions on loading the model and integrating it into your own codebase.
    • If you want to load the MAE, use custom_pretrained_path arguments.
    • If you want to load the UniGaze (MAE + gaze_fc), directly use load_state_dict
## Loading MAE-backbone only - this will not load the gaze_fc
mae_h14 = MAE_Gaze(model_type='vit_h_14', custom_pretrained_path='checkpoints/mae_h14/mae_h14_checkpoint-299.pth')

## Loading UniGaze
unigaze_h14_crossX = MAE_Gaze(model_type='vit_h_14') ## custom_pretrained_path does not matter because it will be overwritten by the UniGaze weight
weight = torch.load('logs/unigaze_h14_cross_X.pth.tar', map_location='cpu')['model_state']
unigaze_h14_crossX.load_state_dict(weight, strict=True)

Predicting Gaze from Videos

To predict gaze direction from videos, use the following script:

projdir=<...>/UniGaze/unigaze
cd ${projdir}
python predict_gaze_video.py \
    --model_cfg_path configs/model/mae_b_16_gaze.yaml  \
    -i ./input_video \
    --ckpt_resume logs/unigaze_b16_joint.pth.tar

Citation

If you find our work useful for your research, please consider citing:

@article{qin2025unigaze,
  title={UniGaze: Towards Universal Gaze Estimation via Large-scale Pre-Training},
  author={Qin, Jiawei and Zhang, Xucong and Sugano, Yusuke},
  journal={arXiv preprint arXiv:2502.02307},
  year={2025}
}

We also acknowledge the excellent work on MAE.


License:

This model is licensed under the ModelGo Attribution-NonCommercial-ResponsibleAI License, Version 2.0 (MG-NC-RAI-2.0); you may use this model only in compliance with the License. You may obtain a copy of the License at

https://github.com/Xtra-Computing/ModelGo/blob/main/MGL/V2/MG-BY-NC-RAI/LICENSE

A comprehensive introduction to the ModelGo license can be found here: https://www.modelgo.li/


Beyond human eye gaze estimation:

Our method also works for different "faces":

grid


Contact

If you have any questions, feel free to contact Jiawei Qin at [email protected].

About

Official PyTorch implementation of the paper: UniGaze: Towards Universal Gaze Estimation via Large-scale Pre-Training.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •