Unity: Fully Self-Supervised Pretraining with Transformers for Recommendation

This repository contains the official implementation of the paper:

Unity: Fully Self-Supervised Pretraining with Transformers for Recommendation Shuang Yang, Yang Yang, Tao Liu, Feng Qi, Kaushik Rangadurai, Luke Simon, Sandeep Pandey

We present Unity, a fully self-supervised learning framework for recommendation. Unity integrates three key components: 1) Unity tokenization, an event-level tokenizer that converts heterogeneous engagement features into a single sequence of compact latent tokens; 2) Pollen, a Transformer architecture designed to program arbitrary interactions in the exponential-polynomial family; and 3) a masked-language model-style self-supervised learning paradigm. We evaluated the framework extensively in both production and public settings and report promising results including multiple production launches that have landed significant topline business metric gains in both organic and ads recommendation. Our early scaling experiments also demonstrate 2.3x-4.9x higher scaling efficiency compared to previously established scaling laws.

Overview

Unity is a plug-in self-supervised pretraining module for recommendation models. It attaches a Masked Autoencoder (MAE) to the intermediate representations of any backbone model, adding a self-supervised reconstruction objective alongside the standard task loss. The framework includes:

Unity MAE: A 1D masked autoencoder that patchifies intermediate representations, masks a subset, and reconstructs them through an encoder-decoder Transformer.
Pollen: A geometric mean attention mechanism that replaces standard softmax attention, enabling interactions in the exponential-polynomial family.
Backbone integration: Demonstrated with two backbone models (WuKong and DCNv2) on public CTR prediction benchmarks.

Repository Structure

.
├── run_expid.py                  # Main training entry point
├── common/
│   ├── base_model.py             # Base model class with training loop
│   └── unity_mae.py              # Unity MAE, Pollen, and Transformer blocks
├── WuKong/
│   ├── config/
│   │   ├── model_config.yaml     # WuKong model hyperparameters
│   │   └── dataset_config.yaml   # Dataset paths and feature definitions
│   └── src/
│       └── WuKong.py             # WuKong backbone model
└── DCNv2/
    ├── config/
    │   ├── model_config.yaml     # DCNv2 model hyperparameters
    │   └── dataset_config.yaml   # Dataset paths and feature definitions
    └── src/
        └── DCNv2.py              # DCNv2 backbone model

Requirements

Python 3.8+
PyTorch
FuxiCTR (for data loading, feature processing, and metrics)
NumPy
tqdm

Install dependencies:

pip install torch numpy tqdm
pip install fuxictr

Data Preparation

This codebase supports datasets in both CSV and NPZ formats. Download the datasets and place them under ~/datasets/ so the directory structure looks like:

~/datasets/
├── KuaiVideo_x1/
│   ├── train.csv
│   ├── test.csv
│   └── item_visual_emb_dim64.h5
├── TaobaoAd_x1/
│   ├── train.csv
│   └── test.csv
└── AmazonElectronics_x1/
    ├── train.csv
    └── test.csv

The config files use /home/USER/datasets/ as a placeholder. At runtime, USER is automatically replaced with your system username ($USER), so no manual path editing is needed as long as datasets are placed under ~/datasets/.

Public benchmark datasets used in the paper:

Quick Start

Running WuKong with Unity

PYTHONPATH=.:$PYTHONPATH python run_expid.py \
    --config WuKong/config \
    --src WuKong.src \
    --expid WuKong_test \
    --gpu 0

Running DCNv2 with Unity

PYTHONPATH=.:$PYTHONPATH python run_expid.py \
    --config DCNv2/config \
    --src DCNv2.src \
    --expid DCNv2_test \
    --gpu 0

Configuration

Model hyperparameters are defined in model_config.yaml. Key Unity-specific parameters:

Parameter	Description	Default
`enabled`	Enable the Unity MAE module	`True`
`output_mode`	How encoder output is combined with input (0=input, 1=encoder, 2=concat, 3=add)	`1`
`patch_size`	Size of each patch for the 1D patchification	`32`
`embed_dim`	Embedding dimension of the MAE encoder	`16`
`depth`	Number of encoder Transformer blocks	`2`
`num_heads`	Number of attention heads in the encoder	`2`
`decoder_embed_dim`	Embedding dimension of the MAE decoder	`128`
`decoder_depth`	Number of decoder Transformer blocks	`2`
`decoder_num_heads`	Number of attention heads in the decoder	`4`
`mask_ratio`	Fraction of patches to mask during training	`0.5`
`loss_weight`	Weight of the MAE reconstruction loss	`0.05`
`pollen_attn_type`	Set to `"pollen"` to use Pollen attention (null for standard MHA)	`"pollen"`
`pollen_use_value_sign`	Enable value sign in geometric mean attention	`True`
`pollen_use_rmsnorm`	Use RMSNorm instead of LayerNorm in Pollen blocks	`False`

Integrating Unity with Your Own Model

Unity is designed as a drop-in module. To add it to a new backbone:

Have your model inherit from BaseModel in common/base_model.py.
Create a UnityMAEConfig and instantiate UnityMAE in your model's __init__.
In forward(), pass intermediate representations through the Unity module:

from common.unity_mae import UnityMAE, UnityMAEConfig

# In __init__:
self.unity_config = UnityMAEConfig(
    raw_input_size=feature_dim,
    input_size=feature_dim,
    output_mode=1,
    patch_size=32,
    embed_dim=16,
    depth=2,
    num_heads=2,
    decoder_embed_dim=128,
    decoder_depth=2,
    decoder_num_heads=4,
    mlp_ratio=4.0,
    mask_ratio=0.5,
    pollen_attn_type="pollen",
    pollen_use_value_sign=True,
)
self.unity = UnityMAE(self.unity_config)

# In forward():
final_out, unity_loss = self.unity(final_out)
unity_loss = self.aggregate_unity_loss(self.unity_config, unity_loss)

The unity_loss is automatically combined with the task loss in BaseModel.train_step().

Citation

@article{yang2025unity,
  title={Unity: Fully Self-Supervised Pretraining with Transformers for Recommendation},
  author={Yang, Shuang and Yang, Yang and Liu, Tao and Qi, Feng and Rangadurai, Kaushik and Simon, Luke and Pandey, Sandeep},
  year={2025}
}

License

This project is licensed under the Apache License 2.0. See individual source files for details.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
DCNv2		DCNv2
WuKong		WuKong
common		common
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
run_expid.py		run_expid.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Unity: Fully Self-Supervised Pretraining with Transformers for Recommendation

Overview

Repository Structure

Requirements

Data Preparation

Quick Start

Running WuKong with Unity

Running DCNv2 with Unity

Configuration

Integrating Unity with Your Own Model

Citation

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Unity: Fully Self-Supervised Pretraining with Transformers for Recommendation

Overview

Repository Structure

Requirements

Data Preparation

Quick Start

Running WuKong with Unity

Running DCNv2 with Unity

Configuration

Integrating Unity with Your Own Model

Citation

License

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages