The source code for the ICML-2025 paper, titled "UltraTWD: Optimizing Ultrametric Trees for Tree-Wasserstein Distance".
We propose UltraTWD, a novel framework for accurate tree-Wasserstein distance computation using optimized ultrametric trees, significantly improving performance in text-based applications.
This repository provides a demo for tree-Wassertein distance computation, as detailed in Section 4 of the paper.
The DistancePack folder contains implementations of various Wasserstein distance computation methods. Detailed descriptions of the methods can be found in Appendix A.4 with hyperparameters listed in Appendix B.2. This package builds upon the foundational work of weight-optimized method and UltraTree method.
Note: The copyright for the original code belongs to the respective authors mentioned in the linked repository.
./ - Top-level directory.
./README.md - This README file.
./experiment.py - Demo script for tree-Wasserstein distance computation.
| Dataset/ - Text dataset
./reuters.pkl - Reuters dataset.
| DistancePack/ - Implementations for tree-Wasserstein distance methods.
./distance_pack.py - All Wasserstein distance methods.
# 1-Wasserstein Distance
./method_wasserstein.py - 1-Wasserstein distance (ground truth).
# Entropy-based Method
./method_sinkhorn.py - Sinkhorn distance.
# Tree-construction Method
./method_quadtree.py - QuadTree distance.
./method_clustertree.py - ClusterTree distance.
# Weight-optimized Method
./method_qtwd.py - qTWD distance (weight-optimized QuadTree).
./method_ctwd.py - cTWD distance (weight-optimized ClusterTree).
./class_weightOptimized.py - Class of weight-optimized methods.
# Supervised Method
./method_ultratree.py - UltraTree distance.
./class_ultratree.py - Class of UltraTree method.
# Our Methods
./method_ultratwd_mst.py - UltraTWD-MST distance. (see Section 3.3 for more details)
./method_ultratwd_ip.py - UltraTWD-IP distance.
./method_ultratwd_gd.py - UltraTWD-GD distance.
./class_ultratwd_ip.py - Class of UltraTWD-IP method. (see Section 3.4 for more details)
./class_ultratwd_gd.py - Class of UltraTWD-GD method. (see Section 3.5 for more details)
| ExpToolKit/ - Evaluation tools
./config.py - Configuration for folder paths.
./utils_data.py - Data loading and result saving utilities.
./utils_evaluate.py - Performance evaluation functions.
./ultrametric.cpp - C++ implementation of iterative projection algorithm.
./mst.c - C implementation of minimum spanning tree algorithm.
If you find this code useful for your research, please use the following BibTeX entry.
@inproceedings{yu2025ultratwd,
title={UltraTWD: Optimizing Ultrametric Trees for Tree-Wasserstein Distance},
author={Yu, Fangchen and Chen, Yanzhen and Wei, Jiaxing and Mao, Jianfeng and Li, Wenye and Sun, Qiang},
booktitle={International Conference on Machine Learning},
year={2025}
}
If you have any problems or questions, please contact the author: Fangchen Yu (email: [email protected])