- This repository provides codes for bulding a MRPG and a distance-based outlier detection algorithm on a MRPG.
- Our algorithm supports metric space.
- Our codes implement L2 (Euclidean distance), L1 (Manhattan distance), Jaccard distance, Edit distance, angular distance, and L4 distance by default.
- The other distance functions are free to add.
- The details about our algorithm can be seen from our SIGMOD2021 paper, Fast and Exact Outlier Detection in Metric Spaces: A Proximity Graph-based Approach.
- Linux OS (Ubuntu).
- The others have not been tested.
g++ 7.4.0(or higher version) andOpenmp.
- Before running our DOD algorithm, build an MRPG.
- Parameter configuration can be done via txt files in
parameterdirectory. - Data files have to be at
datasetdirectory.- You can implement data input in as you like manner at input_data() function in data.hpp.
- Now dataset directory contains a dummy file only.
- Create
result/graphdirectory. - Compile:
g++ -O3 -o mrpg.out main.cpp -std=c++11 -fopenmp. - Run:
./mrpg.out.
- Create
resultdirectory. - Compile:
g++ -O3 -o greedy-pivot.out main.cpp -std=c++11 -fopenmp. - Run:
./greedy-pivot.out. - If you test a low-dimensional dataset, you may enable VP-tree based verification (by setting
mode = 1in main.cpp).- By default, verification is done by a linear scan.
If you use our implementation, please cite the following paper.
@inproceedings{amagata2021dod,
title={Fast and Exact Outlier Detection in Metric Spaces: A Proximity Graph-based Approach},
author={Amagata, Daichi and Onizuka, Makoto and Hara, Takahiro},
booktitle={SIGMOD},
pages={36--48},
year={2021}
}
Copyright (c) 2020 Daichi Amagata
This software is released under the MIT license.