This is the repository accompanying the paper "IFCA: Index-Free Commnunity-Aware Reachability Processing Over Large Dynamic Graphs."
Note: SFMT-src-1.5.1 is used to choose source and destination vertices uniformly at random during query generation, which is open-source and originally downloaded here.
To compile the query generator:
$ g++ -O2 -std=c++11 genQueryBatch.cpp SFMT-src-1.5.1/SFMT.c -o genQueryBatchTo run the query generator:
$ ./genQueryBatch dataset_path dataset_name num_partitions num_total_queryFor example, if you have a dataset in the directory /home/user/, the name of the dataset is some_dataset, and you would like to partition its time span into 20 intervals and generate 1,000,000 queries in total, then run:
$ ./genQueryBatch /home/user/ some_dataset 20 1000000To compile the test driver:
$ g++ -O2 -std=c++11 testBatch.cpp SFMT-src-1.5.1/SFMT.c -o testBatchTo run the test driver:
$ ./testBatch dataset_path dataset_name num_partitions modemode indicates the tested methods (as reported in Section 6.2).
- 0: BiBFS.
- 1: Base.
- 2: Contract.
- 3: IFCA.
One of the real datasets, Enron, is included in this repository (enron.tar.bz2) as an example.
If you would like to run testBatch on a custom dataset, please convert it to the following format:
-
Initial snapshot: should be named as
graph.[dataset_name]. For example, if the name of the dataset ismy_dataset, then the initial snapshot file should be named asgraph.my_dataset. Each line represents a directed edge in the initial snapshot. For example,1 2represents a directed edge from the vertex indexed as 1 to the vertex indexed as 2. (The vertex indexes should start from 0 and increment as consecutive integers.)
-
Update files: the i-th update file should be named as
update.[dataset_name]i. There can be multiple update files, whose indexes start from 1 and increment as consecutive integers. The update files with larger indexes come later in time. Each line represents an edge insertion or deletion event. For example,I 1 3 1represents the insertion of a directed edge from the vertex indexed as 1 to the vertex indexed as 3 at the timestamp 1; and
D 1 3 2represents the deletion of the directed edge from the vertex indexed as 1 to the vertex indexed as 3 at the timestamp 2.
These files should be placed under the same directory, the path of which will be the dataset_path for both genQueryBatch and testBatch; num_partitions is equal to (i + 1).
You may use genQueryBatch to generate queries, or generate your own queries. The i-th query file corresponding to the i-th snapshot should be named as query.[dataset_name]i. Note that the indexes of query files start from 0; the 0-th query file corresponds to the initial snapshot, and the i-th (i > 0) corresponds to the snapshot obtained after applying all the updates in the 1-th, 2-th, ..., and i-th update files to the initial snapshot. Each line represents a query, in the form of:
Q source destination timestamp timestamp ground_truth
ground_truth >= 0 indicates that it is a positive query; ground_truth == -1 indicates that it is a negative query. Note that the timestamp is treated as a dummy in the current implementation of testBatch; but you could circumvent that if you implement your own driver program for query processing.