Bloom is a server, which contains Bloom filter probabilistic data structure in memory, provides access to it via HTTP and ensures data persistence on disk by mean of atomic consistent snapshots.
Consider also Prebuilt Docker image, binaries on Releases page to use prebuilt ones and installation from Snap Store.
Run these commands in sources directory:
sudo apt-get install build-essential libevent-dev
makeRun these commands in sources directory:
sudo yum install gcc libevent2-devel make
makeRun make static instead of make to build static binary.
Assuming you are using Homebrew
brew install libevent
makeStatic build for Mac OS X is not available now.
According to siege benchmarks, GCC compiler gains better performance for this application. If you want to use BSD cc, you may change CC variable in Makefile. Application can be built using both of them.
pkg install gcc libevent2
makeRun make static instead of make to build static binary.
You have to build libevent2 before:
sudo pkg install gcc
wget https://github.com/libevent/libevent/releases/download/release-2.0.22-stable/libevent-2.0.22-stable.tar.gz
tar xf libevent-2.0.22-stable.tar.gz
cd libevent-2.0.22-stable
./configure
make
sudo make installYou may also need to add /usr/local/lib to library search path:
sudo crle
# Settings output here. Check output and add /usr/local/lib at the end, delimiting it by colon
sudo crle -l /lib:/usr/lib:/usr/local/libAfter that, run build of Bloom from its directory:
makeStatic build for Solaris is not available now.
Run:
docker volume bloom
docker run -dit \
-v bloom:/var/lib/bloom \
-p 8889:8889 \
--restart unless-stopped \
--name bloom \
yarmak/bloom \
/var/lib/bloom/bloom.datHelp:
docker run -it \
yarmak/bloom \
-hmake installto install dynamic binary.
sudo snap install bloombloom <filename_for_snapshot> or
./bloom.static <filename_for_snapshot> if you prefer statically linked version.
Command line options:
$ bloom -h
Usage: bloom [options] SNAPSHOT_FILE
Options:
-H BIND_ADDRESS HTTP interface bind address. Default: 0.0.0.0
-P BIND_PORT HTTP interface bind port. Default: 8889
-h Print this help message
-m M Number of bits in bloom filter. Default: 2^33
-k K Number of hash functions. Default: 10
-t SECONDS Dump bloom filter snapshot to file every SECONDS
seconds. You can set this value to 0 if you wish
to disable this feature - snapshots are taken on USR1
signal and at exit in any case.
Default settings is suitable for containing 500,000,000 elements with false positive probability 0.1%. See also Utilities for parameters calculator.
Test whether an element is a member of a set:
$ curl http://127.0.0.1:8889/check?e=sdfdsafdsafsadf
MISSING
Add an element to set:
$ curl http://127.0.0.1:8889/add?e=sdfdsafdsafsadf
ADDED
Check then add at once:
$ curl http://127.0.0.1:8889/checkthenadd?e=aaaaaabbb
MISSING
$ curl http://127.0.0.1:8889/checkthenadd?e=aaaaaabbb
PRESENT
Server saves data to snapshot file in following cases:
- Server exit (received
SIGTERMorSIGINT) - Timer event. By default server dumps snapshot to disk every 5 minutes. See also help for option
-t. - On
SIGUSR1signal. This way you may control dump process on your own by sending signal to daemon.
Snapshot dumping process does not blocks serving request and uses copy-on-write method, so dumped data is always consistent.
utils/collision_meter.py- Check structure occupancy by measuring false positive probability on completely random requests.utils/bf_calc.py- Calculate parameters of bloom filter for given number of elements and false positives probability.