BOOM is an alert broker. What sets it apart from other alert brokers is that it is written to be modular, scalable, and performant. Essentially, the pipeline is composed of multiple types of workers, each with a specific task:
- The
Kafkaconsumer(s), reading alerts from astronomical surveys'Kafkatopics to transfer them toRedis/Valkeyin-memory queues. - The Alert Ingestion workers, reading alerts from the
Redis/Valkeyqueues, responsible of formatting them to BSON documents, and enriching them with crossmatches from archival astronomical catalogs and other surveys before writing the formatted alert packets to aMongoDBdatabase. - The enrichment workers, running alerts through a series of enrichment classifiers, and writing the results back to the
MongoDBdatabase. - The Filter workers, running user-defined filters on the alerts, and sending the results to Kafka topics for other services to consume.
Workers are managed by a Scheduler that can spawn or kill workers of each type. Currently, the number of workers is static, but we are working on dynamically scaling the number of workers based on the load of the system.
BOOM also comes with an HTTP API, under development, which will allow users to query the MongoDB database, to define their own filters, and to have those filters run on alerts in real-time.
BOOM runs on macOS and Linux. You'll need:
Dockeranddocker compose: used to run the database, cache/task queue, andKafka;Rust(a systems programming language)>= 1.55.0;tar: used to extract archived alerts for testing purposes.libssl,libsasl2: required for some Rust crates that depend on native libraries for secure connections and authentication.- If you're on Windows, you must use WSL2 (Windows Subsystem for Linux) and install a Linux distribution like Ubuntu 24.04.
- Docker: On macOS we recommend using Docker Desktop to install docker. You can download it from the website, and follow the installation instructions. The website will ask you to "choose a plan", but really you just need to create an account and stick with the free tier that offers all of the features you will ever need. Once installed, you can verify the installation by running
docker --versionin your terminal, anddocker compose versionto check that docker compose is installed as well. - Rust: You can either use rustup to install Rust, or you can use Homebrew to install it. If you choose the latter, you can run
brew install rustin your terminal. We recommend using rustup, as it allows you to easily switch between different versions of Rust, and to keep your Rust installation up to date. Once installed, you can verify the installation by runningrustc --versionin your terminal. You also want to make sure that cargo is installed, which is the Rust package manager. You can verify this by runningcargo --versionin your terminal. - System packages are essential for compiling and linking some Rust crates. All those used by BOOM should come with macOS by default, but if you get any errors when compiling it you can try to install them again with Homebrew:
brew install openssl@3 cyrus-sasl gnu-tar.
- Docker: You can either install Docker Desktop (same instructions as for macOS), or you can just install Docker Engine. The latter is more lightweight. You can follow the official installation instructions for your specific Linux distribution. If you only installed Docker Engine, you'll want to also install docker compose. Once installed, you can verify the installation by running
docker --versionin your terminal, anddocker compose versionto check that docker compose is installed as well. - Rust: You can use rustup to install Rust. Once installed, you can verify the installation by running
rustc --versionin your terminal. You also want to make sure that cargo is installed, which is the Rust package manager. You can verify this by runningcargo --versionin your terminal. wgetandtar: Most Linux distributions come withwgetandtarpre-installed. If not, you can install them with your package manager.- System packages are essential for compiling and linking some Rust crates. On linux, you can install them with your package manager. For example with
apton Ubuntu or Debian-based systems, you can run:sudo apt update sudo apt install build-essential pkg-config libssl-dev libsasl2-dev -y
BOOM uses environment variables for sensitive configuration like passwords
and API keys.
For local development, you can use the defaults in .env.example
by copying it to .env:
cp .env.example .envNote: Do not commit .env to Git or use the example values
in production.
-
Launch
Valkey,MongoDB, andKafkausing Docker, with the provideddocker-compose.yamlfile:docker compose up -d
This may take a couple of minutes the first time you run it, as it needs to download the docker image for each service. To check if the containers are running and healthy, run
docker ps.Note: Docker Compose will automatically use the environment variables from your
.envfile to configure the MongoDB container with your specified credentials. -
Last but not least, build the Rust binaries. You can do this with or without the
--releaseflag, but we recommend using it for better performance:cargo build --release
To run the API server in development mode,
first ensure cargo-watch is installed (cargo install cargo-watch),
then call:
make api-devBOOM is meant to be run in production, reading from a real-time Kafka stream of astronomical alerts. That said, we made it possible to process ZTF alerts from the ZTF alerts public archive. This is a great way to test BOOM on real data at scale, and not just using the unit tests. To start a Kafka producer, you can run the following command:
cargo run --release --bin kafka_producer <SURVEY> [DATE] [PROGRAMID]To see the list of all parameters, documentation, and examples, run the following command:
cargo run --release --bin kafka_producer -- --helpAs an example, let's say you want to produce public ZTF alerts that were observed on 20240617 UTC. You can run the following command:
cargo run --release --bin kafka_producer ztf 20240617 publicYou can leave that running in the background, and start the rest of the pipeline in another terminal.
If you'd like to clear the Kafka topic before starting the producer, you can run the following command:
docker exec -it broker /opt/kafka/bin/kafka-topics.sh --bootstrap-server broker:9092 --delete --topic ztf_YYYYMMDD_programid1Next, you can start the Kafka consumer with:
cargo run --release --bin kafka_consumer <SURVEY> [DATE] [PROGRAMID]This will start a Kafka consumer, which will read the alerts from a given Kafka topic and transfer them to Redis/Valkey in-memory queue that the processing pipeline will read from.
To continue with the previous example, you can run:
cargo run --release --bin kafka_consumer ztf 20240617 publicNow that alerts have been queued for processing, let's start the workers that will process them. Instead of starting each worker manually, we provide the scheduler binary. You can run it with:
cargo run --release --bin scheduler <SURVEY> [CONFIG_PATH]Where <SURVEY> is the name of the stream you want to process.
For example, to process ZTF alerts, you can run:
cargo run --release --bin scheduler ztfThe scheduler prints a variety of messages to your terminal, e.g.:
- At the start you should see a bunch of
Processed alert with candid: <alert_candid>, queueing for classificationmessages, which means that the fake alert worker is picking up on the alerts, processed them, and is queueing them for classification. - You should then see some
received alerts len: <nb_alerts>messages, which means that the enrichment worker is processing the alerts successfully. - You should not see anything related to the filter worker. This is normal, as we did not define any filters yet! The next version of the README will include instructions on how to upload a dummy filter to the system for testing purposes.
- What you should definitely see is a lot of
heart beat (MAIN)messages, which means that the scheduler is running and managing the workers correctly.
Metrics are available in the Prometheus instance at http://localhost:9090. Here some links to the Prometheus UI with useful queries already entered:
To stop BOOM, you can simply stop the Kafka consumer with CTRL+C, and then stop the scheduler with CTRL+C as well.
You can also stop the docker containers with:
docker compose downWhen you stop the scheduler, it will attempt to gracefully stop all the workers by sending them interrupt signals. This is still a work in progress, so you might see some error handling taking place in the logs.
In the next version of the README, we'll provide the user with example scripts to read the output of BOOM (i.e. the alerts that passed the filters) from Kafka topics. For now, alerts are send back to Redis/valkey if they pass any filters.
The logging level is configured using the RUST_LOG environment variable, which can be set to one or more directives described in the tracing_subscriber docs.
The simplest directives are "trace", "debug", "info", "warn", "error", and "off", though more advanced directives can be used to set the level for particular crates.
An example of this is boom's default directive---what boom uses when RUST_LOG is not set---which is "info,ort=error".
This directive means boom will log at the INFO level, with events from the ort crate specifically limited to ERROR.
Setting RUST_LOG overwrites the default directive. For instance, RUST_LOG=debug will show all DEBUG events from all crates (including ort).
If you need to change the general level while keeping ort events limited to ERROR, then you'll have to specify that explicitly, e.g., RUST_LOG=debug,ort=error.
If you find the filtering on ort too restrictive, but you don't want to open it up to INFO, you can set RUST_LOG=info,ort=warn.
There's nothing special about ort here; directives can be used to control events from any crate.
It's just that ort tends to be significantly "noisier" than all of our other dependencies, so it's a useful example.
Span events can be added to the log by setting the BOOM_SPAN_EVENTS environment variable to one or more of the following span lifecycle options: "new", "enter", "exit", "close", "active", "full", or "none", where multiple values are separated by a comma.
For example, to see events for when spans open and close, set BOOM_SPAN_EVENTS=new,close.
"close" is notable because it creates events with execution time information, which may be useful for profiling.
As a more complete example, the following sets the logging level to DEBUG, with ort specifically set to WARN, and enables "new" and "close" span events while running the scheduler:
RUST_LOG=debug,ort=warn BOOM_SPAN_EVENTS=new,close cargo run --bin scheduler -- ztfWe welcome contributions! Please read the CONTRIBUTING.md file for more information. We rely on GitHub issues to track bugs and feature requests.