Jetstream is a modern C++ application framework that provides a robust set of features for building scalable, high-performance applications. It offers support for Kafka, HTTP, ElasticSearch, TypeSense, Loggly/Logz.io, and more.
- Features
- Roadmap
- Performance
- Installation
- Building with Docker
- Running Jetstream
- Debugging
- Development
- License
- Issue Tracking
- Persistent Kafka Producer/Consumer: Acts as a reliable Kafka producer and consumer.
- HTTP to Kafka: Transforms HTTP requests into Kafka messages.
- Kafka to HTTP with Batching: Processes Kafka messages and sends them to HTTP endpoints with support for batching.
- ElasticSearch Sink: Writes data to ElasticSearch clusters.
- TypeSense Sink: Integrates with TypeSense for search capabilities.
- Loggly/Logz.io Sink: Sends logs to Loggly or Logz.io for centralized logging.
- HTTP API: Provides an HTTP API for interaction and control.
- HTTP Web Server: Serves web content over HTTP.
- HTTP Client: Makes HTTP requests to other services.
- Prometheus Exporter: Exposes metrics for monitoring with Prometheus.
- Parallelized Parsing: Parses XML/JSON/CSV data using a thread pool for high performance.
- Data Augmentation: Enhances data with additional information.
- PostgreSQL Client: Connects and interacts with PostgreSQL databases.
- TypeSense Client: Interfaces with TypeSense for advanced search features.
- Federated Search and Queries: Combines search and relational database queries into a single HTTP call (GraphQL-like functionality).
- Event/Message Router: Routes Kafka events to multiple HTTP endpoints with batching support.
Jetstream optionally works with Logport, which monitors log files and sends changes to Kafka (one line per message). Logport enables your applications to easily ship logs to Kafka as they are written and log-rotated.
- WebSocket Support: Enable real-time communication using WebSockets.
- RocksDB Support: Integrate RocksDB for local storage solutions.
- S3 Sink and Reader/Writer: Add support for Amazon S3 as a data sink and source.
- HTML Templates with CrowCPP: Incorporate CrowCPP for serving HTML templates.
- Web Crawler Implementation: Develop a web crawler for data collection.
- LLM Client Support: Integrate with OpenAI's Large Language Models.
- Message Processing: Processes approximately 10,000 messages per second per partition.
- Memory Usage: Utilizes around 50 MB of runtime memory.
- CPU Usage: Typically consumes below 3% CPU, depending on the workload.
For detailed installation instructions, please refer to the following guides:
You can build Jetstream using Docker to containerize the application:
git config --global alias.st status
git config --global alias.subup "submodule update --init --recursive"
git clone --recursive https://github.com/homer6/jetstream.git
cd jetstream
git subup
docker build -t jetstream:latest -t homer6/jetstream:latest .To build and push the Docker image:
make
docker build -t jetstream:latest -t homer6/jetstream:latest .
docker push homer6/jetstream:latestExample for sending logs to Loggly:
docker run -d \
--restart unless-stopped \
\
--env LOGPORT_BROKERS=192.168.1.91,192.168.1.92,192.168.1.93 \
--env LOGPORT_TOPIC=my_logs_logger \
--env LOGPORT_PRODUCT_CODE=prd4096 \
--env LOGPORT_HOSTNAME=my.hostname.com \
\
--env JETSTREAM_BROKERS=192.168.1.91,192.168.1.92,192.168.1.93 \
--env JETSTREAM_CONSUMER_GROUP=prd4096_mylogs \
--env JETSTREAM_TOPIC=my_logs \
--env JETSTREAM_PRODUCT_CODE=prd4096 \
--env JETSTREAM_HOSTNAME=my.hostname.com \
\
--env JETSTREAM_DESTINATION_TOKEN=my_loggly_token \
\
homer6/jetstream:latest logglyNotes:
- LOGPORT_TOPIC and JETSTREAM_TOPIC must be different to avoid feedback loops.
- LOGPORT is used to ship Jetstream's own logs to Kafka.
To run Jetstream and sink data to ElasticSearch:
./jetstream elasticsearch -t my_logs 192.168.1.91:9200To debug Jetstream using GDB within Docker:
-
Modify the Dockerfile:
- Change the
CMDorENTRYPOINTto/bin/bash. - Optionally, switch the base image to Ubuntu.
- Install GDB by adding
RUN apt-get update && apt-get install -y gdbto the Dockerfile.
- Change the
-
Build the Docker Image:
docker build -t jetstream_debug:latest . -
Run the Docker Container with Debugging Capabilities:
docker run -it --cap-add=SYS_PTRACE --security-opt seccomp=unconfined \ --env JETSTREAM_TOPIC=my_logs jetstream_debug:latest -
Inside the Container, Start GDB:
gdb --args ./jetstream elasticsearch 127.0.0.1:9200
To run a data job and output samples:
mkdir analysis
make -j"$(nproc)" && time ./jetstream data-job-1 /archive/data > analysis/samples.txtTo run the API server locally:
make -j"$(nproc)" && ./jetstream api-serverJetstream is released under the MIT License.
If you encounter any issues or have suggestions for improvement, please feel free to reach out through the project's GitHub repository.