While the bulk of this code is mature and running in production in Walmart and outside, the documentation is very much a work in progress (ideally there'd be a nice summary of various projection patterns, but also much broader information discussing the tradeoffs implied in an event-centric system as a whole
If you're looking for a good discussion forum on these kinds of topics, look no further than the DDD-CQRS-ES Discord's #equinox channel (invite link).
The components within this repository are delivered as a multi-targeted Nuget package targeting net6.0
-
PropulsionImplements core functionality in a channel-independent fashion including
ParallelProjector,StreamsProjector. Depends onMathNet.Numerics,SerilogStreams.Prometheus: Exposes per-scheduler metrics.
-
Propulsion.MemoryStore. Provides bindings to
Equinox.MemoryStore. Depends onEquinox.MemoryStorev4.0.0,FsCodec.Box,PropulsionMemoryStoreSource: Forwarding from anEquinox.MemoryStoreinto aPropulsion.Sink, in order to enable maximum speed integration testing.Monitor.AwaitCompletion: Enables efficient deterministic waits for Reaction processing within an integration test.ReaderCheckpoint: ephemeral checkpoint storage forPropulsion.DynamoStore/Feed/EventStoreDb/SqlStreamSteamStorein test contexts.
-
Propulsion.CosmosStoreProvides bindings to Azure CosmosDB. Depends on
Equinox.CosmosStorev4.0.0CosmosStoreSource: reading from CosmosDb's ChangeFeed usingMicrosoft.Azure.CosmosCosmosStoreSink: writing toEquinox.CosmosStorev4.0.0.CosmosStorePruner: pruning fromEquinox.CosmosStorev4.0.0.ReaderCheckpoint: checkpoint storage forPropulsion.EventStoreDb/DynamoStore/Feed/SqlStreamSteamStoreusingEquinox.CosmosStorev4.0.0.
(Reading and position metrics are exposed via
Propulsion.CosmosStore.Prometheus) -
Propulsion.DynamoStoreProvides bindings to
Equinox.DynamoStore. Depends onEquinox.DynamoStorev4.0.0AppendsIndex/AppendsEpoch:Equinox.DynamoStoreaggregates that together form the Index Event StoreDynamoStoreIndexer: writes toAppendsIndex/AppendsEpoch(used byPropulsion.DynamoStore.Indexer)DynamoStoreSource: reads fromAppendsIndex/AppendsEpoch(which is populated byPropulsion.DynamoStore.IndexerviaDynamoStoreIndexer)ReaderCheckpoint: checkpoint storage forPropulsion.DynamoStore/Feed/EventStoreDb/SqlStreamSteamStoreusingEquinox.DynamoStorev4.0.0`.Monitor.AwaitCompletion: SeePropulsion.Feed
(Reading and position metrics are exposed via
Propulsion.Feed.Prometheus) -
Propulsion.DynamoStore.IndexerAWS Lambda to index appends into an Index Table. Depends on
Propulsion.DynamoStore,Amazon.Lambda.Core,Amazon.Lambda.DynamoDBEvents,Amazon.Lambda.Serialization.SystemTextJsonHandler: parses Dynamo DB Streams Trigger input, feeds intoPropulsion.DynamoStore.DynamoStoreIndexerConnector: Store / environment variables wiring to connectDynamoStreamsLambdato theEquinox.DynamoStoreIndex Event StoreFunction: AWS Lambda Function that can be fed via a DynamoDB Streams Trigger, which it passes toHandler
(Diagnostics are exposed via Console to CloudWatch)
-
Propulsion.DynamoStore.NotifierAWS Lambda to report new events indexed by the Indexer to an SNS Topic, in order to enable triggering AWS Lambdas to service Reactions without requiring a long-lived host application. Depends on
Amazon.Lambda.Core,Amazon.Lambda.DynamoDBEvents,Amazon.Lambda.Serialization.SystemTextJson,AWSSDK.SimpleNotificationServiceHandler: parses Dynamo DB Streams Trigger input, generates a message per updated Tranche in the batchFunction: AWS Lambda Function that can be fed via a DynamoDB Streams Trigger, which passes toHandler
(Diagnostics are exposed via Console to CloudWatch)
-
Propulsion.DynamoStore.ConstructsAWS Lambda CDK deploy logic. Depends on
Amazon.CDK.Lib(and, indirectly, on the binary assets included as content inPropulsion.DynamoStore.Indexer/Propulsion.DynamoStore.Notifier)DynamoStoreIndexerLambda: CDK wiring forPropulsion.DynamoStore.IndexerDynamoStoreNotifierLambda: CDK wiring forPropulsion.DynamoStore.NotifierDynamoStoreReactorLambda: CDK wiring for a Reactor that's triggered based on messages supplied byPropulsion.DynamoStore.Notifier
-
Propulsion.DynamoStore.LambdaHelpers for implementing Lambda Reactors. Depends on
Amazon.Lambda.SQSEvents,Propulsion.FeedSqsNotificationBatch.parse: parses a batch of notification events (queued by theNotifier) in aAmazon.Lambda.SQSEvents.SQSEventSqsNotificationBatch.batchResponseWithFailuresForPositionsNotReached: Correlates the updated checkpoints with the inputSQSEvent, generating aSQSBatchResponsethat will requeue any notifications that have not yet been serviced.
(Used by
eqxShippingtemplate) -
Propulsion.EventStoreDb. Provides bindings to EventStore, writing via
Propulsion.EventStore.EventStoreSinkDepends onEquinox.EventStoreDbv4.0.0,SerilogEventStoreSource: reading from an EventStoreDB >=20.10$allstream into aPropulsion.Sinkusing the gRPC interface. Provides throughput metrics viaPropulsion.Feed.PrometheusEventStoreSink: writing toEquinox.EventStoreDbv4.0.0Monitor.AwaitCompletion: SeePropulsion.Feed
(Reading and position metrics are exposed via
Propulsion.Feed.Prometheus) -
Propulsion.FeedProvides helpers for streamwise consumption of a feed of information with an arbitrary interface (e.g. a third-party Feed API), including the maintenance of checkpoints within such a feed. Depends on
Propulsion, aIFeedCheckpointStoreimplementation (from e.g.,Propulsion.CosmosorPropulsion.CosmosStore)FeedSource: Handles continual reading and checkpointing of events from a set of feeds ('tranches' of a 'source') that collectively represent a change data capture source from a custom system (roughly analogous to how a CosmosDB Container presents a changefeed). AreadTranchesfunction is expected to yield aTrancheIdlist; the Feed Source operates a logical reader thread per such tranche. Each individual Tranche is required to be able to represent its content as an incrementally retrievable change feed with a monotonically increasingIndexperFsCodec.ITimelineEventread from a given tranche.PeriodicSource: Handles regular crawling of an external datasource (such as a SQL database) where there is no way to isolate the changes since a given checkpoint (based on either the intrinsic properties of the data, or of the store itself). The source is expected to present its content as anAsyncSeqofFsCodec.StreamName * FsCodec.IEventData * context. Checkpointing occurs only when all events have been completely ingested by the Sink.Prometheus: Exposes reading statistics to Prometheus (including metrics fromSqlStreamStore.SqlStreamStoreSourceandEventStoreDb.EventStoreSource)Monitor.AwaitCompletion: Enables efficient waiting for completion of reaction processing within an integration test
-
Propulsion.KafkaProvides bindings for producing and consuming both streamwise and in parallel. Includes a standard codec for use with streamwise projection and consumption,
Propulsion.Kafka.Codec.NewtonsoftJson.RenderedSpan. Depends onFsKafkav1.7.0-1.9.99,Serilog -
Propulsion.SqlStreamStore. Provides bindings to SqlStreamStore, maintaining checkpoints in a SQL table using Dapper Depends on
Propulsion.Feed,SqlStreamStore,Dapperv2.0,Microsoft.Data.SqlClientv1.1.3,SerilogSqlStreamStoreSource: reading from a SqlStreamStore$allstream into aPropulsion.SinkReaderCheckpoint: checkpoint storage forPropulsion.Feed/SqlStreamSteamStore/EventStoreDbusingDapper,Microsoft.Data.SqlClientMonitor.AwaitCompletion: SeePropulsion.Feed
(Reading and position metrics are exposed via
Propulsion.Feed.Prometheus)
The ubiquitous Serilog dependency is solely on the core module, not any sinks, i.e. you configure to
-
Propulsion.Tool: Tool used to initialize a Change Feed Processor
auxcontainer forPropulsion.Cosmosand demonstrate basic projection, including to Kafka. See quickstart.- CosmosDB: Initialize
-auxContainer for ChangeFeedProcessor - CosmosDB/DynamoStore/EventStoreDB/Feed/SqlStreamStore: adjust checkpoints
- CosmosDB/DynamoStore/EventStoreDB: walk change feeds/indexes and/or project to Kafka
- DynamoStore: validate and/or reindex DynamoStore Index
- CosmosDB: Initialize
Propulsion supports recent versions of Equinox and other Store Clients within reason - the aim is to provide a clean way to manage phased updates from older clients to current ones by means of adjusting package references while retaining source compatibility to the maximum degree possible.
-
Propulsion.CosmosProvides bindings to Azure CosmosDB. Depends on
Equinox.Cosmos,Microsoft.Azure.DocumentDB.ChangeFeedProcessor,Serilog- Deprecated as Equinox.CosmosStore superseded Equinox.Cosmos
CosmosSource: reading from CosmosDb's ChangeFeed by wrapping thedotnet-changefeedprocessorlibrary.CosmosSink: writing toEquinox.Cosmosv2.6.0.CosmosPruner: pruningEquinox.Cosmosv2.6.0.ReaderCheckpoint: checkpoint storage forPropulsion.DynamoStore/Feed/EventStoreDb/SqlStreamSteamStoreusingEquinox.CosmosStorev2.6.0.
(Reading and position metrics are exposed via
Propulsion.Cosmos.Prometheus) -
Propulsion.CosmosStore3Provides bindings to Azure CosmosDB. Depends on
Equinox.CosmosStorev3.0.7,Microsoft.Azure.Cosmosv3.27.0- Deprecated; Only intended for use in migration from Propulsion.Cosmos and/or Equinox.Cosmos
CosmosStoreSource: reading from CosmosDb's ChangeFeed usingMicrosoft.Azure.Cosmos(relies on explicit checkpointing that entered GA in3.21.0)CosmosStoreSink: writing toEquinox.CosmosStorev3.0.7.CosmosStorePruner: pruning fromEquinox.CosmosStorev3.0.7.ReaderCheckpoint: checkpoint storage forPropulsion.EventStoreDb/DynamoStore/'Feed'/SqlStreamSteamStoreusingEquinox.CosmosStorev3.0.7.
(Reading and position metrics are exposed via
Propulsion.CosmosStore.Prometheus) -
Propulsion.EventStore. Provides bindings to EventStore, writing via
Propulsion.EventStore.EventStoreSinkDepends onEquinox.EventStorev4.0.0,Serilog- Deprecated as reading (and writing) relies on the legacy EventStoreDB TCP interface
- Contains ultra-high throughput striped reader implementation
- Presently Used by
proSynctemplate
(Reading and position metrics are emitted to Console / Serilog; no Prometheus support)
-
See the Equinox QuickStart for examples of using this library to project to Kafka from
Equinox.Cosmosand/orEquinox.EventStore. -
See the
dotnet newtemplates repo for examples using the packages herein:-
Propulsion-specific templates:
proProjectortemplate forCosmosStoreSource+StreamsProjectorlogic consuming from a CosmosDbChangeFeedProcessor.proProjectortemplate (in--kafkamode) for producer logic usingStreamsProducerSinkorParallelProducerSink.proConsumertemplate for example consumer logic usingParallelConsumerandStreamsConsumeretc.
-
proReactortemplate, which includes multiple sources and multiple processing modesproCosmosReactormore legible version ofproReactortemplate, currently only supportsPropulsion.CosmosStoresummaryConsumertemplate, consumes from the output of aproReactor --kafka, saving them in anEquinox.CosmosStorestoretrackingConsumertemplate, which consumes from Kafka, feeding into example Ingester logic in anEquinox.CosmosStorestoreproSynctemplate is a fully fledged store <-> store synchronization tool syncing from aCosmosStoreSourceorEventStoreSourceto aCosmosStoreSinkorEventStoreSinkfeedConsumer,feedSource: templates illustrating usage ofPropulsion.Feed.FeedSourceperiodicIngester: template illustrating usage ofPropulsion.Feed.PeriodicSourceproArchiver,proPruner: templates illustrating usage of hot/cold support and support for secondary fallback inEquinox.CosmosStore
-
-
See the
FsKafkarepo forBatchedProducerandBatchedConsumerimplementations (together with theKafkaConsumerConfigandKafkaProducerConfigused in the Parallel and Streams wrappers inPropulsion.Kafka)
Propulsion and Equinox have a Yin and yang relationship; the use cases for both naturally interlock and overlap.
It can be relevant to peruse the Equinox Documentation's Overview Diagrams for the perspective from the other side (TL;DR its largely the same topology, with elements that are de-emphasized here central over there, and vice versa)
C4 Context diagram
While Equinox focuses on the Consistent Processing element of building an event-sourced decision processing system, offering tailored components that interact with a specific Consistent Event Store, Propulsion elements support the building of complementary facilities as part of an overall Application:
- Ingesters: read stuff from outside the Bounded Context of the System. This kind of service covers aspects such as feeding reference data into Read Models, ingesting changes into a consistent model via Consistent Processing. These services are not acting in reaction to events emanating from the Consistent Event Store, as opposed to...
- Publishers: react to events as they are arrive from the Consistent Event Store by filtering, rendering and producing to feeds for downstreams. While these services may in some cases rely on synchronous queries via Consistent Processing, it's never transacting or driving follow-on work; which brings us to...
- Reactors: drive reactive actions triggered by either upstream feeds, or events observed in the Consistent Event Store. These services handle anything beyond the duties of Ingesters or Publishers, and will often drive follow-on processing via Process Managers and/or transacting via Consistent Processing. In some cases, a reactor app's function may be to progressively compose a notification for a Publisher to eventually publish.
The overall territory is laid out here in this C4 System Context Diagram:
See Overview section in DOCUMENTATION.md for further drill down
dotnet tool uninstall Propulsion.Tool -g
dotnet tool install Propulsion.Tool -g
propulsion init -ru 400 cosmos # generates a -aux container for the ChangeFeedProcessor to maintain consumer group progress within
# -V for verbose ChangeFeedProcessor logging
# `-g projector1` represents the consumer group - >=1 are allowed, allowing multiple independent projections to run concurrently
# stats specifies one only wants stats regarding items (other options include `kafka` to project to Kafka)
# cosmos specifies source overrides (using defaults in step 1 in this instance)
propulsion -V project -g projector1 stats cosmos
# load events with 2 parallel readers, detailed store logging and a read timeout of 20s
propulsion -VS project -g projector1 stats dynamo -rt 20 -d 22. Use propulsion tool to Run a CosmosDb ChangeFeedProcessor or DynamoStoreSource projector, emitting to a Kafka topic
$env:PROPULSION_KAFKA_BROKER="instance.kafka.mysite.com:9092" # or use -b
# `-V` for verbose logging
# `-g projector3` represents the consumer group; >=1 are allowed, allowing multiple independent projections to run concurrently
# `-l 5` to report ChangeFeed lags every 5 minutes
# `kafka` specifies one wants to emit to Kafka
# `temp-topic` is the topic to emit to
# `cosmos` specifies source overrides (using defaults in step 1 in this instance)
propulsion -V project -g projector3 -l 5 kafka temp-topic cosmosSummarize current state of the index being prepared by Propulsion.DynamoStore.Lambda
propulsion index dynamo -t equinox-test
Example output:
19:15:50 I Current Tranches / Active Epochs [[0, 354], [2, 15], [3, 13], [4, 13], [5, 13], [6, 64], [7, 53], [8, 53], [9, 60]]
19:15:50 I Inspect Index Tranches list events 👉 eqx -C dump '$AppendsIndex-0' dynamo -t equinox-test-index
19:15:50 I Inspect Batches in Epoch 2 of Index Tranche 0 👉 eqx -C dump '$AppendsEpoch-0_2' -B dynamo -t equinox-test-index
Validate Propulsion.DynamoStore.Lambda has not missed any events (normally you guarantee this by having alerting on Lambda failures)
propulsion index -t 0 dynamo -t equinox-test
In addition to being able to validate the index (see preceding step), the tool facilitates ingestion of missing events from a complete DynamoDB JSON Export. Steps are as follows:
-
Enable Point in Time Restores in DynamoDB
-
Export data to S3, download and extract JSON from
.json.gzfiles -
Run ingestion job
propulsion index -t 0 $HOME/Downloads/DynamoDbS3Export/*.json dynamo -t equinox-test
See CONTRIBUTING.md
The best place to start, sample-wise is with the QuickStart, which walks you through sample code, tuned for approachability, from dotnet new templates stored in a dedicated repo.
Please note the QuickStart is probably the best way to gain an overview, and the templates are the best way to see how to consume it; these instructions are intended mainly for people looking to make changes.
NB The Propulsion.Kafka.Integration tests are reliant on a TEST_KAFKA_BROKER environment variable pointing to a Broker that has been configured to auto-create ephemeral Kafka Topics as required by the tests (each test run blindly writes to a guid-named topic and trusts the broker will accept the write without any initialization step)
dotnet build build.proj -v nwhy do you employ Kafka as an additional layer, when downstream processes could simply subscribe directly and individually to the relevant Cosmos db change feed(s)? Is it to accommodate other messages besides those emitted from events and snapshot updates? 🙏 @Roland Andrag
Well, Kafka is definitely not a critical component or a panacea.
You're correct that the bulk of things that can be achieved using Kafka can be accomplished via usage of the ChangeFeed. One thing to point out is that in the context of enterprise systems, having a well maintained Kafka cluster does have less incremental cost that it might do if you're building a smaller system from nothing.
Some of the negatives of consuming from the CF direct:
- each CFP reader imposes RU charges (its a set of continuous queries against each and every physical range of which the cosmos store is composed)
- you can't apply a server-side filter, so you pay for everything you see
- you're very prone to falling into coupling issues
- (as you alluded to), if there's some logic or work involved in the production of events you'd emit to Kafka, each consumer would need to duplicate that
While many of these concerns can be alleviated to varying degrees by splitting the storage up into multiple containers in order that each consumer will intrinsically be interested in a large proportion of the data it will observe (potentially using database level RU allocations), the write amplification effects of having multiple consumers will always be more significant when reading directly than when using Kafka, the design of which is well suited to running lots of concurrent readers.
Splitting event categories into containers solely to optimize these effects can also make the management of the transactional workload more complex; the ideal for any given container is to balance the concerns of:
- ensuring that datasets for which you want to ringfence availability / RU allocations don't share with containers/databases for which running hot (potentially significant levels of rate limiting but overall high throughput in aggregate as a result of using a high percentage of the allocated capacity)
- avoiding prematurely splitting data prior to it being required by the constraints of CosmosDB (i.e. you want to let splitting primarily be driven by reaching the [10GB] physical partition range)
- not having logical partition hotspots that lead to a small number of physical partitions having significantly above average RU consumption
- having relatively consistent document sizes
- economies of scale - if each container (or database if you provision at that level) needs to individually managed (with a degree of headroom to ensure availability for load spikes etc), you'll tend to require higher aggregate RU assignment for a given overall workload based on a topology that has more containers
any tips for testing Propulsion (projection) in an integration/end-to-end fashion? 🙏 @James Booth
I know for unit testing, I can just test the obvious parts. Or if end to end testing is even required
Depends what you want to achieve. One important technique for doing end-to-end scenarios, especially where some reaction is supposed to feed back into Equinox is to use Equinox.MemoryStore as the store, and then wire Propulsion to consume from that using MemoryStoreProjector.
That technique has been internally validated and that code from dotnet-templates is on the road to becoming Propulsion.MemoryStore a la jet#64
Other techniques I've seen/heard are:
- rig things to use ephemeral ESDB or cosmos databases (the emulator is only windows but works, you can use serverless or database level RU allocated DBs) to run your system against an ephemeral store. For such cases, you would tend to spin up all your projector apps (maybe in docker-compose etc) and then check for externally visible effects.
In general I'd be looking to use MemoryStoreProjector as a default technique, but there are no particularly deep examples to work off beyond the one adjacent to the impl (which is not a typical straightforward projection scenario)
To answer more completely, I'd say given a scenario involving Propulsion and Equinox, you'll typically have the following ingredients:
- writing to the store - you can either assume that's well-tested infra or say you want to know you wired it up properly
- serialization/deserialization - you can either have unit tests and/or property tests to validate roundtripping, or say its critical to know it really works with real data
- reading from feed, propagating to handler - that's harder to config and has the biggest variablity in a test scenario so either
a) you want to take it out of the equation b) you want to know its wired properly
-
does handler work complete - yes you and should can unit test that, but maybe you want to know it really does work in a broader context with more real stuff
-
does it trigger follow-on work, i.e. a cascade of reactions. you can either do triangulation and say its proven if I observe the trigger for the next bit, or you can want to prove it end to end
-
do overall thing really work - sometimes you want to be able to validate workflows rather than having to pay the tax of going in the front door for all the things
Any reason you didn’t use one of the different subscription models available in ESDB? 🙏 @James Booth
While the implementation and patterns in Propulsion happen to overlap to a degree with the use cases of the ESDB's subscription mechanisms, the primary reason they are not used directly stem from the needs and constraints that Propulsion was evolved to cover.
One thing that should be clear is that Propulsion is definitely not solving for the need of being the simplest conceivable projection library with a low concept count that's easy to get started with. If you're looking to build such a library, you'll likely give yourself some important guiding non-goals, e.g., if we need to add 3 more concepts to get a 20% improvement in throughput, then we'd prefer to retain the simplicity.
For Propulsion, almost literally, job one was to be able to shift 1TB of ordered events in streams to/from ESDB/Cosmos/Kafka in well under 24h - a general naive thing reading and writing in small batches takes more like 24d to do the same thing. A secondary goal is to keep them in sync continually after that point (it's definitely more than a one time bulk ingestion system).
While Propulsion scales down to running simple subscriptions (and I've built systems using it for exactly that), its got quite a few additional concepts compared to using something built literally for that job because that use case was almost literally an afterthought.
That's not to say that all those concepts overall make for a more complex system when all is said and done; there are lots of scenarios where you avoid having to do concurrent/async tricks one might otherwise do more explicitly in a more basic subscription system.
_When looking at the vast majority of typical projections/reactions/denormalizers one runs in an event-sourced system it should come as no surprise that EventStoreDB's subscription features offer lots of excellent ways of achieving those common goals with a good balance of:
- time to implement
- ease of operation
- good enough performance_ That's literally the company's goal: enabling rapidly building systems to solve business problems.
The potential upsides that Propulsion can offer when used as a Projection system can definitely be valuable when actually needed, but on average, they'll frequently simply be massive overkill.
OK, with that context set, some key things that are arguably upsides of using Propulsion for Projectors rather than building a minimal thing without it:
- similar APIs regardless of whether events arrive via CosmosDB, EventStoreDB or Kafka
- consistent dashboards across all those sources
- generally excellent performance for high throughput scenarios (it was built for that)
- good handling for processing of workloads that don't have uniform (and low) cost per handler invocation, i.e., rate-limited writes of events to
Equinox.Cosmosversus feeding stuff to Redis - orthogonality to Equinox features but still offering a degree of commonality of concepts and terminology
- provide a degree of isolation from the low level drivers, e.g.:
- moving from Cosmos CFP V2 to the Azure.Cosmos V4 SDK will be a matter of changing package references and fixing some minimal compilation errors, as opposed to learning a whole new API set
- moving from EventStore's TCP API / EventStore.Client as per V5 to the gRPC based >= v20 clients also becomes a package switch (massive TODO though: actually port it!)
- migrating a workload from EventStoreDB to CosmosDB or vice versa can be accomplished more cleanly if you're only changing the wiring of your projector host while making no changes to the handler implementations
- SqlStreamStore fits logically in as well; using it gives a cleaner mix and match / onramp to/from ESDB (Note however that migrating SSS <-> ESDB is a relatively trivial operation vs migrating from raw EventStore usage to Equinox.Cosmos, i.e. "we're using Propulsion to isolate us from deciding between SSS or ESDB" is not a good enough reason on its own)
- Specifically when consuming from Cosmos, being able to do that over a longer wire by feeding to Kafka to limit RU consumption from projections is a minor change. (Having to do reorganize like that for performance reasons is much more rarely a concern for EventStoreDB)
The order in which the need for various components arose (as a side effect of building out Equinox; solving specific needs in terms of feeding events into and out of EventStoreDB, CosmosDB and Kafka) was also an influence on the abstractions within and general facilities of Propulsion.
Propulsion.Cosmos'sSourcewas the first bit done; it's a light wrapper over the CFP V2 client. Key implications from that are:- order of events in a logical partition can be maintained
- global ordering of events across all logical streams is not achievable due to how CosmosDB works (only ordering guarantees are only logical partition level, the data is sharded into notes which can split as data grows)
Propulsion.Kafka'sSinkwas next; the central goal here is be able to replicate events being read from CosmosDB onto a Kafka Topic maintaining the ordering guarantees. Implications:- There are two high level ways of achieving ordering guarantees in Kafka:
- only ever have a single event in flight; only when you've got the ack for a write do you send the next one. However, literally doing that compromises throughput massively.
- use Kafka's transaction facilities (not implemented in
Confluent.Kafkaat the time) => The approach used is to continuously emit messages concurrently in order to maintain throughput, but guarantee to never emit messages for the same key at the same time.
- There are two high level ways of achieving ordering guarantees in Kafka:
Propulsion.Cosmos'sSinkwas next up. It writes to CosmosDB usingEquinox.Cosmos. Key implications:- because rate-limiting is at the physical partition level, it's crucial for throughput that you keep other partitions busy while wait/retry loops are triggered on hotspots (and you absolutely don't want to exacerbate this effect by competing with yourself)
- you want to ideally batch the writing of multiple events/documents to minimize round-trips (and write RU costs are effectively O(log N) despite high level guidance characterizing it as O(N)))
- you can only touch one logical partition for any given write
- when you hit a hotspot and need to retry, ideally you'd pack events queued up behind you into the retry too
- there is no one-size fits all batch size (yet) that balances
a) not overloading the source
b) maintaining throughput
=> You'll often need a small batch size, which implies larger per-event checkpointing overhead unless you make the checkpointing asynchronous
=> The implementation thus:
- manages reading async from writing in order to maintain throughput (you define a batch size and a number of batches to read ahead)
- schedules write attempts at stream level (the reader concurrently ingests successor events, making all buffered events available when retrying)
- writes checkpoints asynchronously when all the items involved complete within the (stream-level) processing
- At the point where
Propulsion.EventStore'sSourceandSinkwere being implemented (within weeks of theCosmosequivalents; largely overlapping), the implications from realizing goals of providing good throughput while avoiding adding new concepts if it can be avoided are:- The cheapest (lowest impact in terms of triggering scattered reads across disks on an ESDB server, with associated latency implications) and most general API set for reading events is to read the
$allstream - Maintaining checkpoints in an EventStoreDB that you're also monitoring is prone to feedback events (so using the Async checkpointing strategy used for
.Cosmosbut saving them in an external store such as anEquinox.Cosmosone makes sense) - If handlers and/or sinks don't have uniform processing time per message and/or are subject to rate limiting, most of the constraints of the
CosmosSinkapply too; you don't want to sit around retrying the last request out of a batch of 100 while tens of thousands of provisioned RUs are sitting idle in Cosmos and throughput is stalled
- The cheapest (lowest impact in terms of triggering scattered reads across disks on an ESDB server, with associated latency implications) and most general API set for reading events is to read the
The things Propulsion in general accomplishes in the projections space:
- Uniform dashboards for throughput, successes vs failures, and latency distributions over CosmosDB, EventStoreDB, Kafka and generic Feeds
- Metrics to support trustworthy alerting and detailed analysis of busy, failing and stuck projections
- make reading, checkpointing, parsing and running independent asynchronous activities (all big perf boosts with Cosmos, less relevant for EventStoreDB)
- allow handlers to handle backlog of accumulated items for a stream as a batch if desired
- concurrency across streams
- (for Cosmos, but could be achieved for EventStoreDB) provides for running multiple instances of consumers leasing physical partitions roughly how Kafka does it (aka the ChangeFeedProcessor lease management - Propulsion just wraps that and does not seek to impose any significant semantics on top of that)
- provide good instrumentation as to latency, errors, throughput in a pluggable way akin to how Equinox does stuff (e.g. it has a Prometheus support)
- good stories for isolating from specific drivers - i.e., there's
Propulsion.Cosmos(using the V2 SDK) and aPropulsion.CosmosStore(for the V3 SDK) with close-to-identical interfaces (at some point there'll be aPropulsion.EventStoreDbusing the gRPC-based SDKs to go with the V5 SDK-basedPropulsion.EventStore) - handlers/reactors/the projections can be ported to .Cosmos by swapping driver modules; similar to how Equinox.Cosmos vs Equinox.EventStore provides a common programming model despite the underpinnings being fundamentally quite different in nature
- Kafka reading and writing generally fits within the same patterns - i.e. if you want to push CosmosDb CFP output to Kafka and consume over that as a 'longer wire' thing without placing extra load on the source if you e.g. have 50 consumers, that's just an extra 250 line
dotnet new proProjectorapp, and a tweak to ~30 lines of consumer app wireup to connect to Kafka instead of Cosmos
Things ESDB's subscriptions can do that are not covered in Propulsion (highlights, by no means a complete list):
$et-,$ec-streams- honoring the full
$allorder - stacks more things - EventStoreDB is a well designed purpose-built solution used by thousands of systems with a massive mix of throughput and complexity constraints
This repo is derived from FsKafka; the history has been edited to focus only on edits to the Propulsion libraries.
- Please feel free to log question-issues; they'll get answered here