Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Kinesis Stream Producer + Spark Streaming Consumer with local dev infra using localstack

Notifications You must be signed in to change notification settings

sunshah/kinesis-local-dev

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Kinesis Producer-Consumer Local Dev and Testing

This app lets you test code to produce and consume from a kinesis stream by setting up local instances of kinesis, dynamoDB and CloudWatch using localstack. It also sets up an nginx reverse proxy required to redirect outgoing http requests to aws.

The KinesisInputDStream builder does not give you the option to point to a local instance of DynamoDB. The Shard Reader is hard-coded to update aws's prod instance. To get around this problem we've setup a reverse proxy to capture outgoing calls to prod server and redirect them to a local DynamoDB instance. We do the same for Cloudwatch

Kinesis Producer

The kinesis producer uses the KCL to create a new stream if it does not exist and publish messages to kinesis with the following message format <messageId>@@@<totalParts>@@@<currentPart>@@@<payload>

Kinesis Consumer

The kinesis consumer sets up shard consumers using Spark structured streaming and prints records to the console

Local Development

Update hosts file to redirect outgoing aws requests to loopback interface on which localstack is setup

  • Add the following entry to your /etc/hosts file
127.0.0.1 dynamodb.us-east-1.amazonaws.com
127.0.0.1 monitoring.us-east-1.amazonaws.com

Start up localstack and nginx containers

  • docker-compose up -d

Both the producer and consumer use the Default Credential Provider Chain Ensure one of the credentials is provided. This is required by the Kinesis client library and Kinesis producer library. Localstack does not perform any authentication

Appendix

Debugging commands

Kinesis

  • aws --region= --endpoint-url=https://localhost:4568 kinesis list-streams --no-verify-ssl
  • aws --region= --endpoint-url=https://localhost:4568 kinesis describe-stream --stream-name iterable-ds-events-stream --no-verify-ssl
  • aws --region= --endpoint-url=https://localhost:4568 kinesis get-shard-iterator --shard-id <shardId-> --shard-iterator-type TRIM_HORIZON --stream-name iterable-ds-events-stream --query 'ShardIterator' --no-verify-ssl
  • aws --region= --endpoint-url=https://localhost:4568 kinesis get-records --shard-iterator <iterator_from_previous_command> --no-verify-ssl

DynamoDB

  • aws dynamodb list-tables --endpoint-url=https://localhost:4569 --no-verify-ssl --debug

Generating self signed certs

openssl req -subj '/CN=localhost' -x509 -newkey rsa:4096 -nodes -keyout key.pem -out cert.pem -days 365

Redirecting outbound http request

modify /etc/hosts

Add entry 127.0.0.0 <outgoing_url> to redirect traffic destined to aws dynamoDB and Cloudwatch

Challenges

  • Cannot configure the dynamoDB url, therefore need to setup reverse proxy using nginx and redirecting traffic destined to to local gateway 127.0.0.1 Kinesis Stream Issue

Reading

About

Kinesis Stream Producer + Spark Streaming Consumer with local dev infra using localstack

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages