-
-
Notifications
You must be signed in to change notification settings - Fork 4.2k
refactor SNS async publishing & ASF data models #7267
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
a563f3a
to
9b8f18d
Compare
4940492
to
0415a22
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks awesome! Finally this is getting the overhaul it needed and deserved. I have a few comments, nothing major.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a fantastic set of changes, really well done with the class hierarchy and the async delivery logic! Encapsulation into classes vs functions is also very nicely balanced.
In terms of architecture I only have one comment regarding try/except pattern in a contract, which in this case I feel is debatable (see inline).
Two things that I think should be addressed are:
- More documentation: It would be great if we could add some pydoc all-around, but especially for top-level classes/interfaces.
- Logging: proper logging delivery of messages can have a huge impact on DevX for this service, and I think we can make some improvements there
localstack/services/sns/publisher.py
Outdated
for subscriber in subscriptions: | ||
if self._should_publish(ctx.store, ctx.message, subscriber): | ||
notifier = self.topic_notifiers[subscriber["Protocol"]] | ||
LOG.debug("Submitting task to the executor for notifier %s", notifier) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: will this produce a useful output? it seems __repr__
and __str__
aren't implemented for whatever notifier
is.
generally, good logging can be very useful here, so maybe we could improve it to show a concise log message about what's going on: source X publishing to topic Y via protocol Z, maybe?
this would apply to all logging statements that have similar significance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed! Will work on this! 👍
localstack/services/sns/publisher.py
Outdated
def publish(self, context: SnsPublishContext, subscriber: SnsSubscription): | ||
try: | ||
self._publish(context=context, subscriber=subscriber) | ||
except Exception: | ||
LOG.exception( | ||
"An internal error occurred while trying to send the SNS message %s", | ||
context.message, | ||
) | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about swallowing exceptions at this level of abstraction (the base class/contract). The try/except pattern is generally something i feel the caller should be taking care of, not the callee (which is what we're doing here). But I understand we're only invoking this method asynchronously via an executor, and this is the most obvious/convenient way of making sure the exception at least gets logged.
Maybe it makes more sense to wrap the executor.submit(publisher.publish)
with some inline method that does the exception wrapping, which is a bit more inconvenient for the caller, but keeps the publisher free from this logic.
But debatable 🤷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not quite sure I understand fully the concept between caller/callee in this case.
The only way the caller could manage this would be by checking the Future
result, or wrapping like you said.
I guess the PublishDispatcher
could theoretically provide a wrapper method, something like _wrap_and_log()
or something similar, and dispatch it that way? executor.submit(self._wrap_and_log(publisher.publish))
, is that what you meant?
We could almost provide an helper method and use it as a decorator around the publish
methods? I'll wait for your answer on this one as I'm not sure I fully understood what you meant! But thanks, agree that the logging is a bit weird in the publisher that way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update: we discussed and we will tackle this in a next PR, this will do for now. Will add a comment and a todo
to clean up this.
request_headers: Dict[str, str] | ||
|
||
|
||
class TopicPublisher(abc.ABC): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is the interface/contract, a bit of doc could help :-)
localstack/services/sns/publisher.py
Outdated
class LambdaTopicPublisher(TopicPublisher): | ||
def _publish(self, context: SnsPublishContext, subscriber: SnsSubscription): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be great if we could add a bit of pydoc to every publisher and a link to the relevant AWS docs for the integration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, with one small nit :)
Also I feel that the added comments now sufficiently describe the classes and that the EmailJsonTopic naming is ok now.
4677778
to
4a999af
Compare
4a999af
to
1cd0b80
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks great now, thanks for addressing all the comments and adding docs. logging is really good as well! tested it locally.
one more thing is to update the docs with the new config variable!
I'm not sure I want to update the docs about the config variable when the feature was not widely advertised and mostly blocking. Maybe I could comment on the feature request that this is now behind the flag? |
This PR refactors SNS publishing and the underlying data model.
The ASF provider was still making use of the data model used by the
ProxyListener
(using query string parameters).This PR splits the publishing in different class depending on the target:
Lambda
,SQS
,Firehose
,SMS
,Platform Endpoint
(mobile push notification system),email
&email-json
, andhttp
/https
.It makes uses of a
ThreadPoolExecutor
, in the same way as S3 notifications.Previously, while using
publish_batch
, it would send the messages sequentially and synchronously (maybe in order to keep the FIFO order right). This is now all asynchronous.Also, while batch publishing to an SQS queue, the new publisher will make use of SQS
SendMessageBatch
instead of making individual calls (the batching size and limitations are the same, so it must use it under the hood).Different validations were added along the way, for example to
SetSubscriptionAttributes
, and different parameters toPublish
, with corresponding AWS validated tests.Also, I've put a feature allowing us to directly publish to the real GCM/FCM mobile notification platform behind a feature flag, for now called
LEGACY_SNS_GCM_PUBLISHING
but maybe in need of a new name. This feature was not advertised, but was blocking some other users as they would need real world credentials to GCM to be able to use LocalStack.Limitations: while testing, I've realised that messages sent to FIFO topics would not properly be directed to the DLQ in case of issue, most probably because we don't pass down the
MessageGroupId
and such necessary parameters. This will be treated as the next iteration, anxfail
test is already in place and I know how to fix the issue, but the PR was getting a bit out of scope.note: CI run failed on a fluke, should be green
fixes #6863