-
Notifications
You must be signed in to change notification settings - Fork 10.1k
Use static channels for node syncing. #1371
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ribing 5 channels for every connection
Hi 2fours. Can this be considered a viable replacement to the current redid store? Have you deployed this on a large scale app or done any scale testing against it? Just curious because your solution looks promising and just wanted to get a little more information about it. |
I load tested it to 2000 concurrent connections and didn't see issues. I've also been testing it in production for a week and haven't noticed any issues but I don't get that much load (< 100 concurrent) in production yet. I think it has promise but I would definitely do some testing of your own. |
Sounds very promising. I'll test it on my end as well. Thanks for your work on this. |
Prego! I would be interested in hearing the outcome of your tests. |
could it solve #1097 ? |
We are also looking to test this fork (and/or an earlier fork you merged, #1260 ) in production soon (after the holidays). We have ~2,000 concurrent sockets for ~12 hours each day in prod (500-1,000 concurrent in off-hours), with each client emitting ~1 message to the server per 2-3 seconds (so pretty busy per socket - we do something like 15gb of data transmission in/out from/to our server per day in JUST socket messages/packets). We have tested ~300 concurrent users in our load testing environment without issue (for technical reason on our side, we can't currently test many more than that). I'll let you know how it goes! (Also looking forward to socket.io 1.0.0, which seems like it's getting closer! It looks like 1.0.0 gets rid of RedisStore entirely and implements its own thing? YAY!) |
Update/tl;dr: this fork works for us in prod. After running test with other, older/more established forks, the RedisStore leak situation improved DRASTICALLY (see my comments in #1260 for more), but some leaks remained. So, we tried this fork - first in staging, then in prod. The short version is that we have been live in prod for about 24 hours with a peak of ~3,500 concurrent socket connections (sustained for several hours) with NO LEAKS AT ALL. For a more detailed explanation, I am attaching two images from our experiments in our staging environment. First, simply, a display of ~36 hours of this fork running. You'll notice that CPU usage is totally flat (other than at the times of our own nightly scripts). From start to finish - no change. Second, for those curious about the results of other branches/forks/pull requests, here is the same test run in on 3 different forks of Socket.io 0.9.x on the same AWS instance. (Note that this is a staging environment, but it essentially matches what we've seen in prod.) Results: LearnBoost/socket.io#0.9 - we all know RedisStore leaks like crazy here. It seems to be related to connects/disconnects. In our load testing environment, we can see the leak maxes out our server in ~2 hours. We used this period of time as a benchmark by which to compare various fixes. ifsnow/socket.io#improve_redis_store - good results, but not perfect; there is still a leak. In our load testing environment, we see that our CPUs get pegged (and actually crash in this case) after ~20 hours. So, roughly 10x better at handling connects/disconnects vs. the primary socket.io branch. surespot/socket.io#channelfix - works perfectly with NO LEAKS the first ~24 hours in prod, and for an entire ~36 hours of constant load in our load testing environment (this would be ~500,000 unique of sockets disconnecting and reconnecting in a constant stream). We feel pretty damn good about this pull request, but will report back if any issues arise. THANK YOU SURESPOT AUTHORS. :) |
Thanks for reporting back on this, great news! You are welcome! Things are looking pretty stable over here too. |
Guys, how are you? I'm new to Github and I'm wondering how are you testing this. Do you pull Best regards. On Thu, Jan 2, 2014 at 12:11 PM, 2fours [email protected] wrote:
|
Toblerpwn. Excellent, detailed feedback. Thanks for the update! filipedeschamps, It only effects three files. I personally just went to the files changed tab and downloaded and replaced those files. |
@filipedeschamps - Assuming you're using NPM, here is how I manage it: In your package.json, remove your existing socket.io line (probably something like
..you may also use other https://npmjs.org/doc/json.html#Git-URLs-as-Dependencies
Two notes worth calling out:
In other words, be VERY CAREFUL about running An alternative and perhaps safer solution would be to reproduce this code (as @cendrizzi suggested) in your own fork. I personally chose to point to surespot's branch because I am watching this issue like a hawk every day and living dangerously. 😄 |
Guys, that's awesome! I run a brazilian stock market website that is heavily built on top of I'm not a skilled programer, but socket.io make things so easy that it I can't help with socket.io coding currently, but I will try to help Best regards, On Fri, Jan 3, 2014 at 6:04 AM, Sean [email protected] wrote:
|
We have two machines in the online operation,There are all about 4000 connections. I have the memory for single worker process : 24hours: 176mb I noticed io.sockets.clients().length still growth io.sockets.clients():7983 the link to download heapdump file sockets in SocketNamespace 81134 |
@2fours and @toblerpwn thanks for your incredible work and sleuthing here! I was about to scrap our hand-rolled implementation as our servers were getting crushed...without even the load you have! |
@cthomaschase excellent news, thanks! @freeman983 I tested your memory leak fix, it doesn't change anything in our case but we are only using websocket transport so maybe there are issues with the other transports. |
Hi. Thought I would follow up. I've been using this for some time and the results look very encouraging. This is certainly an important fix. At this point I think we really should get this pulled into the 0.9 series. I know there is work now being done in earnest on 1.0 but the reality is for me I'm stuck on 0.9 for the time being as I have some c++ code hitting and using the 0.9 protocol. So anything to give 0.9 legs is important for me. Thanks again 2fours for your work. |
Use static channels for node syncing.
0.9.17 is out. This is a great performance improvement, thanks! |
Every time a connection is made, socket.io subscribes to 5 additional channels in redis to keep sync in state across multiple nodes (which are unsubscribed on disconnect). This causes problems with high numbers of concurrent connections, eg. 10000 concurrent = 50000 redis subscriptions, which causes redis to use orders of magnitude more CPU.
This patch uses 5 static channels to convey the sync information instead of subscribing and unsubscribing with every connection. The pub/sub count in redis now stays at 13 whether 10 users are connected, 10,000, or 100,000.