Комментарии:
I don't understand the point of a cache on top of redis. Won't every request require us to update the cache to get the latest count from the redis store? Cache would make sense if it was read heavy I think.
Ответитьyou dont talk about costs in any of these videos. cost is a very imp. aspect which can redefine the entire solution. But good video, you bring new dimensions to my thought process.
ОтветитьHi @jordanhasnolife5163, I needed some clarity on the redis part. The Linked list you are storing will be stored on redis and algorithm for rate limiting will be on rate limiting server, and every time we are removing, inserting or getting data from linked list, the rate limiting server will call redis. There will be a separate server where redis will be deployed. Is my understanding correct or anything I am missing here.
ОтветитьThis is a beginner question, but how does sharding with single leader replication work? Does each shard range of databases have their own leader?
ОтветитьJordan, thank you for the great content. Could you please share the slides used for all sys design 2.0 videos?
Ответитьno flink? wtf. who are you and where’s our boy jordan??
ОтветитьHi, was wondering if Rate limit rules will come into discussion. For this particular endpoint, these many requests are remaining for this ip). So these configurations need to be stored somewhere right? Probably a DB?
Also correct me if I am wrong, in Redis, the running counters of rate limit are stored right? Like, 5 requests have been exhausted.
Also how are we refreshing the limits here, say after a minute has passed I need to reset the limits right?
Nice Video Jordan..
One small question though
In the final picture, where are we going to keep the Sliding Window Logic to kick out the expired counts ? Is it inside LB or we will create a new RateLimiting Service which uses Redis ?
9K views damn dude you are blowing up
Ответитьwhich app do you use for the white boarding in your video?
ОтветитьBig thanks from another Googler 🙏
ОтветитьHi Jordan. Great video as usual! I had one question. Can't we use Leaderless replication with partitioning to send a particular userId/IP to the same node everytime?
Ответитьgolden
ОтветитьShouldn't the Rate Limiter be part of API Gateway?
ОтветитьGreat video, can the linkedList be a binary search tree instead? Inserting element is slower but when you try to remove from it, it takes O(logn) instead of O(n), you would have to balance the binary search tree once in a while in O(n), but not always.
ОтветитьGood discussions. Very helpful in preparing for upcoming interviews.
A few things I don’t understand about the solution:
1) With a distributed rate limiter sharded on userId/IP address like you’ve proposed here, I can’t see the need for read replicas. Every operation is probably a write operation. That’s under the assumption that the vast majority of requests don’t get throttled and this put a timestamp in a queue. Read replicas would only be helpful when the request is throttled.
So I think if we want 100% accuracy, we can’t do leaderless replication. But actually I would argue that there are a lot of cases where we would prefer speed over accuracy. And if our scale is high enough where requests come so often a single machine doesn’t have the IO to handle those requests (or the blocking data structures are under too much pressure), to support higher throughput we would need to compromise accuracy.
By allowing writes to go to multiple machines, we can scale much higher in write throughput. The loss is accuracy. We also then need to share that information between nodes. Perhaps with a leader, perhaps not. We can also use gossip style dissemination.
2) I can’t understand how the cache would work on the LB. This would I presume be for throttled requests only. I suppose the rate limiter could return the earliest time that the next request would succeed and the cache would be valid only until then. Is that the idea?
Another good thing to talk about would be UDP vs TCP here, which again falls into the accuracy-speed tradeoff here.
Overall great discussion here in the video, maybe some of these points could further someone else’s thinking on the problem space while preparing.
amazing jordan!
i remember seeing sorted sets of redis for the sliding window, while you use linked lists
your solution does have better TC
but maybe they use sorted sets because requests can reach the rate limiter "out of order"?
not sure why they would overcomplicate the solution otherwise?
We dont necessarily need our rate limiter to be available all the time?
- 1 we have the cache at the load balancer
- the rate limiting as a service is not on the load balancer so if it is down for sometime it wont affect the service as a whole
But you can have a multi-master setup too, with a follower of each master. Just need to ensure that the data for one user goes inside a particular shard always. This would solve the user/ip level rate limiting?
ОтветитьWhen are gonna open a discord so we design a feet pic recommendation plateform
ОтветитьThank you for the video! Good learning as always.
One doubt. When you say multi-leader replication setup,
1. Do you mean it in context of rate limiting services ? Like every rate-limiting service node will keep track of its count (CRDTs) and then broadcast it to other nodes. (If yes, then why do we need Redis ?)
Or,
2. Do you mean in context of Redis ?
Great video. However, this implementation is assuming that all 20 services will be rate limited at the same spot? So this design wouldn't work if each service needed to be rate limited independently?
ОтветитьGreat video! In the final design, where is the rate limiting logic implemented? Is it in between the load balancer and the redis database? Or is it implemented on the backend services? Or is it just for us to decide based on the constraints?
ОтветитьRate lmiting is done per server level right. So while the esitmation, why did you multiply with 20 serviers. we use 20 different rate limiters with the limit 12 GB
ОтветитьI've been binge watching your channel and all I hear is Kafka and Flink even though those terms didn't come in this video. Send help.
ОтветитьCan you please share the google drive link of all the above notes ?
ОтветитьConsidering how sarcastic he can be, lame me thought he is referring to the other "Edging".. Ahem Ahem..
Ответитьfor atomicity we can use lua scripts into redis.
Ответитьdumb question, why don't we put it in front of lb? :|
ОтветитьThank you for your video.
ОтветитьThank you Jordan!
ОтветитьJordan I saw it in the comments too but wouldnt Redis' native support for sorted sets be very powerful for Rate Limiting? Specifically the zremrangebyscore function for removing expired keys in a sliding window?
Also with Redis isnt concurrency not an issue since it is single threaded?
Jordan, to avoid locking in sliding window algorithm, we could split the time window into multiple buckets and store them as a circular array, for ex: 1 sec can be broken into 10 buckets of 100ms. The total hits at any point is the total of all the hits in the buckets. We can use atomic variables to maintain the hits in each bucket and the total hits.
ОтветитьWith User ID and IP you can also add in user system data for a more precise system where, user system(mac address(Random mac on new phones is a troublesome thing),phone make,model,device id,screen resolution,processor,storage,device specs etc) a combination of these 3 can make it more precise, especially with cgnat(same ip for whole neighbourhood- common in asia by isps) networks.
ОтветитьThank you!
ОтветитьI think Redis operations (like INCR etc.) are atomic since by default it runs single threaded and uses atomic hardware ops (like compare and swap). And based on what I read so far, even single threaded redis seems to be pretty performant. Any reason why going multi-threaded with locking might be more performant?
NOTE: I recently discovered your channel and absolutely love your content. Keep it coming 🙌🏽.
talk is cheap, show me your code.
ОтветитьI watched this video several times (like the rest of the problems) and practiced by recording myself while designing the rate limiter, notification system, Twitter, etc. Luckily, my interviewer from Amazon picked up the rate limiter, and he didn't like it. Didn't tell me what I did wrong, and he had no feedback or clues for me. I explained my choices based on his numbers and requirements (eventual consistency, exponential lock on IPs, partitioning, replicas, Redis choice, and why). Still don't understand what I missed
ОтветитьW vid. very clear.
ОтветитьI love your videos Jordan... just amazing... my teacher... ill wish you happy teachers day on teachers day...😊😊
ОтветитьShouldn't we could use rabbit mq here as we are not concerned with order
ОтветитьI’d like to dig into the memory requirements of the system some more with the added distinction between a fixed window vs sliding window algorithm. I worry that to support a sliding window rate limit our memory requirements would balloon so much as to invalidate the architecture.
Fixed window
—-
I think your initial estimate holds up as long as we only need a single interval/ process (or maybe a handful) tracked in memory which is responsible for updating all of the token buckets.
Sliding window
—-
If every active ip/user needs a linked list then I think we have a problem. The LL needs as many nodes as there can be requests within a window. That might be say: 1000 if our rate limit is like 1000 per minute or per 24 hours (seems like a reasonable thing to support). Each node in the LL needs a 4byte pointer to the next node and 4 bytes for the timestamp. 8 bytes * 1000 * 1 billion alone puts us at 8TB.
At 8TB+ are we getting outside the realm of a manageable number of rate limit shards? 🤔
This was a really dense, concise and simple video. Good going!
ОтветитьYou able to add the Thanks button? Got an L5 at AWS due in part to your stuff. First baby arriving in two weeks, can finally afford a parental DNA test. Thanks Jordan!!
ОтветитьIsn't 1billion * 20 * 8 *4 = 640 billion, not 240?
Ответить