7: Design a Rate Limiter | Systems Design Interview Questions With Ex-Google SWE

1 год назад

40,919 Просмотров

Комментарии:

@akbarkool - 01.02.2024 20:40

I don't understand the point of a cache on top of redis. Won't every request require us to update the cache to get the latest count from the redis store? Cache would make sense if it was read heavy I think.

Ответить

@GeorgeDicu-hs5yp - 10.02.2024 04:55

you dont talk about costs in any of these videos. cost is a very imp. aspect which can redefine the entire solution. But good video, you bring new dimensions to my thought process.

Ответить

@AdityaKrishnaNamdeo - 10.02.2024 17:17

Hi @jordanhasnolife5163, I needed some clarity on the redis part. The Linked list you are storing will be stored on redis and algorithm for rate limiting will be on rate limiting server, and every time we are removing, inserting or getting data from linked list, the rate limiting server will call redis. There will be a separate server where redis will be deployed. Is my understanding correct or anything I am missing here.

Ответить

@jjlee4883 - 19.02.2024 10:39

This is a beginner question, but how does sharding with single leader replication work? Does each shard range of databases have their own leader?

Ответить

@prinzmukka - 28.02.2024 08:14

Jordan, thank you for the great content. Could you please share the slides used for all sys design 2.0 videos?

Ответить

@titusandronikus1337 - 19.03.2024 03:04

no flink? wtf. who are you and where’s our boy jordan??

Ответить

@sahilkalamkar5332 - 24.03.2024 21:14

Hi, was wondering if Rate limit rules will come into discussion. For this particular endpoint, these many requests are remaining for this ip). So these configurations need to be stored somewhere right? Probably a DB?

Also correct me if I am wrong, in Redis, the running counters of rate limit are stored right? Like, 5 requests have been exhausted.

Also how are we refreshing the limits here, say after a minute has passed I need to reset the limits right?

Ответить

@rakeshvarma8091 - 14.04.2024 19:03

Nice Video Jordan..
One small question though
In the final picture, where are we going to keep the Sliding Window Logic to kick out the expired counts ? Is it inside LB or we will create a new RateLimiting Service which uses Redis ?

Ответить

@suyashsngh250 - 19.04.2024 22:01

9K views damn dude you are blowing up

Ответить

@jasdn93bsad992 - 24.04.2024 18:21

which app do you use for the white boarding in your video?

Ответить

@ziyinyou938 - 08.05.2024 07:07

Big thanks from another Googler 🙏

Ответить

@12akul - 08.05.2024 07:47

Hi Jordan. Great video as usual! I had one question. Can't we use Leaderless replication with partitioning to send a particular userId/IP to the same node everytime?

Ответить

@programmingconcepts-d9w - 11.06.2024 16:22

golden

Ответить

@davidoh0905 - 12.06.2024 23:24

Shouldn't the Rate Limiter be part of API Gateway?

Ответить

@jiananstrackjourney370 - 22.06.2024 20:52

Great video, can the linkedList be a binary search tree instead? Inserting element is slower but when you try to remove from it, it takes O(logn) instead of O(n), you would have to balance the binary search tree once in a while in O(n), but not always.

Ответить

@isaacneale8421 - 23.06.2024 03:24

Good discussions. Very helpful in preparing for upcoming interviews.

A few things I don’t understand about the solution:
1) With a distributed rate limiter sharded on userId/IP address like you’ve proposed here, I can’t see the need for read replicas. Every operation is probably a write operation. That’s under the assumption that the vast majority of requests don’t get throttled and this put a timestamp in a queue. Read replicas would only be helpful when the request is throttled.

So I think if we want 100% accuracy, we can’t do leaderless replication. But actually I would argue that there are a lot of cases where we would prefer speed over accuracy. And if our scale is high enough where requests come so often a single machine doesn’t have the IO to handle those requests (or the blocking data structures are under too much pressure), to support higher throughput we would need to compromise accuracy.

By allowing writes to go to multiple machines, we can scale much higher in write throughput. The loss is accuracy. We also then need to share that information between nodes. Perhaps with a leader, perhaps not. We can also use gossip style dissemination.

2) I can’t understand how the cache would work on the LB. This would I presume be for throttled requests only. I suppose the rate limiter could return the earliest time that the next request would succeed and the cache would be valid only until then. Is that the idea?

Another good thing to talk about would be UDP vs TCP here, which again falls into the accuracy-speed tradeoff here.

Overall great discussion here in the video, maybe some of these points could further someone else’s thinking on the problem space while preparing.

Ответить

@shameekagarwal4872 - 24.06.2024 03:59

amazing jordan!

i remember seeing sorted sets of redis for the sliding window, while you use linked lists
your solution does have better TC
but maybe they use sorted sets because requests can reach the rate limiter "out of order"?
not sure why they would overcomplicate the solution otherwise?

Ответить

@ShivangiSingh-wc3gk - 25.06.2024 01:07

We dont necessarily need our rate limiter to be available all the time?
- 1 we have the cache at the load balancer
- the rate limiting as a service is not on the load balancer so if it is down for sometime it wont affect the service as a whole

Ответить

@rjarora - 06.07.2024 17:22

But you can have a multi-master setup too, with a follower of each master. Just need to ensure that the data for one user goes inside a particular shard always. This would solve the user/ip level rate limiting?

Ответить

@xyz62-t5j - 17.07.2024 04:24

When are gonna open a discord so we design a feet pic recommendation plateform

Ответить

@Adityaq3k - 18.07.2024 21:31

Thank you for the video! Good learning as always.

One doubt. When you say multi-leader replication setup,

1. Do you mean it in context of rate limiting services ? Like every rate-limiting service node will keep track of its count (CRDTs) and then broadcast it to other nodes. (If yes, then why do we need Redis ?)

Or,

2. Do you mean in context of Redis ?

Ответить

@zarifakhtab5116 - 31.07.2024 05:59

Great video. However, this implementation is assuming that all 20 services will be rate limited at the same spot? So this design wouldn't work if each service needed to be rate limited independently?

Ответить

@prohitsaichowdary5966 - 11.08.2024 11:04

Great video! In the final design, where is the rate limiting logic implemented? Is it in between the load balancer and the redis database? Or is it implemented on the backend services? Or is it just for us to decide based on the constraints?

Ответить

@rakeshreddytheeghala9397 - 21.08.2024 17:52

Rate lmiting is done per server level right. So while the esitmation, why did you multiply with 20 serviers. we use 20 different rate limiters with the limit 12 GB

Ответить

@sathishsekar9895 - 03.09.2024 04:57

I've been binge watching your channel and all I hear is Kafka and Flink even though those terms didn't come in this video. Send help.

Ответить

@nishanksoni7120 - 01.10.2024 12:08

Can you please share the google drive link of all the above notes ?

Ответить

@BetaTester-mk6pe - 03.10.2024 23:05

Considering how sarcastic he can be, lame me thought he is referring to the other "Edging".. Ahem Ahem..

Ответить

@testshubham - 15.10.2024 21:18

for atomicity we can use lua scripts into redis.

Ответить

@tacowilco7515 - 16.10.2024 04:34

dumb question, why don't we put it in front of lb? :|

Ответить

@xiangdongyan6955 - 19.10.2024 19:08

Thank you for your video.

Ответить

@dkcjenx - 19.10.2024 21:50

Thank you Jordan!

Ответить

@guitarMartial - 30.10.2024 06:15

Jordan I saw it in the comments too but wouldnt Redis' native support for sorted sets be very powerful for Rate Limiting? Specifically the zremrangebyscore function for removing expired keys in a sliding window?

Also with Redis isnt concurrency not an issue since it is single threaded?

Ответить

@cowboy4mhell - 05.11.2024 20:08

Jordan, to avoid locking in sliding window algorithm, we could split the time window into multiple buckets and store them as a circular array, for ex: 1 sec can be broken into 10 buckets of 100ms. The total hits at any point is the total of all the hits in the buckets. We can use atomic variables to maintain the hits in each bucket and the total hits.

Ответить

@storiesaudio-x5c - 16.11.2024 11:31

With User ID and IP you can also add in user system data for a more precise system where, user system(mac address(Random mac on new phones is a troublesome thing),phone make,model,device id,screen resolution,processor,storage,device specs etc) a combination of these 3 can make it more precise, especially with cgnat(same ip for whole neighbourhood- common in asia by isps) networks.

Ответить

@chaitanyatanwar8151 - 18.11.2024 03:31

Thank you!

Ответить

@subhamroy8734 - 04.12.2024 05:06

I think Redis operations (like INCR etc.) are atomic since by default it runs single threaded and uses atomic hardware ops (like compare and swap). And based on what I read so far, even single threaded redis seems to be pretty performant. Any reason why going multi-threaded with locking might be more performant?

NOTE: I recently discovered your channel and absolutely love your content. Keep it coming 🙌🏽.

Ответить

@中本聪-d9k - 05.12.2024 09:55

talk is cheap, show me your code.

Ответить

@klokuork - 09.12.2024 05:11

I watched this video several times (like the rest of the problems) and practiced by recording myself while designing the rate limiter, notification system, Twitter, etc. Luckily, my interviewer from Amazon picked up the rate limiter, and he didn't like it. Didn't tell me what I did wrong, and he had no feedback or clues for me. I explained my choices based on his numbers and requirements (eventual consistency, exponential lock on IPs, partitioning, replicas, Redis choice, and why). Still don't understand what I missed

Ответить

@exec_mayank - 09.12.2024 06:25

W vid. very clear.

Ответить

@QuantumComputing-s7j - 10.12.2024 21:18

I love your videos Jordan... just amazing... my teacher... ill wish you happy teachers day on teachers day...😊😊

Ответить

@hrithikbabbar5721 - 15.12.2024 20:22

Shouldn't we could use rabbit mq here as we are not concerned with order

Ответить

@lucasparzych5195 - 22.12.2024 19:53

I’d like to dig into the memory requirements of the system some more with the added distinction between a fixed window vs sliding window algorithm. I worry that to support a sliding window rate limit our memory requirements would balloon so much as to invalidate the architecture.

Fixed window
—-
I think your initial estimate holds up as long as we only need a single interval/ process (or maybe a handful) tracked in memory which is responsible for updating all of the token buckets.

Sliding window
—-
If every active ip/user needs a linked list then I think we have a problem. The LL needs as many nodes as there can be requests within a window. That might be say: 1000 if our rate limit is like 1000 per minute or per 24 hours (seems like a reasonable thing to support). Each node in the LL needs a 4byte pointer to the next node and 4 bytes for the timestamp. 8 bytes * 1000 * 1 billion alone puts us at 8TB.

At 8TB+ are we getting outside the realm of a manageable number of rate limit shards? 🤔

Ответить

@lokesh2608 - 20.01.2025 05:23

This was a really dense, concise and simple video. Good going!

Ответить

@JoshMerrell-vq4hg - 15.02.2025 21:49

You able to add the Thanks button? Got an L5 at AWS due in part to your stuff. First baby arriving in two weeks, can finally afford a parental DNA test. Thanks Jordan!!

Ответить