문제

I've been trying to study some systems design materials in preparation for an upcoming interview. My current job doesn't really have to do that much for back of the envelope estimations. I've found a example design problem, specifically how to design Instagram. For the most part, it makes sense, but I'm stuck at a data size estimate on the data that is used to map followers to the people they are following.

The reference material states:

UserFollow: Each row in the UserFollow table will consist of 8 bytes. If we have 500 million users and on average each user follows 500 users. We would need 1.82TB of storage for the UserFollow table:

500 million users * 500 followers * 8 bytes ~= 1.82TB

For the life of me, I keep calculating this out to be 20GB.

First off, number of rows of data: 5,000,000 x 500 = 2,500,000,000 rows

Next, each row is 8 bytes, so 2,500,000,000 rows x 8 bytes = 20,000,000,000 bytes => 20GB

I feel kind of like an idiot because this seems to be a simple math problem, but my number is way off. What could I be missing here?

도움이 되었습니까?

해결책

5,000,000 x 500 is "5 million" users, each with 500 followers... It should be 500,000,000 x 500.

In addition to that, my guess is that there's a difference of GiB (gibibytes, 1024^3 bytes) vs GB (gigabytes, 1000^3 bytes). Your 20 GB figure (which would be 2000 GB with the correct user count above) would result in 1.863 TiB, which is pretty close to their calculated 1.82 TiB.

I suspect they made a mistake (both with the magnitude, and the units being written as TB instead of TiB

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 softwareengineering.stackexchange
scroll top