Back-of-the-envelope Calculation Must Know Numbers


Calculation

Define estimate numbers:

  • #users
  • #operations/events
  • data size per op/event

General formulas:

Number of operations per day

[# of operations / day] = [# of DAU] * [# of operations per 1 user]
unit: #(ops)/day | e.g.: 1M(ops)/day

Number of data generated/transferred per day

[# of data / day] = [# of ops / day] * [size of data per 1 op]
unit: storage_unit/day | e.g.: 1MB/day

Day to second conversion

1 day = 86400 s

QPS estimate

[# of query ops / second] = [# query ops / day] / 86400
unit: #(ops)/s | e.g.: 1M(queries)/s

Throughput estimate

[# of data / second] = [# data transferred / day] / 86400
unit: storage_unit/s | e.g.: 1MB/s

Must Know Terms:

DAU (Daily Active Users)

  • total number of users that engage in some way with a web or mobile product on a given day

Bandwidth

  • maximum amount of data that can travel through a ‘channel’

Throughput

  • amount of data actually does travel through the ‘channel’ successfully.
  • number of actions executed or results produced per unit of time.
  • e.g. actions/second

Latency

  • time required to perform some action or to produce some result.
  • measured in units of time – hours, minutes, seconds, nanoseconds or clock periods.
  • e.g. seconds/action

QPS

  • Queries per second
  • number of query operations / second

Product Numbers

Engineering Numbers

1 word = 2 bytes = 16 bits (common)
1 byte = 8 bits

1 int (java) = 4 bytes => -2,147,483,648 to 2,147,483,647
1 long (java) = 8 bytes => -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807

1 KB = 1024 B
1 MB = 1024 KB
1 GB = 1024 MB
1 TB = 1024 GB
1 PB = 1024 TB

Powers of two table

Power Exact Value Approx Value Bytes
0 1 1B
7 128
8 256
10 1024 1 thousand 1 KB
16 65,536 64 KB
20 1,048,576 1 million 1 MB
30 1,073,741,824 1 billion 1 GB
32 4,294,967,296 4 GB
40 1,099,511,627,776 1 trillion 1 TB

Source(s) and further reading

Latency numbers every programmer should know

Latency Comparison Numbers
L1 cache reference 0.5 ns
Branch mispredict 5 ns
L2 cache reference 7 ns 14x L1 cache
Mutex lock/unlock 25 ns
Main memory reference 100 ns 20x L2 cache, 200x L1 cache
Compress 1K bytes with Zippy 10,000 ns 10 us
Send 1 KB bytes over 1 Gbps network 10,000 ns 10 us
Read 4 KB randomly from SSD* 150,000 ns 150 us ~1GB/sec SSD
Read 1 MB sequentially from memory 250,000 ns 250 us
Round trip within same datacenter 500,000 ns 500 us
Read 1 MB sequentially from SSD* 1,000,000 ns 1,000 us 1 ms ~1GB/sec SSD, 4X memory
Disk seek 10,000,000 ns 10,000 us 10 ms 20x datacenter roundtrip
Read 1 MB sequentially from 1 Gbps 10,000,000 ns 10,000 us 10 ms 40x memory, 10X SSD
Read 1 MB sequentially from disk 30,000,000 ns 30,000 us 30 ms 120x memory, 30X SSD
Send packet CA->Netherlands->CA 150,000,000 ns 150,000 us 150 ms

Notes
1 ns = 10^-9 seconds
1 us = 10^-6 seconds = 1,000 ns
1 ms = 10^-3 seconds = 1,000 us = 1,000,000 ns

Handy metrics based on numbers above:

  • Read sequentially from disk at 30 MB/s
  • Read sequentially from 1 Gbps Ethernet at 100 MB/s
  • Read sequentially from SSD at 1 GB/s
  • Read sequentially from main memory at 4 GB/s
  • 6-7 world-wide round trips per second
  • 2,000 round trips per second within a data center

Latency numbers visualized

Source(s) and further reading


Author: Zijun Zhou
Reprint policy: All articles in this blog are used except for special statements CC BY 4.0 reprint policy. If reproduced, please indicate source Zijun Zhou !
  TOC