The following writeup describes the results of my experiment comparing the measured latencies of using "encoding" = "base64 vs "encoding" = "base64+zstd" for Solana's getAccountInfo RPC call.
Key Takeaways
- For smaller accounts (<3KB):
base64's mean latency is faster thanbase64+zstd - For larger accounts (>5KB):
base64+zstd's mean latency is faster thanbase64 - A crossover point exists between ~3KB and ~5KB account sizes where
base64+zstdbegins to outperformbase64 - Looking at the 95th percentile latency time, we can see that they yield comparable results as mean latencies, where
base64outperforms with smaller account data, whilebase64+zstdoutperforms with larger data accounts.
=== SUMMARY ANALYSIS (MEAN) - getAccountInfo===
Size (B) Base64 (ms) Zstd (ms) Diff (ms) P-value Winner
================================================================================
904 20.67 24.40 3.73 HIGHLY SIGNIFICANT (p < 0.01) Base64
3232 25.43 30.95 5.52 HIGHLY SIGNIFICANT (p < 0.01) Base64
5400 35.03 25.18 9.85 HIGHLY SIGNIFICANT (p < 0.01) Zstd
7448 25.36 20.89 4.47 HIGHLY SIGNIFICANT (p < 0.01) Zstd
10136 23.58 21.41 2.17 HIGHLY SIGNIFICANT (p < 0.01) Zstd
=== SUMMARY ANALYSIS (P95) - getAccountInfo===
Size (B) Base64 (ms) Zstd (ms) Diff (ms) Winner
================================================================================
904 27.71 32.81 5.09 Base64
3232 34.34 36.81 2.47 Base64
5400 43.93 33.82 10.11 Zstd
7448 33.84 27.53 6.31 Zstd
10136 30.54 28.70 1.84 Zstd
Methodology
- RPC Method Tested: getAccountInfo (single account call)
- RPC Provider: Helius paid developer plan
- Sample Size: 1,000 RPC requests per account size, for each encoding type
- For each request, I measured end-to-end latency: the time from sending the RPC request to receiving and decoding and decompressing the complete response
- I collected the mean, media, stdev, min value, max value, and P95/P95 values from each account size
Test Accounts
| Account Size | Address |
|---|---|
| 904 bytes | Po57HwPXoG22SajaQCBn7Zg8hPp15pUTMcTwcr4GhDu |
| 3,232 bytes | 2uLdBfEax2Gzk3KRGTiWoYQPmXWgSG6g1UDax1T9J6MM |
| 5,400 bytes | 4yRzzxU6brZsY9CTZ2Jp9QWhnRftYP6RyZ8FGugVEXN8 |
| 7,448 bytes | CXh2s3cJVwJxZBa7q2ur3paKsWrrznZfApmvjQCPQJpc |
| 10,136 bytes | Hij4zmmrdmQ49bfgy63C8VF12EYysjF6WGFeZJpfipCk |
Example Result Output
============================================================
Testing GetAccountInfo with 3232 byte account, address: 2uLdBfEax2Gzk3KRGTiWoYQPmXWgSG6g1UDax1T9J6MM
============================================================
[base64] .................................................. Completed
[base64+zstd] .................................................. Completed
GetAccountInfo call, base64 - 3232 bytes:
Mean: 25.43 ms
Median: 24.00 ms
Std Dev: 5.79 ms
Min: 19.43 ms
Max: 80.75 ms
P95: 34.34 ms
P99: 50.08 ms
Successful calls: 1000
GetAccountInfo call, base64+zstd - 3232 bytes:
Mean: 30.95 ms
Median: 29.94 ms
Std Dev: 5.02 ms
Min: 25.10 ms
Max: 98.74 ms
P95: 36.81 ms
P99: 46.61 ms
Successful calls: 1000
Understanding getAccountInfo Encoding Types
What is base64 Encoding?
When you request account data via getAccountInfo from a Solana RPC node, the raw binary account data needs to be transmitted over HTTP/JSON. Since JSON cannot directly represent binary data, Solana uses base64 encoding to convert binary data into ASCII text.
The process of encoding data using base64 increases the data size by ~33%, but it is a necessary tradeoff for reliable data transmission over JSON-RPC.
What is base64+zstd?
Zstandard (Zstd) is a modern compression algorithm developed by Facebook that offers fast compression and decompression speeds and good compression ratios.
In the context of Solana, zstd compresses the account data before base64 encoding, reducing the amount of data transmitted over the network. The general logic steps for handling base64+zstd encoding in a Solana RPC call includes:
Server-side (RPC node):
- Fetch raw account data from storage
- Compress the binary data using
zstd - Encode the compressed data with
base64 - Send JSON response with compressed+encoded data
Client-side:
- Receive JSON response
- Use
base64to decode to get compressed binary - Decompress using
zstdto get original account data
Using base64+zstd, you save network bandwidth latency due to a smaller data size, but at the cost of latency from additional CPU cycles for compression/decompression.
The question is: at what data size does the benefit of compression outweight the additional CPU overhead it costs?
Interpretation of Results
After conducting the experiment described in the Methodology section above, the following takeaways could be retrieved from the data:
- For smaller accounts (<3KB):
base64's mean latency is faster thanbase64+zstd - For larger accounts (>5KB):
base64+zstd's mean latency is faster thanbase64
These results are in line with my expectations as they adhered to the tradeoff princible between bandwidth savings and additional costs from decompression.
Every RPC call has three latency components:
- Network Latency: latency dependent on the quality of the client and server's respective networks
- Data Transfer: latency from bandwidth
- CPU Processing: latency from CPU cycles needed to compress and decompress the data
At smaller data sizes, you're spending extra CPU resources for decompression but not saving enough latency from a smaller data size transfer.
At larger data sizes, the bandwidth savings exceed the overhead from the additional decompression. A crossover point exists between ~3KB and ~5KB account sizes where base64+zstd begins to outperform base64.
Small Data Accounts:
base64 costs: Network + Small Data transfer + CPU for decoding
base64+zstd costs: Network + Tiny Data transfer + CPU for decoding and decompressing
Larger Data Accounts:
base64 costs: Network + Larger Data transfer + CPU for decoding
base64+zstd costs: Network + Medium Data transfer + CPU for decoding and decompressing