“The biggest improvement I’ve seen in a decade.”
That’s a Twitter engineer’s assessment of how effective the Intel Ethernet 800 Series Network Adapters with Application Device Queues (ADQ) technology are at reducing tail latency of Remote Procedure Call (RPC) requests over a broad range of sizes and connection counts.
Tail latency is a serious challenge. Twitter aims to serve 99.9% of all incoming requests in less than 5 milliseconds.
Tail latency typically rises dramatically when both the packet rate and TCP connection count are high. Twitter has a lot of both.
So when the company learned about the Intel Ethernet 800 Series Network Adapter with ADQ, it decided to put the technology to the test.
The Intel Ethernet 800 Series Network Adapters are designed for both speed and flexibility. More specifically, the Intel Ethernet Network Adapter E810 can bring data-center customers all the way up to 100 Gbps.
ADQ lets users open express lanes for mission-critical applications. In this way, users can get better data throughput, and without needing to add servers to meet a workload requirement.
ADQ also helps data-center managers improve latency predictability. That’s important. Smart networking engineers will tell you that predictable latency is actually more important than throughput. Throughput measures your fastest components. But what causes SLA-wrecking delays are the components that run slowest.
Twitter was also attracted by ADQ’s ease of use. Even sophisticated networking engineers appreciate products that can be implemented with speed and ease.
As Yue explains, Twitter was looking to accelerate its Pelikan Cache, a modular caching framework. The company also knew that delivering data from in-memory cache should be the fastest method.
Pelikan Cache separates performance-sensitive processing from less performance-sensitive processing. The framework also separates different types of performance-sensitive processing from each other.
Cache is nothing new to Twitter. The company’s data environment already contains more than 400 cache clusters running on thousands of hosts, according to Yue. These caches must be fast and scalable. They also need to be operationally stable and flexible.
To evaluate ADQ, Twitter created a test plan that simulated its production environment. Cache instances were stacked, to mimic a containerized environment. Each cache back-end handled a large number of connections — up to 10,000 per instance. Payloads were set up in a wide range of sizes. And the test was conducted using 24 instances per host. Altogether, that comes to 240,000 active connections.
The results included a 10x reduction in tail latencies with some clients. “The Intel Ethernet 800 Series Network Adapter with ADQ technology,” Yue writes in her blog post, “did an outstanding job in reducing tail latency of Remote Procedure Call (RPC) requests over a broad range of sizes and connection counts.”
“The consistent reduction in tail latencies,” she adds, “is the biggest improvement I’ve seen in a decade, to the point that I think we should upgrade our cache SLO [service level objective] to match.”
Twitter also found that ADQ was easy to set up. That’s important, Yue writes, because “a technology that can be readily deployed is one that can have actual impact.”
Are your data-center customers looking for an easy and affordable way to gain latency predictability? Tell them about Intel 800 Series Network Adapters with ADQ technology.