HashCache
A key-value store datastructure with high speed and hard guarantees
HashCache combines a key-value datastructure with a cache-like implicit replacement policy, a combination that is needed frequently especially in network processing. It is a proprietary algorithm that is designed to be particularly well suited for implementation in hardware and provides excellent lookup, update and insertion performance. With its speed, it solves a fundamental issue in stateful network appliances: how to retrieve and store new state information at sufficient speed such that an attacker cannot overwhelm the device with state‐creating requests.
We provide HashCache as an IP-core but also use it in our own designs.
Background
Dictionaries
Key-value stores are a very common concept of a datastructure in computer science and also go by the name of dictionaries. A key-value store contains a collection of records that are stored under and retrieved by a key. They key in this case is not some form of encryption material, but rather a freely chosen identifier of the record that acts as a bookmark.
This form of associative memory is often used in network processing where the stored records can be the context and state of each network connection and the key is derived from the 5-tuple in the packets’ headers. Upon receiving a network packet, a network appliance can derive the connection’s identifying key from the packet and look up the information stored for the connection that the packet belongs to.
Caches
HashCache combines this dictionary datastructure with a cache-like implicit replacement policy. Like every real world datastructure, HashCache has a limited capacity bounded by its underlying storage medium. Failure to add new entries, e.g. because the capacity is exhausted, can be extremely undesireable. In the context of network appliances, this would mean the refusal to accept new connections, thus opening a severe vulnerability for DoS attacks.
Like a cache, HashCache automatically manages storage lifetime and locates and deletes unused entries when space for new ones is needed.
Why HashCache
While HashCache is a general purpose datastructure, we developed it out of a particular need in stateful network appliances and DPUs: how to let the hardware accelerated “data path” handle not just exisiting connections but also new connections, something usually handled by a slow, DoS vulnerable software fallback “control path”. This ability to add new connections or entries to the datastructure from the hardware accelerated “data path” is sometimes called “self learning” in the network domain. It requires the insertion mechanism and implicit replacement to not just be accessible from the data path, but also to operate at similar speeds.
Modern and future network appliances are expected to handle 100-400 GBit/s of traffic. This translates to 10s-100s of millions of packets per second, which a key-value store must be able to look up in real time. In order to be resilient to DoS attacks or connection-heavy machine2machine communication, these implementations must also support inserts at similar rates. Finally, if active sessions are to be preserved even under heavy load or attack, the key-value store needs to also have a large storage capacity, as well as resilient replacement policies to decide which sessions to drop once the capacity is reached.
Requirements for network appliances:
- High Lookup Rate: 10s - 100s of millions/sec
- Resilient Policy: Guaranteed state retention time by only replacing unused records.
- High Insert Rate: 10s - 100s of millions/sec
- High Capacity: 100s - 1000s of millions
- Thread Scalability: The ability to trade area (more workers) for more performance
When exploring the state of the art, we found that none of the existing approaches offered these ideal properties for high speed, internet facing network appliances.
Properties & Experiments
HashCache is designed to fulfill these requirements. It is build around a hash table that provide high lookup rates. But it also achieves high insert rates through an implicit replacement policy that can locate to-be-replaced records in constant time while guaranteeing to not replace recently used records.
Even with DDR4‐SDRAM, HashCache is capable of processing 100+ million packets per second and per memory channel, enabling stateful processing of steps that in the past had to be handled statelessly with all the associated downsides and problems.
Further Reading
For further information, see: