Core Data Structures — Concepts, Usage, and Trade-offs¶
TL;DR — What Problem This Solves¶
- Pick the right container for performance and clarity: ring buffers, deques, heaps, tries, sparse sets, etc.
- Each section gives a plain‑language “Use for / Pros / Cons” and a tiny code snapshot to copy.
- When in doubt, prefer simple .NET types; use these when you hit performance or ergonomics limits.
This guide covers several foundational data structures used across the library and when to use them.
Cyclic Buffer (Ring Buffer)¶
- What it is: Fixed-capacity circular array with head/tail indices that wrap.
- Use for: Streaming data, fixed-size queues, audio/network buffers.
- Operations: enqueue/dequeue in O(1); overwriting old data optional.
- Pros: Constant-time, cache-friendly, no reallocations at steady size.
- Cons: Fixed capacity unless resized; must handle wrap-around.
When to use vs. DotNET queues
- Prefer
CyclicBuffer<T>overQueue<T>when you want bounded memory with O(1) push/pop at both ends and predictable behavior under backpressure (drop/overwrite oldest, or pop proactively). - Use
Queue<T>when you need unbounded growth without wrap semantics.
API snapshot
Tips and pitfalls
- Know your overflow policy.
Addwill wrap and overwrite the oldest only once capacity is reached; useTryPopFrontperiodically to keep buffer from evicting data you still need. - Iteration enumerates logical order starting at the head, not the underlying storage order.
Remove/RemoveAllare O(n); keep hot paths toAdd/TryPop*when possible.
Deque (Double-Ended Queue)¶
- What it is: Queue with efficient push/pop at both ends.
- Use for: Sliding windows, BFS frontiers, task schedulers.
- Operations: push_front/push_back/pop_front/pop_back in amortized O(1).
- Pros: Flexible ends; generalizes queue and stack behavior.
- Cons: Implementation complexity for block-based layouts.
When to use vs Queue<T> / Stack<T>
- Prefer
Deque<T>when you need bothpush_frontandpush_backin O(1) amortized. - Use
Queue<T>for simple FIFO;Stack<T>for LIFO only.
API snapshot
Tips
- Capacity grows geometrically as needed; call
TrimExcess()after spikes to return memory. - Indexer is in logical order (0 is front, Count-1 is back).
Binary Heap (Priority Queue)¶
- What it is: Array-backed binary tree maintaining heap-order (min/max).
- Use for: Priority queues, Dijkstra/A*, event simulation.
- Operations: push/pop in O(log n); peek O(1); build-heap O(n).
- Pros: Simple; great constant factors; contiguous memory.
- Cons: Not ideal for decrease-key unless augmented.
When to use vs SortedSet<T>
- Prefer
Heap<T>/PriorityQueue<T>for frequent push/pop top in O(log n) with low overhead. - Use
SortedSet<T>for ordered iteration and fast remove arbitrary item (by key), at higher constants.
API snapshot (Heap)
| C# | |
|---|---|
API snapshot (PriorityQueue)
| C# | |
|---|---|
Tips
- Use
PriorityQueue<T>.CreateMax()to flip ordering without writing a custom comparer. - Heaps don’t support efficient decrease-key out of the box; reinsert updated items instead.
Disjoint Set (Union-Find)¶
- What it is: Structure tracking partition of elements into sets.
- Use for: Connectivity, Kruskal’s MST, percolation, clustering.
- Operations: union/find in amortized near O(1) with path compression + union by rank.
- Pros: Extremely fast for bulk unions/finds; minimal memory.
- Cons: Not suited for deletions or enumerating members without extra indexes.
When to use
- Batch connectivity queries where the graph mutates only via unions (no deletions): MST (Kruskal), island labeling, clustering, grouping by equivalence.
API snapshot (int-based)
| C# | |
|---|---|
API snapshot (generic)
| C# | |
|---|---|
Tips
- Use the generic variant to work with domain objects; internally it maps to indices.
- No deletions: rebuild if you need dynamic splits.
Sparse Set¶
- What it is: Two arrays (sparse and dense) enabling O(1) membership checks and iteration over active items.
- Use for: ECS entity sets, fast presence checks with dense iteration.
- Operations: insert/remove/contains in O(1); iterate dense in O(n_active).
- Pros: Very fast, cache-friendly on dense array; stable indices optional.
- Cons: Requires ID space for indices; sparse array sized by max ID.
When to use vs HashSet<T>
- Prefer
SparseSetwhen your IDs are small integers (0..N) and you need O(1) contains with dense, cache-friendly iteration over active items. - Use
HashSet<T>for arbitrary keys, very large/unbounded key spaces, or when memory forsparsecannot scale to the max ID.
API snapshot (int IDs)
API snapshot (generic values)
| C# | |
|---|---|
Tips
- Capacity equals the universe size for keys; do not set capacity larger than your maximum possible ID.
- Deletions swap-with-last in dense array; dense order is not stable.
Trie (Prefix Tree)¶
- What it is: Tree keyed by characters for efficient prefix-based lookup.
- Use for: Autocomplete, spell-checking, dictionary prefix queries.
- Operations: insert/search O(m) where m is key length.
- Pros: Predictable per-character traversal; supports prefix enumeration.
- Cons: Memory overhead vs hash tables; compact with radix/compressed tries.
When to use vs dictionaries
- Prefer
Triefor lots of prefix queries and auto-complete where per-character traversal beats repeated hashing. - Use
Dictionary<string, T>when you rarely do prefix scans and primarily need exact lookup.
API snapshot
Tips
- Build once with full vocabulary; Tries here are immutable post-construction (no public insert) to stay compact.
- Memory scales with total characters; very large alphabets or long keys benefit from compressed/radix tries (not included here).
Bitset¶
- What it is: Packed array of bits for boolean sets and flags.
- Use for: Fast membership bitmaps, masks, filters, small Bloom filters.
- Operations: set/clear/test O(1); bitwise ops on words are vectorizable.
- Pros: Extremely compact; very fast bitwise operations.
- Cons: Fixed maximum size unless dynamically extended; needs index mapping.
When to use vs bool[] / HashSet<int>
- Prefer
BitSetfor dense boolean sets with fast bitwise ops (masks, layers, filters) and compact storage. - Use
bool[]for tiny, fixed schemas you manipulate rarely; useHashSet<int>for sparse, very large universes.
API snapshot
Tips
- Capacity grows automatically when setting beyond bounds; prefer sizing appropriately upfront for fewer resizes.
- Left/Right shift drop/zero-fill at the edges; use with care if capacity is small.
Quick Selection Guide¶
- Need O(1) membership and dense iteration: Sparse Set
- Need priority scheduling: Binary Heap
- Need two-ended queueing: Deque
- Need circular fixed-capacity buffer: Cyclic Buffer
- Need prefix search: Trie
- Need compact boolean set: Bitset
- Need dynamic connectivity: Disjoint Set
- Need auto-evicting key-value store: Cache
Common pitfalls
- Sparse Set capacity equals the max key + 1; allocating for huge key spaces is memory-heavy.
- Heaps don’t give you sorted iteration; popping yields order, but enumerating the heap array is not sorted.
- Cyclic Buffer
Remove/RemoveAllare O(n); keep hot paths to TryPop/Add.
Complexity Summary¶
- Cyclic Buffer: enqueue/dequeue O(1)
- Deque: push/pop ends amortized O(1)
- Heap: push/pop O(log n), peek O(1)
- Disjoint Set: union/find ~O(1) amortized (with heuristics)
- Sparse Set: insert/remove/contains O(1), iterate dense O(n_active)
- Trie: insert/search O(m)
- Bitset: set/test O(1), bitwise ops O(n/word_size)
- Cache: get/set/remove O(1), expiration scan O(n)
Notes on constants
- All structures are allocation-aware (enumerators avoid boxing; internal buffers reuse pools where applicable). Real-world throughput is often more important than asymptotic notation; these implementations are tuned for Unity/IL2CPP.
Cache (LRU/LFU/SLRU/FIFO/Random)¶
- What it is: High-performance key-value cache with configurable eviction policies and time-based expiration.
- Use for: Memoization, asset lookups, network response caching, session data.
- Operations: get/set/remove in O(1); supports weighted entries, auto-loading, and statistics.
- Pros: Multiple eviction strategies; fluent builder API; jitter for thundering herd prevention.
- Cons: Memory overhead for tracking access patterns; requires configuration for optimal performance.
flowchart LR
subgraph Cache Operations
Get[Get] --> Hit{Hit?}
Hit -->|Yes| Return[Return Value]
Hit -->|No| Load[Load/Miss]
Set[Set] --> Evict{At Capacity?}
Evict -->|Yes| Policy[Apply Eviction Policy]
Evict -->|No| Store[Store Entry]
Policy --> Store
end When to use¶
- Use
Cache<TKey, TValue>when you need automatic eviction, expiration, or access tracking. - Use
Dictionary<TKey, TValue>for simple lookups without size limits or expiration. - Use
ConcurrentDictionary<TKey, TValue>for thread-safe access without cache semantics.
Eviction Policies¶
| Policy | Description | Best For |
|---|---|---|
| LRU | Evicts least recently used entry | General purpose, most common |
| SLRU | Segmented LRU with probation/protected segments | High-throughput with scan resistance |
| LFU | Evicts least frequently used entry | Stable access patterns |
| FIFO | First in, first out eviction | Render caches, predictable eviction |
| Random | Random eviction | Low overhead, uniform access |
API snapshot (Basic usage)¶
API snapshot (Loading cache with auto-compute)¶
API snapshot (Advanced configuration)¶
API snapshot (Weighted caching)¶
CachePresets (Ready-to-use configurations)¶
Use CachePresets for common scenarios:
Cache Properties and Methods¶
| Member | Description |
|---|---|
Count | Current number of entries |
Size | Total weight (weighted caching) or count |
MaximumSize | Configured maximum entries |
Capacity | Current internal capacity (may be < MaximumSize) |
Keys | Enumerable of all keys (allocates state machine) |
TryGet(key, out value) | Returns true if key exists and not expired |
Set(key, value) | Adds or updates an entry |
GetOrAdd(key, factory) | Gets existing or computes and caches new value (factory optional with loader) |
TryRemove(key) | Removes entry if present, returns bool |
TryRemove(key, out value) | Removes entry and returns the removed value |
ContainsKey(key) | Checks if key exists |
Clear() | Removes all entries |
CleanUp() | Forces expiration scan |
Compact(ratio) | Evicts percentage of entries |
Resize(newSize) | Changes maximum size |
GetStatistics() | Returns hit/miss/eviction stats (if enabled) |
GetAll(keys, dict) | Batch get into provided dictionary |
SetAll(items) | Batch set from collection |
GetKeys(list) | Zero-allocation key enumeration into provided list |
Tips and pitfalls¶
- Choose the right preset:
CachePresetsprovides optimized defaults for common scenarios. - Enable statistics sparingly: Recording stats adds overhead; enable only when debugging.
- Use jitter for network caches: Prevents thundering herd when many entries expire together.
- Consider SLRU for high-throughput: Better scan resistance than plain LRU.
- Watch InitialCapacity: The cache clamps initial capacity to prevent OutOfMemoryException. Don't set it larger than needed.
- Weighted caches: Use
MaximumWeight+Weigherfor size-based eviction (e.g., texture bytes). - Callbacks are synchronous:
OnEviction,OnGet,OnSetrun on the calling thread.