diff --git a/README.md b/README.md index ce7cd0f5a62ff015e66886d4c9a9a60febfa33ab..ff05bdb8ffa9c9893e942a39db32b6d2ead86ae7 100644 --- a/README.md +++ b/README.md @@ -1,83 +1,103 @@ -# FB+-tree -In essence, FB+-tree is identical to a main-memory B+-tree except for the layout of inner node and leaf node. -Similar to employing bit or byte for branch operation in tries, FB+-tree progressively considers several bytes -following the common prefix on each level of inner nodes, referred to as features. By incorporating features, -FB+-tree blurs the lines between B+-trees and tries, allowing FB+-tree to benefit from prefix skewness. In the -best case, FB+-tree almost becomes a trie, whereas in the worst case, it continues to function as a B+-tree. -In most cases, branch operations can be achieved by feature comparison thus enhancing its cache consciousness. -Furthermore, feature comparison is implemented with SIMD instructions in a simple for loop, which mitigates -dependences between instructions in comparison to binary search, allowing FB+-tree to leverage computational -(instruction-level parallelism and data-level parallelism) and memory-level parallelism, such as, dynamic hardware -scheduling, super-scalar, dynamic branch prediction, speculative execution and a series of out-of-order execution techniques. - -In concurrent environments, with feature comparison, FB+-tree effectively alleviates small, random and irregular -memory access generated by binary search, thus significantly improving the utilization of memory bandwidth and -Ultra Path interconnect (UPI) bandwidth. Eventually, FB+-tree are more multi-core scalable than typical B+-tree even -on read-only workloads. Unlike typical B+-trees that copy anchor keys into inner nodes, FB+-tree stores the actual -contents of anchor keys in leaf nodes (i.e., high_key, the upper bound), while inner nodes only maintain pointers -to high_key, which makes FB+-tree more space-efficient. Since high_key only represents the upper bound of a leaf -node, it can be constructed using discriminative prefixes to improve performance and space consumption. - -# Synchronization Protocol -FB+-tree employs a highly optimized optimistic synchronization protocol for concurrent index access. -It highlights: -* latch-free index traversal, without the comparison overhead with high_key in most cases -* **highly scalable latch-free update (subtle atomic operations coordinated with optimistic lock)** -* concurrent linearizable range scan on linked list of leaf nodes, and lazy rearrangement - -# Index Structures -There is an example in each index directory. -* [ARTOLC](https://github.com/wangziqi2016/index-microbench.git) -* BLinkTree: lock-based B-link-tree, implemented based on the paper by YAO. et al. just a demo, may have some bugs -* [B+-treeOLC](https://github.com/wangziqi2016/index-microbench.git) -* [FAST](https://github.com/RyanMarcus/fast64.git): a simple implementation of FAST in Rust, only support bulk_load. -* FB+-tree -* [GoogleBtree](https://code.google.com/archive/p/cpp-btree/) -* [HOT](https://github.com/speedskater/hot.git) -* [Masstree](https://github.com/kohler/masstree-beta.git) -* [ARTOptiQL](https://github.com/sfu-dis/optiql) -* [STX B+-tree](https://github.com/tlx/tlx.git) -* [Wormhole](https://github.com/wuxb45/wormhole.git) - -# Requirements -* x86-64 CPU supporting SSE2 or AVX2 or AVX512 instruction set -* intel Threading Building Blocks (TBB) (`apt install libtbb-dev`) -* jemalloc (`apt install libjemalloc-dev`) -* google-glog (`apt install libgoogle-glog-dev`) -* a C++17 compliant compiler - -# API -``` -KVPair* lookup(KeyType key) +# Index Research + +This project contains various index structures and their implementations for research purposes. + +## Supported Index Structures + +- **ART (Adaptive Radix Tree)** +- **BLink Tree** +- **BTreeOLC (Optimistic Lock Coupling B-Tree)** +- **FAST (Fast Architecture Sensitive Tree)** +- **FBTree (Feature B-Tree)** +- **HOT (Highly Optimized Trie)** +- **MassTree** +- **Google B-Tree** +- **STX B-Tree** +- **Wormhole (High-performance hash table)** + +## Requirements + +- C++17 compatible compiler +- CMake (3.14 or higher) +- For some components: + - Rust (for FAST) + - Python (for benchmarks) -KVPair* update(KVPair* kv) +## API Documentation -KVPair* upsert(KVPair* kv) +Each index structure has its own implementation with varying APIs. Please refer to the respective header files for detailed API documentation: -KVPair* remove(KeyType key) +- ART: `ARTOLC/Tree.h` +- BLink Tree: `BLinkTree/b_link_tree.h` +- BTreeOLC: `BTreeOLC/BTreeOLC.h` +- FAST: `FAST/fast64/fast64.h` +- FBTree: `FBTree/fbtree.h` +- HOT: `HOT/include/hot/rowex/HOTRowex.hpp` +- MassTree: `MassTree/masstree.hh` +- Google B-Tree: `GoogleBTree/btree_map.h` +- STX B-Tree: `STX/tlx/tlx/container/btree_map.hpp` +- Wormhole: `wormhole/wh.h` -iterator begin() +## Getting Started -iterator lower_bound(KeyType key) +### Building the Project -iterator upper_bound(KeyType key) +```bash +mkdir build +cd build +cmake .. +make -j$(nproc) ``` -# Get Started -1. Clone this repository and initialize the submodules +### Running Examples + +Each index structure has example code demonstrating basic usage: + +- ART: `ARTOLC/example.cpp` +- BLink Tree: `BLinkTree/example.cpp` +- BTreeOLC: `BTreeOLC/example.cpp` +- FAST: `FAST/example.cpp` +- FBTree: `FBTree/example.cpp` +- HOT: `HOT/example.cpp` +- MassTree: `MassTree/example.cpp` +- Google B-Tree: `GoogleBTree/example.cpp` +- STX B-Tree: `STX/example.cpp` +- Wormhole: `wormhole/easydemo.c` + +Compile and run examples: + +```bash +# For C++ examples +./example_binary + +# For Wormhole demo +./wormhole/easydemo ``` -git clone -cd -git submodule init -git submodule update + +## Benchmarks and Performance Testing + +The `test` directory contains performance testing frameworks: + +- `test/cache_miss.cpp` - Cache miss testing +- `test/ycsb_test.cpp` - YCSB-style workload testing + +To run benchmarks: + +```bash +# Generate test data +./ycsb_build + +# Run tests +./ycsb_test ``` -2. Create a new directory *build* `mkdir build && cd build` -3. Build the project `cmake -DCMAKE_BUILD_TYPE=Release .. && make -j` -4. Run the example `./FBTree/FBTreeExample 10000000 1 1` - -# Notes -* Currently, we do not implement a single-threaded version. We will later implement a single-threaded version with - more optimizations, such as embedding key-value into leaf nodes, larger leaf nodes (128), a more rational split tactic. -* To evaluate the performance/scalability of concurrent remove, disable `free` interface to mitigate cross-thread - memory release overhead (for example, acquire a lock on an arena in jemalloc) -* previous implementation during development: https://gitee.com/spearNeil/blinktree.git and https://gitee.com/spearNeil/tree-research.git \ No newline at end of file + +## Notes + +1. The implementation includes both original and modified versions of various index structures. +2. Performance characteristics will vary based on workload and hardware architecture. +3. Some implementations are optimized for specific scenarios (e.g., cache-optimized, persistent memory, etc.). +4. For production use, carefully evaluate the suitability of each index structure for your specific workload. + +## License + +Please check individual license files for each component as licensing varies across the different index implementations. \ No newline at end of file