-
Notifications
You must be signed in to change notification settings - Fork 15
Home
SDSL provides a variety of well engineered succinct data structure libraries in a consistent C++ framework. The library includes extensive documentation and is well tested. The library contains implementations for the following data structures:
- Compressed bit vector representations
- Rank and Select data structures for bit vectors
- Uncompressed and compressed representation for general sequences
- Rank and Select data structures for general sequences (e.g. wavelet trees)
- Support data structure for balanced parentheses sequences
- Succinct range minimum query (RMQ) data structures
- Compressed suffix arrays (CSA)
- Compressed longest common prefix (LCP) array representations
- Compressed suffix trees (CST), FM-Indexes
We have successfully applied the library to handle inputs in the gigabyte range. We have for instance build FM-Indexes and compressed suffix trees for inputs up to 64 GB. Our FM-Index implementations use space close to the best state-of-the-art compressors and outperform older implementations. Here are the benchmark results.
The following steps describe how to download install libsdsl.
-
Clone the current stable version of libsdsl using git:
git clone https://github.com/simongog/sdsl.git
-
Change into the sdsl directory and run the install script
cd sdsl ./install.sh [EXISTING_DIR_FOR_HEADER_AND_LIBFILES_AND_NOT_THE_CURRENT_DIR]
-
Run all tests to make sure the library was installed correctly: cd test make && make test
-
You find a Makefile to compile your programs in the examples directory.
-
Compile some of the Examples and Tutorials available in this wiki.
This wiki contains a wide range of Examples and Tutorials.
The code itself is documented and the auto-generated documentation can be accessed at http://simongog.github.com/sdsl/
- The library is tested under Linux and Mac OS. There are only a few operation depended pieces in the code (like the place of temporary files). So it should be easy also run it on Microsoft Windows with the mingw environment.
- The number of elements in all data structure is limited by (2^{61}).
- The indexing part of the library can process text with an alphabet size of 255 (the reason for that: the 0-byte is not allowed to occur in the text is appended to the text as special character.)