Skip to content

An enhanced suffix tree than can operate as a sliding window

Notifications You must be signed in to change notification settings

qdx/EnhancedSuffixTree

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

78 Commits
 
 
 
 
 
 

Repository files navigation

Enhanced Suffix Tree

An enhanced suffix tree than can operate as a sliding window.

The classic Ukkonen's algorithm is implemented, which is maintained in the classic_suffixtree branch. The implementation is primarily based on this Stack Overflow post. http://stackoverflow.com/questions/9452701/ukkonens-suffix-tree-algorithm-in-plain-english In the implementation and comment, I tried to abide to the same notation and names with this post. This implementation has O(1) insert time, so the build time is O(n). It took O(n) space too.

In addition, an enhanced suffix tree is proposed and implemented. Here is a comparison between the classic suffix tree and the enhanced suffix tree.

Classic suffix tree Enhanced suffix tree
singly linked doubly linked
keeps the sequence of input keeps the sequence of input and reference to leaf nodes
O(n) time to build, can build incrementally O(n) time to build, can build incrementally
Need a special ending character to do pattern match correctly Use recursive tree structure to eliminate the restriction of ending character
Cannot remove inserted input Can remove from the head of the input. Current implementation need O(n) time to remove from head, but in theory it can be O(log(n))

Also, a simple regular expression engine that support "+", "*", "?", ".", "|", "()", "" operators. The regular expression can do pattern match on suffix tree in about O(s) time, where s is number of matches found in the target corpus.

Currently, the data structure has been tested and works correctly. But the code still need heavy refactoring and better documentation. Be prepared to face a lot of dirty code and scattered comment if you decided to use this.

About

An enhanced suffix tree than can operate as a sliding window

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages