With WT constructed via Huffman tree it can be used as an alternative way of storing Huffman encoded files, as a bonus WT allows extraction of arbitrary substring [l, r] in O(r-L) i.e. only scanning related positions.
Please note the construction and serialization/deserialization of SDSL
https://github.com/simongog/sdsl-lite/blob/master/include/sdsl/wt_pc.hpp
And AI review of the process
https://app.kilo.ai/s/605b1323-63b1-430e-be01-69dda2718f71
It is good present the result in a form of binary utility that work in two modes:
- Compression: given the input file construct WT with Huffman shape and serialize it.
- Substring extraction: Extract content from [l, r] of the initial file from a WT archieve.
- Lines extraction: Extract content of lines [l, r] of the initial file from a WT archieve. This is done the same way as previous with addition that we need additional two selects of '\n'
As of now Pixie doesn't have any CLI tools, usage of argparse is encouraged via CMake FetchContent.
With WT constructed via Huffman tree it can be used as an alternative way of storing Huffman encoded files, as a bonus WT allows extraction of arbitrary substring [l, r] in O(r-L) i.e. only scanning related positions.
Please note the construction and serialization/deserialization of SDSL
https://github.com/simongog/sdsl-lite/blob/master/include/sdsl/wt_pc.hpp
And AI review of the process
https://app.kilo.ai/s/605b1323-63b1-430e-be01-69dda2718f71
It is good present the result in a form of binary utility that work in two modes:
As of now Pixie doesn't have any CLI tools, usage of argparse is encouraged via CMake FetchContent.