Conversation
Code Review SummaryStatus: 2 Issues Found | Recommendation: Address before merge Overview
Issue Details (click to expand)WARNING
SUGGESTION
Files Reviewed (3 files)
Reviewed by gpt-5.2-codex-20260114 · 297,555 tokens |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #40 +/- ##
==========================================
+ Coverage 86.17% 87.05% +0.88%
==========================================
Files 11 13 +2
Lines 2850 3068 +218
Branches 561 604 +43
==========================================
+ Hits 2456 2671 +215
+ Misses 257 254 -3
- Partials 137 143 +6
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| size_t end, | ||
| std::span<uint64_t> data, | ||
| node_index_t parent, | ||
| const std::function<size_t(node_index_t)>& get_middle = [](auto) { |
There was a problem hiding this comment.
SUGGESTION: Avoid std::function in the hot build path
std::function introduces type-erasure overhead per recursive call. If you can make BuildNode a template on the callable (or pass a small function object by value), you can keep the same behavior while eliminating indirect calls and potential allocations during construction.
|
|
||
| namespace pixie { | ||
|
|
||
| enum WaveletTreeBuildType { Standard, Huffman }; |
There was a problem hiding this comment.
It is better to use enum class. In current setup it will conflict with other enums using the same values, with enum class it will not be an issue, you'd need to specify as WaveletTreeBuildType::Standard though. Also please use all uppercase for enums, i.e. STANDARD, HUFFMAN
|
|
||
| enum WaveletTreeBuildType { Standard, Huffman }; | ||
|
|
||
| class WaveletTree { |
There was a problem hiding this comment.
Docstring description is required for class and public methods
| std::vector<node_index_t> leaves_; | ||
| std::vector<size_t> permutation_, inverse_permutation_; | ||
|
|
||
| node_index_t BuildNode( |
There was a problem hiding this comment.
snake_case for methods, i.e. build_node
| class WaveletTree { | ||
| using node_index_t = size_t; | ||
| struct WaveletNode { | ||
| static constexpr node_index_t kNil = |
There was a problem hiding this comment.
please use npos for consistency
| node_index_t BuildNode( | ||
| size_t begin, | ||
| size_t end, | ||
| std::span<uint64_t> data, |
There was a problem hiding this comment.
If the node is not suppose to change data than it should have type std::span<const uint64_t>
| gtest_main | ||
| ${PIXIE_DIAGNOSTICS_LIBS}) | ||
|
|
||
| add_executable(wavelet_tree_tests |
There was a problem hiding this comment.
Also add in the converage_report.sh and build-test workflow
| } | ||
|
|
||
| nodes_.emplace_back(middle, std::move(bit_vector), data.size()); | ||
| auto cut = std::stable_partition( |
There was a problem hiding this comment.
Please check how SDSL constructs WT: it does it in two passes over the data without intermediate buffers. All bitvectors are concotenated (optional improvement)
https://app.kilo.ai/s/605b1323-63b1-430e-be01-69dda2718f71
No description provided.