Conversation
|
@Matthijsy this should address at least one of the issues you raised in #38. Let me know if this improves things for you. |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #39 +/- ##
=====================================
Coverage 0.00% 0.00%
=====================================
Files 151 150 -1
Lines 4163 4186 +23
=====================================
- Misses 4163 4186 +23
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Yes the |
There was a problem hiding this comment.
Pull request overview
This PR refactors the ESE database cursor implementation to be more similar to dissect.apfs, replacing the BTree class with a new RawCursor class that provides better control over B+Tree navigation. It also enhances NTDS query capabilities to handle wildcard patterns (including prefix wildcards like *value and complex patterns like h*d*r) and gracefully handles missing indexes by falling back to table scans instead of raising errors.
Changes:
- Replaced
BTreeclass withRawCursorclass for more flexible B+Tree navigation - Enhanced NTDS query support for complex wildcard patterns and missing indexes
- Improved error messages and debugging information in page representations
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
dissect/database/ese/btree.py |
Removed entire BTree class implementation |
dissect/database/ese/cursor.py |
Added new RawCursor class with stack-based B+Tree navigation; updated Cursor to use RawCursor |
dissect/database/ese/table.py |
Migrated from BTree to RawCursor for long value retrieval |
dissect/database/ese/page.py |
Enhanced repr with more debug info; improved error messages |
dissect/database/ese/ntds/query.py |
Added support for complex wildcards and missing indexes; improved wildcard matching with regex |
tests/ese/ntds/test_query.py |
Updated tests for new wildcard patterns and missing index behavior |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
This PR now also incorporates an important fix where if an object had exactly one child, it would not return that when calling |
twiggler
left a comment
There was a problem hiding this comment.
First wave, perhaps more comments later today or monday.
| break | ||
|
|
||
|
|
||
| class RawCursor: |
There was a problem hiding this comment.
Consider moving to rawcursor.py
There was a problem hiding this comment.
I considered it but I think for the time being I'd like them to be together.
dissect/database/ese/cursor.py
Outdated
| exact: Whether to only return successfully on an exact match. | ||
|
|
||
| Returns: | ||
| The node number of the first node that's greater than or equal to the key. |
There was a problem hiding this comment.
When exact=False and a key which is strictly greater than every node, we return lo, which is strictly less.
Not a bug, but an inaccuracy in the docstring
There was a problem hiding this comment.
Could you elaborate a little more?
There was a problem hiding this comment.
Thanks, that clears it up!
There was a problem hiding this comment.
The issue is just a tiny mismatch in the docstring for one specific boundary condition.
Here is the scenario where the description doesn't quite match the output:
The Edge Case: Searching for an Oversized Key
(ai-generated example)
Imagine we have a page with four node keys: [10, 20, 30, 40].
Now, let's say the search key is 50, and exact=False.
The Loop: The while lo < hi: loop runs. Because 50 is greater than every mid value, the code repeatedly executes lo = mid + 1.
The Exit: This continues until lo reaches the very last index of the page (index 3, where the key is 40). At this point, lo == hi, and the loop exits.
The Return: The code does its final comparison. Since exact is False, the function simply returns lo (which is 3).
The Disconnect
The function returned the node at index 3 (key 40).
The docstring states the function returns: "The node number of the first node that's greater than or equal to the key."
However, 40 is strictly less than the search key of 50.
The logic does exactly what it needs to do (landing on the highest possible bound to route the search down the correct right-most branch). It's a perfectly valid fallback, but the docstring currently doesn't mention this right-side boundary outcome.
Suggested Docstring Tweak
Updating the Returns: section to something like this would clear it right up:
The node number of the first node that is greater than or equal to the key. If the search key is greater than all nodes on the page, it returns the index of the last node.
Note that this causes a regression in ADAM parsing (infinite loop) but that's because the way we resolve the root domain is shoddy. #43 fixes this, I'll move the child fix to that PR. |
This reverts commit b4a2421.
Model the cursor implementation a little bit more like the one in
dissect.apfs. Should give a little bit more flexibility and hopefully make the implementation a little bit easier to understand.Also improve NTDS querying to allow for wildcard prefixes and missing indexes.