Skip to content

fix(tree): std::path::Component has infallible conversion to &[u8]#2458

Merged
Sebastian Thiel (Byron) merged 3 commits intoGitoxideLabs:mainfrom
datdenkikniet:direct-u8
Mar 22, 2026
Merged

fix(tree): std::path::Component has infallible conversion to &[u8]#2458
Sebastian Thiel (Byron) merged 3 commits intoGitoxideLabs:mainfrom
datdenkikniet:direct-u8

Conversation

@datdenkikniet
Copy link
Copy Markdown
Contributor

@datdenkikniet datdenkikniet commented Mar 6, 2026

The checks/conversions done in the _by_path methods in Tree and TreeRefIter are unnecessary, so this PR removes them in favor of using OsStr::as_encoded_bytes().

Since as_os_str().as_encoded_bytes() takes &self and (as far as I know) no platforms use internal mutability on OsStr, the underlying value must already have the form we want. The only thing that os_str_into_bstr does for us is validate that the byte string contains valid UTF-8. However, the only thing these BStrs are used for is comparison to other BStrs. Whether either of them is valid UTF-8 does not impact the function in any way (as string conversion errors are silently ignored), so we can simply skip the whole validation step and get ever-so-slightly-cheaper lookups by path.

Please let me know if this change is useful/welcome or not, I can imagine that there may be some case I have not thought of that breaks from this change.

Instead of relying on a fallible path and using a default value,
we can convert directly between a `Component` and a `&[u8]`.
Copy link
Copy Markdown
Member

@Byron Sebastian Thiel (Byron) left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot, this is a great catch!

The previous implementation clearly worked around the fallibility of the conversion call, and the new version is faster with a similar failure mode: If this lookup is made on a platform with different-than-Git path encodings, and Git may try to turn paths to UTF-8, the equality comparison might not pan out.

But getting this right is hard as:

  • it doesn't care about the case-sensivity settings of the repository
  • it will happily compare different encodings and fail for that reason, with the only way to know how to do that by knowing which Git (for Windows) client was used to create the tree

So this seems to be an improvement, and for everything else let's wait and see.

- fix lint

The previous implementation clearly worked around the fallibility of the conversion call, and the new version is faster with a similar failure mode: If this lookup is made on a platform with different-than-Git path encodings, and Git may try to turn paths to UTF-8, the equality comparison might not pan out.

But getting this right is hard as:

* it doesn't care about the case-sensivity settings of the repository
* it will happily compare different encodings and fail for that reason, with the only way to know how to do that by knowing which Git (for Windows)  client was used to create the tree

So this seems to be an improvement, and for everything else let's wait and see.
@Byron Sebastian Thiel (Byron) merged commit 2ce6dde into GitoxideLabs:main Mar 22, 2026
30 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants