Skip to content

fix: accept any IO[bytes] object in convert_to_bytes()#4241

Closed
bittoby wants to merge 11 commits intoUnstructured-IO:mainfrom
bittoby:fix/zipextfile-convert-to-bytes
Closed

fix: accept any IO[bytes] object in convert_to_bytes()#4241
bittoby wants to merge 11 commits intoUnstructured-IO:mainfrom
bittoby:fix/zipextfile-convert-to-bytes

Conversation

@bittoby
Copy link
Contributor

@bittoby bittoby commented Feb 16, 2026

Closes: #4097

Problem

When a user opens a file from inside a zip archive and passes it directly into partition(),
the library crashes with ValueError: Invalid file-like object type.

Uploading a text file directly works fine. Uploading a zip containing that same text file
fails every time, even though the file inside the zip is perfectly readable.

Root Cause

The convert_to_bytes() function only accepted a fixed list of known file types.
Anything not on that list was immediately rejected with an error — no attempt to read it,
no fallback, just a crash.

The file object returned when opening a file from a zip archive is not on that list,
so it was always rejected. This was a flaw in the design of the function: it checked
what the file was instead of checking what it could do.

Fix

Replaced the rigid type-checking approach with a simple capability check.
Before giving up, the function now asks: does this object support reading?
If yes, read it. This makes the function behave correctly for zip archive files
and any other standard readable file object that was previously unrecognised,
without changing how any of the existing accepted types are handled.

@bittoby
Copy link
Contributor Author

bittoby commented Feb 16, 2026

@badGarnet Could you please review this PR? thank you

@bittoby
Copy link
Contributor Author

bittoby commented Feb 25, 2026

@badGarnet please give me any feedback! thanks

@bittoby
Copy link
Contributor Author

bittoby commented Feb 27, 2026

@cragwolfe Could you please review this PR?
thank you

@bittoby
Copy link
Contributor Author

bittoby commented Mar 2, 2026

These testing fails are not relate my changes. Please review my PR @badGarnet @qued

@bittoby bittoby closed this by deleting the head repository Mar 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug/a text file cannot be loaded from a ZipExtFile

1 participant