-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Pull requests: Unstructured-IO/unstructured
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Enable vertical text detection for rotated images
#4328
opened Apr 8, 2026 by
vladimir-kivi-ds
Loading…
feat: add AG2 multi-agent document processing example
#4326
opened Apr 7, 2026 by
faridun-ag2
Loading…
7 tasks done
feat: infer hierarchical heading levels (H1-H6) for PDFs (#4204)
#4325
opened Apr 7, 2026 by
statxc
Loading…
2 tasks done
fix: avoid mutating shared numpy views in tesseract OCR
#4323
opened Apr 5, 2026 by
PastelStorm
Loading…
fix: preserve CSV semantics for single-column files
#4322
opened Apr 5, 2026 by
PastelStorm
Loading…
refactor: chunk PDF rendering for OCR and extraction
#4321
opened Apr 5, 2026 by
PastelStorm
Loading…
fix: enable huge_tree for HTMLParser to handle large documents
#4306
opened Mar 28, 2026 by
joaquinhuigomez
Loading…
fix: restore double-newline row boundaries in Table.text (#4235)
#4299
opened Mar 25, 2026 by
alvinttang
Loading…
3 tasks done
refactor: don't import unstructured-inference via partition.pdf
#4284
opened Mar 16, 2026 by
artdent
Loading…
fix: improve multi-column layout sorting for academic papers (#4104)
#4283
opened Mar 16, 2026 by
Gopesh111
Loading…
refactor: replace deprecated decorators in partition_image with apply_metadata
#4271
opened Mar 2, 2026 by
HemantSudarshan
Loading…
fix: add 'el' and 'gr' as Greek language code aliases for Tesseract OCR
#4270
opened Feb 27, 2026 by
s0wa48
Loading…
fix: handle list output from group_bullet_paragraph in element apply()
#4253
opened Feb 21, 2026 by
s0wa48
Loading…
feat: add XLSM (Excel Macro-Enabled Workbook) parsing support
#4227
opened Feb 8, 2026 by
longway-code
Loading…
docs: fix redundant whitespace in pyenv command in README
#4224
opened Feb 3, 2026 by
longway-code
Loading…
Fix FutureWarning: Add test to verify bytes are wrapped in BytesIO for read_excel
#4213
opened Jan 27, 2026 by
Achieve3318
Loading…
⚡️ Speed up function
merge_out_layout_with_ocr_layout by 30%
#4212
opened Jan 27, 2026 by
aseembits93
Loading…
feat: chunking by character and title now isolates tables
#4197
opened Jan 15, 2026 by
badGarnet
Loading…
fix: NameError: LayoutElements not defined in paddle_ocr.py
#4195
opened Jan 15, 2026 by
mohansinghi
Loading…
Previous Next
ProTip!
Exclude everything labeled
bug with -label:bug.