A Python library for converting HWP documents to Markdown.
HWP is a document format used by Hancom Office, the most widely used word processor in South Korea — commonly found in government, legal, and academic documents.
docpler uses a high-performance Rust core to parse HWP 5.0 files and produce clean Markdown output, including tables, equations, and text boxes.
| Format | Read | Output |
|---|---|---|
| HWP 5.0 | ✅ | Markdown |
pip install docplerfrom docpler.hwp import convert
markdown = convert("document.hwp")
print(markdown)pip install markitdown-hwpfrom markitdown import MarkItDown
md = MarkItDown(enable_plugins=True)
result = md.convert("document.hwp")
print(result.text_content)HWP(한글 워드프로세서) 문서를 Markdown으로 변환하는 Python 패키지입니다. Rust 코어 기반으로 빠르고 정확한 파싱을 제공합니다.
pip install docplerfrom docpler.hwp import convert
markdown = convert("document.hwp")
print(markdown)pip install markitdown-hwpfrom markitdown import MarkItDown
md = MarkItDown(enable_plugins=True)
result = md.convert("document.hwp")
print(result.text_content)Business Source License 1.1 (BSL 1.1)
- Free to use for any purpose, including production use
- Cannot be provided to others as a managed service
- Converts to Apache License 2.0 on 2031-04-05
- Rust core engine: distributed as compiled binary, source code is private
This product was developed with reference to the HWP document file (.hwp) specification published by Hancom.