Skip to content
Wei Lin edited this page Mar 5, 2026 · 9 revisions

MiniPdf Wiki

A minimal, zero-dependency .NET library for converting Excel (.xlsx) and Word (.docx) files to PDF — including chart rendering.


Pages

Page Description
Benchmark Self-evolution benchmark pipeline — 150 test cases, 97.1% avg score, 100 cases ≥ 99%
Benchmark Optimization Session 2026-03-05 Detailed session log: 64→100 passing cases, all code changes and insights
PDF Coding Standards Coding conventions and standards for the MiniPdf library
AI CI Reviewer Setup Copilot code review and Azure AI security scan
NuGet Package NuGet publishing workflow and configuration
DOCX-to-PDF Implementation Session 2026-03-05 DOCX conversion: 30 test cases, 98.7% avg score, all cases Excellent

Project Overview

MiniPdf converts Excel .xlsx and Word .docx files to paginated PDF without any external dependencies — only built-in .NET APIs. Charts embedded in Excel worksheets are rendered as native PDF vector graphics.

Key Features

  • Excel-to-PDF — Automatic column layout, multi-page, text wrapping, font colors, cell alignment (left / center / right), vertical alignment (top / center / bottom)
  • Word-to-PDF — Paragraphs, headings, tables, images, bullet/numbered lists, line spacing, page breaks (98.7% benchmark avg)
  • Chart rendering — Bar/column, line, area, pie/doughnut, scatter, radar charts as native PDF vector graphics
  • Cell styling — Font sizes, bold/italic, cell borders (thin/medium/thick/dashed), fill colors, pattern fills, merged cells
  • Number formatting — General, currency, percentage, scientific notation, date/time formats with auto-fit to column width
  • Unicode support — WinAnsiEncoding for Latin text + CID font (Arial) for CJK and non-Latin characters
  • Excel row/column sizing — Reads explicit row heights, column widths, and default dimensions from .xlsx
  • Zero dependencies — No external NuGet packages; uses System.IO.Compression + System.Xml
  • Valid PDF 1.4 output compatible with all PDF readers
  • Streaming API — Works with files, paths, or Stream inputs

Chart Support

Charts are rendered as native PDF vector graphics (rectangles + line segments), not bitmap images. Data is extracted from cell references in chart XML via a two-pass sheet reading approach.

Chart Type Status
Bar / Column (clustered, stacked, % stacked) ✅ Supported
Horizontal Bar ✅ Supported
Line (with markers) ✅ Supported
Area (filled, stacked, % stacked) ✅ Supported
Pie / Doughnut ✅ Supported (data labels, legend)
Scatter (XY) ✅ Supported (dedicated renderer)
Radar ✅ Supported (spoke labels)
Stock / Bubble / Combo ⚠️ Fallback (basic bar)

Additional chart features:

  • NiceAxisScale — round-number axis bounds (0 / 5,000 / 10,000 …) matching LibreOffice
  • Axis format codes — reads numFmt from chart XML (e.g., #,##0 for thousand separators)
  • Percent-stacked axes — Y-axis shows 0% – 100% for percent-stacked bar/area charts
  • Color palette — LibreOffice-compatible 8-color chart palette
  • Chart title clipping — titles clipped to chart width using FittingChars
  • Pie data labels — percentage labels when showPercent is enabled in chart XML
  • Overflow pages — right-anchored charts automatically spill to a second page
  • Legend — rendered below the chart area (reversed order for stacked series)

Repository Structure

MiniPdf.sln
├── src/MiniPdf/                     # Library (zero-dependency)
│   ├── MiniPdf.cs                   # Public API entry point
│   ├── ExcelToPdfConverter.cs       # Core converter (column grouping, wrapping, chart rendering)
│   ├── ExcelReader.cs               # .xlsx parser (sparse rows, shared strings, chart data)
│   ├── PdfDocument.cs               # PDF document model
│   ├── PdfPage.cs                   # Page with text & line drawing
│   ├── PdfWriter.cs                 # Low-level PDF 1.4 binary writer (text, rectangles, lines)
│   ├── PdfTextBlock.cs              # Text block data
│   └── PdfColor.cs                  # RGB color support
└── tests/
    ├── MiniPdf.Tests/               # xUnit unit tests (94 tests)
    │   ├── ClassicExcelToPdfTests.cs    # 120 classic scenario tests
    │   └── ExcelToPdfConverterTests.cs  # Converter unit tests
    ├── MiniPdf.Benchmark/           # Self-evolution benchmark pipeline
    │   ├── run_benchmark.py         # Total pipeline controller
    │   ├── compare_pdfs.py          # MiniPdf vs LibreOffice comparison engine
    │   ├── generate_reference_pdfs.py  # LibreOffice reference PDF generator
    │   └── reports/                 # HTML / JSON / Markdown reports
    └── MiniPdf.Scripts/             # Excel generators and converters
        ├── generate_classic_xlsx.py # 120 classic .xlsx test file generator
        └── convert_xlsx_to_pdf.cs   # dotnet-script converter runner

Quick Start

dotnet add package MiniPdf
using MiniSoftware;

// File → File
MiniPdf.ConvertToPdf("report.xlsx", "report.pdf");

// File → byte[]
byte[] pdf = MiniPdf.ConvertToPdf("report.xlsx");

// Stream → byte[]
using var stream = File.OpenRead("report.xlsx");
byte[] pdf = MiniPdf.ConvertToPdf(stream);

Benchmark Summary (as of 2026-03-04)

Metric Value
Test cases 120
Average score 96.4%
Cases ≥ 99% 68 / 120
🟢 Excellent (≥ 90%) 105+
C# unit tests passing 94 / 94

See Benchmark for the full pipeline and score table.

CI / Security

Every PR to main runs:

  1. dotnet build + dotnet test (94 tests)
  2. Copilot Code Review — automatic PR review comments
  3. Azure AI Security Scan — GPT-4.1 scans .cs diff, fails CI on security issues

See AI CI Reviewer Setup for details.

Clone this wiki locally