Add ability to import PDF by albertski · Pull Request #408 · basecamp/writebook

albertski · 2026-03-22T18:42:20Z

Users can now import a PDF into a book directly from the table of contents toolbar. Each page of the PDF is imported as a separate textcpage, using the first line as the title.

Implementation:

PdfImporter model wrapping the pdf-reader gem
Books::ImportsController (new/create, editor-only)
Import button in the create buttons toolbar
Unit tests for PdfImporter and controller integration tests

Totally understand if this is something you don’t want to support.

Copilot

Pull request overview

Adds a PDF import workflow for books so editors can upload a PDF from the TOC toolbar and have each PDF page imported as a separate text Page/Leaf.

Changes:

Introduces PdfImporter to parse PDFs (via pdf-reader) and create leaves/pages.
Adds Books::ImportsController + route + books/imports/new form to upload a PDF (editor-only).
Adds an “Import PDF” button to the existing create-buttons toolbar; adds unit/integration tests and PDF fixtures.

Reviewed changes

Copilot reviewed 8 out of 11 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
`app/models/pdf_importer.rb`	New importer that reads PDF pages and creates `Leaf`/`Page` records.
`app/controllers/books/imports_controller.rb`	New editor-only endpoints to render the form and run the import.
`app/views/books/imports/new.html.erb`	Upload form UI for selecting and importing a PDF.
`app/views/books/_create_buttons.html.erb`	Adds a new toolbar button to access the import flow.
`config/routes.rb`	Adds nested `book import` routes (`new/create`).
`test/models/pdf_importer_test.rb`	Unit tests for importer behavior (titles, body content, skipping blank pages, malformed PDFs).
`test/controllers/books/imports_controller_test.rb`	Integration tests for import endpoints + authorization.
`test/fixtures/files/sample.pdf`	PDF fixture with extractable text across 2 pages.
`test/fixtures/files/blank.pdf`	PDF fixture intended to have no extractable text.
`Gemfile` / `Gemfile.lock`	Adds `pdf-reader` and its transitive dependencies.

[!TIP]
If you aren't ready for review, convert to a draft PR.
Click "Convert to draft" or run gh pr ready --undo.
Click "Ready for review" or run gh pr ready to reengage.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-22T18:47:17Z

app/views/books/_create_buttons.html.erb

+<%= link_to new_book_import_path(book), class: "btn txt-medium", title: "Import PDF", aria: { label: "Import PDF" } do %>
+  <svg viewBox="0 0 20 24" xmlns="http://www.w3.org/2000/svg" fill="var(--color-ink)">
+    <path d="m15.8 21.7c0 .3-.2.4-.4.4h-13.5-.2v-16.9c0-.3.2-.4.4-.4h6.3c0-.6.1-1.2.3-1.8h-6.9c-1 0-1.8.8-1.8 1.8v17.5c0 1 .8 1.8 1.8 1.8h14c.9 0 1.6-.6 1.8-1.4v-11.6c-.5.2-1.1.4-1.8.5v10.3z"/>
+    <path fill="var(--color-positive)" d="m15 0c-2.8 0-5 2.2-5 5s2.2 5 5 5 5-2.2 5-5-2.2-5-5-5zm.7 7.5-2 2-2-2 1-1 .4.4v-2.9h1.1v2.9l.5-.4z"/>
+    <path d="m4.5 14h9v1.5h-9zm0 3h6v1.5h-6z"/>
+  </svg>
+  <span class="for-screen-reader">Import PDF</span>
+<% end %>


The new "Import PDF" toolbar button is missing the disable-when-arranging / disable-when-deleting classes used by the other create buttons in this toolbar, so it will remain clickable during rearrange/delete modes while other actions are intentionally disabled. Consider reusing the same button classes/disable hooks for consistent behavior and styling.

Copilot · 2026-03-22T18:47:17Z

app/controllers/books/imports_controller.rb

+    pages = PdfImporter.new(@book, params[:pdf].tempfile).import
+
+    if pages.any?
+      redirect_to book_slug_url(@book), notice: "Successfully imported #{pages.size} #{"page".pluralize(pages.size)} from PDF."


PdfImporter#import returns Leaf records (as also asserted in PdfImporterTest), but this action stores the result in a variable named pages. Renaming this to something like imported_leaves (and adjusting the notice string accordingly) would make the code clearer and reduce confusion between Page (leafable) vs Leaf (wrapper).

Suggested change

pages = PdfImporter.new(@book, params[:pdf].tempfile).import

if pages.any?

redirect_to book_slug_url(@book), notice: "Successfully imported #{pages.size} #{"page".pluralize(pages.size)} from PDF."

imported_leaves = PdfImporter.new(@book, params[:pdf].tempfile).import

if imported_leaves.any?

redirect_to book_slug_url(@book), notice: "Successfully imported #{imported_leaves.size} #{"leaf".pluralize(imported_leaves.size)} from PDF."

Copilot · 2026-03-22T18:47:17Z

app/controllers/books/imports_controller.rb

+      render :new, status: :unprocessable_entity
+    end
+  rescue PdfImporter::InvalidPdfError => e
+    flash.now[:alert] = "Could not read PDF: #{e.message}"


The user-facing alert message includes the raw exception message from pdf-reader (Could not read PDF: #{e.message}). These messages can be noisy/inconsistent and may reveal implementation details; consider logging the exception details and showing a stable, user-friendly message instead (optionally with guidance like “Please verify the file is a valid, unencrypted PDF”).

Suggested change

flash.now[:alert] = "Could not read PDF: #{e.message}"

Rails.logger.warn("PDF import failed for book #{ @book&.id || 'unknown' }: #{e.class}: #{e.message}")

flash.now[:alert] = "Could not read PDF. Please verify the file is a valid, unencrypted PDF."

Copilot · 2026-03-22T18:47:17Z

app/models/pdf_importer.rb

+    def extract_title(text, page_number)
+      first_line = text.lines.first&.strip
+      if first_line.present? && first_line.length.between?(3, 100)
+        first_line
+      else
+        "Page #{page_number}"
+      end


extract_title uses hard-coded title length limits (between?(3, 100)). To make the intent clearer and future changes safer, consider extracting these numbers into named constants (e.g., MIN_TITLE_LENGTH, MAX_TITLE_LENGTH) or documenting why those bounds were chosen.

Copilot · 2026-03-22T18:47:18Z

app/models/pdf_importer.rb

+  def import
+    imported = []
+
+    @reader.pages.each_with_index do |page, index|
+      text = page.text.strip
+      next if text.blank?
+
+      title = extract_title(text, index + 1)
+      body  = normalize_body(text)
+
+      imported << @book.press(Page.new(body: body), title: title)
+    end
+
+    imported
+  end


import creates multiple records (one per PDF page) but does not run inside a transaction. If an exception occurs midway (e.g., validation/database error), the book can end up partially imported. Wrapping the per-page creation loop in a single transaction (or otherwise ensuring all-or-nothing behavior) would keep imports consistent.

Add ability to import PDF

2b534da

Copilot AI review requested due to automatic review settings March 22, 2026 18:42

Copilot started reviewing on behalf of albertski March 22, 2026 18:42 View session

Copilot AI reviewed Mar 22, 2026

View reviewed changes

albertski marked this pull request as draft March 23, 2026 14:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ability to import PDF#408

Add ability to import PDF#408
albertski wants to merge 1 commit intobasecamp:mainfrom
albertski:add-pdf-import

albertski commented Mar 22, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 22, 2026

Uh oh!

Copilot AI Mar 22, 2026

Uh oh!

Copilot AI Mar 22, 2026

Uh oh!

Copilot AI Mar 22, 2026

Uh oh!

Copilot AI Mar 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	flash.now[:alert] = "Could not read PDF: #{e.message}"
	Rails.logger.warn("PDF import failed for book #{ @book&.id \|\| 'unknown' }: #{e.class}: #{e.message}")
	flash.now[:alert] = "Could not read PDF. Please verify the file is a valid, unencrypted PDF."

Conversation

albertski commented Mar 22, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants