Skip to content

Comments

Interlinearization model#12

Open
alex-rawlings-yyc wants to merge 9 commits intomainfrom
interlinearization-model
Open

Interlinearization model#12
alex-rawlings-yyc wants to merge 9 commits intomainfrom
interlinearization-model

Conversation

@alex-rawlings-yyc
Copy link
Contributor

@alex-rawlings-yyc alex-rawlings-yyc commented Feb 19, 2026

Summary

This branch adds the Interlinearization data model, refactors Paratext 9 (PT9) parsing into a dedicated converter with hash-based text versions, extends the interlinearizer WebView with multiple viewing modes and an analyses view, and improves accessibility and async behavior.

Major changes

Interlinearization model and types

  • New Interlinearization type in src/types/interlinearizer.d.ts: structured model for interlinear data (id, books, segments, occurrences, analyses, etc.) consumed by the WebView and converter.
  • InterlinearizationProject (and related types): pairs source- and target-language interlinearizations with alignment links for future alignment/translation workflows.
  • New enums and shared types in src/types/interlinearizer-enums.ts and expanded .d.ts for books, segments, analyses, and alignment.

Paratext 9 parser and converter

  • Parser renamed and reorganized: interlinearXmlParserparatext9Parser; PT9-specific code lives under src/parsers/paratext-9/ (parser, converter, types, pt9-xml.md).
  • New paratext9Converter: converts PT9 parser output into the Interlinearization shape (book-level data with verse entries, assignment status, and a book-level text version for change detection).
  • SHA-256 text version:
    • Book text version is the SHA-256 hex of sorted, concatenated verse hashes so any verse change changes the book version.
    • WebView: uses Web Crypto crypto.subtle.digest('SHA-256', …) so the converter runs in the WebView without Node.
    • Node: optional hashSha256Hex in converter options so the host (e.g. paranext-core) can supply a matching hasher (e.g. generateHashFromBuffer('sha256', 'hex', …)) for consistency.
  • PT9 types: internal PT9 types moved to paratext-9-types.ts; deprecated ScrTextName removed (deprecated in PT9 since 2020); prop casing aligned with PT9.

WebView: viewing modes and analyses

  • Viewing modes: WebView can switch between InterlinearData (raw/legacy) and Interlinearization (new model). JSON view uses a small mapping for mode buttons and labels.
  • Analyses view: New JSON view mode that shows analyses derived from the interlinear data (createAnalyses builds analysis maps). Button added to switch between interlinear and analyses views.
  • Async conversion: Conversion to Interlinearization is async; WebView uses useEffect for loading and state so the UI stays correct when data arrives late.
  • Accessibility: JSON view mode controls use radiogroup / radio with aria-checked instead of group for clearer semantics and screen readers.

This change is Reviewable

Summary by CodeRabbit

  • New Features

    • Interactive JSON view modes with keyboard-accessible controls and async conversion state; Paratext‑9 parsing and end-to-end PT9→interlinearization conversion with lexicon-based gloss lookup.
  • Tests

    • Large expansion of unit/integration tests covering parser, converter, lexicon, views, keyboard navigation, async flows, and new test helpers/mocks.
  • Documentation

    • README updated with project structure, mock/test-data guidance, import path alias suggestions, and Node.js >= 18 requirement.
  • Types/API

    • New runtime enums and a richer interlinear model; PT9-specific data shapes and public type names standardized to camelCase.

- Introduce `Interlinearization` data model.
- Move internal-only PT9 types to dedicated file and change case of props.
- Removed `ScrTextName` prop as it's been deprecated in PT9 since 2020.
- Enhance interlinearizer WebView to support switching between viewing modes: InterlinearData and Interlinearization.
- Update Jest configuration to include path aliases for types and parsers.
- Modify README to clarify the structure of the `src/types/` and `src/parsers/` directories.
- Rename `interlinearXmlParser` and related tests to `paratext9parser`.
- Add new words to cspell configuration for improved spell checking.
- Add support for a new JSON view mode displaying analyses derived from parsed data.
- Implement functions to describe and label the new view mode.
- Update the WebView component to include a button for switching to the analyses view.
- Modify tests to cover the new analyses functionality and ensure proper rendering.
- Refactor the `createAnalyses` function to generate analysis maps from interlinear data.
- Change role from 'group' to 'radiogroup' for JSON view mode buttons to improve accessibility.
- Update button roles to 'radio' and aria attributes to 'aria-checked' for better semantic meaning.
- Modify tests to reflect the updated roles and ensure proper functionality of the JSON view mode switch.
- Introduce SHA-256 hashing for consistent book-level text version generation across Node and WebView environments.
- Add Web Crypto-based sha256HexWebCrypto for WebView-safe hashing; support injectable hashSha256Hex in converter options for Node (e.g. paranext-core generateHashFromBuffer).
- Compute book text version from sorted, concatenated verse hashes via computeBookTextVersion.
- Update paratext9Converter and tests to align with hash-generation behavior and remove obsolete code.
- Refactor interlinearizer WebView to use useEffect for async conversion and improve JSON view mode buttons.
- Update documentation for data structures and types.
@alex-rawlings-yyc alex-rawlings-yyc self-assigned this Feb 19, 2026
@coderabbitai

This comment was marked as resolved.

coderabbitai[bot]

This comment was marked as resolved.

…tionality

- Add Node.js version requirement (>=18) to package.json and package-lock.json.
- Improve interlinearizer WebView by implementing keyboard navigation for JSON view modes, allowing users to switch between modes using arrow keys.
- Refactor related tests to ensure proper functionality of the new keyboard navigation feature.
- Update README to reflect the new Node.js requirement and clarify usage of test data paths.
alex-rawlings-yyc

This comment was marked as resolved.

coderabbitai[bot]

This comment was marked as resolved.

…avigation improvements

- Export `JsonViewMode` type and add a sentinel for conversion status to indicate when interlinearization is in progress.
- Implement `formatJsonPreContent` function to display "Converting..." during the conversion process.
- Refactor keyboard navigation handling for JSON view modes, improving the separation of concerns and testability.
- Update tests to verify the new conversion status display and ensure keyboard navigation functionality works as expected.
Copy link
Contributor Author

@alex-rawlings-yyc alex-rawlings-yyc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alex-rawlings-yyc resolved 1 discussion.
Reviewable status: 0 of 17 files reviewed, all discussions resolved (waiting on alex-rawlings-yyc).

coderabbitai[bot]

This comment was marked as resolved.

… gloss lookup functionality

- Enhance the interlinearizer WebView to utilize a Lexicon XML file for gloss text lookup, allowing for more accurate display of glosses in analyses.
- Modify the `createAnalyses` function to accept an optional gloss lookup, improving the generation of analysis objects with real gloss text.
- Update Jest tests to cover new gloss lookup scenarios, ensuring proper handling of gloss text retrieval and fallback mechanisms.
- Revise README to include details about Lexicon XML structure and its integration with the interlinearizer.
coderabbitai[bot]

This comment was marked as resolved.

- Update the interlinearizer WebView to return `undefined` instead of `result` for better clarity in async handling.
- Simplify gloss lookup handling in `createAnalyses` by removing unnecessary nullish coalescing.
- Add a new test to ensure deterministic sorting of items with the same index by length and kind in the paratext9 converter.
- Improve error handling tests in the paratext9 parser for missing attributes in XML data.
Copy link
Contributor Author

@alex-rawlings-yyc alex-rawlings-yyc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alex-rawlings-yyc resolved 3 discussions.
Reviewable status: 0 of 25 files reviewed, all discussions resolved (waiting on alex-rawlings-yyc).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant