Skip to content

Comments

Add interlinearizer feature with XML parsing and web view integration#6

Merged
alex-rawlings-yyc merged 25 commits intomainfrom
xml-parser
Feb 18, 2026
Merged

Add interlinearizer feature with XML parsing and web view integration#6
alex-rawlings-yyc merged 25 commits intomainfrom
xml-parser

Conversation

@alex-rawlings-yyc
Copy link
Contributor

@alex-rawlings-yyc alex-rawlings-yyc commented Feb 2, 2026

  • Introduced interlinearizer.web-view.tsx and interlinearizer.web-view.scss for rendering interlinear data.
  • Implemented InterlinearXmlParser for parsing interlinear XML data.
  • Updated package.json and package-lock.json to include fast-xml-parser dependency.
  • Added new types for interlinear data in interlinearizer.d.ts.
  • Included sample interlinear XML data in test-data/Interlinear_en_MAT.xml.
  • Enhanced main.ts to register the new web view provider for the interlinearizer.

This change is Reviewable

Summary by CodeRabbit

  • New Features

    • Interlinear XML parsing and a web view to render parsed interlinear data with typed interlinear models.
  • Chores

    • Added testing infrastructure: Jest config, setup, many unit tests, and numerous test mocks.
    • CI workflow to run tests, test fixtures (sample interlinear XML), TypeScript path aliases, and runtime XML parser dependency added.
    • Added spelling dictionary entry for "interlinearization" and updated ignore files.
  • Documentation

    • README expanded to document parser, tests, and project layout.
  • Style

    • Enabled Tailwind usage in styles.

- Introduced `interlinearizer.web-view.tsx` and `interlinearizer.web-view.scss` for rendering interlinear data.
- Implemented `InterlinearXmlParser` for parsing interlinear XML data.
- Updated `package.json` and `package-lock.json` to include `fast-xml-parser` dependency.
- Added new types for interlinear data in `paranext-extension-template.d.ts`.
- Included sample interlinear XML data in `test-data/Interlinear_en_MAT.xml`.
- Enhanced `main.ts` to register the new web view provider for the interlinearizer.
- Updated `cspell.json` with new terms related to interlinearization.
@coderabbitai
Copy link

coderabbitai bot commented Feb 2, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a new InterlinearXmlParser and types, a React WebView component and Tailwind-enabled SCSS, extensive Jest test infrastructure with mocks and test data, modifies activation/provider lifecycle in main.ts, adds fast-xml-parser runtime dependency and test tooling, and introduces a GitHub Actions test workflow.

Changes

Cohort / File(s) Summary
Parser & Types
src/parsers/interlinearXmlParser.ts, src/types/interlinearizer.d.ts
New exported InterlinearXmlParser.parse(xml: string): InterlinearData and Interlinear data interfaces. Implements XML-to-structured data mapping, validation, duplicate-verse detection, ID construction, and error messages. Review parsing edge cases and numeric/attribute parsing config.
WebView Component & Styles
src/interlinearizer.web-view.tsx, src/interlinearizer.web-view.scss
Adds InterlinearWebView (exposed via globalThis.webViewComponent) that bundles test XML and uses InterlinearXmlParser in useMemo; renders parsed JSON or errors. SCSS imports Tailwind via @use 'tailwind';. Check global exposure and style pipeline.
Extension Activation & Provider
src/main.ts
Reworks activation to activate(context: ExecutionActivationContext) and deactivate(): Promise<boolean>; registers a WebView provider, stores disposables in context, opens WebView with error handling and logging. Verify provider type checks and lifecycle disposable management.
Tests, Mocks & Helpers
jest.config.ts, jest.setup.ts, src/__tests__/*, src/__tests__/test-helpers.ts, __mocks__/*
Adds comprehensive Jest config, setup, many unit tests (parser, main, web-view), and numerous CommonJS mocks (@papi, assets, inline styles, react frontend hooks) plus a test activation context helper. Review mocks for runtime parity and CommonJS/ESM boundaries.
Test Data & Docs
test-data/Interlinear_en_MAT.xml, README.md
Adds sample interlinear XML used by tests and updates README to document new parser, test-data, and test/mocking approach.
Package, Tooling & Config
package.json, tsconfig.json, .eslintrc.js, .eslintignore, .prettierignore, .stylelintignore, .gitignore, cspell.json
Adds fast-xml-parser dependency, test scripts and devDeps (Jest, testing-library, ts-jest, eslint-plugin-jest), TS path mappings, ESLint jest overrides, ignores for test artifacts, and adds interlinearization to cspell. Verify CI and local dev install expectations.
CI Workflow
.github/workflows/test.yml
New GitHub Actions workflow that checks out repo and paranext-core, installs dependencies, and runs tests on push/PR for main/release-prep/hotfix-* branches. Validate checkout paths and caching steps.

Sequence Diagram(s)

sequenceDiagram
    participant Ext as Extension (main.ts)
    participant PAPI as `@papi/backend`
    participant Provider as WebViewProvider
    participant WebViewSys as WebView System

    Note over Ext,PAPI: Activation & provider registration
    Ext->>PAPI: activate(context)
    Ext->>Provider: construct provider (getWebView)
    Ext->>PAPI: registerWebViewProvider(mainWebViewType, Provider)
    PAPI-->>Ext: registration disposable
    Ext->>Ext: store disposable in context
    Ext->>PAPI: openWebView(mainWebViewType)
    PAPI->>Provider: getWebView(mainWebViewType)
    Provider->>WebViewSys: return WebViewDefinition (title, content, styles)
    WebViewSys->>WebViewSys: initialize/render content
    Ext->>PAPI: logger.info('finished activating')
    Note over Ext,PAPI: Deactivation
    Ext->>Ext: deactivate()
    Ext->>Ext: run unsubscribers / dispose
    Ext-->>PAPI: return boolean
Loading
sequenceDiagram
    participant Component as InterlinearWebView
    participant Parser as InterlinearXmlParser
    participant Renderer as ReactRenderer

    Component->>Component: useMemo create parser
    Component->>Parser: parse(bundled XML)
    alt Success
        Parser->>Component: return InterlinearData
        Component->>Renderer: render parsed JSON view
    else Failure
        Parser->>Component: throw Error
        Component->>Renderer: render error block
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I hopped through XML and tests so bright,
I parsed each cluster late at night,
Mocks in a burrow, CI on the hill,
Jest runs quick and green tests thrill,
Hooray — the interlinear's just right! 🥕✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and accurately describes the main objective of the PR: adding an interlinearizer feature with XML parsing and web view integration. It aligns with the substantial changes across multiple files including parsers, web views, types, tests, and configuration.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch xml-parser

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@package.json`:
- Line 38: Update the vulnerable dependency declaration for fast-xml-parser by
changing the version string in package.json from "fast-xml-parser": "^5.3.3" to
"fast-xml-parser": "^5.3.4" (reference: the dependency key fast-xml-parser in
package.json), then regenerate your lockfile (npm install or yarn install) to
update package-lock.json / yarn.lock and run the test suite/build to ensure no
regressions.

In `@src/main.ts`:
- Line 38: The call to papi.webViews.openWebView(mainWebViewType, undefined, {
existingId: '?' }) returns a Promise that is currently unhandled; modify the
call in main.ts so the returned Promise is either awaited inside an async
context (await papi.webViews.openWebView(...)) or attach a .catch(...) handler
to handle errors, ensuring any thrown error is logged or handled; locate the
invocation by the identifier openWebView and the mainWebViewType/existingId
argument to update it.
🧹 Nitpick comments (3)
src/parsers/interlinearXmlParser.ts (2)

118-140: Consider consistent error handling between clusters and punctuations.

This function silently skips punctuations with missing Range elements (lines 122-127), while extractClustersFromVerse throws an error for the same condition. If this difference is intentional (punctuations being optional/non-critical), consider adding a brief comment explaining why punctuations are filtered rather than validated.


169-170: Edge case: Empty lexemesId produces IDs starting with /.

When a cluster has no lexemes, lexemesId is an empty string, producing IDs like "/5-10". If these IDs are used as keys or for lookups, the leading slash may cause unexpected behavior.

Consider whether clusters without lexemes are valid, and if so, whether the ID format should handle this case differently.

src/interlinearizer.web-view.tsx (1)

46-53: Minor: Redundant !parseError check.

Given the discriminated union ParseResult, when parsed is truthy, parseError is always undefined. The !parseError check is technically redundant but harmless—it may help TypeScript narrowing or serve as defensive coding.

- Bump `fast-xml-parser` version to `5.3.4` in `package.json` and `package-lock.json`.
- Upgrade `lockfileVersion` in `package-lock.json` to `3`.
- Improve README to clarify the role of `package.json` and `package-lock.json`, and add details about the new `test-data/` directory for sample interlinear XML.
- Add comments in `src/main.ts` and `src/interlinearizer.web-view.tsx` for better code documentation.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@src/interlinearizer.web-view.tsx`:
- Around line 18-19: Add a TypeScript global augmentation so the assignment to
globalThis.webViewComponent is typed: create a .d.ts file and declare a global
block that adds a var webViewComponent typed as a function returning a
JSX.Element (or the actual component return type), e.g. declare global { var
webViewComponent: () => JSX.Element; } and ensure the .d.ts is included in the
TS build (or referenced) so the compiler recognizes the global.

In `@src/main.ts`:
- Around line 52-61: The WebView is opened before its provider registration
completes, causing a race; await the registration promise before calling
papi.webViews.openWebView and before adding the provider to
context.registrations. Concretely, wait for mainWebViewProviderPromise (result
of papi.webViewProviders.registerWebViewProvider with
mainWebViewType/mainWebViewProvider), add the awaited registration to
context.registrations, then call papi.webViews.openWebView(mainWebViewType, ...)
and keep the existing error handling.

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Copy link
Contributor

@imnasnainaec imnasnainaec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@imnasnainaec made 3 comments.
Reviewable status: 0 of 10 files reviewed, 3 unresolved discussions (waiting on @alex-rawlings-yyc and @jasonleenaylor).


src/parsers/interlinearXmlParser.ts line 9 at r3 (raw file):

  PunctuationData,
  StringRange,
} from 'paranext-extension-template';

This file has several references to paranext-extension-template that need updating.


src/main.ts line 15 at r3 (raw file):

 * opening the WebView from the platform.
 */
const mainWebViewType = 'paranextExtensionTemplate.interlinearizer';

'paranextExtensionTemplate.interlinearizer' -> 'interlinearizer.<something like mainWebView>'


src/interlinearizer.web-view.tsx line 2 at r3 (raw file):

import { useMemo } from 'react';
import type { InterlinearData } from 'paranext-extension-template';

Need to update 'paranext-extension-template' to 'interlinearizer'.

- Changed import source for `InterlinearData` and other types from `paranext-extension-template` to `interlinearizer` in `src/interlinearizer.web-view.tsx` and `src/parsers/interlinearXmlParser.ts`.
- Updated the `mainWebViewType` constant in `src/main.ts` to reflect the new naming convention for the interlinearizer web view.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In `@src/parsers/interlinearXmlParser.ts`:
- Around line 122-143: In extractPunctuationsFromVerse, the current guard only
checks for undefined on range fields and may pass non-numeric values into the
returned TextRange; update the filter to validate/coerce the Range values
(el.Range['@_Index'] and el.Range['@_Length']) are finite numbers (e.g., use
Number(...) or parseInt and Number.isFinite / !Number.isNaN checks) and skip
entries that fail validation, then in the map emit Index and Length as numeric
values (not raw strings) so TextRange satisfies the expected numeric StringRange
shape.
- Around line 155-187: In extractClustersFromVerse, rangeElement['@_Index'] and
['@_Length'] are only checked for undefined but may be non-numeric; coerce them
to numbers (e.g., Number(...) or parseInt(..., 10)), validate they are finite
integers (not NaN/Infinity) and throw a clear Error if invalid before
constructing the StringRange and Id; ensure the Id uses the validated numeric
index/length values (and update any uses of index/length in the returned object
to the coerced numeric variables).
- Around line 1-9: The import from 'interlinearizer' is unresolved causing
lint/build errors; either update project module resolution (add tsconfig paths
and configure eslint-import-resolver-typescript) or replace the bare module
import with a correct relative import that points to the module exporting
LexemeData, PunctuationData, ClusterData, StringRange, InterlinearData, and
VerseData used in interlinearXmlParser.ts; locate the import line that currently
reads "import { X } from 'interlinearizer'" and change it to the proper relative
path to the file that defines those exports (or fix tsconfig/eslint resolver) so
the symbols resolve.

Copy link
Contributor

@jasonleenaylor jasonleenaylor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good start, we should probably add unit tests in to verify the parsing behavior on this test data. Start off with good coverage ;)

@jasonleenaylor partially reviewed 8 files and made 3 comments.
Reviewable status: 4 of 10 files reviewed, 3 unresolved discussions (waiting on @alex-rawlings-yyc and @imnasnainaec).


src/types/interlinearizer.d.ts line 34 at r5 (raw file):

    Lexemes: LexemeData[];
    /** When true, this cluster instance is excluded from the interlinear display. */
    Excluded: boolean;

I'm curious about this bit of data. What do you see as the use case for it?


src/types/interlinearizer.d.ts line 70 at r5 (raw file):

    BookId: string;
    /** Verse data keyed by verse reference (e.g. "RUT 3:1"). */
    Verses: Record<string, VerseData>;

This makes a concrete claim that there can only be one set of verse data per reference, we would need to have code guarding that in the parsing.

Copy link
Contributor

@imnasnainaec imnasnainaec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@imnasnainaec reviewed all commit messages, made 2 comments, and resolved 3 discussions.
Reviewable status: 4 of 10 files reviewed, 2 unresolved discussions (waiting on @alex-rawlings-yyc and @jasonleenaylor).


a discussion (no related file):
When I run Platform.Bible with this extension, I get:

Interlinearizer

Raw JSON of the model parsed from test-data/Interlinear_en_MAT.xml.

Parse error

Invalid XML: Range missing required Index or Length attributes

I'm glad for the error handling but suspect the default/demo behavior isn't supposed to trigger it.


package-lock.json line 0 at r5 (raw file):
This changes us from the template's "lockfileVersion": 2, to "lockfileVersion": 3,. I think I support that, but want to make sure it's a conscious choice. And once done, we probably want to avoid anybody taking us back to v2.

Copy link
Contributor Author

@alex-rawlings-yyc alex-rawlings-yyc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. I've been out of the real world for long enough that I forgot about those entirely. I'll get them done

@alex-rawlings-yyc made 6 comments.
Reviewable status: 4 of 10 files reviewed, 2 unresolved discussions (waiting on @imnasnainaec and @jasonleenaylor).


a discussion (no related file):

Previously, imnasnainaec (D. Ror.) wrote…

When I run Platform.Bible with this extension, I get:

Interlinearizer

Raw JSON of the model parsed from test-data/Interlinear_en_MAT.xml.

Parse error

Invalid XML: Range missing required Index or Length attributes

I'm glad for the error handling but suspect the default/demo behavior isn't supposed to trigger it.

That was a rushed change I made. Should've tested it better before pushing. I'll add it to the list


package-lock.json line at r5 (raw file):

Previously, imnasnainaec (D. Ror.) wrote…

This changes us from the template's "lockfileVersion": 2, to "lockfileVersion": 3,. I think I support that, but want to make sure it's a conscious choice. And once done, we probably want to avoid anybody taking us back to v2.

Not a conscious decision. Still not super familiar with npm to I might've done that by accident


src/interlinearizer.web-view.tsx line 2 at r3 (raw file):

Previously, imnasnainaec (D. Ror.) wrote…

Need to update 'paranext-extension-template' to 'interlinearizer'.

Done.


src/main.ts line 15 at r3 (raw file):

Previously, imnasnainaec (D. Ror.) wrote…

'paranextExtensionTemplate.interlinearizer' -> 'interlinearizer.<something like mainWebView>'

Done.


src/parsers/interlinearXmlParser.ts line 9 at r3 (raw file):

Previously, imnasnainaec (D. Ror.) wrote…

This file has several references to paranext-extension-template that need updating.

Done.

Copy link
Contributor Author

@alex-rawlings-yyc alex-rawlings-yyc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alex-rawlings-yyc made 1 comment.
Reviewable status: 4 of 10 files reviewed, 2 unresolved discussions (waiting on @imnasnainaec and @jasonleenaylor).


a discussion (no related file):

Previously, alex-rawlings-yyc wrote…

That was a rushed change I made. Should've tested it better before pushing. I'll add it to the list

Forgot to negate some conditions. Fixed in the latest commit

Copy link
Contributor Author

@alex-rawlings-yyc alex-rawlings-yyc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alex-rawlings-yyc resolved 1 discussion.
Reviewable status: 3 of 30 files reviewed, 2 unresolved discussions (waiting on @imnasnainaec and @jasonleenaylor).

- Introduced Jest configuration in `jest.config.ts` and setup file in `jest.setup.ts` for testing.
- Added unit tests for the `interlinearizer.web-view` component , `InterlinearXmlParser`, and `main` in `src/__tests__/`.
- Created mocks for static assets and dependencies to facilitate testing.
- Updated ESLint and Prettier configurations to ignore Jest-related files and directories.
- Enhanced README to document the new testing structure and usage.

Co-authored-by: Cursor <cursoragent@cursor.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@jest.config.ts`:
- Around line 65-76: The comment above the coverageThreshold is inconsistent: it
says "Branches at 98% to allow err instanceof Error fallback" while the
coverageThreshold.branches is set to 100; update the comment to reflect 100% (or
change coverageThreshold.branches to 98 if that was intended) by editing the
coverageThreshold block and its preceding comment so they match (refer to the
coverageThreshold object and the branches property in the jest.config.ts
snippet).
🧹 Nitpick comments (6)
__mocks__/web-view-inline.js (1)

5-7: Consider returning null for React component semantics.

If this mock is ever rendered as a React component, returning null is the idiomatic way to render nothing, whereas undefined would cause a React warning in strict mode. If this mock is purely for import resolution and never rendered, the current implementation is fine.

♻️ Optional fix
 function MockWebViewComponent() {
-  return undefined;
+  return null;
 }
__mocks__/papi-backend.js (1)

1-33: Potential duplication with __mocks__/@papi/backend.js.

This file is nearly identical to __mocks__/@papi/backend.js. The moduleNameMapper in jest.config.ts explicitly maps @papi/backend to this file, making the scoped directory mock potentially redundant. Consider consolidating to a single mock file and updating either the moduleNameMapper or removing the duplicate to avoid maintenance overhead.

__mocks__/@papi/backend.js (1)

1-33: Implementation is correct, but see duplication note above.

This is the proper Jest auto-mock location for the scoped @papi/backend package. However, since jest.config.ts uses moduleNameMapper to explicitly resolve this module, this file may not be used during test runs. Consider verifying which mock is active and consolidating.

src/__tests__/interlinearizer.web-view.test.tsx (1)

34-38: Unconventional but functional component access pattern.

Using require() and globalThis.webViewComponent works but couples tests to an internal export mechanism. If the web-view module's export pattern changes, tests will break silently at runtime rather than at compile time.

Consider whether the component could be exported directly for testability, or document this pattern if it's a framework constraint.

src/__tests__/main.test.ts (1)

39-44: Simplify the type guard logic.

Line 40 has confusing and partially redundant conditions:

  • x === undefined is fine
  • typeof x === 'object' && !(x instanceof Object) is convoluted — this attempts to catch null (which has typeof === 'object' but isn't an instance of Object), but the next line typeof x !== 'object' would return false for null anyway
♻️ Suggested simplification
 function isIWebViewProvider(x: unknown): x is IWebViewProvider {
-  if (x === undefined || (typeof x === 'object' && !(x instanceof Object))) return false;
-  if (typeof x !== 'object') return false;
+  if (x == null || typeof x !== 'object') return false;
   if (!('getWebView' in x)) return false;
   return typeof x.getWebView === 'function';
 }
src/parsers/interlinearXmlParser.ts (1)

123-141: Use coerced numeric values in the map step for consistency.

The filter at lines 127-129 correctly validates with Number() and Number.isFinite(), but the map at lines 135-136 uses the raw rangeElement['@_Index'] and rangeElement['@_Length'] values instead of the coerced numbers. While parseAttributeValue: true should ensure these are already numbers, this is inconsistent with extractClustersFromVerse which explicitly uses the coerced values.

♻️ Proposed fix for consistency
   return elements
     .filter((el): el is ParsedPunctuation & { Range: ParsedRange } => {
       const rangeElement = el.Range;
       if (!rangeElement) return false;
-      const indexRaw = Number(rangeElement['@_Index']);
-      const lengthRaw = Number(rangeElement['@_Length']);
-      return Number.isFinite(indexRaw) && Number.isFinite(lengthRaw);
+      const index = Number(rangeElement['@_Index']);
+      const length = Number(rangeElement['@_Length']);
+      return Number.isFinite(index) && Number.isFinite(length);
     })
     .map((el) => {
       const rangeElement = el.Range;
       return {
         TextRange: {
-          Index: rangeElement['@_Index'],
-          Length: rangeElement['@_Length'],
+          Index: Number(rangeElement['@_Index']),
+          Length: Number(rangeElement['@_Length']),
         },
         BeforeText: el.BeforeText ?? '',
         AfterText: el.AfterText ?? '',
       };
     });

@alex-rawlings-yyc
Copy link
Contributor Author

alex-rawlings-yyc commented Feb 5, 2026

Re: src/__tests__/interlinearizer.web-view.test.tsx: Loading the web-view module assigns the component to globalThis.webViewComponent. This pattern is required by the Platform.Bible web-view framework: the web-view entry is built with a ?inline query and consumed by main.ts, so the component is not a normal export. Tests that need to render the component must require() the module and read globalThis. If the web-view export mechanism changes, update this test accordingly.

Re: src/__tests__/main.ts: we are using eslint plugin no-null

Rest of nitpicks are valid and taken into account

Copy link
Contributor Author

@alex-rawlings-yyc alex-rawlings-yyc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alex-rawlings-yyc resolved 1 discussion.
Reviewable status: 3 of 30 files reviewed, 1 unresolved discussion (waiting on @imnasnainaec and @jasonleenaylor).

- Updated README to specify the single mock for `@papi/backend` and its usage in Jest tests.
- Modified `web-view-inline.js` mock to return `null` instead of `undefined` for better React compliance.
- Removed redundant `@papi/backend` mock file to streamline mock management.
- Clarified comments in `interlinearizer.web-view.test.tsx` regarding the web-view module loading pattern.
- Improved type handling in `interlinearXmlParser.ts` for index and length attributes.
- Added @file annotations to the comments in various Jest mock files to standardize documentation and improve clarity on their purpose.
- Updated unit test files to reflect the same @file annotation format for consistency in documentation style.

Co-authored-by: Cursor <cursoragent@cursor.com>
Copy link
Contributor

@imnasnainaec imnasnainaec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@imnasnainaec partially reviewed 17 files and all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @alex-rawlings-yyc).

…cumentation

- Reintroduced an optional 'Excluded' boolean flag to the ClusterData structure, allowing for location-specific exclusion of phrases in the interlinear display.
- Updated README to document the 'Excluded' attribute and its behavior.
- Enhanced unit tests to validate parsing of the 'Excluded' flag in XML input, ensuring correct handling of both true and false values.
Copy link
Contributor

@imnasnainaec imnasnainaec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@imnasnainaec reviewed 5 files and all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @alex-rawlings-yyc).

Copy link
Contributor

@myieye myieye left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm both highly impressed and intimidated by the test and mock setup 😆. There's a lot going on there!

@myieye made 6 comments.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @alex-rawlings-yyc).


src/parsers/interlinearXmlParser.ts line 202 at r18 (raw file):

      parseAttributeValue: true,
      isArray: (_tagName, jPath) => {
        return [

⛏️ Hopefully, we'll be importing loads and loads of this XML 😉, so creating the collection outside the callback and using a Set feel like they're worthwhile optimizations to me:

const arrayPaths = new Set([
  'InterlinearData.Verses.item',
  'InterlinearData.Verses.item.VerseData.Cluster',
  'InterlinearData.Verses.item.VerseData.Punctuation',
  'InterlinearData.Verses.item.VerseData.Cluster.Lexeme',
]);

isArray: (_tagName, jPath) => arrayPaths.has(jPath),

.github/workflows/test.yml line 5 at r18 (raw file):

on:
  push:
    branches: ['main', 'release-prep', 'hotfix-*']

Where did 'release-prep' and 'hotfix-*' come from?
Do we anticipate using those?


README.md line 119 at r18 (raw file):

> See the [Extension Anatomy wiki page](https://github.com/paranext/paranext-extension-template/wiki/Extension-Anatomy) for more information about the various files that comprise an extension and their relationships to each other.

## Interlinear XML schema

I feel like this documentation is out of place in the readme, which is intended to be a brief intro to the project/repo.
Should it be a sibling of the actual parser?

It also sounds like it's "the" XML schema that our extension uses, when it's actually "just" for importing PT9 data.
Should the docs and/or the folder/file/class naming reflect that this is a PT9 schema?


README.md line 204 at r18 (raw file):

        </Cluster>
        <Cluster>
          <Range Index="4" Length="3" />

⛏️ Should probably be Length="7" 🙃


src/types/interlinearizer.d.ts line 6 at r18 (raw file):

 */
/**
 * Interlinear types (InterlinearData, VerseData, ClusterData, etc.) are the public API for

🤔 I thought these types were just for us to represent parsed PT9 data in TS and that our actual API/interlinear data model was yet to be determined. Am I behind? 😬

Copy link
Contributor

@myieye myieye left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review noticed two more things, so I (1) added another comment regarding whitespace and (2) pushed a test and fix for numeric hashes.

@myieye made 2 comments.
Reviewable status: 33 of 35 files reviewed, 2 unresolved discussions (waiting on @alex-rawlings-yyc and @imnasnainaec).


src/__tests__/parsers/interlinearXmlParser.test.ts line 298 at r18 (raw file):

        {
          TextRange: { Index: 34, Length: 2 },
          BeforeText: '?',

Are we sure that the whitespace in the punctuation text elements is not meaningful?

@myieye
Copy link
Contributor

myieye commented Feb 13, 2026

I also took the liberty of refactoring the isArray callback per my suggestion.

Copy link
Contributor

@imnasnainaec imnasnainaec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@imnasnainaec reviewed all commit messages, made 2 comments, and resolved 1 discussion.
Reviewable status: 33 of 35 files reviewed, 1 unresolved discussion (waiting on @alex-rawlings-yyc).


.github/workflows/test.yml line 5 at r18 (raw file):

Previously, myieye (Tim Haasdyk) wrote…

Where did 'release-prep' and 'hotfix-*' come from?
Do we anticipate using those?

That matches the branches in the other workflow files we inherited from the template.


src/types/interlinearizer.d.ts line 6 at r18 (raw file):

Previously, myieye (Tim Haasdyk) wrote…

🤔 I thought these types were just for us to represent parsed PT9 data in TS and that our actual API/interlinear data model was yet to be determined. Am I behind? 😬

You're correct. Jason expects to have a model draft for us to review soon. I figure this will be updated accordingly when we start implementing that.

- Modified README to include detailed documentation on the Paratext 9 XML schema and its parsed output.
- Adjusted comments in interlinearXmlParser.ts and interlinearXmlParser.test.ts to reference the new documentation for the 'Excluded' flag.
- Changed parser configuration to ensure tag text is not trimmed during parsing.
- Updated several package dependencies to their latest versions in package-lock.json.
Copy link
Contributor Author

@alex-rawlings-yyc alex-rawlings-yyc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alex-rawlings-yyc partially reviewed 2 files, made 2 comments, and resolved 1 discussion.
Reviewable status: 30 of 36 files reviewed, all discussions resolved (waiting on @imnasnainaec).


README.md line 119 at r18 (raw file):

Previously, myieye (Tim Haasdyk) wrote…

I feel like this documentation is out of place in the readme, which is intended to be a brief intro to the project/repo.
Should it be a sibling of the actual parser?

It also sounds like it's "the" XML schema that our extension uses, when it's actually "just" for importing PT9 data.
Should the docs and/or the folder/file/class naming reflect that this is a PT9 schema?

Good point. This was only ever going to be a temporary home, but I suppose given that it's a public repo it would be better-stored elsewhere from the beginning. Naming comments are very valid. I'll move this to a separate markdown


src/__tests__/parsers/interlinearXmlParser.test.ts line 298 at r18 (raw file):

Previously, myieye (Tim Haasdyk) wrote…

Are we sure that the whitespace in the punctuation text elements is not meaningful?

Now that I think about it, I think using the fast-xml-parser option trimValues: true isn't a great idea in general, since spaces need to be put in manually by a use in the PT9 side of things. Best to keep as much user-generated data as possible. I'll set it to false in the next push

Co-authored-by: Cursor <cursoragent@cursor.com>
Copy link
Contributor Author

@alex-rawlings-yyc alex-rawlings-yyc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alex-rawlings-yyc reviewed 7 files and all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on alex-rawlings-yyc).

Copy link
Contributor

@imnasnainaec imnasnainaec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@imnasnainaec reviewed 5 files and all commit messages, and made 1 comment.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on alex-rawlings-yyc).


src/parsers/interlinearXmlParser.ts line 163 at r22 (raw file):

    /** Cluster Id: LexemesId/Index-Length when lexemes present; Index-Length when none. */
    const id = lexemesId ? `${lexemesId}/${index}-${length}` : `${index}-${length}`;
    const excluded = el.Excluded === 'true' || false;

❓ Why || false? Can === result in null or undefined?

@alex-rawlings-yyc
Copy link
Contributor Author

src/parsers/interlinearXmlParser.ts line 163 at r22 (raw file):

Previously, imnasnainaec (D. Ror.) wrote…

❓ Why || false? Can === result in null or undefined?

Good point.

Copy link
Contributor Author

@alex-rawlings-yyc alex-rawlings-yyc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alex-rawlings-yyc reviewed 2 files and all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on imnasnainaec).

Copy link
Contributor

@imnasnainaec imnasnainaec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@imnasnainaec reviewed 2 files and all commit messages, and resolved 1 discussion.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on alex-rawlings-yyc).

@alex-rawlings-yyc alex-rawlings-yyc merged commit c02d8fc into main Feb 18, 2026
8 checks passed
@alex-rawlings-yyc alex-rawlings-yyc deleted the xml-parser branch February 18, 2026 23:01
alex-rawlings-yyc added a commit that referenced this pull request Feb 19, 2026
…#6)

Add interlinear XML parser, minimal WebView integration, and full test coverage

- Parse PT9 interlinear XML (`fast-xml-parser`) into typed `InterlinearData`
- Public API in `interlinearizer.d.ts` (`StringRange`, `ClusterData`, `VerseData`, etc.)
- Register interlinearizer WebView provider and open React WebView on activation
- Document PT9 schema and parser behavior in `src/parsers/pt9-xml.md`
- Jest unit tests with 100% coverage on parsers, `main`, and web-view
- GitHub Actions workflow for test and lint on push/PR to main
- Handle `Excluded` flag and numeric/hash validation in parser
- Add test-data sample XML and PAPI mocks for tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants