-
Notifications
You must be signed in to change notification settings - Fork 41
refactor: Remove Xerces/xml-apis dependencies and modernize SAX readers #223
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: Remove Xerces/xml-apis dependencies and modernize SAX readers #223
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR successfully removes the Xerces and xml-apis dependencies from the codebase, replacing them with a custom AbstractXMLReader base class that provides standard JDK XMLReader functionality. This modernization eliminates classloader conflicts, reduces security risks, and decreases artifact size by leveraging the built-in Java XML parser.
Key changes:
- Introduced
AbstractXMLReaderto centralize XMLReader interface boilerplate and provide strict SAX2 compliance - Refactored 4 custom reader classes (DelimitedReader, EDIReader, ER7Reader, NCPDPReader) to extend
AbstractXMLReaderinstead of Xerces SAXParser - Removed xercesImpl and xml-apis JARs from both client and server dependencies
Reviewed changes
Copilot reviewed 8 out of 12 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| server/src/org/openintegrationengine/engine/plugins/datatypes/AbstractXMLReader.java | New base class providing XMLReader interface implementation with handler management and SAX2 feature compliance |
| server/src/com/mirth/connect/plugins/datatypes/ncpdp/NCPDPReader.java | Refactored to extend AbstractXMLReader, added ensureHandlerSet() call, updated license header |
| server/src/com/mirth/connect/plugins/datatypes/hl7v2/ER7Reader.java | Refactored to extend AbstractXMLReader, added ensureHandlerSet() call, updated license header |
| server/src/com/mirth/connect/plugins/datatypes/edi/EDIReader.java | Refactored to extend AbstractXMLReader, added ensureHandlerSet() call, updated license header |
| server/src/com/mirth/connect/plugins/datatypes/delimited/DelimitedReader.java | Refactored to extend AbstractXMLReader, added ensureHandlerSet() call, removed unused ContentHandler import, updated license header |
| server/docs/thirdparty/THIRD-PARTY-README.txt | Removed documentation for xml-apis and Xerces dependencies |
| server/.classpath | Removed xercesImpl-2.12.2.jar and xml-apis-1.4.01.jar entries |
| client/.classpath | Removed xercesImpl-2.12.2.jar and xml-apis-1.4.01.jar entries |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
server/src/org/openintegrationengine/engine/plugins/datatypes/AbstractXMLReader.java
Outdated
Show resolved
Hide resolved
server/src/org/openintegrationengine/engine/plugins/datatypes/AbstractXMLReader.java
Outdated
Show resolved
Hide resolved
server/src/org/openintegrationengine/engine/plugins/datatypes/AbstractXMLReader.java
Outdated
Show resolved
Hide resolved
mgaffigan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll approve after I can kick the tires a bit - nothing obvious apart from the bool comparisons copilot already called out.
I assume this is a breaking change for any data type plugin. Should be called out in release notes.
Yes, it would be a breaking change for anything that specifically imports from packages below I am going to have follow-up PRs which are dependent on this one to remove other outdated xml dependencies, but this one had to go first since the standalone xerces release only supports JAXP version 1.4, which is what was bundled with Java 6. The internal implementation has been at JAXP 1.6 since Java 8. With this jar on the classpath, standard To test this I created channels where the source inbound type was the target datatype and the outbound type was XML. The destination reversed this and went from xml to the target type. I created a sample message and verified the transformations to and from xml so the final output was the same as the original. I did this for delimited, hl7v2, and edi. I did not know where to find a sample message to test NCPDP, but I assume it's fine since the others all worked. The bool comparison thing I went back and forth on before I opened the PR. I decided to keep it with the equality comparison since the parameter represents a value to be set, and is not meant for flow control, so it made more sense to me to do the explicit comparison as I would do for any value in a key/val pair that wasn't a boolean. I don't have a strong preference either way, though. |
mgaffigan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested by creating type to xml channels on Mirth 4.4.0 and this branch. Compared results given same source messages.
Here's a NCPDP Telecom Sample File. Note: File includes non-printable characters.
kayyagari
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Built with Java 17 and tested using a channel with both inbound and outbound datatypes set to a combination of Hl7v2.x and XML.
NicoPiel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the motivation includes “correctly report namespace support to downstream transformers,” can we add a regression test that runs a reader through the actual transformation path (SAXSource/Transformer) to confirm no additional feature/property probes break execution?
server/src/org/openintegrationengine/engine/plugins/datatypes/AbstractXMLReader.java
Show resolved
Hide resolved
server/src/org/openintegrationengine/engine/plugins/datatypes/AbstractXMLReader.java
Show resolved
Hide resolved
5403e8f
5403e8f to
bf0392d
Compare
bf0392d to
3a291f7
Compare
"correctly report namespace support to downstream transformers" is alluding to the fact that the XMLReader.setFeature method states:
These features affect the argument values used when the The main purpose of this PR was to remove the imports of
I pushed some additional updates that now include tests for the new class and some additional changes after reading more of the interface javadocs to make sure the plugins follow the contract (mainly passing an empty When merging I plan to squash the first 4 commits and keep the last one separate. |
How to say this ... Does Mirth do anything stupiud or non-standard when it renders XML with Xerces that we have to either explicitly deprecate, explicitly test, or explicitly document as no longer supported? IOW it's not doing anything hacky with Xerces. I searched for code references and talked to the AI a bit. It does not seem to do anything custom with Xerces. |
|
I'm arguing with JARs to render I have to put it down for the night but it looks like I have to get a different CLI tool to view it. EDIT - figured it out Code coverage on these classes isn't good enough for my strategy to provide any useful insight HOWTO: From repo root run: code-coverage-reports/jacoco.exec \
--html code-coverage-reports \
--classfiles classes``` |
Removes the dependency on `xercesImpl` and `xml-apis` to resolve
classloader conflicts and reduce security risks in Java 17+
environments. Refactors custom Reader classes to use the standard JDK
`XMLReader` interface instead of extending the Apache Xerces
`SAXParser`.
Changes:
- **Dependency Removal:** Removed `xercesImpl` and `xml-apis` from the
build configuration.
- **New Base Class:** Introduced `AbstractXMLReader` to centralize
common `XMLReader` logic (handler management, feature/property
compliance) and reduce code duplication across plugins.
- **New Tests:** Added tests for AbstractXMLReader.
- **Reader Refactoring:** Updated `DelimitedReader`, `EDIReader`,
`ER7Reader`, and `NCPDPReader`:
- Replaced inheritance (`extends SAXParser`) with `extends AbstractXMLReader`.
- Added standard `parse(String)` overload via the base class.
- Implemented strict SAX2 compliance for `getFeature`/`setFeature`
to correctly report namespace support to downstream transformers.
- **Code Cleanup:** Removed unused imports of
`org.apache.xerces.parsers.SAXParser`.
- **Documentation Cleanup:** Removed references to removed libraries in
`THIRD-PARTY-README.txt`
This change ensures the application uses the built-in JDK XML parser,
reducing the artifact size and aligning with modern Java best practices.
Signed-off-by: Tony Germano <tony@germano.name>
According to the javadoc for ContentHandler.startElement, if there are no attributes, then an empty Attributes object should be passed rather than null. For both startElement and endElement if qualified names are not available then the qname parameter should be an empty string. It was already being used this way in most instances, but a few were passing null instead. Signed-off-by: Tony Germano <tony@germano.name>
041f912 to
d97c1f1
Compare
Removes the dependency on
xercesImplandxml-apisto resolve classloader conflicts and reduce security risks in Java 17+ environments. Refactors custom Reader classes to use the standard JDKXMLReaderinterface instead of extending the Apache XercesSAXParser.Changes:
xercesImplandxml-apisfrom the build configuration.AbstractXMLReaderto centralize commonXMLReaderlogic (handler management, feature/property compliance) and reduce code duplication across plugins.DelimitedReader,EDIReader,ER7Reader, andNCPDPReader:extends SAXParser) withextends AbstractXMLReader.parse(String)overload via the base class.getFeature/setFeatureto correctly report namespace support to downstream transformers.org.apache.xerces.parsers.SAXParser.THIRD-PARTY-README.txtThis change ensures the application uses the built-in JDK XML parser, reducing the artifact size and aligning with modern Java best practices.