Skip to content

Comments

Fix: Add missing SCTE-20 marker bit syntax check (resolves TODO)#2130

Open
pranavshar223 wants to merge 1 commit intoCCExtractor:masterfrom
pranavshar223:fix-scte20-syntax-check
Open

Fix: Add missing SCTE-20 marker bit syntax check (resolves TODO)#2130
pranavshar223 wants to merge 1 commit intoCCExtractor:masterfrom
pranavshar223:fix-scte20-syntax-check

Conversation

@pranavshar223
Copy link
Contributor

In raising this pull request, I confirm the following (please check boxes):

  • I have read and understood the contributors guide.
  • I have checked that another pull request for this purpose does not exist.
  • I have considered, and confirmed that this submission will be valuable to others.
  • I accept that this submission may not be used, and the pull request closed at the will of the maintainer.
  • I give this submission freely, and claim no ownership to its content.
  • I have mentioned this change in the changelog.

My familiarity with the project is as follows (check one):

  • I have never used CCExtractor.
  • I have used CCExtractor just a couple of times.
  • I absolutely love CCExtractor, but have not contributed previously.
  • I am an active contributor to CCExtractor.

Description

This PR resolves an outstanding TODO: Add syntax check within the SCTE-20 parsing loop, improving the parser's resilience against corrupted or malformed video streams.

According to the SCTE-20 broadcast specification, each caption block is encoded in a 26-bit structure (containing priority, field number, line offset, and two bytes of caption data). The specification mandates that this 26-bit sequence must strictly end with a trailing 1-bit, which acts as a synchronization and marker bit.

Previously, the parser read this trailing bit to advance the stream but discarded it without evaluation. If a stream suffered from corruption or misalignment, causing the marker bit to evaluate to 0, the parser would blindly accept the corrupted cc_data1 and cc_data2 payloads, potentially resulting in garbage characters being sent to the subtitle decoder.

Changes Made

I have synchronized this fix across both the original C codebase and the Rust port to maintain parity.

  • Captured the Marker Bit: Instead of discarding the final bit of the SCTE-20 block, it is now explicitly stored and evaluated.
  • Added Syntax Validation: If marker_bit == 0, the stream is locally corrupted. The parser now logs a verbose debug warning (user_data: SCTE-20 syntax error - marker bit is 0).
  • Safe Error Recovery: When a 0 is detected, the parser triggers a continue to skip processing the current corrupted block.
  • Zero-State Safety: Because the cc_data array is pre-initialized with zeros at the start of the function ([0u8; 3 * 31 + 1]), skipping the block execution safely leaves those specific bytes as 0x00 0x00 0x00. The caption decoding engine natively recognizes 0x00 as an invalid/empty payload and will safely ignore it without crashing or rendering garbage text.

Files Modified

  • C Core: src/lib_ccx/es_userdata.c
  • Rust Port: src/rust/src/es/userdata.rs

Testing

  • Rust: cargo check passes successfully with no warnings.
  • C: Compiled cleanly without warnings or errors using the Linux ./build script.
  • Confirmed that standard stream parsing continues normally and only malformed blocks trigger the skip condition.

@pranavshar223 pranavshar223 force-pushed the fix-scte20-syntax-check branch from 8277745 to 381489f Compare February 20, 2026 10:04
@ccextractor-bot
Copy link
Collaborator

CCExtractor CI platform finished running the test files on linux. Below is a summary of the test results, when compared to test for commit 6f7ce27...:
Report Name Tests Passed
Broken 13/13
CEA-708 14/14
DVB 6/7
DVD 3/3
DVR-MS 2/2
General 25/27
Hardsubx 1/1
Hauppage 3/3
MP4 3/3
NoCC 10/10
Options 81/86
Teletext 21/21
WTV 13/13
XDS 34/34

Your PR breaks these cases:

  • ccextractor --autoprogram --out=srt --latin1 --quant 0 85271be4d2...
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla dab1c1bd65...
  • ccextractor --out=srt --latin1 --autoprogram 29e5ffd34b...
  • ccextractor --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9...
  • ccextractor --startcreditsnotbefore 1 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9...
  • ccextractor --startcreditsnotafter 2 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9...
  • ccextractor --startcreditsforatleast 1 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9...
  • ccextractor --startcreditsforatmost 2 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9...

Congratulations: Merging this PR would fix the following tests:


It seems that not all tests were passed completely. This is an indication that the output of some files is not as expected (but might be according to you).

Check the result page for more info.

@ccextractor-bot
Copy link
Collaborator

CCExtractor CI platform finished running the test files on windows. Below is a summary of the test results, when compared to test for commit 6f7ce27...:
Report Name Tests Passed
Broken 13/13
CEA-708 14/14
DVB 6/7
DVD 3/3
DVR-MS 2/2
General 25/27
Hardsubx 1/1
Hauppage 3/3
MP4 3/3
NoCC 10/10
Options 81/86
Teletext 21/21
WTV 13/13
XDS 34/34

Your PR breaks these cases:

  • ccextractor --autoprogram --out=srt --latin1 --quant 0 85271be4d2...
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla dab1c1bd65...
  • ccextractor --out=srt --latin1 --autoprogram 29e5ffd34b...
  • ccextractor --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9...
  • ccextractor --startcreditsnotbefore 1 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9...
  • ccextractor --startcreditsnotafter 2 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9...
  • ccextractor --startcreditsforatleast 1 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9...
  • ccextractor --startcreditsforatmost 2 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9...

Congratulations: Merging this PR would fix the following tests:


It seems that not all tests were passed completely. This is an indication that the output of some files is not as expected (but might be according to you).

Check the result page for more info.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants