Skip to content

Conversation

@vatsalkeshav
Copy link
Contributor

@vatsalkeshav vatsalkeshav commented Dec 24, 2025

In raising this pull request, I confirm the following (please check boxes):

  • I have read and understood the contributors guide.
  • I have checked that another pull request for this purpose does not exist.
  • I have considered, and confirmed that this submission will be valuable to others.
  • I accept that this submission may not be used, and the pull request closed at the will of the maintainer.
  • I give this submission freely, and claim no ownership to its content.
  • I have mentioned this change in the changelog.

My familiarity with the project is as follows (check one):

  • I have never used CCExtractor.
  • I have used CCExtractor just a couple of times.
  • I absolutely love CCExtractor, but have not contributed previously.
  • I am an active contributor to CCExtractor.

This pull request migrates ccx_decoders_xds.c and ccx_decoders_xds.h to /src/rust/xds rust module.

  • mod.rs : exposes the Rust implementation to the C codebase
  • types.rs : has types and structures for decoding extended data
  • handlers.rs : has handler functions for processing extended data packets

Tested on stream : https://sampleplatform.ccextractor.org/sample/b22260d065ab537899baaf34e78a5184671f4bcb2df0414d05e6345adfd7812f

@ccextractor-bot
Copy link
Collaborator

CCExtractor CI platform finished running the test files on linux. Below is a summary of the test results, when compared to test for commit 000b397...:
Report Name Tests Passed
Broken 12/13
CEA-708 14/14
DVB 7/7
DVD 3/3
DVR-MS 2/2
General 24/27
Hardsubx 1/1
Hauppage 3/3
MP4 3/3
NoCC 10/10
Options 81/86
Teletext 21/21
WTV 13/13
XDS 16/34

Your PR breaks these cases:

  • ccextractor --autoprogram --out=ttxt --latin1 --ucla --xds 8e8229b88b...
  • ccextractor --autoprogram --out=ttxt --latin1 1974a299f0...
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla dab1c1bd65...
  • ccextractor --out=srt --latin1 --autoprogram 29e5ffd34b...
  • ccextractor --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9...
  • ccextractor --startcreditsnotbefore 1 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9...
  • ccextractor --startcreditsnotafter 2 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9...
  • ccextractor --startcreditsforatleast 1 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9...
  • ccextractor --startcreditsforatmost 2 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9...
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla --xds 725a49f871...
  • ccextractor --autoprogram --out=ttxt --xds --latin1 --ucla d037c7509e...
  • ccextractor --autoprogram --out=ttxt --xds --latin1 --ucla e274a73653...
  • ccextractor --autoprogram --out=ttxt --xds --latin1 --ucla 85058ad37e...
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla --xds b22260d065...
  • ccextractor --autoprogram --out=ttxt --latin1 --xds --ucla c813e713a0...
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla --xds 27fab4dbb6...
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla --xds bbd5bb52fc...
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla --xds b992e0cccb...
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla --xds d0291cdcf6...
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla --xds c8dc039a88...
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla --xds 53339f3455...
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla --xds 83b03036a2...
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla --xds 7d3f25c32c...
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla --xds f41d4c29a1...
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla --xds 88cd42b89a...
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla --xds 7f41299cc7...
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla --xds 0069dffd21...

It seems that not all tests were passed completely. This is an indication that the output of some files is not as expected (but might be according to you).

Check the result page for more info.

@cfsmp3
Copy link
Contributor

cfsmp3 commented Dec 24, 2025

Comprehensive Code Review: XDS Module Rust Port

Thank you for working on porting the XDS decoder to Rust! This is a substantial effort. I've done a thorough comparison between the C code in ccx_decoders_xds.c and your Rust implementation to identify differences that could cause CI failures and output parity issues.


🔴 Critical Issues (Must Fix)

1. Windows Build Failure - c_ulong Type Mismatch

File: handlers.rs:76

let new_data = unsafe { realloc(sub.data as *mut c_void, new_size as c_ulong) as *mut eia608_screen };

Problem: c_ulong is u32 on Windows but the Windows realloc expects c_ulonglong (u64). This causes the compilation error:

error[E0308]: mismatched types
   --> src\xds\handlers.rs:76:51
    |
 76 |         unsafe { realloc(sub.data as *mut c_void, new_size as c_ulong) as *mut eia608_screen };
    |                                                   ^^^^^^^^^^^^^^^^^^^ expected `u64`, found `u32`

Fix: Per PR #1873, this project has standardized on removing platform-dependent types. Use u64 instead of c_ulong:

let new_data = unsafe { realloc(sub.data as *mut c_void, new_size as u64) as *mut eia608_screen };

2. Memory Management UB - freep Function

File: handlers.rs:113-121

pub fn freep<T>(ptr: &mut *mut T) {
    unsafe {
        if !ptr.is_null() {
            let _ = Box::from_raw(*ptr);  // ❌ WRONG!
            *ptr = null_mut();
        }
    }
}

Problem: This function uses Box::from_raw to deallocate memory that was allocated by C's realloc. This is undefined behavior because:

  • C's realloc allocates memory using the C allocator
  • Rust's Box::from_raw expects memory from Rust's allocator

Fix: Either:

  1. Call C's free() directly: libc::free(*ptr as *mut c_void)
  2. Or don't define this function and use the C freep from bindings

3. CString/free() Memory Mismatch

File: handlers.rs:80-82

let c_str = CString::new(p).map_err(|_| XDSError::Err)?;
let c_str_ptr = c_str.into_raw();
data_element.xds_str = c_str_ptr;

Problem: The C code allocates xds_str with malloc and expects it to be freed with free(). But Rust's CString::into_raw() creates memory that should be freed with CString::from_raw(). These allocators are NOT guaranteed to be compatible!

C code expectation:

data->xds_str = p; // p is malloc'd memory
/* Note: Don't free(p) here - the pointer is stored in data->xds_str
   and will be freed by the encoder or decoder cleanup code */

Fix: Allocate the string using C's malloc to match the C memory model:

use libc::{malloc, strcpy};
let c_str = CString::new(p).map_err(|_| XDSError::Err)?;
let len = c_str.as_bytes_with_nul().len();
let ptr = unsafe { malloc(len) as *mut i8 };
if ptr.is_null() { return Err(XDSError::Err); }
unsafe { std::ptr::copy_nonoverlapping(c_str.as_ptr(), ptr, len); }
data_element.xds_str = ptr;

🟠 Output Parity Issues (Will Cause CI Failures)

4. Content Advisory - Empty Content Not Filtered

File: handlers.rs - xds_do_content_advisory

C code:

if (content[0]) // Only output content if not empty
    xdsprint(sub, ctx, content);

Rust code:

let _ = xdsprint(sub, ctx, state.content.clone()); // Always outputs, even if empty!

Fix: Add the empty check:

if !state.content.is_empty() {
    let _ = xdsprint(sub, ctx, state.content.clone());
}

5. Canadian French Rating - Character Mismatch

C code:

const char *ratingtext[8] = {"Exempt?es", "G?n?ral", "G?n?ral - D?conseill? aux jeunes enfants", ...

(Note: The C code uses literal ? characters where French accented characters would normally appear)

Rust code:

const CANADIAN_FRENCH_RATING_TEXT: [&str; 8] = [
    "Exemptées",
    "Général",
    ...
];

Problem: The Rust version uses proper UTF-8 French characters, but the C code uses literal ? characters. This WILL produce different output if this code path is triggered.

Fix: Match the C code exactly:

const CANADIAN_FRENCH_RATING_TEXT: [&str; 8] = [
    "Exempt?es",
    "G?n?ral",
    "G?n?ral - D?conseill? aux jeunes enfants",
    ...
];

Or fix BOTH C and Rust to use proper UTF-8 in a separate PR.


6. ts_start_of_xds Global Not Synchronized

Problem: The C code has a global LLONG ts_start_of_xds = -1 that gets set in various places. The Rust code uses AtomicI64 but I don't see where it's being updated to match the C behavior.

Fix: Ensure the Rust TS_START_OF_XDS atomic is updated in the same places as the C global.


🟡 Code Style Issues (CI Failures)

7. C Formatting - Tabs vs Spaces

File: ccx_decoders_xds.c:6-15

extern void ccxr_process_xds_bytes(
	struct ccx_decoders_xds_context *ctx,  // ← Uses tabs
	unsigned char hi,
	int lo);

Fix: Use spaces (4-space indent) per clang-format:

extern void ccxr_process_xds_bytes(
    struct ccx_decoders_xds_context *ctx,
    unsigned char hi,
    int lo);

8. Rust Clippy Warning - Unnecessary Parentheses

File: handlers.rs:136

if (hi >= 0x01 && hi <= 0x0f) {

Fix:

if hi >= 0x01 && hi <= 0x0f {

9. Unused Imports

warning: unused import: `c_int`
warning: unused import: `ccx_decoders_xds_context`

Fix: Remove unused imports.


🔵 Logic Differences to Verify

10. XDS Buffer Index Bounds Check

C code:

if (ctx->xds_buffers[ctx->cur_xds_buffer_idx].used_bytes <= 32)

This directly accesses the array without checking if cur_xds_buffer_idx >= 0.

Rust code:

if self.cur_xds_buffer_idx >= 0 {
    let idx = self.cur_xds_buffer_idx as usize;
    if idx < NUM_XDS_BUFFERS && (self.xds_buffers[idx].used_bytes as usize) <= 32 {

This is actually safer in Rust, but verify the C code works despite this (it might have UB when cur_xds_buffer_idx == -1).


11. xds_do_misc Signature Difference

C code:

int xds_do_misc(struct ccx_decoders_xds_context *ctx)  // No `sub` parameter

Verify: The Rust version's signature matches and doesn't pass sub where C doesn't.


📋 Pre-Push Checklist

Before pushing fixes, run:

# Fix Rust formatting
cargo fmt --manifest-path src/rust/Cargo.toml

# Fix Rust clippy warnings  
cargo clippy --manifest-path src/rust/Cargo.toml

# Fix C formatting
find src/lib_ccx -name "*.c" -o -name "*.h" | xargs clang-format -i

Summary

Category Count
🔴 Critical (UB/Build Failure) 3
🟠 Output Parity Issues 3
🟡 Code Style Issues 3
🔵 Logic Verification 2

The most urgent issues are:

  1. Windows type mismatch (prevents build)
  2. Memory management UB (will cause crashes or memory corruption)
  3. CString/free mismatch (will cause memory corruption)

Once these are fixed, focus on the output parity issues to match C behavior exactly.


Let me know if you have questions about any of these points!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants