Skip to content

Conversation

@Rahul-2k4
Copy link
Contributor

@Rahul-2k4 Rahul-2k4 commented Dec 26, 2025

In raising this pull request, I confirm the following (please check boxes):

I have read and understood the contributors guide
.

  • I have checked that another pull request for this purpose does not exist.
  • I have considered, and confirmed that this submission will be valuable to others.
  • I accept that this submission may not be used, and the pull request closed at the will of the maintainer.
  • I give this submission freely, and claim no ownership to its content.
  • I have mentioned this change in the changelog

.

My familiarity with the project is as follows (check one):

  • I have never used CCExtractor.
  • I have used CCExtractor just a couple of times.
  • I absolutely love CCExtractor, but have not contributed previously.
  • I am an active contributor to CCExtractor.

Summary

This PR completes verification of Issue #447 and applies a few small but necessary fixes identified during review. The --split-dvb-subs feature is confirmed to work correctly with real DVB broadcast samples.

Key points

Verified multi-stream DVB subtitle extraction end-to-end using a real broadcast TS.

Applied 3 minor code-review fixes:

  • Fix escaped newline in DVB debug logging.
  • Remove hardcoded debug PID values.
  • Improve language-code validation to accept uppercase letters.
  • Confirmed legacy (non-split) behavior remains unaffected.

Test sample clarification

  • The originally referenced file (arte_multiaudio.ts) was not suitable for validation:
  • DVB subtitle streams were advertised in the PMT
  • No actual DVB subtitle bitmap packets were present
  • Testing was therefore redone using a proper broadcast capture containing real DVB subtitles:
    https://tsduck.io/streams/france-dttv/tnt-uhf30-546MHz-2019-01-22.ts

Expected / observed behavior:

  • Separate output files are created per DVB subtitle stream.
  • Only streams that actually broadcast subtitle packets produce non-empty output.
  • Streams advertised in PMT but carrying no subtitle data result in empty files.
  • This matches normal DVB broadcast behavior and is not a bug.

Results:

  • Valid SRT output extracted ✔
  • Empty streams handled safely ✔
  • Invalid flag combinations rejected ✔
  • No crashes or regressions ✔

Related issue

Fixes and verification for Issue #447.

- Fixed NULL pointer dereference in dvb_subtitle_decoder.c (sub->prev check).
- Corrected logic in dvbsub_handle_display_segment to prevent dropped subtitles.
- Implemented robust encoder context swapping in general_loop.c for DVB streams.
- Added regression test: tests/regression/dvb_split.txt.
- Verified 100% completion in split mode and correct Teletext/DVB routing.
@Rahul-2k4 Rahul-2k4 changed the title Final Automatic extraction of multiple DVB subtitle streams (--split-dvb-subs) fixes#447 #1864 Dec 27, 2025
… (kept both split_dvb_subs and scc_framerate)
- Fix escaped newline in debug print (dvb_subtitle_decoder.c:1861)
- Replace hardcoded PID 0x106 with 0 in debug calls (lines 1822, 1835)
- Accept uppercase letters in language code validation (ts_tables.c:396)
Copy link
Contributor

@cfsmp3 cfsmp3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for working on this feature! I tested it with actual DVB samples:

Sample 1: 04e47919de5908edfa1fddc522a811d56bc67a1d4020f8b3972709e25b15966c.ts (3 DVB subtitle streams: CHI, ENG, CHS)

  • Legacy mode (no flag): ✅ Works
  • --split-dvb-subs: ❌ Crashes with "PES data packet larger than remaining buffer" error at ~63%

Sample 2: 36d5eca53c56ac18e727badec449ac0f10096369f8a7eda5f7108f7170c5cc8c.mpg (2 DVB subtitle streams)

  • Legacy mode: ✅ Works
  • --split-dvb-subs: Runs but outputs both streams to a single _unk.srt file instead of separate files

Please investigate:

  1. The crash on the 3-stream sample
  2. Why streams aren't being split into separate files with language tags (the log shows "lang=unk" for both PIDs)

Also, please update CHANGES.TXT for this new feature.

@Rahul-2k4
Copy link
Contributor Author

Thanks for the detailed review and for testing this. I’ll proceed with investigating the reported issues on my side and follow up with an update.

…actor#447

- Replace spin-lock with proper mutex (CRITICAL_SECTION/pthread_mutex)
- Add per-pipeline OCR contexts for thread safety
- Include PID in output filenames to handle duplicate languages
- Add dvbsub_get_context_size() and dvbsub_copy_context() for state management
- Improve language code validation (ISO 639-2 compliant)
- Change fatal error to warning for oversized PES packets
- Better language lookup from potential_streams before cinfo fallback
- Reset potential_stream data in demuxer cleanup
Fixes segmentation fault at 99% when PAT changes occur during DVB
subtitle processing. The crash happened because decoder context
private_data was freed but still accessed.

Changes:
- Add NULL check in process_data() before dvbsub_decode call
- Add defensive NULL check at start of dvbsub_decode()
- Add defensive NULL check at start of write_dvb_sub()
- Deep copy DVB bitmap data in copy_subtitle() to avoid aliasing
- Safe DVBSubContext copy that doesn't alias linked list pointers
- Clean up pipeline decoder refs in dinit_cap() after PAT change
- Direct FTS calculation for DVB-only streams

Tested with 11GB TS file with 23 PAT changes - no crash.
…tion

After PAT changes, the pipeline's decoder was NULLed out to prevent
crashes, but this caused all subsequent DVB data to be skipped.

Now the decoder is reinitialized when detected as NULL, allowing
subtitle extraction to continue across PAT changes.
Resolves conflicts while preserving Issue CCExtractor#447 fix for DVB multi-stream handling:
- Kept DVB metadata update logic in ts_tables.c for split mode
- Adapted to upstream's single-param dvbsub_init_decoder signature
- Updated lib_ccx.c and general_loop.c to match new API
- Run clang-format on all source files to fix CI formatting check
- Add Issue CCExtractor#447 DVB multi-stream feature to CHANGES.TXT
@Rahul-2k4 Rahul-2k4 requested a review from cfsmp3 December 31, 2025 06:43
…uption

The start_credits_text and end_credits_text pointers were being copied
directly from the encoder config options, but free_encoder_context()
would later free them. This caused memory corruption when the pointers
referred to memory owned by ccx_options.

Now these strings are deep-copied in init_encoder() so each encoder
context owns its own copy, fixing the --startcreditstext regression.
@ccextractor-bot
Copy link
Collaborator

CCExtractor CI platform finished running the test files on linux. Below is a summary of the test results, when compared to test for commit 64ce4ac...:
Report Name Tests Passed
Broken 13/13
CEA-708 14/14
DVB 7/7
DVD 3/3
DVR-MS 2/2
General 25/27
Hardsubx 1/1
Hauppage 3/3
MP4 3/3
NoCC 10/10
Options 81/86
Teletext 21/21
WTV 13/13
XDS 34/34

Your PR breaks these cases:

  • ccextractor --autoprogram --out=ttxt --latin1 --ucla dab1c1bd65...
  • ccextractor --out=srt --latin1 --autoprogram 29e5ffd34b...
  • ccextractor --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9...
  • ccextractor --startcreditsnotbefore 1 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9...
  • ccextractor --startcreditsnotafter 2 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9...
  • ccextractor --startcreditsforatleast 1 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9...
  • ccextractor --startcreditsforatmost 2 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9...

It seems that not all tests were passed completely. This is an indication that the output of some files is not as expected (but might be according to you).

Check the result page for more info.

@ccextractor-bot
Copy link
Collaborator

CCExtractor CI platform finished running the test files on windows. Below is a summary of the test results, when compared to test for commit 64ce4ac...:
Report Name Tests Passed
Broken 13/13
CEA-708 14/14
DVB 7/7
DVD 3/3
DVR-MS 2/2
General 27/27
Hardsubx 1/1
Hauppage 3/3
MP4 3/3
NoCC 10/10
Options 86/86
Teletext 21/21
WTV 13/13
XDS 34/34

Congratulations: Merging this PR would fix the following tests:

  • ccextractor --autoprogram --out=srt --latin1 --quant 0 85271be4d2..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla dab1c1bd65..., Last passed: Never
  • ccextractor --out=srt --latin1 --autoprogram 29e5ffd34b..., Last passed: Never
  • ccextractor --out=spupng c83f765c66..., Last passed: Never
  • ccextractor --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9..., Last passed: Never
  • ccextractor --startcreditsnotbefore 1 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9..., Last passed: Never
  • ccextractor --startcreditsnotafter 2 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9..., Last passed: Never
  • ccextractor --startcreditsforatleast 1 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9..., Last passed: Never
  • ccextractor --startcreditsforatmost 2 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9..., Last passed: Never

All tests passed completely.

Check the result page for more info.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants