From: Soft Works <softworkz@hotmail.com> To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org> Subject: Re: [FFmpeg-devel] [PATCH 1/2] avcodec/{ass, webvttdec}: fix handling of backslashes Date: Sun, 6 Feb 2022 01:08:35 +0000 Message-ID: <DM8P223MB0365E41B6A7FDD11627629EFBA2B9@DM8P223MB0365.NAMP223.PROD.OUTLOOK.COM> (raw) In-Reply-To: <Yf7zRpmfuVAS7c99@oneric.de> > -----Original Message----- > From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of > Oneric > Sent: Saturday, February 5, 2022 11:00 PM > To: FFmpeg development discussions and patches <ffmpeg- > devel@ffmpeg.org> > Subject: Re: [FFmpeg-devel] [PATCH 1/2] avcodec/{ass, webvttdec}: fix > handling of backslashes > > On Sat, Feb 05, 2022 at 02:08:48 +0000, Soft Works wrote: > > Let's try to approach this from a different side. Which case is > > your [1/2] commit actually supposed to fix? > > Escape backslashes when converting from WebVTT to not accidentally > introduce active ASS sequences and replace the wrong backslash-escape > in ff_ass_bprint_text_event with an escape that actually works. > > > How did you test your patch? > > Can we please go over an example? > > Take a look at the attached WebVTT file. Thanks a lot for the test file! > We expect the second event to be rendered like this, > as from WebVTT’s point of view it’s just normal text: > > our final \h approach \N into \ Coruscant. > > What we currently get after conversion to ASS is like this though > (pay attention to the number of spaces): > > our final approach > into \ Coruscant. Yup, no doubt that this is wrong. > If instead the word-joiner is appended as in my patch, the > visual output matches the expectation (mail does not contain U+2060): > > our final \h approach \N into \ Coruscant. I can also confirm that your patch would "work" with regards to ass output when trying with both "old" libass and new libass. I haven't tried with other implementations yet, but when this would turn out to be working with all usual implementations, I might even be OK with the word-joiner. But this is where the agreement ends. - If at all, the word-joiner insertion would need to be limited to ASS output ONLY - it would need to be controllable through an option in the ASS encoder - The word joiners should not be used in internal processing and only be (optionally) added when encoding to ass - Unfortunately, the FATE tests for the other subtitle formats do not include these sequences in the test source files, and that means, before making such change that might potentially affect all other text subtitle encoders, those sequences would need to be added to make sure these conversion won't be affected - Generally, those changes (also the BIDI mark insertion) should be evaluated with regards to all text subtitle encoders, making sure there's no side effect. You said: > I’m not interested in reworking ffmpeg’s internal subtitle handling. > The proposed patch is a clear improvement over the status quo which > is plain incorrect. Within reasonable effort and sound arguments for > it adjustments to the patch can be made; reworking ffmpeg internals is > imo not “reasonable” effort to correct an uncontestedly wrong escape. And that is a problem. The changes you are proposing are making changes to ffmpeg’s internal subtitle handling, so you need to decide whether you want to work on it or not. > You have two options: [..] > Or go ahead and create your own patch. I will do this, but "only" on top of my subtitle filtering patchset because that's my current focus area and just two weeks ago I had to add a temporary hack for a related case which is about ASS dialog lines like: Dialogue: 0,0:00:00.00,0:00:05.00,Default,,0,0,0,,{comment text: \....} Currently, ffmpeg does not recognize this as override code, even though it's valid in ASS that a backslash with the actual code doesn't appear immediately after the opening curly brace. What comes on top of this is that other subtitle decoders do NOT escape the curly braces like WebVTTdec and ass_bprint_text_event(). When you look at SubRip_capability_tester.srt from the FATE suite, you can see that it contains ASS codes that are expected to be recognized and applied, but when there's normal text in curly braces without a backslash, it should be treated at normal text. This is quite a mess that needs to be cleaned up with a plan and it's all related. Like I said already: A central point to this is the escaping and what's needed is a solution that can cover all of those things. I had put this subject aside as I've been unsure about how to do it, but this discussion has been very helpful to see the issues more clearly. How about separating the BIDI part from your patch? I'd see only two things remaining: - Go through all text subtitle encoders and think/research whether those marks would be acceptable in those formats or whether they would need to be removed (like now) - Think about whether the Unicode bidi marks should be replaced back to the html-style codes It's not these wouldn't work, but it's again about visibility and I tend to think that it would be preferable to have them visible in the output softworkz _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
next prev parent reply other threads:[~2022-02-06 1:08 UTC|newest] Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top 2022-01-16 18:16 Oneric 2022-01-16 18:16 ` [FFmpeg-devel] [PATCH 2/2] avcodec/webvttdec: honour bidi marks Oneric 2022-02-01 17:38 ` [FFmpeg-devel] [PATCH 1/2] avcodec/{ass, webvttdec}: fix handling of backslashes Oneric 2022-02-01 19:44 ` Soft Works 2022-02-01 20:06 ` Oneric 2022-02-01 20:41 ` Soft Works 2022-02-01 23:25 ` Oneric 2022-02-02 4:44 ` Soft Works 2022-02-02 17:03 ` Oneric 2022-02-02 22:18 ` Soft Works 2022-02-02 22:44 ` Soft Works 2022-02-03 2:11 ` Oneric 2022-02-03 20:51 ` Soft Works 2022-02-04 1:01 ` Oneric 2022-02-04 1:30 ` Andreas Rheinhardt 2022-02-04 21:52 ` Oneric 2022-02-04 23:24 ` Soft Works 2022-02-05 1:20 ` Oneric 2022-02-05 2:08 ` Soft Works 2022-02-05 21:59 ` Oneric 2022-02-06 1:08 ` Soft Works [this message] 2022-02-06 1:37 ` Soft Works 2022-02-04 1:57 ` Soft Works 2022-02-04 5:34 ` Soft Works 2022-02-04 5:59 ` Soft Works 2022-02-04 6:48 ` Soft Works 2022-02-04 21:19 ` Oneric 2022-02-04 22:23 ` Soft Works
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=DM8P223MB0365E41B6A7FDD11627629EFBA2B9@DM8P223MB0365.NAMP223.PROD.OUTLOOK.COM \ --to=softworkz@hotmail.com \ --cc=ffmpeg-devel@ffmpeg.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel This inbox may be cloned and mirrored by anyone: git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \ ffmpegdev@gitmailbox.com public-inbox-index ffmpegdev Example config snippet for mirrors. AGPL code for this site: git clone https://public-inbox.org/public-inbox.git