Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
 help / color / mirror / Atom feed
From: Soft Works <softworkz@hotmail.com>
To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Subject: Re: [FFmpeg-devel] [PATCH 1/2] avcodec/{ass, webvttdec}: fix handling of backslashes
Date: Sun, 6 Feb 2022 01:08:35 +0000
Message-ID: <DM8P223MB0365E41B6A7FDD11627629EFBA2B9@DM8P223MB0365.NAMP223.PROD.OUTLOOK.COM> (raw)
In-Reply-To: <Yf7zRpmfuVAS7c99@oneric.de>



> -----Original Message-----
> From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of
> Oneric
> Sent: Saturday, February 5, 2022 11:00 PM
> To: FFmpeg development discussions and patches <ffmpeg-
> devel@ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH 1/2] avcodec/{ass, webvttdec}: fix
> handling of backslashes
> 
> On Sat, Feb 05, 2022 at 02:08:48 +0000, Soft Works wrote:
> > Let's try to approach this from a different side. Which case is
> > your [1/2] commit actually supposed to fix?
> 
> Escape backslashes when converting from WebVTT to not accidentally
> introduce active ASS sequences and replace the wrong backslash-escape
> in ff_ass_bprint_text_event with an escape that actually works.
> 
> > How did you test your patch?
> > Can we please go over an example?
> 
> Take a look at the attached WebVTT file.

Thanks a lot for the test file!


> We expect the second event to be rendered like this,
> as from WebVTT’s point of view it’s just normal text:
> 
>   our final \h approach \N into \ Coruscant.
> 
> What we currently get after conversion to ASS is like this though
> (pay attention to the number of spaces):
> 
>   our final   approach
>   into \ Coruscant.

Yup, no doubt that this is wrong. 


> If instead the word-joiner is appended as in my patch, the
> visual output matches the expectation (mail does not contain U+2060):
> 
>   our final \h approach \N into \ Coruscant.

I can also confirm that your patch would "work" with regards
to ass output when trying with both "old" libass and new libass.
I haven't tried with other implementations yet, but when this 
would turn out to be working with all usual implementations,
I might even be OK with the word-joiner.

But this is where the agreement ends. 

- If at all, the word-joiner insertion would need to be 
  limited to ASS output ONLY
- it would need to be controllable through an option in the ASS 
  encoder
- The word joiners should not be used in internal processing and 
  only be (optionally) added when encoding to ass
- Unfortunately, the FATE tests for the other subtitle formats
  do not include these sequences in the test source files, and
  that means, before making such change that might potentially 
  affect all other text subtitle encoders, those sequences would
  need to be added to make sure these conversion won't be affected
- Generally, those changes (also the BIDI mark insertion) should 
  be evaluated with regards to all text subtitle encoders,
  making sure there's no side effect.

You said:

> I’m not interested in reworking ffmpeg’s internal subtitle handling.
> The proposed patch is a clear improvement over the status quo which
> is plain incorrect. Within reasonable effort and sound arguments for
> it adjustments to the patch can be made; reworking ffmpeg internals is
> imo not “reasonable” effort to correct an uncontestedly wrong escape.

And that is a problem. The changes you are proposing are making
changes to ffmpeg’s internal subtitle handling, so you need to decide 
whether you want to work on it or not.


> You have two options:
[..]
> Or go ahead and create your own patch.

I will do this, but "only" on top of my subtitle filtering patchset
because that's my current focus area and just two weeks ago I had to
add a temporary hack for a related case which is about ASS dialog lines
like:

Dialogue: 0,0:00:00.00,0:00:05.00,Default,,0,0,0,,{comment text: \....}

Currently, ffmpeg does not recognize this as override code, even though
it's valid in ASS that a backslash with the actual code doesn't appear
immediately after the opening curly brace.
What comes on top of this is that other subtitle decoders do NOT escape
the curly braces like WebVTTdec and ass_bprint_text_event().
When you look at SubRip_capability_tester.srt from the FATE suite, you
can see that it contains ASS codes that are expected to be recognized
and applied, but when there's normal text in curly braces without 
a backslash, it should be treated at normal text.

This is quite a mess that needs to be cleaned up with a plan and it's
all related. Like I said already: A central point to this is the escaping
and what's needed is a solution that can cover all of those things.

I had put this subject aside as I've been unsure about how to do it,
but this discussion has been very helpful to see the issues more clearly.

How about separating the BIDI part from your patch? I'd see only two
things remaining:

- Go through all text subtitle encoders and think/research whether 
  those marks would be acceptable in those formats or whether they
  would need to be removed (like now)
- Think about whether the Unicode bidi marks should be replaced back
  to the html-style codes
  It's not these wouldn't work, but it's again about visibility and 
  I tend to think that it would be preferable to have them visible 
  in the output

softworkz







_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

  reply	other threads:[~2022-02-06  1:08 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-16 18:16 Oneric
2022-01-16 18:16 ` [FFmpeg-devel] [PATCH 2/2] avcodec/webvttdec: honour bidi marks Oneric
2022-02-01 17:38 ` [FFmpeg-devel] [PATCH 1/2] avcodec/{ass, webvttdec}: fix handling of backslashes Oneric
2022-02-01 19:44   ` Soft Works
2022-02-01 20:06     ` Oneric
2022-02-01 20:41       ` Soft Works
2022-02-01 23:25         ` Oneric
2022-02-02  4:44           ` Soft Works
2022-02-02 17:03             ` Oneric
2022-02-02 22:18               ` Soft Works
2022-02-02 22:44                 ` Soft Works
2022-02-03  2:11                   ` Oneric
2022-02-03 20:51                     ` Soft Works
2022-02-04  1:01                       ` Oneric
2022-02-04  1:30                         ` Andreas Rheinhardt
2022-02-04 21:52                           ` Oneric
2022-02-04 23:24                             ` Soft Works
2022-02-05  1:20                               ` Oneric
2022-02-05  2:08                                 ` Soft Works
2022-02-05 21:59                                   ` Oneric
2022-02-06  1:08                                     ` Soft Works [this message]
2022-02-06  1:37                                       ` Soft Works
2022-02-04  1:57                         ` Soft Works
2022-02-04  5:34                           ` Soft Works
2022-02-04  5:59                             ` Soft Works
2022-02-04  6:48                             ` Soft Works
2022-02-04 21:19                               ` Oneric
2022-02-04 22:23                                 ` Soft Works

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DM8P223MB0365E41B6A7FDD11627629EFBA2B9@DM8P223MB0365.NAMP223.PROD.OUTLOOK.COM \
    --to=softworkz@hotmail.com \
    --cc=ffmpeg-devel@ffmpeg.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
		ffmpegdev@gitmailbox.com
	public-inbox-index ffmpegdev

Example config snippet for mirrors.


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git