Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
 help / color / mirror / Atom feed
From: Soft Works <softworkz@hotmail.com>
To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Subject: Re: [FFmpeg-devel] [PATCH 1/2] avcodec/{ass, webvttdec}: fix handling of backslashes
Date: Sat, 5 Feb 2022 02:08:48 +0000
Message-ID: <DM8P223MB0365DCA87C8B9FD873CF5B66BA2A9@DM8P223MB0365.NAMP223.PROD.OUTLOOK.COM> (raw)
In-Reply-To: <Yf3Q0JOl3dEsY0Li@oneric.de>



> -----Original Message-----
> From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of
> Oneric
> Sent: Saturday, February 5, 2022 2:20 AM
> To: FFmpeg development discussions and patches <ffmpeg-
> devel@ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH 1/2] avcodec/{ass, webvttdec}: fix
> handling of backslashes
> 
> On Fri, Feb 04, 2022 at 23:24:58 +0000, Soft Works wrote:
> > You want to "pollute" gazillions of subtitle streams in the
> > world from multiple subtitle formats with invisible
> > characters in order to solve an escaping problem in ffmpeg?
> 
> I do not consider using characters that are explicitly recommended to
> be
> used by Unicode to be “polluting”. Further consider that as mentioned
> invisible characters in ASS are not uncommon anyway already and
> conversion
> from ASS to something else are rare due to being generally lossy.
> Lossy
> with regards to typesetting that is, removing breaking hints in form
> of
> plain Unicode characters would be a new form of lossyness.
> 
> > [From the other mail:]
> > I'm not into changing ffmpeg's ass output, it's all
> > about the internally used ass format and the escaping is
> > a central problem there.
> 
> I’m not interested in reworking ffmpeg’s internal subtitle handling.
> The proposed patch is a clear improvement over the status quo which
> is plain incorrect. Within reasonable effort and sound arguments for
> it adjustments to the patch can be made; reworking ffmpeg internals is
> imo not “reasonable” effort to correct an uncontestedly wrong escape.
> 
> You have two options:
> Either finally tell me what I asked about:
> where (as in which file and function) removing wordjoiners should
> even happen and where possible lingering “\\ → \” conversions
> presumably
> are and if it’s simple enough I can add a removal accompanied by a
> comment
> pointing out that this can go wrong.
> Or go ahead and create your own patch.
> 
> ~~~~~~
> 
> > > > I'm not sure whether all ffmpeg text-sub encoders can handle
> > > > those chars - which could be verified of course.
> > >
> > > Since it's in the BMP and ffmpeg already seems happy to assume
> some
> > > UTF-8
> > > support by converting everything to it, I'm not worried about this
> > > until
> > > proven wrong.
> >
> > Proven wrong: https://github.com/libass/libass/issues/507
> 
> This issue is not at all wordjoiner specific despite the name.
> As far as I recall this never lead to wrong rendering.
> With HarfBuzz, the only fully featured shaping backend of libass,
> control characters were and are handled by HarfBuzz.
> And even with FriBiDi U+2060 was ignored since long before (2012)
> the linked issue was opened.
> 
> What that issue really is about is a combination of two more general
> issues. libass is currently not caching failure to lookup a glyph
> leading
> to multiple messages and at worst a perf degradation if no font on the
> font pool contained a glyph for a particular glyph. And the
> realisation
> that libass’ font-fallback strategy is not ideal for prefix-type
> control
> characters, characters which visibly affect both neighbours and a few
> others.
> The word-joiner is only highlighted here as due to its usage as an
> backslash escape its commonly passed to libass and a high enough
> percentage of fonts doesn’t contain it to create reports about it.
> 
> 
> For further reference: U+2060 was added in Unicode 3.2 released 2002.
> If you want to strip it because it might not render correctly you
> should
> also strip most emoji, the uppercase eszett ẞ and several actively
> used writing systems in their entirety.


Let's try to approach this from a different side. Which case is 
your [1/2] commit actually supposed to fix? 
How did you test your patch?
Can we please go over an example?

Thanks,
sw




_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

  reply	other threads:[~2022-02-05  2:08 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-16 18:16 Oneric
2022-01-16 18:16 ` [FFmpeg-devel] [PATCH 2/2] avcodec/webvttdec: honour bidi marks Oneric
2022-02-01 17:38 ` [FFmpeg-devel] [PATCH 1/2] avcodec/{ass, webvttdec}: fix handling of backslashes Oneric
2022-02-01 19:44   ` Soft Works
2022-02-01 20:06     ` Oneric
2022-02-01 20:41       ` Soft Works
2022-02-01 23:25         ` Oneric
2022-02-02  4:44           ` Soft Works
2022-02-02 17:03             ` Oneric
2022-02-02 22:18               ` Soft Works
2022-02-02 22:44                 ` Soft Works
2022-02-03  2:11                   ` Oneric
2022-02-03 20:51                     ` Soft Works
2022-02-04  1:01                       ` Oneric
2022-02-04  1:30                         ` Andreas Rheinhardt
2022-02-04 21:52                           ` Oneric
2022-02-04 23:24                             ` Soft Works
2022-02-05  1:20                               ` Oneric
2022-02-05  2:08                                 ` Soft Works [this message]
2022-02-05 21:59                                   ` Oneric
2022-02-06  1:08                                     ` Soft Works
2022-02-06  1:37                                       ` Soft Works
2022-02-04  1:57                         ` Soft Works
2022-02-04  5:34                           ` Soft Works
2022-02-04  5:59                             ` Soft Works
2022-02-04  6:48                             ` Soft Works
2022-02-04 21:19                               ` Oneric
2022-02-04 22:23                                 ` Soft Works

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DM8P223MB0365DCA87C8B9FD873CF5B66BA2A9@DM8P223MB0365.NAMP223.PROD.OUTLOOK.COM \
    --to=softworkz@hotmail.com \
    --cc=ffmpeg-devel@ffmpeg.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
		ffmpegdev@gitmailbox.com
	public-inbox-index ffmpegdev

Example config snippet for mirrors.


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git