Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
 help / color / mirror / Atom feed
* [FFmpeg-devel] [PATCH v2 0/3] Fix some active sequences in subtitles
@ 2023-12-10 16:37 Oneric
  2023-12-10 16:37 ` [FFmpeg-devel] [PATCH v2 1/3] avcodec/webvttdec: honour bidi marks Oneric
                   ` (3 more replies)
  0 siblings, 4 replies; 6+ messages in thread
From: Oneric @ 2023-12-10 16:37 UTC (permalink / raw)
  To: ffmpeg-devel

Changes from v1:
 - ff_ass_bprint_text_event now only inserts a word-joiner
   if there isn’t already one anyway
 - added a third commit improving the handling of
   curly brackets for standard ASS renderers

Heads up:
As everytime FATE’s ASS reference samples are touched the commits
contain CRLF endings because the reference files do.
Unless it has finally been fixed, patchwork doesn’t like this and will
mangle and falsely fail to apply the patches and this particular mailing list
has also in the past been eager to rewrite the mail body, mangling the encoding
and destroying CRLF line endings in the process.
I took the usual measures to protect against the latter (explicity using base64
transfer encoding and including non-ASCII characters in the body), but if this
now also doesn't work anymore, I’ll just reply to this mail with a link to
the patches uploaded somewhere else.
(And send patches to get rid of CRLF line endings)


Oneric (3):
  avcodec/webvttdec: honour bidi marks
  avcodec/{ass,webvttdec}: fix handling of backslashes
  avcodec/{ass,webvttdec}: more portable curly brace escapes

 libavcodec/ass.c           | 19 +++++++++++++++----
 libavcodec/webvttdec.c     |  4 ++--
 tests/ref/fate/sub-webvtt  |  2 +-
 tests/ref/fate/sub-webvtt2 |  2 +-
 4 files changed, 19 insertions(+), 8 deletions(-)

-- 
2.39.2

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [FFmpeg-devel] [PATCH v2 1/3] avcodec/webvttdec: honour bidi marks
  2023-12-10 16:37 [FFmpeg-devel] [PATCH v2 0/3] Fix some active sequences in subtitles Oneric
@ 2023-12-10 16:37 ` Oneric
  2023-12-10 16:37 ` [FFmpeg-devel] [PATCH v2 2/3] avcodec/{ass, webvttdec}: fix handling of backslashes Oneric
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 6+ messages in thread
From: Oneric @ 2023-12-10 16:37 UTC (permalink / raw)
  To: ffmpeg-devel

---
“”
---
 libavcodec/webvttdec.c     | 2 +-
 tests/ref/fate/sub-webvtt2 | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/libavcodec/webvttdec.c b/libavcodec/webvttdec.c
index 690f00dc47..990d150f16 100644
--- a/libavcodec/webvttdec.c
+++ b/libavcodec/webvttdec.c
@@ -39,7 +39,7 @@ static const struct {
     {"<u>", "{\\u1}"}, {"</u>", "{\\u0}"},
     {"{", "\\{"}, {"}", "\\}"}, // escape to avoid ASS markup conflicts
     {"&gt;", ">"}, {"&lt;", "<"},
-    {"&lrm;", ""}, {"&rlm;", ""}, // FIXME: properly honor bidi marks
+    {"&lrm;", "\xe2\x80\x8e"}, {"&rlm;", "\xe2\x80\x8f"},
     {"&amp;", "&"}, {"&nbsp;", "\\h"},
 };
 
diff --git a/tests/ref/fate/sub-webvtt2 b/tests/ref/fate/sub-webvtt2
index 1d236eabdc..31fb5f83a7 100644
--- a/tests/ref/fate/sub-webvtt2
+++ b/tests/ref/fate/sub-webvtt2
@@ -21,6 +21,6 @@ Dialogue: 0,0:00:12.50,0:00:32.50,Default,,0,0,0,,OK, let’s go.
 Dialogue: 0,0:00:38.00,0:00:43.00,Default,,0,0,0,,I want to 愛あい love you\NThat's not proper English!
 Dialogue: 0,0:00:43.00,0:00:46.00,Default,,0,0,0,,{\i1}キツネ{\i0}じゃない キツネじゃない\N乙女おとめは
 Dialogue: 0,0:00:50.00,0:00:55.00,Default,,0,0,0,,Some time ago in a rather distant place....
-Dialogue: 0,0:00:55.00,0:01:00.00,Default,,0,0,0,,Descending: 123456\NAscending: 123456
+Dialogue: 0,0:00:55.00,0:01:00.00,Default,,0,0,0,,Descending: ‏123456‎\NAscending: 123456
 Dialogue: 0,0:01:00.00,0:01:05.00,Default,,0,0,0,,>> Never gonna give you up Never gonna let you down\NNever\hgonna\hrun\haround & desert\hyou
 Dialogue: 0,0:55:00.00,1:00:00.00,Default,,0,0,0,,Transcrit par Célestes™
-- 
2.39.2

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [FFmpeg-devel] [PATCH v2 2/3] avcodec/{ass, webvttdec}: fix handling of backslashes
  2023-12-10 16:37 [FFmpeg-devel] [PATCH v2 0/3] Fix some active sequences in subtitles Oneric
  2023-12-10 16:37 ` [FFmpeg-devel] [PATCH v2 1/3] avcodec/webvttdec: honour bidi marks Oneric
@ 2023-12-10 16:37 ` Oneric
  2023-12-10 16:37 ` [FFmpeg-devel] [PATCH v2 3/3] avcodec/{ass, webvttdec}: more portable curly brace escapes Oneric
  2023-12-28  0:55 ` [FFmpeg-devel] [PATCH v2 0/3] Fix some active sequences in subtitles Oneric
  3 siblings, 0 replies; 6+ messages in thread
From: Oneric @ 2023-12-10 16:37 UTC (permalink / raw)
  To: ffmpeg-devel

Backslashes cannot be escaped by a backslash in any ASS renderer,
but unless followed by specific characters it is just printed out.
Insert a word-joiner character after a backslash to break up
active sequences without changing the visual output.
---
“”
---
 libavcodec/ass.c       | 9 ++++++++-
 libavcodec/webvttdec.c | 2 +-
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/libavcodec/ass.c b/libavcodec/ass.c
index 5058dc8337..a68d3568b4 100644
--- a/libavcodec/ass.c
+++ b/libavcodec/ass.c
@@ -183,9 +183,16 @@ void ff_ass_bprint_text_event(AVBPrint *buf, const char *p, int size,
 
         /* standard ASS escaping so random characters don't get mis-interpreted
          * as ASS */
-        } else if (!keep_ass_markup && strchr("{}\\", *p)) {
+        } else if (!keep_ass_markup && strchr("{}", *p)) {
             av_bprintf(buf, "\\%c", *p);
 
+        /* append word-joiner U+2060 as UTF-8 to break up sequences like \N */
+        } else if (!keep_ass_markup && *p == '\\') {
+            if (p_end - p <= 3 || strncmp(p + 1, "\xe2\x81\xa0", 3))
+                av_bprintf(buf, "\\\xe2\x81\xa0");
+            else
+                av_bprintf(buf, "\\");
+
         /* some packets might end abruptly (no \0 at the end, like for example
          * in some cases of demuxing from a classic video container), some
          * might be terminated with \n or \r\n which we have to remove (for
diff --git a/libavcodec/webvttdec.c b/libavcodec/webvttdec.c
index 990d150f16..6e55bc5499 100644
--- a/libavcodec/webvttdec.c
+++ b/libavcodec/webvttdec.c
@@ -37,7 +37,7 @@ static const struct {
     {"<i>", "{\\i1}"}, {"</i>", "{\\i0}"},
     {"<b>", "{\\b1}"}, {"</b>", "{\\b0}"},
     {"<u>", "{\\u1}"}, {"</u>", "{\\u0}"},
-    {"{", "\\{"}, {"}", "\\}"}, // escape to avoid ASS markup conflicts
+    {"{", "\\{"}, {"}", "\\}"}, {"\\", "\\\xe2\x81\xa0"}, // escape to avoid ASS markup conflicts
     {"&gt;", ">"}, {"&lt;", "<"},
     {"&lrm;", "\xe2\x80\x8e"}, {"&rlm;", "\xe2\x80\x8f"},
     {"&amp;", "&"}, {"&nbsp;", "\\h"},
-- 
2.39.2

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [FFmpeg-devel] [PATCH v2 3/3] avcodec/{ass, webvttdec}: more portable curly brace escapes
  2023-12-10 16:37 [FFmpeg-devel] [PATCH v2 0/3] Fix some active sequences in subtitles Oneric
  2023-12-10 16:37 ` [FFmpeg-devel] [PATCH v2 1/3] avcodec/webvttdec: honour bidi marks Oneric
  2023-12-10 16:37 ` [FFmpeg-devel] [PATCH v2 2/3] avcodec/{ass, webvttdec}: fix handling of backslashes Oneric
@ 2023-12-10 16:37 ` Oneric
  2023-12-28  0:55 ` [FFmpeg-devel] [PATCH v2 0/3] Fix some active sequences in subtitles Oneric
  3 siblings, 0 replies; 6+ messages in thread
From: Oneric @ 2023-12-10 16:37 UTC (permalink / raw)
  To: ffmpeg-devel

Unlike what the old comment suggested, standard ASS has no character
escape mechanism, but a closing curly bracket doesn't even need one.

For manual authored sub files using a full-width variant of an apropiate
font and with scaling and psacing modifiers is a common workaround.
This is not an option here, but we can still make things much less bad.
Now the desired opening bracket still shows up in libass and
standard renders will merely display a backslash in its place
instead of stripping the following text like before.
---
“”
---
 libavcodec/ass.c          | 12 ++++++++----
 libavcodec/webvttdec.c    |  2 +-
 tests/ref/fate/sub-webvtt |  2 +-
 3 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/libavcodec/ass.c b/libavcodec/ass.c
index a68d3568b4..e7a1ac0eb5 100644
--- a/libavcodec/ass.c
+++ b/libavcodec/ass.c
@@ -181,10 +181,14 @@ void ff_ass_bprint_text_event(AVBPrint *buf, const char *p, int size,
         if (linebreaks && strchr(linebreaks, *p)) {
             av_bprintf(buf, "\\N");
 
-        /* standard ASS escaping so random characters don't get mis-interpreted
-         * as ASS */
-        } else if (!keep_ass_markup && strchr("{}", *p)) {
-            av_bprintf(buf, "\\%c", *p);
+        /* cancel curly brackets to avoid bogus override tag blocks
+         * hiding text. Standard ASS has no character escapes,
+         * though (only) libass provides \{ and \}.
+         * Unpaired closing brackets don't need escaping at all though and
+         * to make the situation less bad in standard ASS insert an empty block
+         */
+        } else if (!keep_ass_markup && *p == '{') {
+            av_bprintf(buf, "\\{{}");
 
         /* append word-joiner U+2060 as UTF-8 to break up sequences like \N */
         } else if (!keep_ass_markup && *p == '\\') {
diff --git a/libavcodec/webvttdec.c b/libavcodec/webvttdec.c
index 6e55bc5499..35bdbe805d 100644
--- a/libavcodec/webvttdec.c
+++ b/libavcodec/webvttdec.c
@@ -37,7 +37,7 @@ static const struct {
     {"<i>", "{\\i1}"}, {"</i>", "{\\i0}"},
     {"<b>", "{\\b1}"}, {"</b>", "{\\b0}"},
     {"<u>", "{\\u1}"}, {"</u>", "{\\u0}"},
-    {"{", "\\{"}, {"}", "\\}"}, {"\\", "\\\xe2\x81\xa0"}, // escape to avoid ASS markup conflicts
+    {"{", "\\{{}"}, {"\\", "\\\xe2\x81\xa0"}, // escape to avoid ASS markup conflicts
     {"&gt;", ">"}, {"&lt;", "<"},
     {"&lrm;", "\xe2\x80\x8e"}, {"&rlm;", "\xe2\x80\x8f"},
     {"&amp;", "&"}, {"&nbsp;", "\\h"},
diff --git a/tests/ref/fate/sub-webvtt b/tests/ref/fate/sub-webvtt
index 2317c7d5a0..7f0a306361 100644
--- a/tests/ref/fate/sub-webvtt
+++ b/tests/ref/fate/sub-webvtt
@@ -21,7 +21,7 @@ Dialogue: 0,0:00:22.00,0:00:24.00,Default,,0,0,0,,at the AMNH.
 Dialogue: 0,0:00:24.00,0:00:26.00,Default,,0,0,0,,Thank you for walking down here.
 Dialogue: 0,0:00:27.00,0:00:30.00,Default,,0,0,0,,And I want to do a follow-up on the last conversation we did.\Nmultiple lines\Nagain
 Dialogue: 0,0:00:30.00,0:00:31.50,Default,,0,0,0,,When we e-mailed—
-Dialogue: 0,0:00:30.50,0:00:32.50,Default,,0,0,0,,Didn't we {\b1}talk {\i1}about\N{\i0} enough{\b0} in that conversation? \{I'm not an ASS comment\}
+Dialogue: 0,0:00:30.50,0:00:32.50,Default,,0,0,0,,Didn't we {\b1}talk {\i1}about\N{\i0} enough{\b0} in that conversation? \{{}I'm not an ASS comment}
 Dialogue: 0,0:00:32.00,0:00:35.50,Default,,0,0,0,,No! No no no no; 'cos 'cos obviously 'cos
 Dialogue: 0,0:00:32.50,0:00:33.50,Default,,0,0,0,,{\i1}Laughs{\i0}
 Dialogue: 0,0:00:35.50,0:00:38.00,Default,,0,0,0,,You know I'm so excited my glasses are falling off here.
-- 
2.39.2

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [FFmpeg-devel] [PATCH v2 0/3] Fix some active sequences in subtitles
  2023-12-10 16:37 [FFmpeg-devel] [PATCH v2 0/3] Fix some active sequences in subtitles Oneric
                   ` (2 preceding siblings ...)
  2023-12-10 16:37 ` [FFmpeg-devel] [PATCH v2 3/3] avcodec/{ass, webvttdec}: more portable curly brace escapes Oneric
@ 2023-12-28  0:55 ` Oneric
  2024-01-10  0:25   ` Oneric
  3 siblings, 1 reply; 6+ messages in thread
From: Oneric @ 2023-12-28  0:55 UTC (permalink / raw)
  To: FFmpeg development discussions and patches

On Sun, Dec 10, 2023 at 17:37:12 +0100, Oneric wrote:
> Changes from v1:
>  - ff_ass_bprint_text_event now only inserts a word-joiner
>    if there isn’t already one anyway
>  - added a third commit improving the handling of
>    curly brackets for standard ASS renderers
> 
> Heads up:
> [...]

ping

(applying from the received mails works fine,
 but do not use Patchwork’s mbox/diff/series as Patchwork mangled them)
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [FFmpeg-devel] [PATCH v2 0/3] Fix some active sequences in subtitles
  2023-12-28  0:55 ` [FFmpeg-devel] [PATCH v2 0/3] Fix some active sequences in subtitles Oneric
@ 2024-01-10  0:25   ` Oneric
  0 siblings, 0 replies; 6+ messages in thread
From: Oneric @ 2024-01-10  0:25 UTC (permalink / raw)
  To: FFmpeg development discussions and patches

On Thu, Dec 28, 2023 at 01:55:02 +0100, Oneric wrote:
> On Sun, Dec 10, 2023 at 17:37:12 +0100, Oneric wrote:
> > Changes from v1:
> >  - ff_ass_bprint_text_event now only inserts a word-joiner
> >    if there isn’t already one anyway
> >  - added a third commit improving the handling of
> >    curly brackets for standard ASS renderers
> > 
> 
> ping

ping #2
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-01-10  0:25 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-12-10 16:37 [FFmpeg-devel] [PATCH v2 0/3] Fix some active sequences in subtitles Oneric
2023-12-10 16:37 ` [FFmpeg-devel] [PATCH v2 1/3] avcodec/webvttdec: honour bidi marks Oneric
2023-12-10 16:37 ` [FFmpeg-devel] [PATCH v2 2/3] avcodec/{ass, webvttdec}: fix handling of backslashes Oneric
2023-12-10 16:37 ` [FFmpeg-devel] [PATCH v2 3/3] avcodec/{ass, webvttdec}: more portable curly brace escapes Oneric
2023-12-28  0:55 ` [FFmpeg-devel] [PATCH v2 0/3] Fix some active sequences in subtitles Oneric
2024-01-10  0:25   ` Oneric

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
		ffmpegdev@gitmailbox.com
	public-inbox-index ffmpegdev

Example config snippet for mirrors.


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git