Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
 help / color / mirror / Atom feed
* RE: [PATCH 1/2] avutil/ass_split: add parsing of hard-space tags (\h)
       [not found] ` <6fdace57a3aaf3be439774e44875f641e2c4e4b0.1639870936.git.ffmpegagent@gmail.com>
@ 2021-12-18 23:46   ` Soft Works
  0 siblings, 0 replies; only message in thread
From: Soft Works @ 2021-12-18 23:46 UTC (permalink / raw)
  To: ffmpegdev



> -----Original Message-----
> From: ffmpegagent <ffmpegagent@gmail.com>
> Sent: Sunday, December 19, 2021 12:42 AM
> To: softworkz@hotmail.com
> Cc: softworkz <softworkz@hotmail.com>; softworkz <softworkz@hotmail.com>
> Subject: [PATCH 1/2] avutil/ass_split: add parsing of hard-space tags (\h)
> 
> From: softworkz <softworkz@hotmail.com>
> 
> The \h tag in ASS/SSA is indicating a non-breaking space. See
> https://github.com/Aegisub/aegisite/blob/master/source/docs/3.2/ASS_Tags.html
> .md
> 
> The ass_split implementation is used by almost all text subtitle
> encoders and it didn't handle this tag. Interestingly, several tests
> are testing for \h parsing and had incorrect reference data for those tests.
> 
> The \h tag is specific to ASS and doesn't have any meaning outside of ASS.
> Still, the reference data for ttmlenc, textenc and webvttenc were full of
> \h tags even though this tag doesn't have a meaning there.
> 
> Signed-off-by: softworkz <softworkz@hotmail.com>
> ---
>  libavcodec/ass_split.c           |  7 +++++++
>  tests/ref/fate/.gitattributes    |  3 +++
>  tests/ref/fate/mov-mp4-ttml-dfxp |  8 ++++----
>  tests/ref/fate/mov-mp4-ttml-stpp |  8 ++++----
>  tests/ref/fate/sub-textenc       | 10 +++++-----
>  tests/ref/fate/sub-ttmlenc       |  8 ++++----
>  tests/ref/fate/sub-webvttenc     | 10 +++++-----
>  7 files changed, 32 insertions(+), 22 deletions(-)
>  create mode 100644 tests/ref/fate/.gitattributes
> 
> diff --git a/libavcodec/ass_split.c b/libavcodec/ass_split.c
> index 05c5453e53..4155592954 100644
> --- a/libavcodec/ass_split.c
> +++ b/libavcodec/ass_split.c
> @@ -484,6 +484,7 @@ int ff_ass_split_override_codes(const ASSCodesCallbacks
> *callbacks, void *priv,
>      while (buf && *buf) {
>          if (text && callbacks->text &&
>              (sscanf(buf, "\\%1[nN]", new_line) == 1 ||
> +             sscanf(buf, "\\%1[hH]", new_line) == 1 ||
>               !strncmp(buf, "{\\", 2))) {
>              callbacks->text(priv, text, text_len);
>              text = NULL;
> @@ -492,6 +493,12 @@ int ff_ass_split_override_codes(const ASSCodesCallbacks
> *callbacks, void *priv,
>              if (callbacks->new_line)
>                  callbacks->new_line(priv, new_line[0] == 'N');
>              buf += 2;
> +        } else if (sscanf(buf, "\\%1[hH]", new_line) == 1) {
> +            if (callbacks->hard_space)
> +                callbacks->hard_space(priv);
> +            else if (callbacks->text)
> +                callbacks->text(priv, " ", 1);
> +            buf += 2;

This is bad.

>          } else if (!strncmp(buf, "{\\", 2)) {
>              buf++;
>              while (*buf == '\\') {
> diff --git a/tests/ref/fate/.gitattributes b/tests/ref/fate/.gitattributes
> new file mode 100644
> index 0000000000..19be64d085
> --- /dev/null
> +++ b/tests/ref/fate/.gitattributes
> @@ -0,0 +1,3 @@
> +sub-textenc -diff
> +sub-ttmlenc -diff
> +sub-webvttenc -diff
> diff --git a/tests/ref/fate/mov-mp4-ttml-dfxp b/tests/ref/fate/mov-mp4-ttml-
> dfxp
> index e24b5d618b..e565ffa1f6 100644
> --- a/tests/ref/fate/mov-mp4-ttml-dfxp
> +++ b/tests/ref/fate/mov-mp4-ttml-dfxp
> @@ -1,9 +1,9 @@
> -2e7e01c821c111466e7a2844826b7f6d *tests/data/fate/mov-mp4-ttml-dfxp.mp4
> -8519 tests/data/fate/mov-mp4-ttml-dfxp.mp4
> +658884e1b789e75c454b25bdf71283c9 *tests/data/fate/mov-mp4-ttml-dfxp.mp4
> +8486 tests/data/fate/mov-mp4-ttml-dfxp.mp4
>  #tb 0: 1/1000
>  #media_type 0: data
>  #codec_id 0: none
> -0,          0,          0,    68500,     7866, 0x456c36b7
> +0,          0,          0,    68500,     7833, 0x31b22193
>  {

And this is catastrophically bad. Totally unacceptable...

>      "packets": [
>          {
> @@ -15,7 +15,7 @@
>              "dts_time": "0.000000",
>              "duration": 68500,
>              "duration_time": "68.500000",
> -            "size": "7866",
> +            "size": "7833",
>              "pos": "44",
>              "flags": "K_"
>          }
> diff --git a/tests/ref/fate/mov-mp4-ttml-stpp b/tests/ref/fate/mov-mp4-ttml-
> stpp
> index 77bd23b7bf..f25b5b2d28 100644
> --- a/tests/ref/fate/mov-mp4-ttml-stpp
> +++ b/tests/ref/fate/mov-mp4-ttml-stpp
> @@ -1,9 +1,9 @@
> -cbd2c7ff864a663b0d893deac5a0caec *tests/data/fate/mov-mp4-ttml-stpp.mp4
> -8547 tests/data/fate/mov-mp4-ttml-stpp.mp4
> +c9570de0ccebc858b0c662a7e449582c *tests/data/fate/mov-mp4-ttml-stpp.mp4
> +8514 tests/data/fate/mov-mp4-ttml-stpp.mp4
>  #tb 0: 1/1000
>  #media_type 0: data
>  #codec_id 0: none
> -0,          0,          0,    68500,     7866, 0x456c36b7
> +0,          0,          0,    68500,     7833, 0x31b22193
>  {
>      "packets": [
>          {
> @@ -15,7 +15,7 @@ cbd2c7ff864a663b0d893deac5a0caec *tests/data/fate/mov-mp4-
> ttml-stpp.mp4
>              "dts_time": "0.000000",
>              "duration": 68500,
>              "duration_time": "68.500000",
> -            "size": "7866",
> +            "size": "7833",
>              "pos": "44",
>              "flags": "K_"
>          }
> diff --git a/tests/ref/fate/sub-textenc b/tests/ref/fate/sub-textenc
> index 3ea56b38f0..910ca3d6e3 100644
> --- a/tests/ref/fate/sub-textenc
> +++ b/tests/ref/fate/sub-textenc
> @@ -160,18 +160,18 @@ but show this: {normal text}
>  \ N is a forced line break
>  \ h is a hard space
>  Normal spaces at the start and at the end of the line are trimmed while hard
> spaces are not trimmed.
> -
> The\hline\hwill\hnever\hbreak\hautomatically\hright\hbefore\hor\hafter\ha\hha
> rd\hspace.\h:-D
> +The line will never break automatically right before or after a hard space.
> :-D
> 
>  31
>  00:00:54,501 --> 00:00:56,500
> 
> -\h\h\h\h\hA (05 hard spaces followed by a letter)
> +     A (05 hard spaces followed by a letter)
>  A (Normal  spaces followed by a letter)
>  A (No hard spaces followed by a letter)
> 
>  32
>  00:00:56,501 --> 00:00:58,500
> -\h\h\h\h\hA (05 hard spaces followed by a letter)
> +     A (05 hard spaces followed by a letter)
>  A (Normal  spaces followed by a letter)
>  A (No hard spaces followed by a letter)
>  Show this: \TEST and this: \-)
> @@ -179,10 +179,10 @@ Show this: \TEST and this: \-)
>  33
>  00:00:58,501 --> 00:01:00,500
> 
> -A letter followed by 05 hard spaces: A\h\h\h\h\h
> +A letter followed by 05 hard spaces: A
>  A letter followed by normal  spaces: A
>  A letter followed by no hard spaces: A
> -05 hard  spaces between letters: A\h\h\h\h\hA
> +05 hard  spaces between letters: A     A
>  5 normal spaces between letters: A     A
> 
>  ^--Forced line break
> diff --git a/tests/ref/fate/sub-ttmlenc b/tests/ref/fate/sub-ttmlenc
> index 4df8f8796f..aea09bb31e 100644
> --- a/tests/ref/fate/sub-ttmlenc
> +++ b/tests/ref/fate/sub-ttmlenc
> @@ -109,16 +109,16 @@
>          end="00:00:54.500"><span region="Default">Hide these tags:<br/>also
> hide these tags:<br/>but show this: {normal text}</span></p>
>        <p
>          begin="00:00:54.501"
> -        end="00:01:00.500"><span region="Default"><br/>\ N is a forced line
> break<br/>\ h is a hard space<br/>Normal spaces at the start and at the end
> of the line are trimmed while hard spaces are not
> trimmed.<br/>The\hline\hwill\hnever\hbreak\hautomatically\hright\hbefore\hor\
> hafter\ha\hhard\hspace.\h:-D</span></p>
> +        end="00:01:00.500"><span region="Default"><br/>\ N is a forced line
> break<br/>\ h is a hard space<br/>Normal spaces at the start and at the end
> of the line are trimmed while hard spaces are not trimmed.<br/>The line will
> never break automatically right before or after a hard space. :-D</span></p>
>        <p
>          begin="00:00:54.501"
> -        end="00:00:56.500"><span region="Default"><br/>\h\h\h\h\hA (05 hard
> spaces followed by a letter)<br/>A (Normal  spaces followed by a
> letter)<br/>A (No hard spaces followed by a letter)</span></p>
> +        end="00:00:56.500"><span region="Default"><br/>     A (05 hard
> spaces followed by a letter)<br/>A (Normal  spaces followed by a
> letter)<br/>A (No hard spaces followed by a letter)</span></p>
>        <p
>          begin="00:00:56.501"
> -        end="00:00:58.500"><span region="Default">\h\h\h\h\hA (05 hard
> spaces followed by a letter)<br/>A (Normal  spaces followed by a
> letter)<br/>A (No hard spaces followed by a letter)<br/>Show this: \TEST and
> this: \-)</span></p>
> +        end="00:00:58.500"><span region="Default">     A (05 hard spaces
> followed by a letter)<br/>A (Normal  spaces followed by a letter)<br/>A (No
> hard spaces followed by a letter)<br/>Show this: \TEST and this: \-
> )</span></p>
>        <p
>          begin="00:00:58.501"
> -        end="00:01:00.500"><span region="Default"><br/>A letter followed by
> 05 hard spaces: A\h\h\h\h\h<br/>A letter followed by normal  spaces: A<br/>A
> letter followed by no hard spaces: A<br/>05 hard  spaces between letters:
> A\h\h\h\h\hA<br/>5 normal spaces between letters: A     A<br/><br/>^--Forced
> line break</span></p>
> +        end="00:01:00.500"><span region="Default"><br/>A letter followed by
> 05 hard spaces: A     <br/>A letter followed by normal  spaces: A<br/>A
> letter followed by no hard spaces: A<br/>05 hard  spaces between letters: A
> A<br/>5 normal spaces between letters: A     A<br/><br/>^--Forced line
> break</span></p>
>        <p
>          begin="00:01:00.501"
>          end="00:01:02.500"><span region="Default">Both line should be
> strikethrough,<br/>yes.<br/>Correctly closed tags<br/>should be
> hidden.</span></p>
> diff --git a/tests/ref/fate/sub-webvttenc b/tests/ref/fate/sub-webvttenc
> index 45ae0b6131..f4172dcc84 100644
> --- a/tests/ref/fate/sub-webvttenc
> +++ b/tests/ref/fate/sub-webvttenc
> @@ -132,26 +132,26 @@ but show this: {normal text}
>  \ N is a forced line break
>  \ h is a hard space
>  Normal spaces at the start and at the end of the line are trimmed while hard
> spaces are not trimmed.
> -
> The\hline\hwill\hnever\hbreak\hautomatically\hright\hbefore\hor\hafter\ha\hha
> rd\hspace.\h:-D
> +The line will never break automatically right before or after a hard space.
> :-D
> 
>  00:54.501 --> 00:56.500
> 
> -\h\h\h\h\hA (05 hard spaces followed by a letter)
> +     A (05 hard spaces followed by a letter)
>  A (Normal  spaces followed by a letter)
>  A (No hard spaces followed by a letter)
> 
>  00:56.501 --> 00:58.500
> -\h\h\h\h\hA (05 hard spaces followed by a letter)
> +     A (05 hard spaces followed by a letter)
>  A (Normal  spaces followed by a letter)
>  A (No hard spaces followed by a letter)
>  Show this: \TEST and this: \-)
> 
>  00:58.501 --> 01:00.500
> 
> -A letter followed by 05 hard spaces: A\h\h\h\h\h
> +A letter followed by 05 hard spaces: A
>  A letter followed by normal  spaces: A
>  A letter followed by no hard spaces: A
> -05 hard  spaces between letters: A\h\h\h\h\hA
> +05 hard  spaces between letters: A     A
>  5 normal spaces between letters: A     A
> 
>  ^--Forced line break
> --
> gitgitgadget


This is a text at the very end.

Thanks,
sw

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2021-12-18 23:46 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <pull.7.ffstaging.FFmpeg.1639870936.ffmpegagent@gmail.com>
     [not found] ` <6fdace57a3aaf3be439774e44875f641e2c4e4b0.1639870936.git.ffmpegagent@gmail.com>
2021-12-18 23:46   ` [PATCH 1/2] avutil/ass_split: add parsing of hard-space tags (\h) Soft Works

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
		ffmpegdev@gitmailbox.com
	public-inbox-index ffmpegdev

Example config snippet for mirrors.


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git