From: Soft Works <softworkz@hotmail.com> To: "ffmpegdev@gitmailbox.com" <ffmpegdev@gitmailbox.com> Subject: RE: [PATCH 1/2] avutil/ass_split: add parsing of hard-space tags (\h) Date: Sat, 18 Dec 2021 23:46:54 +0000 Message-ID: <BN0P223MB03582BD2D32853E56DD1A8DABA799@BN0P223MB0358.NAMP223.PROD.OUTLOOK.COM> (raw) In-Reply-To: <6fdace57a3aaf3be439774e44875f641e2c4e4b0.1639870936.git.ffmpegagent@gmail.com> > -----Original Message----- > From: ffmpegagent <ffmpegagent@gmail.com> > Sent: Sunday, December 19, 2021 12:42 AM > To: softworkz@hotmail.com > Cc: softworkz <softworkz@hotmail.com>; softworkz <softworkz@hotmail.com> > Subject: [PATCH 1/2] avutil/ass_split: add parsing of hard-space tags (\h) > > From: softworkz <softworkz@hotmail.com> > > The \h tag in ASS/SSA is indicating a non-breaking space. See > https://github.com/Aegisub/aegisite/blob/master/source/docs/3.2/ASS_Tags.html > .md > > The ass_split implementation is used by almost all text subtitle > encoders and it didn't handle this tag. Interestingly, several tests > are testing for \h parsing and had incorrect reference data for those tests. > > The \h tag is specific to ASS and doesn't have any meaning outside of ASS. > Still, the reference data for ttmlenc, textenc and webvttenc were full of > \h tags even though this tag doesn't have a meaning there. > > Signed-off-by: softworkz <softworkz@hotmail.com> > --- > libavcodec/ass_split.c | 7 +++++++ > tests/ref/fate/.gitattributes | 3 +++ > tests/ref/fate/mov-mp4-ttml-dfxp | 8 ++++---- > tests/ref/fate/mov-mp4-ttml-stpp | 8 ++++---- > tests/ref/fate/sub-textenc | 10 +++++----- > tests/ref/fate/sub-ttmlenc | 8 ++++---- > tests/ref/fate/sub-webvttenc | 10 +++++----- > 7 files changed, 32 insertions(+), 22 deletions(-) > create mode 100644 tests/ref/fate/.gitattributes > > diff --git a/libavcodec/ass_split.c b/libavcodec/ass_split.c > index 05c5453e53..4155592954 100644 > --- a/libavcodec/ass_split.c > +++ b/libavcodec/ass_split.c > @@ -484,6 +484,7 @@ int ff_ass_split_override_codes(const ASSCodesCallbacks > *callbacks, void *priv, > while (buf && *buf) { > if (text && callbacks->text && > (sscanf(buf, "\\%1[nN]", new_line) == 1 || > + sscanf(buf, "\\%1[hH]", new_line) == 1 || > !strncmp(buf, "{\\", 2))) { > callbacks->text(priv, text, text_len); > text = NULL; > @@ -492,6 +493,12 @@ int ff_ass_split_override_codes(const ASSCodesCallbacks > *callbacks, void *priv, > if (callbacks->new_line) > callbacks->new_line(priv, new_line[0] == 'N'); > buf += 2; > + } else if (sscanf(buf, "\\%1[hH]", new_line) == 1) { > + if (callbacks->hard_space) > + callbacks->hard_space(priv); > + else if (callbacks->text) > + callbacks->text(priv, " ", 1); > + buf += 2; This is bad. > } else if (!strncmp(buf, "{\\", 2)) { > buf++; > while (*buf == '\\') { > diff --git a/tests/ref/fate/.gitattributes b/tests/ref/fate/.gitattributes > new file mode 100644 > index 0000000000..19be64d085 > --- /dev/null > +++ b/tests/ref/fate/.gitattributes > @@ -0,0 +1,3 @@ > +sub-textenc -diff > +sub-ttmlenc -diff > +sub-webvttenc -diff > diff --git a/tests/ref/fate/mov-mp4-ttml-dfxp b/tests/ref/fate/mov-mp4-ttml- > dfxp > index e24b5d618b..e565ffa1f6 100644 > --- a/tests/ref/fate/mov-mp4-ttml-dfxp > +++ b/tests/ref/fate/mov-mp4-ttml-dfxp > @@ -1,9 +1,9 @@ > -2e7e01c821c111466e7a2844826b7f6d *tests/data/fate/mov-mp4-ttml-dfxp.mp4 > -8519 tests/data/fate/mov-mp4-ttml-dfxp.mp4 > +658884e1b789e75c454b25bdf71283c9 *tests/data/fate/mov-mp4-ttml-dfxp.mp4 > +8486 tests/data/fate/mov-mp4-ttml-dfxp.mp4 > #tb 0: 1/1000 > #media_type 0: data > #codec_id 0: none > -0, 0, 0, 68500, 7866, 0x456c36b7 > +0, 0, 0, 68500, 7833, 0x31b22193 > { And this is catastrophically bad. Totally unacceptable... > "packets": [ > { > @@ -15,7 +15,7 @@ > "dts_time": "0.000000", > "duration": 68500, > "duration_time": "68.500000", > - "size": "7866", > + "size": "7833", > "pos": "44", > "flags": "K_" > } > diff --git a/tests/ref/fate/mov-mp4-ttml-stpp b/tests/ref/fate/mov-mp4-ttml- > stpp > index 77bd23b7bf..f25b5b2d28 100644 > --- a/tests/ref/fate/mov-mp4-ttml-stpp > +++ b/tests/ref/fate/mov-mp4-ttml-stpp > @@ -1,9 +1,9 @@ > -cbd2c7ff864a663b0d893deac5a0caec *tests/data/fate/mov-mp4-ttml-stpp.mp4 > -8547 tests/data/fate/mov-mp4-ttml-stpp.mp4 > +c9570de0ccebc858b0c662a7e449582c *tests/data/fate/mov-mp4-ttml-stpp.mp4 > +8514 tests/data/fate/mov-mp4-ttml-stpp.mp4 > #tb 0: 1/1000 > #media_type 0: data > #codec_id 0: none > -0, 0, 0, 68500, 7866, 0x456c36b7 > +0, 0, 0, 68500, 7833, 0x31b22193 > { > "packets": [ > { > @@ -15,7 +15,7 @@ cbd2c7ff864a663b0d893deac5a0caec *tests/data/fate/mov-mp4- > ttml-stpp.mp4 > "dts_time": "0.000000", > "duration": 68500, > "duration_time": "68.500000", > - "size": "7866", > + "size": "7833", > "pos": "44", > "flags": "K_" > } > diff --git a/tests/ref/fate/sub-textenc b/tests/ref/fate/sub-textenc > index 3ea56b38f0..910ca3d6e3 100644 > --- a/tests/ref/fate/sub-textenc > +++ b/tests/ref/fate/sub-textenc > @@ -160,18 +160,18 @@ but show this: {normal text} > \ N is a forced line break > \ h is a hard space > Normal spaces at the start and at the end of the line are trimmed while hard > spaces are not trimmed. > - > The\hline\hwill\hnever\hbreak\hautomatically\hright\hbefore\hor\hafter\ha\hha > rd\hspace.\h:-D > +The line will never break automatically right before or after a hard space. > :-D > > 31 > 00:00:54,501 --> 00:00:56,500 > > -\h\h\h\h\hA (05 hard spaces followed by a letter) > + A (05 hard spaces followed by a letter) > A (Normal spaces followed by a letter) > A (No hard spaces followed by a letter) > > 32 > 00:00:56,501 --> 00:00:58,500 > -\h\h\h\h\hA (05 hard spaces followed by a letter) > + A (05 hard spaces followed by a letter) > A (Normal spaces followed by a letter) > A (No hard spaces followed by a letter) > Show this: \TEST and this: \-) > @@ -179,10 +179,10 @@ Show this: \TEST and this: \-) > 33 > 00:00:58,501 --> 00:01:00,500 > > -A letter followed by 05 hard spaces: A\h\h\h\h\h > +A letter followed by 05 hard spaces: A > A letter followed by normal spaces: A > A letter followed by no hard spaces: A > -05 hard spaces between letters: A\h\h\h\h\hA > +05 hard spaces between letters: A A > 5 normal spaces between letters: A A > > ^--Forced line break > diff --git a/tests/ref/fate/sub-ttmlenc b/tests/ref/fate/sub-ttmlenc > index 4df8f8796f..aea09bb31e 100644 > --- a/tests/ref/fate/sub-ttmlenc > +++ b/tests/ref/fate/sub-ttmlenc > @@ -109,16 +109,16 @@ > end="00:00:54.500"><span region="Default">Hide these tags:<br/>also > hide these tags:<br/>but show this: {normal text}</span></p> > <p > begin="00:00:54.501" > - end="00:01:00.500"><span region="Default"><br/>\ N is a forced line > break<br/>\ h is a hard space<br/>Normal spaces at the start and at the end > of the line are trimmed while hard spaces are not > trimmed.<br/>The\hline\hwill\hnever\hbreak\hautomatically\hright\hbefore\hor\ > hafter\ha\hhard\hspace.\h:-D</span></p> > + end="00:01:00.500"><span region="Default"><br/>\ N is a forced line > break<br/>\ h is a hard space<br/>Normal spaces at the start and at the end > of the line are trimmed while hard spaces are not trimmed.<br/>The line will > never break automatically right before or after a hard space. :-D</span></p> > <p > begin="00:00:54.501" > - end="00:00:56.500"><span region="Default"><br/>\h\h\h\h\hA (05 hard > spaces followed by a letter)<br/>A (Normal spaces followed by a > letter)<br/>A (No hard spaces followed by a letter)</span></p> > + end="00:00:56.500"><span region="Default"><br/> A (05 hard > spaces followed by a letter)<br/>A (Normal spaces followed by a > letter)<br/>A (No hard spaces followed by a letter)</span></p> > <p > begin="00:00:56.501" > - end="00:00:58.500"><span region="Default">\h\h\h\h\hA (05 hard > spaces followed by a letter)<br/>A (Normal spaces followed by a > letter)<br/>A (No hard spaces followed by a letter)<br/>Show this: \TEST and > this: \-)</span></p> > + end="00:00:58.500"><span region="Default"> A (05 hard spaces > followed by a letter)<br/>A (Normal spaces followed by a letter)<br/>A (No > hard spaces followed by a letter)<br/>Show this: \TEST and this: \- > )</span></p> > <p > begin="00:00:58.501" > - end="00:01:00.500"><span region="Default"><br/>A letter followed by > 05 hard spaces: A\h\h\h\h\h<br/>A letter followed by normal spaces: A<br/>A > letter followed by no hard spaces: A<br/>05 hard spaces between letters: > A\h\h\h\h\hA<br/>5 normal spaces between letters: A A<br/><br/>^--Forced > line break</span></p> > + end="00:01:00.500"><span region="Default"><br/>A letter followed by > 05 hard spaces: A <br/>A letter followed by normal spaces: A<br/>A > letter followed by no hard spaces: A<br/>05 hard spaces between letters: A > A<br/>5 normal spaces between letters: A A<br/><br/>^--Forced line > break</span></p> > <p > begin="00:01:00.501" > end="00:01:02.500"><span region="Default">Both line should be > strikethrough,<br/>yes.<br/>Correctly closed tags<br/>should be > hidden.</span></p> > diff --git a/tests/ref/fate/sub-webvttenc b/tests/ref/fate/sub-webvttenc > index 45ae0b6131..f4172dcc84 100644 > --- a/tests/ref/fate/sub-webvttenc > +++ b/tests/ref/fate/sub-webvttenc > @@ -132,26 +132,26 @@ but show this: {normal text} > \ N is a forced line break > \ h is a hard space > Normal spaces at the start and at the end of the line are trimmed while hard > spaces are not trimmed. > - > The\hline\hwill\hnever\hbreak\hautomatically\hright\hbefore\hor\hafter\ha\hha > rd\hspace.\h:-D > +The line will never break automatically right before or after a hard space. > :-D > > 00:54.501 --> 00:56.500 > > -\h\h\h\h\hA (05 hard spaces followed by a letter) > + A (05 hard spaces followed by a letter) > A (Normal spaces followed by a letter) > A (No hard spaces followed by a letter) > > 00:56.501 --> 00:58.500 > -\h\h\h\h\hA (05 hard spaces followed by a letter) > + A (05 hard spaces followed by a letter) > A (Normal spaces followed by a letter) > A (No hard spaces followed by a letter) > Show this: \TEST and this: \-) > > 00:58.501 --> 01:00.500 > > -A letter followed by 05 hard spaces: A\h\h\h\h\h > +A letter followed by 05 hard spaces: A > A letter followed by normal spaces: A > A letter followed by no hard spaces: A > -05 hard spaces between letters: A\h\h\h\h\hA > +05 hard spaces between letters: A A > 5 normal spaces between letters: A A > > ^--Forced line break > -- > gitgitgadget This is a text at the very end. Thanks, sw
parent reply other threads:[~2021-12-18 23:46 UTC|newest] Thread overview: expand[flat|nested] mbox.gz Atom feed [parent not found: <6fdace57a3aaf3be439774e44875f641e2c4e4b0.1639870936.git.ffmpegagent@gmail.com>]
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=BN0P223MB03582BD2D32853E56DD1A8DABA799@BN0P223MB0358.NAMP223.PROD.OUTLOOK.COM \ --to=softworkz@hotmail.com \ --cc=ffmpegdev@gitmailbox.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel This inbox may be cloned and mirrored by anyone: git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \ ffmpegdev@gitmailbox.com public-inbox-index ffmpegdev Example config snippet for mirrors. AGPL code for this site: git clone https://public-inbox.org/public-inbox.git