Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
 help / color / mirror / Atom feed
* [FFmpeg-devel] [PATCH 0/1] Handle ASS format subtitle encoding ambiguity
@ 2023-01-18 14:31 Tim Angus
  2023-01-18 14:31 ` [FFmpeg-devel] [PATCH] avformat/assenc: fix incorrect copy of null terminator Tim Angus
  2023-01-21 19:50 ` [FFmpeg-devel] [PATCH 0/1] Handle ASS format subtitle encoding ambiguity Tim Angus
  0 siblings, 2 replies; 6+ messages in thread
From: Tim Angus @ 2023-01-18 14:31 UTC (permalink / raw)
  To: ffmpeg-devel; +Cc: Tim Angus

Some matroska files embed ASS format subtitles. The header for said subtitles
include the header for the subtitle stream in the "codec private data" section.
It appears to be optional whether or not the last byte of this data is 0, i.e.
a null terminator for the string data. Using ffmpeg to extract subtitles for
such a file, this header is copied directly to the output file, including the
null terminator, if it was present. This results in a file in which there is a
null terminator after the header, but preceeding the actual content of the
subtitle file. Obviously this is not correct.

As a data point, of the ~600 mkvs I have locally, 22 of them have ASS
subtitles, and of them 20 include the null terminator, so it doesn't appear to
be a rare phenomenon.

As another data point, the tool mkvextract from mkvtoolnix avoids the ambiguity
by first assuming that the source buffer is *not* null terminated, and then
manually adding a (possibly second) null terminator. The buffer is then
interpreted as a null terminated string and processed that way.
(https://gitlab.com/mbunkus/mkvtoolnix/-/blob/main/src/extract/xtr_textsubs.cpp#L117)

My change here simply avoids copying the trailing null terminator(s) present.
FATE succeeds as there are no mkvs in the suite that have ASS subtitles
embedded.

Tim Angus (1):
  avformat/assenc: fix incorrect copy of null terminator

 libavformat/assenc.c | 6 ++++++
 1 file changed, 6 insertions(+)

-- 
2.25.1

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [FFmpeg-devel] [PATCH] avformat/assenc: fix incorrect copy of null terminator
  2023-01-18 14:31 [FFmpeg-devel] [PATCH 0/1] Handle ASS format subtitle encoding ambiguity Tim Angus
@ 2023-01-18 14:31 ` Tim Angus
  2023-01-31 12:37   ` "zhilizhao(赵志立)"
  2023-01-21 19:50 ` [FFmpeg-devel] [PATCH 0/1] Handle ASS format subtitle encoding ambiguity Tim Angus
  1 sibling, 1 reply; 6+ messages in thread
From: Tim Angus @ 2023-01-18 14:31 UTC (permalink / raw)
  To: ffmpeg-devel; +Cc: Tim Angus

When writing a subtitle SSA/ASS subtitle file, the
AVCodecParameters::extradata buffer is written directly to the output,
potentially including a null terminating character, which is sometimes
present. The result is the output having a null character in the middle;
this is addressed here by avoiding copying it.

Signed-off-by: Tim Angus <tim@ngus.net>
---
 libavformat/assenc.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/libavformat/assenc.c b/libavformat/assenc.c
index 1600f0a02b..5e74b84575 100644
--- a/libavformat/assenc.c
+++ b/libavformat/assenc.c
@@ -69,6 +69,11 @@ static int write_header(AVFormatContext *s)
                 ass->trailer = trailer;
         }
 
+        /* extradata may or may not be null terminated; in the case where
+         * it is, avoid copying a null into the middle of the buffer */
+        while (header_size > 0 && par->extradata[header_size - 1] == '\0')
+            header_size--;
+
         avio_write(s->pb, par->extradata, header_size);
         if (par->extradata[header_size - 1] != '\n')
             avio_write(s->pb, "\r\n", 2);
-- 
2.25.1

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [FFmpeg-devel] [PATCH 0/1] Handle ASS format subtitle encoding ambiguity
  2023-01-18 14:31 [FFmpeg-devel] [PATCH 0/1] Handle ASS format subtitle encoding ambiguity Tim Angus
  2023-01-18 14:31 ` [FFmpeg-devel] [PATCH] avformat/assenc: fix incorrect copy of null terminator Tim Angus
@ 2023-01-21 19:50 ` Tim Angus
  1 sibling, 0 replies; 6+ messages in thread
From: Tim Angus @ 2023-01-21 19:50 UTC (permalink / raw)
  To: ffmpeg-devel

On 18/01/2023 14:31, Tim Angus wrote:
> Some matroska files embed ASS format subtitles. The header for said subtitles
> include the header for the subtitle stream in the "codec private data" section.
> It appears to be optional whether or not the last byte of this data is 0, i.e.
> a null terminator for the string data. Using ffmpeg to extract subtitles for
> such a file, this header is copied directly to the output file, including the
> null terminator, if it was present. This results in a file in which there is a
> null terminator after the header, but preceeding the actual content of the
> subtitle file. Obviously this is not correct.
>
> As a data point, of the ~600 mkvs I have locally, 22 of them have ASS
> subtitles, and of them 20 include the null terminator, so it doesn't appear to
> be a rare phenomenon.
>
> As another data point, the tool mkvextract from mkvtoolnix avoids the ambiguity
> by first assuming that the source buffer is *not* null terminated, and then
> manually adding a (possibly second) null terminator. The buffer is then
> interpreted as a null terminated string and processed that way.
> (https://gitlab.com/mbunkus/mkvtoolnix/-/blob/main/src/extract/xtr_textsubs.cpp#L117)
>
> My change here simply avoids copying the trailing null terminator(s) present.
> FATE succeeds as there are no mkvs in the suite that have ASS subtitles
> embedded.
>
> Tim Angus (1):
>    avformat/assenc: fix incorrect copy of null terminator
>
>   libavformat/assenc.c | 6 ++++++
>   1 file changed, 6 insertions(+)
Why is there a gap of submitted patches on patchwork? Have I missed 
something?

https://patchwork.ffmpeg.org/project/ffmpeg/list/
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [FFmpeg-devel] [PATCH] avformat/assenc: fix incorrect copy of null terminator
  2023-01-18 14:31 ` [FFmpeg-devel] [PATCH] avformat/assenc: fix incorrect copy of null terminator Tim Angus
@ 2023-01-31 12:37   ` "zhilizhao(赵志立)"
  2023-01-31 13:37     ` Tim Angus
  0 siblings, 1 reply; 6+ messages in thread
From: "zhilizhao(赵志立)" @ 2023-01-31 12:37 UTC (permalink / raw)
  To: FFmpeg development discussions and patches; +Cc: Tim Angus



> On Jan 18, 2023, at 22:31, Tim Angus <tim@ngus.net> wrote:
> 
> When writing a subtitle SSA/ASS subtitle file, the
> AVCodecParameters::extradata buffer is written directly to the output,
> potentially including a null terminating character, which is sometimes
> present. The result is the output having a null character in the middle;
> this is addressed here by avoiding copying it.
> 
> Signed-off-by: Tim Angus <tim@ngus.net>
> ---
> libavformat/assenc.c | 5 +++++
> 1 file changed, 5 insertions(+)
> 
> diff --git a/libavformat/assenc.c b/libavformat/assenc.c
> index 1600f0a02b..5e74b84575 100644
> --- a/libavformat/assenc.c
> +++ b/libavformat/assenc.c
> @@ -69,6 +69,11 @@ static int write_header(AVFormatContext *s)
>                 ass->trailer = trailer;
>         }
> 
> +        /* extradata may or may not be null terminated; in the case where
> +         * it is, avoid copying a null into the middle of the buffer */
> +        while (header_size > 0 && par->extradata[header_size - 1] == '\0')
> +            header_size--;
> +

The comment is misleading. extradata is always null terminated, although
those paddings don’t count in extradata_size.

>         avio_write(s->pb, par->extradata, header_size);
>         if (par->extradata[header_size - 1] != '\n')
>             avio_write(s->pb, "\r\n", 2);
> -- 
> 2.25.1
> 
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [FFmpeg-devel] [PATCH] avformat/assenc: fix incorrect copy of null terminator
  2023-01-31 12:37   ` "zhilizhao(赵志立)"
@ 2023-01-31 13:37     ` Tim Angus
  0 siblings, 0 replies; 6+ messages in thread
From: Tim Angus @ 2023-01-31 13:37 UTC (permalink / raw)
  To: ffmpeg-devel

On 31/01/2023 12:37, "zhilizhao(赵志立)" wrote:
>> +        /* extradata may or may not be null terminated; in the case where
>> +         * it is, avoid copying a null into the middle of the buffer */
>> +        while (header_size > 0 && par->extradata[header_size - 1] == '\0')
>> +            header_size--;
>> +
> The comment is misleading. extradata is always null terminated, although
> those paddings don’t count in extradata_size.

That's a bit pedantic, but I take your point. "The contents of extradata 
may or may..." would be better. Following some discussion on IRC, I've 
actually submitted another patch that solves the problem in a different 
way, but I don't think anyone has looked at it yet...

http://ffmpeg.org/pipermail/ffmpeg-devel/2023-January/306017.html
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [FFmpeg-devel] [PATCH 0/1] Handle ASS format subtitle encoding ambiguity
@ 2023-01-27 17:20 Tim Angus
  0 siblings, 0 replies; 6+ messages in thread
From: Tim Angus @ 2023-01-27 17:20 UTC (permalink / raw)
  To: ffmpeg-devel; +Cc: Tim Angus

Some matroska files embed ASS format subtitles. The header for said
subtitles include the header for the subtitle stream in the "codec
private data" section. It's not clear whether the last byte of this data
is supposed to be 0, i.e. a null terminator for the string data. Among
other tools, older versions of Handbrake do include the null terminator
and there are many files out in the wild created using Handbrake. Using
ffmpeg to extract subtitles for such a file, this header is copied
directly to the output file, including the null terminator, if it was
present. This results in a file in which there is a null terminator
after the header, but preceeding the actual content of the subtitle
file. Obviously this is not correct.

As a data point, of the ~600 mkvs I have locally, 22 of them have ASS
subtitles, and of them 20 include the null terminator, so it doesn't appear to
be a rare phenomenon.

As another data point, the tool mkvextract from mkvtoolnix avoids the ambiguity
by first assuming that the source buffer is *not* null terminated, and then
manually adding a (possibly second) null terminator. The buffer is then
interpreted as a null terminated string and processed that way.
(https://gitlab.com/mbunkus/mkvtoolnix/-/blob/main/src/extract/xtr_textsubs.cpp#L117)

My change here refactors the way the output file is created, by treating
the source buffer as a string. I posted an earlier iteration of this
patch to the list that dealt with the extra null explicitly, but
following some #ffmpeg-devel discussion, I suggested I rework it to work
more like how mkvtoolnix does things, and this was positively received,
so here we are. 

FATE succeeds as there are no mkvs in the suite that have ASS subtitles
embedded.

A small test file that exhibits the problem may be found here:
https://0x0.st/oFTT.mkv

Tim Angus (1):
  avformat/assenc: avoid incorrect copy of null terminator

 libavformat/assenc.c | 19 ++++++++++++++++---
 1 file changed, 16 insertions(+), 3 deletions(-)

-- 
2.25.1

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-01-31 13:37 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-18 14:31 [FFmpeg-devel] [PATCH 0/1] Handle ASS format subtitle encoding ambiguity Tim Angus
2023-01-18 14:31 ` [FFmpeg-devel] [PATCH] avformat/assenc: fix incorrect copy of null terminator Tim Angus
2023-01-31 12:37   ` "zhilizhao(赵志立)"
2023-01-31 13:37     ` Tim Angus
2023-01-21 19:50 ` [FFmpeg-devel] [PATCH 0/1] Handle ASS format subtitle encoding ambiguity Tim Angus
2023-01-27 17:20 Tim Angus

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
		ffmpegdev@gitmailbox.com
	public-inbox-index ffmpegdev

Example config snippet for mirrors.


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git