From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <ffmpeg-devel-bounces@ffmpeg.org>
Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100])
	by master.gitmailbox.com (Postfix) with ESMTPS id 8420C4CFD4
	for <ffmpegdev@gitmailbox.com>; Fri, 14 Feb 2025 12:30:20 +0000 (UTC)
Received: from [127.0.1.1] (localhost [127.0.0.1])
	by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 4959C68C0D4;
	Fri, 14 Feb 2025 14:30:18 +0200 (EET)
Received: from mail-ej1-f47.google.com (mail-ej1-f47.google.com
 [209.85.218.47])
 by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 31C4D68B510
 for <ffmpeg-devel@ffmpeg.org>; Fri, 14 Feb 2025 14:30:12 +0200 (EET)
Received: by mail-ej1-f47.google.com with SMTP id
 a640c23a62f3a-ab7e1286126so354855266b.0
 for <ffmpeg-devel@ffmpeg.org>; Fri, 14 Feb 2025 04:30:12 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=gmail.com; s=20230601; t=1739536211; x=1740141011; darn=ffmpeg.org;
 h=to:subject:message-id:date:from:in-reply-to:references:mime-version
 :from:to:cc:subject:date:message-id:reply-to;
 bh=43FPZ3e7p7a4usCU3/qF2XBToAj2i06CnI4ouFJnIMI=;
 b=Nkfeycwyi+K97jqUw27OyWJQsoavazErPpSKvXKxKPbb1nlNTj1R8uQ1fv5sEZCv3F
 DkpSjm8BF3ygz73NX1TAhMHuGw4UvcuMVNNLOXzg59oOTy73siNACzPgDO8dxEeQhUwi
 D54rBRdNJhoba3kkznnfE3ROauzxhhm1ovnHfQekCMCLkNVjvpwbO9Od4tIvpjdtCCas
 YaQw60vAILGpmybWSewdbJGHhn1bNbacWAQzNaI4qBtcMzdlsC6tX0AZE8sdQMzJWITS
 hgr952opvspNR7dRk11mxA3gcSdh62zl25Fd8REgDhAw2mJ+NPuTvfOG23pw00c0VJQt
 0qNg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20230601; t=1739536211; x=1740141011;
 h=to:subject:message-id:date:from:in-reply-to:references:mime-version
 :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
 bh=43FPZ3e7p7a4usCU3/qF2XBToAj2i06CnI4ouFJnIMI=;
 b=nZWg15ybgi5d4QAxOLUdEjpaul7vrtZhZTCCOlLm0aRc1OHoLSdUjLtSpKAHBfmgJL
 TkbRIDylavhvHOoGhOIYXNbxgWaSUF4LRK3bhypiIJGBV5v7DP9k1TFV2q13y1gfq4op
 Q6xgeAz/XcWmzW/Y/YEAwcJkAjysB3twS+iuJrNcW9o+ESQkZANEN7tGYYnS0mLbAC4B
 2v5sV1NQ9LEb7ETXHXl5OTeBN12iST0tPBzVLxXeFXTaQqnz1LRLuBKTU7IphnT5FFW+
 lDDnCUiirGa3fnPTw9FKWk+/iV8L1eMjfvMLVziBPZx3GNMObdwOlQsAThJnQgg8fNv+
 S4jQ==
X-Gm-Message-State: AOJu0YwxyudumDsdDbOAAsaPq5BdSr8ZnHvHY7eK4ALEdS5HfYiI0fWP
 KIFy/pmCknbQN7RXN3vQ4Lr6832JhLaOS8DYMw6ZwrlrvKeC4ozigzkvQXGjs9ZpFeZvJUrJmED
 Qq+1PatDbArufniy1X8+c2fB/W6eBdQ==
X-Gm-Gg: ASbGncurW1Nsz4qylqUk4v2fMmJh82zdX3De9KG7wWUSCGZ658wjNV+CqP95C+eV5LP
 NIgrAj2eWpkrBpY7Ni0RnZf9OVDF4b4yA3xOva/+voibu6YV2IhclhmFmoM0Uxjs4uZTBaFp5Lt
 Sm3gI403VWRbW0XA3oMYMNxnDqDg7ShWc=
X-Google-Smtp-Source: AGHT+IHYKFRaZ2fbYwvZ1Q++3/A7yyXpBPpdmMq8P9p7LvLqKCZcWo4mhLA8zoVmxnaHCs0zfAGosF75pKJolM8gEqY=
X-Received: by 2002:a17:907:2d22:b0:aba:f6ff:d38a with SMTP id
 a640c23a62f3a-abaf70f1f1dmr63637066b.29.1739536211286; Fri, 14 Feb 2025
 04:30:11 -0800 (PST)
MIME-Version: 1.0
References: <20250213212208.29414-1-pkoshevoy@gmail.com>
 <AS8P250MB07444AE485058304292FE77D8FFE2@AS8P250MB0744.EURP250.PROD.OUTLOOK.COM>
In-Reply-To: <AS8P250MB07444AE485058304292FE77D8FFE2@AS8P250MB0744.EURP250.PROD.OUTLOOK.COM>
From: Pavel Koshevoy <pkoshevoy@gmail.com>
Date: Fri, 14 Feb 2025 05:30:03 -0700
X-Gm-Features: AWEUYZmc03yZeCsinbsL8-CDgbIlzaEff_1iB_W_CKWqnH0Moyevw_sstKsSeKM
Message-ID: <CAJgjuoyrjHH6SE0YXLnUM1O19B7yQ2eAeV5qz+imOHLP0C-9vg@mail.gmail.com>
To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
X-Content-Filtered-By: Mailman/MimeDel 2.1.29
Subject: Re: [FFmpeg-devel] [PATCH] avformat/mov: (v4) fix get_eia608_packet
X-BeenThere: ffmpeg-devel@ffmpeg.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: FFmpeg development discussions and patches <ffmpeg-devel.ffmpeg.org>
List-Unsubscribe: <https://ffmpeg.org/mailman/options/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=unsubscribe>
List-Archive: <https://ffmpeg.org/pipermail/ffmpeg-devel>
List-Post: <mailto:ffmpeg-devel@ffmpeg.org>
List-Help: <mailto:ffmpeg-devel-request@ffmpeg.org?subject=help>
List-Subscribe: <https://ffmpeg.org/mailman/listinfo/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=subscribe>
Reply-To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Errors-To: ffmpeg-devel-bounces@ffmpeg.org
Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>
Archived-At: <https://master.gitmailbox.com/ffmpegdev/CAJgjuoyrjHH6SE0YXLnUM1O19B7yQ2eAeV5qz+imOHLP0C-9vg@mail.gmail.com/>
List-Archive: <https://master.gitmailbox.com/ffmpegdev/>
List-Post: <mailto:ffmpegdev@gitmailbox.com>

On Thu, Feb 13, 2025, 22:04 Andreas Rheinhardt <
andreas.rheinhardt@outlook.com> wrote:

> Pavel Koshevoy:
> > The problem is reproducible with "Test for Quicktime 608 CC file.mov"
> > from https://samples.ffmpeg.org/MPEG2/subcc/
> >
> > ffmpeg -i "Test for Quicktime 608 CC file.mov" -map 0 -c copy -y
> remuxed.mov
> >
> > Prior to the fix QuickTime Player playback of remuxed.mov would
> > render garbage text for "English CC" subtitles.
>
> Is remuxing necessary for there being garbage?
>

The original file displays correct English CC text in QuickTime Player, and
the remuxed file (prior to the fix) does not.


> > ---
> >  libavformat/mov.c | 70 +++++++++++++++++++++++++++++++++++++++--------
> >  1 file changed, 59 insertions(+), 11 deletions(-)
> >
> > diff --git a/libavformat/mov.c b/libavformat/mov.c
> > index 85aef33b19..5a91ef5b8c 100644
> > --- a/libavformat/mov.c
> > +++ b/libavformat/mov.c
> > @@ -10788,25 +10788,73 @@ static int mov_change_extradata(AVStream *st,
> AVPacket *pkt)
> >      return 0;
> >  }
> >
> > -static int get_eia608_packet(AVIOContext *pb, AVPacket *pkt, int size)
> > +static int get_eia608_packet(AVIOContext *pb, AVPacket *pkt, int
> src_size)
> >  {
> > -    int new_size, ret;
> > +    /* We can't make assumptions about the structure of the payload,
> > +       because it may include multiple cdat and cdt2 samples. */
> > +    const uint32_t cdat = AV_RB32("cdat");
> > +    const uint32_t cdt2 = AV_RB32("cdt2");
>
> I don't think that using (non-variable) variables for these improves
> clarity (e.g. it means that the definition of the actual values used for
> the comparisons below is now further away from its use). Why not simply
> use MKBETAG('c','d','a','t') below?
>


That is a matter of personal preference.  I personally find "cdat" more
readable (and searchable) than any MKBETAG.


> > +    int ret, out_size = 0;
> >
> > -    if (size <= 8)
> > +    /* a valid payload must have size, 4cc, and at least 1 byte pair: */
> > +    if (src_size < 10)
> >          return AVERROR_INVALIDDATA;
> > -    new_size = ((size - 8) / 2) * 3;
> > -    ret = av_new_packet(pkt, new_size);
> > +
> > +    /* avoid an int overflow: */
> > +    if ((src_size - 8) / 2 >= INT_MAX / 3)
> > +        return AVERROR_INVALIDDATA;
> > +
> > +    ret = av_new_packet(pkt, ((src_size - 8) / 2) * 3);
> >      if (ret < 0)
> >          return ret;
> >
> > -    avio_skip(pb, 8);
> > -    for (int j = 0; j < new_size; j += 3) {
> > -        pkt->data[j] = 0xFC;
> > -        pkt->data[j+1] = avio_r8(pb);
> > -        pkt->data[j+2] = avio_r8(pb);
> > +    /* parse and re-format the c608 payload in one pass. */
> > +    while (src_size >= 10) {
> > +        const uint32_t atom_size = avio_rb32(pb);
> > +        const uint32_t atom_type = avio_rb32(pb);
> > +        const uint32_t data_size = atom_size - 8;
>
> This may wrap around (if atom_size is < 8). If int is 32 bits, then the
> data_size > src_size check will catch this, but in case of 64 bit ints
> it may not. Relying on (unsigned, defined) integer wraparound should be
> avoided unless it is advantageous to use it; in this case, this is just
> not true: Just compare atom_size to 10 below.
>

I fully expect the size of uint32_t to be 32 bits, on any platform.  It
should be a compile time assertio n, but that is outside the scope of this
fix.  The name of the data type says it's 32 bit long, so it must be so.


> > +        const uint8_t cc_field =
> > +            atom_type == cdat ? 1 :
> > +            atom_type == cdt2 ? 2 :
> > +            0;
> > +
> > +        /* account for bytes consumed for atom size and type. */
> > +        src_size -= 8;
> > +
> > +        /* make sure the data size stays within the buffer boundaries.
> */
> > +        if (data_size < 2 || data_size > src_size) {
> > +            ret = AVERROR_INVALIDDATA;
> > +            break;
> > +        }
> > +
> > +        /* make sure the data size is consistent with N byte pairs. */
> > +        if (data_size % 2 != 0) {
>
> We typically try to avoid redundant "!= 0".
>

Again, this is a matter of personal preference.  If you would prefer to
tweak the patch to suit your personal preference before merging -- you are
free to do so, but I don't think it's a valid reason to delay a fix for a
parser that has been mis-parsing well-formed files for the past 5 years.


> > +            ret = AVERROR_INVALIDDATA;
> > +            break;
> > +        }
> > +
> > +        if (!cc_field) {
> > +            /* neither cdat or cdt2 ... skip it */
> > +            avio_skip(pb, data_size);
> > +            src_size -= data_size;
> > +            continue;
> > +        }
> > +
> > +        for (int32_t i = 0; i < data_size; i += 2) {
>
> int32_t? Why signed? (And why use a separate loop counter at all? Simply
> decrement data_size by 2 in each iteration.
>

Please feel free to make additional improvements to whatever fix you decide
to merge.


> > +            pkt->data[out_size] = (0x1F << 3) | (1 << 2) | (cc_field -
> 1);
> > +            pkt->data[out_size + 1] = avio_r8(pb);
> > +            pkt->data[out_size + 2] = avio_r8(pb);
> > +            out_size += 3;
> > +            src_size -= 2;
> > +        }
> >      }
> >
> > -    return 0;
> > +    if (src_size > 0)
> > +        /* skip any remaining unread portion of the input payload */
> > +        avio_skip(pb, src_size);
> > +
> > +    av_shrink_packet(pkt, out_size);
> > +    return ret;
> >  }
> >
> >  static int mov_finalize_packet(AVFormatContext *s, AVStream *st,
> AVIndexEntry *sample,
>
> Generally, I believe that reading the input into pkt->data[size / 2]
> would be advantageous: It would make it simple to check for EOF and I/O
> errors (notice that the avio_r* reads above are unchecked) and would
> read the data in one go, avoiding all the avio_skip().
>
> - Andreas
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".