From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <ffmpeg-devel-bounces@ffmpeg.org>
Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100])
	by master.gitmailbox.com (Postfix) with ESMTP id 4037540584
	for <ffmpegdev@gitmailbox.com>; Fri, 23 Sep 2022 14:27:44 +0000 (UTC)
Received: from [127.0.1.1] (localhost [127.0.0.1])
	by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 066FE68BBB8;
	Fri, 23 Sep 2022 17:27:42 +0300 (EEST)
Received: from mail-oi1-f175.google.com (mail-oi1-f175.google.com
 [209.85.167.175])
 by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 33AA568BA76
 for <ffmpeg-devel@ffmpeg.org>; Fri, 23 Sep 2022 17:27:36 +0300 (EEST)
Received: by mail-oi1-f175.google.com with SMTP id u131so34797oie.5
 for <ffmpeg-devel@ffmpeg.org>; Fri, 23 Sep 2022 07:27:36 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112;
 h=content-transfer-encoding:in-reply-to:from:content-language
 :references:to:subject:user-agent:mime-version:date:message-id:from
 :to:cc:subject:date;
 bh=zngBq0uQdIWcmoTo2xMYpzEOALP6V+qVMb//mJTkQP4=;
 b=OFVZ6i9n3OlOKMQQEqJ2G2AIi7vVzS6gl+P9ZbjfmUZro4Nai0OVktGPQMXLpr+S2l
 xyYPKxZKfFBgoAfzsiMaz/4bd4MvDCHPTXry12roIEkUjF3L0GEtCwRkhcGLqIRNSZwG
 dAjE67alWHfPDemoqvJlyh5u5O9/8QzFHBkn2rJyQ3IzXoTb06kCnH9VT4PhykSDmrd7
 pDIYixH2zim9S+qIx9HKh8MpRiHWJcoGXLUBjbDHxGp6jAlx2tBGWxC5BkdAbAJEJX4q
 6q2fxryIgW9fES3wq+UTsvLvorCKW/ZskpuSso1q8nZmqnd0ArfsVCf79BFrkWUAbcJ7
 G7LA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20210112;
 h=content-transfer-encoding:in-reply-to:from:content-language
 :references:to:subject:user-agent:mime-version:date:message-id
 :x-gm-message-state:from:to:cc:subject:date;
 bh=zngBq0uQdIWcmoTo2xMYpzEOALP6V+qVMb//mJTkQP4=;
 b=uOIlU1XWSnJQfa/GmEEJy4paQeDAuU4iwxhT2m2IfDNrmMaF0uUI1daLtp1+5GHgWE
 T6fo0RE+J93bbhseNGqmI8b+45MNmfeYAGWYP/uPQB3l6SSbvckE7Tko6sNTYtaN41bA
 TOSu+QF1Bx4NNcDZnhRnRLJelykFm3P2U60xWecBSHYIHuGehlUR/jp02LjII3/SWpQ0
 F1voAgAaS3jnzwxnEL9Y/dEhQATshTyW7oTeXqk40lr4mYaPL7As2CM4sLFC3CD26Exm
 N5eOpNCi4s04n/HddGMJHY6HovHl97ZpYjk5QgRuaUGIS+SpPqy/jC4cndTRL4Ya/YRy
 dvvg==
X-Gm-Message-State: ACrzQf0Uu5ytkycjGPVNjvaDY4gTxhO+fn3ERhGgBTHyzI+O5X8OtzAq
 G4g1vWAU0E3TrlMpwNZstIOfocQPbPc=
X-Google-Smtp-Source: AMsMyM4PL2KL51OGGkMJD1LlEdf5Zm4Pr75Tz6uVoNDU7qds8sldzSfZVmLfgpPqJvIdn3waQi2//A==
X-Received: by 2002:a05:6808:f91:b0:350:ce5d:fe68 with SMTP id
 o17-20020a0568080f9100b00350ce5dfe68mr4056929oiw.166.1663943253950; 
 Fri, 23 Sep 2022 07:27:33 -0700 (PDT)
Received: from [192.168.0.13] ([191.97.187.183])
 by smtp.gmail.com with ESMTPSA id
 f47-20020a4a8932000000b0044b491ccf97sm3289450ooi.25.2022.09.23.07.27.32
 for <ffmpeg-devel@ffmpeg.org>
 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128);
 Fri, 23 Sep 2022 07:27:33 -0700 (PDT)
Message-ID: <011a0421-e70a-37b3-b9f8-e9015f3622e7@gmail.com>
Date: Fri, 23 Sep 2022 11:27:32 -0300
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
 Thunderbird/102.3.0
To: ffmpeg-devel@ffmpeg.org
References: <20220830014653.2477-1-jamrial@gmail.com>
 <20220905010955.173-1-jamrial@gmail.com>
 <166393837176.26932.2149159917577069121@lain.khirnov.net>
Content-Language: en-US
From: James Almer <jamrial@gmail.com>
In-Reply-To: <166393837176.26932.2149159917577069121@lain.khirnov.net>
Subject: Re: [FFmpeg-devel] [PATCH] avcodec: add a bsf to reorder DTS into
 PTS
X-BeenThere: ffmpeg-devel@ffmpeg.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: FFmpeg development discussions and patches <ffmpeg-devel.ffmpeg.org>
List-Unsubscribe: <https://ffmpeg.org/mailman/options/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=unsubscribe>
List-Archive: <https://ffmpeg.org/pipermail/ffmpeg-devel>
List-Post: <mailto:ffmpeg-devel@ffmpeg.org>
List-Help: <mailto:ffmpeg-devel-request@ffmpeg.org?subject=help>
List-Subscribe: <https://ffmpeg.org/mailman/listinfo/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=subscribe>
Reply-To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset="us-ascii"; Format="flowed"
Errors-To: ffmpeg-devel-bounces@ffmpeg.org
Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>
Archived-At: <https://master.gitmailbox.com/ffmpegdev/011a0421-e70a-37b3-b9f8-e9015f3622e7@gmail.com/>
List-Archive: <https://master.gitmailbox.com/ffmpegdev/>
List-Post: <mailto:ffmpegdev@gitmailbox.com>

On 9/23/2022 10:06 AM, Anton Khirnov wrote:
> Quoting James Almer (2022-09-05 03:09:55)
>> +static int h264_init(AVBSFContext *ctx)
>> +{
>> +    DTS2PTSContext *s = ctx->priv_data;
>> +    DTS2PTSH264Context *h264 = &s->u.h264;
>> +
>> +    s->cbc->decompose_unit_types    = h264_decompose_unit_types;
>> +    s->cbc->nb_decompose_unit_types = FF_ARRAY_ELEMS(h264_decompose_unit_types);
>> +
>> +    s->nb_frame = -(ctx->par_in->video_delay << 1);
>> +    h264->last_poc = h264->highest_poc = INT_MIN;
> 
> just call h264_flush()?

Ok. I avoided doing that since the memsets in there would be run for no 
gain. But this is init() after all, so called only once.

> 
>> +    return 0;
>> +}
>> +
>> +static int get_mmco_reset(const H264RawSliceHeader *header)
>> +{
>> +    if (header->nal_unit_header.nal_ref_idc == 0 ||
>> +        !header->adaptive_ref_pic_marking_mode_flag)
>> +        return 0;
>> +
>> +    for (int i = 0; i < H264_MAX_MMCO_COUNT; i++) {
>> +        if (header->mmco[i].memory_management_control_operation == 0)
>> +            return 0;
>> +        else if (header->mmco[i].memory_management_control_operation == 5)
>> +            return 1;
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>> +static int h264_queue_frame(AVBSFContext *ctx, AVPacket *pkt, int poc, int *queued)
>> +{
>> +    DTS2PTSContext *s = ctx->priv_data;
>> +    DTS2PTSH264Context *h264 = &s->u.h264;
>> +    DTS2PTSFrame frame;
>> +    int poc_diff, ret;
>> +
>> +    poc_diff = (h264->picture_structure == 3) + 1;
>> +    if (h264->sps.frame_mbs_only_flag && h264->poc_diff)
>> +        poc_diff = FFMIN(poc_diff, h264->poc_diff);
>> +    if (poc < 0) {
>> +        av_tree_enumerate(s->root, &poc_diff, NULL, dec_poc);
>> +        s->nb_frame -= poc_diff;
>> +    }
>> +    // Check if there was a POC reset (Like an IDR slice)
> 
> I don't think this is enough. You should bump the sequence counter on
> - IDR
> - MMCO type 5
> - SPS change
> - H264_NAL_END_SEQUENCE/STREAM

This is handled when parsing the packet. ff_h264_init_poc() will return 
the actual POC value for the frame as coded in the bistream, which 
should have been reset in all those cases.

> Then every sequence should get a separate tree and you sort POCs in each
> tree.

When POC is reset, for example on an IDR, the last few frames of the 
previous GOP are meant to have as PTS the DTS from the first few frames 
in the new GOP. Is using separate trees really simplifying things if i 
still need to cross tree boundaries to fetch timestamps?

> 
>> +    if (s->nb_frame > h264->highest_poc) {
>> +        s->nb_frame = 0;
>> +        s->gop = (s->gop + 1) % s->fifo_size;
>> +        h264->highest_poc = h264->last_poc;
>> +    }
>> +
>> +    ret = alloc_and_insert_node(ctx, pkt->dts, pkt->duration, s->nb_frame, poc_diff, s->gop);
>> +    if (ret < 0)
>> +        return ret;
>> +    av_log(ctx, AV_LOG_DEBUG, "Queueing frame with POC %d, GOP %d, dts %"PRId64"\n",
>> +           poc, s->gop, pkt->dts);
>> +    s->nb_frame += poc_diff;
>> +
>> +    // Add frame to output FIFO only once
>> +    if (*queued)
>> +        return 0;
>> +
>> +    frame = (DTS2PTSFrame) { pkt, poc, poc_diff, s->gop };
>> +    ret = av_fifo_write(s->fifo, &frame, 1);
>> +    av_assert2(ret >= 0);
>> +    *queued = 1;
>> +
>> +    return 0;
>> +}
>> +
>> +static int h264_filter(AVBSFContext *ctx)
>> +{
>> +    DTS2PTSContext *s = ctx->priv_data;
>> +    DTS2PTSH264Context *h264 = &s->u.h264;
>> +    CodedBitstreamFragment *au = &s->au;
>> +    AVPacket *in;
>> +    int output_picture_number = INT_MIN;
>> +    int field_poc[2];
>> +    int queued = 0, ret;
>> +
>> +    ret = ff_bsf_get_packet(ctx, &in);
>> +    if (ret < 0)
>> +        return ret;
>> +
>> +    ret = ff_cbs_read_packet(s->cbc, au, in);
>> +    if (ret < 0) {
>> +        av_log(ctx, AV_LOG_WARNING, "Failed to parse access unit.\n");
>> +        goto fail;
>> +    }
>> +
>> +    for (int i = 0; i < au->nb_units; i++) {
>> +        CodedBitstreamUnit *unit = &au->units[i];
>> +
>> +        switch (unit->type) {
>> +        case H264_NAL_IDR_SLICE:
>> +            h264->poc.prev_frame_num        = 0;
>> +            h264->poc.prev_frame_num_offset = 0;
>> +            h264->poc.prev_poc_msb          =
>> +            h264->poc.prev_poc_lsb          = 0;
>> +        // fall-through
>> +        case H264_NAL_SLICE: {
>> +            const H264RawSlice *slice = unit->content;
>> +            const H264RawSliceHeader *header = &slice->header;
>> +            const CodedBitstreamH264Context *cbs_h264 = s->cbc->priv_data;
>> +            const H264RawSPS *sps = cbs_h264->active_sps;
>> +            int got_reset;
>> +
>> +            if (!sps) {
>> +                av_log(ctx, AV_LOG_ERROR, "No active SPS for a slice\n");
>> +                goto fail;
>> +            }
>> +            // Initialize the SPS struct with the fields ff_h264_init_poc() cares about
>> +            h264->sps.frame_mbs_only_flag            = sps->frame_mbs_only_flag;
>> +            h264->sps.log2_max_frame_num             = sps->log2_max_frame_num_minus4 + 4;
>> +            h264->sps.poc_type                       = sps->pic_order_cnt_type;
>> +            h264->sps.log2_max_poc_lsb               = sps->log2_max_pic_order_cnt_lsb_minus4 + 4;
>> +            h264->sps.offset_for_non_ref_pic         = sps->offset_for_non_ref_pic;
>> +            h264->sps.offset_for_top_to_bottom_field = sps->offset_for_top_to_bottom_field;
>> +            h264->sps.poc_cycle_length               = sps->num_ref_frames_in_pic_order_cnt_cycle;
>> +            for (int i = 0; i < h264->sps.poc_cycle_length; i++)
> 
> moderately evil shadowing

Yikes, will change.

> 
>> +                h264->sps.offset_for_ref_frame[i] = sps->offset_for_ref_frame[i];
>> +
>> +            h264->picture_structure = sps->frame_mbs_only_flag ? 3 :
>> +                                      (header->field_pic_flag ?
>> +                                       header->field_pic_flag + header->bottom_field_flag : 3);
>> +
>> +            h264->poc.frame_num = header->frame_num;
>> +            h264->poc.poc_lsb = header->pic_order_cnt_lsb;
>> +            h264->poc.delta_poc_bottom = header->delta_pic_order_cnt_bottom;
>> +            h264->poc.delta_poc[0] = header->delta_pic_order_cnt[0];
>> +            h264->poc.delta_poc[1] = header->delta_pic_order_cnt[1];
>> +
>> +            field_poc[0] = field_poc[1] = INT_MAX;
>> +            ret = ff_h264_init_poc(field_poc, &output_picture_number, &h264->sps,
>> +                                   &h264->poc, h264->picture_structure,
>> +                                   header->nal_unit_header.nal_ref_idc);
>> +            if (ret < 0) {
>> +                av_log(ctx, AV_LOG_ERROR, "ff_h264_init_poc() failure\n");
>> +                goto fail;
>> +            }
>> +
>> +            got_reset = get_mmco_reset(header);
>> +            h264->poc.prev_frame_num        = got_reset ? 0 : h264->poc.frame_num;
>> +            h264->poc.prev_frame_num_offset = got_reset ? 0 : h264->poc.frame_num_offset;
>> +            if (header->nal_unit_header.nal_ref_idc != 0) {
>> +                h264->poc.prev_poc_msb      = got_reset ? 0 : h264->poc.poc_msb;
>> +                if (got_reset)
>> +                    h264->poc.prev_poc_lsb = h264->picture_structure == 2 ? 0 : field_poc[0];
>> +                else
>> +                    h264->poc.prev_poc_lsb = h264->poc.poc_lsb;
>> +            }
>> +
>> +            if (output_picture_number != h264->last_poc) {
> 
> Why this condition?

A single frame/field could be split into several slice NALs, as signaled 
by first_mb_in_slice. The POC value (and pretty much everything else in 
the slice NAL header) is duplicated in all of them.
This is to add the frame/slice only once to the tree and fifo.

> 
> Also the entire block below looks extremely magical
> and could use some detailed explanation.

So each frame can be coded either as a complete frame or as field pairs. 
POC is most cases increases by 1 per field, even in coded frames. But in 
some rare cases it doesn't, like this one sample in FATE where 
frame_mbs_only_flag is 1 (meaning the entire sequence is coded frames 
and it's not signaled in a per slice basis) and POC increases by 1 per 
coded frame instead of 2 like in the vast majority of samples.

This code below is an heuristic to find this difference between POC 
values and build the tree accordingly, because there doesn't seem to be 
anything in the bitstream to signal this at all.

> 
>> +                if (h264->last_poc != INT_MIN) {
>> +                    int diff = FFABS(h264->last_poc - output_picture_number);
>> +
>> +                    if ((output_picture_number < 0) && !h264->last_poc)
>> +                        h264->poc_diff = 0;
>> +                    else if (FFABS(output_picture_number) < h264->poc_diff) {
>> +                        diff = FFABS(output_picture_number);
>> +                        h264->poc_diff = 0;
>> +                    }
>> +                    if (!h264->poc_diff || (h264->poc_diff > diff)) {
>> +                        h264->poc_diff = diff;
>> +                        if (h264->poc_diff == 1 && h264->sps.frame_mbs_only_flag) {
>> +                            av_tree_enumerate(s->root, &h264->poc_diff, NULL, dec_poc);
>> +                            s->nb_frame -= 2;
>> +                        }
>> +                    }
>> +                }
>> +                h264->last_poc = output_picture_number;
>> +                h264->highest_poc = FFMAX(h264->highest_poc, output_picture_number);
>> +
>> +                ret = h264_queue_frame(ctx, in, output_picture_number, &queued);
>> +                if (ret < 0)
>> +                    goto fail;
>> +            }
>> +            break;
>> +        }
>> +        default:
>> +            break;
>> +        }
>> +    }
>> +
>> +    if (output_picture_number == INT_MIN) {
>> +        av_log(ctx, AV_LOG_ERROR, "No slices in access unit\n");
>> +        ret = AVERROR_INVALIDDATA;
>> +        goto fail;
>> +    }
>> +
>> +    ret = AVERROR(EAGAIN);
>> +fail:
>> +    ff_cbs_fragment_reset(au);
>> +    if (!queued)
>> +        av_packet_free(&in);
>> +
>> +    return ret;
>> +}
>> [...]
>> +static int dts2pts_filter(AVBSFContext *ctx, AVPacket *out)
>> +{
>> +    DTS2PTSContext *s = ctx->priv_data;
>> +    DTS2PTSNode *poc_node = NULL, *next[2] = { NULL, NULL };
>> +    DTS2PTSFrame frame;
>> +    int ret;
>> +
>> +    // Fill up the FIFO and POC tree
>> +    if (!s->eof && av_fifo_can_write(s->fifo)) {
> 
> More than one packet can be available, so this should probably be a
> loop.

What do you mean? AVBSFContext can contain at most one buffered packet. 
And in here i'm filling the FIFO before i start draining and replacing 
one packet in it at a time.
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".