From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <ffmpeg-devel-bounces@ffmpeg.org>
Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100])
	by master.gitmailbox.com (Postfix) with ESMTP id B3C9B45BF2
	for <ffmpegdev@gitmailbox.com>; Sat, 25 Mar 2023 19:18:40 +0000 (UTC)
Received: from [127.0.1.1] (localhost [127.0.0.1])
	by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 812E068CA2B;
	Sat, 25 Mar 2023 21:16:26 +0200 (EET)
Received: from mail0.khirnov.net (red.khirnov.net [176.97.15.12])
 by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id BEAB668C9AD
 for <ffmpeg-devel@ffmpeg.org>; Sat, 25 Mar 2023 21:16:07 +0200 (EET)
Received: from localhost (localhost [IPv6:::1])
 by mail0.khirnov.net (Postfix) with ESMTP id 95BDD2405F9
 for <ffmpeg-devel@ffmpeg.org>; Sat, 25 Mar 2023 20:16:04 +0100 (CET)
Received: from mail0.khirnov.net ([IPv6:::1])
 by localhost (mail0.khirnov.net [IPv6:::1]) (amavisd-new, port 10024)
 with ESMTP id nL-uM_K6lja7 for <ffmpeg-devel@ffmpeg.org>;
 Sat, 25 Mar 2023 20:16:03 +0100 (CET)
Received: from libav.khirnov.net (libav.khirnov.net
 [IPv6:2a00:c500:561:201::7])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256
 client-signature RSA-PSS (2048 bits) client-digest SHA256)
 (Client CN "libav.khirnov.net",
 Issuer "smtp.khirnov.net SMTP CA" (verified OK))
 by mail0.khirnov.net (Postfix) with ESMTPS id EEC6A2405B5
 for <ffmpeg-devel@ffmpeg.org>; Sat, 25 Mar 2023 20:16:00 +0100 (CET)
Received: from libav.khirnov.net (libav.khirnov.net [IPv6:::1])
 by libav.khirnov.net (Postfix) with ESMTP id A907C3A05C8
 for <ffmpeg-devel@ffmpeg.org>; Sat, 25 Mar 2023 20:15:54 +0100 (CET)
From: Anton Khirnov <anton@khirnov.net>
To: ffmpeg-devel@ffmpeg.org
Date: Sat, 25 Mar 2023 20:15:14 +0100
Message-Id: <20230325191529.10578-8-anton@khirnov.net>
X-Mailer: git-send-email 2.39.1
In-Reply-To: <20230325191529.10578-1-anton@khirnov.net>
References: <20230325191529.10578-1-anton@khirnov.net>
MIME-Version: 1.0
Subject: [FFmpeg-devel] [PATCH 08/23] fftools/ffmpeg: use sync queues for
 enforcing audio frame size
X-BeenThere: ffmpeg-devel@ffmpeg.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: FFmpeg development discussions and patches <ffmpeg-devel.ffmpeg.org>
List-Unsubscribe: <https://ffmpeg.org/mailman/options/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=unsubscribe>
List-Archive: <https://ffmpeg.org/pipermail/ffmpeg-devel>
List-Post: <mailto:ffmpeg-devel@ffmpeg.org>
List-Help: <mailto:ffmpeg-devel-request@ffmpeg.org?subject=help>
List-Subscribe: <https://ffmpeg.org/mailman/listinfo/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=subscribe>
Reply-To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Errors-To: ffmpeg-devel-bounces@ffmpeg.org
Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>
Archived-At: <https://master.gitmailbox.com/ffmpegdev/20230325191529.10578-8-anton@khirnov.net/>
List-Archive: <https://master.gitmailbox.com/ffmpegdev/>
List-Post: <mailto:ffmpegdev@gitmailbox.com>

The code currently uses lavfi for this, which creates a sort of
configuration dependency loop - the encoder should be ideally
initialized with information from the first audio frame, but to get this
frame one needs to first open the encoder to know the frame size. This
necessitates an awkward workaround, which causes audio handling to be
different from video.

With this change, audio encoder initialization is congruent with video.
---
 fftools/ffmpeg.c          | 58 ++++++++-------------------------------
 fftools/ffmpeg_filter.c   |  8 ------
 fftools/ffmpeg_mux_init.c | 19 +++++++++----
 3 files changed, 25 insertions(+), 60 deletions(-)

diff --git a/fftools/ffmpeg.c b/fftools/ffmpeg.c
index 3a205a3b01..f00b2d44e4 100644
--- a/fftools/ffmpeg.c
+++ b/fftools/ffmpeg.c
@@ -1028,6 +1028,8 @@ static void do_audio_out(OutputFile *of, OutputStream *ost,
     AVCodecContext *enc = ost->enc_ctx;
     int ret;
 
+    init_output_stream_wrapper(ost, frame, 1);
+
     if (frame->pts == AV_NOPTS_VALUE)
         frame->pts = ost->next_pts;
     else {
@@ -1378,18 +1380,6 @@ static int reap_filters(int flush)
             continue;
         filter = ost->filter->filter;
 
-        /*
-         * Unlike video, with audio the audio frame size matters.
-         * Currently we are fully reliant on the lavfi filter chain to
-         * do the buffering deed for us, and thus the frame size parameter
-         * needs to be set accordingly. Where does one get the required
-         * frame size? From the initialized AVCodecContext of an audio
-         * encoder. Thus, if we have gotten to an audio stream, initialize
-         * the encoder earlier than receiving the first AVFrame.
-         */
-        if (av_buffersink_get_type(filter) == AVMEDIA_TYPE_AUDIO)
-            init_output_stream_wrapper(ost, NULL, 1);
-
         filtered_frame = ost->filtered_frame;
 
         while (1) {
@@ -1432,6 +1422,7 @@ static int reap_filters(int flush)
                 break;
             case AVMEDIA_TYPE_AUDIO:
                 if (!(enc->codec->capabilities & AV_CODEC_CAP_PARAM_CHANGE) &&
+                    avcodec_is_open(enc) &&
                     enc->ch_layout.nb_channels != filtered_frame->ch_layout.nb_channels) {
                     av_log(NULL, AV_LOG_ERROR,
                            "Audio filter graph output is not normalized and encoder does not support parameter changes\n");
@@ -3238,10 +3229,13 @@ static int init_output_stream(OutputStream *ost, AVFrame *frame,
                     ost->file_index, ost->index);
             return ret;
         }
-        if (codec->type == AVMEDIA_TYPE_AUDIO &&
-            !(codec->capabilities & AV_CODEC_CAP_VARIABLE_FRAME_SIZE))
-            av_buffersink_set_frame_size(ost->filter->filter,
-                                            ost->enc_ctx->frame_size);
+
+        if (ost->enc_ctx->frame_size) {
+            av_assert0(ost->sq_idx_encode >= 0);
+            sq_frame_samples(output_files[ost->file_index]->sq_encode,
+                             ost->sq_idx_encode, ost->enc_ctx->frame_size);
+        }
+
         assert_avoptions(ost->encoder_opts);
         if (ost->enc_ctx->bit_rate && ost->enc_ctx->bit_rate < 1000 &&
             ost->enc_ctx->codec_id != AV_CODEC_ID_CODEC2 /* don't complain about 700 bit/s modes */)
@@ -3331,12 +3325,8 @@ static int transcode_init(void)
 
     /*
      * initialize stream copy and subtitle/data streams.
-     * Encoded AVFrame based streams will get initialized as follows:
-     * - when the first AVFrame is received in do_video_out
-     * - just before the first AVFrame is received in either transcode_step
-     *   or reap_filters due to us requiring the filter chain buffer sink
-     *   to be configured with the correct audio frame size, which is only
-     *   known after the encoder is initialized.
+     * Encoded AVFrame based streams will get initialized when the first AVFrame
+     * is received in do_video_out
      */
     for (OutputStream *ost = ost_iter(NULL); ost; ost = ost_iter(ost)) {
         if (ost->enc_ctx &&
@@ -3942,30 +3932,6 @@ static int transcode_step(void)
     }
 
     if (ost->filter && ost->filter->graph->graph) {
-        /*
-         * Similar case to the early audio initialization in reap_filters.
-         * Audio is special in ffmpeg.c currently as we depend on lavfi's
-         * audio frame buffering/creation to get the output audio frame size
-         * in samples correct. The audio frame size for the filter chain is
-         * configured during the output stream initialization.
-         *
-         * Apparently avfilter_graph_request_oldest (called in
-         * transcode_from_filter just down the line) peeks. Peeking already
-         * puts one frame "ready to be given out", which means that any
-         * update in filter buffer sink configuration afterwards will not
-         * help us. And yes, even if it would be utilized,
-         * av_buffersink_get_samples is affected, as it internally utilizes
-         * the same early exit for peeked frames.
-         *
-         * In other words, if avfilter_graph_request_oldest would not make
-         * further filter chain configuration or usage of
-         * av_buffersink_get_samples useless (by just causing the return
-         * of the peeked AVFrame as-is), we could get rid of this additional
-         * early encoder initialization.
-         */
-        if (av_buffersink_get_type(ost->filter->filter) == AVMEDIA_TYPE_AUDIO)
-            init_output_stream_wrapper(ost, NULL, 1);
-
         if ((ret = transcode_from_filter(ost->filter->graph, &ist)) < 0)
             return ret;
         if (!ist)
diff --git a/fftools/ffmpeg_filter.c b/fftools/ffmpeg_filter.c
index 314b89b585..c9fd65e902 100644
--- a/fftools/ffmpeg_filter.c
+++ b/fftools/ffmpeg_filter.c
@@ -1242,14 +1242,6 @@ int configure_filtergraph(FilterGraph *fg)
 
     fg->reconfiguration = 1;
 
-    for (i = 0; i < fg->nb_outputs; i++) {
-        OutputStream *ost = fg->outputs[i]->ost;
-        if (ost->enc_ctx->codec_type == AVMEDIA_TYPE_AUDIO &&
-            !(ost->enc_ctx->codec->capabilities & AV_CODEC_CAP_VARIABLE_FRAME_SIZE))
-            av_buffersink_set_frame_size(ost->filter->filter,
-                                         ost->enc_ctx->frame_size);
-    }
-
     for (i = 0; i < fg->nb_inputs; i++) {
         AVFrame *tmp;
         while (av_fifo_read(fg->inputs[i]->frame_queue, &tmp, 1) >= 0) {
diff --git a/fftools/ffmpeg_mux_init.c b/fftools/ffmpeg_mux_init.c
index ebc17059f9..385312d4fe 100644
--- a/fftools/ffmpeg_mux_init.c
+++ b/fftools/ffmpeg_mux_init.c
@@ -1451,7 +1451,7 @@ static void create_streams(Muxer *mux, const OptionsContext *o)
 static int setup_sync_queues(Muxer *mux, AVFormatContext *oc, int64_t buf_size_us)
 {
     OutputFile *of = &mux->of;
-    int nb_av_enc = 0, nb_interleaved = 0;
+    int nb_av_enc = 0, nb_audio_fs = 0, nb_interleaved = 0;
     int limit_frames = 0, limit_frames_av_enc = 0;
 
 #define IS_AV_ENC(ost, type)  \
@@ -1468,19 +1468,26 @@ static int setup_sync_queues(Muxer *mux, AVFormatContext *oc, int64_t buf_size_u
 
         nb_interleaved += IS_INTERLEAVED(type);
         nb_av_enc      += IS_AV_ENC(ost, type);
+        nb_audio_fs    += (ost->enc_ctx && type == AVMEDIA_TYPE_AUDIO &&
+                           !(ost->enc_ctx->codec->capabilities & AV_CODEC_CAP_VARIABLE_FRAME_SIZE));
 
         limit_frames        |=  ms->max_frames < INT64_MAX;
         limit_frames_av_enc |= (ms->max_frames < INT64_MAX) && IS_AV_ENC(ost, type);
     }
 
     if (!((nb_interleaved > 1 && of->shortest) ||
-          (nb_interleaved > 0 && limit_frames)))
+          (nb_interleaved > 0 && limit_frames) ||
+          nb_audio_fs))
         return 0;
 
-    /* if we have more than one encoded audio/video streams, or at least
-     * one encoded audio/video stream is frame-limited, then we
-     * synchronize them before encoding */
-    if ((of->shortest && nb_av_enc > 1) || limit_frames_av_enc) {
+    /* we use a sync queue before encoding when:
+     * - 'shortest' is in effect and we have two or more encoded audio/video
+     *   streams
+     * - at least one encoded audio/video stream is frame-limited, since
+     *   that has similar semantics to 'shortest'
+     * - at least one audio encoder requires constant frame sizes
+     */
+    if ((of->shortest && nb_av_enc > 1) || limit_frames_av_enc || nb_audio_fs) {
         of->sq_encode = sq_alloc(SYNC_QUEUE_FRAMES, buf_size_us);
         if (!of->sq_encode)
             return AVERROR(ENOMEM);
-- 
2.39.1

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".