From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <ffmpeg-devel-bounces@ffmpeg.org>
Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100])
	by master.gitmailbox.com (Postfix) with ESMTP id 7392844D79
	for <ffmpegdev@gitmailbox.com>; Tue, 21 Nov 2023 21:15:50 +0000 (UTC)
Received: from [127.0.1.1] (localhost [127.0.0.1])
	by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 9150668CCC9;
	Tue, 21 Nov 2023 23:15:04 +0200 (EET)
Received: from mail-pf1-f171.google.com (mail-pf1-f171.google.com
 [209.85.210.171])
 by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 48D6D68CCCB
 for <ffmpeg-devel@ffmpeg.org>; Tue, 21 Nov 2023 23:14:55 +0200 (EET)
Received: by mail-pf1-f171.google.com with SMTP id
 d2e1a72fcca58-6c34e87b571so4899873b3a.3
 for <ffmpeg-devel@ffmpeg.org>; Tue, 21 Nov 2023 13:14:55 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=gmail.com; s=20230601; t=1700601293; x=1701206093; darn=ffmpeg.org;
 h=content-transfer-encoding:mime-version:references:in-reply-to
 :message-id:date:subject:to:from:from:to:cc:subject:date:message-id
 :reply-to; bh=rG4A55hSdFxBIZyGibAEnfV6rp6mILn4vlBJfx9CEDw=;
 b=TD840FzW/KZ7gW38XBHy7QmsNdiMB7vgfza+QB5iJ+fbDX5VgrV2nEHkVUu+zdEBje
 AeTTPYrlU/rKP/qidNCyqjv9DHd6gY5ISH4c42r/It02Ep1y1VB8vmGY18YNvoF7GHDn
 qiOj/1oGL0TVXxQs5E31xnpLiEfUvU0Jjx0rj3e+k8RlZj7ECeJR2ofkpOt6K0VAV1Sb
 nt2zCmk/p03tehLKTUdFHP3a25WW8WkXPO0TU4w0bKEaI8HemgTMH6l1/DXTXZZfR2IA
 7qeRzZ8rHRwLaZrGBbQPOFRN0Kk5jufeKWwAoVjL859axKJVxrLYfFFBBHlzLYA2yd4S
 Fb9w==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20230601; t=1700601293; x=1701206093;
 h=content-transfer-encoding:mime-version:references:in-reply-to
 :message-id:date:subject:to:from:x-gm-message-state:from:to:cc
 :subject:date:message-id:reply-to;
 bh=rG4A55hSdFxBIZyGibAEnfV6rp6mILn4vlBJfx9CEDw=;
 b=lPkELlCit5hv1nX7L9UQ1nCWGD7kI9gWWvzNC+UgkPPcZZBGwEAIBypUivKEQ6Ro3T
 lbSyHlQ6yNjjHIpDbacfLMxL7/BJqFriAzTnYdXnD+p7FH/2EUG1lqQwkU2QwOc4z+1Q
 WVmVNl3PVW8tHuMn999LmiPlVkutHGnyi6caApFSdZ98iITS4abyKgqwh/KdCKRf/AYF
 Id1NiQ5CZbxLK91xHhnQUO506inyeEzKH2zDs1SqsQ+7SrY5yVeGBdNKsdlj+yll6O2Q
 0fZ71wlm1c07wZCn0t5OvAqMNEc2NEwR1XMidDvhXFz0UfqkUYzX1+X0IkoUUwbMtUab
 xcYQ==
X-Gm-Message-State: AOJu0YzVRb/hkui3LSATQo8fBpXYhflCEKVh1bUXdczT8ULjU9TJHS+W
 C4v3+nk/2Y0XsYNrZiDgzmBSE/yBUbg=
X-Google-Smtp-Source: AGHT+IFC0BgaF55CFFgj6w0Q7QJv6GL2LbsFlOUrH0VyYFTMtLQoaC5DTOqJA+dNnnEhlGYrTtTCQw==
X-Received: by 2002:a05:6a21:9706:b0:18a:d4e6:ce20 with SMTP id
 ub6-20020a056a21970600b0018ad4e6ce20mr239838pzb.23.1700601293098; 
 Tue, 21 Nov 2023 13:14:53 -0800 (PST)
Received: from localhost.localdomain (host197.190-225-105.telecom.net.ar.
 [190.225.105.197]) by smtp.gmail.com with ESMTPSA id
 bn2-20020a056a00324200b0069ea08a2a99sm8412505pfb.211.2023.11.21.13.14.52
 for <ffmpeg-devel@ffmpeg.org>
 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
 Tue, 21 Nov 2023 13:14:52 -0800 (PST)
From: James Almer <jamrial@gmail.com>
To: ffmpeg-devel@ffmpeg.org
Date: Tue, 21 Nov 2023 18:14:42 -0300
Message-ID: <20231121211442.8723-6-jamrial@gmail.com>
X-Mailer: git-send-email 2.42.1
In-Reply-To: <20231121211442.8723-1-jamrial@gmail.com>
References: <20231121211442.8723-1-jamrial@gmail.com>
MIME-Version: 1.0
Subject: [FFmpeg-devel] [PATCH 5/5] ffmpeg: add support for muxing
 AVStreamGroups
X-BeenThere: ffmpeg-devel@ffmpeg.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: FFmpeg development discussions and patches <ffmpeg-devel.ffmpeg.org>
List-Unsubscribe: <https://ffmpeg.org/mailman/options/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=unsubscribe>
List-Archive: <https://ffmpeg.org/pipermail/ffmpeg-devel>
List-Post: <mailto:ffmpeg-devel@ffmpeg.org>
List-Help: <mailto:ffmpeg-devel-request@ffmpeg.org?subject=help>
List-Subscribe: <https://ffmpeg.org/mailman/listinfo/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=subscribe>
Reply-To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Errors-To: ffmpeg-devel-bounces@ffmpeg.org
Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>
Archived-At: <https://master.gitmailbox.com/ffmpegdev/20231121211442.8723-6-jamrial@gmail.com/>
List-Archive: <https://master.gitmailbox.com/ffmpegdev/>
List-Post: <mailto:ffmpegdev@gitmailbox.com>

Signed-off-by: James Almer <jamrial@gmail.com>
---
Example command line, remuxing an existing iamf file using the avoptions
defined for iamf in libavformat. This creates two stream groups, one
Audio Element and one Mixing Presentation. The first defines two layers,
one stereo and one 5.1. The latter defines two submixes, one for standard
loudspeakers output with two layouts, also stereo and 5.1, and one for
Binaural output.

./ffmpeg -i iamf_test_000059.iamf \
-stream_group "type=iamf_audio_element:id=1:st=0:st=1:st=2:st=3:default_w=10,demixing=dmixp_mode=1:parameter_id=998,recon_gain=parameter_id=101,layer=ch_layout=stereo,layer=ch_layout=5.1:recon_gain_is_present=true" \
-stream_group "type=iamf_mix_presentation:id=3:stg=0:annotations=en-us=Mix_Presentation,submix=parameter_id=100:default_mix_gain=1.1|element=stg=0:parameter_id=100:headphones_rendering_mode=stereo:annotations=en-us=Standard_submix|layout=sound_system=stereo:integrated_loudness=1.0|layout=sound_system=5.1,submix=parameter_id=100|element=stg=0:parameter_id=100:headphones_rendering_mode=binaural:default_mix_gain=1.0:annotations=en-us=Binaural_submix|layout=layout_type=binaural" \
-c:a copy -map 0 -y test.iamf

 fftools/ffmpeg.h          |   2 +
 fftools/ffmpeg_mux_init.c | 327 ++++++++++++++++++++++++++++++++++++++
 fftools/ffmpeg_opt.c      |   2 +
 3 files changed, 331 insertions(+)

diff --git a/fftools/ffmpeg.h b/fftools/ffmpeg.h
index 41935d39d5..057535adbb 100644
--- a/fftools/ffmpeg.h
+++ b/fftools/ffmpeg.h
@@ -262,6 +262,8 @@ typedef struct OptionsContext {
     int        nb_disposition;
     SpecifierOpt *program;
     int        nb_program;
+    SpecifierOpt *stream_groups;
+    int        nb_stream_groups;
     SpecifierOpt *time_bases;
     int        nb_time_bases;
     SpecifierOpt *enc_time_bases;
diff --git a/fftools/ffmpeg_mux_init.c b/fftools/ffmpeg_mux_init.c
index 63a25a350f..62e3e4aa86 100644
--- a/fftools/ffmpeg_mux_init.c
+++ b/fftools/ffmpeg_mux_init.c
@@ -27,6 +27,7 @@
 
 #include "libavformat/avformat.h"
 #include "libavformat/avio.h"
+#include "libavformat/iamf.h"
 
 #include "libavcodec/avcodec.h"
 
@@ -1943,6 +1944,328 @@ static int setup_sync_queues(Muxer *mux, AVFormatContext *oc, int64_t buf_size_u
     return 0;
 }
 
+static int of_parse_iamf_audio_element_layers(Muxer *mux, AVStreamGroup *stg, char **ptr)
+{
+    AVIAMFAudioElement *audio_element = stg->params.iamf_audio_element;
+    AVDictionary *dict = NULL;
+    const char *token;
+    int ret = 0;
+
+    audio_element->demixing_info =
+        avformat_iamf_param_definition_alloc(AV_IAMF_PARAMETER_DEFINITION_DEMIXING, NULL, 1, NULL, NULL);
+    audio_element->recon_gain_info =
+        avformat_iamf_param_definition_alloc(AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN, NULL, 1, NULL, NULL);
+
+    if (!audio_element->demixing_info ||
+        !audio_element->recon_gain_info)
+        return AVERROR(ENOMEM);
+
+    /* process manually set layers and parameters */
+    token = av_strtok(NULL, ",", ptr);
+    while (token) {
+        const AVDictionaryEntry *e;
+        int demixing = 0, recon_gain = 0;
+        int layer = 0;
+
+        if (av_strstart(token, "layer=", &token))
+            layer = 1;
+        else if (av_strstart(token, "demixing=", &token))
+            demixing = 1;
+        else if (av_strstart(token, "recon_gain=", &token))
+            recon_gain = 1;
+
+        av_dict_free(&dict);
+        ret = av_dict_parse_string(&dict, token, "=", ":", 0);
+        if (ret < 0) {
+            av_log(mux, AV_LOG_ERROR, "Error parsing audio element specification %s\n", token);
+            goto fail;
+        }
+
+        if (layer) {
+            ret = avformat_iamf_audio_element_add_layer(audio_element, &dict);
+            if (ret < 0) {
+                av_log(mux, AV_LOG_ERROR, "Error adding layer to stream group %d\n", stg->index);
+                goto fail;
+            }
+        } else if (demixing || recon_gain) {
+            AVIAMFParamDefinition *param = demixing ? audio_element->demixing_info
+                                                    : audio_element->recon_gain_info;
+            void *subblock = avformat_iamf_param_definition_get_subblock(param, 0);
+
+            av_opt_set_dict(param, &dict);
+            av_opt_set_dict(subblock, &dict);
+
+            /* Hardcode spec parameters */
+            param->param_definition_mode = 0;
+            param->parameter_rate = stg->streams[0]->codecpar->sample_rate;
+            param->duration =
+            param->constant_subblock_duration = stg->streams[0]->codecpar->frame_size;
+        }
+
+        // make sure that no entries are left in the dict
+        e = NULL;
+        if (e = av_dict_iterate(dict, e)) {
+            av_log(mux, AV_LOG_FATAL, "Unknown layer key %s.\n", e->key);
+            ret = AVERROR(EINVAL);
+            goto fail;
+        }
+        token = av_strtok(NULL, ",", ptr);
+    }
+
+fail:
+    av_dict_free(&dict);
+    if (!ret && !audio_element->num_layers) {
+        av_log(mux, AV_LOG_ERROR, "No layer in audio element specification\n");
+        ret = AVERROR(EINVAL);
+    }
+
+    return ret;
+}
+
+static int of_parse_iamf_submixes(Muxer *mux, AVStreamGroup *stg, char **ptr)
+{
+    AVFormatContext *oc = mux->fc;
+    AVIAMFMixPresentation *mix = stg->params.iamf_mix_presentation;
+    AVDictionary *dict = NULL;
+    const char *token;
+    char *submix_str = NULL;
+    int ret = 0;
+
+    /* process manually set submixes */
+    token = av_strtok(NULL, ",", ptr);
+    while (token) {
+        AVIAMFSubmix *submix = NULL;
+        const char *subtoken;
+        char *subptr = NULL;
+
+        if (!av_strstart(token, "submix=", &token)) {
+            av_log(mux, AV_LOG_ERROR, "No submix in mix presentation specification \"%s\"\n", token);
+            goto fail;
+        }
+
+        submix_str = av_strdup(token);
+        if (!submix_str)
+            goto fail;
+
+        ret = avformat_iamf_mix_presentation_add_submix(mix, NULL);
+        if (!ret) {
+            submix = mix->submixes[mix->num_submixes - 1];
+            submix->output_mix_config =
+                avformat_iamf_param_definition_alloc(AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN, NULL, 0, NULL, NULL);
+            if (!submix->output_mix_config)
+                ret = AVERROR(ENOMEM);
+        }
+        if (ret < 0) {
+            av_log(mux, AV_LOG_ERROR, "Error adding submix to stream group %d\n", stg->index);
+            goto fail;
+        }
+
+        submix->output_mix_config->parameter_rate = stg->streams[0]->codecpar->sample_rate;
+
+        subptr = NULL;
+        subtoken = av_strtok(submix_str, "|", &subptr);
+        while (subtoken) {
+            const AVDictionaryEntry *e;
+            int element = 0, layout = 0;
+
+            if (av_strstart(subtoken, "element=", &subtoken))
+                element = 1;
+            else if (av_strstart(subtoken, "layout=", &subtoken))
+                layout = 1;
+
+            av_dict_free(&dict);
+            ret = av_dict_parse_string(&dict, subtoken, "=", ":", 0);
+            if (ret < 0) {
+                av_log(mux, AV_LOG_ERROR, "Error parsing submix specification \"%s\"\n", subtoken);
+                goto fail;
+            }
+
+            if (element) {
+                AVIAMFSubmixElement *submix_element;
+                int idx = -1;
+
+                if (e = av_dict_get(dict, "stg", NULL, 0))
+                    idx = strtol(e->value, NULL, 0);
+                av_dict_set(&dict, "stg", NULL, 0);
+                if (idx < 0 || idx >= oc->nb_stream_groups) {
+                    av_log(mux, AV_LOG_ERROR, "Invalid or missing stream group index in "
+                                              "submix element specification \"%s\"\n", subtoken);
+                    ret = AVERROR(EINVAL);
+                    goto fail;
+                }
+                ret = avformat_iamf_submix_add_element(submix, NULL);
+                if (ret < 0)
+                    av_log(mux, AV_LOG_ERROR, "Error adding element to submix\n");
+
+                submix_element = submix->elements[submix->num_elements - 1];
+                submix_element->audio_element_id = oc->stream_groups[idx]->id;
+
+                submix_element->element_mix_config =
+                    avformat_iamf_param_definition_alloc(AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN, NULL, 0, NULL, NULL);
+                if (!submix_element->element_mix_config)
+                    ret = AVERROR(ENOMEM);
+                av_opt_set_dict2(submix_element, &dict, AV_OPT_SEARCH_CHILDREN);
+                submix_element->element_mix_config->parameter_rate = stg->streams[0]->codecpar->sample_rate;
+            } else if (layout) {
+                ret = avformat_iamf_submix_add_layout(submix, &dict);
+                if (ret < 0)
+                    av_log(mux, AV_LOG_ERROR, "Error adding layout to submix\n");
+            } else
+                av_opt_set_dict2(submix, &dict, AV_OPT_SEARCH_CHILDREN);
+
+            if (ret < 0) {
+                goto fail;
+            }
+
+            // make sure that no entries are left in the dict
+            e = NULL;
+            while (e = av_dict_iterate(dict, e)) {
+                av_log(mux, AV_LOG_FATAL, "Unknown submix key %s.\n", e->key);
+                ret = AVERROR(EINVAL);
+                goto fail;
+            }
+            subtoken = av_strtok(NULL, "|", &subptr);
+        }
+        av_freep(&submix_str);
+
+        if (!submix->num_elements) {
+            av_log(mux, AV_LOG_ERROR, "No audio elements in submix specification \"%s\"\n", token);
+            ret = AVERROR(EINVAL);
+        }
+        token = av_strtok(NULL, ",", ptr);
+    }
+
+fail:
+    av_dict_free(&dict);
+    av_free(submix_str);
+
+    return ret;
+}
+
+static int of_add_groups(Muxer *mux, const OptionsContext *o)
+{
+    AVFormatContext *oc = mux->fc;
+    int ret;
+
+    /* process manually set groups */
+    for (int i = 0; i < o->nb_stream_groups; i++) {
+        AVDictionary *dict = NULL, *tmp = NULL;
+        const AVDictionaryEntry *e;
+        AVStreamGroup *stg = NULL;
+        int type;
+        const char *token;
+        char *str, *ptr = NULL;
+        const AVOption opts[] = {
+            { "type", "Set group type", offsetof(AVStreamGroup, type), AV_OPT_TYPE_INT,
+                    { .i64 = 0 }, 0, INT_MAX, AV_OPT_FLAG_ENCODING_PARAM, "type" },
+                { "iamf_audio_element",    NULL, 0, AV_OPT_TYPE_CONST,
+                    { .i64 = AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT },    .unit = "type" },
+                { "iamf_mix_presentation", NULL, 0, AV_OPT_TYPE_CONST,
+                    { .i64 = AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION }, .unit = "type" },
+            { NULL },
+        };
+         const AVClass class = {
+            .class_name = "StreamGroupType",
+            .item_name  = av_default_item_name,
+            .option     = opts,
+            .version    = LIBAVUTIL_VERSION_INT,
+        };
+        const AVClass *pclass = &class;
+
+        str = av_strdup(o->stream_groups[i].u.str);
+        if (!str)
+            goto end;
+
+        token = av_strtok(str, ",", &ptr);
+        if (token) {
+            ret = av_dict_parse_string(&dict, token, "=", ":", AV_DICT_MULTIKEY);
+            if (ret < 0) {
+                av_log(mux, AV_LOG_ERROR, "Error parsing group specification %s\n", token);
+                goto end;
+            }
+
+            // "type" is not a user settable option in AVStreamGroup
+            e = av_dict_get(dict, "type", NULL, 0);
+            if (!e) {
+                av_log(mux, AV_LOG_ERROR, "No type define for Steam Group %d\n", i);
+                ret = AVERROR(EINVAL);
+                goto end;
+            }
+
+            ret = av_opt_eval_int(&pclass, opts, e->value, &type);
+            if (ret < 0 || type == AV_STREAM_GROUP_PARAMS_NONE) {
+                av_log(mux, AV_LOG_ERROR, "Invalid group type \"%s\"\n", e->value);
+                goto end;
+            }
+
+            av_dict_copy(&tmp, dict, 0);
+            stg = avformat_stream_group_create(oc, type, &tmp);
+            if (!stg) {
+                ret = AVERROR(ENOMEM);
+                goto end;
+            }
+            av_dict_set(&tmp, "type", NULL, 0);
+
+            e = NULL;
+            while (e = av_dict_get(dict, "st", e, 0)) {
+                unsigned int idx = strtol(e->value, NULL, 0);
+                if (idx >= oc->nb_streams) {
+                    av_log(mux, AV_LOG_ERROR, "Invalid stream index %d\n", idx);
+                    ret = AVERROR(EINVAL);
+                    goto end;
+                }
+                avformat_stream_group_add_stream(stg, oc->streams[idx]);
+            }
+            while (e = av_dict_get(dict, "stg", e, 0)) {
+                unsigned int idx = strtol(e->value, NULL, 0);
+                if (idx >= oc->nb_stream_groups || idx == stg->index) {
+                    av_log(mux, AV_LOG_ERROR, "Invalid stream group index %d\n", idx);
+                    ret = AVERROR(EINVAL);
+                    goto end;
+                }
+                for (int j = 0; j < oc->stream_groups[idx]->nb_streams; j++)
+                    avformat_stream_group_add_stream(stg, oc->stream_groups[idx]->streams[j]);
+            }
+
+            switch(type) {
+            case AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT:
+                ret = of_parse_iamf_audio_element_layers(mux, stg, &ptr);
+                break;
+            case AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION:
+                ret = of_parse_iamf_submixes(mux, stg, &ptr);
+                break;
+            default:
+                av_log(mux, AV_LOG_FATAL, "Unknown group type %d.\n", type);
+                ret = AVERROR(EINVAL);
+                break;
+            }
+
+            if (ret < 0)
+                goto end;
+
+            // make sure that nothing but "st" and "stg" entries are left in the dict
+            e = NULL;
+            while (e = av_dict_iterate(tmp, e)) {
+                if (!strcmp(e->key, "st") || !strcmp(e->key, "stg"))
+                    continue;
+
+                av_log(mux, AV_LOG_FATAL, "Unknown group key %s.\n", e->key);
+                ret = AVERROR(EINVAL);
+                goto end;
+            }
+        }
+
+end:
+        av_dict_free(&dict);
+        av_dict_free(&tmp);
+        av_free(str);
+        if (ret < 0)
+            return ret;
+    }
+
+    return 0;
+}
+
 static int of_add_programs(Muxer *mux, const OptionsContext *o)
 {
     AVFormatContext *oc = mux->fc;
@@ -2740,6 +3063,10 @@ int of_open(const OptionsContext *o, const char *filename)
     if (err < 0)
         return err;
 
+    err = of_add_groups(mux, o);
+    if (err < 0)
+        return err;
+
     err = of_add_programs(mux, o);
     if (err < 0)
         return err;
diff --git a/fftools/ffmpeg_opt.c b/fftools/ffmpeg_opt.c
index 304471dd03..1144f64f89 100644
--- a/fftools/ffmpeg_opt.c
+++ b/fftools/ffmpeg_opt.c
@@ -1491,6 +1491,8 @@ const OptionDef options[] = {
         "add metadata", "string=string" },
     { "program",        HAS_ARG | OPT_STRING | OPT_SPEC | OPT_OUTPUT, { .off = OFFSET(program) },
         "add program with specified streams", "title=string:st=number..." },
+    { "stream_group",        HAS_ARG | OPT_STRING | OPT_SPEC | OPT_OUTPUT, { .off = OFFSET(stream_groups) },
+        "add stream group with specified streams and group type-specific arguments", "id=number:st=number..." },
     { "dframes",        HAS_ARG | OPT_PERFILE | OPT_EXPERT |
                         OPT_OUTPUT,                                  { .func_arg = opt_data_frames },
         "set the number of data frames to output", "number" },
-- 
2.42.1

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".