Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
 help / color / mirror / Atom feed
* [FFmpeg-devel] [PATCH v6 0/8] avformat: introduce AVStreamGroup
@ 2023-12-05 22:43 James Almer
  2023-12-05 22:43 ` [FFmpeg-devel] [PATCH 1/8] avutil: introduce an Immersive Audio Model and Formats API James Almer
                   ` (8 more replies)
  0 siblings, 9 replies; 23+ messages in thread
From: James Almer @ 2023-12-05 22:43 UTC (permalink / raw)
  To: ffmpeg-devel

Addressed Anton's comments and added some documentation. Also split the
common code some more in order to facilitate using it from different
modules.
I'm withdrawing the MP4 code for now as i've noticed a bug in the spec
and reported it. Depending on what happens to that, i'll resubmit it.

James Almer (8):
  avutil: introduce an Immersive Audio Model and Formats API
  avformat: introduce AVStreamGroup
  ffmpeg: add support for muxing AVStreamGroups
  avcodec/packet: add IAMF Parameters side data types
  avcodec/get_bits: add get_leb()
  avformat/aviobuf: add ffio_read_leb() and ffio_write_leb()
  avformat: Immersive Audio Model and Formats demuxer
  avformat: Immersive Audio Model and Formats muxer

 doc/fftools-common-opts.texi    |   17 +-
 fftools/ffmpeg.h                |    2 +
 fftools/ffmpeg_mux_init.c       |  335 ++++++++++
 fftools/ffmpeg_opt.c            |    2 +
 libavcodec/avpacket.c           |    3 +
 libavcodec/bitstream.h          |    2 +
 libavcodec/bitstream_template.h |   23 +
 libavcodec/get_bits.h           |   24 +
 libavcodec/packet.h             |   24 +
 libavformat/Makefile            |    2 +
 libavformat/allformats.c        |    2 +
 libavformat/avformat.c          |  185 +++++-
 libavformat/avformat.h          |  169 +++++
 libavformat/avio_internal.h     |   10 +
 libavformat/aviobuf.c           |   33 +
 libavformat/dump.c              |  147 +++-
 libavformat/iamf.c              |  125 ++++
 libavformat/iamf.h              |  162 +++++
 libavformat/iamf_parse.c        | 1106 +++++++++++++++++++++++++++++++
 libavformat/iamf_parse.h        |   38 ++
 libavformat/iamf_writer.c       |  823 +++++++++++++++++++++++
 libavformat/iamf_writer.h       |   51 ++
 libavformat/iamfdec.c           |  495 ++++++++++++++
 libavformat/iamfenc.c           |  388 +++++++++++
 libavformat/internal.h          |   33 +
 libavformat/options.c           |  139 ++++
 libavutil/Makefile              |    2 +
 libavutil/iamf.c                |  564 ++++++++++++++++
 libavutil/iamf.h                |  573 ++++++++++++++++
 29 files changed, 5445 insertions(+), 34 deletions(-)
 create mode 100644 libavformat/iamf.c
 create mode 100644 libavformat/iamf.h
 create mode 100644 libavformat/iamf_parse.c
 create mode 100644 libavformat/iamf_parse.h
 create mode 100644 libavformat/iamf_writer.c
 create mode 100644 libavformat/iamf_writer.h
 create mode 100644 libavformat/iamfdec.c
 create mode 100644 libavformat/iamfenc.c
 create mode 100644 libavutil/iamf.c
 create mode 100644 libavutil/iamf.h

-- 
2.43.0

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [FFmpeg-devel] [PATCH 1/8] avutil: introduce an Immersive Audio Model and Formats API
  2023-12-05 22:43 [FFmpeg-devel] [PATCH v6 0/8] avformat: introduce AVStreamGroup James Almer
@ 2023-12-05 22:43 ` James Almer
  2023-12-11 10:19   ` Anton Khirnov
  2023-12-05 22:43 ` [FFmpeg-devel] [PATCH 2/8] avformat: introduce AVStreamGroup James Almer
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 23+ messages in thread
From: James Almer @ 2023-12-05 22:43 UTC (permalink / raw)
  To: ffmpeg-devel

Signed-off-by: James Almer <jamrial@gmail.com>
---
 libavutil/Makefile |   2 +
 libavutil/iamf.c   | 564 ++++++++++++++++++++++++++++++++++++++++++++
 libavutil/iamf.h   | 573 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 1139 insertions(+)
 create mode 100644 libavutil/iamf.c
 create mode 100644 libavutil/iamf.h

diff --git a/libavutil/Makefile b/libavutil/Makefile
index 4711f8cde8..62cc1a1831 100644
--- a/libavutil/Makefile
+++ b/libavutil/Makefile
@@ -51,6 +51,7 @@ HEADERS = adler32.h                                                     \
           hwcontext_videotoolbox.h                                      \
           hwcontext_vdpau.h                                             \
           hwcontext_vulkan.h                                            \
+          iamf.h                                                        \
           imgutils.h                                                    \
           intfloat.h                                                    \
           intreadwrite.h                                                \
@@ -140,6 +141,7 @@ OBJS = adler32.o                                                        \
        hdr_dynamic_vivid_metadata.o                                     \
        hmac.o                                                           \
        hwcontext.o                                                      \
+       iamf.o                                                           \
        imgutils.o                                                       \
        integer.o                                                        \
        intmath.o                                                        \
diff --git a/libavutil/iamf.c b/libavutil/iamf.c
new file mode 100644
index 0000000000..5f646f2d65
--- /dev/null
+++ b/libavutil/iamf.c
@@ -0,0 +1,564 @@
+/*
+ * Immersive Audio Model and Formats helper functions and defines
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include <limits.h>
+#include <stddef.h>
+#include <stdint.h>
+
+#include "avassert.h"
+#include "error.h"
+#include "iamf.h"
+#include "log.h"
+#include "mem.h"
+#include "opt.h"
+
+#define IAMF_ADD_FUNC_TEMPLATE(parent_type, parent_name, child_type, child_name, suffix)                   \
+child_type *av_iamf_ ## parent_name ## _add_ ## child_name(parent_type *parent_name)                       \
+{                                                                                                          \
+    child_type **child_name ## suffix, *child_name;                                                        \
+                                                                                                           \
+    if (parent_name->nb_## child_name ## suffix == UINT_MAX)                                               \
+        return NULL;                                                                                       \
+                                                                                                           \
+    child_name ## suffix = av_realloc_array(parent_name->child_name ## suffix,                             \
+                                            parent_name->nb_## child_name ## suffix + 1,                   \
+                                            sizeof(*parent_name->child_name ## suffix));                   \
+    if (!child_name ## suffix)                                                                             \
+        return NULL;                                                                                       \
+                                                                                                           \
+    parent_name->child_name ## suffix = child_name ## suffix;                                              \
+                                                                                                           \
+    child_name = parent_name->child_name ## suffix[parent_name->nb_## child_name ## suffix]                \
+               = av_mallocz(sizeof(*child_name));                                                          \
+    if (!child_name)                                                                                       \
+        return NULL;                                                                                       \
+                                                                                                           \
+    child_name->av_class = &child_name ## _class;                                                          \
+    av_opt_set_defaults(child_name);                                                                       \
+    parent_name->nb_## child_name ## suffix++;                                                             \
+                                                                                                           \
+    return child_name;                                                                                     \
+}
+
+#define FLAGS AV_OPT_FLAG_ENCODING_PARAM
+
+//
+// Param Definition
+//
+#define OFFSET(x) offsetof(AVIAMFMixGain, x)
+static const AVOption mix_gain_options[] = {
+    { "subblock_duration", "set subblock_duration", OFFSET(subblock_duration), AV_OPT_TYPE_INT64, {.i64 = 1 }, 1, UINT_MAX, FLAGS },
+    { "animation_type", "set animation_type", OFFSET(animation_type), AV_OPT_TYPE_INT, {.i64 = 0 }, 0, 2, FLAGS },
+    { "start_point_value", "set start_point_value", OFFSET(animation_type), AV_OPT_TYPE_RATIONAL, {.dbl = 0 }, -128.0, 128.0, FLAGS },
+    { "end_point_value", "set end_point_value", OFFSET(animation_type), AV_OPT_TYPE_RATIONAL, {.dbl = 0 }, -128.0, 128.0, FLAGS },
+    { "control_point_value", "set control_point_value", OFFSET(animation_type), AV_OPT_TYPE_RATIONAL, {.dbl = 0 }, -128.0, 128.0, FLAGS },
+    { "control_point_relative_time", "set control_point_relative_time", OFFSET(animation_type), AV_OPT_TYPE_RATIONAL, {.dbl = 0 }, 0.0, 1.0, FLAGS },
+    { NULL },
+};
+
+static const AVClass mix_gain_class = {
+    .class_name     = "AVIAMFSubmixElement",
+    .item_name      = av_default_item_name,
+    .version        = LIBAVUTIL_VERSION_INT,
+    .option         = mix_gain_options,
+};
+
+#undef OFFSET
+#define OFFSET(x) offsetof(AVIAMFDemixingInfo, x)
+static const AVOption demixing_info_options[] = {
+    { "subblock_duration", "set subblock_duration", OFFSET(subblock_duration), AV_OPT_TYPE_INT64, {.i64 = 1 }, 1, UINT_MAX, FLAGS },
+    { "dmixp_mode", "set dmixp_mode", OFFSET(dmixp_mode), AV_OPT_TYPE_INT, {.i64 = 0 }, 0, 6, FLAGS },
+    { NULL },
+};
+
+static const AVClass demixing_info_class = {
+    .class_name     = "AVIAMFDemixingInfo",
+    .item_name      = av_default_item_name,
+    .version        = LIBAVUTIL_VERSION_INT,
+    .option         = demixing_info_options,
+};
+
+#undef OFFSET
+#define OFFSET(x) offsetof(AVIAMFReconGain, x)
+static const AVOption recon_gain_options[] = {
+    { "subblock_duration", "set subblock_duration", OFFSET(subblock_duration), AV_OPT_TYPE_INT64, {.i64 = 1 }, 1, UINT_MAX, FLAGS },
+    { NULL },
+};
+
+static const AVClass recon_gain_class = {
+    .class_name     = "AVIAMFReconGain",
+    .item_name      = av_default_item_name,
+    .version        = LIBAVUTIL_VERSION_INT,
+    .option         = recon_gain_options,
+};
+
+#undef OFFSET
+#define OFFSET(x) offsetof(AVIAMFParamDefinition, x)
+static const AVOption param_definition_options[] = {
+    { "parameter_id", "set parameter_id", OFFSET(parameter_id), AV_OPT_TYPE_INT64, {.i64 = 0 }, 0, UINT_MAX, FLAGS },
+    { "parameter_rate", "set parameter_rate", OFFSET(parameter_rate), AV_OPT_TYPE_INT64, {.i64 = 0 }, 0, UINT_MAX, FLAGS },
+    { "param_definition_mode", "set param_definition_mode", OFFSET(param_definition_mode), AV_OPT_TYPE_INT, {.i64 = 1 }, 0, 1, FLAGS },
+    { "duration", "set duration", OFFSET(duration), AV_OPT_TYPE_INT64, {.i64 = 0 }, 0, UINT_MAX, FLAGS },
+    { "constant_subblock_duration", "set constant_subblock_duration", OFFSET(constant_subblock_duration), AV_OPT_TYPE_INT64, {.i64 = 0 }, 0, UINT_MAX, FLAGS },
+    { NULL },
+};
+
+static const AVClass *param_definition_child_iterate(void **opaque)
+{
+    uintptr_t i = (uintptr_t)*opaque;
+    const AVClass *ret = NULL;
+
+    switch(i) {
+    case AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN:
+        ret = &mix_gain_class;
+        break;
+    case AV_IAMF_PARAMETER_DEFINITION_DEMIXING:
+        ret = &demixing_info_class;
+        break;
+    case AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN:
+        ret = &recon_gain_class;
+        break;
+    default:
+        break;
+    }
+
+    if (ret)
+        *opaque = (void*)(i + 1);
+    return ret;
+}
+
+static const AVClass param_definition_class = {
+    .class_name          = "AVIAMFParamDefinition",
+    .item_name           = av_default_item_name,
+    .version             = LIBAVUTIL_VERSION_INT,
+    .option              = param_definition_options,
+    .child_class_iterate = param_definition_child_iterate,
+};
+
+const AVClass *av_iamf_param_definition_get_class(void)
+{
+    return &param_definition_class;
+}
+
+AVIAMFParamDefinition *av_iamf_param_definition_alloc(enum AVIAMFParamDefinitionType type,
+                                                      unsigned int nb_subblocks, size_t *out_size)
+{
+
+    struct MixGainStruct {
+        AVIAMFParamDefinition p;
+        AVIAMFMixGain m;
+    };
+    struct DemixStruct {
+        AVIAMFParamDefinition p;
+        AVIAMFDemixingInfo d;
+    };
+    struct ReconGainStruct {
+        AVIAMFParamDefinition p;
+        AVIAMFReconGain r;
+    };
+    size_t subblocks_offset, subblock_size;
+    size_t size;
+    AVIAMFParamDefinition *par;
+
+    switch (type) {
+    case AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN:
+        subblocks_offset = offsetof(struct MixGainStruct, m);
+        subblock_size = sizeof(AVIAMFMixGain);
+        break;
+    case AV_IAMF_PARAMETER_DEFINITION_DEMIXING:
+        subblocks_offset = offsetof(struct DemixStruct, d);
+        subblock_size = sizeof(AVIAMFDemixingInfo);
+        break;
+    case AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN:
+        subblocks_offset = offsetof(struct ReconGainStruct, r);
+        subblock_size = sizeof(AVIAMFReconGain);
+        break;
+    default:
+        return NULL;
+    }
+
+    size = subblocks_offset;
+    if (nb_subblocks > (SIZE_MAX - size) / subblock_size)
+        return NULL;
+    size += subblock_size * nb_subblocks;
+
+    par = av_mallocz(size);
+    if (!par)
+        return NULL;
+
+    par->av_class = &param_definition_class;
+    av_opt_set_defaults(par);
+
+    par->param_definition_type = type;
+    par->nb_subblocks = nb_subblocks;
+    par->subblock_size = subblock_size;
+    par->subblocks_offset = subblocks_offset;
+
+    for (int i = 0; i < nb_subblocks; i++) {
+        void *subblock = av_iamf_param_definition_get_subblock(par, i);
+
+        switch (type) {
+        case AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN:
+            ((AVIAMFMixGain *)subblock)->av_class = &mix_gain_class;
+            break;
+        case AV_IAMF_PARAMETER_DEFINITION_DEMIXING:
+            ((AVIAMFDemixingInfo *)subblock)->av_class = &demixing_info_class;
+            break;
+        case AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN:
+            ((AVIAMFReconGain *)subblock)->av_class = &recon_gain_class;
+            break;
+        default:
+            av_assert0(0);
+        }
+
+        av_opt_set_defaults(subblock);
+    }
+
+    if (out_size)
+        *out_size = size;
+
+    return par;
+}
+
+//
+// Audio Element
+//
+#undef OFFSET
+#define OFFSET(x) offsetof(AVIAMFLayer, x)
+static const AVOption layer_options[] = {
+    { "ch_layout", "set ch_layout", OFFSET(ch_layout), AV_OPT_TYPE_CHLAYOUT, {.str = NULL }, 0, 0, FLAGS },
+    { "flags", "set flags", OFFSET(flags), AV_OPT_TYPE_FLAGS,
+        {.i64 = 0 }, 0, AV_IAMF_LAYER_FLAG_RECON_GAIN, FLAGS, "flags" },
+            {"recon_gain",  "Recon gain is present", 0, AV_OPT_TYPE_CONST,
+                {.i64 = AV_IAMF_LAYER_FLAG_RECON_GAIN }, INT_MIN, INT_MAX, FLAGS, "flags"},
+    { "output_gain_flags", "set output_gain_flags", OFFSET(output_gain_flags), AV_OPT_TYPE_FLAGS,
+        {.i64 = 0 }, 0, (1 << 6) - 1, FLAGS, "output_gain_flags" },
+            {"FL",  "Left channel",            0, AV_OPT_TYPE_CONST,
+                {.i64 = 1 << 5 }, INT_MIN, INT_MAX, FLAGS, "output_gain_flags"},
+            {"FR",  "Right channel",           0, AV_OPT_TYPE_CONST,
+                {.i64 = 1 << 4 }, INT_MIN, INT_MAX, FLAGS, "output_gain_flags"},
+            {"BL",  "Left surround channel",   0, AV_OPT_TYPE_CONST,
+                {.i64 = 1 << 3 }, INT_MIN, INT_MAX, FLAGS, "output_gain_flags"},
+            {"BR",  "Right surround channel",  0, AV_OPT_TYPE_CONST,
+                {.i64 = 1 << 2 }, INT_MIN, INT_MAX, FLAGS, "output_gain_flags"},
+            {"TFL", "Left top front channel",  0, AV_OPT_TYPE_CONST,
+                {.i64 = 1 << 1 }, INT_MIN, INT_MAX, FLAGS, "output_gain_flags"},
+            {"TFR", "Right top front channel", 0, AV_OPT_TYPE_CONST,
+                {.i64 = 1 << 0 }, INT_MIN, INT_MAX, FLAGS, "output_gain_flags"},
+    { "output_gain", "set output_gain", OFFSET(output_gain), AV_OPT_TYPE_RATIONAL, { .dbl = 0 }, -128.0, 128.0, FLAGS },
+    { "ambisonics_mode", "set ambisonics_mode", OFFSET(ambisonics_mode), AV_OPT_TYPE_INT,
+            { .i64 = AV_IAMF_AMBISONICS_MODE_MONO },
+            AV_IAMF_AMBISONICS_MODE_MONO, AV_IAMF_AMBISONICS_MODE_PROJECTION, FLAGS, "ambisonics_mode" },
+        { "mono",       NULL, 0, AV_OPT_TYPE_CONST,
+                   { .i64 = AV_IAMF_AMBISONICS_MODE_MONO },       .unit = "ambisonics_mode" },
+        { "projection", NULL, 0, AV_OPT_TYPE_CONST,
+                   { .i64 = AV_IAMF_AMBISONICS_MODE_PROJECTION }, .unit = "ambisonics_mode" },
+    { NULL },
+};
+
+static const AVClass layer_class = {
+    .class_name     = "AVIAMFLayer",
+    .item_name      = av_default_item_name,
+    .version        = LIBAVUTIL_VERSION_INT,
+    .option         = layer_options,
+};
+
+#undef OFFSET
+#define OFFSET(x) offsetof(AVIAMFAudioElement, x)
+static const AVOption audio_element_options[] = {
+    { "audio_element_type", "set audio_element_type", OFFSET(audio_element_type), AV_OPT_TYPE_INT,
+            {.i64 = AV_IAMF_AUDIO_ELEMENT_TYPE_CHANNEL },
+            AV_IAMF_AUDIO_ELEMENT_TYPE_CHANNEL, AV_IAMF_AUDIO_ELEMENT_TYPE_SCENE, FLAGS, "audio_element_type" },
+        { "channel", NULL, 0, AV_OPT_TYPE_CONST,
+                   { .i64 = AV_IAMF_AUDIO_ELEMENT_TYPE_CHANNEL }, .unit = "audio_element_type" },
+        { "scene",   NULL, 0, AV_OPT_TYPE_CONST,
+                   { .i64 = AV_IAMF_AUDIO_ELEMENT_TYPE_SCENE },   .unit = "audio_element_type" },
+    { "default_w", "set default_w", OFFSET(default_w), AV_OPT_TYPE_INT, {.i64 = 0 }, 0, 10, FLAGS },
+    { NULL },
+};
+
+static const AVClass *audio_element_child_iterate(void **opaque)
+{
+    uintptr_t i = (uintptr_t)*opaque;
+    const AVClass *ret = NULL;
+
+    if (i)
+        ret = &layer_class;
+
+    if (ret)
+        *opaque = (void*)(i + 1);
+    return ret;
+}
+
+static const AVClass audio_element_class = {
+    .class_name          = "AVIAMFAudioElement",
+    .item_name           = av_default_item_name,
+    .version             = LIBAVUTIL_VERSION_INT,
+    .option              = audio_element_options,
+    .child_class_iterate = audio_element_child_iterate,
+};
+
+const AVClass *av_iamf_audio_element_get_class(void)
+{
+    return &audio_element_class;
+}
+
+AVIAMFAudioElement *av_iamf_audio_element_alloc(void)
+{
+    AVIAMFAudioElement *audio_element = av_mallocz(sizeof(*audio_element));
+
+    if (audio_element) {
+        audio_element->av_class = &audio_element_class;
+        av_opt_set_defaults(audio_element);
+    }
+
+    return audio_element;
+}
+
+IAMF_ADD_FUNC_TEMPLATE(AVIAMFAudioElement, audio_element, AVIAMFLayer, layer, s)
+
+void av_iamf_audio_element_free(AVIAMFAudioElement **paudio_element)
+{
+    AVIAMFAudioElement *audio_element = *paudio_element;
+
+    if (!audio_element)
+        return;
+
+    for (int i = 0; i < audio_element->nb_layers; i++) {
+        AVIAMFLayer *layer = audio_element->layers[i];
+        av_opt_free(layer);
+        av_free(layer->demixing_matrix);
+        av_free(layer);
+    }
+    av_free(audio_element->layers);
+
+    av_free(audio_element->demixing_info);
+    av_free(audio_element->recon_gain_info);
+    av_freep(paudio_element);
+}
+
+//
+// Mix Presentation
+//
+#undef OFFSET
+#define OFFSET(x) offsetof(AVIAMFSubmixElement, x)
+static const AVOption submix_element_options[] = {
+    { "headphones_rendering_mode", "Headphones rendering mode", OFFSET(headphones_rendering_mode), AV_OPT_TYPE_INT,
+            { .i64 = AV_IAMF_HEADPHONES_MODE_STEREO },
+            AV_IAMF_HEADPHONES_MODE_STEREO, AV_IAMF_HEADPHONES_MODE_BINAURAL, FLAGS, "headphones_rendering_mode" },
+        { "stereo",   NULL, 0, AV_OPT_TYPE_CONST,
+                   { .i64 = AV_IAMF_HEADPHONES_MODE_STEREO },   .unit = "headphones_rendering_mode" },
+        { "binaural", NULL, 0, AV_OPT_TYPE_CONST,
+                   { .i64 = AV_IAMF_HEADPHONES_MODE_BINAURAL }, .unit = "headphones_rendering_mode" },
+    { "default_mix_gain", "Default mix gain", OFFSET(default_mix_gain), AV_OPT_TYPE_RATIONAL, { .dbl = 0 }, -128.0, 128.0, FLAGS },
+    { "annotations", "Annotations", OFFSET(annotations), AV_OPT_TYPE_DICT, { .str = NULL }, 0, 0, FLAGS },
+    { NULL },
+};
+
+static void *submix_element_child_next(void *obj, void *prev)
+{
+    AVIAMFSubmixElement *submix_element = obj;
+    if (!prev)
+        return submix_element->element_mix_config;
+
+    return NULL;
+}
+
+static const AVClass *submix_element_child_iterate(void **opaque)
+{
+    uintptr_t i = (uintptr_t)*opaque;
+    const AVClass *ret = NULL;
+
+    if (i)
+        ret = &param_definition_class;
+
+    if (ret)
+        *opaque = (void*)(i + 1);
+    return ret;
+}
+
+static const AVClass element_class = {
+    .class_name          = "AVIAMFSubmixElement",
+    .item_name           = av_default_item_name,
+    .version             = LIBAVUTIL_VERSION_INT,
+    .option              = submix_element_options,
+    .child_next          = submix_element_child_next,
+    .child_class_iterate = submix_element_child_iterate,
+};
+
+IAMF_ADD_FUNC_TEMPLATE(AVIAMFSubmix, submix, AVIAMFSubmixElement, element, s)
+
+#undef OFFSET
+#define OFFSET(x) offsetof(AVIAMFSubmixLayout, x)
+static const AVOption submix_layout_options[] = {
+    { "layout_type", "Layout type", OFFSET(layout_type), AV_OPT_TYPE_INT,
+            { .i64 = AV_IAMF_SUBMIX_LAYOUT_TYPE_LOUDSPEAKERS },
+            AV_IAMF_SUBMIX_LAYOUT_TYPE_LOUDSPEAKERS, AV_IAMF_SUBMIX_LAYOUT_TYPE_BINAURAL, FLAGS, "layout_type" },
+        { "loudspeakers", NULL, 0, AV_OPT_TYPE_CONST,
+                   { .i64 = AV_IAMF_SUBMIX_LAYOUT_TYPE_LOUDSPEAKERS }, .unit = "layout_type" },
+        { "binaural",     NULL, 0, AV_OPT_TYPE_CONST,
+                   { .i64 = AV_IAMF_SUBMIX_LAYOUT_TYPE_BINAURAL },     .unit = "layout_type" },
+    { "sound_system", "Sound System", OFFSET(sound_system), AV_OPT_TYPE_CHLAYOUT, { .str = NULL }, 0, 0, FLAGS },
+    { "integrated_loudness", "Integrated loudness", OFFSET(integrated_loudness), AV_OPT_TYPE_RATIONAL, { .dbl = 0 }, -128.0, 128.0, FLAGS },
+    { "digital_peak", "Digital peak", OFFSET(digital_peak), AV_OPT_TYPE_RATIONAL, { .dbl = 0 }, -128.0, 128.0, FLAGS },
+    { "true_peak", "True peak", OFFSET(true_peak), AV_OPT_TYPE_RATIONAL, { .dbl = 0 }, -128.0, 128.0, FLAGS },
+    { "dialog_anchored_loudness", "Anchored loudness (Dialog)", OFFSET(dialogue_anchored_loudness), AV_OPT_TYPE_RATIONAL, { .dbl = 0 }, -128.0, 128.0, FLAGS },
+    { "album_anchored_loudness", "Anchored loudness (Album)", OFFSET(album_anchored_loudness), AV_OPT_TYPE_RATIONAL, { .dbl = 0 }, -128.0, 128.0, FLAGS },
+    { NULL },
+};
+
+static const AVClass layout_class = {
+    .class_name     = "AVIAMFSubmixLayout",
+    .item_name      = av_default_item_name,
+    .version        = LIBAVUTIL_VERSION_INT,
+    .option         = submix_layout_options,
+};
+
+IAMF_ADD_FUNC_TEMPLATE(AVIAMFSubmix, submix, AVIAMFSubmixLayout, layout, s)
+
+#undef OFFSET
+#define OFFSET(x) offsetof(AVIAMFSubmix, x)
+static const AVOption submix_presentation_options[] = {
+    { "default_mix_gain", "Default mix gain", OFFSET(default_mix_gain), AV_OPT_TYPE_RATIONAL, { .dbl = 0 }, -128.0, 128.0, FLAGS },
+    { NULL },
+};
+
+static void *submix_presentation_child_next(void *obj, void *prev)
+{
+    AVIAMFSubmix *sub_mix = obj;
+    if (!prev)
+        return sub_mix->output_mix_config;
+
+    return NULL;
+}
+
+static const AVClass *submix_presentation_child_iterate(void **opaque)
+{
+    uintptr_t i = (uintptr_t)*opaque;
+    const AVClass *ret = NULL;
+
+    switch(i) {
+    case 0:
+        ret = &element_class;
+        break;
+    case 1:
+        ret = &layout_class;
+        break;
+    case 2:
+        ret = &param_definition_class;
+        break;
+    default:
+        break;
+    }
+
+    if (ret)
+        *opaque = (void*)(i + 1);
+    return ret;
+}
+
+static const AVClass submix_class = {
+    .class_name          = "AVIAMFSubmix",
+    .item_name           = av_default_item_name,
+    .version             = LIBAVUTIL_VERSION_INT,
+    .option              = submix_presentation_options,
+    .child_next          = submix_presentation_child_next,
+    .child_class_iterate = submix_presentation_child_iterate,
+};
+
+#undef OFFSET
+#define OFFSET(x) offsetof(AVIAMFMixPresentation, x)
+static const AVOption mix_presentation_options[] = {
+    { "annotations", "set annotations", OFFSET(annotations), AV_OPT_TYPE_DICT, {.str = NULL }, 0, 0, FLAGS },
+    { NULL },
+};
+
+#undef OFFSET
+#undef FLAGS
+
+static const AVClass *mix_presentation_child_iterate(void **opaque)
+{
+    uintptr_t i = (uintptr_t)*opaque;
+    const AVClass *ret = NULL;
+
+    if (i)
+        ret = &submix_class;
+
+    if (ret)
+        *opaque = (void*)(i + 1);
+    return ret;
+}
+
+static const AVClass mix_presentation_class = {
+    .class_name          = "AVIAMFMixPresentation",
+    .item_name           = av_default_item_name,
+    .version             = LIBAVUTIL_VERSION_INT,
+    .option              = mix_presentation_options,
+    .child_class_iterate = mix_presentation_child_iterate,
+};
+
+const AVClass *av_iamf_mix_presentation_get_class(void)
+{
+    return &mix_presentation_class;
+}
+
+AVIAMFMixPresentation *av_iamf_mix_presentation_alloc(void)
+{
+    AVIAMFMixPresentation *mix_presentation = av_mallocz(sizeof(*mix_presentation));
+
+    if (mix_presentation) {
+        mix_presentation->av_class = &mix_presentation_class;
+        av_opt_set_defaults(mix_presentation);
+    }
+
+    return mix_presentation;
+}
+
+IAMF_ADD_FUNC_TEMPLATE(AVIAMFMixPresentation, mix_presentation, AVIAMFSubmix, submix, es)
+
+void av_iamf_mix_presentation_free(AVIAMFMixPresentation **pmix_presentation)
+{
+    AVIAMFMixPresentation *mix_presentation = *pmix_presentation;
+
+    if (!mix_presentation)
+        return;
+
+    for (int i = 0; i < mix_presentation->nb_submixes; i++) {
+        AVIAMFSubmix *sub_mix = mix_presentation->submixes[i];
+        for (int j = 0; j < sub_mix->nb_elements; j++) {
+            AVIAMFSubmixElement *submix_element = sub_mix->elements[j];
+            av_opt_free(submix_element);
+            av_free(submix_element->element_mix_config);
+            av_free(submix_element);
+        }
+        av_free(sub_mix->elements);
+        for (int j = 0; j < sub_mix->nb_layouts; j++) {
+            AVIAMFSubmixLayout *submix_layout = sub_mix->layouts[j];
+            av_opt_free(submix_layout);
+            av_free(submix_layout);
+        }
+        av_free(sub_mix->layouts);
+        av_free(sub_mix->output_mix_config);
+        av_free(sub_mix);
+    }
+    av_opt_free(mix_presentation);
+    av_free(mix_presentation->submixes);
+
+    av_freep(pmix_presentation);
+}
diff --git a/libavutil/iamf.h b/libavutil/iamf.h
new file mode 100644
index 0000000000..bc0363153d
--- /dev/null
+++ b/libavutil/iamf.h
@@ -0,0 +1,573 @@
+/*
+ * Immersive Audio Model and Formats helper functions and defines
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#ifndef AVUTIL_IAMF_H
+#define AVUTIL_IAMF_H
+
+/**
+ * @file
+ * Immersive Audio Model and Formats API header
+ * @see <a href="https://aomediacodec.github.io/iamf/">Immersive Audio Model and Formats</a>
+ */
+
+#include <stdint.h>
+#include <stddef.h>
+
+#include "attributes.h"
+#include "avassert.h"
+#include "channel_layout.h"
+#include "dict.h"
+#include "rational.h"
+
+/**
+ * @defgroup lavf_iamf_params Parameter Definition
+ * @{
+ * Parameters as defined in section 3.6.1 and 3.8 of IAMF.
+ * @}
+ * @defgroup lavf_iamf_audio Audio Element
+ * @{
+ * Audio Elements as defined in section 3.6 of IAMF.
+ * @}
+ * @defgroup lavf_iamf_mix Mix Presentation
+ * @{
+ * Mix Presentations as defined in section 3.7 of IAMF.
+ * @}
+ *
+ * @}
+ * @addtogroup lavf_iamf_params
+ * @{
+ */
+enum AVIAMFAnimationType {
+    AV_IAMF_ANIMATION_TYPE_STEP,
+    AV_IAMF_ANIMATION_TYPE_LINEAR,
+    AV_IAMF_ANIMATION_TYPE_BEZIER,
+};
+
+/**
+ * Mix Gain Parameter Data as defined in section 3.8.1 of IAMF.
+ *
+ * Subblocks in AVIAMFParamDefinition use this struct when the value or
+ * @ref AVIAMFParamDefinition.param_definition_type param_definition_type is
+ * AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN.
+ */
+typedef struct AVIAMFMixGain {
+    const AVClass *av_class;
+
+    unsigned int subblock_duration;
+    /**
+     * The type of animation applied to the parameter values.
+     */
+    enum AVIAMFAnimationType animation_type;
+    /**
+     * Parameter value that is applied at the start of the subblock.
+     * Applies to all defined Animation Types.
+     *
+     * Valid range of values is -128.0 to 128.0
+     */
+    AVRational start_point_value;
+    /**
+     * Parameter value that is applied at the end of the subblock.
+     * Applies only to AV_IAMF_ANIMATION_TYPE_LINEAR and
+     * AV_IAMF_ANIMATION_TYPE_BEZIER Animation Types.
+     *
+     * Valid range of values is -128.0 to 128.0
+     */
+    AVRational end_point_value;
+    /**
+     * Parameter value of the middle control point of a quadratic Bezier
+     * curve, i.e., its y-axis value.
+     * Applies only to AV_IAMF_ANIMATION_TYPE_BEZIER Animation Type.
+     *
+     * Valid range of values is -128.0 to 128.0
+     */
+    AVRational control_point_value;
+    /**
+     * Parameter value of the time of the middle control point of a
+     * quadratic Bezier curve, i.e., its x-axis value.
+     * Applies only to AV_IAMF_ANIMATION_TYPE_BEZIER Animation Type.
+     *
+     * Valid range of values is 0.0 to 1.0
+     */
+    AVRational control_point_relative_time;
+} AVIAMFMixGain;
+
+/**
+ * Demixing Info Parameter Data as defined in section 3.8.2 of IAMF..
+ *
+ * Subblocks in AVIAMFParamDefinition use this struct when the value or
+ * @ref AVIAMFParamDefinition.param_definition_type param_definition_type is
+ * AV_IAMF_PARAMETER_DEFINITION_DEMIXING.
+ */
+typedef struct AVIAMFDemixingInfo {
+    const AVClass *av_class;
+
+    unsigned int subblock_duration;
+    /**
+     * Pre-defined combination of demixing parameters.
+     */
+    unsigned int dmixp_mode;
+} AVIAMFDemixingInfo;
+
+/**
+ * Recon Gain Info Parameter Data as defined in section 3.8.3 of IAMF.
+ *
+ * Subblocks in AVIAMFParamDefinition use this struct when the value or
+ * @ref AVIAMFParamDefinition.param_definition_type param_definition_type is
+ * AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN.
+ */
+typedef struct AVIAMFReconGain {
+    const AVClass *av_class;
+
+    unsigned int subblock_duration;
+
+    /**
+     * Array of gain values to be applied to each channel for each layer
+     * defined in the Audio Element referencing the parent Parameter Definition.
+     * Values for layers where the AV_IAMF_LAYER_FLAG_RECON_GAIN flag is not set
+     * are undefined.
+     *
+     * Channel order is: FL, C, FR, SL, SR, TFL, TFR, BL, BR, TBL, TBR, LFE
+     */
+    uint8_t recon_gain[6][12];
+} AVIAMFReconGain;
+
+enum AVIAMFParamDefinitionType {
+    AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN,
+    AV_IAMF_PARAMETER_DEFINITION_DEMIXING,
+    AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN,
+};
+
+/**
+ * Parameters as defined in section 3.6.1 of IAMF.
+ */
+typedef struct AVIAMFParamDefinition {
+    const AVClass *av_class;
+
+    size_t subblocks_offset;
+    size_t subblock_size;
+
+    enum AVIAMFParamDefinitionType param_definition_type;
+    unsigned int nb_subblocks;
+
+    unsigned int parameter_id;
+    unsigned int parameter_rate;
+    unsigned int param_definition_mode;
+    unsigned int duration;
+    unsigned int constant_subblock_duration;
+} AVIAMFParamDefinition;
+
+const AVClass *av_iamf_param_definition_get_class(void);
+
+/**
+ * Allocates memory for AVIAMFParamDefinition, plus an array of {@code nb_subblocks}
+ * amount of subblocks of the given type and initializes the variables. Can be
+ * freed with a normal av_free() call.
+ *
+ * @param size if non-NULL, the size in bytes of the resulting data array is written here.
+ */
+AVIAMFParamDefinition *av_iamf_param_definition_alloc(enum AVIAMFParamDefinitionType param_definition_type,
+                                                      unsigned int nb_subblocks, size_t *size);
+
+/**
+ * Get the subblock at the specified {@code idx}. Must be between 0 and nb_subblocks - 1.
+ *
+ * The @ref AVIAMFParamDefinition.param_definition_type "param definition type" defines
+ * the struct type of the returned pointer.
+ */
+static av_always_inline void*
+av_iamf_param_definition_get_subblock(const AVIAMFParamDefinition *par, unsigned int idx)
+{
+    av_assert0(idx < par->nb_subblocks);
+    return (void *)((uint8_t *)par + par->subblocks_offset + idx * par->subblock_size);
+}
+
+/**
+ * @}
+ * @addtogroup lavf_iamf_audio
+ * @{
+ */
+
+enum AVIAMFAmbisonicsMode {
+    AV_IAMF_AMBISONICS_MODE_MONO,
+    AV_IAMF_AMBISONICS_MODE_PROJECTION,
+};
+
+/**
+ * Recon gain information for the layer is present in AVIAMFReconGain
+ */
+#define AV_IAMF_LAYER_FLAG_RECON_GAIN (1 << 0)
+
+/**
+ * A layer defining a Channel Layout in the Audio Element.
+ *
+ * When @ref AVIAMFAudioElement.audio_element_type "the parent's Audio Element type"
+ * is AV_IAMF_AUDIO_ELEMENT_TYPE_CHANNEL, this corresponds to an Scalable Channel
+ * Layout layer as defined in section 3.6.2 of IAMF.
+ * For AV_IAMF_AUDIO_ELEMENT_TYPE_SCENE, it is an Ambisonics channel
+ * layout as defined in section 3.6.3 of IAMF.
+ */
+typedef struct AVIAMFLayer {
+    const AVClass *av_class;
+
+    AVChannelLayout ch_layout;
+
+    /**
+     * A bitmask which may contain a combination of AV_IAMF_LAYER_FLAG_* flags.
+     */
+    unsigned int flags;
+    /**
+     * Output gain channel flags as defined in section 3.6.2 of IAMF.
+     *
+     * This field is defined only if @ref AVIAMFAudioElement.audio_element_type
+     * "the parent's Audio Element type" is AV_IAMF_AUDIO_ELEMENT_TYPE_CHANNEL,
+     * must be 0 otherwise.
+     */
+    unsigned int output_gain_flags;
+    /**
+     * Output gain as defined in section 3.6.2 of IAMF.
+     *
+     * Must be 0 if @ref output_gain_flags is 0.
+     */
+    AVRational output_gain;
+    /**
+     * Ambisonics mode as defined in section 3.6.3 of IAMF.
+     *
+     * This field is defined only if @ref AVIAMFAudioElement.audio_element_type
+     * "the parent's Audio Element type" is AV_IAMF_AUDIO_ELEMENT_TYPE_SCENE.
+     *
+     * If AV_IAMF_AMBISONICS_MODE_MONO, channel_mapping is defined implicitly
+     * (Ambisonic Order) or explicitly (Custom Order with ambi channels) in
+     * @ref ch_layout.
+     * If AV_IAMF_AMBISONICS_MODE_PROJECTION, @ref demixing_matrix must be set.
+     */
+    enum AVIAMFAmbisonicsMode ambisonics_mode;
+
+    /**
+     * Demixing matrix as defined in section 3.6.3 of IAMF.
+     *
+     * May be set only if @ref ambisonics_mode == AV_IAMF_AMBISONICS_MODE_PROJECTION,
+     * must be NULL otherwise.
+     */
+    AVRational *demixing_matrix;
+} AVIAMFLayer;
+
+
+enum AVIAMFAudioElementType {
+    AV_IAMF_AUDIO_ELEMENT_TYPE_CHANNEL,
+    AV_IAMF_AUDIO_ELEMENT_TYPE_SCENE,
+};
+
+typedef struct AVIAMFAudioElement {
+    const AVClass *av_class;
+
+    AVIAMFLayer **layers;
+    /**
+     * Number of layers, or channel groups, in the Audio Element.
+     * For @ref audio_element_type AV_IAMF_AUDIO_ELEMENT_TYPE_SCENE, there
+     * may be exactly 1.
+     *
+     * Set by av_iamf_audio_element_add_layer(), must not be
+     * modified by any other code.
+     */
+    unsigned int nb_layers;
+
+    /**
+     * Demixing information used to reconstruct a scalable channel audio
+     * representation.
+     * The @ref AVIAMFParamDefinition.param_definition_type "type" must be
+     * AV_IAMF_PARAMETER_DEFINITION_DEMIXING.
+     */
+    AVIAMFParamDefinition *demixing_info;
+    /**
+     * Recon gain information used to reconstruct a scalable channel audio
+     * representation.
+     * The @ref AVIAMFParamDefinition.param_definition_type "type" must be
+     * AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN.
+     */
+    AVIAMFParamDefinition *recon_gain_info;
+
+    /**
+     * Audio element type as defined in section 3.6 of IAMF.
+     */
+    enum AVIAMFAudioElementType audio_element_type;
+
+    /**
+     * Default weight value as defined in section 3.6 of IAMF.
+     */
+    unsigned int default_w;
+} AVIAMFAudioElement;
+
+const AVClass *av_iamf_audio_element_get_class(void);
+
+/**
+ * Allocates a AVIAMFAudioElement, and initializes its fields with default values.
+ * No layers are allocated. Must be freed with av_iamf_audio_element_free().
+ *
+ * @see av_iamf_audio_element_add_layer()
+ */
+AVIAMFAudioElement *av_iamf_audio_element_alloc(void);
+
+/**
+ * Allocate a layer and add it to a given AVIAMFAudioElement.
+ * It is freed by av_iamf_audio_element_free() alongside the rest of the parent
+ * AVIAMFAudioElement.
+ *
+ * @return a pointer to the allocated layer.
+ */
+AVIAMFLayer *av_iamf_audio_element_add_layer(AVIAMFAudioElement *audio_element);
+
+void av_iamf_audio_element_free(AVIAMFAudioElement **audio_element);
+
+/**
+ * @}
+ * @addtogroup lavf_iamf_mix
+ * @{
+ */
+
+enum AVIAMFHeadphonesMode {
+    /**
+     * The referenced Audio Element shall be rendered to stereo loudspeakers.
+     */
+    AV_IAMF_HEADPHONES_MODE_STEREO,
+    /**
+     * The referenced Audio Element shall be rendered with a binaural renderer.
+     */
+    AV_IAMF_HEADPHONES_MODE_BINAURAL,
+};
+
+typedef struct AVIAMFSubmixElement {
+    const AVClass *av_class;
+
+    /**
+     * The id of the Audio Element this submix element references.
+     */
+    unsigned int audio_element_id;
+
+    /**
+     * Information required required for applying any processing to the
+     * referenced and rendered Audio Element before being summed with other
+     * processed Audio Elements.
+     * The @ref AVIAMFParamDefinition.param_definition_type "type" must be
+     * AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN.
+     */
+    AVIAMFParamDefinition *element_mix_config;
+
+    /**
+     * Default mix gain value to apply when there are no AVIAMFParamDefinition
+     * with @ref element_mix_config "element_mix_config's"
+     * @ref AVIAMFParamDefinition.parameter_id "parameter_id" available for a
+     * given audio frame.
+     */
+    AVRational default_mix_gain;
+
+    /**
+     * A value that indicates whether the referenced channel-based Audio Element
+     * shall be rendered to stereo loudspeakers or spatialized with a binaural
+     * renderer when played back on headphones.
+     * If the Audio Element is not of @ref AVIAMFAudioElement.audio_element_type
+     * "type" AV_IAMF_AUDIO_ELEMENT_TYPE_CHANNEL, then this field is undefined.
+     */
+    enum AVIAMFHeadphonesMode headphones_rendering_mode;
+
+    /**
+     * A dictionary of strings describing the submix in different languages.
+     * Must have the same amount of entries as
+     * @ref AVIAMFMixPresentation.annotations "the mix's annotations", stored
+     * in the same order, and with the same key strings.
+     *
+     * @ref AVDictionaryEntry.key "key" is a string conforming to BCP-47 that
+     * specifies the language for the string stored in
+     * @ref AVDictionaryEntry.value "value".
+     */
+    AVDictionary *annotations;
+} AVIAMFSubmixElement;
+
+enum AVIAMFSubmixLayoutType {
+    /**
+     * The layout follows the loudspeaker sound system convention of ITU-2051-3.
+     */
+    AV_IAMF_SUBMIX_LAYOUT_TYPE_LOUDSPEAKERS = 2,
+    /**
+     * The layout is binaural.
+     */
+    AV_IAMF_SUBMIX_LAYOUT_TYPE_BINAURAL = 3,
+};
+
+typedef struct AVIAMFSubmixLayout {
+    const AVClass *av_class;
+
+    enum AVIAMFSubmixLayoutType layout_type;
+
+    /**
+     * Channel layout matching one of Sound Systems A to J of ITU-2051-3, plus
+     * 7.1.2ch and 3.1.2ch
+     * If layout_type is not AV_IAMF_SUBMIX_LAYOUT_TYPE_LOUDSPEAKERS, this field
+     * is undefined.
+     */
+    AVChannelLayout sound_system;
+    /**
+     * The program integrated loudness information, as defined in
+     * ITU-1770-4.
+     */
+    AVRational integrated_loudness;
+    /**
+     * The digital (sampled) peak value of the audio signal, as defined
+     * in ITU-1770-4.
+     */
+    AVRational digital_peak;
+    /**
+     * The true peak of the audio signal, as defined in ITU-1770-4.
+     */
+    AVRational true_peak;
+    /**
+     * The Dialogue loudness information, as defined in ITU-1770-4.
+     */
+    AVRational dialogue_anchored_loudness;
+    /**
+     * The Album loudness information, as defined in ITU-1770-4.
+     */
+    AVRational album_anchored_loudness;
+} AVIAMFSubmixLayout;
+
+typedef struct AVIAMFSubmix {
+    const AVClass *av_class;
+
+    /**
+     * Array of submix elements.
+     *
+     * Set by av_iamf_submix_add_element(), must not be modified by any
+     * other code.
+     */
+    AVIAMFSubmixElement **elements;
+    /**
+     * Number of elements in the submix.
+     *
+     * Set by av_iamf_submix_add_element(), must not be modified by any
+     * other code.
+     */
+    unsigned int nb_elements;
+
+    /**
+     * Array of submix layouts.
+     *
+     * Set by av_iamf_submix_add_layout(), must not be modified by any
+     * other code.
+     */
+    AVIAMFSubmixLayout **layouts;
+    /**
+     * Number of layouts in the submix.
+     *
+     * Set by av_iamf_submix_add_layout(), must not be modified by any
+     * other code.
+     */
+    unsigned int nb_layouts;
+
+    /**
+     * Information required for post-processing the mixed audio signal to
+     * generate the audio signal for playback.
+     * The @ref AVIAMFParamDefinition.param_definition_type "type" must be
+     * AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN.
+     */
+    AVIAMFParamDefinition *output_mix_config;
+
+    /**
+     * Default mix gain value to apply when there are no AVIAMFParamDefinition
+     * with @ref output_mix_config "output_mix_config's"
+     * @ref AVIAMFParamDefinition.parameter_id "parameter_id" available for a
+     * given audio frame.
+     */
+    AVRational default_mix_gain;
+} AVIAMFSubmix;
+
+typedef struct AVIAMFMixPresentation {
+    const AVClass *av_class;
+
+    /**
+     * Array of submixes.
+     *
+     * Set by av_iamf_mix_presentation_add_submix(), must not be modified
+     * by any other code.
+     */
+    AVIAMFSubmix **submixes;
+    /**
+     * Number of submixes in the presentation.
+     *
+     * Set by av_iamf_mix_presentation_add_submix(), must not be modified
+     * by any other code.
+     */
+    unsigned int nb_submixes;
+
+    /**
+     * A dictionary of strings describing the mix in different languages.
+     * Must have the same amount of entries as every
+     * @ref AVIAMFSubmixElement.annotations "Submix element annotations",
+     * stored in the same order, and with the same key strings.
+     *
+     * @ref AVDictionaryEntry.key "key" is a string conforming to BCP-47
+     * that specifies the language for the string stored in
+     * @ref AVDictionaryEntry.value "value".
+     */
+    AVDictionary *annotations;
+} AVIAMFMixPresentation;
+
+const AVClass *av_iamf_mix_presentation_get_class(void);
+
+/**
+ * Allocates a AVIAMFMixPresentation, and initializes its fields with default
+ * values. No submixes are allocated.
+ * Must be freed with av_iamf_mix_presentation_free().
+ *
+ * @see av_iamf_mix_presentation_add_submix()
+ */
+AVIAMFMixPresentation *av_iamf_mix_presentation_alloc(void);
+
+/**
+ * Allocate a submix and add it to a given AVIAMFMixPresentation.
+ * It is freed by av_iamf_mix_presentation_free() alongside the rest of the
+ * parent AVIAMFMixPresentation.
+ *
+ * @return a pointer to the allocated submix.
+ */
+AVIAMFSubmix *av_iamf_mix_presentation_add_submix(AVIAMFMixPresentation *mix_presentation);
+
+/**
+ * Allocate a submix element and add it to a given AVIAMFSubmix.
+ * It is freed by av_iamf_mix_presentation_free() alongside the rest of the
+ * parent AVIAMFSubmix.
+ *
+ * @return a pointer to the allocated submix.
+ */
+AVIAMFSubmixElement *av_iamf_submix_add_element(AVIAMFSubmix *submix);
+
+/**
+ * Allocate a submix layout and add it to a given AVIAMFSubmix.
+ * It is freed by av_iamf_mix_presentation_free() alongside the rest of the
+ * parent AVIAMFSubmix.
+ *
+ * @return a pointer to the allocated submix.
+ */
+AVIAMFSubmixLayout *av_iamf_submix_add_layout(AVIAMFSubmix *submix);
+
+void av_iamf_mix_presentation_free(AVIAMFMixPresentation **mix_presentation);
+/**
+ * @}
+ */
+
+#endif /* AVUTIL_IAMF_H */
-- 
2.43.0

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [FFmpeg-devel] [PATCH 2/8] avformat: introduce AVStreamGroup
  2023-12-05 22:43 [FFmpeg-devel] [PATCH v6 0/8] avformat: introduce AVStreamGroup James Almer
  2023-12-05 22:43 ` [FFmpeg-devel] [PATCH 1/8] avutil: introduce an Immersive Audio Model and Formats API James Almer
@ 2023-12-05 22:43 ` James Almer
  2023-12-11 10:59   ` Anton Khirnov
  2023-12-11 11:03   ` Anton Khirnov
  2023-12-05 22:43 ` [FFmpeg-devel] [PATCH 3/8] ffmpeg: add support for muxing AVStreamGroups James Almer
                   ` (6 subsequent siblings)
  8 siblings, 2 replies; 23+ messages in thread
From: James Almer @ 2023-12-05 22:43 UTC (permalink / raw)
  To: ffmpeg-devel

Signed-off-by: James Almer <jamrial@gmail.com>
---
 doc/fftools-common-opts.texi |  17 +++-
 libavformat/avformat.c       | 185 ++++++++++++++++++++++++++++++++++-
 libavformat/avformat.h       | 169 ++++++++++++++++++++++++++++++++
 libavformat/dump.c           | 147 +++++++++++++++++++++++-----
 libavformat/internal.h       |  33 +++++++
 libavformat/options.c        | 139 ++++++++++++++++++++++++++
 6 files changed, 656 insertions(+), 34 deletions(-)

diff --git a/doc/fftools-common-opts.texi b/doc/fftools-common-opts.texi
index d9145704d6..f459bfdc1d 100644
--- a/doc/fftools-common-opts.texi
+++ b/doc/fftools-common-opts.texi
@@ -37,9 +37,9 @@ Matches the stream with this index. E.g. @code{-threads:1 4} would set the
 thread count for the second stream to 4. If @var{stream_index} is used as an
 additional stream specifier (see below), then it selects stream number
 @var{stream_index} from the matching streams. Stream numbering is based on the
-order of the streams as detected by libavformat except when a program ID is
-also specified. In this case it is based on the ordering of the streams in the
-program.
+order of the streams as detected by libavformat except when a stream group
+specifier or program ID is also specified. In this case it is based on the
+ordering of the streams in the group or program.
 @item @var{stream_type}[:@var{additional_stream_specifier}]
 @var{stream_type} is one of following: 'v' or 'V' for video, 'a' for audio, 's'
 for subtitle, 'd' for data, and 't' for attachments. 'v' matches all video
@@ -48,6 +48,17 @@ thumbnails or cover arts. If @var{additional_stream_specifier} is used, then
 it matches streams which both have this type and match the
 @var{additional_stream_specifier}. Otherwise, it matches all streams of the
 specified type.
+@item g:@var{group_specifier}[:@var{additional_stream_specifier}]
+Matches streams which are in the group with the specifier @var{group_specifier}.
+if @var{additional_stream_specifier} is used, then it matches streams which both
+are part of the group and match the @var{additional_stream_specifier}.
+@var{group_specifier} may be one of the following:
+@table @option
+@item @var{group_index}
+Match the stream with this group index.
+@item #@var{group_id} or i:@var{group_id}
+Match the stream with this group id.
+@end table
 @item p:@var{program_id}[:@var{additional_stream_specifier}]
 Matches streams which are in the program with the id @var{program_id}. If
 @var{additional_stream_specifier} is used, then it matches streams which both
diff --git a/libavformat/avformat.c b/libavformat/avformat.c
index 5b8bb7879e..a02ec965dd 100644
--- a/libavformat/avformat.c
+++ b/libavformat/avformat.c
@@ -24,6 +24,7 @@
 #include "libavutil/avstring.h"
 #include "libavutil/channel_layout.h"
 #include "libavutil/frame.h"
+#include "libavutil/iamf.h"
 #include "libavutil/intreadwrite.h"
 #include "libavutil/mem.h"
 #include "libavutil/opt.h"
@@ -80,6 +81,32 @@ FF_ENABLE_DEPRECATION_WARNINGS
     av_freep(pst);
 }
 
+void ff_free_stream_group(AVStreamGroup **pstg)
+{
+    AVStreamGroup *stg = *pstg;
+
+    if (!stg)
+        return;
+
+    av_freep(&stg->streams);
+    av_dict_free(&stg->metadata);
+    av_freep(&stg->priv_data);
+    switch (stg->type) {
+    case AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT: {
+        av_iamf_audio_element_free(&stg->params.iamf_audio_element);
+        break;
+    }
+    case AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION: {
+        av_iamf_mix_presentation_free(&stg->params.iamf_mix_presentation);
+        break;
+    }
+    default:
+        break;
+    }
+
+    av_freep(pstg);
+}
+
 void ff_remove_stream(AVFormatContext *s, AVStream *st)
 {
     av_assert0(s->nb_streams>0);
@@ -88,6 +115,14 @@ void ff_remove_stream(AVFormatContext *s, AVStream *st)
     ff_free_stream(&s->streams[ --s->nb_streams ]);
 }
 
+void ff_remove_stream_group(AVFormatContext *s, AVStreamGroup *stg)
+{
+    av_assert0(s->nb_stream_groups > 0);
+    av_assert0(s->stream_groups[ s->nb_stream_groups - 1 ] == stg);
+
+    ff_free_stream_group(&s->stream_groups[ --s->nb_stream_groups ]);
+}
+
 /* XXX: suppress the packet queue */
 void ff_flush_packet_queue(AVFormatContext *s)
 {
@@ -118,6 +153,9 @@ void avformat_free_context(AVFormatContext *s)
 
     for (unsigned i = 0; i < s->nb_streams; i++)
         ff_free_stream(&s->streams[i]);
+    for (unsigned i = 0; i < s->nb_stream_groups; i++)
+        ff_free_stream_group(&s->stream_groups[i]);
+    s->nb_stream_groups = 0;
     s->nb_streams = 0;
 
     for (unsigned i = 0; i < s->nb_programs; i++) {
@@ -139,6 +177,7 @@ void avformat_free_context(AVFormatContext *s)
     av_packet_free(&si->pkt);
     av_packet_free(&si->parse_pkt);
     av_freep(&s->streams);
+    av_freep(&s->stream_groups);
     ff_flush_packet_queue(s);
     av_freep(&s->url);
     av_free(s);
@@ -464,7 +503,7 @@ int av_find_best_stream(AVFormatContext *ic, enum AVMediaType type,
  */
 static int match_stream_specifier(const AVFormatContext *s, const AVStream *st,
                                   const char *spec, const char **indexptr,
-                                  const AVProgram **p)
+                                  const AVStreamGroup **g, const AVProgram **p)
 {
     int match = 1;                      /* Stores if the specifier matches so far. */
     while (*spec) {
@@ -493,6 +532,46 @@ static int match_stream_specifier(const AVFormatContext *s, const AVStream *st,
                 match = 0;
             if (nopic && (st->disposition & AV_DISPOSITION_ATTACHED_PIC))
                 match = 0;
+        } else if (*spec == 'g' && *(spec + 1) == ':') {
+            int64_t group_idx = -1, group_id = -1;
+            int found = 0;
+            char *endptr;
+            spec += 2;
+            if (*spec == '#' || (*spec == 'i' && *(spec + 1) == ':')) {
+                spec += 1 + (*spec == 'i');
+                group_id = strtol(spec, &endptr, 0);
+                if (spec == endptr || (*endptr && *endptr++ != ':'))
+                    return AVERROR(EINVAL);
+                spec = endptr;
+            } else {
+                group_idx = strtol(spec, &endptr, 0);
+                /* Disallow empty id and make sure that if we are not at the end, then another specifier must follow. */
+                if (spec == endptr || (*endptr && *endptr++ != ':'))
+                    return AVERROR(EINVAL);
+                spec = endptr;
+            }
+            if (match) {
+                if (group_id > 0) {
+                    for (unsigned i = 0; i < s->nb_stream_groups; i++) {
+                        if (group_id == s->stream_groups[i]->id) {
+                            group_idx = i;
+                            break;
+                        }
+                    }
+                }
+                if (group_idx < 0 || group_idx > s->nb_stream_groups)
+                    return AVERROR(EINVAL);
+                for (unsigned j = 0; j < s->stream_groups[group_idx]->nb_streams; j++) {
+                    if (st->index == s->stream_groups[group_idx]->streams[j]->index) {
+                        found = 1;
+                        if (g)
+                            *g = s->stream_groups[group_idx];
+                        break;
+                    }
+                }
+            }
+            if (!found)
+                match = 0;
         } else if (*spec == 'p' && *(spec + 1) == ':') {
             int prog_id;
             int found = 0;
@@ -591,10 +670,11 @@ int avformat_match_stream_specifier(AVFormatContext *s, AVStream *st,
     int ret, index;
     char *endptr;
     const char *indexptr = NULL;
+    const AVStreamGroup *g = NULL;
     const AVProgram *p = NULL;
     int nb_streams;
 
-    ret = match_stream_specifier(s, st, spec, &indexptr, &p);
+    ret = match_stream_specifier(s, st, spec, &indexptr, &g, &p);
     if (ret < 0)
         goto error;
 
@@ -612,10 +692,11 @@ int avformat_match_stream_specifier(AVFormatContext *s, AVStream *st,
         return (index == st->index);
 
     /* If we requested a matching stream index, we have to ensure st is that. */
-    nb_streams = p ? p->nb_stream_indexes : s->nb_streams;
+    nb_streams = g ? g->nb_streams : (p ? p->nb_stream_indexes : s->nb_streams);
     for (int i = 0; i < nb_streams && index >= 0; i++) {
-        const AVStream *candidate = s->streams[p ? p->stream_index[i] : i];
-        ret = match_stream_specifier(s, candidate, spec, NULL, NULL);
+        unsigned idx = g ? g->streams[i]->index : (p ? p->stream_index[i] : i);
+        const AVStream *candidate = s->streams[idx];
+        ret = match_stream_specifier(s, candidate, spec, NULL, NULL, NULL);
         if (ret < 0)
             goto error;
         if (ret > 0 && index-- == 0 && st == candidate)
@@ -629,6 +710,100 @@ error:
     return ret;
 }
 
+/**
+ * Matches a stream specifier (but ignores requested index).
+ *
+ * @param indexptr set to point to the requested stream index if there is one
+ *
+ * @return <0 on error
+ *         0  if st is NOT a matching stream
+ *         >0 if st is a matching stream
+ */
+static int match_stream_group_specifier(const AVFormatContext *s, const AVStreamGroup *stg,
+                                        const char *spec, const char **indexptr)
+{
+    int match = 1;                      /* Stores if the specifier matches so far. */
+    while (*spec) {
+        if (*spec <= '9' && *spec >= '0') { /* opt:index */
+            if (indexptr)
+                *indexptr = spec;
+            return match;
+        } else if (*spec == 't' && *(spec + 1) == ':') {
+            int64_t group_type = -1;
+            int found = 0;
+            char *endptr;
+            spec += 2;
+            group_type = strtol(spec, &endptr, 0);
+            /* Disallow empty type and make sure that if we are not at the end, then another specifier must follow. */
+            if (spec == endptr || (*endptr && *endptr++ != ':'))
+                return AVERROR(EINVAL);
+            spec = endptr;
+            if (match && group_type > 0) {
+                for (unsigned i = 0; i < s->nb_stream_groups; i++) {
+                    if (group_type == s->stream_groups[i]->type) {
+                        found = 1;
+                        break;
+                    }
+                }
+            }
+            if (!found)
+                match = 0;
+        } else if (*spec == '#' ||
+                   (*spec == 'i' && *(spec + 1) == ':')) {
+            int group_id;
+            char *endptr;
+            spec += 1 + (*spec == 'i');
+            group_id = strtol(spec, &endptr, 0);
+            if (spec == endptr || *endptr)                /* Disallow empty id and make sure we are at the end. */
+                return AVERROR(EINVAL);
+            return match && (group_id == stg->id);
+        }
+    }
+
+    return match;
+}
+
+int avformat_match_stream_group_specifier(AVFormatContext *s, AVStreamGroup *stg,
+                                          const char *spec)
+{
+    int ret, index;
+    char *endptr;
+    const char *indexptr = NULL;
+
+    ret = match_stream_group_specifier(s, stg, spec, &indexptr);
+    if (ret < 0)
+        goto error;
+
+    if (!indexptr)
+        return ret;
+
+    index = strtol(indexptr, &endptr, 0);
+    if (*endptr) {                  /* We can't have anything after the requested index. */
+        ret = AVERROR(EINVAL);
+        goto error;
+    }
+
+    /* This is not really needed but saves us a loop for simple stream index specifiers. */
+    if (spec == indexptr)
+        return (index == stg->index);
+
+    /* If we requested a matching stream index, we have to ensure stg is that. */
+    for (int i = 0; i < s->nb_stream_groups && index >= 0; i++) {
+        const AVStreamGroup *candidate = s->stream_groups[i];
+        ret = match_stream_group_specifier(s, candidate, spec, NULL);
+        if (ret < 0)
+            goto error;
+        if (ret > 0 && index-- == 0 && stg == candidate)
+            return 1;
+    }
+    return 0;
+
+error:
+    if (ret == AVERROR(EINVAL))
+        av_log(s, AV_LOG_ERROR, "Invalid stream group specifier: %s.\n", spec);
+    return ret;
+}
+
 AVRational av_guess_sample_aspect_ratio(AVFormatContext *format, AVStream *stream, AVFrame *frame)
 {
     AVRational undef = {0, 1};
diff --git a/libavformat/avformat.h b/libavformat/avformat.h
index 9e7eca007e..9e428ee843 100644
--- a/libavformat/avformat.h
+++ b/libavformat/avformat.h
@@ -1018,6 +1018,83 @@ typedef struct AVStream {
     int pts_wrap_bits;
 } AVStream;
 
+enum AVStreamGroupParamsType {
+    AV_STREAM_GROUP_PARAMS_NONE,
+    AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT,
+    AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION,
+};
+
+struct AVIAMFAudioElement;
+struct AVIAMFMixPresentation;
+
+typedef struct AVStreamGroup {
+    /**
+     * A class for @ref avoptions. Set by avformat_stream_group_create().
+     */
+    const AVClass *av_class;
+
+    void *priv_data;
+
+    /**
+     * Group index in AVFormatContext.
+     */
+    unsigned int index;
+
+    /**
+     * Group type-specific group ID.
+     *
+     * decoding: set by libavformat
+     * encoding: may set by the user
+     */
+    int64_t id;
+
+    /**
+     * Group type
+     *
+     * decoding: set by libavformat on group creation
+     * encoding: set by avformat_stream_group_create()
+     */
+    enum AVStreamGroupParamsType type;
+
+    /**
+     * Group type-specific parameters
+     */
+    union {
+        struct AVIAMFAudioElement *iamf_audio_element;
+        struct AVIAMFMixPresentation *iamf_mix_presentation;
+    } params;
+
+    /**
+     * Metadata that applies to the whole group.
+     *
+     * - demuxing: set by libavformat on group creation
+     * - muxing: may be set by the caller before avformat_write_header()
+     *
+     * Freed by libavformat in avformat_free_context().
+     */
+    AVDictionary *metadata;
+
+    /**
+     * Number of elements in AVStreamGroup.streams.
+     *
+     * Set by avformat_stream_group_add_stream() must not be modified by any other code.
+     */
+    unsigned int nb_streams;
+
+    /**
+     * A list of streams in the group. New entries are created with
+     * avformat_stream_group_add_stream().
+     *
+     * - demuxing: entries are created by libavformat on group creation.
+     *             If AVFMTCTX_NOHEADER is set in ctx_flags, then new entries may also
+     *             appear in av_read_frame().
+     * - muxing: entries are created by the user before avformat_write_header().
+     *
+     * Freed by libavformat in avformat_free_context().
+     */
+    AVStream **streams;
+} AVStreamGroup;
+
 struct AVCodecParserContext *av_stream_get_parser(const AVStream *s);
 
 #if FF_API_GET_END_PTS
@@ -1726,6 +1803,26 @@ typedef struct AVFormatContext {
      * @return 0 on success, a negative AVERROR code on failure
      */
     int (*io_close2)(struct AVFormatContext *s, AVIOContext *pb);
+
+    /**
+     * Number of elements in AVFormatContext.stream_groups.
+     *
+     * Set by avformat_stream_group_create(), must not be modified by any other code.
+     */
+    unsigned int nb_stream_groups;
+
+    /**
+     * A list of all stream groups in the file. New groups are created with
+     * avformat_stream_group_create(), and filled with avformat_stream_group_add_stream().
+     *
+     * - demuxing: groups may be created by libavformat in avformat_open_input().
+     *             If AVFMTCTX_NOHEADER is set in ctx_flags, then new groups may also
+     *             appear in av_read_frame().
+     * - muxing: groups may be created by the user before avformat_write_header().
+     *
+     * Freed by libavformat in avformat_free_context().
+     */
+    AVStreamGroup **stream_groups;
 } AVFormatContext;
 
 /**
@@ -1844,6 +1941,37 @@ const AVClass *avformat_get_class(void);
  */
 const AVClass *av_stream_get_class(void);
 
+/**
+ * Get the AVClass for AVStreamGroup. It can be used in combination with
+ * AV_OPT_SEARCH_FAKE_OBJ for examining options.
+ *
+ * @see av_opt_find().
+ */
+const AVClass *av_stream_group_get_class(void);
+
+/**
+ * Add a new empty stream group to a media file.
+ *
+ * When demuxing, it may be called by the demuxer in read_header(). If the
+ * flag AVFMTCTX_NOHEADER is set in s.ctx_flags, then it may also
+ * be called in read_packet().
+ *
+ * When muxing, may be called by the user before avformat_write_header().
+ *
+ * User is required to call avformat_free_context() to clean up the allocation
+ * by avformat_stream_group_create().
+ *
+ * New streams can be added to the group with avformat_stream_group_add_stream().
+ *
+ * @param s media file handle
+ *
+ * @return newly created group or NULL on error.
+ * @see avformat_new_stream, avformat_stream_group_add_stream.
+ */
+AVStreamGroup *avformat_stream_group_create(AVFormatContext *s,
+                                            enum AVStreamGroupParamsType type,
+                                            AVDictionary **options);
+
 /**
  * Add a new stream to a media file.
  *
@@ -1863,6 +1991,31 @@ const AVClass *av_stream_get_class(void);
  */
 AVStream *avformat_new_stream(AVFormatContext *s, const struct AVCodec *c);
 
+/**
+ * Add an already allocated stream to a stream group.
+ *
+ * When demuxing, it may be called by the demuxer in read_header(). If the
+ * flag AVFMTCTX_NOHEADER is set in s.ctx_flags, then it may also
+ * be called in read_packet().
+ *
+ * When muxing, may be called by the user before avformat_write_header() after
+ * having allocated a new group with avformat_stream_group_create() and stream with
+ * avformat_new_stream().
+ *
+ * User is required to call avformat_free_context() to clean up the allocation
+ * by avformat_stream_group_add_stream().
+ *
+ * @param stg stream group belonging to a media file.
+ * @param st  stream in the media file to add to the group.
+ *
+ * @retval 0                 success
+ * @retval AVERROR(EEXIST)   the stream was already in the group
+ * @retval "another negative error code" legitimate errors
+ *
+ * @see avformat_new_stream, avformat_stream_group_create.
+ */
+int avformat_stream_group_add_stream(AVStreamGroup *stg, AVStream *st);
+
 #if FF_API_AVSTREAM_SIDE_DATA
 /**
  * Wrap an existing array as stream side data.
@@ -2819,6 +2972,22 @@ AVRational av_guess_frame_rate(AVFormatContext *ctx, AVStream *stream,
 int avformat_match_stream_specifier(AVFormatContext *s, AVStream *st,
                                     const char *spec);
 
+/**
+ * Check if the group stg contained in s is matched by the stream group
+ * specifier spec.
+ *
+ * See the "stream group specifiers" chapter in the documentation for the
+ * syntax of spec.
+ *
+ * @return  >0 if stg is matched by spec;
+ *          0  if stg is not matched by spec;
+ *          AVERROR code if spec is invalid
+ *
+ * @note  A stream group specifier can match several groups in the format.
+ */
+int avformat_match_stream_group_specifier(AVFormatContext *s, AVStreamGroup *stg,
+                                          const char *spec);
+
 int avformat_queue_attached_pictures(AVFormatContext *s);
 
 enum AVTimebaseSource {
diff --git a/libavformat/dump.c b/libavformat/dump.c
index c0868a1bb3..cc179f284f 100644
--- a/libavformat/dump.c
+++ b/libavformat/dump.c
@@ -24,6 +24,7 @@
 
 #include "libavutil/channel_layout.h"
 #include "libavutil/display.h"
+#include "libavutil/iamf.h"
 #include "libavutil/intreadwrite.h"
 #include "libavutil/log.h"
 #include "libavutil/mastering_display_metadata.h"
@@ -134,28 +135,36 @@ static void print_fps(double d, const char *postfix)
         av_log(NULL, AV_LOG_INFO, "%1.0fk %s", d / 1000, postfix);
 }
 
-static void dump_metadata(void *ctx, const AVDictionary *m, const char *indent)
+static void dump_dictionary(void *ctx, const AVDictionary *m,
+                            const char *name, const char *indent)
 {
-    if (m && !(av_dict_count(m) == 1 && av_dict_get(m, "language", NULL, 0))) {
-        const AVDictionaryEntry *tag = NULL;
-
-        av_log(ctx, AV_LOG_INFO, "%sMetadata:\n", indent);
-        while ((tag = av_dict_iterate(m, tag)))
-            if (strcmp("language", tag->key)) {
-                const char *p = tag->value;
-                av_log(ctx, AV_LOG_INFO,
-                       "%s  %-16s: ", indent, tag->key);
-                while (*p) {
-                    size_t len = strcspn(p, "\x8\xa\xb\xc\xd");
-                    av_log(ctx, AV_LOG_INFO, "%.*s", (int)(FFMIN(255, len)), p);
-                    p += len;
-                    if (*p == 0xd) av_log(ctx, AV_LOG_INFO, " ");
-                    if (*p == 0xa) av_log(ctx, AV_LOG_INFO, "\n%s  %-16s: ", indent, "");
-                    if (*p) p++;
-                }
-                av_log(ctx, AV_LOG_INFO, "\n");
+    const AVDictionaryEntry *tag = NULL;
+
+    if (!m)
+        return;
+
+    av_log(ctx, AV_LOG_INFO, "%s%s:\n", indent, name);
+    while ((tag = av_dict_iterate(m, tag)))
+        if (strcmp("language", tag->key)) {
+            const char *p = tag->value;
+            av_log(ctx, AV_LOG_INFO,
+                   "%s  %-16s: ", indent, tag->key);
+            while (*p) {
+                size_t len = strcspn(p, "\x8\xa\xb\xc\xd");
+                av_log(ctx, AV_LOG_INFO, "%.*s", (int)(FFMIN(255, len)), p);
+                p += len;
+                if (*p == 0xd) av_log(ctx, AV_LOG_INFO, " ");
+                if (*p == 0xa) av_log(ctx, AV_LOG_INFO, "\n%s  %-16s: ", indent, "");
+                if (*p) p++;
             }
-    }
+            av_log(ctx, AV_LOG_INFO, "\n");
+        }
+}
+
+static void dump_metadata(void *ctx, const AVDictionary *m, const char *indent)
+{
+    if (m && !(av_dict_count(m) == 1 && av_dict_get(m, "language", NULL, 0)))
+        dump_dictionary(ctx, m, "Metadata", indent);
 }
 
 /* param change side data*/
@@ -509,7 +518,7 @@ static void dump_sidedata(void *ctx, const AVStream *st, const char *indent)
 
 /* "user interface" functions */
 static void dump_stream_format(const AVFormatContext *ic, int i,
-                               int index, int is_output)
+                               int group_index, int index, int is_output)
 {
     char buf[256];
     int flags = (is_output ? ic->oformat->flags : ic->iformat->flags);
@@ -517,6 +526,8 @@ static void dump_stream_format(const AVFormatContext *ic, int i,
     const FFStream *const sti = cffstream(st);
     const AVDictionaryEntry *lang = av_dict_get(st->metadata, "language", NULL, 0);
     const char *separator = ic->dump_separator;
+    const char *group_indent = group_index >= 0 ? "    " : "";
+    const char *extra_indent = group_index >= 0 ? "        " : "      ";
     AVCodecContext *avctx;
     int ret;
 
@@ -543,7 +554,8 @@ static void dump_stream_format(const AVFormatContext *ic, int i,
     avcodec_string(buf, sizeof(buf), avctx, is_output);
     avcodec_free_context(&avctx);
 
-    av_log(NULL, AV_LOG_INFO, "  Stream #%d:%d", index, i);
+    av_log(NULL, AV_LOG_INFO, "%s  Stream #%d", group_indent, index);
+    av_log(NULL, AV_LOG_INFO, ":%d", i);
 
     /* the pid is an important information, so we display it */
     /* XXX: add a generic system */
@@ -621,9 +633,89 @@ static void dump_stream_format(const AVFormatContext *ic, int i,
         av_log(NULL, AV_LOG_INFO, " (non-diegetic)");
     av_log(NULL, AV_LOG_INFO, "\n");
 
-    dump_metadata(NULL, st->metadata, "    ");
+    dump_metadata(NULL, st->metadata, extra_indent);
+
+    dump_sidedata(NULL, st, extra_indent);
+}
+
+static void dump_stream_group(const AVFormatContext *ic, uint8_t *printed,
+                              int i, int index, int is_output)
+{
+    const AVStreamGroup *stg = ic->stream_groups[i];
+    int flags = (is_output ? ic->oformat->flags : ic->iformat->flags);
+    char buf[512];
+    int ret;
 
-    dump_sidedata(NULL, st, "    ");
+    av_log(NULL, AV_LOG_INFO, "  Stream group #%d:%d", index, i);
+    if (flags & AVFMT_SHOW_IDS)
+        av_log(NULL, AV_LOG_INFO, "[0x%"PRIx64"]", stg->id);
+    av_log(NULL, AV_LOG_INFO, ":");
+
+    switch (stg->type) {
+    case AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT: {
+        const AVIAMFAudioElement *audio_element = stg->params.iamf_audio_element;
+        av_log(NULL, AV_LOG_INFO, " IAMF Audio Element\n");
+        dump_metadata(NULL, stg->metadata, "    ");
+        for (int j = 0; j < audio_element->nb_layers; j++) {
+            const AVIAMFLayer *layer = audio_element->layers[j];
+            int channel_count = layer->ch_layout.nb_channels;
+            av_log(NULL, AV_LOG_INFO, "    Layer %d:", j);
+            ret = av_channel_layout_describe(&layer->ch_layout, buf, sizeof(buf));
+            if (ret >= 0)
+                av_log(NULL, AV_LOG_INFO, " %s", buf);
+            av_log(NULL, AV_LOG_INFO, "\n");
+            for (int k = 0; channel_count > 0 && k < stg->nb_streams; k++) {
+                AVStream *st = stg->streams[k];
+                dump_stream_format(ic, st->index, i, index, is_output);
+                printed[st->index] = 1;
+                channel_count -= st->codecpar->ch_layout.nb_channels;
+            }
+        }
+        break;
+    }
+    case AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION: {
+        const AVIAMFMixPresentation *mix_presentation = stg->params.iamf_mix_presentation;
+        av_log(NULL, AV_LOG_INFO, " IAMF Mix Presentation\n");
+        dump_metadata(NULL, stg->metadata, "    ");
+        dump_dictionary(NULL, mix_presentation->annotations, "Annotations", "    ");
+        for (int j = 0; j < mix_presentation->nb_submixes; j++) {
+            AVIAMFSubmix *sub_mix = mix_presentation->submixes[j];
+            av_log(NULL, AV_LOG_INFO, "    Submix %d:\n", j);
+            for (int k = 0; k < sub_mix->nb_elements; k++) {
+                const AVIAMFSubmixElement *submix_element = sub_mix->elements[k];
+                const AVStreamGroup *audio_element = NULL;
+                for (int l = 0; l < ic->nb_stream_groups; l++)
+                    if (ic->stream_groups[l]->type == AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT &&
+                        ic->stream_groups[l]->id   == submix_element->audio_element_id) {
+                        audio_element = ic->stream_groups[l];
+                        break;
+                    }
+                if (audio_element) {
+                    av_log(NULL, AV_LOG_INFO, "      IAMF Audio Element #%d:%d",
+                           index, audio_element->index);
+                    if (flags & AVFMT_SHOW_IDS)
+                        av_log(NULL, AV_LOG_INFO, "[0x%"PRIx64"]", audio_element->id);
+                    av_log(NULL, AV_LOG_INFO, "\n");
+                    dump_dictionary(NULL, submix_element->annotations, "Annotations", "        ");
+                }
+            }
+            for (int k = 0; k < sub_mix->nb_layouts; k++) {
+                const AVIAMFSubmixLayout *submix_layout = sub_mix->layouts[k];
+                av_log(NULL, AV_LOG_INFO, "      Layout #%d:", k);
+                if (submix_layout->layout_type == 2) {
+                    ret = av_channel_layout_describe(&submix_layout->sound_system, buf, sizeof(buf));
+                    if (ret >= 0)
+                        av_log(NULL, AV_LOG_INFO, " %s", buf);
+                } else if (submix_layout->layout_type == 3)
+                    av_log(NULL, AV_LOG_INFO, " Binaural");
+                av_log(NULL, AV_LOG_INFO, "\n");
+            }
+        }
+        break;
+    }
+    default:
+        break;
+    }
 }
 
 void av_dump_format(AVFormatContext *ic, int index,
@@ -699,7 +791,7 @@ void av_dump_format(AVFormatContext *ic, int index,
             dump_metadata(NULL, program->metadata, "    ");
             for (k = 0; k < program->nb_stream_indexes; k++) {
                 dump_stream_format(ic, program->stream_index[k],
-                                   index, is_output);
+                                   -1, index, is_output);
                 printed[program->stream_index[k]] = 1;
             }
             total += program->nb_stream_indexes;
@@ -708,9 +800,12 @@ void av_dump_format(AVFormatContext *ic, int index,
             av_log(NULL, AV_LOG_INFO, "  No Program\n");
     }
 
+    for (i = 0; i < ic->nb_stream_groups; i++)
+         dump_stream_group(ic, printed, i, index, is_output);
+
     for (i = 0; i < ic->nb_streams; i++)
         if (!printed[i])
-            dump_stream_format(ic, i, index, is_output);
+            dump_stream_format(ic, i, -1, index, is_output);
 
     av_free(printed);
 }
diff --git a/libavformat/internal.h b/libavformat/internal.h
index 7702986c9c..c6181683ef 100644
--- a/libavformat/internal.h
+++ b/libavformat/internal.h
@@ -202,6 +202,7 @@ typedef struct FFStream {
      */
     AVStream pub;
 
+    AVFormatContext *fmtctx;
     /**
      * Set to 1 if the codec allows reordering, so pts can be different
      * from dts.
@@ -427,6 +428,26 @@ static av_always_inline const FFStream *cffstream(const AVStream *st)
     return (const FFStream*)st;
 }
 
+typedef struct FFStreamGroup {
+    /**
+     * The public context.
+     */
+    AVStreamGroup pub;
+
+    AVFormatContext *fmtctx;
+} FFStreamGroup;
+
+
+static av_always_inline FFStreamGroup *ffstreamgroup(AVStreamGroup *stg)
+{
+    return (FFStreamGroup*)stg;
+}
+
+static av_always_inline const FFStreamGroup *cffstreamgroup(const AVStreamGroup *stg)
+{
+    return (const FFStreamGroup*)stg;
+}
+
 #ifdef __GNUC__
 #define dynarray_add(tab, nb_ptr, elem)\
 do {\
@@ -608,6 +629,18 @@ void ff_free_stream(AVStream **st);
  */
 void ff_remove_stream(AVFormatContext *s, AVStream *st);
 
+/**
+ * Frees a stream group without modifying the corresponding AVFormatContext.
+ * Must only be called if the latter doesn't matter or if the stream
+ * is not yet attached to an AVFormatContext.
+ */
+void ff_free_stream_group(AVStreamGroup **pstg);
+/**
+ * Remove a stream group from its AVFormatContext and free it.
+ * The group must be the last stream of the AVFormatContext.
+ */
+void ff_remove_stream_group(AVFormatContext *s, AVStreamGroup *stg);
+
 unsigned int ff_codec_get_tag(const AVCodecTag *tags, enum AVCodecID id);
 
 enum AVCodecID ff_codec_get_id(const AVCodecTag *tags, unsigned int tag);
diff --git a/libavformat/options.c b/libavformat/options.c
index 1d8c52246b..bf6113ca95 100644
--- a/libavformat/options.c
+++ b/libavformat/options.c
@@ -26,6 +26,7 @@
 #include "libavcodec/codec_par.h"
 
 #include "libavutil/avassert.h"
+#include "libavutil/iamf.h"
 #include "libavutil/internal.h"
 #include "libavutil/intmath.h"
 #include "libavutil/opt.h"
@@ -271,6 +272,7 @@ AVStream *avformat_new_stream(AVFormatContext *s, const AVCodec *c)
     if (!st->codecpar)
         goto fail;
 
+    sti->fmtctx = s;
     sti->avctx = avcodec_alloc_context3(NULL);
     if (!sti->avctx)
         goto fail;
@@ -325,6 +327,143 @@ fail:
     return NULL;
 }
 
+static void *stream_group_child_next(void *obj, void *prev)
+{
+    AVStreamGroup *stg = obj;
+    if (!prev) {
+        switch(stg->type) {
+        case AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT:
+            return stg->params.iamf_audio_element;
+        case AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION:
+            return stg->params.iamf_mix_presentation;
+        default:
+            break;
+        }
+    }
+    return NULL;
+}
+
+static const AVClass *stream_group_child_iterate(void **opaque)
+{
+    uintptr_t i = (uintptr_t)*opaque;
+    const AVClass *ret = NULL;
+
+    switch(i) {
+    case AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT:
+        ret = av_iamf_audio_element_get_class();
+        break;
+    case AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION:
+        ret = av_iamf_mix_presentation_get_class();
+        break;
+    default:
+        break;
+    }
+
+    if (ret)
+        *opaque = (void*)(i + 1);
+    return ret;
+}
+
+static const AVOption stream_group_options[] = {
+    {"id", "Set group id", offsetof(AVStreamGroup, id), AV_OPT_TYPE_INT64, {.i64 = 0}, 0, INT64_MAX, AV_OPT_FLAG_ENCODING_PARAM },
+    { NULL }
+};
+
+static const AVClass stream_group_class = {
+    .class_name     = "AVStreamGroup",
+    .item_name      = av_default_item_name,
+    .version        = LIBAVUTIL_VERSION_INT,
+    .option         = stream_group_options,
+    .child_next     = stream_group_child_next,
+    .child_class_iterate = stream_group_child_iterate,
+};
+
+const AVClass *av_stream_group_get_class(void)
+{
+    return &stream_group_class;
+}
+
+AVStreamGroup *avformat_stream_group_create(AVFormatContext *s,
+                                            enum AVStreamGroupParamsType type,
+                                            AVDictionary **options)
+{
+    AVStreamGroup **stream_groups;
+    AVStreamGroup *stg;
+    FFStreamGroup *stgi;
+
+    stream_groups = av_realloc_array(s->stream_groups, s->nb_stream_groups + 1,
+                                     sizeof(*stream_groups));
+    if (!stream_groups)
+        return NULL;
+    s->stream_groups = stream_groups;
+
+    stgi = av_mallocz(sizeof(*stgi));
+    if (!stgi)
+        return NULL;
+    stg = &stgi->pub;
+
+    stg->av_class = &stream_group_class;
+    av_opt_set_defaults(stg);
+    stg->type = type;
+    switch (type) {
+    case AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT:
+        stg->params.iamf_audio_element = av_iamf_audio_element_alloc();
+        if (!stg->params.iamf_audio_element)
+            goto fail;
+        break;
+    case AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION:
+        stg->params.iamf_mix_presentation = av_iamf_mix_presentation_alloc();
+        if (!stg->params.iamf_mix_presentation)
+            goto fail;
+        break;
+    default:
+        goto fail;
+    }
+
+    if (options) {
+        if (av_opt_set_dict2(stg, options, AV_OPT_SEARCH_CHILDREN))
+            goto fail;
+    }
+
+    stgi->fmtctx = s;
+    stg->index   = s->nb_stream_groups;
+
+    s->stream_groups[s->nb_stream_groups++] = stg;
+
+    return stg;
+fail:
+    ff_free_stream_group(&stg);
+    return NULL;
+}
+
+static int stream_group_add_stream(AVStreamGroup *stg, AVStream *st)
+{
+    AVStream **streams = av_realloc_array(stg->streams, stg->nb_streams + 1,
+                                          sizeof(*stg->streams));
+    if (!streams)
+        return AVERROR(ENOMEM);
+
+    stg->streams = streams;
+    stg->streams[stg->nb_streams++] = st;
+
+    return 0;
+}
+
+int avformat_stream_group_add_stream(AVStreamGroup *stg, AVStream *st)
+{
+    const FFStreamGroup *stgi = cffstreamgroup(stg);
+    const FFStream *sti = cffstream(st);
+
+    if (stgi->fmtctx != sti->fmtctx)
+        return AVERROR(EINVAL);
+
+    for (int i = 0; i < stg->nb_streams; i++)
+        if (stg->streams[i]->index == st->index)
+            return AVERROR(EEXIST);
+
+    return stream_group_add_stream(stg, st);
+}
+
 static int option_is_disposition(const AVOption *opt)
 {
     return opt->type == AV_OPT_TYPE_CONST &&
-- 
2.43.0

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [FFmpeg-devel] [PATCH 3/8] ffmpeg: add support for muxing AVStreamGroups
  2023-12-05 22:43 [FFmpeg-devel] [PATCH v6 0/8] avformat: introduce AVStreamGroup James Almer
  2023-12-05 22:43 ` [FFmpeg-devel] [PATCH 1/8] avutil: introduce an Immersive Audio Model and Formats API James Almer
  2023-12-05 22:43 ` [FFmpeg-devel] [PATCH 2/8] avformat: introduce AVStreamGroup James Almer
@ 2023-12-05 22:43 ` James Almer
  2023-12-11 11:48   ` Anton Khirnov
  2023-12-05 22:43 ` [FFmpeg-devel] [PATCH 4/8] avcodec/packet: add IAMF Parameters side data types James Almer
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 23+ messages in thread
From: James Almer @ 2023-12-05 22:43 UTC (permalink / raw)
  To: ffmpeg-devel

Starting with IAMF support.

Signed-off-by: James Almer <jamrial@gmail.com>
---
 fftools/ffmpeg.h          |   2 +
 fftools/ffmpeg_mux_init.c | 335 ++++++++++++++++++++++++++++++++++++++
 fftools/ffmpeg_opt.c      |   2 +
 3 files changed, 339 insertions(+)

diff --git a/fftools/ffmpeg.h b/fftools/ffmpeg.h
index 41935d39d5..057535adbb 100644
--- a/fftools/ffmpeg.h
+++ b/fftools/ffmpeg.h
@@ -262,6 +262,8 @@ typedef struct OptionsContext {
     int        nb_disposition;
     SpecifierOpt *program;
     int        nb_program;
+    SpecifierOpt *stream_groups;
+    int        nb_stream_groups;
     SpecifierOpt *time_bases;
     int        nb_time_bases;
     SpecifierOpt *enc_time_bases;
diff --git a/fftools/ffmpeg_mux_init.c b/fftools/ffmpeg_mux_init.c
index 63a25a350f..7648f2a2f1 100644
--- a/fftools/ffmpeg_mux_init.c
+++ b/fftools/ffmpeg_mux_init.c
@@ -39,6 +39,7 @@
 #include "libavutil/dict.h"
 #include "libavutil/display.h"
 #include "libavutil/getenv_utf8.h"
+#include "libavutil/iamf.h"
 #include "libavutil/intreadwrite.h"
 #include "libavutil/log.h"
 #include "libavutil/mem.h"
@@ -1943,6 +1944,336 @@ static int setup_sync_queues(Muxer *mux, AVFormatContext *oc, int64_t buf_size_u
     return 0;
 }
 
+static int of_parse_iamf_audio_element_layers(Muxer *mux, AVStreamGroup *stg, char **ptr)
+{
+    AVIAMFAudioElement *audio_element = stg->params.iamf_audio_element;
+    AVDictionary *dict = NULL;
+    const char *token;
+    int ret = 0;
+
+    audio_element->demixing_info =
+        av_iamf_param_definition_alloc(AV_IAMF_PARAMETER_DEFINITION_DEMIXING, 1, NULL);
+    audio_element->recon_gain_info =
+        av_iamf_param_definition_alloc(AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN, 1, NULL);
+
+    if (!audio_element->demixing_info ||
+        !audio_element->recon_gain_info)
+        return AVERROR(ENOMEM);
+
+    /* process manually set layers and parameters */
+    token = av_strtok(NULL, ",", ptr);
+    while (token) {
+        const AVDictionaryEntry *e;
+        int demixing = 0, recon_gain = 0;
+        int layer = 0;
+
+        if (av_strstart(token, "layer=", &token))
+            layer = 1;
+        else if (av_strstart(token, "demixing=", &token))
+            demixing = 1;
+        else if (av_strstart(token, "recon_gain=", &token))
+            recon_gain = 1;
+
+        av_dict_free(&dict);
+        ret = av_dict_parse_string(&dict, token, "=", ":", 0);
+        if (ret < 0) {
+            av_log(mux, AV_LOG_ERROR, "Error parsing audio element specification %s\n", token);
+            goto fail;
+        }
+
+        if (layer) {
+            AVIAMFLayer *audio_layer = av_iamf_audio_element_add_layer(audio_element);
+            if (!audio_layer) {
+                av_log(mux, AV_LOG_ERROR, "Error adding layer to stream group %d\n", stg->index);
+                ret = AVERROR(ENOMEM);
+                goto fail;
+            }
+            av_opt_set_dict(audio_layer, &dict);
+        } else if (demixing || recon_gain) {
+            AVIAMFParamDefinition *param = demixing ? audio_element->demixing_info
+                                                    : audio_element->recon_gain_info;
+            void *subblock = av_iamf_param_definition_get_subblock(param, 0);
+
+            av_opt_set_dict(param, &dict);
+            av_opt_set_dict(subblock, &dict);
+
+            /* Hardcode spec parameters */
+            param->param_definition_mode = 0;
+            param->parameter_rate = stg->streams[0]->codecpar->sample_rate;
+            param->duration =
+            param->constant_subblock_duration = stg->streams[0]->codecpar->frame_size;
+        }
+
+        // make sure that no entries are left in the dict
+        e = NULL;
+        if (e = av_dict_iterate(dict, e)) {
+            av_log(mux, AV_LOG_FATAL, "Unknown layer key %s.\n", e->key);
+            ret = AVERROR(EINVAL);
+            goto fail;
+        }
+        token = av_strtok(NULL, ",", ptr);
+    }
+
+fail:
+    av_dict_free(&dict);
+    if (!ret && !audio_element->nb_layers) {
+        av_log(mux, AV_LOG_ERROR, "No layer in audio element specification\n");
+        ret = AVERROR(EINVAL);
+    }
+
+    return ret;
+}
+
+static int of_parse_iamf_submixes(Muxer *mux, AVStreamGroup *stg, char **ptr)
+{
+    AVFormatContext *oc = mux->fc;
+    AVIAMFMixPresentation *mix = stg->params.iamf_mix_presentation;
+    AVDictionary *dict = NULL;
+    const char *token;
+    char *submix_str = NULL;
+    int ret = 0;
+
+    /* process manually set submixes */
+    token = av_strtok(NULL, ",", ptr);
+    while (token) {
+        AVIAMFSubmix *submix = NULL;
+        const char *subtoken;
+        char *subptr = NULL;
+
+        if (!av_strstart(token, "submix=", &token)) {
+            av_log(mux, AV_LOG_ERROR, "No submix in mix presentation specification \"%s\"\n", token);
+            goto fail;
+        }
+
+        submix_str = av_strdup(token);
+        if (!submix_str)
+            goto fail;
+
+        submix = av_iamf_mix_presentation_add_submix(mix);
+        if (!submix) {
+            av_log(mux, AV_LOG_ERROR, "Error adding submix to stream group %d\n", stg->index);
+            ret = AVERROR(ENOMEM);
+            goto fail;
+        }
+        submix->output_mix_config =
+            av_iamf_param_definition_alloc(AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN, 0, NULL);
+        if (!submix->output_mix_config) {
+            ret = AVERROR(ENOMEM);
+            goto fail;
+        }
+
+        submix->output_mix_config->parameter_rate = stg->streams[0]->codecpar->sample_rate;
+
+        subptr = NULL;
+        subtoken = av_strtok(submix_str, "|", &subptr);
+        while (subtoken) {
+            const AVDictionaryEntry *e;
+            int element = 0, layout = 0;
+
+            if (av_strstart(subtoken, "element=", &subtoken))
+                element = 1;
+            else if (av_strstart(subtoken, "layout=", &subtoken))
+                layout = 1;
+
+            av_dict_free(&dict);
+            ret = av_dict_parse_string(&dict, subtoken, "=", ":", 0);
+            if (ret < 0) {
+                av_log(mux, AV_LOG_ERROR, "Error parsing submix specification \"%s\"\n", subtoken);
+                goto fail;
+            }
+
+            if (element) {
+                AVIAMFSubmixElement *submix_element;
+                int idx = -1;
+
+                if (e = av_dict_get(dict, "stg", NULL, 0))
+                    idx = strtol(e->value, NULL, 0);
+                av_dict_set(&dict, "stg", NULL, 0);
+                if (idx < 0 || idx >= oc->nb_stream_groups) {
+                    av_log(mux, AV_LOG_ERROR, "Invalid or missing stream group index in "
+                                              "submix element specification \"%s\"\n", subtoken);
+                    ret = AVERROR(EINVAL);
+                    goto fail;
+                }
+                submix_element = av_iamf_submix_add_element(submix);
+                if (!submix_element) {
+                    av_log(mux, AV_LOG_ERROR, "Error adding element to submix\n");
+                    ret = AVERROR(ENOMEM);
+                    goto fail;
+                }
+
+                submix_element->audio_element_id = oc->stream_groups[idx]->id;
+
+                submix_element->element_mix_config =
+                    av_iamf_param_definition_alloc(AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN, 0, NULL);
+                if (!submix_element->element_mix_config)
+                    ret = AVERROR(ENOMEM);
+                av_opt_set_dict2(submix_element, &dict, AV_OPT_SEARCH_CHILDREN);
+                submix_element->element_mix_config->parameter_rate = stg->streams[0]->codecpar->sample_rate;
+            } else if (layout) {
+                AVIAMFSubmixLayout *submix_layout = av_iamf_submix_add_layout(submix);
+                if (!submix_layout) {
+                    av_log(mux, AV_LOG_ERROR, "Error adding layout to submix\n");
+                    ret = AVERROR(ENOMEM);
+                    goto fail;
+                }
+                av_opt_set_dict(submix_layout, &dict);
+            } else
+                av_opt_set_dict2(submix, &dict, AV_OPT_SEARCH_CHILDREN);
+
+            if (ret < 0) {
+                goto fail;
+            }
+
+            // make sure that no entries are left in the dict
+            e = NULL;
+            while (e = av_dict_iterate(dict, e)) {
+                av_log(mux, AV_LOG_FATAL, "Unknown submix key %s.\n", e->key);
+                ret = AVERROR(EINVAL);
+                goto fail;
+            }
+            subtoken = av_strtok(NULL, "|", &subptr);
+        }
+        av_freep(&submix_str);
+
+        if (!submix->nb_elements) {
+            av_log(mux, AV_LOG_ERROR, "No audio elements in submix specification \"%s\"\n", token);
+            ret = AVERROR(EINVAL);
+        }
+        token = av_strtok(NULL, ",", ptr);
+    }
+
+fail:
+    av_dict_free(&dict);
+    av_free(submix_str);
+
+    return ret;
+}
+
+static int of_add_groups(Muxer *mux, const OptionsContext *o)
+{
+    AVFormatContext *oc = mux->fc;
+    int ret;
+
+    /* process manually set groups */
+    for (int i = 0; i < o->nb_stream_groups; i++) {
+        AVDictionary *dict = NULL, *tmp = NULL;
+        const AVDictionaryEntry *e;
+        AVStreamGroup *stg = NULL;
+        int type;
+        const char *token;
+        char *str, *ptr = NULL;
+        const AVOption opts[] = {
+            { "type", "Set group type", offsetof(AVStreamGroup, type), AV_OPT_TYPE_INT,
+                    { .i64 = 0 }, 0, INT_MAX, AV_OPT_FLAG_ENCODING_PARAM, "type" },
+                { "iamf_audio_element",    NULL, 0, AV_OPT_TYPE_CONST,
+                    { .i64 = AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT },    .unit = "type" },
+                { "iamf_mix_presentation", NULL, 0, AV_OPT_TYPE_CONST,
+                    { .i64 = AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION }, .unit = "type" },
+            { NULL },
+        };
+         const AVClass class = {
+            .class_name = "StreamGroupType",
+            .item_name  = av_default_item_name,
+            .option     = opts,
+            .version    = LIBAVUTIL_VERSION_INT,
+        };
+        const AVClass *pclass = &class;
+
+        str = av_strdup(o->stream_groups[i].u.str);
+        if (!str)
+            goto end;
+
+        token = av_strtok(str, ",", &ptr);
+        if (token) {
+            ret = av_dict_parse_string(&dict, token, "=", ":", AV_DICT_MULTIKEY);
+            if (ret < 0) {
+                av_log(mux, AV_LOG_ERROR, "Error parsing group specification %s\n", token);
+                goto end;
+            }
+
+            // "type" is not a user settable option in AVStreamGroup
+            e = av_dict_get(dict, "type", NULL, 0);
+            if (!e) {
+                av_log(mux, AV_LOG_ERROR, "No type define for Steam Group %d\n", i);
+                ret = AVERROR(EINVAL);
+                goto end;
+            }
+
+            ret = av_opt_eval_int(&pclass, opts, e->value, &type);
+            if (ret < 0 || type == AV_STREAM_GROUP_PARAMS_NONE) {
+                av_log(mux, AV_LOG_ERROR, "Invalid group type \"%s\"\n", e->value);
+                goto end;
+            }
+
+            av_dict_copy(&tmp, dict, 0);
+            stg = avformat_stream_group_create(oc, type, &tmp);
+            if (!stg) {
+                ret = AVERROR(ENOMEM);
+                goto end;
+            }
+            av_dict_set(&tmp, "type", NULL, 0);
+
+            e = NULL;
+            while (e = av_dict_get(dict, "st", e, 0)) {
+                unsigned int idx = strtol(e->value, NULL, 0);
+                if (idx >= oc->nb_streams) {
+                    av_log(mux, AV_LOG_ERROR, "Invalid stream index %d\n", idx);
+                    ret = AVERROR(EINVAL);
+                    goto end;
+                }
+                avformat_stream_group_add_stream(stg, oc->streams[idx]);
+            }
+            while (e = av_dict_get(dict, "stg", e, 0)) {
+                unsigned int idx = strtol(e->value, NULL, 0);
+                if (idx >= oc->nb_stream_groups || idx == stg->index) {
+                    av_log(mux, AV_LOG_ERROR, "Invalid stream group index %d\n", idx);
+                    ret = AVERROR(EINVAL);
+                    goto end;
+                }
+                for (int j = 0; j < oc->stream_groups[idx]->nb_streams; j++)
+                    avformat_stream_group_add_stream(stg, oc->stream_groups[idx]->streams[j]);
+            }
+
+            switch(type) {
+            case AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT:
+                ret = of_parse_iamf_audio_element_layers(mux, stg, &ptr);
+                break;
+            case AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION:
+                ret = of_parse_iamf_submixes(mux, stg, &ptr);
+                break;
+            default:
+                av_log(mux, AV_LOG_FATAL, "Unknown group type %d.\n", type);
+                ret = AVERROR(EINVAL);
+                break;
+            }
+
+            if (ret < 0)
+                goto end;
+
+            // make sure that nothing but "st" and "stg" entries are left in the dict
+            e = NULL;
+            while (e = av_dict_iterate(tmp, e)) {
+                if (!strcmp(e->key, "st") || !strcmp(e->key, "stg"))
+                    continue;
+
+                av_log(mux, AV_LOG_FATAL, "Unknown group key %s.\n", e->key);
+                ret = AVERROR(EINVAL);
+                goto end;
+            }
+        }
+
+end:
+        av_dict_free(&dict);
+        av_dict_free(&tmp);
+        av_free(str);
+        if (ret < 0)
+            return ret;
+    }
+
+    return 0;
+}
+
 static int of_add_programs(Muxer *mux, const OptionsContext *o)
 {
     AVFormatContext *oc = mux->fc;
@@ -2740,6 +3071,10 @@ int of_open(const OptionsContext *o, const char *filename)
     if (err < 0)
         return err;
 
+    err = of_add_groups(mux, o);
+    if (err < 0)
+        return err;
+
     err = of_add_programs(mux, o);
     if (err < 0)
         return err;
diff --git a/fftools/ffmpeg_opt.c b/fftools/ffmpeg_opt.c
index 304471dd03..1144f64f89 100644
--- a/fftools/ffmpeg_opt.c
+++ b/fftools/ffmpeg_opt.c
@@ -1491,6 +1491,8 @@ const OptionDef options[] = {
         "add metadata", "string=string" },
     { "program",        HAS_ARG | OPT_STRING | OPT_SPEC | OPT_OUTPUT, { .off = OFFSET(program) },
         "add program with specified streams", "title=string:st=number..." },
+    { "stream_group",        HAS_ARG | OPT_STRING | OPT_SPEC | OPT_OUTPUT, { .off = OFFSET(stream_groups) },
+        "add stream group with specified streams and group type-specific arguments", "id=number:st=number..." },
     { "dframes",        HAS_ARG | OPT_PERFILE | OPT_EXPERT |
                         OPT_OUTPUT,                                  { .func_arg = opt_data_frames },
         "set the number of data frames to output", "number" },
-- 
2.43.0

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [FFmpeg-devel] [PATCH 4/8] avcodec/packet: add IAMF Parameters side data types
  2023-12-05 22:43 [FFmpeg-devel] [PATCH v6 0/8] avformat: introduce AVStreamGroup James Almer
                   ` (2 preceding siblings ...)
  2023-12-05 22:43 ` [FFmpeg-devel] [PATCH 3/8] ffmpeg: add support for muxing AVStreamGroups James Almer
@ 2023-12-05 22:43 ` James Almer
  2023-12-05 22:43 ` [FFmpeg-devel] [PATCH 5/8] avcodec/get_bits: add get_leb() James Almer
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 23+ messages in thread
From: James Almer @ 2023-12-05 22:43 UTC (permalink / raw)
  To: ffmpeg-devel

Signed-off-by: James Almer <jamrial@gmail.com>
---
 libavcodec/avpacket.c |  3 +++
 libavcodec/packet.h   | 24 ++++++++++++++++++++++++
 2 files changed, 27 insertions(+)

diff --git a/libavcodec/avpacket.c b/libavcodec/avpacket.c
index e29725c2d2..0f8c9b77ae 100644
--- a/libavcodec/avpacket.c
+++ b/libavcodec/avpacket.c
@@ -301,6 +301,9 @@ const char *av_packet_side_data_name(enum AVPacketSideDataType type)
     case AV_PKT_DATA_DOVI_CONF:                  return "DOVI configuration record";
     case AV_PKT_DATA_S12M_TIMECODE:              return "SMPTE ST 12-1:2014 timecode";
     case AV_PKT_DATA_DYNAMIC_HDR10_PLUS:         return "HDR10+ Dynamic Metadata (SMPTE 2094-40)";
+    case AV_PKT_DATA_IAMF_MIX_GAIN_PARAM:        return "IAMF Mix Gain Parameter Data";
+    case AV_PKT_DATA_IAMF_DEMIXING_INFO_PARAM:   return "IAMF Demixing Info Parameter Data";
+    case AV_PKT_DATA_IAMF_RECON_GAIN_INFO_PARAM: return "IAMF Recon Gain Info Parameter Data";
     }
     return NULL;
 }
diff --git a/libavcodec/packet.h b/libavcodec/packet.h
index b19409b719..2c57d262c6 100644
--- a/libavcodec/packet.h
+++ b/libavcodec/packet.h
@@ -299,6 +299,30 @@ enum AVPacketSideDataType {
      */
     AV_PKT_DATA_DYNAMIC_HDR10_PLUS,
 
+    /**
+     * IAMF Mix Gain Parameter Data associated with the audio frame. This metadata
+     * is in the form of the AVIAMFParamDefinition struct and contains information
+     * defined in sections 3.6.1 and 3.8.1 of the Immersive Audio Model and
+     * Formats standard.
+     */
+    AV_PKT_DATA_IAMF_MIX_GAIN_PARAM,
+
+    /**
+     * IAMF Demixing Info Parameter Data associated with the audio frame. This
+     * metadata is in the form of the AVIAMFParamDefinition struct and contains
+     * information defined in sections 3.6.1 and 3.8.2 of the Immersive Audio Model
+     * and Formats standard.
+     */
+    AV_PKT_DATA_IAMF_DEMIXING_INFO_PARAM,
+
+    /**
+     * IAMF Recon Gain Info Parameter Data associated with the audio frame. This
+     * metadata is in the form of the AVIAMFParamDefinition struct and contains
+     * information defined in sections 3.6.1 and 3.8.3 of the Immersive Audio Model
+     * and Formats standard.
+     */
+    AV_PKT_DATA_IAMF_RECON_GAIN_INFO_PARAM,
+
     /**
      * The number of side data types.
      * This is not part of the public API/ABI in the sense that it may
-- 
2.43.0

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [FFmpeg-devel] [PATCH 5/8] avcodec/get_bits: add get_leb()
  2023-12-05 22:43 [FFmpeg-devel] [PATCH v6 0/8] avformat: introduce AVStreamGroup James Almer
                   ` (3 preceding siblings ...)
  2023-12-05 22:43 ` [FFmpeg-devel] [PATCH 4/8] avcodec/packet: add IAMF Parameters side data types James Almer
@ 2023-12-05 22:43 ` James Almer
  2023-12-05 22:44 ` [FFmpeg-devel] [PATCH 6/8] avformat/aviobuf: add ffio_read_leb() and ffio_write_leb() James Almer
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 23+ messages in thread
From: James Almer @ 2023-12-05 22:43 UTC (permalink / raw)
  To: ffmpeg-devel

Signed-off-by: James Almer <jamrial@gmail.com>
---
 libavcodec/bitstream.h          |  2 ++
 libavcodec/bitstream_template.h | 23 +++++++++++++++++++++++
 libavcodec/get_bits.h           | 24 ++++++++++++++++++++++++
 3 files changed, 49 insertions(+)

diff --git a/libavcodec/bitstream.h b/libavcodec/bitstream.h
index 35b7873b9c..17f8a5da83 100644
--- a/libavcodec/bitstream.h
+++ b/libavcodec/bitstream.h
@@ -103,6 +103,7 @@
 # define bits_apply_sign    bits_apply_sign_le
 # define bits_read_vlc      bits_read_vlc_le
 # define bits_read_vlc_multi bits_read_vlc_multi_le
+# define bits_read_leb      bits_read_leb_le
 
 #elif defined(BITS_DEFAULT_BE)
 
@@ -132,6 +133,7 @@
 # define bits_apply_sign    bits_apply_sign_be
 # define bits_read_vlc      bits_read_vlc_be
 # define bits_read_vlc_multi bits_read_vlc_multi_be
+# define bits_read_leb      bits_read_leb_be
 
 #endif
 
diff --git a/libavcodec/bitstream_template.h b/libavcodec/bitstream_template.h
index 4f3d07275f..4c7101632f 100644
--- a/libavcodec/bitstream_template.h
+++ b/libavcodec/bitstream_template.h
@@ -562,6 +562,29 @@ static inline int BS_FUNC(read_vlc_multi)(BSCTX *bc, uint8_t dst[8],
     return ret;
 }
 
+/**
+ * Read a unsigned integer coded as a variable number of up to eight
+ * little-endian bytes, where the MSB in a byte signals another byte
+ * must be read.
+ * Values > UINT_MAX are truncated, but all coded bits are read.
+ */
+static inline unsigned BS_FUNC(read_leb)(BSCTX *bc) {
+    int more, i = 0;
+    unsigned leb = 0;
+
+    do {
+        int byte = BS_FUNC(read)(bc, 8);
+        unsigned bits = byte & 0x7f;
+        more = byte & 0x80;
+        if (i <= 4)
+            leb |= bits << (i * 7);
+        if (++i == 8)
+            break;
+    } while (more);
+
+    return leb;
+}
+
 #undef BSCTX
 #undef BS_FUNC
 #undef BS_JOIN3
diff --git a/libavcodec/get_bits.h b/libavcodec/get_bits.h
index cfcf97c021..9e19d2a439 100644
--- a/libavcodec/get_bits.h
+++ b/libavcodec/get_bits.h
@@ -94,6 +94,7 @@ typedef BitstreamContext GetBitContext;
 #define align_get_bits      bits_align
 #define get_vlc2            bits_read_vlc
 #define get_vlc_multi       bits_read_vlc_multi
+#define get_leb             bits_read_leb
 
 #define init_get_bits8_le(s, buffer, byte_size) bits_init8_le((BitstreamContextLE*)s, buffer, byte_size)
 #define get_bits_le(s, n)                       bits_read_le((BitstreamContextLE*)s, n)
@@ -710,6 +711,29 @@ static inline int skip_1stop_8data_bits(GetBitContext *gb)
     return 0;
 }
 
+/**
+ * Read a unsigned integer coded as a variable number of up to eight
+ * little-endian bytes, where the MSB in a byte signals another byte
+ * must be read.
+ * All coded bits are read, but values > UINT_MAX are truncated.
+ */
+static inline unsigned get_leb(GetBitContext *s) {
+    int more, i = 0;
+    unsigned leb = 0;
+
+    do {
+        int byte = get_bits(s, 8);
+        unsigned bits = byte & 0x7f;
+        more = byte & 0x80;
+        if (i <= 4)
+            leb |= bits << (i * 7);
+        if (++i == 8)
+            break;
+    } while (more);
+
+    return leb;
+}
+
 #endif // CACHED_BITSTREAM_READER
 
 #endif /* AVCODEC_GET_BITS_H */
-- 
2.43.0

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [FFmpeg-devel] [PATCH 6/8] avformat/aviobuf: add ffio_read_leb() and ffio_write_leb()
  2023-12-05 22:43 [FFmpeg-devel] [PATCH v6 0/8] avformat: introduce AVStreamGroup James Almer
                   ` (4 preceding siblings ...)
  2023-12-05 22:43 ` [FFmpeg-devel] [PATCH 5/8] avcodec/get_bits: add get_leb() James Almer
@ 2023-12-05 22:44 ` James Almer
  2023-12-05 22:44 ` [FFmpeg-devel] [PATCH 7/8] avformat: Immersive Audio Model and Formats demuxer James Almer
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 23+ messages in thread
From: James Almer @ 2023-12-05 22:44 UTC (permalink / raw)
  To: ffmpeg-devel

Signed-off-by: James Almer <jamrial@gmail.com>
---
 libavformat/avio_internal.h | 10 ++++++++++
 libavformat/aviobuf.c       | 33 +++++++++++++++++++++++++++++++++
 2 files changed, 43 insertions(+)

diff --git a/libavformat/avio_internal.h b/libavformat/avio_internal.h
index bd58499b64..f2e4ff30cb 100644
--- a/libavformat/avio_internal.h
+++ b/libavformat/avio_internal.h
@@ -146,6 +146,16 @@ int ffio_rewind_with_probe_data(AVIOContext *s, unsigned char **buf, int buf_siz
 
 uint64_t ffio_read_varlen(AVIOContext *bc);
 
+/**
+ * Read a unsigned integer coded as a variable number of up to eight
+ * little-endian bytes, where the MSB in a byte signals another byte
+ * must be read.
+ * All coded bytes are read, but values > UINT_MAX are truncated.
+ */
+unsigned int ffio_read_leb(AVIOContext *s);
+
+void ffio_write_leb(AVIOContext *s, unsigned val);
+
 /**
  * Read size bytes from AVIOContext into buf.
  * Check that exactly size bytes have been read.
diff --git a/libavformat/aviobuf.c b/libavformat/aviobuf.c
index 2899c75521..5a329ce465 100644
--- a/libavformat/aviobuf.c
+++ b/libavformat/aviobuf.c
@@ -971,6 +971,39 @@ uint64_t ffio_read_varlen(AVIOContext *bc){
     return val;
 }
 
+unsigned int ffio_read_leb(AVIOContext *s) {
+    int more, i = 0;
+    unsigned leb = 0;
+
+    do {
+        int byte = avio_r8(s);
+        unsigned bits = byte & 0x7f;
+        more = byte & 0x80;
+        if (i <= 4)
+            leb |= bits << (i * 7);
+        if (++i == 8)
+            break;
+    } while (more);
+
+    return leb;
+}
+
+void ffio_write_leb(AVIOContext *s, unsigned val)
+{
+    int len;
+    uint8_t byte;
+
+    len = (av_log2(val) + 7) / 7;
+
+    for (int i = 0; i < len; i++) {
+        byte = val >> (7 * i) & 0x7f;
+        if (i < len - 1)
+            byte |= 0x80;
+
+        avio_w8(s, byte);
+    }
+}
+
 int ffio_fdopen(AVIOContext **s, URLContext *h)
 {
     uint8_t *buffer = NULL;
-- 
2.43.0

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [FFmpeg-devel] [PATCH 7/8] avformat: Immersive Audio Model and Formats demuxer
  2023-12-05 22:43 [FFmpeg-devel] [PATCH v6 0/8] avformat: introduce AVStreamGroup James Almer
                   ` (5 preceding siblings ...)
  2023-12-05 22:44 ` [FFmpeg-devel] [PATCH 6/8] avformat/aviobuf: add ffio_read_leb() and ffio_write_leb() James Almer
@ 2023-12-05 22:44 ` James Almer
  2023-12-05 22:44 ` [FFmpeg-devel] [PATCH 8/8] avformat: Immersive Audio Model and Formats muxer James Almer
  2023-12-10 21:52 ` [FFmpeg-devel] [PATCH v6 0/8] avformat: introduce AVStreamGroup James Almer
  8 siblings, 0 replies; 23+ messages in thread
From: James Almer @ 2023-12-05 22:44 UTC (permalink / raw)
  To: ffmpeg-devel

Signed-off-by: James Almer <jamrial@gmail.com>
---
 libavformat/Makefile     |    1 +
 libavformat/allformats.c |    1 +
 libavformat/iamf.c       |  125 +++++
 libavformat/iamf.h       |  162 ++++++
 libavformat/iamf_parse.c | 1106 ++++++++++++++++++++++++++++++++++++++
 libavformat/iamf_parse.h |   38 ++
 libavformat/iamfdec.c    |  495 +++++++++++++++++
 7 files changed, 1928 insertions(+)
 create mode 100644 libavformat/iamf.c
 create mode 100644 libavformat/iamf.h
 create mode 100644 libavformat/iamf_parse.c
 create mode 100644 libavformat/iamf_parse.h
 create mode 100644 libavformat/iamfdec.c

diff --git a/libavformat/Makefile b/libavformat/Makefile
index 2db83aff81..f23c22792b 100644
--- a/libavformat/Makefile
+++ b/libavformat/Makefile
@@ -258,6 +258,7 @@ OBJS-$(CONFIG_EVC_MUXER)                 += rawenc.o
 OBJS-$(CONFIG_HLS_DEMUXER)               += hls.o hls_sample_encryption.o
 OBJS-$(CONFIG_HLS_MUXER)                 += hlsenc.o hlsplaylist.o avc.o
 OBJS-$(CONFIG_HNM_DEMUXER)               += hnm.o
+OBJS-$(CONFIG_IAMF_DEMUXER)              += iamfdec.o iamf_parse.o iamf.o
 OBJS-$(CONFIG_ICO_DEMUXER)               += icodec.o
 OBJS-$(CONFIG_ICO_MUXER)                 += icoenc.o
 OBJS-$(CONFIG_IDCIN_DEMUXER)             += idcin.o
diff --git a/libavformat/allformats.c b/libavformat/allformats.c
index c8bb4e3866..6e520b78a6 100644
--- a/libavformat/allformats.c
+++ b/libavformat/allformats.c
@@ -212,6 +212,7 @@ extern const FFOutputFormat ff_hevc_muxer;
 extern const AVInputFormat  ff_hls_demuxer;
 extern const FFOutputFormat ff_hls_muxer;
 extern const AVInputFormat  ff_hnm_demuxer;
+extern const AVInputFormat  ff_iamf_demuxer;
 extern const AVInputFormat  ff_ico_demuxer;
 extern const FFOutputFormat ff_ico_muxer;
 extern const AVInputFormat  ff_idcin_demuxer;
diff --git a/libavformat/iamf.c b/libavformat/iamf.c
new file mode 100644
index 0000000000..5de70dc082
--- /dev/null
+++ b/libavformat/iamf.c
@@ -0,0 +1,125 @@
+/*
+ * Immersive Audio Model and Formats common helpers and structs
+ * Copyright (c) 2023 James Almer <jamrial@gmail.com>
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include "libavutil/channel_layout.h"
+#include "libavutil/iamf.h"
+#include "libavutil/mem.h"
+#include "iamf.h"
+
+const AVChannelLayout ff_iamf_scalable_ch_layouts[10] = {
+    AV_CHANNEL_LAYOUT_MONO,
+    AV_CHANNEL_LAYOUT_STEREO,
+    // "Loudspeaker configuration for Sound System B"
+    AV_CHANNEL_LAYOUT_5POINT1_BACK,
+    // "Loudspeaker configuration for Sound System C"
+    AV_CHANNEL_LAYOUT_5POINT1POINT2_BACK,
+    // "Loudspeaker configuration for Sound System D"
+    AV_CHANNEL_LAYOUT_5POINT1POINT4_BACK,
+    // "Loudspeaker configuration for Sound System I"
+    AV_CHANNEL_LAYOUT_7POINT1,
+    // "Loudspeaker configuration for Sound System I" + Ltf + Rtf
+    AV_CHANNEL_LAYOUT_7POINT1POINT2,
+    // "Loudspeaker configuration for Sound System J"
+    AV_CHANNEL_LAYOUT_7POINT1POINT4_BACK,
+    // Front subset of "Loudspeaker configuration for Sound System J"
+    AV_CHANNEL_LAYOUT_3POINT1POINT2,
+    // Binaural
+    AV_CHANNEL_LAYOUT_STEREO,
+};
+
+const struct IAMFSoundSystemMap ff_iamf_sound_system_map[13] = {
+    { SOUND_SYSTEM_A_0_2_0, AV_CHANNEL_LAYOUT_STEREO },
+    { SOUND_SYSTEM_B_0_5_0, AV_CHANNEL_LAYOUT_5POINT1_BACK },
+    { SOUND_SYSTEM_C_2_5_0, AV_CHANNEL_LAYOUT_5POINT1POINT2_BACK },
+    { SOUND_SYSTEM_D_4_5_0, AV_CHANNEL_LAYOUT_5POINT1POINT4_BACK },
+    { SOUND_SYSTEM_E_4_5_1,
+        {
+            .nb_channels = 11,
+            .order       = AV_CHANNEL_ORDER_NATIVE,
+            .u.mask      = AV_CH_LAYOUT_5POINT1POINT4_BACK | AV_CH_BOTTOM_FRONT_CENTER,
+        },
+    },
+    { SOUND_SYSTEM_F_3_7_0,  AV_CHANNEL_LAYOUT_7POINT2POINT3 },
+    { SOUND_SYSTEM_G_4_9_0,  AV_CHANNEL_LAYOUT_9POINT1POINT4_BACK },
+    { SOUND_SYSTEM_H_9_10_3, AV_CHANNEL_LAYOUT_22POINT2 },
+    { SOUND_SYSTEM_I_0_7_0,  AV_CHANNEL_LAYOUT_7POINT1 },
+    { SOUND_SYSTEM_J_4_7_0,  AV_CHANNEL_LAYOUT_7POINT1POINT4_BACK },
+    { SOUND_SYSTEM_10_2_7_0, AV_CHANNEL_LAYOUT_7POINT1POINT2 },
+    { SOUND_SYSTEM_11_2_3_0, AV_CHANNEL_LAYOUT_3POINT1POINT2 },
+    { SOUND_SYSTEM_12_0_1_0, AV_CHANNEL_LAYOUT_MONO },
+};
+
+void ff_iamf_free_audio_element(IAMFAudioElement **paudio_element)
+{
+    IAMFAudioElement *audio_element = *paudio_element;
+
+    if (!audio_element)
+        return;
+
+    for (int i = 0; i < audio_element->nb_substreams; i++)
+        avcodec_parameters_free(&audio_element->substreams[i].codecpar);
+    av_free(audio_element->substreams);
+    av_free(audio_element->layers);
+    av_iamf_audio_element_free(&audio_element->element);
+    av_freep(paudio_element);
+}
+
+void ff_iamf_free_mix_presentation(IAMFMixPresentation **pmix_presentation)
+{
+    IAMFMixPresentation *mix_presentation = *pmix_presentation;
+
+    if (!mix_presentation)
+        return;
+
+    for (int i = 0; i < mix_presentation->count_label; i++)
+        av_free(mix_presentation->language_label[i]);
+    av_free(mix_presentation->language_label);
+    av_iamf_mix_presentation_free(&mix_presentation->mix);
+    av_freep(pmix_presentation);
+}
+
+void ff_iamf_uninit_context(IAMFContext *c)
+{
+    if (!c)
+        return;
+
+    for (int i = 0; i < c->nb_codec_configs; i++) {
+        av_free(c->codec_configs[i]->extradata);
+        av_free(c->codec_configs[i]);
+    }
+    av_freep(&c->codec_configs);
+    c->nb_codec_configs = 0;
+
+    for (int i = 0; i < c->nb_audio_elements; i++)
+        ff_iamf_free_audio_element(&c->audio_elements[i]);
+    av_freep(&c->audio_elements);
+    c->nb_audio_elements = 0;
+
+    for (int i = 0; i < c->nb_mix_presentations; i++)
+        ff_iamf_free_mix_presentation(&c->mix_presentations[i]);
+    av_freep(&c->mix_presentations);
+    c->nb_mix_presentations = 0;
+
+    for (int i = 0; i < c->nb_param_definitions; i++)
+        av_free(c->param_definitions[i]);
+    av_freep(&c->param_definitions);
+    c->nb_param_definitions = 0;
+}
diff --git a/libavformat/iamf.h b/libavformat/iamf.h
new file mode 100644
index 0000000000..efcd5dc4e2
--- /dev/null
+++ b/libavformat/iamf.h
@@ -0,0 +1,162 @@
+/*
+ * Immersive Audio Model and Formats common helpers and structs
+ * Copyright (c) 2023 James Almer <jamrial@gmail.com>
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#ifndef AVFORMAT_IAMF_H
+#define AVFORMAT_IAMF_H
+
+#include <stdint.h>
+
+#include "libavutil/channel_layout.h"
+#include "libavutil/iamf.h"
+#include "libavcodec/codec_id.h"
+#include "libavcodec/codec_par.h"
+#include "avformat.h"
+
+#define MAX_IAMF_OBU_HEADER_SIZE (1 + 8 * 3)
+
+// OBU types (section 3.2).
+enum IAMF_OBU_Type {
+    IAMF_OBU_IA_CODEC_CONFIG        = 0,
+    IAMF_OBU_IA_AUDIO_ELEMENT       = 1,
+    IAMF_OBU_IA_MIX_PRESENTATION    = 2,
+    IAMF_OBU_IA_PARAMETER_BLOCK     = 3,
+    IAMF_OBU_IA_TEMPORAL_DELIMITER  = 4,
+    IAMF_OBU_IA_AUDIO_FRAME         = 5,
+    IAMF_OBU_IA_AUDIO_FRAME_ID0     = 6,
+    IAMF_OBU_IA_AUDIO_FRAME_ID1     = 7,
+    IAMF_OBU_IA_AUDIO_FRAME_ID2     = 8,
+    IAMF_OBU_IA_AUDIO_FRAME_ID3     = 9,
+    IAMF_OBU_IA_AUDIO_FRAME_ID4     = 10,
+    IAMF_OBU_IA_AUDIO_FRAME_ID5     = 11,
+    IAMF_OBU_IA_AUDIO_FRAME_ID6     = 12,
+    IAMF_OBU_IA_AUDIO_FRAME_ID7     = 13,
+    IAMF_OBU_IA_AUDIO_FRAME_ID8     = 14,
+    IAMF_OBU_IA_AUDIO_FRAME_ID9     = 15,
+    IAMF_OBU_IA_AUDIO_FRAME_ID10    = 16,
+    IAMF_OBU_IA_AUDIO_FRAME_ID11    = 17,
+    IAMF_OBU_IA_AUDIO_FRAME_ID12    = 18,
+    IAMF_OBU_IA_AUDIO_FRAME_ID13    = 19,
+    IAMF_OBU_IA_AUDIO_FRAME_ID14    = 20,
+    IAMF_OBU_IA_AUDIO_FRAME_ID15    = 21,
+    IAMF_OBU_IA_AUDIO_FRAME_ID16    = 22,
+    IAMF_OBU_IA_AUDIO_FRAME_ID17    = 23,
+    // 24~30 reserved.
+    IAMF_OBU_IA_SEQUENCE_HEADER     = 31,
+};
+
+typedef struct IAMFCodecConfig {
+    unsigned codec_config_id;
+    enum AVCodecID codec_id;
+    uint32_t codec_tag;
+    unsigned nb_samples;
+    int seek_preroll;
+    int sample_rate;
+    int extradata_size;
+    uint8_t *extradata;
+} IAMFCodecConfig;
+
+typedef struct IAMFLayer {
+    unsigned int substream_count;
+    unsigned int coupled_substream_count;
+} IAMFLayer;
+
+typedef struct IAMFSubStream {
+    unsigned int audio_substream_id;
+
+    // demux
+    AVCodecParameters *codecpar;
+} IAMFSubStream;
+
+typedef struct IAMFAudioElement {
+    AVIAMFAudioElement *element;
+    unsigned int audio_element_id;
+
+    IAMFSubStream *substreams;
+    unsigned int nb_substreams;
+
+    unsigned int codec_config_id;
+
+    // mux
+    IAMFLayer *layers;
+    unsigned int nb_layers;
+} IAMFAudioElement;
+
+typedef struct IAMFMixPresentation {
+    AVIAMFMixPresentation *mix;
+    unsigned int mix_presentation_id;
+
+    // demux
+    unsigned int count_label;
+    char **language_label;
+} IAMFMixPresentation;
+
+typedef struct IAMFParamDefinition {
+    const AVIAMFAudioElement *audio_element;
+    AVIAMFParamDefinition *param;
+    size_t param_size;
+} IAMFParamDefinition;
+
+typedef struct IAMFContext {
+    IAMFCodecConfig **codec_configs;
+    int nb_codec_configs;
+    IAMFAudioElement **audio_elements;
+    int nb_audio_elements;
+    IAMFMixPresentation **mix_presentations;
+    int nb_mix_presentations;
+    IAMFParamDefinition **param_definitions;
+    int nb_param_definitions;
+} IAMFContext;
+
+enum IAMF_Anchor_Element {
+    IAMF_ANCHOR_ELEMENT_UNKNWONW,
+    IAMF_ANCHOR_ELEMENT_DIALOGUE,
+    IAMF_ANCHOR_ELEMENT_ALBUM,
+};
+
+enum IAMF_Sound_System {
+    SOUND_SYSTEM_A_0_2_0  = 0,  // "Loudspeaker configuration for Sound System A"
+    SOUND_SYSTEM_B_0_5_0  = 1,  // "Loudspeaker configuration for Sound System B"
+    SOUND_SYSTEM_C_2_5_0  = 2,  // "Loudspeaker configuration for Sound System C"
+    SOUND_SYSTEM_D_4_5_0  = 3,  // "Loudspeaker configuration for Sound System D"
+    SOUND_SYSTEM_E_4_5_1  = 4,  // "Loudspeaker configuration for Sound System E"
+    SOUND_SYSTEM_F_3_7_0  = 5,  // "Loudspeaker configuration for Sound System F"
+    SOUND_SYSTEM_G_4_9_0  = 6,  // "Loudspeaker configuration for Sound System G"
+    SOUND_SYSTEM_H_9_10_3 = 7,  // "Loudspeaker configuration for Sound System H"
+    SOUND_SYSTEM_I_0_7_0  = 8,  // "Loudspeaker configuration for Sound System I"
+    SOUND_SYSTEM_J_4_7_0  = 9, // "Loudspeaker configuration for Sound System J"
+    SOUND_SYSTEM_10_2_7_0 = 10, // "Loudspeaker configuration for Sound System I" + Ltf + Rtf
+    SOUND_SYSTEM_11_2_3_0 = 11, // Front subset of "Loudspeaker configuration for Sound System J"
+    SOUND_SYSTEM_12_0_1_0 = 12, // Mono
+};
+
+struct IAMFSoundSystemMap {
+    enum IAMF_Sound_System id;
+    AVChannelLayout layout;
+};
+
+extern const AVChannelLayout ff_iamf_scalable_ch_layouts[10];
+extern const struct IAMFSoundSystemMap ff_iamf_sound_system_map[13];
+
+void ff_iamf_free_audio_element(IAMFAudioElement **paudio_element);
+void ff_iamf_free_mix_presentation(IAMFMixPresentation **pmix_presentation);
+void ff_iamf_uninit_context(IAMFContext *c);
+
+#endif /* AVFORMAT_IAMF_H */
diff --git a/libavformat/iamf_parse.c b/libavformat/iamf_parse.c
new file mode 100644
index 0000000000..62abb5fe9f
--- /dev/null
+++ b/libavformat/iamf_parse.c
@@ -0,0 +1,1106 @@
+/*
+ * Immersive Audio Model and Formats parsing
+ * Copyright (c) 2023 James Almer <jamrial@gmail.com>
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include "libavutil/avassert.h"
+#include "libavutil/common.h"
+#include "libavutil/iamf.h"
+#include "libavutil/intreadwrite.h"
+#include "libavutil/log.h"
+#include "libavcodec/get_bits.h"
+#include "libavcodec/flac.h"
+#include "libavcodec/mpeg4audio.h"
+#include "libavcodec/put_bits.h"
+#include "avio_internal.h"
+#include "iamf_parse.h"
+#include "isom.h"
+
+static int opus_decoder_config(IAMFCodecConfig *codec_config,
+                               AVIOContext *pb, int len)
+{
+    int left = len - avio_tell(pb);
+
+    if (left < 11)
+        return AVERROR_INVALIDDATA;
+
+    codec_config->extradata = av_malloc(left + 8);
+    if (!codec_config->extradata)
+        return AVERROR(ENOMEM);
+
+    AV_WB32(codec_config->extradata, MKBETAG('O','p','u','s'));
+    AV_WB32(codec_config->extradata + 4, MKBETAG('H','e','a','d'));
+    codec_config->extradata_size = avio_read(pb, codec_config->extradata + 8, left);
+    if (codec_config->extradata_size < left)
+        return AVERROR_INVALIDDATA;
+
+    codec_config->extradata_size += 8;
+    codec_config->sample_rate = 48000;
+
+    return 0;
+}
+
+static int aac_decoder_config(IAMFCodecConfig *codec_config,
+                              AVIOContext *pb, int len, void *logctx)
+{
+    MPEG4AudioConfig cfg = { 0 };
+    int object_type_id, codec_id, stream_type;
+    int ret, tag, left;
+
+    tag = avio_r8(pb);
+    if (tag != MP4DecConfigDescrTag)
+        return AVERROR_INVALIDDATA;
+
+    object_type_id = avio_r8(pb);
+    if (object_type_id != 0x40)
+        return AVERROR_INVALIDDATA;
+
+    stream_type = avio_r8(pb);
+    if (((stream_type >> 2) != 5) || ((stream_type >> 1) & 1))
+        return AVERROR_INVALIDDATA;
+
+    avio_skip(pb, 3); // buffer size db
+    avio_skip(pb, 4); // rc_max_rate
+    avio_skip(pb, 4); // avg bitrate
+
+    codec_id = ff_codec_get_id(ff_mp4_obj_type, object_type_id);
+    if (codec_id && codec_id != codec_config->codec_id)
+        return AVERROR_INVALIDDATA;
+
+    tag = avio_r8(pb);
+    if (tag != MP4DecSpecificDescrTag)
+        return AVERROR_INVALIDDATA;
+
+    left = len - avio_tell(pb);
+    if (left <= 0)
+        return AVERROR_INVALIDDATA;
+
+    codec_config->extradata = av_malloc(left);
+    if (!codec_config->extradata)
+        return AVERROR(ENOMEM);
+
+    codec_config->extradata_size = avio_read(pb, codec_config->extradata, left);
+    if (codec_config->extradata_size < left)
+        return AVERROR_INVALIDDATA;
+
+    ret = avpriv_mpeg4audio_get_config2(&cfg, codec_config->extradata,
+                                        codec_config->extradata_size, 1, logctx);
+    if (ret < 0)
+        return ret;
+
+    codec_config->sample_rate = cfg.sample_rate;
+
+    return 0;
+}
+
+static int flac_decoder_config(IAMFCodecConfig *codec_config,
+                               AVIOContext *pb, int len)
+{
+    int left;
+
+    avio_skip(pb, 4); // METADATA_BLOCK_HEADER
+
+    left = len - avio_tell(pb);
+    if (left < FLAC_STREAMINFO_SIZE)
+        return AVERROR_INVALIDDATA;
+
+    codec_config->extradata = av_malloc(left);
+    if (!codec_config->extradata)
+        return AVERROR(ENOMEM);
+
+    codec_config->extradata_size = avio_read(pb, codec_config->extradata, left);
+    if (codec_config->extradata_size < left)
+        return AVERROR_INVALIDDATA;
+
+    codec_config->sample_rate = AV_RB24(codec_config->extradata + 10) >> 4;
+
+    return 0;
+}
+
+static int ipcm_decoder_config(IAMFCodecConfig *codec_config,
+                               AVIOContext *pb, int len)
+{
+    static const enum AVSampleFormat sample_fmt[2][3] = {
+        { AV_CODEC_ID_PCM_S16BE, AV_CODEC_ID_PCM_S24BE, AV_CODEC_ID_PCM_S32BE },
+        { AV_CODEC_ID_PCM_S16LE, AV_CODEC_ID_PCM_S24LE, AV_CODEC_ID_PCM_S32LE },
+    };
+    int sample_format = avio_r8(pb); // 0 = BE, 1 = LE
+    int sample_size = (avio_r8(pb) / 8 - 2); // 16, 24, 32
+    if (sample_format > 1 || sample_size > 2)
+        return AVERROR_INVALIDDATA;
+
+    codec_config->codec_id = sample_fmt[sample_format][sample_size];
+    codec_config->sample_rate = avio_rb32(pb);
+
+    if (len - avio_tell(pb))
+        return AVERROR_INVALIDDATA;
+
+    return 0;
+}
+
+static int codec_config_obu(void *s, IAMFContext *c, AVIOContext *pb, int len)
+{
+    IAMFCodecConfig **tmp, *codec_config = NULL;
+    FFIOContext b;
+    AVIOContext *pbc;
+    uint8_t *buf;
+    enum AVCodecID avcodec_id;
+    unsigned codec_config_id, nb_samples, codec_id;
+    int16_t seek_preroll;
+    int ret;
+
+    buf = av_malloc(len);
+    if (!buf)
+        return AVERROR(ENOMEM);
+
+    ret = avio_read(pb, buf, len);
+    if (ret != len) {
+        if (ret >= 0)
+            ret = AVERROR_INVALIDDATA;
+        goto fail;
+    }
+
+    ffio_init_context(&b, buf, len, 0, NULL, NULL, NULL, NULL);
+    pbc = &b.pub;
+
+    codec_config_id = ffio_read_leb(pbc);
+    codec_id = avio_rb32(pbc);
+    nb_samples = ffio_read_leb(pbc);
+    seek_preroll = avio_rb16(pbc);
+
+    switch(codec_id) {
+    case MKBETAG('O','p','u','s'):
+        avcodec_id = AV_CODEC_ID_OPUS;
+        break;
+    case MKBETAG('m','p','4','a'):
+        avcodec_id = AV_CODEC_ID_AAC;
+        break;
+    case MKBETAG('f','L','a','C'):
+        avcodec_id = AV_CODEC_ID_FLAC;
+        break;
+    default:
+        avcodec_id = AV_CODEC_ID_NONE;
+        break;
+    }
+
+    for (int i = 0; i < c->nb_codec_configs; i++)
+        if (c->codec_configs[i]->codec_config_id == codec_config_id) {
+            ret = AVERROR_INVALIDDATA;
+            goto fail;
+        }
+
+    tmp = av_realloc_array(c->codec_configs, c->nb_codec_configs + 1, sizeof(*c->codec_configs));
+    if (!tmp) {
+        ret = AVERROR(ENOMEM);
+        goto fail;
+    }
+    c->codec_configs = tmp;
+
+    codec_config = av_mallocz(sizeof(*codec_config));
+    if (!codec_config) {
+        ret = AVERROR(ENOMEM);
+        goto fail;
+    }
+
+    codec_config->codec_config_id = codec_config_id;
+    codec_config->codec_id = avcodec_id;
+    codec_config->nb_samples = nb_samples;
+    codec_config->seek_preroll = seek_preroll;
+
+    switch(codec_id) {
+    case MKBETAG('O','p','u','s'):
+        ret = opus_decoder_config(codec_config, pbc, len);
+        break;
+    case MKBETAG('m','p','4','a'):
+        ret = aac_decoder_config(codec_config, pbc, len, s);
+        break;
+    case MKBETAG('f','L','a','C'):
+        ret = flac_decoder_config(codec_config, pbc, len);
+        break;
+    case MKBETAG('i','p','c','m'):
+        ret = ipcm_decoder_config(codec_config, pbc, len);
+        break;
+    default:
+        break;
+    }
+    if (ret < 0)
+        goto fail;
+
+    c->codec_configs[c->nb_codec_configs++] = codec_config;
+
+    len -= avio_tell(pbc);
+    if (len)
+       av_log(s, AV_LOG_WARNING, "Underread in codec_config_obu. %d bytes left at the end\n", len);
+
+    ret = 0;
+fail:
+    av_free(buf);
+    if (ret < 0) {
+        if (codec_config)
+            av_free(codec_config->extradata);
+        av_free(codec_config);
+    }
+    return ret;
+}
+
+static int update_extradata(AVCodecParameters *codecpar)
+{
+    GetBitContext gb;
+    PutBitContext pb;
+    int ret;
+
+    switch(codecpar->codec_id) {
+    case AV_CODEC_ID_OPUS:
+        AV_WB8(codecpar->extradata + 9, codecpar->ch_layout.nb_channels);
+        break;
+    case AV_CODEC_ID_AAC: {
+        uint8_t buf[5];
+
+        init_put_bits(&pb, buf, sizeof(buf));
+        ret = init_get_bits8(&gb, codecpar->extradata, codecpar->extradata_size);
+        if (ret < 0)
+            return ret;
+
+        ret = get_bits(&gb, 5);
+        put_bits(&pb, 5, ret);
+        if (ret == AOT_ESCAPE) // violates section 3.11.2, but better check for it
+            put_bits(&pb, 6, get_bits(&gb, 6));
+        ret = get_bits(&gb, 4);
+        put_bits(&pb, 4, ret);
+        if (ret == 0x0f)
+            put_bits(&pb, 24, get_bits(&gb, 24));
+
+        skip_bits(&gb, 4);
+        put_bits(&pb, 4, codecpar->ch_layout.nb_channels); // set channel config
+        ret = put_bits_left(&pb);
+        put_bits(&pb, ret, get_bits(&gb, ret));
+        flush_put_bits(&pb);
+
+        memcpy(codecpar->extradata, buf, sizeof(buf));
+        break;
+    }
+    case AV_CODEC_ID_FLAC: {
+        uint8_t buf[13];
+
+        init_put_bits(&pb, buf, sizeof(buf));
+        ret = init_get_bits8(&gb, codecpar->extradata, codecpar->extradata_size);
+        if (ret < 0)
+            return ret;
+
+        put_bits32(&pb, get_bits_long(&gb, 32)); // min/max blocksize
+        put_bits64(&pb, 48, get_bits64(&gb, 48)); // min/max framesize
+        put_bits(&pb, 20, get_bits(&gb, 20)); // samplerate
+        skip_bits(&gb, 3);
+        put_bits(&pb, 3, codecpar->ch_layout.nb_channels - 1);
+        ret = put_bits_left(&pb);
+        put_bits(&pb, ret, get_bits(&gb, ret));
+        flush_put_bits(&pb);
+
+        memcpy(codecpar->extradata, buf, sizeof(buf));
+        break;
+    }
+    }
+
+    return 0;
+}
+
+static int scalable_channel_layout_config(void *s, AVIOContext *pb,
+                                          IAMFAudioElement *audio_element,
+                                          const IAMFCodecConfig *codec_config)
+{
+    int nb_layers, k = 0;
+
+    nb_layers = avio_r8(pb) >> 5; // get_bits(&gb, 3);
+    // skip_bits(&gb, 5); //reserved
+
+    if (nb_layers > 6)
+        return AVERROR_INVALIDDATA;
+
+    for (int i = 0; i < nb_layers; i++) {
+        AVIAMFLayer *layer;
+        int loudspeaker_layout, output_gain_is_present_flag;
+        int substream_count, coupled_substream_count;
+        int ret, byte = avio_r8(pb);
+
+        layer = av_iamf_audio_element_add_layer(audio_element->element);
+        if (!layer)
+            return AVERROR(ENOMEM);
+
+        loudspeaker_layout = byte >> 4; // get_bits(&gb, 4);
+        output_gain_is_present_flag = (byte >> 3) & 1; //get_bits1(&gb);
+        if ((byte >> 2) & 1)
+            layer->flags |= AV_IAMF_LAYER_FLAG_RECON_GAIN;
+        substream_count = avio_r8(pb);
+        coupled_substream_count = avio_r8(pb);
+
+        if (output_gain_is_present_flag) {
+            layer->output_gain_flags = avio_r8(pb) >> 2;  // get_bits(&gb, 6);
+            layer->output_gain = av_make_q(sign_extend(avio_rb16(pb), 16), 1 << 8);
+        }
+
+        if (loudspeaker_layout < 10)
+            av_channel_layout_copy(&layer->ch_layout, &ff_iamf_scalable_ch_layouts[loudspeaker_layout]);
+        else
+            layer->ch_layout = (AVChannelLayout){ .order = AV_CHANNEL_ORDER_UNSPEC,
+                                                          .nb_channels = substream_count +
+                                                                         coupled_substream_count };
+
+        for (int j = 0; j < substream_count; j++) {
+            IAMFSubStream *substream = &audio_element->substreams[k++];
+
+            substream->codecpar->ch_layout = coupled_substream_count-- > 0 ? (AVChannelLayout)AV_CHANNEL_LAYOUT_STEREO :
+                                                                             (AVChannelLayout)AV_CHANNEL_LAYOUT_MONO;
+
+            ret = update_extradata(substream->codecpar);
+            if (ret < 0)
+                return ret;
+        }
+
+    }
+
+    return 0;
+}
+
+static int ambisonics_config(void *s, AVIOContext *pb,
+                             IAMFAudioElement *audio_element,
+                             const IAMFCodecConfig *codec_config)
+{
+    AVIAMFLayer *layer;
+    unsigned ambisonics_mode;
+    int output_channel_count, substream_count, order;
+    int ret;
+
+    ambisonics_mode = ffio_read_leb(pb);
+    if (ambisonics_mode > 1)
+        return 0;
+
+    output_channel_count = avio_r8(pb);  // C
+    substream_count = avio_r8(pb);  // N
+    if (audio_element->nb_substreams != substream_count)
+        return AVERROR_INVALIDDATA;
+
+    order = floor(sqrt(output_channel_count - 1));
+    /* incomplete order - some harmonics are missing */
+    if ((order + 1) * (order + 1) != output_channel_count)
+        return AVERROR_INVALIDDATA;
+
+    layer = av_iamf_audio_element_add_layer(audio_element->element);
+    if (!layer)
+        return AVERROR(ENOMEM);
+
+    layer->ambisonics_mode = ambisonics_mode;
+    if (ambisonics_mode == 0) {
+        for (int i = 0; i < substream_count; i++) {
+            IAMFSubStream *substream = &audio_element->substreams[i];
+
+            substream->codecpar->ch_layout = (AVChannelLayout)AV_CHANNEL_LAYOUT_MONO;
+
+            ret = update_extradata(substream->codecpar);
+            if (ret < 0)
+                return ret;
+        }
+
+        layer->ch_layout.order = AV_CHANNEL_ORDER_CUSTOM;
+        layer->ch_layout.nb_channels = output_channel_count;
+        layer->ch_layout.u.map = av_calloc(output_channel_count, sizeof(*layer->ch_layout.u.map));
+        if (!layer->ch_layout.u.map)
+            return AVERROR(ENOMEM);
+
+        for (int i = 0; i < output_channel_count; i++)
+            layer->ch_layout.u.map[i].id = avio_r8(pb) + AV_CHAN_AMBISONIC_BASE;
+    } else {
+        int coupled_substream_count = avio_r8(pb);  // M
+        int nb_demixing_matrix = substream_count + coupled_substream_count;
+        int demixing_matrix_size = nb_demixing_matrix * output_channel_count;
+
+        layer->ch_layout = (AVChannelLayout){ .order = AV_CHANNEL_ORDER_AMBISONIC, .nb_channels = output_channel_count };
+        layer->demixing_matrix = av_malloc_array(demixing_matrix_size, sizeof(*layer->demixing_matrix));
+        if (!layer->demixing_matrix)
+            return AVERROR(ENOMEM);
+
+        for (int i = 0; i < demixing_matrix_size; i++)
+            layer->demixing_matrix[i] = av_make_q(sign_extend(avio_rb16(pb), 16), 1 << 8);
+
+        for (int i = 0; i < substream_count; i++) {
+            IAMFSubStream *substream = &audio_element->substreams[i];
+
+            substream->codecpar->ch_layout = coupled_substream_count-- > 0 ? (AVChannelLayout)AV_CHANNEL_LAYOUT_STEREO :
+                                                                             (AVChannelLayout)AV_CHANNEL_LAYOUT_MONO;
+
+
+            ret = update_extradata(substream->codecpar);
+            if (ret < 0)
+                return ret;
+        }
+    }
+
+    return 0;
+}
+
+static int param_parse(void *s, IAMFContext *c, AVIOContext *pb,
+                       unsigned int param_definition_type,
+                       const AVIAMFAudioElement *audio_element,
+                       AVIAMFParamDefinition **out_param_definition)
+{
+    IAMFParamDefinition *param_definition = NULL;
+    AVIAMFParamDefinition *param;
+    unsigned int parameter_id, parameter_rate, param_definition_mode;
+    unsigned int duration = 0, constant_subblock_duration = 0, nb_subblocks = 0;
+    size_t param_size;
+
+    parameter_id = ffio_read_leb(pb);
+
+    for (int i = 0; i < c->nb_param_definitions; i++)
+        if (c->param_definitions[i]->param->parameter_id == parameter_id) {
+            param_definition = c->param_definitions[i];
+            break;
+        }
+
+    parameter_rate = ffio_read_leb(pb);
+    param_definition_mode = avio_r8(pb) >> 7;
+
+    if (param_definition_mode == 0) {
+        duration = ffio_read_leb(pb);
+        constant_subblock_duration = ffio_read_leb(pb);
+        if (constant_subblock_duration == 0)
+            nb_subblocks = ffio_read_leb(pb);
+        else
+            nb_subblocks = duration / constant_subblock_duration;
+    }
+
+    param = av_iamf_param_definition_alloc(param_definition_type, nb_subblocks, &param_size);
+    if (!param)
+        return AVERROR(ENOMEM);
+
+    for (int i = 0; i < nb_subblocks; i++) {
+        void *subblock = av_iamf_param_definition_get_subblock(param, i);
+        unsigned int subblock_duration = constant_subblock_duration;
+
+        if (constant_subblock_duration == 0)
+            subblock_duration = ffio_read_leb(pb);
+
+        switch (param_definition_type) {
+        case AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN: {
+            AVIAMFMixGain *mix = subblock;
+            mix->subblock_duration = subblock_duration;
+            break;
+        }
+        case AV_IAMF_PARAMETER_DEFINITION_DEMIXING: {
+            AVIAMFDemixingInfo *demix = subblock;
+            demix->subblock_duration = subblock_duration;
+            // DemixingInfoParameterData
+            demix->dmixp_mode = avio_r8(pb) >> 5;
+            break;
+        }
+        case AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN: {
+            AVIAMFReconGain *recon = subblock;
+            recon->subblock_duration = subblock_duration;
+            break;
+        }
+        default:
+            av_free(param);
+            return AVERROR_INVALIDDATA;
+        }
+    }
+
+    param->parameter_id = parameter_id;
+    param->parameter_rate = parameter_rate;
+    param->param_definition_mode = param_definition_mode;
+    param->duration = duration;
+    param->constant_subblock_duration = constant_subblock_duration;
+    param->nb_subblocks = nb_subblocks;
+
+    if (param_definition) {
+        if (param_definition->param_size != param_size || memcmp(param_definition->param, param, param_size)) {
+            av_log(s, AV_LOG_ERROR, "Incosistent parameters for parameter_id %u\n", parameter_id);
+            av_free(param);
+            return AVERROR_INVALIDDATA;
+        }
+    } else {
+        IAMFParamDefinition **tmp = av_realloc_array(c->param_definitions, c->nb_param_definitions + 1,
+                                                     sizeof(*c->param_definitions));
+        if (!tmp) {
+            av_free(param);
+            return AVERROR(ENOMEM);
+        }
+        c->param_definitions = tmp;
+
+        param_definition = av_mallocz(sizeof(*param_definition));
+        if (!param_definition) {
+            av_free(param);
+            return AVERROR(ENOMEM);
+        }
+        param_definition->param = param;
+        param_definition->param_size = param_size;
+        param_definition->audio_element = audio_element;
+
+        c->param_definitions[c->nb_param_definitions++] = param_definition;
+    }
+
+    av_assert0(out_param_definition);
+    *out_param_definition = param;
+
+    return 0;
+}
+
+static IAMFCodecConfig *get_codec_config(IAMFContext *c, unsigned int codec_config_id)
+{
+    for (int i = 0; i < c->nb_codec_configs; i++) {
+        if (c->codec_configs[i]->codec_config_id == codec_config_id)
+            return c->codec_configs[i];
+    }
+
+    return NULL;
+}
+
+static int audio_element_obu(void *s, IAMFContext *c, AVIOContext *pb, int len)
+{
+    const IAMFCodecConfig *codec_config;
+    AVIAMFAudioElement *element;
+    IAMFAudioElement **tmp, *audio_element = NULL;
+    FFIOContext b;
+    AVIOContext *pbc;
+    uint8_t *buf;
+    unsigned audio_element_id, codec_config_id, num_parameters;
+    int audio_element_type, ret;
+
+    buf = av_malloc(len);
+    if (!buf)
+        return AVERROR(ENOMEM);
+
+    ret = avio_read(pb, buf, len);
+    if (ret != len) {
+        if (ret >= 0)
+            ret = AVERROR_INVALIDDATA;
+        goto fail;
+    }
+
+    ffio_init_context(&b, buf, len, 0, NULL, NULL, NULL, NULL);
+    pbc = &b.pub;
+
+    audio_element_id = ffio_read_leb(pbc);
+
+    for (int i = 0; i < c->nb_audio_elements; i++)
+        if (c->audio_elements[i]->audio_element_id == audio_element_id) {
+            av_log(s, AV_LOG_ERROR, "Duplicate audio_element_id %d\n", audio_element_id);
+            ret = AVERROR_INVALIDDATA;
+            goto fail;
+        }
+
+    audio_element_type = avio_r8(pbc) >> 5;
+    codec_config_id = ffio_read_leb(pbc);
+
+    codec_config = get_codec_config(c, codec_config_id);
+    if (!codec_config) {
+        av_log(s, AV_LOG_ERROR, "Non existant codec config id %d referenced in an audio element\n", codec_config_id);
+        ret = AVERROR_INVALIDDATA;
+        goto fail;
+    }
+
+    if (codec_config->codec_id == AV_CODEC_ID_NONE) {
+        av_log(s, AV_LOG_DEBUG, "Unknown codec id referenced in an audio element. Ignoring\n");
+        ret = 0;
+        goto fail;
+    }
+
+    tmp = av_realloc_array(c->audio_elements, c->nb_audio_elements + 1, sizeof(*c->audio_elements));
+    if (!tmp) {
+        ret = AVERROR(ENOMEM);
+        goto fail;
+    }
+    c->audio_elements = tmp;
+
+    audio_element = av_mallocz(sizeof(*audio_element));
+    if (!audio_element) {
+        ret = AVERROR(ENOMEM);
+        goto fail;
+    }
+
+    audio_element->nb_substreams = ffio_read_leb(pbc);
+    audio_element->codec_config_id = codec_config_id;
+    audio_element->audio_element_id = audio_element_id;
+    audio_element->substreams = av_calloc(audio_element->nb_substreams, sizeof(*audio_element->substreams));
+    if (!audio_element->substreams) {
+        ret = AVERROR(ENOMEM);
+        goto fail;
+    }
+
+    element = audio_element->element = av_iamf_audio_element_alloc();
+    if (!element) {
+        ret = AVERROR(ENOMEM);
+        goto fail;
+    }
+
+    element->audio_element_type = audio_element_type;
+
+    for (int i = 0; i < audio_element->nb_substreams; i++) {
+        IAMFSubStream *substream = &audio_element->substreams[i];
+
+        substream->codecpar = avcodec_parameters_alloc();
+        if (!substream->codecpar) {
+            ret = AVERROR(ENOMEM);
+            goto fail;
+        }
+
+        substream->audio_substream_id = ffio_read_leb(pbc);
+
+        substream->codecpar->codec_type = AVMEDIA_TYPE_AUDIO;
+        substream->codecpar->codec_id   = codec_config->codec_id;
+        substream->codecpar->frame_size = codec_config->nb_samples;
+        substream->codecpar->sample_rate = codec_config->sample_rate;
+        substream->codecpar->seek_preroll = codec_config->seek_preroll;
+
+        switch(substream->codecpar->codec_id) {
+        case AV_CODEC_ID_AAC:
+        case AV_CODEC_ID_FLAC:
+        case AV_CODEC_ID_OPUS:
+            substream->codecpar->extradata = av_malloc(codec_config->extradata_size + AV_INPUT_BUFFER_PADDING_SIZE);
+            if (!substream->codecpar->extradata) {
+                ret = AVERROR(ENOMEM);
+                goto fail;
+            }
+            memcpy(substream->codecpar->extradata, codec_config->extradata, codec_config->extradata_size);
+            memset(substream->codecpar->extradata + codec_config->extradata_size, 0, AV_INPUT_BUFFER_PADDING_SIZE);
+            substream->codecpar->extradata_size = codec_config->extradata_size;
+            break;
+        }
+    }
+
+    num_parameters = ffio_read_leb(pbc);
+    if (num_parameters && audio_element_type != 0) {
+        av_log(s, AV_LOG_ERROR, "Audio Element parameter count %u is invalid"
+                                " for Scene representations\n", num_parameters);
+        ret = AVERROR_INVALIDDATA;
+        goto fail;
+    }
+
+    for (int i = 0; i < num_parameters; i++) {
+        unsigned param_definition_type;
+
+        param_definition_type = ffio_read_leb(pbc);
+        if (param_definition_type == AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN) {
+            ret = AVERROR_INVALIDDATA;
+            goto fail;
+        } else if (param_definition_type == AV_IAMF_PARAMETER_DEFINITION_DEMIXING) {
+            ret = param_parse(s, c, pbc, param_definition_type, element, &element->demixing_info);
+            if (ret < 0)
+                goto fail;
+
+            element->default_w = avio_r8(pbc) >> 4;
+        } else if (param_definition_type == AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN) {
+            ret = param_parse(s, c, pbc, param_definition_type, element, &element->recon_gain_info);
+            if (ret < 0)
+                goto fail;
+        } else {
+            unsigned param_definition_size = ffio_read_leb(pbc);
+            avio_skip(pbc, param_definition_size);
+        }
+    }
+
+    if (audio_element_type == AV_IAMF_AUDIO_ELEMENT_TYPE_CHANNEL) {
+        ret = scalable_channel_layout_config(s, pbc, audio_element, codec_config);
+        if (ret < 0)
+            goto fail;
+    } else if (audio_element_type == AV_IAMF_AUDIO_ELEMENT_TYPE_SCENE) {
+        ret = ambisonics_config(s, pbc, audio_element, codec_config);
+        if (ret < 0)
+            goto fail;
+    } else {
+        unsigned audio_element_config_size = ffio_read_leb(pbc);
+        avio_skip(pbc, audio_element_config_size);
+    }
+
+    c->audio_elements[c->nb_audio_elements++] = audio_element;
+
+    len -= avio_tell(pbc);
+    if (len)
+       av_log(s, AV_LOG_WARNING, "Underread in audio_element_obu. %d bytes left at the end\n", len);
+
+    ret = 0;
+fail:
+    av_free(buf);
+    if (ret < 0)
+        ff_iamf_free_audio_element(&audio_element);
+    return ret;
+}
+
+static int label_string(AVIOContext *pb, char **label)
+{
+    uint8_t buf[128];
+
+    avio_get_str(pb, sizeof(buf), buf, sizeof(buf));
+
+    if (pb->error)
+        return pb->error;
+    if (pb->eof_reached)
+        return AVERROR_INVALIDDATA;
+    *label = av_strdup(buf);
+    if (!*label)
+        return AVERROR(ENOMEM);
+
+    return 0;
+}
+
+static int mix_presentation_obu(void *s, IAMFContext *c, AVIOContext *pb, int len)
+{
+    AVIAMFMixPresentation *mix;
+    IAMFMixPresentation **tmp, *mix_presentation = NULL;
+    FFIOContext b;
+    AVIOContext *pbc;
+    uint8_t *buf;
+    unsigned mix_presentation_id;
+    int ret;
+
+    buf = av_malloc(len);
+    if (!buf)
+        return AVERROR(ENOMEM);
+
+    ret = avio_read(pb, buf, len);
+    if (ret != len) {
+        if (ret >= 0)
+            ret = AVERROR_INVALIDDATA;
+        goto fail;
+    }
+
+    ffio_init_context(&b, buf, len, 0, NULL, NULL, NULL, NULL);
+    pbc = &b.pub;
+
+    mix_presentation_id = ffio_read_leb(pbc);
+
+    for (int i = 0; i < c->nb_mix_presentations; i++)
+        if (c->mix_presentations[i]->mix_presentation_id == mix_presentation_id) {
+            av_log(s, AV_LOG_ERROR, "Duplicate mix_presentation_id %d\n", mix_presentation_id);
+            ret = AVERROR_INVALIDDATA;
+            goto fail;
+        }
+
+    tmp = av_realloc_array(c->mix_presentations, c->nb_mix_presentations + 1, sizeof(*c->mix_presentations));
+    if (!tmp) {
+        ret = AVERROR(ENOMEM);
+        goto fail;
+    }
+    c->mix_presentations = tmp;
+
+    mix_presentation = av_mallocz(sizeof(*mix_presentation));
+    if (!mix_presentation) {
+        ret = AVERROR(ENOMEM);
+        goto fail;
+    }
+
+    mix_presentation->mix_presentation_id = mix_presentation_id;
+    mix = mix_presentation->mix = av_iamf_mix_presentation_alloc();
+    if (!mix) {
+        ret = AVERROR(ENOMEM);
+        goto fail;
+    }
+
+    mix_presentation->count_label = ffio_read_leb(pbc);
+    mix_presentation->language_label = av_calloc(mix_presentation->count_label,
+                                                 sizeof(*mix_presentation->language_label));
+    if (!mix_presentation->language_label) {
+        ret = AVERROR(ENOMEM);
+        goto fail;
+    }
+
+    for (int i = 0; i < mix_presentation->count_label; i++) {
+        ret = label_string(pbc, &mix_presentation->language_label[i]);
+        if (ret < 0)
+            goto fail;
+    }
+
+    for (int i = 0; i < mix_presentation->count_label; i++) {
+        char *annotation = NULL;
+        ret = label_string(pbc, &annotation);
+        if (ret < 0)
+            goto fail;
+        ret = av_dict_set(&mix->annotations, mix_presentation->language_label[i], annotation,
+                          AV_DICT_DONT_STRDUP_VAL | AV_DICT_DONT_OVERWRITE);
+        if (ret < 0)
+            goto fail;
+    }
+
+    mix->nb_submixes = ffio_read_leb(pbc);
+    mix->submixes = av_calloc(mix->nb_submixes, sizeof(*mix->submixes));
+    if (!mix->submixes) {
+        ret = AVERROR(ENOMEM);
+        goto fail;
+    }
+
+    for (int i = 0; i < mix->nb_submixes; i++) {
+        AVIAMFSubmix *sub_mix;
+
+        sub_mix = mix->submixes[i] = av_mallocz(sizeof(*sub_mix));
+        if (!sub_mix) {
+            ret = AVERROR(ENOMEM);
+            goto fail;
+        }
+
+        sub_mix->nb_elements = ffio_read_leb(pbc);
+        sub_mix->elements = av_calloc(sub_mix->nb_elements, sizeof(*sub_mix->elements));
+        if (!sub_mix->elements) {
+            ret = AVERROR(ENOMEM);
+            goto fail;
+        }
+
+        for (int j = 0; j < sub_mix->nb_elements; j++) {
+            AVIAMFSubmixElement *submix_element;
+            IAMFAudioElement *audio_element = NULL;
+            unsigned int rendering_config_extension_size;
+
+            submix_element = sub_mix->elements[j] = av_mallocz(sizeof(*submix_element));
+            if (!submix_element) {
+                ret = AVERROR(ENOMEM);
+                goto fail;
+            }
+
+            submix_element->audio_element_id = ffio_read_leb(pbc);
+
+            for (int k = 0; k < c->nb_audio_elements; k++)
+                if (c->audio_elements[k]->audio_element_id == submix_element->audio_element_id) {
+                    audio_element = c->audio_elements[k];
+                    break;
+                }
+
+            if (!audio_element) {
+                av_log(s, AV_LOG_ERROR, "Invalid Audio Element with id %u referenced by Mix Parameters %u\n",
+                       submix_element->audio_element_id, mix_presentation_id);
+                ret = AVERROR_INVALIDDATA;
+                goto fail;
+            }
+
+            for (int k = 0; k < mix_presentation->count_label; k++) {
+                char *annotation = NULL;
+                ret = label_string(pbc, &annotation);
+                if (ret < 0)
+                    goto fail;
+                ret = av_dict_set(&submix_element->annotations, mix_presentation->language_label[k], annotation,
+                                  AV_DICT_DONT_STRDUP_VAL | AV_DICT_DONT_OVERWRITE);
+                if (ret < 0)
+                    goto fail;
+            }
+
+            submix_element->headphones_rendering_mode = avio_r8(pbc) >> 6;
+
+            rendering_config_extension_size = ffio_read_leb(pbc);
+            avio_skip(pbc, rendering_config_extension_size);
+
+            ret = param_parse(s, c, pbc, AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN,
+                              audio_element->element,
+                              &submix_element->element_mix_config);
+            if (ret < 0)
+                goto fail;
+            submix_element->default_mix_gain = av_make_q(sign_extend(avio_rb16(pbc), 16), 1 << 8);
+        }
+
+        ret = param_parse(s, c, pbc, AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN, NULL, &sub_mix->output_mix_config);
+        if (ret < 0)
+            goto fail;
+        sub_mix->default_mix_gain = av_make_q(sign_extend(avio_rb16(pbc), 16), 1 << 8);
+
+        sub_mix->nb_layouts = ffio_read_leb(pbc);
+        sub_mix->layouts = av_calloc(sub_mix->nb_layouts, sizeof(*sub_mix->layouts));
+        if (!sub_mix->layouts) {
+            ret = AVERROR(ENOMEM);
+            goto fail;
+        }
+
+        for (int j = 0; j < sub_mix->nb_layouts; j++) {
+            AVIAMFSubmixLayout *submix_layout;
+            int info_type;
+            int byte = avio_r8(pbc);
+
+            submix_layout = sub_mix->layouts[j] = av_mallocz(sizeof(*submix_layout));
+            if (!submix_layout) {
+                ret = AVERROR(ENOMEM);
+                goto fail;
+            }
+
+            submix_layout->layout_type = byte >> 6;
+            if (submix_layout->layout_type < AV_IAMF_SUBMIX_LAYOUT_TYPE_LOUDSPEAKERS &&
+                submix_layout->layout_type > AV_IAMF_SUBMIX_LAYOUT_TYPE_BINAURAL) {
+                av_log(s, AV_LOG_ERROR, "Invalid Layout type %u in a submix from Mix Presentation %u\n",
+                       submix_layout->layout_type, mix_presentation_id);
+                ret = AVERROR_INVALIDDATA;
+                goto fail;
+            }
+            if (submix_layout->layout_type == 2) {
+                int sound_system;
+                sound_system = (byte >> 2) & 0xF;
+                av_channel_layout_copy(&submix_layout->sound_system, &ff_iamf_sound_system_map[sound_system].layout);
+            }
+
+            info_type = avio_r8(pbc);
+            submix_layout->integrated_loudness = av_make_q(sign_extend(avio_rb16(pbc), 16), 1 << 8);
+            submix_layout->digital_peak = av_make_q(sign_extend(avio_rb16(pbc), 16), 1 << 8);
+
+            if (info_type & 1)
+                submix_layout->true_peak = av_make_q(sign_extend(avio_rb16(pbc), 16), 1 << 8);
+            if (info_type & 2) {
+                unsigned int num_anchored_loudness = avio_r8(pbc);
+
+                for (int k = 0; k < num_anchored_loudness; k++) {
+                    unsigned int anchor_element = avio_r8(pbc);
+                    AVRational anchored_loudness = av_make_q(sign_extend(avio_rb16(pbc), 16), 1 << 8);
+                    if (anchor_element == IAMF_ANCHOR_ELEMENT_DIALOGUE)
+                        submix_layout->dialogue_anchored_loudness = anchored_loudness;
+                    else if (anchor_element <= IAMF_ANCHOR_ELEMENT_ALBUM)
+                        submix_layout->album_anchored_loudness = anchored_loudness;
+                    else
+                        av_log(s, AV_LOG_DEBUG, "Unknown anchor_element. Ignoring\n");
+                }
+            }
+
+            if (info_type & 0xFC) {
+                unsigned int info_type_size = ffio_read_leb(pbc);
+                avio_skip(pbc, info_type_size);
+            }
+        }
+    }
+
+    c->mix_presentations[c->nb_mix_presentations++] = mix_presentation;
+
+    len -= avio_tell(pbc);
+    if (len)
+        av_log(s, AV_LOG_WARNING, "Underread in mix_presentation_obu. %d bytes left at the end\n", len);
+
+    ret = 0;
+fail:
+    av_free(buf);
+    if (ret < 0)
+        ff_iamf_free_mix_presentation(&mix_presentation);
+    return ret;
+}
+
+int ff_iamf_parse_obu_header(const uint8_t *buf, int buf_size,
+                             unsigned *obu_size, int *start_pos, enum IAMF_OBU_Type *type,
+                             unsigned *skip_samples, unsigned *discard_padding)
+{
+    GetBitContext gb;
+    int ret, extension_flag, trimming, start;
+    unsigned skip = 0, discard = 0;
+    unsigned size;
+
+    ret = init_get_bits8(&gb, buf, FFMIN(buf_size, MAX_IAMF_OBU_HEADER_SIZE));
+    if (ret < 0)
+        return ret;
+
+    *type          = get_bits(&gb, 5);
+    /*redundant      =*/ get_bits1(&gb);
+    trimming       = get_bits1(&gb);
+    extension_flag = get_bits1(&gb);
+
+    *obu_size = get_leb(&gb);
+    if (*obu_size > INT_MAX)
+        return AVERROR_INVALIDDATA;
+
+    start = get_bits_count(&gb) / 8;
+
+    if (trimming) {
+        discard = get_leb(&gb); // num_samples_to_trim_at_end
+        skip = get_leb(&gb); // num_samples_to_trim_at_start
+    }
+
+    if (skip_samples)
+        *skip_samples = skip;
+    if (discard_padding)
+        *discard_padding = discard;
+
+    if (extension_flag) {
+        unsigned int extension_bytes;
+        extension_bytes = get_leb(&gb);
+        if (extension_bytes > INT_MAX / 8)
+            return AVERROR_INVALIDDATA;
+        skip_bits_long(&gb, extension_bytes * 8);
+    }
+
+    if (get_bits_left(&gb) < 0)
+        return AVERROR_INVALIDDATA;
+
+    size = *obu_size + start;
+    if (size > INT_MAX)
+        return AVERROR_INVALIDDATA;
+
+    *obu_size -= get_bits_count(&gb) / 8 - start;
+    *start_pos = size - *obu_size;
+
+    return size;
+}
+
+int ff_iamfdec_read_descriptors(IAMFContext *c, AVIOContext *pb,
+                                int max_size, void *log_ctx)
+{
+    uint8_t header[MAX_IAMF_OBU_HEADER_SIZE + AV_INPUT_BUFFER_PADDING_SIZE];
+    int ret;
+
+    while (1) {
+        unsigned obu_size;
+        enum IAMF_OBU_Type type;
+        int start_pos, len, size;
+
+        if ((ret = ffio_ensure_seekback(pb, FFMIN(MAX_IAMF_OBU_HEADER_SIZE, max_size))) < 0)
+            return ret;
+        size = avio_read(pb, header, FFMIN(MAX_IAMF_OBU_HEADER_SIZE, max_size));
+        if (size < 0)
+            return size;
+
+        len = ff_iamf_parse_obu_header(header, size, &obu_size, &start_pos, &type, NULL, NULL);
+        if (len < 0 || obu_size > max_size) {
+            av_log(log_ctx, AV_LOG_ERROR, "Failed to read obu header\n");
+            avio_seek(pb, -size, SEEK_CUR);
+            return len;
+        }
+
+        if (type >= IAMF_OBU_IA_PARAMETER_BLOCK && type < IAMF_OBU_IA_SEQUENCE_HEADER) {
+            avio_seek(pb, -size, SEEK_CUR);
+            break;
+        }
+
+        avio_seek(pb, -(size - start_pos), SEEK_CUR);
+        switch (type) {
+        case IAMF_OBU_IA_CODEC_CONFIG:
+            ret = codec_config_obu(log_ctx, c, pb, obu_size);
+            break;
+        case IAMF_OBU_IA_AUDIO_ELEMENT:
+            ret = audio_element_obu(log_ctx, c, pb, obu_size);
+            break;
+        case IAMF_OBU_IA_MIX_PRESENTATION:
+            ret = mix_presentation_obu(log_ctx, c, pb, obu_size);
+            break;
+        case IAMF_OBU_IA_TEMPORAL_DELIMITER:
+            break;
+        default: {
+            int64_t offset = avio_skip(pb, obu_size);
+            if (offset < 0)
+                ret = offset;
+            break;
+        }
+        }
+        if (ret < 0) {
+            av_log(log_ctx, AV_LOG_ERROR, "Failed to read obu type %d\n", type);
+            return ret;
+        }
+        max_size -= obu_size + start_pos;
+        if (max_size < 0)
+            return AVERROR_INVALIDDATA;
+        if (!max_size)
+            break;
+    }
+
+    return 0;
+}
diff --git a/libavformat/iamf_parse.h b/libavformat/iamf_parse.h
new file mode 100644
index 0000000000..f4f297ecd4
--- /dev/null
+++ b/libavformat/iamf_parse.h
@@ -0,0 +1,38 @@
+/*
+ * Immersive Audio Model and Formats parsing
+ * Copyright (c) 2023 James Almer <jamrial@gmail.com>
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#ifndef AVFORMAT_IAMF_PARSE_H
+#define AVFORMAT_IAMF_PARSE_H
+
+#include <stdint.h>
+
+#include "libavutil/iamf.h"
+#include "avio.h"
+#include "iamf.h"
+
+int ff_iamf_parse_obu_header(const uint8_t *buf, int buf_size,
+                             unsigned *obu_size, int *start_pos, enum IAMF_OBU_Type *type,
+                             unsigned *skip_samples, unsigned *discard_padding);
+
+int ff_iamfdec_read_descriptors(IAMFContext *c, AVIOContext *pb,
+                                int size, void *log_ctx);
+
+#endif /* AVFORMAT_IAMF_PARSE_H */
diff --git a/libavformat/iamfdec.c b/libavformat/iamfdec.c
new file mode 100644
index 0000000000..97438f2662
--- /dev/null
+++ b/libavformat/iamfdec.c
@@ -0,0 +1,495 @@
+/*
+ * Immersive Audio Model and Formats demuxer
+ * Copyright (c) 2023 James Almer <jamrial@gmail.com>
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include "config_components.h"
+
+#include "libavutil/avassert.h"
+#include "libavutil/iamf.h"
+#include "libavutil/intreadwrite.h"
+#include "libavutil/log.h"
+#include "libavcodec/mathops.h"
+#include "avformat.h"
+#include "avio_internal.h"
+#include "demux.h"
+#include "iamf.h"
+#include "iamf_parse.h"
+#include "internal.h"
+
+typedef struct IAMFDemuxContext {
+    IAMFContext iamf;
+
+    // Packet side data
+    AVIAMFParamDefinition *mix;
+    size_t mix_size;
+    AVIAMFParamDefinition *demix;
+    size_t demix_size;
+    AVIAMFParamDefinition *recon;
+    size_t recon_size;
+} IAMFDemuxContext;
+
+static AVStream *find_stream_by_id(AVFormatContext *s, int id)
+{
+    for (int i = 0; i < s->nb_streams; i++)
+        if (s->streams[i]->id == id)
+            return s->streams[i];
+
+    av_log(s, AV_LOG_ERROR, "Invalid stream id %d\n", id);
+    return NULL;
+}
+
+static int audio_frame_obu(AVFormatContext *s, AVPacket *pkt, int len,
+                           enum IAMF_OBU_Type type,
+                           unsigned skip_samples, unsigned discard_padding,
+                           int id_in_bitstream)
+{
+    const IAMFDemuxContext *const c = s->priv_data;
+    AVStream *st;
+    int ret, audio_substream_id;
+
+    if (id_in_bitstream) {
+        unsigned explicit_audio_substream_id;
+        int64_t pos = avio_tell(s->pb);
+        explicit_audio_substream_id = ffio_read_leb(s->pb);
+        len -= avio_tell(s->pb) - pos;
+        audio_substream_id = explicit_audio_substream_id;
+    } else
+        audio_substream_id = type - IAMF_OBU_IA_AUDIO_FRAME_ID0;
+
+    st = find_stream_by_id(s, audio_substream_id);
+    if (!st)
+        return AVERROR_INVALIDDATA;
+
+    ret = av_get_packet(s->pb, pkt, len);
+    if (ret < 0)
+        return ret;
+    if (ret != len)
+        return AVERROR_INVALIDDATA;
+
+    if (skip_samples || discard_padding) {
+        uint8_t *side_data = av_packet_new_side_data(pkt, AV_PKT_DATA_SKIP_SAMPLES, 10);
+        if (!side_data)
+            return AVERROR(ENOMEM);
+        AV_WL32(side_data, skip_samples);
+        AV_WL32(side_data + 4, discard_padding);
+    }
+    if (c->mix) {
+        uint8_t *side_data = av_packet_new_side_data(pkt, AV_PKT_DATA_IAMF_MIX_GAIN_PARAM, c->mix_size);
+        if (!side_data)
+            return AVERROR(ENOMEM);
+        memcpy(side_data, c->mix, c->mix_size);
+    }
+    if (c->demix) {
+        uint8_t *side_data = av_packet_new_side_data(pkt, AV_PKT_DATA_IAMF_DEMIXING_INFO_PARAM, c->demix_size);
+        if (!side_data)
+            return AVERROR(ENOMEM);
+        memcpy(side_data, c->demix, c->demix_size);
+    }
+    if (c->recon) {
+        uint8_t *side_data = av_packet_new_side_data(pkt, AV_PKT_DATA_IAMF_RECON_GAIN_INFO_PARAM, c->recon_size);
+        if (!side_data)
+            return AVERROR(ENOMEM);
+        memcpy(side_data, c->recon, c->recon_size);
+    }
+
+    pkt->stream_index = st->index;
+    return 0;
+}
+
+static const IAMFParamDefinition *get_param_definition(AVFormatContext *s, unsigned int parameter_id)
+{
+    const IAMFDemuxContext *const c = s->priv_data;
+    const IAMFContext *const iamf = &c->iamf;
+    const IAMFParamDefinition *param_definition = NULL;
+
+    for (int i = 0; i < iamf->nb_param_definitions; i++)
+        if (iamf->param_definitions[i]->param->parameter_id == parameter_id) {
+            param_definition = iamf->param_definitions[i];
+            break;
+        }
+
+    return param_definition;
+}
+
+static int parameter_block_obu(AVFormatContext *s, int len)
+{
+    IAMFDemuxContext *const c = s->priv_data;
+    const IAMFParamDefinition *param_definition;
+    const AVIAMFParamDefinition *param;
+    AVIAMFParamDefinition *out_param = NULL;
+    FFIOContext b;
+    AVIOContext *pb;
+    uint8_t *buf;
+    unsigned int duration, constant_subblock_duration;
+    unsigned int nb_subblocks;
+    unsigned int parameter_id;
+    size_t out_param_size;
+    int ret;
+
+    buf = av_malloc(len);
+    if (!buf)
+        return AVERROR(ENOMEM);
+
+    ret = avio_read(s->pb, buf, len);
+    if (ret != len) {
+        if (ret >= 0)
+            ret = AVERROR_INVALIDDATA;
+        goto fail;
+    }
+
+    ffio_init_context(&b, buf, len, 0, NULL, NULL, NULL, NULL);
+    pb = &b.pub;
+
+    parameter_id = ffio_read_leb(pb);
+    param_definition = get_param_definition(s, parameter_id);
+    if (!param_definition) {
+        av_log(s, AV_LOG_VERBOSE, "Non existant parameter_id %d referenced in a parameter block. Ignoring\n",
+               parameter_id);
+        ret = 0;
+        goto fail;
+    }
+
+    param = param_definition->param;
+    if (param->param_definition_mode) {
+        duration = ffio_read_leb(pb);
+        constant_subblock_duration = ffio_read_leb(pb);
+        if (constant_subblock_duration == 0)
+            nb_subblocks = ffio_read_leb(pb);
+        else
+            nb_subblocks = duration / constant_subblock_duration;
+    } else {
+        duration = param->duration;
+        constant_subblock_duration = param->constant_subblock_duration;
+        nb_subblocks = param->nb_subblocks;
+        if (!nb_subblocks)
+            nb_subblocks = duration / constant_subblock_duration;
+    }
+
+    out_param = av_iamf_param_definition_alloc(param->param_definition_type, nb_subblocks, &out_param_size);
+    if (!out_param) {
+        ret = AVERROR(ENOMEM);
+        goto fail;
+    }
+
+    out_param->parameter_id = param->parameter_id;
+    out_param->param_definition_type = param->param_definition_type;
+    out_param->parameter_rate = param->parameter_rate;
+    out_param->param_definition_mode = param->param_definition_mode;
+    out_param->duration = duration;
+    out_param->constant_subblock_duration = constant_subblock_duration;
+    out_param->nb_subblocks = nb_subblocks;
+
+    for (int i = 0; i < nb_subblocks; i++) {
+        void *subblock = av_iamf_param_definition_get_subblock(out_param, i);
+        unsigned int subblock_duration = constant_subblock_duration;
+
+        if (param->param_definition_mode && !constant_subblock_duration)
+            subblock_duration = ffio_read_leb(pb);
+
+        switch (param->param_definition_type) {
+        case AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN: {
+            AVIAMFMixGain *mix = subblock;
+
+            mix->animation_type = ffio_read_leb(pb);
+            if (mix->animation_type > AV_IAMF_ANIMATION_TYPE_BEZIER) {
+                ret = 0;
+                av_free(out_param);
+                goto fail;
+            }
+
+            mix->start_point_value = av_make_q(sign_extend(avio_rb16(pb), 16), 1 << 8);
+            if (mix->animation_type >= AV_IAMF_ANIMATION_TYPE_LINEAR)
+                mix->end_point_value = av_make_q(sign_extend(avio_rb16(pb), 16), 1 << 8);
+            if (mix->animation_type == AV_IAMF_ANIMATION_TYPE_BEZIER) {
+                mix->control_point_value = av_make_q(sign_extend(avio_rb16(pb), 16), 1 << 8);
+                mix->control_point_relative_time = av_make_q(avio_r8(pb), 1 << 8);
+            }
+            mix->subblock_duration = subblock_duration;
+            break;
+        }
+        case AV_IAMF_PARAMETER_DEFINITION_DEMIXING: {
+            AVIAMFDemixingInfo *demix = subblock;
+
+            demix->dmixp_mode = avio_r8(pb) >> 5;
+            demix->subblock_duration = subblock_duration;
+            break;
+        }
+        case AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN: {
+            AVIAMFReconGain *recon = subblock;
+            const AVIAMFAudioElement *audio_element = param_definition->audio_element;
+
+            av_assert0(audio_element);
+            for (int i = 0; i < audio_element->nb_layers; i++) {
+                const AVIAMFLayer *layer = audio_element->layers[i];
+                if (layer->flags & AV_IAMF_LAYER_FLAG_RECON_GAIN) {
+                    unsigned int recon_gain_flags = ffio_read_leb(pb);
+                    unsigned int bitcount = 7 + 5 * !!(recon_gain_flags & 0x80);
+                    recon_gain_flags = (recon_gain_flags & 0x7F) | ((recon_gain_flags & 0xFF00) >> 1);
+                    for (int j = 0; j < bitcount; j++) {
+                        if (recon_gain_flags & (1 << j))
+                            recon->recon_gain[i][j] = avio_r8(pb);
+                    }
+                }
+            }
+            recon->subblock_duration = subblock_duration;
+            break;
+        }
+        default:
+            av_assert0(0);
+        }
+    }
+
+    len -= avio_tell(pb);
+    if (len) {
+       int level = (s->error_recognition & AV_EF_EXPLODE) ? AV_LOG_ERROR : AV_LOG_WARNING;
+       av_log(s, level, "Underread in parameter_block_obu. %d bytes left at the end\n", len);
+    }
+
+    switch (param->param_definition_type) {
+    case AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN:
+        av_free(c->mix);
+        c->mix = out_param;
+        c->mix_size = out_param_size;
+        break;
+    case AV_IAMF_PARAMETER_DEFINITION_DEMIXING:
+        av_free(c->demix);
+        c->demix = out_param;
+        c->demix_size = out_param_size;
+        break;
+    case AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN:
+        av_free(c->recon);
+        c->recon = out_param;
+        c->recon_size = out_param_size;
+        break;
+    default:
+        av_assert0(0);
+    }
+
+    ret = 0;
+fail:
+    if (ret < 0)
+        av_free(out_param);
+    av_free(buf);
+
+    return ret;
+}
+
+static int iamf_read_packet(AVFormatContext *s, AVPacket *pkt)
+{
+    IAMFDemuxContext *const c = s->priv_data;
+    uint8_t header[MAX_IAMF_OBU_HEADER_SIZE + AV_INPUT_BUFFER_PADDING_SIZE];
+    unsigned obu_size;
+    int ret;
+
+    while (1) {
+        enum IAMF_OBU_Type type;
+        unsigned skip_samples, discard_padding;
+        int len, size, start_pos;
+
+        if ((ret = ffio_ensure_seekback(s->pb, MAX_IAMF_OBU_HEADER_SIZE)) < 0)
+            return ret;
+        size = avio_read(s->pb, header, MAX_IAMF_OBU_HEADER_SIZE);
+        if (size < 0)
+            return size;
+
+        len = ff_iamf_parse_obu_header(header, size, &obu_size, &start_pos, &type,
+                                       &skip_samples, &discard_padding);
+        if (len < 0) {
+            av_log(s, AV_LOG_ERROR, "Failed to read obu\n");
+            return len;
+        }
+        avio_seek(s->pb, -(size - start_pos), SEEK_CUR);
+
+        if (type >= IAMF_OBU_IA_AUDIO_FRAME && type <= IAMF_OBU_IA_AUDIO_FRAME_ID17)
+            return audio_frame_obu(s, pkt, obu_size, type,
+                                   skip_samples, discard_padding,
+                                   type == IAMF_OBU_IA_AUDIO_FRAME);
+        else if (type == IAMF_OBU_IA_PARAMETER_BLOCK) {
+            ret = parameter_block_obu(s, obu_size);
+            if (ret < 0)
+                return ret;
+        } else if (type == IAMF_OBU_IA_TEMPORAL_DELIMITER) {
+            av_freep(&c->mix);
+            c->mix_size = 0;
+            av_freep(&c->demix);
+            c->demix_size = 0;
+            av_freep(&c->recon);
+            c->recon_size = 0;
+        } else {
+            int64_t offset = avio_skip(s->pb, obu_size);
+            if (offset < 0) {
+                ret = offset;
+                break;
+            }
+        }
+    }
+
+    return ret;
+}
+
+//return < 0 if we need more data
+static int get_score(const uint8_t *buf, int buf_size, enum IAMF_OBU_Type type, int *seq)
+{
+    if (type == IAMF_OBU_IA_SEQUENCE_HEADER) {
+        if (buf_size < 4 || AV_RB32(buf) != MKBETAG('i','a','m','f'))
+            return 0;
+        *seq = 1;
+        return -1;
+    }
+    if (type >= IAMF_OBU_IA_CODEC_CONFIG && type <= IAMF_OBU_IA_TEMPORAL_DELIMITER)
+        return *seq ? -1 : 0;
+    if (type >= IAMF_OBU_IA_AUDIO_FRAME && type <= IAMF_OBU_IA_AUDIO_FRAME_ID17)
+        return *seq ? AVPROBE_SCORE_EXTENSION + 1 : 0;
+    return 0;
+}
+
+static int iamf_probe(const AVProbeData *p)
+{
+    unsigned obu_size;
+    enum IAMF_OBU_Type type;
+    int seq = 0, cnt = 0, start_pos;
+    int ret;
+
+    while (1) {
+        int size = ff_iamf_parse_obu_header(p->buf + cnt, p->buf_size - cnt,
+                                            &obu_size, &start_pos, &type,
+                                            NULL, NULL);
+        if (size < 0)
+            return 0;
+
+        ret = get_score(p->buf + cnt + start_pos,
+                        p->buf_size - cnt - start_pos,
+                        type, &seq);
+        if (ret >= 0)
+            return ret;
+
+        cnt += FFMIN(size, p->buf_size - cnt);
+    }
+    return 0;
+}
+
+static int iamf_read_header(AVFormatContext *s)
+{
+    IAMFDemuxContext *const c = s->priv_data;
+    IAMFContext *const iamf = &c->iamf;
+    int ret;
+
+    ret = ff_iamfdec_read_descriptors(iamf, s->pb, INT_MAX, s);
+    if (ret < 0)
+        return ret;
+
+    for (int i = 0; i < iamf->nb_audio_elements; i++) {
+        IAMFAudioElement *audio_element = iamf->audio_elements[i];
+        AVStreamGroup *stg = avformat_stream_group_create(s, AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT, NULL);
+
+        if (!stg)
+            return AVERROR(ENOMEM);
+
+        stg->id = audio_element->audio_element_id;
+        stg->params.iamf_audio_element = audio_element->element;
+        audio_element->element = NULL;
+
+        for (int j = 0; j < audio_element->nb_substreams; j++) {
+            IAMFSubStream *substream = &audio_element->substreams[j];
+            AVStream *st = avformat_new_stream(s, NULL);
+
+            if (!st)
+                return AVERROR(ENOMEM);
+
+            ret = avformat_stream_group_add_stream(stg, st);
+            if (ret < 0)
+                return ret;
+
+            ret = avcodec_parameters_copy(st->codecpar, substream->codecpar);
+            if (ret < 0)
+                return ret;
+
+            st->id = substream->audio_substream_id;
+            avpriv_set_pts_info(st, 64, 1, st->codecpar->sample_rate);
+        }
+    }
+
+    for (int i = 0; i < iamf->nb_mix_presentations; i++) {
+        IAMFMixPresentation *mix_presentation = iamf->mix_presentations[i];
+        AVStreamGroup *stg = avformat_stream_group_create(s, AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION, NULL);
+        const AVIAMFMixPresentation *mix = mix_presentation->mix;
+
+        if (!stg)
+            return AVERROR(ENOMEM);
+
+        stg->id = mix_presentation->mix_presentation_id;
+        stg->params.iamf_mix_presentation = mix_presentation->mix;
+        mix_presentation->mix = NULL;
+
+        for (int j = 0; j < mix->nb_submixes; j++) {
+            AVIAMFSubmix *sub_mix = mix->submixes[j];
+
+            for (int k = 0; k < sub_mix->nb_elements; k++) {
+                AVIAMFSubmixElement *submix_element = sub_mix->elements[k];
+                AVStreamGroup *audio_element = NULL;
+
+                for (int l = 0; l < s->nb_stream_groups; l++)
+                    if (s->stream_groups[l]->type == AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT &&
+                        s->stream_groups[l]->id == submix_element->audio_element_id) {
+                        audio_element = s->stream_groups[l];
+                        break;
+                    }
+                av_assert0(audio_element);
+
+                for (int l = 0; l < audio_element->nb_streams; l++) {
+                    ret = avformat_stream_group_add_stream(stg, audio_element->streams[l]);
+                    if (ret < 0 && ret != AVERROR(EEXIST))
+                        return ret;
+                }
+            }
+        }
+    }
+
+    return 0;
+}
+
+static int iamf_read_close(AVFormatContext *s)
+{
+    IAMFDemuxContext *const c = s->priv_data;
+
+    ff_iamf_uninit_context(&c->iamf);
+
+    av_freep(&c->mix);
+    c->mix_size = 0;
+    av_freep(&c->demix);
+    c->demix_size = 0;
+    av_freep(&c->recon);
+    c->recon_size = 0;
+
+    return 0;
+}
+
+const AVInputFormat ff_iamf_demuxer = {
+    .name           = "iamf",
+    .long_name      = NULL_IF_CONFIG_SMALL("Raw Immersive Audio Model and Formats"),
+    .priv_data_size = sizeof(IAMFDemuxContext),
+    .flags_internal = FF_FMT_INIT_CLEANUP,
+    .read_probe     = iamf_probe,
+    .read_header    = iamf_read_header,
+    .read_packet    = iamf_read_packet,
+    .read_close     = iamf_read_close,
+    .extensions     = "iamf",
+    .flags          = AVFMT_GENERIC_INDEX | AVFMT_NO_BYTE_SEEK | AVFMT_NOTIMESTAMPS | AVFMT_SHOW_IDS,
+};
-- 
2.43.0

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [FFmpeg-devel] [PATCH 8/8] avformat: Immersive Audio Model and Formats muxer
  2023-12-05 22:43 [FFmpeg-devel] [PATCH v6 0/8] avformat: introduce AVStreamGroup James Almer
                   ` (6 preceding siblings ...)
  2023-12-05 22:44 ` [FFmpeg-devel] [PATCH 7/8] avformat: Immersive Audio Model and Formats demuxer James Almer
@ 2023-12-05 22:44 ` James Almer
  2023-12-10 21:52 ` [FFmpeg-devel] [PATCH v6 0/8] avformat: introduce AVStreamGroup James Almer
  8 siblings, 0 replies; 23+ messages in thread
From: James Almer @ 2023-12-05 22:44 UTC (permalink / raw)
  To: ffmpeg-devel

Signed-off-by: James Almer <jamrial@gmail.com>
---
 libavformat/Makefile      |   1 +
 libavformat/allformats.c  |   1 +
 libavformat/iamf_writer.c | 823 ++++++++++++++++++++++++++++++++++++++
 libavformat/iamf_writer.h |  51 +++
 libavformat/iamfenc.c     | 388 ++++++++++++++++++
 5 files changed, 1264 insertions(+)
 create mode 100644 libavformat/iamf_writer.c
 create mode 100644 libavformat/iamf_writer.h
 create mode 100644 libavformat/iamfenc.c

diff --git a/libavformat/Makefile b/libavformat/Makefile
index f23c22792b..581e378d95 100644
--- a/libavformat/Makefile
+++ b/libavformat/Makefile
@@ -259,6 +259,7 @@ OBJS-$(CONFIG_HLS_DEMUXER)               += hls.o hls_sample_encryption.o
 OBJS-$(CONFIG_HLS_MUXER)                 += hlsenc.o hlsplaylist.o avc.o
 OBJS-$(CONFIG_HNM_DEMUXER)               += hnm.o
 OBJS-$(CONFIG_IAMF_DEMUXER)              += iamfdec.o iamf_parse.o iamf.o
+OBJS-$(CONFIG_IAMF_MUXER)                += iamfenc.o iamf_writer.o iamf.o
 OBJS-$(CONFIG_ICO_DEMUXER)               += icodec.o
 OBJS-$(CONFIG_ICO_MUXER)                 += icoenc.o
 OBJS-$(CONFIG_IDCIN_DEMUXER)             += idcin.o
diff --git a/libavformat/allformats.c b/libavformat/allformats.c
index 6e520b78a6..ce6be5f04d 100644
--- a/libavformat/allformats.c
+++ b/libavformat/allformats.c
@@ -213,6 +213,7 @@ extern const AVInputFormat  ff_hls_demuxer;
 extern const FFOutputFormat ff_hls_muxer;
 extern const AVInputFormat  ff_hnm_demuxer;
 extern const AVInputFormat  ff_iamf_demuxer;
+extern const FFOutputFormat ff_iamf_muxer;
 extern const AVInputFormat  ff_ico_demuxer;
 extern const FFOutputFormat ff_ico_muxer;
 extern const AVInputFormat  ff_idcin_demuxer;
diff --git a/libavformat/iamf_writer.c b/libavformat/iamf_writer.c
new file mode 100644
index 0000000000..fc31174b53
--- /dev/null
+++ b/libavformat/iamf_writer.c
@@ -0,0 +1,823 @@
+/*
+ * Immersive Audio Model and Formats muxing helpers and structs
+ * Copyright (c) 2023 James Almer <jamrial@gmail.com>
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include "libavutil/channel_layout.h"
+#include "libavutil/intreadwrite.h"
+#include "libavutil/iamf.h"
+#include "libavutil/mem.h"
+#include "libavcodec/get_bits.h"
+#include "libavcodec/flac.h"
+#include "libavcodec/mpeg4audio.h"
+#include "libavcodec/put_bits.h"
+#include "avformat.h"
+#include "avio_internal.h"
+#include "iamf.h"
+#include "iamf_writer.h"
+
+
+static int update_extradata(IAMFCodecConfig *codec_config)
+{
+    GetBitContext gb;
+    PutBitContext pb;
+    int ret;
+
+    switch(codec_config->codec_id) {
+    case AV_CODEC_ID_OPUS:
+        if (codec_config->extradata_size < 19)
+            return AVERROR_INVALIDDATA;
+        codec_config->extradata_size -= 8;
+        memmove(codec_config->extradata, codec_config->extradata + 8, codec_config->extradata_size);
+        AV_WB8(codec_config->extradata + 1, 2); // set channels to stereo
+        break;
+    case AV_CODEC_ID_FLAC: {
+        uint8_t buf[13];
+
+        init_put_bits(&pb, buf, sizeof(buf));
+        ret = init_get_bits8(&gb, codec_config->extradata, codec_config->extradata_size);
+        if (ret < 0)
+            return ret;
+
+        put_bits32(&pb, get_bits_long(&gb, 32)); // min/max blocksize
+        put_bits64(&pb, 48, get_bits64(&gb, 48)); // min/max framesize
+        put_bits(&pb, 20, get_bits(&gb, 20)); // samplerate
+        skip_bits(&gb, 3);
+        put_bits(&pb, 3, 1); // set channels to stereo
+        ret = put_bits_left(&pb);
+        put_bits(&pb, ret, get_bits(&gb, ret));
+        flush_put_bits(&pb);
+
+        memcpy(codec_config->extradata, buf, sizeof(buf));
+        break;
+    }
+    default:
+        break;
+    }
+
+    return 0;
+}
+
+static int fill_codec_config(IAMFContext *iamf, const AVStreamGroup *stg,
+                             IAMFCodecConfig *codec_config)
+{
+    const AVStream *st = stg->streams[0];
+    IAMFCodecConfig **tmp;
+    int j, ret = 0;
+
+    codec_config->codec_id = st->codecpar->codec_id;
+    codec_config->sample_rate = st->codecpar->sample_rate;
+    codec_config->codec_tag = st->codecpar->codec_tag;
+    codec_config->nb_samples = st->codecpar->frame_size;
+    codec_config->seek_preroll = st->codecpar->seek_preroll;
+    if (st->codecpar->extradata_size) {
+        codec_config->extradata = av_memdup(st->codecpar->extradata, st->codecpar->extradata_size);
+        if (!codec_config->extradata)
+            return AVERROR(ENOMEM);
+        codec_config->extradata_size = st->codecpar->extradata_size;
+        ret = update_extradata(codec_config);
+        if (ret < 0)
+            goto fail;
+    }
+
+    for (j = 0; j < iamf->nb_codec_configs; j++) {
+        if (!memcmp(iamf->codec_configs[j], codec_config, offsetof(IAMFCodecConfig, extradata)) &&
+            (!codec_config->extradata_size || !memcmp(iamf->codec_configs[j]->extradata,
+                                                      codec_config->extradata, codec_config->extradata_size)))
+            break;
+    }
+
+    if (j < iamf->nb_codec_configs) {
+        av_free(iamf->codec_configs[j]->extradata);
+        av_free(iamf->codec_configs[j]);
+        iamf->codec_configs[j] = codec_config;
+        return j;
+    }
+
+    tmp = av_realloc_array(iamf->codec_configs, iamf->nb_codec_configs + 1, sizeof(*iamf->codec_configs));
+    if (!tmp) {
+        ret = AVERROR(ENOMEM);
+        goto fail;
+    }
+
+    iamf->codec_configs = tmp;
+    iamf->codec_configs[iamf->nb_codec_configs] = codec_config;
+    codec_config->codec_config_id = iamf->nb_codec_configs;
+
+    return iamf->nb_codec_configs++;
+
+fail:
+    av_freep(&codec_config->extradata);
+    return ret;
+}
+
+static IAMFParamDefinition *add_param_definition(IAMFContext *iamf, AVIAMFParamDefinition *param)
+{
+    IAMFParamDefinition **tmp, *param_definition;
+
+    tmp = av_realloc_array(iamf->param_definitions, iamf->nb_param_definitions + 1,
+                           sizeof(*iamf->param_definitions));
+    if (!tmp)
+        return NULL;
+
+    iamf->param_definitions = tmp;
+
+    param_definition = av_mallocz(sizeof(*param_definition));
+    if (!param_definition)
+        return NULL;
+
+    param_definition->param = param;
+    iamf->param_definitions[iamf->nb_param_definitions++] = param_definition;
+
+    return param_definition;
+}
+
+int ff_iamf_add_audio_element(IAMFContext *iamf, const AVStreamGroup *stg, void *log_ctx)
+{
+    const AVIAMFAudioElement *iamf_audio_element;
+    IAMFAudioElement **tmp, *audio_element;
+    IAMFCodecConfig *codec_config;
+    int ret;
+
+    if (stg->type != AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT)
+        return AVERROR(EINVAL);
+
+    iamf_audio_element = stg->params.iamf_audio_element;
+    if (iamf_audio_element->audio_element_type == AV_IAMF_AUDIO_ELEMENT_TYPE_SCENE) {
+        const AVIAMFLayer *layer = iamf_audio_element->layers[0];
+        if (iamf_audio_element->nb_layers != 1) {
+            av_log(log_ctx, AV_LOG_ERROR, "Invalid amount of layers for SCENE_BASED audio element. Must be 1\n");
+            return AVERROR(EINVAL);
+        }
+        if (layer->ch_layout.order != AV_CHANNEL_ORDER_CUSTOM &&
+            layer->ch_layout.order != AV_CHANNEL_ORDER_AMBISONIC) {
+            av_log(log_ctx, AV_LOG_ERROR, "Invalid channel layout for SCENE_BASED audio element\n");
+            return AVERROR(EINVAL);
+        }
+        if (layer->ambisonics_mode >= AV_IAMF_AMBISONICS_MODE_PROJECTION) {
+            av_log(log_ctx, AV_LOG_ERROR, "Unsuported ambisonics mode %d\n", layer->ambisonics_mode);
+            return AVERROR_PATCHWELCOME;
+        }
+        for (int i = 0; i < stg->nb_streams; i++) {
+            if (stg->streams[i]->codecpar->ch_layout.nb_channels > 1) {
+                av_log(log_ctx, AV_LOG_ERROR, "Invalid amount of channels in a stream for MONO mode ambisonics\n");
+                return AVERROR(EINVAL);
+            }
+        }
+    } else
+        for (int j, i = 0; i < iamf_audio_element->nb_layers; i++) {
+            const AVIAMFLayer *layer = iamf_audio_element->layers[i];
+            for (j = 0; j < FF_ARRAY_ELEMS(ff_iamf_scalable_ch_layouts); j++)
+                if (!av_channel_layout_compare(&layer->ch_layout, &ff_iamf_scalable_ch_layouts[j]))
+                    break;
+
+            if (j >= FF_ARRAY_ELEMS(ff_iamf_scalable_ch_layouts)) {
+                av_log(log_ctx, AV_LOG_ERROR, "Unsupported channel layout in stream group #%d\n", i);
+                return AVERROR(EINVAL);
+            }
+        }
+
+    for (int i = 0; i < iamf->nb_audio_elements; i++) {
+        if (stg->id == iamf->audio_elements[i]->audio_element_id) {
+            av_log(log_ctx, AV_LOG_ERROR, "Duplicated Audio Element id %"PRId64"\n", stg->id);
+            return AVERROR(EINVAL);
+        }
+    }
+
+    codec_config = av_mallocz(sizeof(*codec_config));
+    if (!codec_config)
+        return AVERROR(ENOMEM);
+
+    ret = fill_codec_config(iamf, stg, codec_config);
+    if (ret < 0) {
+        av_free(codec_config);
+        return ret;
+    }
+
+    audio_element = av_mallocz(sizeof(*audio_element));
+    if (!audio_element)
+        return AVERROR(ENOMEM);
+
+    audio_element->element = stg->params.iamf_audio_element;
+    audio_element->audio_element_id = stg->id;
+    audio_element->codec_config_id = ret;
+
+    audio_element->substreams = av_calloc(stg->nb_streams, sizeof(*audio_element->substreams));
+    if (!audio_element->substreams)
+        return AVERROR(ENOMEM);
+    audio_element->nb_substreams = stg->nb_streams;
+
+    audio_element->layers = av_calloc(iamf_audio_element->nb_layers, sizeof(*audio_element->layers));
+    if (!audio_element->layers)
+        return AVERROR(ENOMEM);
+
+    for (int i = 0, j = 0; i < iamf_audio_element->nb_layers; i++) {
+        int nb_channels = iamf_audio_element->layers[i]->ch_layout.nb_channels;
+
+        IAMFLayer *layer = &audio_element->layers[i];
+        if (!layer)
+            return AVERROR(ENOMEM);
+        memset(layer, 0, sizeof(*layer));
+
+        if (i)
+            nb_channels -= iamf_audio_element->layers[i - 1]->ch_layout.nb_channels;
+        for (; nb_channels > 0 && j < stg->nb_streams; j++) {
+            const AVStream *st = stg->streams[j];
+            IAMFSubStream *substream = &audio_element->substreams[j];
+
+            substream->audio_substream_id = st->id;
+            layer->substream_count++;
+            layer->coupled_substream_count += st->codecpar->ch_layout.nb_channels == 2;
+            nb_channels -= st->codecpar->ch_layout.nb_channels;
+        }
+        if (nb_channels) {
+            av_log(log_ctx, AV_LOG_ERROR, "Invalid channel count across substreams in layer %u from stream group %u\n",
+                   i, stg->index);
+            return AVERROR(EINVAL);
+        }
+    }
+
+    if (iamf_audio_element->demixing_info) {
+        AVIAMFParamDefinition *param = iamf_audio_element->demixing_info;
+        IAMFParamDefinition *param_definition = ff_iamf_get_param_definition(iamf, param->parameter_id);
+
+        if (param->nb_subblocks != 1) {
+            av_log(log_ctx, AV_LOG_ERROR, "nb_subblocks in demixing_info for stream group %u is not 1\n", stg->index);
+            return AVERROR(EINVAL);
+        }
+    if (!param_definition) {
+            param_definition = add_param_definition(iamf, param);
+            if (!param_definition)
+                return AVERROR(ENOMEM);
+        }
+        param_definition->audio_element = iamf_audio_element;
+    }
+    if (iamf_audio_element->recon_gain_info) {
+        AVIAMFParamDefinition *param = iamf_audio_element->recon_gain_info;
+        IAMFParamDefinition *param_definition = ff_iamf_get_param_definition(iamf, param->parameter_id);
+
+        if (param->nb_subblocks != 1) {
+            av_log(log_ctx, AV_LOG_ERROR, "nb_subblocks in recon_gain_info for stream group %u is not 1\n", stg->index);
+            return AVERROR(EINVAL);
+        }
+
+        if (!param_definition) {
+            param_definition = add_param_definition(iamf, param);
+            if (!param_definition)
+                return AVERROR(ENOMEM);
+        }
+        param_definition->audio_element = iamf_audio_element;
+    }
+
+    tmp = av_realloc_array(iamf->audio_elements, iamf->nb_audio_elements + 1, sizeof(*iamf->audio_elements));
+    if (!tmp)
+        return AVERROR(ENOMEM);
+
+    iamf->audio_elements = tmp;
+    iamf->audio_elements[iamf->nb_audio_elements++] = audio_element;
+
+    return 0;
+}
+
+int ff_iamf_add_mix_presentation(IAMFContext *iamf, const AVStreamGroup *stg, void *log_ctx)
+{
+    IAMFMixPresentation **tmp, *mix_presentation;
+
+    if (stg->type != AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION)
+        return AVERROR(EINVAL);
+
+    for (int i = 0; i < iamf->nb_mix_presentations; i++) {
+        if (stg->id == iamf->mix_presentations[i]->mix_presentation_id) {
+            av_log(log_ctx, AV_LOG_ERROR, "Duplicate Mix Presentation id %"PRId64"\n", stg->id);
+            return AVERROR(EINVAL);
+        }
+    }
+
+    mix_presentation = av_mallocz(sizeof(*mix_presentation));
+    if (!mix_presentation)
+        return AVERROR(ENOMEM);
+
+    mix_presentation->mix = stg->params.iamf_mix_presentation;
+    mix_presentation->mix_presentation_id = stg->id;
+
+    for (int i = 0; i < mix_presentation->mix->nb_submixes; i++) {
+        const AVIAMFSubmix *submix = mix_presentation->mix->submixes[i];
+        AVIAMFParamDefinition *param = submix->output_mix_config;
+        IAMFParamDefinition *param_definition;
+
+        if (!param) {
+            av_log(log_ctx, AV_LOG_ERROR, "output_mix_config is not present in submix %u from "
+                                          "Mix Presentation ID %"PRId64"\n", i, stg->id);
+            return AVERROR(EINVAL);
+        }
+
+        param_definition = ff_iamf_get_param_definition(iamf, param->parameter_id);
+        if (!param_definition) {
+            param_definition = add_param_definition(iamf, param);
+            if (!param_definition)
+                return AVERROR(ENOMEM);
+        }
+
+        for (int j = 0; j < submix->nb_elements; j++) {
+            const AVIAMFAudioElement *iamf_audio_element = NULL;
+            const AVIAMFSubmixElement *element = submix->elements[j];
+            param = element->element_mix_config;
+
+            if (!param) {
+                av_log(log_ctx, AV_LOG_ERROR, "element_mix_config is not present for element %u in submix %u from "
+                                              "Mix Presentation ID %"PRId64"\n", j, i, stg->id);
+                return AVERROR(EINVAL);
+            }
+            param_definition = ff_iamf_get_param_definition(iamf, param->parameter_id);
+            if (!param_definition) {
+                param_definition = add_param_definition(iamf, param);
+                if (!param_definition)
+                    return AVERROR(ENOMEM);
+            }
+            for (int k = 0; k < iamf->nb_audio_elements; k++)
+                if (iamf->audio_elements[k]->audio_element_id == element->audio_element_id) {
+                    iamf_audio_element = iamf->audio_elements[k]->element;
+                    break;
+                }
+            param_definition->audio_element = iamf_audio_element;
+        }
+    }
+
+    tmp = av_realloc_array(iamf->mix_presentations, iamf->nb_mix_presentations + 1, sizeof(*iamf->mix_presentations));
+    if (!tmp)
+        return AVERROR(ENOMEM);
+
+    iamf->mix_presentations = tmp;		
+    iamf->mix_presentations[iamf->nb_mix_presentations++] = mix_presentation;
+
+    return 0;
+}
+
+static int iamf_write_codec_config(const IAMFContext *iamf,
+                                   const IAMFCodecConfig *codec_config,
+                                   AVIOContext *pb)
+{
+    uint8_t header[MAX_IAMF_OBU_HEADER_SIZE];
+    AVIOContext *dyn_bc;
+    uint8_t *dyn_buf = NULL;
+    PutBitContext pbc;
+    int dyn_size;
+
+    int ret = avio_open_dyn_buf(&dyn_bc);
+    if (ret < 0)
+        return ret;
+
+    ffio_write_leb(dyn_bc, codec_config->codec_config_id);
+    avio_wl32(dyn_bc, codec_config->codec_tag);
+
+    ffio_write_leb(dyn_bc, codec_config->nb_samples);
+    avio_wb16(dyn_bc, codec_config->seek_preroll);
+
+    switch(codec_config->codec_id) {
+    case AV_CODEC_ID_OPUS:
+        avio_write(dyn_bc, codec_config->extradata, codec_config->extradata_size);
+        break;
+    case AV_CODEC_ID_AAC:
+        return AVERROR_PATCHWELCOME;
+    case AV_CODEC_ID_FLAC:
+        avio_w8(dyn_bc, 0x80);
+        avio_wb24(dyn_bc, codec_config->extradata_size);
+        avio_write(dyn_bc, codec_config->extradata, codec_config->extradata_size);
+        break;
+    case AV_CODEC_ID_PCM_S16LE:
+        avio_w8(dyn_bc, 0);
+        avio_w8(dyn_bc, 16);
+        avio_wb32(dyn_bc, codec_config->sample_rate);
+        break;
+    case AV_CODEC_ID_PCM_S24LE:
+        avio_w8(dyn_bc, 0);
+        avio_w8(dyn_bc, 24);
+        avio_wb32(dyn_bc, codec_config->sample_rate);
+        break;
+    case AV_CODEC_ID_PCM_S32LE:
+        avio_w8(dyn_bc, 0);
+        avio_w8(dyn_bc, 32);
+        avio_wb32(dyn_bc, codec_config->sample_rate);
+        break;
+    case AV_CODEC_ID_PCM_S16BE:
+        avio_w8(dyn_bc, 1);
+        avio_w8(dyn_bc, 16);
+        avio_wb32(dyn_bc, codec_config->sample_rate);
+        break;
+    case AV_CODEC_ID_PCM_S24BE:
+        avio_w8(dyn_bc, 1);
+        avio_w8(dyn_bc, 24);
+        avio_wb32(dyn_bc, codec_config->sample_rate);
+        break;
+    case AV_CODEC_ID_PCM_S32BE:
+        avio_w8(dyn_bc, 1);
+        avio_w8(dyn_bc, 32);
+        avio_wb32(dyn_bc, codec_config->sample_rate);
+        break;
+    default:
+        break;
+    }
+
+    init_put_bits(&pbc, header, sizeof(header));
+    put_bits(&pbc, 5, IAMF_OBU_IA_CODEC_CONFIG);
+    put_bits(&pbc, 3, 0);
+    flush_put_bits(&pbc);
+
+    dyn_size = avio_close_dyn_buf(dyn_bc, &dyn_buf);
+    avio_write(pb, header, put_bytes_count(&pbc, 1));
+    ffio_write_leb(pb, dyn_size);
+    avio_write(pb, dyn_buf, dyn_size);
+    av_free(dyn_buf);
+
+    return 0;
+}
+
+static inline int rescale_rational(AVRational q, int b)
+{
+    return av_clip_int16(av_rescale(q.num, b, q.den));
+}
+
+static int scalable_channel_layout_config(const IAMFAudioElement *audio_element,
+                                          AVIOContext *dyn_bc)
+{
+    const AVIAMFAudioElement *element = audio_element->element;
+    uint8_t header[MAX_IAMF_OBU_HEADER_SIZE];
+    PutBitContext pb;
+
+    init_put_bits(&pb, header, sizeof(header));
+    put_bits(&pb, 3, element->nb_layers);
+    put_bits(&pb, 5, 0);
+    flush_put_bits(&pb);
+    avio_write(dyn_bc, header, put_bytes_count(&pb, 1));
+    for (int i = 0; i < element->nb_layers; i++) {
+        AVIAMFLayer *layer = element->layers[i];
+        int layout;
+        for (layout = 0; layout < FF_ARRAY_ELEMS(ff_iamf_scalable_ch_layouts); layout++) {
+            if (!av_channel_layout_compare(&layer->ch_layout, &ff_iamf_scalable_ch_layouts[layout]))
+                break;
+        }
+        init_put_bits(&pb, header, sizeof(header));
+        put_bits(&pb, 4, layout);
+        put_bits(&pb, 1, !!layer->output_gain_flags);
+        put_bits(&pb, 1, !!(layer->flags & AV_IAMF_LAYER_FLAG_RECON_GAIN));
+        put_bits(&pb, 2, 0); // reserved
+        put_bits(&pb, 8, audio_element->layers[i].substream_count);
+        put_bits(&pb, 8, audio_element->layers[i].coupled_substream_count);
+        if (layer->output_gain_flags) {
+            put_bits(&pb, 6, layer->output_gain_flags);
+            put_bits(&pb, 2, 0);
+            put_bits(&pb, 16, rescale_rational(layer->output_gain, 1 << 8));
+        }
+        flush_put_bits(&pb);
+        avio_write(dyn_bc, header, put_bytes_count(&pb, 1));
+    }
+
+    return 0;
+}
+
+static int ambisonics_config(const IAMFAudioElement *audio_element,
+                             AVIOContext *dyn_bc)
+{
+    const AVIAMFAudioElement *element = audio_element->element;
+    AVIAMFLayer *layer = element->layers[0];
+
+    ffio_write_leb(dyn_bc, 0); // ambisonics_mode
+    ffio_write_leb(dyn_bc, layer->ch_layout.nb_channels); // output_channel_count
+    ffio_write_leb(dyn_bc, audio_element->nb_substreams); // substream_count
+
+    if (layer->ch_layout.order == AV_CHANNEL_ORDER_AMBISONIC)
+        for (int i = 0; i < layer->ch_layout.nb_channels; i++)
+            avio_w8(dyn_bc, i);
+    else
+        for (int i = 0; i < layer->ch_layout.nb_channels; i++)
+            avio_w8(dyn_bc, layer->ch_layout.u.map[i].id);
+
+    return 0;
+}
+
+static int param_definition(const AVIAMFParamDefinition *param,
+                            AVIOContext *dyn_bc)
+{
+    ffio_write_leb(dyn_bc, param->parameter_id);
+    ffio_write_leb(dyn_bc, param->parameter_rate);
+    avio_w8(dyn_bc, !!param->param_definition_mode << 7);
+    if (!param->param_definition_mode) {
+        ffio_write_leb(dyn_bc, param->duration);
+        ffio_write_leb(dyn_bc, param->constant_subblock_duration);
+        if (param->constant_subblock_duration == 0) {
+            ffio_write_leb(dyn_bc, param->nb_subblocks);
+            for (int i = 0; i < param->nb_subblocks; i++) {
+                const void *subblock = av_iamf_param_definition_get_subblock(param, i);
+
+                switch (param->param_definition_type) {
+                case AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN: {
+                    const AVIAMFMixGain *mix = subblock;
+                    ffio_write_leb(dyn_bc, mix->subblock_duration);
+                    break;
+                }
+                case AV_IAMF_PARAMETER_DEFINITION_DEMIXING: {
+                    const AVIAMFDemixingInfo *demix = subblock;
+                    ffio_write_leb(dyn_bc, demix->subblock_duration);
+                    break;
+                }
+                case AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN: {
+                    const AVIAMFReconGain *recon = subblock;
+                    ffio_write_leb(dyn_bc, recon->subblock_duration);
+                    break;
+                }
+                }
+            }
+        }
+    }
+
+    return 0;
+}
+
+static int iamf_write_audio_element(const IAMFContext *iamf,
+                                    const IAMFAudioElement *audio_element,
+                                    AVIOContext *pb, void *log_ctx)
+{
+    const AVIAMFAudioElement *element = audio_element->element;
+    const IAMFCodecConfig *codec_config = iamf->codec_configs[audio_element->codec_config_id];
+    uint8_t header[MAX_IAMF_OBU_HEADER_SIZE];
+    AVIOContext *dyn_bc;
+    uint8_t *dyn_buf = NULL;
+    PutBitContext pbc;
+    int param_definition_types = AV_IAMF_PARAMETER_DEFINITION_DEMIXING, dyn_size;
+
+    int ret = avio_open_dyn_buf(&dyn_bc);
+    if (ret < 0)
+        return ret;
+
+    ffio_write_leb(dyn_bc, audio_element->audio_element_id);
+
+    init_put_bits(&pbc, header, sizeof(header));
+    put_bits(&pbc, 3, element->audio_element_type);
+    put_bits(&pbc, 5, 0);
+    flush_put_bits(&pbc);
+    avio_write(dyn_bc, header, put_bytes_count(&pbc, 1));
+
+    ffio_write_leb(dyn_bc, audio_element->codec_config_id);
+    ffio_write_leb(dyn_bc, audio_element->nb_substreams);
+
+    for (int i = 0; i < audio_element->nb_substreams; i++)
+        ffio_write_leb(dyn_bc, audio_element->substreams[i].audio_substream_id);
+
+    if (element->nb_layers == 1)
+        param_definition_types &= ~AV_IAMF_PARAMETER_DEFINITION_DEMIXING;
+    if (element->nb_layers > 1)
+        param_definition_types |= AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN;
+    if (codec_config->codec_tag == MKTAG('f','L','a','C') ||
+        codec_config->codec_tag == MKTAG('i','p','c','m'))
+        param_definition_types &= ~AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN;
+
+    ffio_write_leb(dyn_bc, av_popcount(param_definition_types)); // num_parameters
+
+    if (param_definition_types & 1) {
+        const AVIAMFParamDefinition *param = element->demixing_info;
+        const AVIAMFDemixingInfo *demix;
+
+        if (!param) {
+            av_log(log_ctx, AV_LOG_ERROR, "demixing_info needed but not set in Stream Group #%u\n",
+                   audio_element->audio_element_id);
+            return AVERROR(EINVAL);
+        }
+
+        demix = av_iamf_param_definition_get_subblock(param, 0);
+        ffio_write_leb(dyn_bc, AV_IAMF_PARAMETER_DEFINITION_DEMIXING); // param_definition_type
+        param_definition(param, dyn_bc);
+
+        avio_w8(dyn_bc, demix->dmixp_mode << 5); // dmixp_mode
+        avio_w8(dyn_bc, element->default_w << 4); // default_w
+    }
+    if (param_definition_types & 2) {
+        const AVIAMFParamDefinition *param = element->recon_gain_info;
+
+        if (!param) {
+            av_log(log_ctx, AV_LOG_ERROR, "recon_gain_info needed but not set in Stream Group #%u\n",
+                   audio_element->audio_element_id);
+            return AVERROR(EINVAL);
+        }
+        ffio_write_leb(dyn_bc, AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN); // param_definition_type
+        param_definition(param, dyn_bc);
+    }
+
+    if (element->audio_element_type == AV_IAMF_AUDIO_ELEMENT_TYPE_CHANNEL) {
+        ret = scalable_channel_layout_config(audio_element, dyn_bc);
+        if (ret < 0)
+            return ret;
+    } else {
+        ret = ambisonics_config(audio_element, dyn_bc);
+        if (ret < 0)
+            return ret;
+    }
+
+    init_put_bits(&pbc, header, sizeof(header));
+    put_bits(&pbc, 5, IAMF_OBU_IA_AUDIO_ELEMENT);
+    put_bits(&pbc, 3, 0);
+    flush_put_bits(&pbc);
+
+    dyn_size = avio_close_dyn_buf(dyn_bc, &dyn_buf);
+    avio_write(pb, header, put_bytes_count(&pbc, 1));
+    ffio_write_leb(pb, dyn_size);
+    avio_write(pb, dyn_buf, dyn_size);
+    av_free(dyn_buf);
+
+    return 0;
+}
+
+static int iamf_write_mixing_presentation(const IAMFContext *iamf,
+                                          const IAMFMixPresentation *mix_presentation,
+                                          AVIOContext *pb, void *log_ctx)
+{
+    uint8_t header[MAX_IAMF_OBU_HEADER_SIZE];
+    const AVIAMFMixPresentation *mix = mix_presentation->mix;
+    const AVDictionaryEntry *tag = NULL;
+    PutBitContext pbc;
+    AVIOContext *dyn_bc;
+    uint8_t *dyn_buf = NULL;
+    int dyn_size;
+
+    int ret = avio_open_dyn_buf(&dyn_bc);
+    if (ret < 0)
+        return ret;
+
+    ffio_write_leb(dyn_bc, mix_presentation->mix_presentation_id); // mix_presentation_id
+    ffio_write_leb(dyn_bc, av_dict_count(mix->annotations)); // count_label
+
+    while ((tag = av_dict_iterate(mix->annotations, tag)))
+        avio_put_str(dyn_bc, tag->key);
+    while ((tag = av_dict_iterate(mix->annotations, tag)))
+        avio_put_str(dyn_bc, tag->value);
+
+    ffio_write_leb(dyn_bc, mix->nb_submixes);
+    for (int i = 0; i < mix->nb_submixes; i++) {
+        const AVIAMFSubmix *sub_mix = mix->submixes[i];
+
+        ffio_write_leb(dyn_bc, sub_mix->nb_elements);
+        for (int j = 0; j < sub_mix->nb_elements; j++) {
+            const IAMFAudioElement *audio_element = NULL;
+            const AVIAMFSubmixElement *submix_element = sub_mix->elements[j];
+
+            for (int k = 0; k < iamf->nb_audio_elements; k++)
+                if (iamf->audio_elements[k]->audio_element_id == submix_element->audio_element_id) {
+                    audio_element = iamf->audio_elements[k];
+                    break;
+                }
+
+            av_assert0(audio_element);
+            ffio_write_leb(dyn_bc, submix_element->audio_element_id);
+
+            if (av_dict_count(submix_element->annotations) != av_dict_count(mix->annotations)) {
+                av_log(log_ctx, AV_LOG_ERROR, "Inconsistent amount of labels in submix %d from Mix Presentation id #%u\n",
+                       j, audio_element->audio_element_id);
+                return AVERROR(EINVAL);
+            }
+            while ((tag = av_dict_iterate(submix_element->annotations, tag)))
+                avio_put_str(dyn_bc, tag->value);
+
+            init_put_bits(&pbc, header, sizeof(header));
+            put_bits(&pbc, 2, submix_element->headphones_rendering_mode);
+            put_bits(&pbc, 6, 0); // reserved
+            flush_put_bits(&pbc);
+            avio_write(dyn_bc, header, put_bytes_count(&pbc, 1));
+            ffio_write_leb(dyn_bc, 0); // rendering_config_extension_size
+            param_definition(submix_element->element_mix_config, dyn_bc);
+            avio_wb16(dyn_bc, rescale_rational(submix_element->default_mix_gain, 1 << 8));
+        }
+        param_definition(sub_mix->output_mix_config, dyn_bc);
+        avio_wb16(dyn_bc, rescale_rational(sub_mix->default_mix_gain, 1 << 8));
+
+        ffio_write_leb(dyn_bc, sub_mix->nb_layouts); // nb_layouts
+        for (int i = 0; i < sub_mix->nb_layouts; i++) {
+            const AVIAMFSubmixLayout *submix_layout = sub_mix->layouts[i];
+            int layout, info_type;
+            int dialogue = submix_layout->dialogue_anchored_loudness.num &&
+                           submix_layout->dialogue_anchored_loudness.den;
+            int album = submix_layout->album_anchored_loudness.num &&
+                        submix_layout->album_anchored_loudness.den;
+
+            if (layout == FF_ARRAY_ELEMS(ff_iamf_sound_system_map)) {
+                av_log(log_ctx, AV_LOG_ERROR, "Invalid Sound System value in a submix\n");
+                return AVERROR(EINVAL);
+            }
+
+            if (submix_layout->layout_type == AV_IAMF_SUBMIX_LAYOUT_TYPE_LOUDSPEAKERS) {
+                for (layout = 0; layout < FF_ARRAY_ELEMS(ff_iamf_sound_system_map); layout++) {
+                    if (!av_channel_layout_compare(&submix_layout->sound_system, &ff_iamf_sound_system_map[layout].layout))
+                        break;
+                }
+                if (layout == FF_ARRAY_ELEMS(ff_iamf_sound_system_map)) {
+                    av_log(log_ctx, AV_LOG_ERROR, "Invalid Sound System value in a submix\n");
+                    return AVERROR(EINVAL);
+                }
+            }
+            init_put_bits(&pbc, header, sizeof(header));
+            put_bits(&pbc, 2, submix_layout->layout_type); // layout_type
+            if (submix_layout->layout_type == AV_IAMF_SUBMIX_LAYOUT_TYPE_LOUDSPEAKERS) {
+                put_bits(&pbc, 4, ff_iamf_sound_system_map[layout].id); // sound_system
+                put_bits(&pbc, 2, 0); // reserved
+            } else
+                put_bits(&pbc, 6, 0); // reserved
+            flush_put_bits(&pbc);
+            avio_write(dyn_bc, header, put_bytes_count(&pbc, 1));
+
+            info_type  = (submix_layout->true_peak.num && submix_layout->true_peak.den);
+            info_type |= (dialogue || album) << 1;
+            avio_w8(dyn_bc, info_type);
+            avio_wb16(dyn_bc, rescale_rational(submix_layout->integrated_loudness, 1 << 8));
+            avio_wb16(dyn_bc, rescale_rational(submix_layout->digital_peak, 1 << 8));
+            if (info_type & 1)
+                avio_wb16(dyn_bc, rescale_rational(submix_layout->true_peak, 1 << 8));
+            if (info_type & 2) {
+                avio_w8(dyn_bc, dialogue + album); // num_anchored_loudness
+                if (dialogue) {
+                    avio_w8(dyn_bc, IAMF_ANCHOR_ELEMENT_DIALOGUE);
+                    avio_wb16(dyn_bc, rescale_rational(submix_layout->dialogue_anchored_loudness, 1 << 8));
+                }
+                if (album) {
+                    avio_w8(dyn_bc, IAMF_ANCHOR_ELEMENT_ALBUM);
+                    avio_wb16(dyn_bc, rescale_rational(submix_layout->album_anchored_loudness, 1 << 8));
+                }
+            }
+        }
+    }
+
+    init_put_bits(&pbc, header, sizeof(header));
+    put_bits(&pbc, 5, IAMF_OBU_IA_MIX_PRESENTATION);
+    put_bits(&pbc, 3, 0);
+    flush_put_bits(&pbc);
+
+    dyn_size = avio_close_dyn_buf(dyn_bc, &dyn_buf);
+    avio_write(pb, header, put_bytes_count(&pbc, 1));
+    ffio_write_leb(pb, dyn_size);
+    avio_write(pb, dyn_buf, dyn_size);
+    av_free(dyn_buf);
+
+    return 0;
+}
+
+int ff_iamf_write_descriptors(const IAMFContext *iamf, AVIOContext *pb, void *log_ctx)
+{
+    uint8_t header[MAX_IAMF_OBU_HEADER_SIZE];
+    PutBitContext pbc;
+    AVIOContext *dyn_bc;
+    uint8_t *dyn_buf = NULL;
+    int dyn_size;
+
+    int ret = avio_open_dyn_buf(&dyn_bc);
+    if (ret < 0)
+        return ret;
+
+    // Sequence Header
+    init_put_bits(&pbc, header, sizeof(header));
+    put_bits(&pbc, 5, IAMF_OBU_IA_SEQUENCE_HEADER);
+    put_bits(&pbc, 3, 0);
+    flush_put_bits(&pbc);
+
+    avio_write(dyn_bc, header, put_bytes_count(&pbc, 1));
+    ffio_write_leb(dyn_bc, 6);
+    avio_wb32(dyn_bc, MKBETAG('i','a','m','f'));
+    avio_w8(dyn_bc, iamf->nb_audio_elements > 1); // primary_profile
+    avio_w8(dyn_bc, iamf->nb_audio_elements > 1); // additional_profile
+
+    dyn_size = avio_close_dyn_buf(dyn_bc, &dyn_buf);
+    avio_write(pb, dyn_buf, dyn_size);
+    av_free(dyn_buf);
+
+    for (int i = 0; i < iamf->nb_codec_configs; i++) {
+        ret = iamf_write_codec_config(iamf, iamf->codec_configs[i], pb);
+        if (ret < 0)
+            return ret;
+    }
+
+    for (int i = 0; i < iamf->nb_audio_elements; i++) {
+        ret = iamf_write_audio_element(iamf, iamf->audio_elements[i], pb, log_ctx);
+        if (ret < 0)
+            return ret;
+    }
+
+    for (int i = 0; i < iamf->nb_mix_presentations; i++) {
+        ret = iamf_write_mixing_presentation(iamf, iamf->mix_presentations[i], pb, log_ctx);
+        if (ret < 0)
+            return ret;
+    }
+
+    return 0;
+}
diff --git a/libavformat/iamf_writer.h b/libavformat/iamf_writer.h
new file mode 100644
index 0000000000..93354670b8
--- /dev/null
+++ b/libavformat/iamf_writer.h
@@ -0,0 +1,51 @@
+/*
+ * Immersive Audio Model and Formats muxing helpers and structs
+ * Copyright (c) 2023 James Almer <jamrial@gmail.com>
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#ifndef AVFORMAT_IAMF_WRITER_H
+#define AVFORMAT_IAMF_WRITER_H
+
+#include <stdint.h>
+
+#include "libavutil/common.h"
+#include "avformat.h"
+#include "avio.h"
+#include "iamf.h"
+
+static inline IAMFParamDefinition *ff_iamf_get_param_definition(const IAMFContext *iamf,
+                                                                unsigned int parameter_id)
+{
+    IAMFParamDefinition *param_definition = NULL;
+
+    for (int i = 0; i < iamf->nb_param_definitions; i++)
+        if (iamf->param_definitions[i]->param->parameter_id == parameter_id) {
+            param_definition = iamf->param_definitions[i];
+            break;
+        }
+
+    return param_definition;
+}
+
+int ff_iamf_add_audio_element(IAMFContext *iamf, const AVStreamGroup *stg, void *log_ctx);
+int ff_iamf_add_mix_presentation(IAMFContext *iamf, const AVStreamGroup *stg, void *log_ctx);
+
+int ff_iamf_write_descriptors(const IAMFContext *iamf, AVIOContext *pb, void *log_ctx);
+
+#endif /* AVFORMAT_IAMF_WRITER_H */
diff --git a/libavformat/iamfenc.c b/libavformat/iamfenc.c
new file mode 100644
index 0000000000..1dbb8b21d4
--- /dev/null
+++ b/libavformat/iamfenc.c
@@ -0,0 +1,388 @@
+/*
+ * IAMF muxer
+ * Copyright (c) 2023 James Almer
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include <stdint.h>
+
+#include "libavutil/avassert.h"
+#include "libavutil/common.h"
+#include "libavutil/iamf.h"
+#include "libavcodec/get_bits.h"
+#include "libavcodec/put_bits.h"
+#include "avformat.h"
+#include "avio_internal.h"
+#include "iamf.h"
+#include "iamf_writer.h"
+#include "internal.h"
+#include "mux.h"
+
+typedef struct IAMFMuxContext {
+    IAMFContext iamf;
+
+    int first_stream_id;
+} IAMFMuxContext;
+
+static int iamf_init(AVFormatContext *s)
+{
+    IAMFMuxContext *const c = s->priv_data;
+    IAMFContext *const iamf = &c->iamf;
+    int nb_audio_elements = 0, nb_mix_presentations = 0;
+    int ret;
+
+    if (!s->nb_streams) {
+        av_log(s, AV_LOG_ERROR, "There must be at least one stream\n");
+        return AVERROR(EINVAL);
+    }
+
+    for (int i = 0; i < s->nb_streams; i++) {
+        if (s->streams[i]->codecpar->codec_type != AVMEDIA_TYPE_AUDIO ||
+            (s->streams[i]->codecpar->codec_tag != MKTAG('m','p','4','a') &&
+             s->streams[i]->codecpar->codec_tag != MKTAG('O','p','u','s') &&
+             s->streams[i]->codecpar->codec_tag != MKTAG('f','L','a','C') &&
+             s->streams[i]->codecpar->codec_tag != MKTAG('i','p','c','m'))) {
+            av_log(s, AV_LOG_ERROR, "Unsupported codec id %s\n",
+                   avcodec_get_name(s->streams[i]->codecpar->codec_id));
+            return AVERROR(EINVAL);
+        }
+
+        if (s->streams[i]->codecpar->ch_layout.nb_channels > 2) {
+            av_log(s, AV_LOG_ERROR, "Unsupported channel layout on stream #%d\n", i);
+            return AVERROR(EINVAL);
+        }
+
+        for (int j = 0; j < i; j++) {
+            if (s->streams[i]->id == s->streams[j]->id) {
+                av_log(s, AV_LOG_ERROR, "Duplicated stream id %d\n", s->streams[j]->id);
+                return AVERROR(EINVAL);
+            }
+        }
+    }
+
+    if (!s->nb_stream_groups) {
+        av_log(s, AV_LOG_ERROR, "There must be at least two stream groups\n");
+        return AVERROR(EINVAL);
+    }
+
+    for (int i = 0; i < s->nb_stream_groups; i++) {
+        const AVStreamGroup *stg = s->stream_groups[i];
+
+        if (stg->type == AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT)
+            nb_audio_elements++;
+        if (stg->type == AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION)
+            nb_mix_presentations++;
+    }
+    if ((nb_audio_elements < 1 && nb_audio_elements > 2) || nb_mix_presentations < 1) {
+        av_log(s, AV_LOG_ERROR, "There must be >= 1 and <= 2 IAMF_AUDIO_ELEMENT and at least "
+                                "one IAMF_MIX_PRESENTATION stream groups\n");
+        return AVERROR(EINVAL);
+    }
+
+    for (int i = 0; i < s->nb_stream_groups; i++) {
+        const AVStreamGroup *stg = s->stream_groups[i];
+        if (stg->type != AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT)
+            continue;
+
+        ret = ff_iamf_add_audio_element(iamf, stg, s);
+        if (ret < 0)
+            return ret;
+    }
+
+    for (int i = 0; i < s->nb_stream_groups; i++) {
+        const AVStreamGroup *stg = s->stream_groups[i];
+        if (stg->type != AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION)
+            continue;
+
+        ret = ff_iamf_add_mix_presentation(iamf, stg, s);
+        if (ret < 0)
+            return ret;
+    }
+
+    c->first_stream_id = s->streams[0]->id;
+
+    return 0;
+}
+
+static int iamf_write_header(AVFormatContext *s)
+{
+    IAMFMuxContext *const c = s->priv_data;
+    IAMFContext *const iamf = &c->iamf;
+    int ret;
+
+    ret = ff_iamf_write_descriptors(iamf, s->pb, s);
+    if (ret < 0)
+        return ret;
+
+    c->first_stream_id = s->streams[0]->id;
+
+    return 0;
+}
+
+static inline int rescale_rational(AVRational q, int b)
+{
+    return av_clip_int16(av_rescale(q.num, b, q.den));
+}
+
+static int write_parameter_block(AVFormatContext *s, const AVIAMFParamDefinition *param)
+{
+    const IAMFMuxContext *const c = s->priv_data;
+    const IAMFContext *const iamf = &c->iamf;
+    uint8_t header[MAX_IAMF_OBU_HEADER_SIZE];
+    IAMFParamDefinition *param_definition = ff_iamf_get_param_definition(iamf, param->parameter_id);
+    PutBitContext pb;
+    AVIOContext *dyn_bc;
+    uint8_t *dyn_buf = NULL;
+    int dyn_size, ret;
+
+    if (param->param_definition_type > AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN) {
+        av_log(s, AV_LOG_DEBUG, "Ignoring side data with unknown param_definition_type %u\n",
+               param->param_definition_type);
+        return 0;
+    }
+
+    if (!param_definition) {
+        av_log(s, AV_LOG_ERROR, "Non-existent Parameter Definition with ID %u referenced by a packet\n",
+               param->parameter_id);
+        return AVERROR(EINVAL);
+    }
+
+    if (param->param_definition_type != param_definition->param->param_definition_type ||
+        param->param_definition_mode != param_definition->param->param_definition_mode) {
+        av_log(s, AV_LOG_ERROR, "Inconsistent param_definition_mode or param_definition_type values "
+                                "for Parameter Definition with ID %u in a packet\n",
+               param->parameter_id);
+        return AVERROR(EINVAL);
+    }
+
+    ret = avio_open_dyn_buf(&dyn_bc);
+    if (ret < 0)
+        return ret;
+
+    // Sequence Header
+    init_put_bits(&pb, header, sizeof(header));
+    put_bits(&pb, 5, IAMF_OBU_IA_PARAMETER_BLOCK);
+    put_bits(&pb, 3, 0);
+    flush_put_bits(&pb);
+    avio_write(s->pb, header, put_bytes_count(&pb, 1));
+
+    ffio_write_leb(dyn_bc, param->parameter_id);
+    if (param->param_definition_mode) {
+        ffio_write_leb(dyn_bc, param->duration);
+        ffio_write_leb(dyn_bc, param->constant_subblock_duration);
+        if (param->constant_subblock_duration == 0)
+            ffio_write_leb(dyn_bc, param->nb_subblocks);
+    }
+
+    for (int i = 0; i < param->nb_subblocks; i++) {
+        const void *subblock = av_iamf_param_definition_get_subblock(param, i);
+
+        switch (param->param_definition_type) {
+        case AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN: {
+            const AVIAMFMixGain *mix = subblock;
+            if (param->param_definition_mode && param->constant_subblock_duration == 0)
+                ffio_write_leb(dyn_bc, mix->subblock_duration);
+
+            ffio_write_leb(dyn_bc, mix->animation_type);
+
+            avio_wb16(dyn_bc, rescale_rational(mix->start_point_value, 1 << 8));
+            if (mix->animation_type >= AV_IAMF_ANIMATION_TYPE_LINEAR)
+                avio_wb16(dyn_bc, rescale_rational(mix->end_point_value, 1 << 8));
+            if (mix->animation_type == AV_IAMF_ANIMATION_TYPE_BEZIER) {
+                avio_wb16(dyn_bc, rescale_rational(mix->control_point_value, 1 << 8));
+                avio_w8(dyn_bc, av_clip_uint8(av_rescale(mix->control_point_relative_time.num, 1 << 8,
+                                                         mix->control_point_relative_time.den)));
+            }
+            break;
+        }
+        case AV_IAMF_PARAMETER_DEFINITION_DEMIXING: {
+            const AVIAMFDemixingInfo *demix = subblock;
+            if (param->param_definition_mode && param->constant_subblock_duration == 0)
+                ffio_write_leb(dyn_bc, demix->subblock_duration);
+
+            avio_w8(dyn_bc, demix->dmixp_mode << 5);
+            break;
+        }
+        case AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN: {
+            const AVIAMFReconGain *recon = subblock;
+            const AVIAMFAudioElement *audio_element = param_definition->audio_element;
+
+            if (param->param_definition_mode && param->constant_subblock_duration == 0)
+                ffio_write_leb(dyn_bc, recon->subblock_duration);
+
+            if (!audio_element) {
+                av_log(s, AV_LOG_ERROR, "Invalid Parameter Definition with ID %u referenced by a packet\n", param->parameter_id);
+                return AVERROR(EINVAL);
+            }
+
+            for (int j = 0; j < audio_element->nb_layers; j++) {
+                const AVIAMFLayer *layer = audio_element->layers[j];
+
+                if (layer->flags & AV_IAMF_LAYER_FLAG_RECON_GAIN) {
+                    unsigned int recon_gain_flags = 0;
+                    int k = 0;
+
+                    for (; k < 7; k++)
+                        recon_gain_flags |= (1 << k) * !!recon->recon_gain[j][k];
+                    for (; k < 12; k++)
+                        recon_gain_flags |= (2 << k) * !!recon->recon_gain[j][k];
+                    if (recon_gain_flags >> 8)
+                        recon_gain_flags |= (1 << k);
+
+                    ffio_write_leb(dyn_bc, recon_gain_flags);
+                    for (k = 0; k < 12; k++) {
+                        if (recon->recon_gain[j][k])
+                            avio_w8(dyn_bc, recon->recon_gain[j][k]);
+                    }
+                }
+            }
+            break;
+        }
+        default:
+            av_assert0(0);
+        }
+    }
+
+    dyn_size = avio_close_dyn_buf(dyn_bc, &dyn_buf);
+    ffio_write_leb(s->pb, dyn_size);
+    avio_write(s->pb, dyn_buf, dyn_size);
+    av_free(dyn_buf);
+
+    return 0;
+}
+
+static int iamf_write_packet(AVFormatContext *s, AVPacket *pkt)
+{
+    const IAMFMuxContext *const c = s->priv_data;
+    AVStream *st = s->streams[pkt->stream_index];
+    uint8_t header[MAX_IAMF_OBU_HEADER_SIZE];
+    PutBitContext pb;
+    AVIOContext *dyn_bc;
+    uint8_t *side_data, *dyn_buf = NULL;
+    unsigned int skip_samples = 0, discard_padding = 0;
+    size_t side_data_size;
+    int dyn_size, type = st->id <= 17 ? st->id + IAMF_OBU_IA_AUDIO_FRAME_ID0 : IAMF_OBU_IA_AUDIO_FRAME;
+    int ret;
+
+    if (s->nb_stream_groups && st->id == c->first_stream_id) {
+        AVIAMFParamDefinition *mix =
+            (AVIAMFParamDefinition *)av_packet_get_side_data(pkt, AV_PKT_DATA_IAMF_MIX_GAIN_PARAM, NULL);
+        AVIAMFParamDefinition *demix =
+            (AVIAMFParamDefinition *)av_packet_get_side_data(pkt, AV_PKT_DATA_IAMF_DEMIXING_INFO_PARAM, NULL);
+        AVIAMFParamDefinition *recon =
+            (AVIAMFParamDefinition *)av_packet_get_side_data(pkt, AV_PKT_DATA_IAMF_RECON_GAIN_INFO_PARAM, NULL);
+
+        if (mix) {
+            ret = write_parameter_block(s, mix);
+            if (ret < 0)
+               return ret;
+        }
+        if (demix) {
+            ret = write_parameter_block(s, demix);
+            if (ret < 0)
+               return ret;
+        }
+        if (recon) {
+            ret = write_parameter_block(s, recon);
+            if (ret < 0)
+               return ret;
+        }
+    }
+    side_data = av_packet_get_side_data(pkt, AV_PKT_DATA_SKIP_SAMPLES,
+                                        &side_data_size);
+
+    if (side_data && side_data_size >= 10) {
+        skip_samples = AV_RL32(side_data);
+        discard_padding = AV_RL32(side_data + 4);
+    }
+
+    ret = avio_open_dyn_buf(&dyn_bc);
+    if (ret < 0)
+        return ret;
+
+    init_put_bits(&pb, header, sizeof(header));
+    put_bits(&pb, 5, type);
+    put_bits(&pb, 1, 0); // obu_redundant_copy
+    put_bits(&pb, 1, skip_samples || discard_padding);
+    put_bits(&pb, 1, 0); // obu_extension_flag
+    flush_put_bits(&pb);
+    avio_write(s->pb, header, put_bytes_count(&pb, 1));
+
+    if (skip_samples || discard_padding) {
+        ffio_write_leb(dyn_bc, discard_padding);
+        ffio_write_leb(dyn_bc, skip_samples);
+    }
+
+    if (st->id > 17)
+        ffio_write_leb(dyn_bc, st->id);
+
+    dyn_size = avio_close_dyn_buf(dyn_bc, &dyn_buf);
+    ffio_write_leb(s->pb, dyn_size + pkt->size);
+    avio_write(s->pb, dyn_buf, dyn_size);
+    av_free(dyn_buf);
+    avio_write(s->pb, pkt->data, pkt->size);
+
+    return 0;
+}
+
+static void iamf_deinit(AVFormatContext *s)
+{
+    IAMFMuxContext *const c = s->priv_data;
+    IAMFContext *const iamf = &c->iamf;
+
+    for (int i = 0; i < iamf->nb_audio_elements; i++) {
+        IAMFAudioElement *audio_element = iamf->audio_elements[i];
+        audio_element->element = NULL;
+    }
+
+    for (int i = 0; i < iamf->nb_mix_presentations; i++) {
+        IAMFMixPresentation *mix_presentation = iamf->mix_presentations[i];
+        mix_presentation->mix = NULL;
+    }
+
+    ff_iamf_uninit_context(iamf);
+
+    return;
+}
+
+static const AVCodecTag iamf_codec_tags[] = {
+    { AV_CODEC_ID_AAC,       MKTAG('m','p','4','a') },
+    { AV_CODEC_ID_FLAC,      MKTAG('f','L','a','C') },
+    { AV_CODEC_ID_OPUS,      MKTAG('O','p','u','s') },
+    { AV_CODEC_ID_PCM_S16LE, MKTAG('i','p','c','m') },
+    { AV_CODEC_ID_PCM_S16BE, MKTAG('i','p','c','m') },
+    { AV_CODEC_ID_PCM_S24LE, MKTAG('i','p','c','m') },
+    { AV_CODEC_ID_PCM_S24BE, MKTAG('i','p','c','m') },
+    { AV_CODEC_ID_PCM_S32LE, MKTAG('i','p','c','m') },
+    { AV_CODEC_ID_PCM_S32BE, MKTAG('i','p','c','m') },
+    { AV_CODEC_ID_NONE,      MKTAG('i','p','c','m') }
+};
+
+const FFOutputFormat ff_iamf_muxer = {
+    .p.name            = "iamf",
+    .p.long_name       = NULL_IF_CONFIG_SMALL("Raw Immersive Audio Model and Formats"),
+    .p.extensions      = "iamf",
+    .priv_data_size    = sizeof(IAMFMuxContext),
+    .p.audio_codec     = AV_CODEC_ID_OPUS,
+    .init              = iamf_init,
+    .deinit            = iamf_deinit,
+    .write_header      = iamf_write_header,
+    .write_packet      = iamf_write_packet,
+    .p.codec_tag       = (const AVCodecTag* const []){ iamf_codec_tags, NULL },
+    .p.flags           = AVFMT_GLOBALHEADER | AVFMT_NOTIMESTAMPS,
+};
-- 
2.43.0

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [FFmpeg-devel] [PATCH v6 0/8] avformat: introduce AVStreamGroup
  2023-12-05 22:43 [FFmpeg-devel] [PATCH v6 0/8] avformat: introduce AVStreamGroup James Almer
                   ` (7 preceding siblings ...)
  2023-12-05 22:44 ` [FFmpeg-devel] [PATCH 8/8] avformat: Immersive Audio Model and Formats muxer James Almer
@ 2023-12-10 21:52 ` James Almer
  8 siblings, 0 replies; 23+ messages in thread
From: James Almer @ 2023-12-10 21:52 UTC (permalink / raw)
  To: ffmpeg-devel

On 12/5/2023 7:43 PM, James Almer wrote:
> Addressed Anton's comments and added some documentation. Also split the
> common code some more in order to facilitate using it from different
> modules.
> I'm withdrawing the MP4 code for now as i've noticed a bug in the spec
> and reported it. Depending on what happens to that, i'll resubmit it.
> 
> James Almer (8):
>    avutil: introduce an Immersive Audio Model and Formats API
>    avformat: introduce AVStreamGroup
>    ffmpeg: add support for muxing AVStreamGroups
>    avcodec/packet: add IAMF Parameters side data types
>    avcodec/get_bits: add get_leb()
>    avformat/aviobuf: add ffio_read_leb() and ffio_write_leb()
>    avformat: Immersive Audio Model and Formats demuxer
>    avformat: Immersive Audio Model and Formats muxer
> 
>   doc/fftools-common-opts.texi    |   17 +-
>   fftools/ffmpeg.h                |    2 +
>   fftools/ffmpeg_mux_init.c       |  335 ++++++++++
>   fftools/ffmpeg_opt.c            |    2 +
>   libavcodec/avpacket.c           |    3 +
>   libavcodec/bitstream.h          |    2 +
>   libavcodec/bitstream_template.h |   23 +
>   libavcodec/get_bits.h           |   24 +
>   libavcodec/packet.h             |   24 +
>   libavformat/Makefile            |    2 +
>   libavformat/allformats.c        |    2 +
>   libavformat/avformat.c          |  185 +++++-
>   libavformat/avformat.h          |  169 +++++
>   libavformat/avio_internal.h     |   10 +
>   libavformat/aviobuf.c           |   33 +
>   libavformat/dump.c              |  147 +++-
>   libavformat/iamf.c              |  125 ++++
>   libavformat/iamf.h              |  162 +++++
>   libavformat/iamf_parse.c        | 1106 +++++++++++++++++++++++++++++++
>   libavformat/iamf_parse.h        |   38 ++
>   libavformat/iamf_writer.c       |  823 +++++++++++++++++++++++
>   libavformat/iamf_writer.h       |   51 ++
>   libavformat/iamfdec.c           |  495 ++++++++++++++
>   libavformat/iamfenc.c           |  388 +++++++++++
>   libavformat/internal.h          |   33 +
>   libavformat/options.c           |  139 ++++
>   libavutil/Makefile              |    2 +
>   libavutil/iamf.c                |  564 ++++++++++++++++
>   libavutil/iamf.h                |  573 ++++++++++++++++
>   29 files changed, 5445 insertions(+), 34 deletions(-)
>   create mode 100644 libavformat/iamf.c
>   create mode 100644 libavformat/iamf.h
>   create mode 100644 libavformat/iamf_parse.c
>   create mode 100644 libavformat/iamf_parse.h
>   create mode 100644 libavformat/iamf_writer.c
>   create mode 100644 libavformat/iamf_writer.h
>   create mode 100644 libavformat/iamfdec.c
>   create mode 100644 libavformat/iamfenc.c
>   create mode 100644 libavutil/iamf.c
>   create mode 100644 libavutil/iamf.h

Will apply the set (with version bumps and APIChanges/Changelog entries) 
soon unless there are objections.
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [FFmpeg-devel] [PATCH 1/8] avutil: introduce an Immersive Audio Model and Formats API
  2023-12-05 22:43 ` [FFmpeg-devel] [PATCH 1/8] avutil: introduce an Immersive Audio Model and Formats API James Almer
@ 2023-12-11 10:19   ` Anton Khirnov
  0 siblings, 0 replies; 23+ messages in thread
From: Anton Khirnov @ 2023-12-11 10:19 UTC (permalink / raw)
  To: FFmpeg development discussions and patches

Quoting James Almer (2023-12-05 23:43:55)
> diff --git a/libavutil/iamf.h b/libavutil/iamf.h
> new file mode 100644
> index 0000000000..bc0363153d
> --- /dev/null
> +++ b/libavutil/iamf.h
> +/**
> + * @file
> + * Immersive Audio Model and Formats API header
> + * @see <a href="https://aomediacodec.github.io/iamf/">Immersive Audio Model and Formats</a>
> + */
> +
> +#include <stdint.h>
> +#include <stddef.h>
> +
> +#include "attributes.h"
> +#include "avassert.h"
> +#include "channel_layout.h"
> +#include "dict.h"
> +#include "rational.h"
> +
> +/**
> + * @defgroup lavf_iamf_params Parameter Definition
> + * @{
> + * Parameters as defined in section 3.6.1 and 3.8 of IAMF.
> + * @}
> + * @defgroup lavf_iamf_audio Audio Element
> + * @{
> + * Audio Elements as defined in section 3.6 of IAMF.
> + * @}
> + * @defgroup lavf_iamf_mix Mix Presentation
> + * @{
> + * Mix Presentations as defined in section 3.7 of IAMF.
> + * @}
> + *
> + * @}
> + * @addtogroup lavf_iamf_params
> + * @{
> + */
> +enum AVIAMFAnimationType {
> +    AV_IAMF_ANIMATION_TYPE_STEP,
> +    AV_IAMF_ANIMATION_TYPE_LINEAR,
> +    AV_IAMF_ANIMATION_TYPE_BEZIER,
> +};
> +
> +/**
> + * Mix Gain Parameter Data as defined in section 3.8.1 of IAMF.
> + *
> + * Subblocks in AVIAMFParamDefinition use this struct when the value or
> + * @ref AVIAMFParamDefinition.param_definition_type param_definition_type is
> + * AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN.

This is the wrong place for this second paragraph. IMO it should go into
doxy for either param_definition_type, or
AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN.

> +/**
> + * Parameters as defined in section 3.6.1 of IAMF.

It's not immediately obvious how the contents are structured, so I'd
extend it with something like:

The struct is allocated along with an array of subblocks, which are
either AVIAMFMixGain, AVIAMFDemixingInfo, or AVIAMFReconGain depending
on the value of param_definition_type. This array is placed
subblocks_offset bytes after the start of this struct.

> + */
> +typedef struct AVIAMFParamDefinition {
> +    const AVClass *av_class;
> +
> +    size_t subblocks_offset;

Offset in bytes from the start of this struct, at which the subblocks
array is located.

> +    size_t subblock_size;

Size in bytes of each element in the subblocks array.

> +
> +    enum AVIAMFParamDefinitionType param_definition_type;

This struct is already called ParamDefinition, so this could be just
type. Also:

Parameters type. Determines the type of the subblock elements.

> +    unsigned int nb_subblocks;

Number of subblocks in the array.

> +
> +    unsigned int parameter_id;
> +    unsigned int parameter_rate;
> +    unsigned int param_definition_mode;
> +    unsigned int duration;
> +    unsigned int constant_subblock_duration;

These should be documented.

> +typedef struct AVIAMFLayer {
> +    const AVClass *av_class;
> +
> +    AVChannelLayout ch_layout;
> +
> +    /**
> +     * A bitmask which may contain a combination of AV_IAMF_LAYER_FLAG_* flags.
> +     */
> +    unsigned int flags;
> +    /**
> +     * Output gain channel flags as defined in section 3.6.2 of IAMF.
> +     *
> +     * This field is defined only if @ref AVIAMFAudioElement.audio_element_type
> +     * "the parent's Audio Element type" is AV_IAMF_AUDIO_ELEMENT_TYPE_CHANNEL,
> +     * must be 0 otherwise.
> +     */
> +    unsigned int output_gain_flags;
> +    /**
> +     * Output gain as defined in section 3.6.2 of IAMF.
> +     *
> +     * Must be 0 if @ref output_gain_flags is 0.
> +     */
> +    AVRational output_gain;
> +    /**
> +     * Ambisonics mode as defined in section 3.6.3 of IAMF.
> +     *
> +     * This field is defined only if @ref AVIAMFAudioElement.audio_element_type
> +     * "the parent's Audio Element type" is AV_IAMF_AUDIO_ELEMENT_TYPE_SCENE.
> +     *
> +     * If AV_IAMF_AMBISONICS_MODE_MONO, channel_mapping is defined implicitly
> +     * (Ambisonic Order) or explicitly (Custom Order with ambi channels) in
> +     * @ref ch_layout.
> +     * If AV_IAMF_AMBISONICS_MODE_PROJECTION, @ref demixing_matrix must be set.
> +     */
> +    enum AVIAMFAmbisonicsMode ambisonics_mode;
> +
> +    /**
> +     * Demixing matrix as defined in section 3.6.3 of IAMF.
> +     *
> +     * May be set only if @ref ambisonics_mode == AV_IAMF_AMBISONICS_MODE_PROJECTION,
> +     * must be NULL otherwise.
> +     */
> +    AVRational *demixing_matrix;

How many elements are in the array?

-- 
Anton Khirnov
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [FFmpeg-devel] [PATCH 2/8] avformat: introduce AVStreamGroup
  2023-12-05 22:43 ` [FFmpeg-devel] [PATCH 2/8] avformat: introduce AVStreamGroup James Almer
@ 2023-12-11 10:59   ` Anton Khirnov
  2023-12-11 12:48     ` James Almer
  2023-12-11 11:03   ` Anton Khirnov
  1 sibling, 1 reply; 23+ messages in thread
From: Anton Khirnov @ 2023-12-11 10:59 UTC (permalink / raw)
  To: FFmpeg development discussions and patches

Quoting James Almer (2023-12-05 23:43:56)
> @@ -2819,6 +2972,22 @@ AVRational av_guess_frame_rate(AVFormatContext *ctx, AVStream *stream,
>  int avformat_match_stream_specifier(AVFormatContext *s, AVStream *st,
>                                      const char *spec);
>  
> +/**
> + * Check if the group stg contained in s is matched by the stream group
> + * specifier spec.
> + *
> + * See the "stream group specifiers" chapter in the documentation for the
> + * syntax of spec.
> + *
> + * @return  >0 if stg is matched by spec;
> + *          0  if stg is not matched by spec;
> + *          AVERROR code if spec is invalid
> + *
> + * @note  A stream group specifier can match several groups in the format.
> + */
> +int avformat_match_stream_group_specifier(AVFormatContext *s, AVStreamGroup *stg,
> +                                          const char *spec);

What is this for? It does not seem to be used in this set.

-- 
Anton Khirnov
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [FFmpeg-devel] [PATCH 2/8] avformat: introduce AVStreamGroup
  2023-12-05 22:43 ` [FFmpeg-devel] [PATCH 2/8] avformat: introduce AVStreamGroup James Almer
  2023-12-11 10:59   ` Anton Khirnov
@ 2023-12-11 11:03   ` Anton Khirnov
  2023-12-11 12:53     ` James Almer
  1 sibling, 1 reply; 23+ messages in thread
From: Anton Khirnov @ 2023-12-11 11:03 UTC (permalink / raw)
  To: FFmpeg development discussions and patches

Quoting James Almer (2023-12-05 23:43:56)
> +/**
> + * Remove a stream group from its AVFormatContext and free it.
> + * The group must be the last stream of the AVFormatContext.
> + */
> +void ff_remove_stream_group(AVFormatContext *s, AVStreamGroup *stg);

When would this be useful?

-- 
Anton Khirnov
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [FFmpeg-devel] [PATCH 3/8] ffmpeg: add support for muxing AVStreamGroups
  2023-12-05 22:43 ` [FFmpeg-devel] [PATCH 3/8] ffmpeg: add support for muxing AVStreamGroups James Almer
@ 2023-12-11 11:48   ` Anton Khirnov
  2023-12-11 12:46     ` James Almer
  0 siblings, 1 reply; 23+ messages in thread
From: Anton Khirnov @ 2023-12-11 11:48 UTC (permalink / raw)
  To: FFmpeg development discussions and patches

Quoting James Almer (2023-12-05 23:43:57)
> Starting with IAMF support.
> 
> Signed-off-by: James Almer <jamrial@gmail.com>
> ---
>  fftools/ffmpeg.h          |   2 +
>  fftools/ffmpeg_mux_init.c | 335 ++++++++++++++++++++++++++++++++++++++
>  fftools/ffmpeg_opt.c      |   2 +
>  3 files changed, 339 insertions(+)

Missing documentation.

> +static int of_add_groups(Muxer *mux, const OptionsContext *o)
> +{
> +    AVFormatContext *oc = mux->fc;
> +    int ret;
> +
> +    /* process manually set groups */
> +    for (int i = 0; i < o->nb_stream_groups; i++) {
> +        AVDictionary *dict = NULL, *tmp = NULL;
> +        const AVDictionaryEntry *e;
> +        AVStreamGroup *stg = NULL;
> +        int type;
> +        const char *token;
> +        char *str, *ptr = NULL;
> +        const AVOption opts[] = {
> +            { "type", "Set group type", offsetof(AVStreamGroup, type), AV_OPT_TYPE_INT,
> +                    { .i64 = 0 }, 0, INT_MAX, AV_OPT_FLAG_ENCODING_PARAM, "type" },
> +                { "iamf_audio_element",    NULL, 0, AV_OPT_TYPE_CONST,
> +                    { .i64 = AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT },    .unit = "type" },
> +                { "iamf_mix_presentation", NULL, 0, AV_OPT_TYPE_CONST,
> +                    { .i64 = AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION }, .unit = "type" },
> +            { NULL },
> +        };
> +         const AVClass class = {
> +            .class_name = "StreamGroupType",
> +            .item_name  = av_default_item_name,
> +            .option     = opts,
> +            .version    = LIBAVUTIL_VERSION_INT,
> +        };
> +        const AVClass *pclass = &class;
> +
> +        str = av_strdup(o->stream_groups[i].u.str);
> +        if (!str)
> +            goto end;
> +
> +        token = av_strtok(str, ",", &ptr);
> +        if (token) {

Too many indentation levels, move this whole block into a separate
function.

> +            ret = av_dict_parse_string(&dict, token, "=", ":", AV_DICT_MULTIKEY);
> +            if (ret < 0) {
> +                av_log(mux, AV_LOG_ERROR, "Error parsing group specification %s\n", token);
> +                goto end;
> +            }
> +
> +            // "type" is not a user settable option in AVStreamGroup

This comment confuses me.

> +            e = av_dict_get(dict, "type", NULL, 0);
> +            if (!e) {
> +                av_log(mux, AV_LOG_ERROR, "No type define for Steam Group %d\n", i);

1) Steam
2) defined? Or maybe specified.
3) Print the string, not the index.

> +                ret = AVERROR(EINVAL);
> +                goto end;
> +            }
> +
> +            ret = av_opt_eval_int(&pclass, opts, e->value, &type);
> +            if (ret < 0 || type == AV_STREAM_GROUP_PARAMS_NONE) {
> +                av_log(mux, AV_LOG_ERROR, "Invalid group type \"%s\"\n", e->value);
> +                goto end;
> +            }
> +
> +            av_dict_copy(&tmp, dict, 0);
> +            stg = avformat_stream_group_create(oc, type, &tmp);
> +            if (!stg) {
> +                ret = AVERROR(ENOMEM);
> +                goto end;
> +            }
> +            av_dict_set(&tmp, "type", NULL, 0);
> +
> +            e = NULL;
> +            while (e = av_dict_get(dict, "st", e, 0)) {
> +                unsigned int idx = strtol(e->value, NULL, 0);
> +                if (idx >= oc->nb_streams) {
> +                    av_log(mux, AV_LOG_ERROR, "Invalid stream index %d\n", idx);
> +                    ret = AVERROR(EINVAL);
> +                    goto end;
> +                }

This block seems confused about signedness of e->value.

> +                avformat_stream_group_add_stream(stg, oc->streams[idx]);

Unchecked return value.


-- 
Anton Khirnov
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [FFmpeg-devel] [PATCH 3/8] ffmpeg: add support for muxing AVStreamGroups
  2023-12-11 11:48   ` Anton Khirnov
@ 2023-12-11 12:46     ` James Almer
  2023-12-12 11:26       ` Anton Khirnov
  0 siblings, 1 reply; 23+ messages in thread
From: James Almer @ 2023-12-11 12:46 UTC (permalink / raw)
  To: ffmpeg-devel

On 12/11/2023 8:48 AM, Anton Khirnov wrote:
> Quoting James Almer (2023-12-05 23:43:57)
>> Starting with IAMF support.
>>
>> Signed-off-by: James Almer <jamrial@gmail.com>
>> ---
>>   fftools/ffmpeg.h          |   2 +
>>   fftools/ffmpeg_mux_init.c | 335 ++++++++++++++++++++++++++++++++++++++
>>   fftools/ffmpeg_opt.c      |   2 +
>>   3 files changed, 339 insertions(+)
> 
> Missing documentation.

Will do.

> 
>> +static int of_add_groups(Muxer *mux, const OptionsContext *o)
>> +{
>> +    AVFormatContext *oc = mux->fc;
>> +    int ret;
>> +
>> +    /* process manually set groups */
>> +    for (int i = 0; i < o->nb_stream_groups; i++) {
>> +        AVDictionary *dict = NULL, *tmp = NULL;
>> +        const AVDictionaryEntry *e;
>> +        AVStreamGroup *stg = NULL;
>> +        int type;
>> +        const char *token;
>> +        char *str, *ptr = NULL;
>> +        const AVOption opts[] = {
>> +            { "type", "Set group type", offsetof(AVStreamGroup, type), AV_OPT_TYPE_INT,
>> +                    { .i64 = 0 }, 0, INT_MAX, AV_OPT_FLAG_ENCODING_PARAM, "type" },
>> +                { "iamf_audio_element",    NULL, 0, AV_OPT_TYPE_CONST,
>> +                    { .i64 = AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT },    .unit = "type" },
>> +                { "iamf_mix_presentation", NULL, 0, AV_OPT_TYPE_CONST,
>> +                    { .i64 = AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION }, .unit = "type" },
>> +            { NULL },
>> +        };
>> +         const AVClass class = {
>> +            .class_name = "StreamGroupType",
>> +            .item_name  = av_default_item_name,
>> +            .option     = opts,
>> +            .version    = LIBAVUTIL_VERSION_INT,
>> +        };
>> +        const AVClass *pclass = &class;
>> +
>> +        str = av_strdup(o->stream_groups[i].u.str);
>> +        if (!str)
>> +            goto end;
>> +
>> +        token = av_strtok(str, ",", &ptr);
>> +        if (token) {
> 
> Too many indentation levels, move this whole block into a separate
> function.
> 
>> +            ret = av_dict_parse_string(&dict, token, "=", ":", AV_DICT_MULTIKEY);
>> +            if (ret < 0) {
>> +                av_log(mux, AV_LOG_ERROR, "Error parsing group specification %s\n", token);
>> +                goto end;
>> +            }
>> +
>> +            // "type" is not a user settable option in AVStreamGroup
> 
> This comment confuses me.

AVStreamGroup.type is not setteable through AVOptions, but it of course 
needs to be supported by the CLI. So i catch it and remove it from the 
dict that will be used for avformat_stream_group_create().

I can change the comment to "'type' is not a user settable AVOption".

> 
>> +            e = av_dict_get(dict, "type", NULL, 0);
>> +            if (!e) {
>> +                av_log(mux, AV_LOG_ERROR, "No type define for Steam Group %d\n", i);
> 
> 1) Steam
> 2) defined? Or maybe specified.

Will change to specified.

> 3) Print the string, not the index.
> 
>> +                ret = AVERROR(EINVAL);
>> +                goto end;
>> +            }
>> +
>> +            ret = av_opt_eval_int(&pclass, opts, e->value, &type);
>> +            if (ret < 0 || type == AV_STREAM_GROUP_PARAMS_NONE) {
>> +                av_log(mux, AV_LOG_ERROR, "Invalid group type \"%s\"\n", e->value);
>> +                goto end;
>> +            }
>> +
>> +            av_dict_copy(&tmp, dict, 0);
>> +            stg = avformat_stream_group_create(oc, type, &tmp);
>> +            if (!stg) {
>> +                ret = AVERROR(ENOMEM);
>> +                goto end;
>> +            }
>> +            av_dict_set(&tmp, "type", NULL, 0);
>> +
>> +            e = NULL;
>> +            while (e = av_dict_get(dict, "st", e, 0)) {
>> +                unsigned int idx = strtol(e->value, NULL, 0);
>> +                if (idx >= oc->nb_streams) {
>> +                    av_log(mux, AV_LOG_ERROR, "Invalid stream index %d\n", idx);
>> +                    ret = AVERROR(EINVAL);
>> +                    goto end;
>> +                }
> 
> This block seems confused about signedness of e->value.

You mean change %d to %u?

> 
>> +                avformat_stream_group_add_stream(stg, oc->streams[idx]);
> 
> Unchecked return value.
> 
> 
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [FFmpeg-devel] [PATCH 2/8] avformat: introduce AVStreamGroup
  2023-12-11 10:59   ` Anton Khirnov
@ 2023-12-11 12:48     ` James Almer
  2023-12-11 12:51       ` Anton Khirnov
  0 siblings, 1 reply; 23+ messages in thread
From: James Almer @ 2023-12-11 12:48 UTC (permalink / raw)
  To: ffmpeg-devel

On 12/11/2023 7:59 AM, Anton Khirnov wrote:
> Quoting James Almer (2023-12-05 23:43:56)
>> @@ -2819,6 +2972,22 @@ AVRational av_guess_frame_rate(AVFormatContext *ctx, AVStream *stream,
>>   int avformat_match_stream_specifier(AVFormatContext *s, AVStream *st,
>>                                       const char *spec);
>>   
>> +/**
>> + * Check if the group stg contained in s is matched by the stream group
>> + * specifier spec.
>> + *
>> + * See the "stream group specifiers" chapter in the documentation for the
>> + * syntax of spec.
>> + *
>> + * @return  >0 if stg is matched by spec;
>> + *          0  if stg is not matched by spec;
>> + *          AVERROR code if spec is invalid
>> + *
>> + * @note  A stream group specifier can match several groups in the format.
>> + */
>> +int avformat_match_stream_group_specifier(AVFormatContext *s, AVStreamGroup *stg,
>> +                                          const char *spec);
> 
> What is this for? It does not seem to be used in this set.

Remnant from when i used in the CLI while developing this set.
I can remove it, but would it not be useful for some other API user?
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [FFmpeg-devel] [PATCH 2/8] avformat: introduce AVStreamGroup
  2023-12-11 12:48     ` James Almer
@ 2023-12-11 12:51       ` Anton Khirnov
  0 siblings, 0 replies; 23+ messages in thread
From: Anton Khirnov @ 2023-12-11 12:51 UTC (permalink / raw)
  To: FFmpeg development discussions and patches

Quoting James Almer (2023-12-11 13:48:17)
> On 12/11/2023 7:59 AM, Anton Khirnov wrote:
> > Quoting James Almer (2023-12-05 23:43:56)
> >> @@ -2819,6 +2972,22 @@ AVRational av_guess_frame_rate(AVFormatContext *ctx, AVStream *stream,
> >>   int avformat_match_stream_specifier(AVFormatContext *s, AVStream *st,
> >>                                       const char *spec);
> >>   
> >> +/**
> >> + * Check if the group stg contained in s is matched by the stream group
> >> + * specifier spec.
> >> + *
> >> + * See the "stream group specifiers" chapter in the documentation for the
> >> + * syntax of spec.
> >> + *
> >> + * @return  >0 if stg is matched by spec;
> >> + *          0  if stg is not matched by spec;
> >> + *          AVERROR code if spec is invalid
> >> + *
> >> + * @note  A stream group specifier can match several groups in the format.
> >> + */
> >> +int avformat_match_stream_group_specifier(AVFormatContext *s, AVStreamGroup *stg,
> >> +                                          const char *spec);
> > 
> > What is this for? It does not seem to be used in this set.
> 
> Remnant from when i used in the CLI while developing this set.
> I can remove it, but would it not be useful for some other API user?

I doubt this API is useful to anyone and would much prefer to move it
back to the CLI where it belongs.

-- 
Anton Khirnov
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [FFmpeg-devel] [PATCH 2/8] avformat: introduce AVStreamGroup
  2023-12-11 11:03   ` Anton Khirnov
@ 2023-12-11 12:53     ` James Almer
  2023-12-12 11:46       ` Anton Khirnov
  0 siblings, 1 reply; 23+ messages in thread
From: James Almer @ 2023-12-11 12:53 UTC (permalink / raw)
  To: ffmpeg-devel

On 12/11/2023 8:03 AM, Anton Khirnov wrote:
> Quoting James Almer (2023-12-05 23:43:56)
>> +/**
>> + * Remove a stream group from its AVFormatContext and free it.
>> + * The group must be the last stream of the AVFormatContext.
>> + */
>> +void ff_remove_stream_group(AVFormatContext *s, AVStreamGroup *stg);
> 
> When would this be useful?

I can't think of a use case right now, at least for the raw iamf demuxer 
and mp4 implementations. I added it for feature parity with AVStream and 
because it was four lines long.

Will remove.
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [FFmpeg-devel] [PATCH 3/8] ffmpeg: add support for muxing AVStreamGroups
  2023-12-11 12:46     ` James Almer
@ 2023-12-12 11:26       ` Anton Khirnov
  0 siblings, 0 replies; 23+ messages in thread
From: Anton Khirnov @ 2023-12-12 11:26 UTC (permalink / raw)
  To: FFmpeg development discussions and patches

Quoting James Almer (2023-12-11 13:46:36)
> AVStreamGroup.type is not setteable through AVOptions, but it of course 
> needs to be supported by the CLI. So i catch it and remove it from the 
> dict that will be used for avformat_stream_group_create().
> 
> I can change the comment to "'type' is not a user settable AVOption".

That seems better, thanks.

> > 3) Print the string, not the index.
> > 
> >> +                ret = AVERROR(EINVAL);
> >> +                goto end;
> >> +            }
> >> +
> >> +            ret = av_opt_eval_int(&pclass, opts, e->value, &type);
> >> +            if (ret < 0 || type == AV_STREAM_GROUP_PARAMS_NONE) {
> >> +                av_log(mux, AV_LOG_ERROR, "Invalid group type \"%s\"\n", e->value);
> >> +                goto end;
> >> +            }
> >> +
> >> +            av_dict_copy(&tmp, dict, 0);
> >> +            stg = avformat_stream_group_create(oc, type, &tmp);
> >> +            if (!stg) {
> >> +                ret = AVERROR(ENOMEM);
> >> +                goto end;
> >> +            }
> >> +            av_dict_set(&tmp, "type", NULL, 0);
> >> +
> >> +            e = NULL;
> >> +            while (e = av_dict_get(dict, "st", e, 0)) {
> >> +                unsigned int idx = strtol(e->value, NULL, 0);
> >> +                if (idx >= oc->nb_streams) {
> >> +                    av_log(mux, AV_LOG_ERROR, "Invalid stream index %d\n", idx);
> >> +                    ret = AVERROR(EINVAL);
> >> +                    goto end;
> >> +                }
> > 
> > This block seems confused about signedness of e->value.
> 
> You mean change %d to %u?

I mean strtol will parse the string into a signed number, then you
assign the result into unsigned, and print it as signed. It's probably
more user-friendly to keep parsing it as signed, and add a check for
idx >= 0.

-- 
Anton Khirnov
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [FFmpeg-devel] [PATCH 2/8] avformat: introduce AVStreamGroup
  2023-12-11 12:53     ` James Almer
@ 2023-12-12 11:46       ` Anton Khirnov
  0 siblings, 0 replies; 23+ messages in thread
From: Anton Khirnov @ 2023-12-12 11:46 UTC (permalink / raw)
  To: FFmpeg development discussions and patches

Quoting James Almer (2023-12-11 13:53:40)
> On 12/11/2023 8:03 AM, Anton Khirnov wrote:
> > Quoting James Almer (2023-12-05 23:43:56)
> >> +/**
> >> + * Remove a stream group from its AVFormatContext and free it.
> >> + * The group must be the last stream of the AVFormatContext.
> >> + */
> >> +void ff_remove_stream_group(AVFormatContext *s, AVStreamGroup *stg);
> > 
> > When would this be useful?
> 
> I can't think of a use case right now, at least for the raw iamf demuxer 
> and mp4 implementations. I added it for feature parity with AVStream and 
> because it was four lines long.

Removing streams seems like a disgusting hack with very limited usage,
that should really not exist.

I'd rather not pretend we support removing previously added groups,
unless there is a very strong case for it.

-- 
Anton Khirnov
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [FFmpeg-devel] [PATCH 1/8] avutil: introduce an Immersive Audio Model and Formats API
  2023-12-18 11:04   ` Anton Khirnov
@ 2023-12-18 18:10     ` James Almer
  0 siblings, 0 replies; 23+ messages in thread
From: James Almer @ 2023-12-18 18:10 UTC (permalink / raw)
  To: ffmpeg-devel

On 12/18/2023 8:04 AM, Anton Khirnov wrote:
> Quoting James Almer (2023-12-14 21:14:26)
>> +/**
>> + * Mix Gain Parameter Data as defined in section 3.8.1 of IAMF.
>> + */
>> +typedef struct AVIAMFMixGain {
>> +    const AVClass *av_class;
>> +
>> +    /**
>> +     * Duration for the given subblock. It must not be 0.
> 
> In what units? Same for all durations in this patch.

parameter_rate. Amended.

> 
>> +typedef struct AVIAMFParamDefinition {
>> +    const AVClass *av_class;
>> +
>> +    /**
>> +     * Offset in bytes from the start of this struct, at which the subblocks
>> +     * array is located.
>> +     */
>> +    size_t subblocks_offset;
>> +    /**
>> +     * Size in bytes of each element in the subblocks array.
>> +     */
>> +    size_t subblock_size;
>> +    /**
>> +     * Number of subblocks in the array.
>> +     *
>> +     * Must be 0 if @ref constant_subblock_duration is not 0.

Removed this line as it's bogus.

>> +     */
>> +    unsigned int nb_subblocks;
>> +
>> +    /**
>> +     * Parameters type. Determines the type of the subblock elements.
>> +     */
>> +    enum AVIAMFParamDefinitionType type;
>> +
>> +    /**
>> +     * Identifier for the paremeter substream.
>> +     */
>> +    unsigned int parameter_id;
>> +    /**
>> +     * Sample rate for the paremeter substream. It must not be 0.
>> +     */
>> +    unsigned int parameter_rate;
>> +
>> +    /**
>> +     * The duration of the all subblocks in this parameter definition.
>> +     *
>> +     * May be 0, in which case all duration values should be specified in
>> +     * another parameter definition referencing the same parameter_id.
>> +     */
>> +    unsigned int duration;
>> +    /**
>> +     * The duration of every subblock in the case where all subblocks, with
>> +     * the optional exception of the last subblock, have equal durations.
>> +     *
>> +     * Must be 0 if subblocks have different durations.
>> +     */
>> +    unsigned int constant_subblock_duration;
> 
> This also seems like should be a flags field.

No, duration and subblock duration are not the same thing. The former is 
the accumulated duration of all subblocks in a given parameter 
definition. subblock durations can be smaller, and only if they are 
constant will constant_subblock_duration be set to a value other than 0.

> 
> Otherwise looks good.
> 

Pushed. Thanks a lot for looking at it.
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [FFmpeg-devel] [PATCH 1/8] avutil: introduce an Immersive Audio Model and Formats API
  2023-12-14 20:14 ` [FFmpeg-devel] [PATCH 1/8] avutil: introduce an Immersive Audio Model and Formats API James Almer
@ 2023-12-18 11:04   ` Anton Khirnov
  2023-12-18 18:10     ` James Almer
  0 siblings, 1 reply; 23+ messages in thread
From: Anton Khirnov @ 2023-12-18 11:04 UTC (permalink / raw)
  To: FFmpeg development discussions and patches

Quoting James Almer (2023-12-14 21:14:26)
> +/**
> + * Mix Gain Parameter Data as defined in section 3.8.1 of IAMF.
> + */
> +typedef struct AVIAMFMixGain {
> +    const AVClass *av_class;
> +
> +    /**
> +     * Duration for the given subblock. It must not be 0.

In what units? Same for all durations in this patch.

> +typedef struct AVIAMFParamDefinition {
> +    const AVClass *av_class;
> +
> +    /**
> +     * Offset in bytes from the start of this struct, at which the subblocks
> +     * array is located.
> +     */
> +    size_t subblocks_offset;
> +    /**
> +     * Size in bytes of each element in the subblocks array.
> +     */
> +    size_t subblock_size;
> +    /**
> +     * Number of subblocks in the array.
> +     *
> +     * Must be 0 if @ref constant_subblock_duration is not 0.
> +     */
> +    unsigned int nb_subblocks;
> +
> +    /**
> +     * Parameters type. Determines the type of the subblock elements.
> +     */
> +    enum AVIAMFParamDefinitionType type;
> +
> +    /**
> +     * Identifier for the paremeter substream.
> +     */
> +    unsigned int parameter_id;
> +    /**
> +     * Sample rate for the paremeter substream. It must not be 0.
> +     */
> +    unsigned int parameter_rate;
> +
> +    /**
> +     * The duration of the all subblocks in this parameter definition.
> +     *
> +     * May be 0, in which case all duration values should be specified in
> +     * another parameter definition referencing the same parameter_id.
> +     */
> +    unsigned int duration;
> +    /**
> +     * The duration of every subblock in the case where all subblocks, with
> +     * the optional exception of the last subblock, have equal durations.
> +     *
> +     * Must be 0 if subblocks have different durations.
> +     */
> +    unsigned int constant_subblock_duration;

This also seems like should be a flags field.

Otherwise looks good.

-- 
Anton Khirnov
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [FFmpeg-devel] [PATCH 1/8] avutil: introduce an Immersive Audio Model and Formats API
  2023-12-14 20:14 [FFmpeg-devel] [PATCH v7 " James Almer
@ 2023-12-14 20:14 ` James Almer
  2023-12-18 11:04   ` Anton Khirnov
  0 siblings, 1 reply; 23+ messages in thread
From: James Almer @ 2023-12-14 20:14 UTC (permalink / raw)
  To: ffmpeg-devel

Signed-off-by: James Almer <jamrial@gmail.com>
---
 libavutil/Makefile |   2 +
 libavutil/iamf.c   | 563 ++++++++++++++++++++++++++++++++++++++++
 libavutil/iamf.h   | 620 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 1185 insertions(+)
 create mode 100644 libavutil/iamf.c
 create mode 100644 libavutil/iamf.h

diff --git a/libavutil/Makefile b/libavutil/Makefile
index 4711f8cde8..62cc1a1831 100644
--- a/libavutil/Makefile
+++ b/libavutil/Makefile
@@ -51,6 +51,7 @@ HEADERS = adler32.h                                                     \
           hwcontext_videotoolbox.h                                      \
           hwcontext_vdpau.h                                             \
           hwcontext_vulkan.h                                            \
+          iamf.h                                                        \
           imgutils.h                                                    \
           intfloat.h                                                    \
           intreadwrite.h                                                \
@@ -140,6 +141,7 @@ OBJS = adler32.o                                                        \
        hdr_dynamic_vivid_metadata.o                                     \
        hmac.o                                                           \
        hwcontext.o                                                      \
+       iamf.o                                                           \
        imgutils.o                                                       \
        integer.o                                                        \
        intmath.o                                                        \
diff --git a/libavutil/iamf.c b/libavutil/iamf.c
new file mode 100644
index 0000000000..62b6051049
--- /dev/null
+++ b/libavutil/iamf.c
@@ -0,0 +1,563 @@
+/*
+ * Immersive Audio Model and Formats helper functions and defines
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include <limits.h>
+#include <stddef.h>
+#include <stdint.h>
+
+#include "avassert.h"
+#include "error.h"
+#include "iamf.h"
+#include "log.h"
+#include "mem.h"
+#include "opt.h"
+
+#define IAMF_ADD_FUNC_TEMPLATE(parent_type, parent_name, child_type, child_name, suffix)                   \
+child_type *av_iamf_ ## parent_name ## _add_ ## child_name(parent_type *parent_name)                       \
+{                                                                                                          \
+    child_type **child_name ## suffix, *child_name;                                                        \
+                                                                                                           \
+    if (parent_name->nb_## child_name ## suffix == UINT_MAX)                                               \
+        return NULL;                                                                                       \
+                                                                                                           \
+    child_name ## suffix = av_realloc_array(parent_name->child_name ## suffix,                             \
+                                            parent_name->nb_## child_name ## suffix + 1,                   \
+                                            sizeof(*parent_name->child_name ## suffix));                   \
+    if (!child_name ## suffix)                                                                             \
+        return NULL;                                                                                       \
+                                                                                                           \
+    parent_name->child_name ## suffix = child_name ## suffix;                                              \
+                                                                                                           \
+    child_name = parent_name->child_name ## suffix[parent_name->nb_## child_name ## suffix]                \
+               = av_mallocz(sizeof(*child_name));                                                          \
+    if (!child_name)                                                                                       \
+        return NULL;                                                                                       \
+                                                                                                           \
+    child_name->av_class = &child_name ## _class;                                                          \
+    av_opt_set_defaults(child_name);                                                                       \
+    parent_name->nb_## child_name ## suffix++;                                                             \
+                                                                                                           \
+    return child_name;                                                                                     \
+}
+
+#define FLAGS AV_OPT_FLAG_ENCODING_PARAM
+
+//
+// Param Definition
+//
+#define OFFSET(x) offsetof(AVIAMFMixGain, x)
+static const AVOption mix_gain_options[] = {
+    { "subblock_duration", "set subblock_duration", OFFSET(subblock_duration), AV_OPT_TYPE_INT64, {.i64 = 1 }, 1, UINT_MAX, FLAGS },
+    { "animation_type", "set animation_type", OFFSET(animation_type), AV_OPT_TYPE_INT, {.i64 = 0 }, 0, 2, FLAGS },
+    { "start_point_value", "set start_point_value", OFFSET(animation_type), AV_OPT_TYPE_RATIONAL, {.dbl = 0 }, -128.0, 128.0, FLAGS },
+    { "end_point_value", "set end_point_value", OFFSET(animation_type), AV_OPT_TYPE_RATIONAL, {.dbl = 0 }, -128.0, 128.0, FLAGS },
+    { "control_point_value", "set control_point_value", OFFSET(animation_type), AV_OPT_TYPE_RATIONAL, {.dbl = 0 }, -128.0, 128.0, FLAGS },
+    { "control_point_relative_time", "set control_point_relative_time", OFFSET(animation_type), AV_OPT_TYPE_RATIONAL, {.dbl = 0 }, 0.0, 1.0, FLAGS },
+    { NULL },
+};
+
+static const AVClass mix_gain_class = {
+    .class_name     = "AVIAMFSubmixElement",
+    .item_name      = av_default_item_name,
+    .version        = LIBAVUTIL_VERSION_INT,
+    .option         = mix_gain_options,
+};
+
+#undef OFFSET
+#define OFFSET(x) offsetof(AVIAMFDemixingInfo, x)
+static const AVOption demixing_info_options[] = {
+    { "subblock_duration", "set subblock_duration", OFFSET(subblock_duration), AV_OPT_TYPE_INT64, {.i64 = 1 }, 1, UINT_MAX, FLAGS },
+    { "dmixp_mode", "set dmixp_mode", OFFSET(dmixp_mode), AV_OPT_TYPE_INT, {.i64 = 0 }, 0, 6, FLAGS },
+    { NULL },
+};
+
+static const AVClass demixing_info_class = {
+    .class_name     = "AVIAMFDemixingInfo",
+    .item_name      = av_default_item_name,
+    .version        = LIBAVUTIL_VERSION_INT,
+    .option         = demixing_info_options,
+};
+
+#undef OFFSET
+#define OFFSET(x) offsetof(AVIAMFReconGain, x)
+static const AVOption recon_gain_options[] = {
+    { "subblock_duration", "set subblock_duration", OFFSET(subblock_duration), AV_OPT_TYPE_INT64, {.i64 = 1 }, 1, UINT_MAX, FLAGS },
+    { NULL },
+};
+
+static const AVClass recon_gain_class = {
+    .class_name     = "AVIAMFReconGain",
+    .item_name      = av_default_item_name,
+    .version        = LIBAVUTIL_VERSION_INT,
+    .option         = recon_gain_options,
+};
+
+#undef OFFSET
+#define OFFSET(x) offsetof(AVIAMFParamDefinition, x)
+static const AVOption param_definition_options[] = {
+    { "parameter_id", "set parameter_id", OFFSET(parameter_id), AV_OPT_TYPE_INT64, {.i64 = 0 }, 0, UINT_MAX, FLAGS },
+    { "parameter_rate", "set parameter_rate", OFFSET(parameter_rate), AV_OPT_TYPE_INT64, {.i64 = 0 }, 0, UINT_MAX, FLAGS },
+    { "duration", "set duration", OFFSET(duration), AV_OPT_TYPE_INT64, {.i64 = 0 }, 0, UINT_MAX, FLAGS },
+    { "constant_subblock_duration", "set constant_subblock_duration", OFFSET(constant_subblock_duration), AV_OPT_TYPE_INT64, {.i64 = 0 }, 0, UINT_MAX, FLAGS },
+    { NULL },
+};
+
+static const AVClass *param_definition_child_iterate(void **opaque)
+{
+    uintptr_t i = (uintptr_t)*opaque;
+    const AVClass *ret = NULL;
+
+    switch(i) {
+    case AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN:
+        ret = &mix_gain_class;
+        break;
+    case AV_IAMF_PARAMETER_DEFINITION_DEMIXING:
+        ret = &demixing_info_class;
+        break;
+    case AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN:
+        ret = &recon_gain_class;
+        break;
+    default:
+        break;
+    }
+
+    if (ret)
+        *opaque = (void*)(i + 1);
+    return ret;
+}
+
+static const AVClass param_definition_class = {
+    .class_name          = "AVIAMFParamDefinition",
+    .item_name           = av_default_item_name,
+    .version             = LIBAVUTIL_VERSION_INT,
+    .option              = param_definition_options,
+    .child_class_iterate = param_definition_child_iterate,
+};
+
+const AVClass *av_iamf_param_definition_get_class(void)
+{
+    return &param_definition_class;
+}
+
+AVIAMFParamDefinition *av_iamf_param_definition_alloc(enum AVIAMFParamDefinitionType type,
+                                                      unsigned int nb_subblocks, size_t *out_size)
+{
+
+    struct MixGainStruct {
+        AVIAMFParamDefinition p;
+        AVIAMFMixGain m;
+    };
+    struct DemixStruct {
+        AVIAMFParamDefinition p;
+        AVIAMFDemixingInfo d;
+    };
+    struct ReconGainStruct {
+        AVIAMFParamDefinition p;
+        AVIAMFReconGain r;
+    };
+    size_t subblocks_offset, subblock_size;
+    size_t size;
+    AVIAMFParamDefinition *par;
+
+    switch (type) {
+    case AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN:
+        subblocks_offset = offsetof(struct MixGainStruct, m);
+        subblock_size = sizeof(AVIAMFMixGain);
+        break;
+    case AV_IAMF_PARAMETER_DEFINITION_DEMIXING:
+        subblocks_offset = offsetof(struct DemixStruct, d);
+        subblock_size = sizeof(AVIAMFDemixingInfo);
+        break;
+    case AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN:
+        subblocks_offset = offsetof(struct ReconGainStruct, r);
+        subblock_size = sizeof(AVIAMFReconGain);
+        break;
+    default:
+        return NULL;
+    }
+
+    size = subblocks_offset;
+    if (nb_subblocks > (SIZE_MAX - size) / subblock_size)
+        return NULL;
+    size += subblock_size * nb_subblocks;
+
+    par = av_mallocz(size);
+    if (!par)
+        return NULL;
+
+    par->av_class = &param_definition_class;
+    av_opt_set_defaults(par);
+
+    par->type = type;
+    par->nb_subblocks = nb_subblocks;
+    par->subblock_size = subblock_size;
+    par->subblocks_offset = subblocks_offset;
+
+    for (int i = 0; i < nb_subblocks; i++) {
+        void *subblock = av_iamf_param_definition_get_subblock(par, i);
+
+        switch (type) {
+        case AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN:
+            ((AVIAMFMixGain *)subblock)->av_class = &mix_gain_class;
+            break;
+        case AV_IAMF_PARAMETER_DEFINITION_DEMIXING:
+            ((AVIAMFDemixingInfo *)subblock)->av_class = &demixing_info_class;
+            break;
+        case AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN:
+            ((AVIAMFReconGain *)subblock)->av_class = &recon_gain_class;
+            break;
+        default:
+            av_assert0(0);
+        }
+
+        av_opt_set_defaults(subblock);
+    }
+
+    if (out_size)
+        *out_size = size;
+
+    return par;
+}
+
+//
+// Audio Element
+//
+#undef OFFSET
+#define OFFSET(x) offsetof(AVIAMFLayer, x)
+static const AVOption layer_options[] = {
+    { "ch_layout", "set ch_layout", OFFSET(ch_layout), AV_OPT_TYPE_CHLAYOUT, {.str = NULL }, 0, 0, FLAGS },
+    { "flags", "set flags", OFFSET(flags), AV_OPT_TYPE_FLAGS,
+        {.i64 = 0 }, 0, AV_IAMF_LAYER_FLAG_RECON_GAIN, FLAGS, "flags" },
+            {"recon_gain",  "Recon gain is present", 0, AV_OPT_TYPE_CONST,
+                {.i64 = AV_IAMF_LAYER_FLAG_RECON_GAIN }, INT_MIN, INT_MAX, FLAGS, "flags"},
+    { "output_gain_flags", "set output_gain_flags", OFFSET(output_gain_flags), AV_OPT_TYPE_FLAGS,
+        {.i64 = 0 }, 0, (1 << 6) - 1, FLAGS, "output_gain_flags" },
+            {"FL",  "Left channel",            0, AV_OPT_TYPE_CONST,
+                {.i64 = 1 << 5 }, INT_MIN, INT_MAX, FLAGS, "output_gain_flags"},
+            {"FR",  "Right channel",           0, AV_OPT_TYPE_CONST,
+                {.i64 = 1 << 4 }, INT_MIN, INT_MAX, FLAGS, "output_gain_flags"},
+            {"BL",  "Left surround channel",   0, AV_OPT_TYPE_CONST,
+                {.i64 = 1 << 3 }, INT_MIN, INT_MAX, FLAGS, "output_gain_flags"},
+            {"BR",  "Right surround channel",  0, AV_OPT_TYPE_CONST,
+                {.i64 = 1 << 2 }, INT_MIN, INT_MAX, FLAGS, "output_gain_flags"},
+            {"TFL", "Left top front channel",  0, AV_OPT_TYPE_CONST,
+                {.i64 = 1 << 1 }, INT_MIN, INT_MAX, FLAGS, "output_gain_flags"},
+            {"TFR", "Right top front channel", 0, AV_OPT_TYPE_CONST,
+                {.i64 = 1 << 0 }, INT_MIN, INT_MAX, FLAGS, "output_gain_flags"},
+    { "output_gain", "set output_gain", OFFSET(output_gain), AV_OPT_TYPE_RATIONAL, { .dbl = 0 }, -128.0, 128.0, FLAGS },
+    { "ambisonics_mode", "set ambisonics_mode", OFFSET(ambisonics_mode), AV_OPT_TYPE_INT,
+            { .i64 = AV_IAMF_AMBISONICS_MODE_MONO },
+            AV_IAMF_AMBISONICS_MODE_MONO, AV_IAMF_AMBISONICS_MODE_PROJECTION, FLAGS, "ambisonics_mode" },
+        { "mono",       NULL, 0, AV_OPT_TYPE_CONST,
+                   { .i64 = AV_IAMF_AMBISONICS_MODE_MONO },       .unit = "ambisonics_mode" },
+        { "projection", NULL, 0, AV_OPT_TYPE_CONST,
+                   { .i64 = AV_IAMF_AMBISONICS_MODE_PROJECTION }, .unit = "ambisonics_mode" },
+    { NULL },
+};
+
+static const AVClass layer_class = {
+    .class_name     = "AVIAMFLayer",
+    .item_name      = av_default_item_name,
+    .version        = LIBAVUTIL_VERSION_INT,
+    .option         = layer_options,
+};
+
+#undef OFFSET
+#define OFFSET(x) offsetof(AVIAMFAudioElement, x)
+static const AVOption audio_element_options[] = {
+    { "audio_element_type", "set audio_element_type", OFFSET(audio_element_type), AV_OPT_TYPE_INT,
+            {.i64 = AV_IAMF_AUDIO_ELEMENT_TYPE_CHANNEL },
+            AV_IAMF_AUDIO_ELEMENT_TYPE_CHANNEL, AV_IAMF_AUDIO_ELEMENT_TYPE_SCENE, FLAGS, "audio_element_type" },
+        { "channel", NULL, 0, AV_OPT_TYPE_CONST,
+                   { .i64 = AV_IAMF_AUDIO_ELEMENT_TYPE_CHANNEL }, .unit = "audio_element_type" },
+        { "scene",   NULL, 0, AV_OPT_TYPE_CONST,
+                   { .i64 = AV_IAMF_AUDIO_ELEMENT_TYPE_SCENE },   .unit = "audio_element_type" },
+    { "default_w", "set default_w", OFFSET(default_w), AV_OPT_TYPE_INT, {.i64 = 0 }, 0, 10, FLAGS },
+    { NULL },
+};
+
+static const AVClass *audio_element_child_iterate(void **opaque)
+{
+    uintptr_t i = (uintptr_t)*opaque;
+    const AVClass *ret = NULL;
+
+    if (i)
+        ret = &layer_class;
+
+    if (ret)
+        *opaque = (void*)(i + 1);
+    return ret;
+}
+
+static const AVClass audio_element_class = {
+    .class_name          = "AVIAMFAudioElement",
+    .item_name           = av_default_item_name,
+    .version             = LIBAVUTIL_VERSION_INT,
+    .option              = audio_element_options,
+    .child_class_iterate = audio_element_child_iterate,
+};
+
+const AVClass *av_iamf_audio_element_get_class(void)
+{
+    return &audio_element_class;
+}
+
+AVIAMFAudioElement *av_iamf_audio_element_alloc(void)
+{
+    AVIAMFAudioElement *audio_element = av_mallocz(sizeof(*audio_element));
+
+    if (audio_element) {
+        audio_element->av_class = &audio_element_class;
+        av_opt_set_defaults(audio_element);
+    }
+
+    return audio_element;
+}
+
+IAMF_ADD_FUNC_TEMPLATE(AVIAMFAudioElement, audio_element, AVIAMFLayer, layer, s)
+
+void av_iamf_audio_element_free(AVIAMFAudioElement **paudio_element)
+{
+    AVIAMFAudioElement *audio_element = *paudio_element;
+
+    if (!audio_element)
+        return;
+
+    for (int i = 0; i < audio_element->nb_layers; i++) {
+        AVIAMFLayer *layer = audio_element->layers[i];
+        av_opt_free(layer);
+        av_free(layer->demixing_matrix);
+        av_free(layer);
+    }
+    av_free(audio_element->layers);
+
+    av_free(audio_element->demixing_info);
+    av_free(audio_element->recon_gain_info);
+    av_freep(paudio_element);
+}
+
+//
+// Mix Presentation
+//
+#undef OFFSET
+#define OFFSET(x) offsetof(AVIAMFSubmixElement, x)
+static const AVOption submix_element_options[] = {
+    { "headphones_rendering_mode", "Headphones rendering mode", OFFSET(headphones_rendering_mode), AV_OPT_TYPE_INT,
+            { .i64 = AV_IAMF_HEADPHONES_MODE_STEREO },
+            AV_IAMF_HEADPHONES_MODE_STEREO, AV_IAMF_HEADPHONES_MODE_BINAURAL, FLAGS, "headphones_rendering_mode" },
+        { "stereo",   NULL, 0, AV_OPT_TYPE_CONST,
+                   { .i64 = AV_IAMF_HEADPHONES_MODE_STEREO },   .unit = "headphones_rendering_mode" },
+        { "binaural", NULL, 0, AV_OPT_TYPE_CONST,
+                   { .i64 = AV_IAMF_HEADPHONES_MODE_BINAURAL }, .unit = "headphones_rendering_mode" },
+    { "default_mix_gain", "Default mix gain", OFFSET(default_mix_gain), AV_OPT_TYPE_RATIONAL, { .dbl = 0 }, -128.0, 128.0, FLAGS },
+    { "annotations", "Annotations", OFFSET(annotations), AV_OPT_TYPE_DICT, { .str = NULL }, 0, 0, FLAGS },
+    { NULL },
+};
+
+static void *submix_element_child_next(void *obj, void *prev)
+{
+    AVIAMFSubmixElement *submix_element = obj;
+    if (!prev)
+        return submix_element->element_mix_config;
+
+    return NULL;
+}
+
+static const AVClass *submix_element_child_iterate(void **opaque)
+{
+    uintptr_t i = (uintptr_t)*opaque;
+    const AVClass *ret = NULL;
+
+    if (i)
+        ret = &param_definition_class;
+
+    if (ret)
+        *opaque = (void*)(i + 1);
+    return ret;
+}
+
+static const AVClass element_class = {
+    .class_name          = "AVIAMFSubmixElement",
+    .item_name           = av_default_item_name,
+    .version             = LIBAVUTIL_VERSION_INT,
+    .option              = submix_element_options,
+    .child_next          = submix_element_child_next,
+    .child_class_iterate = submix_element_child_iterate,
+};
+
+IAMF_ADD_FUNC_TEMPLATE(AVIAMFSubmix, submix, AVIAMFSubmixElement, element, s)
+
+#undef OFFSET
+#define OFFSET(x) offsetof(AVIAMFSubmixLayout, x)
+static const AVOption submix_layout_options[] = {
+    { "layout_type", "Layout type", OFFSET(layout_type), AV_OPT_TYPE_INT,
+            { .i64 = AV_IAMF_SUBMIX_LAYOUT_TYPE_LOUDSPEAKERS },
+            AV_IAMF_SUBMIX_LAYOUT_TYPE_LOUDSPEAKERS, AV_IAMF_SUBMIX_LAYOUT_TYPE_BINAURAL, FLAGS, "layout_type" },
+        { "loudspeakers", NULL, 0, AV_OPT_TYPE_CONST,
+                   { .i64 = AV_IAMF_SUBMIX_LAYOUT_TYPE_LOUDSPEAKERS }, .unit = "layout_type" },
+        { "binaural",     NULL, 0, AV_OPT_TYPE_CONST,
+                   { .i64 = AV_IAMF_SUBMIX_LAYOUT_TYPE_BINAURAL },     .unit = "layout_type" },
+    { "sound_system", "Sound System", OFFSET(sound_system), AV_OPT_TYPE_CHLAYOUT, { .str = NULL }, 0, 0, FLAGS },
+    { "integrated_loudness", "Integrated loudness", OFFSET(integrated_loudness), AV_OPT_TYPE_RATIONAL, { .dbl = 0 }, -128.0, 128.0, FLAGS },
+    { "digital_peak", "Digital peak", OFFSET(digital_peak), AV_OPT_TYPE_RATIONAL, { .dbl = 0 }, -128.0, 128.0, FLAGS },
+    { "true_peak", "True peak", OFFSET(true_peak), AV_OPT_TYPE_RATIONAL, { .dbl = 0 }, -128.0, 128.0, FLAGS },
+    { "dialog_anchored_loudness", "Anchored loudness (Dialog)", OFFSET(dialogue_anchored_loudness), AV_OPT_TYPE_RATIONAL, { .dbl = 0 }, -128.0, 128.0, FLAGS },
+    { "album_anchored_loudness", "Anchored loudness (Album)", OFFSET(album_anchored_loudness), AV_OPT_TYPE_RATIONAL, { .dbl = 0 }, -128.0, 128.0, FLAGS },
+    { NULL },
+};
+
+static const AVClass layout_class = {
+    .class_name     = "AVIAMFSubmixLayout",
+    .item_name      = av_default_item_name,
+    .version        = LIBAVUTIL_VERSION_INT,
+    .option         = submix_layout_options,
+};
+
+IAMF_ADD_FUNC_TEMPLATE(AVIAMFSubmix, submix, AVIAMFSubmixLayout, layout, s)
+
+#undef OFFSET
+#define OFFSET(x) offsetof(AVIAMFSubmix, x)
+static const AVOption submix_presentation_options[] = {
+    { "default_mix_gain", "Default mix gain", OFFSET(default_mix_gain), AV_OPT_TYPE_RATIONAL, { .dbl = 0 }, -128.0, 128.0, FLAGS },
+    { NULL },
+};
+
+static void *submix_presentation_child_next(void *obj, void *prev)
+{
+    AVIAMFSubmix *sub_mix = obj;
+    if (!prev)
+        return sub_mix->output_mix_config;
+
+    return NULL;
+}
+
+static const AVClass *submix_presentation_child_iterate(void **opaque)
+{
+    uintptr_t i = (uintptr_t)*opaque;
+    const AVClass *ret = NULL;
+
+    switch(i) {
+    case 0:
+        ret = &element_class;
+        break;
+    case 1:
+        ret = &layout_class;
+        break;
+    case 2:
+        ret = &param_definition_class;
+        break;
+    default:
+        break;
+    }
+
+    if (ret)
+        *opaque = (void*)(i + 1);
+    return ret;
+}
+
+static const AVClass submix_class = {
+    .class_name          = "AVIAMFSubmix",
+    .item_name           = av_default_item_name,
+    .version             = LIBAVUTIL_VERSION_INT,
+    .option              = submix_presentation_options,
+    .child_next          = submix_presentation_child_next,
+    .child_class_iterate = submix_presentation_child_iterate,
+};
+
+#undef OFFSET
+#define OFFSET(x) offsetof(AVIAMFMixPresentation, x)
+static const AVOption mix_presentation_options[] = {
+    { "annotations", "set annotations", OFFSET(annotations), AV_OPT_TYPE_DICT, {.str = NULL }, 0, 0, FLAGS },
+    { NULL },
+};
+
+#undef OFFSET
+#undef FLAGS
+
+static const AVClass *mix_presentation_child_iterate(void **opaque)
+{
+    uintptr_t i = (uintptr_t)*opaque;
+    const AVClass *ret = NULL;
+
+    if (i)
+        ret = &submix_class;
+
+    if (ret)
+        *opaque = (void*)(i + 1);
+    return ret;
+}
+
+static const AVClass mix_presentation_class = {
+    .class_name          = "AVIAMFMixPresentation",
+    .item_name           = av_default_item_name,
+    .version             = LIBAVUTIL_VERSION_INT,
+    .option              = mix_presentation_options,
+    .child_class_iterate = mix_presentation_child_iterate,
+};
+
+const AVClass *av_iamf_mix_presentation_get_class(void)
+{
+    return &mix_presentation_class;
+}
+
+AVIAMFMixPresentation *av_iamf_mix_presentation_alloc(void)
+{
+    AVIAMFMixPresentation *mix_presentation = av_mallocz(sizeof(*mix_presentation));
+
+    if (mix_presentation) {
+        mix_presentation->av_class = &mix_presentation_class;
+        av_opt_set_defaults(mix_presentation);
+    }
+
+    return mix_presentation;
+}
+
+IAMF_ADD_FUNC_TEMPLATE(AVIAMFMixPresentation, mix_presentation, AVIAMFSubmix, submix, es)
+
+void av_iamf_mix_presentation_free(AVIAMFMixPresentation **pmix_presentation)
+{
+    AVIAMFMixPresentation *mix_presentation = *pmix_presentation;
+
+    if (!mix_presentation)
+        return;
+
+    for (int i = 0; i < mix_presentation->nb_submixes; i++) {
+        AVIAMFSubmix *sub_mix = mix_presentation->submixes[i];
+        for (int j = 0; j < sub_mix->nb_elements; j++) {
+            AVIAMFSubmixElement *submix_element = sub_mix->elements[j];
+            av_opt_free(submix_element);
+            av_free(submix_element->element_mix_config);
+            av_free(submix_element);
+        }
+        av_free(sub_mix->elements);
+        for (int j = 0; j < sub_mix->nb_layouts; j++) {
+            AVIAMFSubmixLayout *submix_layout = sub_mix->layouts[j];
+            av_opt_free(submix_layout);
+            av_free(submix_layout);
+        }
+        av_free(sub_mix->layouts);
+        av_free(sub_mix->output_mix_config);
+        av_free(sub_mix);
+    }
+    av_opt_free(mix_presentation);
+    av_free(mix_presentation->submixes);
+
+    av_freep(pmix_presentation);
+}
diff --git a/libavutil/iamf.h b/libavutil/iamf.h
new file mode 100644
index 0000000000..7038b71a27
--- /dev/null
+++ b/libavutil/iamf.h
@@ -0,0 +1,620 @@
+/*
+ * Immersive Audio Model and Formats helper functions and defines
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#ifndef AVUTIL_IAMF_H
+#define AVUTIL_IAMF_H
+
+/**
+ * @file
+ * Immersive Audio Model and Formats API header
+ * @see <a href="https://aomediacodec.github.io/iamf/">Immersive Audio Model and Formats</a>
+ */
+
+#include <stdint.h>
+#include <stddef.h>
+
+#include "attributes.h"
+#include "avassert.h"
+#include "channel_layout.h"
+#include "dict.h"
+#include "rational.h"
+
+/**
+ * @defgroup lavf_iamf_params Parameter Definition
+ * @{
+ * Parameters as defined in section 3.6.1 and 3.8 of IAMF.
+ * @}
+ * @defgroup lavf_iamf_audio Audio Element
+ * @{
+ * Audio Elements as defined in section 3.6 of IAMF.
+ * @}
+ * @defgroup lavf_iamf_mix Mix Presentation
+ * @{
+ * Mix Presentations as defined in section 3.7 of IAMF.
+ * @}
+ *
+ * @}
+ * @addtogroup lavf_iamf_params
+ * @{
+ */
+enum AVIAMFAnimationType {
+    AV_IAMF_ANIMATION_TYPE_STEP,
+    AV_IAMF_ANIMATION_TYPE_LINEAR,
+    AV_IAMF_ANIMATION_TYPE_BEZIER,
+};
+
+/**
+ * Mix Gain Parameter Data as defined in section 3.8.1 of IAMF.
+ */
+typedef struct AVIAMFMixGain {
+    const AVClass *av_class;
+
+    /**
+     * Duration for the given subblock. It must not be 0.
+     */
+    unsigned int subblock_duration;
+    /**
+     * The type of animation applied to the parameter values.
+     */
+    enum AVIAMFAnimationType animation_type;
+    /**
+     * Parameter value that is applied at the start of the subblock.
+     * Applies to all defined Animation Types.
+     *
+     * Valid range of values is -128.0 to 128.0
+     */
+    AVRational start_point_value;
+    /**
+     * Parameter value that is applied at the end of the subblock.
+     * Applies only to AV_IAMF_ANIMATION_TYPE_LINEAR and
+     * AV_IAMF_ANIMATION_TYPE_BEZIER Animation Types.
+     *
+     * Valid range of values is -128.0 to 128.0
+     */
+    AVRational end_point_value;
+    /**
+     * Parameter value of the middle control point of a quadratic Bezier
+     * curve, i.e., its y-axis value.
+     * Applies only to AV_IAMF_ANIMATION_TYPE_BEZIER Animation Type.
+     *
+     * Valid range of values is -128.0 to 128.0
+     */
+    AVRational control_point_value;
+    /**
+     * Parameter value of the time of the middle control point of a
+     * quadratic Bezier curve, i.e., its x-axis value.
+     * Applies only to AV_IAMF_ANIMATION_TYPE_BEZIER Animation Type.
+     *
+     * Valid range of values is 0.0 to 1.0
+     */
+    AVRational control_point_relative_time;
+} AVIAMFMixGain;
+
+/**
+ * Demixing Info Parameter Data as defined in section 3.8.2 of IAMF.
+ */
+typedef struct AVIAMFDemixingInfo {
+    const AVClass *av_class;
+
+    /**
+     * Duration for the given subblock. It must not be 0.
+     */
+    unsigned int subblock_duration;
+    /**
+     * Pre-defined combination of demixing parameters.
+     */
+    unsigned int dmixp_mode;
+} AVIAMFDemixingInfo;
+
+/**
+ * Recon Gain Info Parameter Data as defined in section 3.8.3 of IAMF.
+ */
+typedef struct AVIAMFReconGain {
+    const AVClass *av_class;
+
+    /**
+     * Duration for the given subblock. It must not be 0.
+     */
+    unsigned int subblock_duration;
+
+    /**
+     * Array of gain values to be applied to each channel for each layer
+     * defined in the Audio Element referencing the parent Parameter Definition.
+     * Values for layers where the AV_IAMF_LAYER_FLAG_RECON_GAIN flag is not set
+     * are undefined.
+     *
+     * Channel order is: FL, C, FR, SL, SR, TFL, TFR, BL, BR, TBL, TBR, LFE
+     */
+    uint8_t recon_gain[6][12];
+} AVIAMFReconGain;
+
+enum AVIAMFParamDefinitionType {
+   /**
+    * Subblocks are of struct type AVIAMFMixGain
+    */
+    AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN,
+   /**
+    * Subblocks are of struct type AVIAMFDemixingInfo
+    */
+    AV_IAMF_PARAMETER_DEFINITION_DEMIXING,
+   /**
+    * Subblocks are of struct type AVIAMFReconGain
+    */
+    AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN,
+};
+
+/**
+ * Parameters as defined in section 3.6.1 of IAMF.
+ *
+ * The struct is allocated by av_iamf_param_definition_alloc() along with an
+ * array of subblocks, its type depending on the value of type.
+ * This array is placed subblocks_offset bytes after the start of this struct.
+ */
+typedef struct AVIAMFParamDefinition {
+    const AVClass *av_class;
+
+    /**
+     * Offset in bytes from the start of this struct, at which the subblocks
+     * array is located.
+     */
+    size_t subblocks_offset;
+    /**
+     * Size in bytes of each element in the subblocks array.
+     */
+    size_t subblock_size;
+    /**
+     * Number of subblocks in the array.
+     *
+     * Must be 0 if @ref constant_subblock_duration is not 0.
+     */
+    unsigned int nb_subblocks;
+
+    /**
+     * Parameters type. Determines the type of the subblock elements.
+     */
+    enum AVIAMFParamDefinitionType type;
+
+    /**
+     * Identifier for the paremeter substream.
+     */
+    unsigned int parameter_id;
+    /**
+     * Sample rate for the paremeter substream. It must not be 0.
+     */
+    unsigned int parameter_rate;
+
+    /**
+     * The duration of the all subblocks in this parameter definition.
+     *
+     * May be 0, in which case all duration values should be specified in
+     * another parameter definition referencing the same parameter_id.
+     */
+    unsigned int duration;
+    /**
+     * The duration of every subblock in the case where all subblocks, with
+     * the optional exception of the last subblock, have equal durations.
+     *
+     * Must be 0 if subblocks have different durations.
+     */
+    unsigned int constant_subblock_duration;
+} AVIAMFParamDefinition;
+
+const AVClass *av_iamf_param_definition_get_class(void);
+
+/**
+ * Allocates memory for AVIAMFParamDefinition, plus an array of {@code nb_subblocks}
+ * amount of subblocks of the given type and initializes the variables. Can be
+ * freed with a normal av_free() call.
+ *
+ * @param size if non-NULL, the size in bytes of the resulting data array is written here.
+ */
+AVIAMFParamDefinition *av_iamf_param_definition_alloc(enum AVIAMFParamDefinitionType type,
+                                                      unsigned int nb_subblocks, size_t *size);
+
+/**
+ * Get the subblock at the specified {@code idx}. Must be between 0 and nb_subblocks - 1.
+ *
+ * The @ref AVIAMFParamDefinition.type "param definition type" defines
+ * the struct type of the returned pointer.
+ */
+static av_always_inline void*
+av_iamf_param_definition_get_subblock(const AVIAMFParamDefinition *par, unsigned int idx)
+{
+    av_assert0(idx < par->nb_subblocks);
+    return (void *)((uint8_t *)par + par->subblocks_offset + idx * par->subblock_size);
+}
+
+/**
+ * @}
+ * @addtogroup lavf_iamf_audio
+ * @{
+ */
+
+enum AVIAMFAmbisonicsMode {
+    AV_IAMF_AMBISONICS_MODE_MONO,
+    AV_IAMF_AMBISONICS_MODE_PROJECTION,
+};
+
+/**
+ * Recon gain information for the layer is present in AVIAMFReconGain
+ */
+#define AV_IAMF_LAYER_FLAG_RECON_GAIN (1 << 0)
+
+/**
+ * A layer defining a Channel Layout in the Audio Element.
+ *
+ * When @ref AVIAMFAudioElement.audio_element_type "the parent's Audio Element type"
+ * is AV_IAMF_AUDIO_ELEMENT_TYPE_CHANNEL, this corresponds to an Scalable Channel
+ * Layout layer as defined in section 3.6.2 of IAMF.
+ * For AV_IAMF_AUDIO_ELEMENT_TYPE_SCENE, it is an Ambisonics channel
+ * layout as defined in section 3.6.3 of IAMF.
+ */
+typedef struct AVIAMFLayer {
+    const AVClass *av_class;
+
+    AVChannelLayout ch_layout;
+
+    /**
+     * A bitmask which may contain a combination of AV_IAMF_LAYER_FLAG_* flags.
+     */
+    unsigned int flags;
+    /**
+     * Output gain channel flags as defined in section 3.6.2 of IAMF.
+     *
+     * This field is defined only if @ref AVIAMFAudioElement.audio_element_type
+     * "the parent's Audio Element type" is AV_IAMF_AUDIO_ELEMENT_TYPE_CHANNEL,
+     * must be 0 otherwise.
+     */
+    unsigned int output_gain_flags;
+    /**
+     * Output gain as defined in section 3.6.2 of IAMF.
+     *
+     * Must be 0 if @ref output_gain_flags is 0.
+     */
+    AVRational output_gain;
+    /**
+     * Ambisonics mode as defined in section 3.6.3 of IAMF.
+     *
+     * This field is defined only if @ref AVIAMFAudioElement.audio_element_type
+     * "the parent's Audio Element type" is AV_IAMF_AUDIO_ELEMENT_TYPE_SCENE.
+     *
+     * If AV_IAMF_AMBISONICS_MODE_MONO, channel_mapping is defined implicitly
+     * (Ambisonic Order) or explicitly (Custom Order with ambi channels) in
+     * @ref ch_layout.
+     * If AV_IAMF_AMBISONICS_MODE_PROJECTION, @ref demixing_matrix must be set.
+     */
+    enum AVIAMFAmbisonicsMode ambisonics_mode;
+
+    /**
+     * Demixing matrix as defined in section 3.6.3 of IAMF.
+     *
+     * The length of the array is ch_layout.nb_channels multiplied by the sum of
+     * the amount of streams in the group plus the amount of streams in the group
+     * that are stereo.
+     *
+     * May be set only if @ref ambisonics_mode == AV_IAMF_AMBISONICS_MODE_PROJECTION,
+     * must be NULL otherwise.
+     */
+    AVRational *demixing_matrix;
+} AVIAMFLayer;
+
+
+enum AVIAMFAudioElementType {
+    AV_IAMF_AUDIO_ELEMENT_TYPE_CHANNEL,
+    AV_IAMF_AUDIO_ELEMENT_TYPE_SCENE,
+};
+
+typedef struct AVIAMFAudioElement {
+    const AVClass *av_class;
+
+    AVIAMFLayer **layers;
+    /**
+     * Number of layers, or channel groups, in the Audio Element.
+     * There may be 6 layers at most, and for @ref audio_element_type
+     * AV_IAMF_AUDIO_ELEMENT_TYPE_SCENE, there may be exactly 1.
+     *
+     * Set by av_iamf_audio_element_add_layer(), must not be
+     * modified by any other code.
+     */
+    unsigned int nb_layers;
+
+    /**
+     * Demixing information used to reconstruct a scalable channel audio
+     * representation.
+     * The @ref AVIAMFParamDefinition.type "type" must be
+     * AV_IAMF_PARAMETER_DEFINITION_DEMIXING.
+     */
+    AVIAMFParamDefinition *demixing_info;
+    /**
+     * Recon gain information used to reconstruct a scalable channel audio
+     * representation.
+     * The @ref AVIAMFParamDefinition.type "type" must be
+     * AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN.
+     */
+    AVIAMFParamDefinition *recon_gain_info;
+
+    /**
+     * Audio element type as defined in section 3.6 of IAMF.
+     */
+    enum AVIAMFAudioElementType audio_element_type;
+
+    /**
+     * Default weight value as defined in section 3.6 of IAMF.
+     */
+    unsigned int default_w;
+} AVIAMFAudioElement;
+
+const AVClass *av_iamf_audio_element_get_class(void);
+
+/**
+ * Allocates a AVIAMFAudioElement, and initializes its fields with default values.
+ * No layers are allocated. Must be freed with av_iamf_audio_element_free().
+ *
+ * @see av_iamf_audio_element_add_layer()
+ */
+AVIAMFAudioElement *av_iamf_audio_element_alloc(void);
+
+/**
+ * Allocate a layer and add it to a given AVIAMFAudioElement.
+ * It is freed by av_iamf_audio_element_free() alongside the rest of the parent
+ * AVIAMFAudioElement.
+ *
+ * @return a pointer to the allocated layer.
+ */
+AVIAMFLayer *av_iamf_audio_element_add_layer(AVIAMFAudioElement *audio_element);
+
+void av_iamf_audio_element_free(AVIAMFAudioElement **audio_element);
+
+/**
+ * @}
+ * @addtogroup lavf_iamf_mix
+ * @{
+ */
+
+enum AVIAMFHeadphonesMode {
+    /**
+     * The referenced Audio Element shall be rendered to stereo loudspeakers.
+     */
+    AV_IAMF_HEADPHONES_MODE_STEREO,
+    /**
+     * The referenced Audio Element shall be rendered with a binaural renderer.
+     */
+    AV_IAMF_HEADPHONES_MODE_BINAURAL,
+};
+
+typedef struct AVIAMFSubmixElement {
+    const AVClass *av_class;
+
+    /**
+     * The id of the Audio Element this submix element references.
+     */
+    unsigned int audio_element_id;
+
+    /**
+     * Information required required for applying any processing to the
+     * referenced and rendered Audio Element before being summed with other
+     * processed Audio Elements.
+     * The @ref AVIAMFParamDefinition.type "type" must be
+     * AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN.
+     */
+    AVIAMFParamDefinition *element_mix_config;
+
+    /**
+     * Default mix gain value to apply when there are no AVIAMFParamDefinition
+     * with @ref element_mix_config "element_mix_config's"
+     * @ref AVIAMFParamDefinition.parameter_id "parameter_id" available for a
+     * given audio frame.
+     */
+    AVRational default_mix_gain;
+
+    /**
+     * A value that indicates whether the referenced channel-based Audio Element
+     * shall be rendered to stereo loudspeakers or spatialized with a binaural
+     * renderer when played back on headphones.
+     * If the Audio Element is not of @ref AVIAMFAudioElement.audio_element_type
+     * "type" AV_IAMF_AUDIO_ELEMENT_TYPE_CHANNEL, then this field is undefined.
+     */
+    enum AVIAMFHeadphonesMode headphones_rendering_mode;
+
+    /**
+     * A dictionary of strings describing the submix in different languages.
+     * Must have the same amount of entries as
+     * @ref AVIAMFMixPresentation.annotations "the mix's annotations", stored
+     * in the same order, and with the same key strings.
+     *
+     * @ref AVDictionaryEntry.key "key" is a string conforming to BCP-47 that
+     * specifies the language for the string stored in
+     * @ref AVDictionaryEntry.value "value".
+     */
+    AVDictionary *annotations;
+} AVIAMFSubmixElement;
+
+enum AVIAMFSubmixLayoutType {
+    /**
+     * The layout follows the loudspeaker sound system convention of ITU-2051-3.
+     */
+    AV_IAMF_SUBMIX_LAYOUT_TYPE_LOUDSPEAKERS = 2,
+    /**
+     * The layout is binaural.
+     */
+    AV_IAMF_SUBMIX_LAYOUT_TYPE_BINAURAL = 3,
+};
+
+typedef struct AVIAMFSubmixLayout {
+    const AVClass *av_class;
+
+    enum AVIAMFSubmixLayoutType layout_type;
+
+    /**
+     * Channel layout matching one of Sound Systems A to J of ITU-2051-3, plus
+     * 7.1.2ch and 3.1.2ch
+     * If layout_type is not AV_IAMF_SUBMIX_LAYOUT_TYPE_LOUDSPEAKERS, this field
+     * is undefined.
+     */
+    AVChannelLayout sound_system;
+    /**
+     * The program integrated loudness information, as defined in
+     * ITU-1770-4.
+     */
+    AVRational integrated_loudness;
+    /**
+     * The digital (sampled) peak value of the audio signal, as defined
+     * in ITU-1770-4.
+     */
+    AVRational digital_peak;
+    /**
+     * The true peak of the audio signal, as defined in ITU-1770-4.
+     */
+    AVRational true_peak;
+    /**
+     * The Dialogue loudness information, as defined in ITU-1770-4.
+     */
+    AVRational dialogue_anchored_loudness;
+    /**
+     * The Album loudness information, as defined in ITU-1770-4.
+     */
+    AVRational album_anchored_loudness;
+} AVIAMFSubmixLayout;
+
+typedef struct AVIAMFSubmix {
+    const AVClass *av_class;
+
+    /**
+     * Array of submix elements.
+     *
+     * Set by av_iamf_submix_add_element(), must not be modified by any
+     * other code.
+     */
+    AVIAMFSubmixElement **elements;
+    /**
+     * Number of elements in the submix.
+     *
+     * Set by av_iamf_submix_add_element(), must not be modified by any
+     * other code.
+     */
+    unsigned int nb_elements;
+
+    /**
+     * Array of submix layouts.
+     *
+     * Set by av_iamf_submix_add_layout(), must not be modified by any
+     * other code.
+     */
+    AVIAMFSubmixLayout **layouts;
+    /**
+     * Number of layouts in the submix.
+     *
+     * Set by av_iamf_submix_add_layout(), must not be modified by any
+     * other code.
+     */
+    unsigned int nb_layouts;
+
+    /**
+     * Information required for post-processing the mixed audio signal to
+     * generate the audio signal for playback.
+     * The @ref AVIAMFParamDefinition.type "type" must be
+     * AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN.
+     */
+    AVIAMFParamDefinition *output_mix_config;
+
+    /**
+     * Default mix gain value to apply when there are no AVIAMFParamDefinition
+     * with @ref output_mix_config "output_mix_config's"
+     * @ref AVIAMFParamDefinition.parameter_id "parameter_id" available for a
+     * given audio frame.
+     */
+    AVRational default_mix_gain;
+} AVIAMFSubmix;
+
+typedef struct AVIAMFMixPresentation {
+    const AVClass *av_class;
+
+    /**
+     * Array of submixes.
+     *
+     * Set by av_iamf_mix_presentation_add_submix(), must not be modified
+     * by any other code.
+     */
+    AVIAMFSubmix **submixes;
+    /**
+     * Number of submixes in the presentation.
+     *
+     * Set by av_iamf_mix_presentation_add_submix(), must not be modified
+     * by any other code.
+     */
+    unsigned int nb_submixes;
+
+    /**
+     * A dictionary of strings describing the mix in different languages.
+     * Must have the same amount of entries as every
+     * @ref AVIAMFSubmixElement.annotations "Submix element annotations",
+     * stored in the same order, and with the same key strings.
+     *
+     * @ref AVDictionaryEntry.key "key" is a string conforming to BCP-47
+     * that specifies the language for the string stored in
+     * @ref AVDictionaryEntry.value "value".
+     */
+    AVDictionary *annotations;
+} AVIAMFMixPresentation;
+
+const AVClass *av_iamf_mix_presentation_get_class(void);
+
+/**
+ * Allocates a AVIAMFMixPresentation, and initializes its fields with default
+ * values. No submixes are allocated.
+ * Must be freed with av_iamf_mix_presentation_free().
+ *
+ * @see av_iamf_mix_presentation_add_submix()
+ */
+AVIAMFMixPresentation *av_iamf_mix_presentation_alloc(void);
+
+/**
+ * Allocate a submix and add it to a given AVIAMFMixPresentation.
+ * It is freed by av_iamf_mix_presentation_free() alongside the rest of the
+ * parent AVIAMFMixPresentation.
+ *
+ * @return a pointer to the allocated submix.
+ */
+AVIAMFSubmix *av_iamf_mix_presentation_add_submix(AVIAMFMixPresentation *mix_presentation);
+
+/**
+ * Allocate a submix element and add it to a given AVIAMFSubmix.
+ * It is freed by av_iamf_mix_presentation_free() alongside the rest of the
+ * parent AVIAMFSubmix.
+ *
+ * @return a pointer to the allocated submix.
+ */
+AVIAMFSubmixElement *av_iamf_submix_add_element(AVIAMFSubmix *submix);
+
+/**
+ * Allocate a submix layout and add it to a given AVIAMFSubmix.
+ * It is freed by av_iamf_mix_presentation_free() alongside the rest of the
+ * parent AVIAMFSubmix.
+ *
+ * @return a pointer to the allocated submix.
+ */
+AVIAMFSubmixLayout *av_iamf_submix_add_layout(AVIAMFSubmix *submix);
+
+void av_iamf_mix_presentation_free(AVIAMFMixPresentation **mix_presentation);
+/**
+ * @}
+ */
+
+#endif /* AVUTIL_IAMF_H */
-- 
2.43.0

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2023-12-18 18:10 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-12-05 22:43 [FFmpeg-devel] [PATCH v6 0/8] avformat: introduce AVStreamGroup James Almer
2023-12-05 22:43 ` [FFmpeg-devel] [PATCH 1/8] avutil: introduce an Immersive Audio Model and Formats API James Almer
2023-12-11 10:19   ` Anton Khirnov
2023-12-05 22:43 ` [FFmpeg-devel] [PATCH 2/8] avformat: introduce AVStreamGroup James Almer
2023-12-11 10:59   ` Anton Khirnov
2023-12-11 12:48     ` James Almer
2023-12-11 12:51       ` Anton Khirnov
2023-12-11 11:03   ` Anton Khirnov
2023-12-11 12:53     ` James Almer
2023-12-12 11:46       ` Anton Khirnov
2023-12-05 22:43 ` [FFmpeg-devel] [PATCH 3/8] ffmpeg: add support for muxing AVStreamGroups James Almer
2023-12-11 11:48   ` Anton Khirnov
2023-12-11 12:46     ` James Almer
2023-12-12 11:26       ` Anton Khirnov
2023-12-05 22:43 ` [FFmpeg-devel] [PATCH 4/8] avcodec/packet: add IAMF Parameters side data types James Almer
2023-12-05 22:43 ` [FFmpeg-devel] [PATCH 5/8] avcodec/get_bits: add get_leb() James Almer
2023-12-05 22:44 ` [FFmpeg-devel] [PATCH 6/8] avformat/aviobuf: add ffio_read_leb() and ffio_write_leb() James Almer
2023-12-05 22:44 ` [FFmpeg-devel] [PATCH 7/8] avformat: Immersive Audio Model and Formats demuxer James Almer
2023-12-05 22:44 ` [FFmpeg-devel] [PATCH 8/8] avformat: Immersive Audio Model and Formats muxer James Almer
2023-12-10 21:52 ` [FFmpeg-devel] [PATCH v6 0/8] avformat: introduce AVStreamGroup James Almer
2023-12-14 20:14 [FFmpeg-devel] [PATCH v7 " James Almer
2023-12-14 20:14 ` [FFmpeg-devel] [PATCH 1/8] avutil: introduce an Immersive Audio Model and Formats API James Almer
2023-12-18 11:04   ` Anton Khirnov
2023-12-18 18:10     ` James Almer

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
		ffmpegdev@gitmailbox.com
	public-inbox-index ffmpegdev

Example config snippet for mirrors.


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git