Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
 help / color / mirror / Atom feed
* [FFmpeg-devel] [PATCH 1/4] avutil: add generic side data for video coding info
@ 2025-07-18 10:30 Timothée Regaud
  2025-07-18 10:30 ` [FFmpeg-devel] [PATCH 2/4] avcodec: add option to export " Timothée Regaud
                   ` (4 more replies)
  0 siblings, 5 replies; 7+ messages in thread
From: Timothée Regaud @ 2025-07-18 10:30 UTC (permalink / raw)
  To: ffmpeg-devel; +Cc: Timothee Regaud

From: Timothee Regaud <timothee.informatique@regaud-chapuy.fr>

Adds the generic data structures to libavutil. The design is recursive to support other codecs, even though the implementation is only for H.264 for now.

Signed-off-by: Timothee Regaud <timothee.informatique@regaud-chapuy.fr>
---
 libavutil/Makefile            |   1 +
 libavutil/frame.h             |   7 ++
 libavutil/side_data.c         |   1 +
 libavutil/video_coding_info.h | 163 ++++++++++++++++++++++++++++++++++
 4 files changed, 172 insertions(+)
 create mode 100644 libavutil/video_coding_info.h

diff --git a/libavutil/Makefile b/libavutil/Makefile
index 94a56bb72f..44e51ab7ae 100644
--- a/libavutil/Makefile
+++ b/libavutil/Makefile
@@ -93,6 +93,7 @@ HEADERS = adler32.h                                                     \
           tree.h                                                        \
           twofish.h                                                     \
           uuid.h                                                        \
+          video_coding_info.h                                           \
           version.h                                                     \
           video_enc_params.h                                            \
           xtea.h                                                        \
diff --git a/libavutil/frame.h b/libavutil/frame.h
index c50cd263d9..f4404472a0 100644
--- a/libavutil/frame.h
+++ b/libavutil/frame.h
@@ -254,6 +254,13 @@ enum AVFrameSideDataType {
      * libavutil/tdrdi.h.
      */
     AV_FRAME_DATA_3D_REFERENCE_DISPLAYS,
+
+    /**
+     * Detailed block-level coding information. The data is an AVVideoCodingInfo
+     * structure. This is exported by video decoders and can be used by filters
+     * for analysis and visualization.
+     */
+    AV_FRAME_DATA_VIDEO_CODING_INFO,
 };
 
 enum AVActiveFormatDescription {
diff --git a/libavutil/side_data.c b/libavutil/side_data.c
index fa2a2c2a13..b938ef6f52 100644
--- a/libavutil/side_data.c
+++ b/libavutil/side_data.c
@@ -56,6 +56,7 @@ static const AVSideDataDescriptor sd_props[] = {
     [AV_FRAME_DATA_SEI_UNREGISTERED]            = { "H.26[45] User Data Unregistered SEI message",  AV_SIDE_DATA_PROP_MULTI },
     [AV_FRAME_DATA_VIDEO_HINT]                  = { "Encoding video hint",                          AV_SIDE_DATA_PROP_SIZE_DEPENDENT },
     [AV_FRAME_DATA_3D_REFERENCE_DISPLAYS]       = { "3D Reference Displays Information",            AV_SIDE_DATA_PROP_GLOBAL },
+    [AV_FRAME_DATA_VIDEO_CODING_INFO]           = { "Video Coding Info",                            AV_SIDE_DATA_PROP_SIZE_DEPENDENT },
 };
 
 const AVSideDataDescriptor *av_frame_side_data_desc(enum AVFrameSideDataType type)
diff --git a/libavutil/video_coding_info.h b/libavutil/video_coding_info.h
new file mode 100644
index 0000000000..17e9345892
--- /dev/null
+++ b/libavutil/video_coding_info.h
@@ -0,0 +1,163 @@
+/*
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#ifndef AVUTIL_VIDEO_CODING_INFO_H
+#define AVUTIL_VIDEO_CODING_INFO_H
+
+#include <stdint.h>
+#include <stddef.h>
+
+/**
+ * @file
+ * @ingroup lavu_frame
+ * Structures for describing block-level video coding information.
+ */
+
+/**
+ * @defgroup lavu_video_coding_info Video Coding Info
+ * @ingroup lavu_frame
+ *
+ * @{
+ * Structures for describing block-level video coding information, to be
+ * attached to an AVFrame as side data.
+ *
+ * All pointer-like members in these structures are offsets relative to the
+ * start of the AVVideoCodingInfo struct to ensure the side data is
+ * self-contained and relocatable. This is critical as the underlying buffer
+ * may be moved in memory.
+ */
+
+/**
+ * Structure to hold inter-prediction information for a block.
+ */
+typedef struct AVBlockInterInfo {
+    /**
+     * Offsets to motion vectors for list 0 and list 1, relative to the
+     * start of the AVVideoCodingInfo struct.
+     * The data for each list is an array of [x, y] pairs of int16_t.
+     * The number of vectors is given by num_mv.
+     * An offset of 0 indicates this data is not present.
+     */
+    size_t mv_offset[2];
+
+    /**
+     * Offsets to reference indices for list 0 and list 1, relative to the
+     * start of the AVVideoCodingInfo struct.
+     * The data is an array of int8_t. A value of -1 indicates the reference
+     * is not used for a specific partition.
+     * An offset of 0 indicates this data is not present.
+     */
+    size_t ref_idx_offset[2];
+    /**
+     * Number of motion vectors for list 0 and list 1.
+     */
+    uint8_t num_mv[2];
+} AVBlockInterInfo;
+
+/**
+ * Structure to hold intra-prediction information for a block.
+ */
+typedef struct AVBlockIntraInfo {
+    /**
+     * Offset to an array of intra prediction modes, relative to the
+     * start of the AVVideoCodingInfo struct.
+     * The number of modes is given by num_pred_modes.
+     */
+    size_t pred_mode_offset;
+
+    /**
+     * Number of intra prediction modes.
+     */
+    uint8_t num_pred_modes;
+
+    /**
+     * Chroma intra prediction mode.
+     */
+    uint8_t chroma_pred_mode;
+} AVBlockIntraInfo;
+
+/**
+ * Main structure for a single coding block.
+ * This structure can be recursive for codecs that use tree-based partitioning.
+ */
+typedef struct AVVideoCodingInfoBlock {
+    /**
+     * Position (x, y) and size (w, h) of the block, in pixels,
+     * relative to the top-left corner of the frame.
+     */
+    int16_t x, y;
+    uint8_t w, h;
+
+    /**
+     * Flag indicating if the block is intra-coded.
+     * 1 if intra, 0 if inter.
+     */
+    uint8_t is_intra;
+
+    /**
+     * The original, codec-specific type of this block or macroblock.
+     * This allows a filter to have codec-specific logic for interpreting
+     * the generic prediction information based on the source codec.
+     * For example, for H.264, this would store the MB type flags (MB_TYPE_*).
+     */
+    uint32_t codec_specific_type;
+
+    union {
+        AVBlockIntraInfo intra;
+        AVBlockInterInfo inter;
+    };
+
+    /**
+     * Number of child blocks this block is partitioned into.
+     * If 0, this is a leaf node in the partition tree.
+     */
+    uint8_t num_children;
+
+    /**
+     * Offset to an array of child AVVideoCodingInfoBlock structures, relative
+     * to the start of the AVVideoCodingInfo struct.
+     * This allows for recursive representation of coding structures.
+     * An offset of 0 indicates there are no children.
+     */
+    size_t children_offset;
+} AVVideoCodingInfoBlock;
+
+/**
+ * Top-level structure to be attached to an AVFrame as side data.
+ * It contains an array of the highest-level coding blocks (e.g., CTUs or MBs).
+ */
+typedef struct AVVideoCodingInfo {
+    /**
+     * Number of top-level blocks in the frame.
+     */
+    uint32_t nb_blocks;
+
+    /**
+     * Offset to an array of top-level blocks, relative to the start of the
+     * AVVideoCodingInfo struct.
+     * The actual data for these blocks, and any child blocks or sub-data,
+     * is stored contiguously in the AVBufferRef attached to the side data.
+     */
+    size_t blocks_offset;
+} AVVideoCodingInfo;
+
+/**
+ * @}
+ */
+
+#endif /* AVUTIL_VIDEO_CODING_INFO_H */
-- 
2.39.5

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [FFmpeg-devel] [PATCH 2/4] avcodec: add option to export video coding info
  2025-07-18 10:30 [FFmpeg-devel] [PATCH 1/4] avutil: add generic side data for video coding info Timothée Regaud
@ 2025-07-18 10:30 ` Timothée Regaud
  2025-07-18 10:30 ` [FFmpeg-devel] [PATCH 3/4] avcodec/h264dec: implement export of video coding info for H.264 Timothée Regaud
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 7+ messages in thread
From: Timothée Regaud @ 2025-07-18 10:30 UTC (permalink / raw)
  To: ffmpeg-devel; +Cc: Timothee Regaud

From: Timothee Regaud <timothee.informatique@regaud-chapuy.fr>

Adds the AV_CODEC_EXPORT_DATA_VIDEO_CODING_INFO flag and the corresponding video_coding_info option to the options table, allowing users to enable this feature.

Signed-off-by: Timothee Regaud <timothee.informatique@regaud-chapuy.fr>
---
 libavcodec/avcodec.h       | 6 ++++++
 libavcodec/options_table.h | 1 +
 2 files changed, 7 insertions(+)

diff --git a/libavcodec/avcodec.h b/libavcodec/avcodec.h
index a004cccd2d..a1f95f78dd 100644
--- a/libavcodec/avcodec.h
+++ b/libavcodec/avcodec.h
@@ -405,6 +405,12 @@ typedef struct RcOverride{
  */
 #define AV_CODEC_EXPORT_DATA_ENHANCEMENTS (1 << 4)
 
+/**
+ * Export detailed video coding information from the decoder.
+ * @see AV_FRAME_DATA_VIDEO_CODING_INFO
+ */
+#define AV_CODEC_EXPORT_DATA_VIDEO_CODING_INFO (1 << 5)
+
 /**
  * The decoder will keep a reference to the frame and may reuse it later.
  */
diff --git a/libavcodec/options_table.h b/libavcodec/options_table.h
index c525cde80a..b773055068 100644
--- a/libavcodec/options_table.h
+++ b/libavcodec/options_table.h
@@ -90,6 +90,7 @@ static const AVOption avcodec_options[] = {
 {"prft", "export Producer Reference Time through packet side data", 0, AV_OPT_TYPE_CONST, {.i64 = AV_CODEC_EXPORT_DATA_PRFT}, INT_MIN, INT_MAX, A|V|S|E, .unit = "export_side_data"},
 {"venc_params", "export video encoding parameters through frame side data", 0, AV_OPT_TYPE_CONST, {.i64 = AV_CODEC_EXPORT_DATA_VIDEO_ENC_PARAMS}, INT_MIN, INT_MAX, V|D, .unit = "export_side_data"},
 {"film_grain", "export film grain parameters through frame side data", 0, AV_OPT_TYPE_CONST, {.i64 = AV_CODEC_EXPORT_DATA_FILM_GRAIN}, INT_MIN, INT_MAX, V|D, .unit = "export_side_data"},
+{ "video_coding_info", "Export video coding information", 0, AV_OPT_TYPE_CONST, {.i64 = AV_CODEC_EXPORT_DATA_VIDEO_CODING_INFO }, 0, 0, V|D, "export_side_data" },
 {"enhancements", "export picture enhancement metadata through frame side data", 0, AV_OPT_TYPE_CONST, {.i64 = AV_CODEC_EXPORT_DATA_ENHANCEMENTS}, INT_MIN, INT_MAX, V|D, .unit = "export_side_data"},
 {"time_base", NULL, OFFSET(time_base), AV_OPT_TYPE_RATIONAL, {.dbl = 0}, 0, INT_MAX},
 {"g", "set the group of picture (GOP) size", OFFSET(gop_size), AV_OPT_TYPE_INT, {.i64 = 12 }, INT_MIN, INT_MAX, V|E},
-- 
2.39.5

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [FFmpeg-devel] [PATCH 3/4] avcodec/h264dec: implement export of video coding info for H.264
  2025-07-18 10:30 [FFmpeg-devel] [PATCH 1/4] avutil: add generic side data for video coding info Timothée Regaud
  2025-07-18 10:30 ` [FFmpeg-devel] [PATCH 2/4] avcodec: add option to export " Timothée Regaud
@ 2025-07-18 10:30 ` Timothée Regaud
  2025-07-18 14:17   ` Michael Niedermayer
  2025-07-18 10:30 ` [FFmpeg-devel] [PATCH 4/4] vf_codecview: add support for AV_FRAME_DATA_VIDEO_CODING_INFO Timothée Regaud
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 7+ messages in thread
From: Timothée Regaud @ 2025-07-18 10:30 UTC (permalink / raw)
  To: ffmpeg-devel; +Cc: Timothee Regaud

From: Timothee Regaud <timothee.informatique@regaud-chapuy.fr>

Hooks into the H.264 decoder to populate the new generic video coding info structures. It handles allocation of the side data buffer, collection of modes/MVs/refs for all macroblock types, and attach the final side data buffer to the output frame.

This should serve as a template for adding support for other codecs down the line.

Signed-off-by: Timothee Regaud <timothee.informatique@regaud-chapuy.fr>
---
 Changelog                     |   1 +
 libavcodec/h264_mb.c          | 150 ++++++++++++++++++++++++++++++++++
 libavcodec/h264_mb_template.c |   3 +
 libavcodec/h264_picture.c     |   3 +
 libavcodec/h264_slice.c       |  19 +++++
 libavcodec/h264dec.c          |  17 ++++
 libavcodec/h264dec.h          |  12 +++
 7 files changed, 205 insertions(+)

diff --git a/Changelog b/Changelog
index ad2361a481..360f6fd28a 100644
--- a/Changelog
+++ b/Changelog
@@ -2,6 +2,7 @@ Entries are sorted chronologically from oldest to youngest within each release,
 releases are sorted from youngest to oldest.
 
 version <next>:
+- avcodec: add generic side data export for video coding info
 - Drop support for OpenSSL < 1.1.0
 - yasm support dropped, users need to use nasm
 - VVC VAAPI decoder
diff --git a/libavcodec/h264_mb.c b/libavcodec/h264_mb.c
index 0d6562b583..af790fd854 100644
--- a/libavcodec/h264_mb.c
+++ b/libavcodec/h264_mb.c
@@ -37,6 +37,156 @@
 #include "rectangle.h"
 #include "threadframe.h"
 
+/**
+ * Collects detailed mode, reference, and motion vector information for the
+ * current macroblock and stores it in the picture's coding_info buffer.
+ * This populates the generic AVVideoCodingInfoBlock structure for an H.264 macroblock.
+ */
+static void ff_h264_collect_coding_info(const H264Context *h, H264SliceContext *sl)
+{
+    AVVideoCodingInfo *coding_info;
+    AVVideoCodingInfoBlock *block;
+    AVVideoCodingInfoBlock *blocks_array;
+    uint8_t *mb_sub_data_base;
+    int mb_type;
+    int i, j, list;
+
+    if (!h->cur_pic_ptr || !h->cur_pic_ptr->coding_info_ref) {
+        return;
+    }
+
+    if (sl->mb_xy >= h->mb_num) {
+        return;
+    }
+
+    coding_info = (AVVideoCodingInfo*)h->cur_pic_ptr->coding_info_ref->data;
+    blocks_array = (AVVideoCodingInfoBlock*)((uint8_t*)coding_info + coding_info->blocks_offset);
+    block = &blocks_array[sl->mb_xy];
+    mb_type = h->cur_pic.mb_type[sl->mb_xy];
+
+    AVVideoCodingInfoBlock *child_blocks_pool = (AVVideoCodingInfoBlock*)(blocks_array + h->mb_num);
+    uint8_t *sub_data_pool_start = (uint8_t*)(child_blocks_pool + h->mb_num * 4);
+    mb_sub_data_base = sub_data_pool_start + sl->mb_xy * H264_MAX_SUB_DATA_PER_MB;
+
+    block->x = sl->mb_x * 16;
+    block->y = sl->mb_y * 16;
+    block->w = 16;
+    block->h = 16;
+    block->is_intra = IS_INTRA(mb_type);
+    block->codec_specific_type = mb_type;
+    block->num_children = 0;
+    block->children_offset = 0;
+
+    if (IS_8X8(mb_type)) {
+        block->num_children = 4;
+        block->children_offset = (uint8_t*)(child_blocks_pool + (sl->mb_xy * 4)) - (uint8_t*)coding_info;
+        const size_t sub_data_per_child = H264_MAX_SUB_DATA_PER_MB >> 2;
+
+        for (i = 0; i < 4; i++) {
+            AVVideoCodingInfoBlock *child = &((AVVideoCodingInfoBlock*)((uint8_t*)coding_info + block->children_offset))[i];
+            uint8_t *child_sub_data_base = mb_sub_data_base + i * sub_data_per_child;
+            int sub_mb_type = sl->sub_mb_type[i];
+            // Calculate 8x8 sub-block offsets from the raster-scan index 'i'.
+            int part_x = (i & 1) * 8; // Isolates bit 0 for horizontal position.
+            int part_y = (i & 2) * 4; // Isolates bit 1 for vertical position.
+
+            child->x = block->x + part_x;
+            child->y = block->y + part_y;
+            child->w = 8;
+            child->h = 8;
+            child->is_intra = 0;
+            child->codec_specific_type = sub_mb_type;
+            child->num_children = 0;
+            child->children_offset = 0;
+
+            int num_partitions = 1;
+            if (IS_SUB_8X4(sub_mb_type) || IS_SUB_4X8(sub_mb_type)) num_partitions = 2;
+            if (IS_SUB_4X4(sub_mb_type)) num_partitions = 4;
+
+            // Define the memory layout for this child's sub-data
+            int16_t (*mv_l0)[2]   = (void*)child_sub_data_base;
+            int8_t  *ref_idx_l0 = (int8_t*)(mv_l0 + num_partitions);
+            int16_t (*mv_l1)[2]   = (void*)(ref_idx_l0 + num_partitions);
+            int8_t  *ref_idx_l1 = (int8_t*)(mv_l1 + num_partitions);
+
+            for (list = 0; list < 2; list++) {
+                if (USES_LIST(sub_mb_type, list)) {
+                    child->inter.num_mv[list] = num_partitions;
+                    child->inter.mv_offset[list] = !list ? ((uint8_t*)mv_l0 - (uint8_t*)coding_info) : ((uint8_t*)mv_l1 - (uint8_t*)coding_info);
+                    child->inter.ref_idx_offset[list] = !list ? ((uint8_t*)ref_idx_l0 - (uint8_t*)coding_info) : ((uint8_t*)ref_idx_l1 - (uint8_t*)coding_info);
+
+                    // Reconstruct pointers to write data
+                    int16_t (*current_mv)[2] = (int16_t (*)[2])((uint8_t*)coding_info + child->inter.mv_offset[list]);
+                    int8_t *current_ref_idx = (int8_t*)((uint8_t*)coding_info + child->inter.ref_idx_offset[list]);
+
+                    for (j = 0; j < num_partitions; j++) {
+                        int block_idx = i * 4;
+                        if (IS_SUB_8X4(sub_mb_type)) block_idx += j * 2;
+                        else if (IS_SUB_4X8(sub_mb_type)) block_idx += j;
+                        else if (IS_SUB_4X4(sub_mb_type)) block_idx += j;
+
+                        current_ref_idx[j] = sl->ref_cache[list][scan8[block_idx]];
+                        current_mv[j][0]   = sl->mv_cache[list][scan8[block_idx]][0];
+                        current_mv[j][1]   = sl->mv_cache[list][scan8[block_idx]][1];
+                    }
+                } else {
+                    child->inter.num_mv[list] = 0;
+                    child->inter.mv_offset[list] = 0;
+                    child->inter.ref_idx_offset[list] = 0;
+                }
+            }
+        }
+    } else if (block->is_intra) {
+        block->intra.pred_mode_offset = (uint8_t*)mb_sub_data_base - (uint8_t*)coding_info;
+        int8_t *pred_mode = (int8_t*)mb_sub_data_base; // Keep temporary pointer to write data
+        if (IS_INTRA4x4(mb_type)) {
+            block->intra.num_pred_modes = 16;
+            for (i = 0; i < 16; i++)
+                pred_mode[i] = sl->intra4x4_pred_mode_cache[scan8[i]];
+        } else {
+            block->intra.num_pred_modes = 1;
+            pred_mode[0] = sl->intra16x16_pred_mode;
+        }
+        block->intra.chroma_pred_mode = sl->chroma_pred_mode;
+    } else { // Non-8x8 Inter modes
+        int num_mvs = 0;
+        if (IS_16X16(mb_type)) num_mvs = 1;
+        else if (IS_16X8(mb_type) || IS_8X16(mb_type)) num_mvs = 2;
+
+        // Define the memory layout for this block's sub-data
+        int16_t (*mv_l0)[2]   = (void*)mb_sub_data_base;
+        int16_t (*mv_l1)[2]   = mv_l0 + num_mvs;
+        int8_t  *ref_idx_l0 = (int8_t*)(mv_l1 + num_mvs);
+        int8_t  *ref_idx_l1 = ref_idx_l0 + num_mvs;
+
+        for (list = 0; list < 2; list++) {
+            if (USES_LIST(mb_type, list)) {
+                block->inter.num_mv[list] = num_mvs;
+                block->inter.mv_offset[list] = !list ? ((uint8_t*)mv_l0 - (uint8_t*)coding_info) : ((uint8_t*)mv_l1 - (uint8_t*)coding_info);
+                block->inter.ref_idx_offset[list] = !list ? ((uint8_t*)ref_idx_l0 - (uint8_t*)coding_info) : ((uint8_t*)ref_idx_l1 - (uint8_t*)coding_info);
+
+                // Reconstruct pointers to write data
+                int16_t (*current_mv)[2] = (int16_t (*)[2])((uint8_t*)coding_info + block->inter.mv_offset[list]);
+                int8_t *current_ref_idx = (int8_t*)((uint8_t*)coding_info + block->inter.ref_idx_offset[list]);
+
+                for (i = 0; i < num_mvs; i++) {
+                    int block_idx = 0;
+                    if (IS_16X8(mb_type)) block_idx = i * 8;
+                    else if (IS_8X16(mb_type)) block_idx = i * 4;
+
+                    current_ref_idx[i] = sl->ref_cache[list][scan8[block_idx]];
+                    current_mv[i][0]   = sl->mv_cache[list][scan8[block_idx]][0];
+                    current_mv[i][1]   = sl->mv_cache[list][scan8[block_idx]][1];
+                }
+            } else {
+                block->inter.num_mv[list] = 0;
+                block->inter.mv_offset[list] = 0;
+                block->inter.ref_idx_offset[list] = 0;
+            }
+        }
+    }
+}
+
 static inline int get_lowest_part_list_y(H264SliceContext *sl,
                                          int n, int height, int y_offset, int list)
 {
diff --git a/libavcodec/h264_mb_template.c b/libavcodec/h264_mb_template.c
index d5ea26a6e3..6dc09f0611 100644
--- a/libavcodec/h264_mb_template.c
+++ b/libavcodec/h264_mb_template.c
@@ -53,6 +53,9 @@ static av_noinline void FUNC(hl_decode_mb)(const H264Context *h, H264SliceContex
     const int block_h   = 16 >> h->chroma_y_shift;
     const int chroma422 = CHROMA422(h);
 
+    // Collect macroblock information after decoding
+    ff_h264_collect_coding_info(h, sl);
+
     dest_y  = h->cur_pic.f->data[0] + ((mb_x << PIXEL_SHIFT)     + mb_y * sl->linesize)  * 16;
     dest_cb = h->cur_pic.f->data[1] +  (mb_x << PIXEL_SHIFT) * 8 + mb_y * sl->uvlinesize * block_h;
     dest_cr = h->cur_pic.f->data[2] +  (mb_x << PIXEL_SHIFT) * 8 + mb_y * sl->uvlinesize * block_h;
diff --git a/libavcodec/h264_picture.c b/libavcodec/h264_picture.c
index f5d2b31cd6..5572f45fae 100644
--- a/libavcodec/h264_picture.c
+++ b/libavcodec/h264_picture.c
@@ -35,6 +35,7 @@
 #include "libavutil/refstruct.h"
 #include "thread.h"
 #include "threadframe.h"
+#include "libavutil/mem.h"
 
 void ff_h264_unref_picture(H264Picture *pic)
 {
@@ -56,6 +57,7 @@ void ff_h264_unref_picture(H264Picture *pic)
         av_refstruct_unref(&pic->ref_index[i]);
     }
     av_refstruct_unref(&pic->decode_error_flags);
+    av_buffer_unref(&pic->coding_info_ref);
 
     memset((uint8_t*)pic + off, 0, sizeof(*pic) - off);
 }
@@ -103,6 +105,7 @@ static void h264_copy_picture_params(H264Picture *dst, const H264Picture *src)
     dst->mb_height     = src->mb_height;
     dst->mb_stride     = src->mb_stride;
     dst->needs_fg      = src->needs_fg;
+    av_buffer_replace(&dst->coding_info_ref, src->coding_info_ref);
 }
 
 int ff_h264_ref_picture(H264Picture *dst, const H264Picture *src)
diff --git a/libavcodec/h264_slice.c b/libavcodec/h264_slice.c
index 7e53e38cca..4398cf2f98 100644
--- a/libavcodec/h264_slice.c
+++ b/libavcodec/h264_slice.c
@@ -266,6 +266,25 @@ static int alloc_picture(H264Context *h, H264Picture *pic)
     pic->mb_height = h->mb_height;
     pic->mb_stride = h->mb_stride;
 
+    // Allocate the coding info buffer for this picture.
+    if (h->avctx->export_side_data & AV_CODEC_EXPORT_DATA_VIDEO_CODING_INFO) {
+        // Total size must account for the main struct, the array of parent blocks,
+        // a pool for all potential child blocks, and the sub-data for all blocks.
+        // For H.264, the max children per MB is 4 (for 8x8 mode).
+        size_t coding_info_size = sizeof(AVVideoCodingInfo) +
+                              h->mb_num * sizeof(AVVideoCodingInfoBlock) +      // Parent blocks
+                              h->mb_num * 4 * sizeof(AVVideoCodingInfoBlock) +  // Pool for child blocks
+                              h->mb_num * H264_MAX_SUB_DATA_PER_MB;             // Pool for sub-data (MVs, modes)
+
+
+        pic->coding_info_ref = av_buffer_allocz(coding_info_size);
+        if (!pic->coding_info_ref)
+            goto fail;
+        AVVideoCodingInfo *info = (AVVideoCodingInfo*)pic->coding_info_ref->data;
+        info->nb_blocks = h->mb_num;
+        info->blocks_offset = sizeof(AVVideoCodingInfo);
+    }
+
     return 0;
 fail:
     ff_h264_unref_picture(pic);
diff --git a/libavcodec/h264dec.c b/libavcodec/h264dec.c
index 82b85b3387..a7b9e56db3 100644
--- a/libavcodec/h264dec.c
+++ b/libavcodec/h264dec.c
@@ -887,6 +887,23 @@ static int output_frame(H264Context *h, AVFrame *dst, H264Picture *srcp)
             goto fail;
     }
 
+    // Attach the coding info from the main context.
+    if (srcp->coding_info_ref) {
+        AVFrameSideData *side_data;
+
+        av_log(h->avctx, AV_LOG_DEBUG, "Attaching coding_info to frame %"PRId64"\n", dst->pts);
+
+        // Create a new side data entry.
+        side_data = av_frame_new_side_data_from_buf(dst, AV_FRAME_DATA_VIDEO_CODING_INFO, srcp->coding_info_ref);
+        if (!side_data) {
+            av_log(h->avctx, AV_LOG_ERROR, "Failed to allocate side data for coding info.\n");
+        } else {
+            // The AVFrame now owns the buffer, so we release our reference to it.
+            // It will be freed when the frame is unreferenced.
+            srcp->coding_info_ref = NULL;
+        }
+    }
+
     if (!(h->avctx->export_side_data & AV_CODEC_EXPORT_DATA_FILM_GRAIN))
         av_frame_remove_side_data(dst, AV_FRAME_DATA_FILM_GRAIN_PARAMS);
 
diff --git a/libavcodec/h264dec.h b/libavcodec/h264dec.h
index c28d278240..e52f737766 100644
--- a/libavcodec/h264dec.h
+++ b/libavcodec/h264dec.h
@@ -45,6 +45,7 @@
 #include "mpegutils.h"
 #include "threadframe.h"
 #include "videodsp.h"
+#include "libavutil/video_coding_info.h"
 
 #define H264_MAX_PICTURE_COUNT 36
 
@@ -102,6 +103,14 @@
 // does this mb use listX, note does not work if subMBs
 #define USES_LIST(a, list) ((a) & ((MB_TYPE_P0L0 | MB_TYPE_P1L0) << (2 * (list))))
 
+/* Constants for AVVideoCodingInfo buffer allocation for H.264.
+ * Max sub-data per MB is for inter prediction with 16 partitions. */
+static const size_t H264_MAX_MV_SIZE_PER_LIST = 16 * sizeof(int16_t[2]);
+static const size_t H264_MAX_REF_SIZE_PER_LIST = 16 * sizeof(int8_t);
+static const size_t H264_INTER_SUB_DATA_SIZE = 2 * (H264_MAX_MV_SIZE_PER_LIST + H264_MAX_REF_SIZE_PER_LIST);
+static const size_t H264_INTRA_SUB_DATA_SIZE = 16 * sizeof(int8_t);
+static const size_t H264_MAX_SUB_DATA_PER_MB = FFMAX(H264_INTER_SUB_DATA_SIZE, H264_INTRA_SUB_DATA_SIZE);
+
 /**
  * Memory management control operation.
  */
@@ -164,6 +173,9 @@ typedef struct H264Picture {
     atomic_int *decode_error_flags;
 
     int gray;
+
+    // Buffer to store macroblock mode information for this picture.
+    AVBufferRef *coding_info_ref;
 } H264Picture;
 
 typedef struct H264Ref {
-- 
2.39.5

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [FFmpeg-devel] [PATCH 4/4] vf_codecview: add support for AV_FRAME_DATA_VIDEO_CODING_INFO
  2025-07-18 10:30 [FFmpeg-devel] [PATCH 1/4] avutil: add generic side data for video coding info Timothée Regaud
  2025-07-18 10:30 ` [FFmpeg-devel] [PATCH 2/4] avcodec: add option to export " Timothée Regaud
  2025-07-18 10:30 ` [FFmpeg-devel] [PATCH 3/4] avcodec/h264dec: implement export of video coding info for H.264 Timothée Regaud
@ 2025-07-18 10:30 ` Timothée Regaud
  2025-07-18 15:48 ` [FFmpeg-devel] [PATCH 1/4] avutil: add generic side data for video coding info Michael Niedermayer
  2025-07-18 17:42 ` Lynne
  4 siblings, 0 replies; 7+ messages in thread
From: Timothée Regaud @ 2025-07-18 10:30 UTC (permalink / raw)
  To: ffmpeg-devel; +Cc: Timothee Regaud

From: Timothee Regaud <timothee.informatique@regaud-chapuy.fr>

The filter now checks for AV_FRAME_DATA_VIDEO_CODING_INFO and contains a recursive logging function to traverse the block-partitioning tree. This demonstrates how a consumer would parse the new generic data structure.

Signed-off-by: Timothee Regaud <timothee.informatique@regaud-chapuy.fr>
---
 libavfilter/vf_codecview.c | 186 +++++++++++++++++++++++++++++++++++++
 1 file changed, 186 insertions(+)

diff --git a/libavfilter/vf_codecview.c b/libavfilter/vf_codecview.c
index a4a701b00c..f381c01067 100644
--- a/libavfilter/vf_codecview.c
+++ b/libavfilter/vf_codecview.c
@@ -39,6 +39,14 @@
 #include "qp_table.h"
 #include "video.h"
 
+#include "libavcodec/h264.h"
+#include "libavcodec/h264pred.h"
+#include "libavutil/video_coding_info.h"
+#include "libavcodec/h264dec.h"
+#include "libavcodec/mpegutils.h"
+
+#define GET_PTR(base, offset) ((void*)((uint8_t*)(base) + (offset)))
+
 #define MV_P_FOR  (1<<0)
 #define MV_B_FOR  (1<<1)
 #define MV_B_BACK (1<<2)
@@ -56,6 +64,8 @@ typedef struct CodecViewContext {
     int hsub, vsub;
     int qp;
     int block;
+    int show_modes;
+    int frame_count;
 } CodecViewContext;
 
 #define OFFSET(x) offsetof(CodecViewContext, x)
@@ -78,9 +88,58 @@ static const AVOption codecview_options[] = {
         CONST("pf", "P-frames", FRAME_TYPE_P, "frame_type"),
         CONST("bf", "B-frames", FRAME_TYPE_B, "frame_type"),
     { "block",      "set block partitioning structure to visualize", OFFSET(block), AV_OPT_TYPE_BOOL, {.i64=0}, 0, 1, FLAGS },
+    { "show_modes", "Visualize macroblock modes", OFFSET(show_modes), AV_OPT_TYPE_BOOL, {.i64=0}, 0, 1, FLAGS },
     { NULL }
 };
 
+static const char *get_intra_4x4_mode_name(int mode) {
+    if (mode < 0) return "N/A"; // Handle unavailable edge blocks
+    switch (mode) {
+    case VERT_PRED:            return "V";
+    case HOR_PRED:             return "H";
+    case DC_PRED:              return "DC";
+    case DIAG_DOWN_LEFT_PRED:  return "DL";
+    case DIAG_DOWN_RIGHT_PRED: return "DR";
+    case VERT_RIGHT_PRED:      return "VR";
+    case HOR_DOWN_PRED:        return "HD";
+    case VERT_LEFT_PRED:       return "VL";
+    case HOR_UP_PRED:          return "HU";
+    default:                   return "?";
+    }
+}
+
+static const char *get_intra_16x16_mode_name(int mode) {
+    switch (mode) {
+    case VERT_PRED8x8:   return "Vertical";
+    case HOR_PRED8x8:    return "Horizontal";
+    case DC_PRED8x8:     return "DC";
+    case PLANE_PRED8x8:  return "Plane";
+    default:             return "Unknown";
+    }
+}
+
+/**
+ * Get a string representation for an inter sub-macroblock type.
+ * For B-frames, this indicates prediction direction (L0, L1, BI).
+ * For P-frames, this indicates partition size (8x8, 8x4, etc.).
+ */
+static const char *get_inter_sub_mb_type_name(uint32_t type, char pict_type) {
+    if (pict_type == 'B') {
+        if (type & MB_TYPE_DIRECT2) return "D";
+        int has_l0 = (type & MB_TYPE_L0);
+        int has_l1 = (type & MB_TYPE_L1);
+        if (has_l0 && has_l1) return "BI";
+        if (has_l0) return "L0";
+        if (has_l1) return "L1";
+    } else if (pict_type == 'P') {
+        if (IS_SUB_8X8(type)) return "8x8";
+        if (IS_SUB_8X4(type)) return "8x4";
+        if (IS_SUB_4X8(type)) return "4x8";
+        if (IS_SUB_4X4(type)) return "4x4";
+    }
+    return "?";
+}
+
 AVFILTER_DEFINE_CLASS(codecview);
 
 static int clip_line(int *sx, int *sy, int *ex, int *ey, int maxx)
@@ -219,12 +278,139 @@ static void draw_block_rectangle(uint8_t *buf, int sx, int sy, int w, int h, ptr
     }
 }
 
+static void format_mv_info(char *buf, size_t buf_size, const AVVideoCodingInfo *info_base,
+                           const AVBlockInterInfo *inter, int list, int mv_idx)
+{
+    // Check if the list is active, the index is valid, and offsets are set.
+    if (inter->num_mv[list] <= mv_idx || !inter->mv_offset[list] || !inter->ref_idx_offset[list]) {
+        return;
+    }
+
+    int16_t (*mv)[2]   = GET_PTR(info_base, inter->mv_offset[list]);
+    int8_t  *ref_idx = GET_PTR(info_base, inter->ref_idx_offset[list]);
+
+    if (ref_idx[mv_idx] >= 0) {
+        snprintf(buf, buf_size, " L%d[ref%d, %4d, %4d]",
+                 list,
+                 ref_idx[mv_idx],
+                 mv[mv_idx][0],
+                 mv[mv_idx][1]);
+    }
+}
+
+/**
+ * Recursive function to log a block and its children.
+ * This version is fully generic and handles any tree-based partitioning.
+ */
+static void log_block_info(AVFilterContext *ctx, const AVVideoCodingInfo *info_base,
+                           const AVVideoCodingInfoBlock *block,
+                           char pict_type, int64_t frame_num, int indent_level)
+{
+    char indent[16] = {0};
+    char line_buf[1024];
+    char info_buf[512];
+    char mv_buf[256];
+    int mb_type = block->codec_specific_type;
+
+    if (indent_level > 0 && indent_level < sizeof(indent) - 1) {
+        memset(indent, '\t', indent_level);
+    }
+
+    // Common prefix for all log lines
+    snprintf(line_buf, sizeof(line_buf), "F:%-3"PRId64" |%c| %s%-3dx%-3d @(%4d,%4d)|",
+             frame_num, pict_type, indent, block->w, block->h, block->x, block->y);
+
+    if (block->is_intra) {
+        int8_t *pred_mode = GET_PTR(info_base, block->intra.pred_mode_offset);
+        if (IS_INTRA4x4(mb_type)) {
+            snprintf(info_buf, sizeof(info_buf),
+                     "Intra: I_4x4 P:[%s,%s,%s,%s|%s,%s,%s,%s|%s,%s,%s,%s|%s,%s,%s,%s]",
+                     get_intra_4x4_mode_name(pred_mode[0]), get_intra_4x4_mode_name(pred_mode[1]),
+                     get_intra_4x4_mode_name(pred_mode[2]), get_intra_4x4_mode_name(pred_mode[3]),
+                     get_intra_4x4_mode_name(pred_mode[4]), get_intra_4x4_mode_name(pred_mode[5]),
+                     get_intra_4x4_mode_name(pred_mode[6]), get_intra_4x4_mode_name(pred_mode[7]),
+                     get_intra_4x4_mode_name(pred_mode[8]), get_intra_4x4_mode_name(pred_mode[9]),
+                     get_intra_4x4_mode_name(pred_mode[10]), get_intra_4x4_mode_name(pred_mode[11]),
+                     get_intra_4x4_mode_name(pred_mode[12]), get_intra_4x4_mode_name(pred_mode[13]),
+                     get_intra_4x4_mode_name(pred_mode[14]), get_intra_4x4_mode_name(pred_mode[15]));
+        } else if (IS_INTRA16x16(mb_type)) {
+            snprintf(info_buf, sizeof(info_buf), "Intra: I_16x16 M:%-8s",
+                     get_intra_16x16_mode_name(pred_mode[0]));
+        } else {
+            snprintf(info_buf, sizeof(info_buf), "Intra: Type %d", mb_type);
+        }
+        av_log(ctx, AV_LOG_INFO, "%s%s\n", line_buf, info_buf);
+    } else { // Inter
+        const char *prefix = (pict_type == 'P') ? "P" : "B";
+        const char *type_str = "Unknown";
+
+        // Use codec_specific_type to get a human-readable name
+        if (IS_SKIP(mb_type)) type_str = "Skip";
+        else if (IS_16X16(mb_type)) type_str = "16x16";
+        else if (IS_16X8(mb_type)) type_str = "16x8";
+        else if (IS_8X16(mb_type)) type_str = "8x16";
+        else if (IS_8X8(mb_type)) type_str = "8x8";
+        else type_str = get_inter_sub_mb_type_name(mb_type, pict_type); // For sub-partitions
+
+        snprintf(info_buf, sizeof(info_buf), "Inter: %s_%s", prefix, type_str);
+
+        // If there are no children, this is a leaf node, print its MVs.
+        if (!block->num_children) {
+            mv_buf[0] = '\0';
+            // A block can have multiple MVs (e.g., 8x4 partition has 2)
+            for (int i = 0; i < FFMAX(block->inter.num_mv[0], block->inter.num_mv[1]); i++) {
+                char temp_mv_buf[128] = {0};
+                if (block->inter.num_mv[0] > i && block->inter.mv_offset[0])
+                    format_mv_info(temp_mv_buf, sizeof(temp_mv_buf), info_base, &block->inter, 0, i);
+                if (pict_type == 'B' && block->inter.num_mv[1] > i && block->inter.mv_offset[1])
+                    format_mv_info(temp_mv_buf + strlen(temp_mv_buf), sizeof(temp_mv_buf) - strlen(temp_mv_buf), info_base, &block->inter, 1, i);
+
+                if (i > 0) strncat(mv_buf, " |", sizeof(mv_buf) - strlen(mv_buf) - 1);
+                strncat(mv_buf, temp_mv_buf, sizeof(mv_buf) - strlen(mv_buf) - 1);
+            }
+            av_log(ctx, AV_LOG_INFO, "%s%s%s\n", line_buf, info_buf, mv_buf);
+        } else {
+            // This is a parent node, just print its type and recurse.
+            av_log(ctx, AV_LOG_INFO, "%s%s\n", line_buf, info_buf);
+        }
+    }
+
+    // Recursive call for children
+    if (block->num_children > 0 && block->children_offset > 0) {
+        const AVVideoCodingInfoBlock *children = GET_PTR(info_base, block->children_offset);
+        for (int i = 0; i < block->num_children; i++) {
+            log_block_info(ctx, info_base, &children[i], pict_type, frame_num, indent_level + 1);
+        }
+    }
+}
+
+static void log_coding_info(AVFilterContext *ctx, AVFrame *frame, int64_t frame_num)
+{
+    AVFrameSideData *sd = av_frame_get_side_data(frame, AV_FRAME_DATA_VIDEO_CODING_INFO);
+    if (!sd)
+        return;
+
+    const AVVideoCodingInfo *coding_info = (const AVVideoCodingInfo *)sd->data;
+    const AVVideoCodingInfoBlock *blocks_array = GET_PTR(coding_info, coding_info->blocks_offset);
+    char pict_type = av_get_picture_type_char(frame->pict_type);
+
+    for (int i = 0; i < coding_info->nb_blocks; i++) {
+        log_block_info(ctx, coding_info, &blocks_array[i], pict_type, frame_num, 0);
+    }
+}
+
 static int filter_frame(AVFilterLink *inlink, AVFrame *frame)
 {
     AVFilterContext *ctx = inlink->dst;
     CodecViewContext *s = ctx->priv;
     AVFilterLink *outlink = ctx->outputs[0];
 
+    if (s->show_modes) {
+            log_coding_info(ctx, frame, s->frame_count);
+    }
+
+    s->frame_count++;
+
     if (s->qp) {
         enum AVVideoEncParamsType qp_type;
         int qstride, ret;
-- 
2.39.5

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [FFmpeg-devel] [PATCH 3/4] avcodec/h264dec: implement export of video coding info for H.264
  2025-07-18 10:30 ` [FFmpeg-devel] [PATCH 3/4] avcodec/h264dec: implement export of video coding info for H.264 Timothée Regaud
@ 2025-07-18 14:17   ` Michael Niedermayer
  0 siblings, 0 replies; 7+ messages in thread
From: Michael Niedermayer @ 2025-07-18 14:17 UTC (permalink / raw)
  To: FFmpeg development discussions and patches


[-- Attachment #1.1: Type: text/plain, Size: 2863 bytes --]

On Fri, Jul 18, 2025 at 12:30:54PM +0200, Timothée Regaud wrote:
> From: Timothee Regaud <timothee.informatique@regaud-chapuy.fr>
> 
> Hooks into the H.264 decoder to populate the new generic video coding info structures. It handles allocation of the side data buffer, collection of modes/MVs/refs for all macroblock types, and attach the final side data buffer to the output frame.
> 
> This should serve as a template for adding support for other codecs down the line.
> 
> Signed-off-by: Timothee Regaud <timothee.informatique@regaud-chapuy.fr>
> ---
>  Changelog                     |   1 +
>  libavcodec/h264_mb.c          | 150 ++++++++++++++++++++++++++++++++++
>  libavcodec/h264_mb_template.c |   3 +
>  libavcodec/h264_picture.c     |   3 +
>  libavcodec/h264_slice.c       |  19 +++++
>  libavcodec/h264dec.c          |  17 ++++
>  libavcodec/h264dec.h          |  12 +++
>  7 files changed, 205 insertions(+)
> 
[...]
> @@ -102,6 +103,14 @@
>  // does this mb use listX, note does not work if subMBs
>  #define USES_LIST(a, list) ((a) & ((MB_TYPE_P0L0 | MB_TYPE_P1L0) << (2 * (list))))
>  
> +/* Constants for AVVideoCodingInfo buffer allocation for H.264.
> + * Max sub-data per MB is for inter prediction with 16 partitions. */
> +static const size_t H264_MAX_MV_SIZE_PER_LIST = 16 * sizeof(int16_t[2]);
> +static const size_t H264_MAX_REF_SIZE_PER_LIST = 16 * sizeof(int8_t);
> +static const size_t H264_INTER_SUB_DATA_SIZE = 2 * (H264_MAX_MV_SIZE_PER_LIST + H264_MAX_REF_SIZE_PER_LIST);
> +static const size_t H264_INTRA_SUB_DATA_SIZE = 16 * sizeof(int8_t);
> +static const size_t H264_MAX_SUB_DATA_PER_MB = FFMAX(H264_INTER_SUB_DATA_SIZE, H264_INTRA_SUB_DATA_SIZE);

CC	libavfilter/vf_codecview.o
In file included from src/libavfilter/vf_codecview.c:45:0:
src/libavcodec/h264dec.h:110:48: error: initializer element is not constant
 static const size_t H264_INTER_SUB_DATA_SIZE = 2 * (H264_MAX_MV_SIZE_PER_LIST + H264_MAX_REF_SIZE_PER_LIST);
                                                ^
In file included from src/libavutil/error.h:30:0,
                 from src/libavutil/common.h:43,
                 from src/libavutil/avutil.h:300,
                 from src/libavutil/opt.h:31,
                 from src/libavfilter/vf_codecview.c:34:
src/libavutil/macros.h:47:20: error: initializer element is not constant
 #define FFMAX(a,b) ((a) > (b) ? (a) : (b))
                    ^
src/libavcodec/h264dec.h:112:48: note: in expansion of macro ‘FFMAX’
 static const size_t H264_MAX_SUB_DATA_PER_MB = FFMAX(H264_INTER_SUB_DATA_SIZE, H264_INTRA_SUB_DATA_SIZE);

arm-linux-gnueabi-gcc-7 (Ubuntu/Linaro 7.5.0-3ubuntu1~18.04) 7.5.0

[...]

-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Why not whip the teacher when the pupil misbehaves? -- Diogenes of Sinope

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

[-- Attachment #2: Type: text/plain, Size: 251 bytes --]

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [FFmpeg-devel] [PATCH 1/4] avutil: add generic side data for video coding info
  2025-07-18 10:30 [FFmpeg-devel] [PATCH 1/4] avutil: add generic side data for video coding info Timothée Regaud
                   ` (2 preceding siblings ...)
  2025-07-18 10:30 ` [FFmpeg-devel] [PATCH 4/4] vf_codecview: add support for AV_FRAME_DATA_VIDEO_CODING_INFO Timothée Regaud
@ 2025-07-18 15:48 ` Michael Niedermayer
  2025-07-18 17:42 ` Lynne
  4 siblings, 0 replies; 7+ messages in thread
From: Michael Niedermayer @ 2025-07-18 15:48 UTC (permalink / raw)
  To: FFmpeg development discussions and patches


[-- Attachment #1.1: Type: text/plain, Size: 7935 bytes --]

Hi

On Fri, Jul 18, 2025 at 12:30:52PM +0200, Timothée Regaud wrote:
> From: Timothee Regaud <timothee.informatique@regaud-chapuy.fr>
> 
> Adds the generic data structures to libavutil. The design is recursive to support other codecs, even though the implementation is only for H.264 for now.
> 
> Signed-off-by: Timothee Regaud <timothee.informatique@regaud-chapuy.fr>
> ---
>  libavutil/Makefile            |   1 +
>  libavutil/frame.h             |   7 ++
>  libavutil/side_data.c         |   1 +
>  libavutil/video_coding_info.h | 163 ++++++++++++++++++++++++++++++++++
>  4 files changed, 172 insertions(+)
>  create mode 100644 libavutil/video_coding_info.h
> 
> diff --git a/libavutil/Makefile b/libavutil/Makefile
> index 94a56bb72f..44e51ab7ae 100644
> --- a/libavutil/Makefile
> +++ b/libavutil/Makefile
> @@ -93,6 +93,7 @@ HEADERS = adler32.h                                                     \
>            tree.h                                                        \
>            twofish.h                                                     \
>            uuid.h                                                        \
> +          video_coding_info.h                                           \
>            version.h                                                     \
>            video_enc_params.h                                            \
>            xtea.h                                                        \
> diff --git a/libavutil/frame.h b/libavutil/frame.h
> index c50cd263d9..f4404472a0 100644
> --- a/libavutil/frame.h
> +++ b/libavutil/frame.h
> @@ -254,6 +254,13 @@ enum AVFrameSideDataType {
>       * libavutil/tdrdi.h.
>       */
>      AV_FRAME_DATA_3D_REFERENCE_DISPLAYS,
> +
> +    /**
> +     * Detailed block-level coding information. The data is an AVVideoCodingInfo
> +     * structure. This is exported by video decoders and can be used by filters
> +     * for analysis and visualization.
> +     */
> +    AV_FRAME_DATA_VIDEO_CODING_INFO,
>  };
>  
>  enum AVActiveFormatDescription {
> diff --git a/libavutil/side_data.c b/libavutil/side_data.c
> index fa2a2c2a13..b938ef6f52 100644
> --- a/libavutil/side_data.c
> +++ b/libavutil/side_data.c
> @@ -56,6 +56,7 @@ static const AVSideDataDescriptor sd_props[] = {
>      [AV_FRAME_DATA_SEI_UNREGISTERED]            = { "H.26[45] User Data Unregistered SEI message",  AV_SIDE_DATA_PROP_MULTI },
>      [AV_FRAME_DATA_VIDEO_HINT]                  = { "Encoding video hint",                          AV_SIDE_DATA_PROP_SIZE_DEPENDENT },
>      [AV_FRAME_DATA_3D_REFERENCE_DISPLAYS]       = { "3D Reference Displays Information",            AV_SIDE_DATA_PROP_GLOBAL },
> +    [AV_FRAME_DATA_VIDEO_CODING_INFO]           = { "Video Coding Info",                            AV_SIDE_DATA_PROP_SIZE_DEPENDENT },
>  };
>  
>  const AVSideDataDescriptor *av_frame_side_data_desc(enum AVFrameSideDataType type)
> diff --git a/libavutil/video_coding_info.h b/libavutil/video_coding_info.h
> new file mode 100644
> index 0000000000..17e9345892
> --- /dev/null
> +++ b/libavutil/video_coding_info.h
> @@ -0,0 +1,163 @@
> +/*
> + * This file is part of FFmpeg.
> + *
> + * FFmpeg is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * FFmpeg is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with FFmpeg; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
> + */
> +
> +#ifndef AVUTIL_VIDEO_CODING_INFO_H
> +#define AVUTIL_VIDEO_CODING_INFO_H
> +
> +#include <stdint.h>
> +#include <stddef.h>
> +
> +/**
> + * @file
> + * @ingroup lavu_frame
> + * Structures for describing block-level video coding information.
> + */
> +
> +/**
> + * @defgroup lavu_video_coding_info Video Coding Info
> + * @ingroup lavu_frame
> + *
> + * @{
> + * Structures for describing block-level video coding information, to be
> + * attached to an AVFrame as side data.
> + *
> + * All pointer-like members in these structures are offsets relative to the
> + * start of the AVVideoCodingInfo struct to ensure the side data is
> + * self-contained and relocatable. This is critical as the underlying buffer
> + * may be moved in memory.
> + */
> +

> +/**
> + * Structure to hold inter-prediction information for a block.
> + */
> +typedef struct AVBlockInterInfo {
> +    /**
> +     * Offsets to motion vectors for list 0 and list 1, relative to the
> +     * start of the AVVideoCodingInfo struct.
> +     * The data for each list is an array of [x, y] pairs of int16_t.
> +     * The number of vectors is given by num_mv.
> +     * An offset of 0 indicates this data is not present.
> +     */
> +    size_t mv_offset[2];

int16 is not enough, with growing picture sizes and growing precission of
motion vectors

also the MV precssion is needed somewhere somehow or they could not be
vissualized by generic code


> +
> +    /**
> +     * Offsets to reference indices for list 0 and list 1, relative to the
> +     * start of the AVVideoCodingInfo struct.
> +     * The data is an array of int8_t. A value of -1 indicates the reference
> +     * is not used for a specific partition.
> +     * An offset of 0 indicates this data is not present.
> +     */
> +    size_t ref_idx_offset[2];
> +    /**
> +     * Number of motion vectors for list 0 and list 1.
> +     */
> +    uint8_t num_mv[2];
> +} AVBlockInterInfo;

weighted bi pred needs the weights too

and for more than 1 MV, the question becomes what the other vectors
are, bipred ?, affine MC ?, ...

Also if you want to be really generic you need to allow blocks
that dont span accross the luma and chroma planes but allow
different block structures (and motion vectors) per plane

Iam not sure how generic we want to be and how useful that is.

But it seemes you want your patch to be quite generic ?

I think its more important to allow this to be extensible
than suporting everything we can think of.

That is maybe store the size of the struct also somewhere so
that elements can be added to their end without breaking
anything. At least for the main block structure

I mean a future codec might allow non rectangular blocks but we
dont want to think about that today.

Maybe its best to keep this as simple as possible but extensible


> +
> +/**
> + * Structure to hold intra-prediction information for a block.
> + */
> +typedef struct AVBlockIntraInfo {
> +    /**
> +     * Offset to an array of intra prediction modes, relative to the
> +     * start of the AVVideoCodingInfo struct.
> +     * The number of modes is given by num_pred_modes.
> +     */
> +    size_t pred_mode_offset;
> +
> +    /**
> +     * Number of intra prediction modes.
> +     */
> +    uint8_t num_pred_modes;
> +
> +    /**
> +     * Chroma intra prediction mode.
> +     */
> +    uint8_t chroma_pred_mode;
> +} AVBlockIntraInfo;

classifying the predition in directional, DC, and non directional and
for directional the direction. Could be usefull.

Otherwise the prediction mode number requires codec specific knowledge
to interpret

thx

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

While the State exists there can be no freedom; when there is freedom there
will be no State. -- Vladimir Lenin

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

[-- Attachment #2: Type: text/plain, Size: 251 bytes --]

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [FFmpeg-devel] [PATCH 1/4] avutil: add generic side data for video coding info
  2025-07-18 10:30 [FFmpeg-devel] [PATCH 1/4] avutil: add generic side data for video coding info Timothée Regaud
                   ` (3 preceding siblings ...)
  2025-07-18 15:48 ` [FFmpeg-devel] [PATCH 1/4] avutil: add generic side data for video coding info Michael Niedermayer
@ 2025-07-18 17:42 ` Lynne
  4 siblings, 0 replies; 7+ messages in thread
From: Lynne @ 2025-07-18 17:42 UTC (permalink / raw)
  To: ffmpeg-devel

On 18/07/2025 19:30, Timothée Regaud wrote:
> From: Timothee Regaud <timothee.informatique@regaud-chapuy.fr>
> 
> Adds the generic data structures to libavutil. The design is recursive to support other codecs, even though the implementation is only for H.264 for now.
> 
> Signed-off-by: Timothee Regaud <timothee.informatique@regaud-chapuy.fr>
> ---
>   libavutil/Makefile            |   1 +
>   libavutil/frame.h             |   7 ++
>   libavutil/side_data.c         |   1 +
>   libavutil/video_coding_info.h | 163 ++++++++++++++++++++++++++++++++++
>   4 files changed, 172 insertions(+)
>   create mode 100644 libavutil/video_coding_info.h
> 
> diff --git a/libavutil/Makefile b/libavutil/Makefile
> index 94a56bb72f..44e51ab7ae 100644
> --- a/libavutil/Makefile
> +++ b/libavutil/Makefile
> @@ -93,6 +93,7 @@ HEADERS = adler32.h                                                     \
>             tree.h                                                        \
>             twofish.h                                                     \
>             uuid.h                                                        \
> +          video_coding_info.h                                           \
>             version.h                                                     \
>             video_enc_params.h                                            \
>             xtea.h                                                        \
> diff --git a/libavutil/frame.h b/libavutil/frame.h
> index c50cd263d9..f4404472a0 100644
> --- a/libavutil/frame.h
> +++ b/libavutil/frame.h
> @@ -254,6 +254,13 @@ enum AVFrameSideDataType {
>        * libavutil/tdrdi.h.
>        */
>       AV_FRAME_DATA_3D_REFERENCE_DISPLAYS,
> +
> +    /**
> +     * Detailed block-level coding information. The data is an AVVideoCodingInfo
> +     * structure. This is exported by video decoders and can be used by filters
> +     * for analysis and visualization.
> +     */
> +    AV_FRAME_DATA_VIDEO_CODING_INFO,
>   };
>   
>   enum AVActiveFormatDescription {
> diff --git a/libavutil/side_data.c b/libavutil/side_data.c
> index fa2a2c2a13..b938ef6f52 100644
> --- a/libavutil/side_data.c
> +++ b/libavutil/side_data.c
> @@ -56,6 +56,7 @@ static const AVSideDataDescriptor sd_props[] = {
>       [AV_FRAME_DATA_SEI_UNREGISTERED]            = { "H.26[45] User Data Unregistered SEI message",  AV_SIDE_DATA_PROP_MULTI },
>       [AV_FRAME_DATA_VIDEO_HINT]                  = { "Encoding video hint",                          AV_SIDE_DATA_PROP_SIZE_DEPENDENT },
>       [AV_FRAME_DATA_3D_REFERENCE_DISPLAYS]       = { "3D Reference Displays Information",            AV_SIDE_DATA_PROP_GLOBAL },
> +    [AV_FRAME_DATA_VIDEO_CODING_INFO]           = { "Video Coding Info",                            AV_SIDE_DATA_PROP_SIZE_DEPENDENT },
>   };
>   
>   const AVSideDataDescriptor *av_frame_side_data_desc(enum AVFrameSideDataType type)
> diff --git a/libavutil/video_coding_info.h b/libavutil/video_coding_info.h
> new file mode 100644
> index 0000000000..17e9345892
> --- /dev/null
> +++ b/libavutil/video_coding_info.h
> @@ -0,0 +1,163 @@
> +/*
> + * This file is part of FFmpeg.
> + *
> + * FFmpeg is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * FFmpeg is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with FFmpeg; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
> + */
> +
> +#ifndef AVUTIL_VIDEO_CODING_INFO_H
> +#define AVUTIL_VIDEO_CODING_INFO_H
> +
> +#include <stdint.h>
> +#include <stddef.h>
> +
> +/**
> + * @file
> + * @ingroup lavu_frame
> + * Structures for describing block-level video coding information.
> + */
> +
> +/**
> + * @defgroup lavu_video_coding_info Video Coding Info
> + * @ingroup lavu_frame
> + *
> + * @{
> + * Structures for describing block-level video coding information, to be
> + * attached to an AVFrame as side data.
> + *
> + * All pointer-like members in these structures are offsets relative to the
> + * start of the AVVideoCodingInfo struct to ensure the side data is
> + * self-contained and relocatable. This is critical as the underlying buffer
> + * may be moved in memory.
> + */
> +
> +/**
> + * Structure to hold inter-prediction information for a block.
> + */
> +typedef struct AVBlockInterInfo {
> +    /**
> +     * Offsets to motion vectors for list 0 and list 1, relative to the
> +     * start of the AVVideoCodingInfo struct.
> +     * The data for each list is an array of [x, y] pairs of int16_t.
> +     * The number of vectors is given by num_mv.
> +     * An offset of 0 indicates this data is not present.
> +     */
> +    size_t mv_offset[2];
> +
> +    /**
> +     * Offsets to reference indices for list 0 and list 1, relative to the
> +     * start of the AVVideoCodingInfo struct.
> +     * The data is an array of int8_t. A value of -1 indicates the reference
> +     * is not used for a specific partition.
> +     * An offset of 0 indicates this data is not present.
> +     */
> +    size_t ref_idx_offset[2];
> +    /**
> +     * Number of motion vectors for list 0 and list 1.
> +     */
> +    uint8_t num_mv[2];
> +} AVBlockInterInfo;
> +
> +/**
> + * Structure to hold intra-prediction information for a block.
> + */
> +typedef struct AVBlockIntraInfo {
> +    /**
> +     * Offset to an array of intra prediction modes, relative to the
> +     * start of the AVVideoCodingInfo struct.
> +     * The number of modes is given by num_pred_modes.
> +     */
> +    size_t pred_mode_offset;
> +
> +    /**
> +     * Number of intra prediction modes.
> +     */
> +    uint8_t num_pred_modes;
> +
> +    /**
> +     * Chroma intra prediction mode.
> +     */
> +    uint8_t chroma_pred_mode;
> +} AVBlockIntraInfo;
> +
> +/**
> + * Main structure for a single coding block.
> + * This structure can be recursive for codecs that use tree-based partitioning.
> + */
> +typedef struct AVVideoCodingInfoBlock {
> +    /**
> +     * Position (x, y) and size (w, h) of the block, in pixels,
> +     * relative to the top-left corner of the frame.
> +     */
> +    int16_t x, y;
> +    uint8_t w, h;
> +
> +    /**
> +     * Flag indicating if the block is intra-coded.
> +     * 1 if intra, 0 if inter.
> +     */
> +    uint8_t is_intra;
> +
> +    /**
> +     * The original, codec-specific type of this block or macroblock.
> +     * This allows a filter to have codec-specific logic for interpreting
> +     * the generic prediction information based on the source codec.
> +     * For example, for H.264, this would store the MB type flags (MB_TYPE_*).
> +     */
> +    uint32_t codec_specific_type;
> +
> +    union {
> +        AVBlockIntraInfo intra;
> +        AVBlockInterInfo inter;
> +    };
> +
> +    /**
> +     * Number of child blocks this block is partitioned into.
> +     * If 0, this is a leaf node in the partition tree.
> +     */
> +    uint8_t num_children;
> +
> +    /**
> +     * Offset to an array of child AVVideoCodingInfoBlock structures, relative
> +     * to the start of the AVVideoCodingInfo struct.
> +     * This allows for recursive representation of coding structures.
> +     * An offset of 0 indicates there are no children.
> +     */
> +    size_t children_offset;
> +} AVVideoCodingInfoBlock;
> +
> +/**
> + * Top-level structure to be attached to an AVFrame as side data.
> + * It contains an array of the highest-level coding blocks (e.g., CTUs or MBs).
> + */
> +typedef struct AVVideoCodingInfo {
> +    /**
> +     * Number of top-level blocks in the frame.
> +     */
> +    uint32_t nb_blocks;
> +
> +    /**
> +     * Offset to an array of top-level blocks, relative to the start of the
> +     * AVVideoCodingInfo struct.
> +     * The actual data for these blocks, and any child blocks or sub-data,
> +     * is stored contiguously in the AVBufferRef attached to the side data.
> +     */
> +    size_t blocks_offset;
> +} AVVideoCodingInfo;
> +
> +/**
> + * @}
> + */
> +
> +#endif /* AVUTIL_VIDEO_CODING_INFO_H */

Absolutely not.
Use and extend libavutil/video_enc_params.h instead.
And if at all possible, don't implement an inspection tool in ffmpeg 
*just because you want to*. Parsing a bitstream and displaying it is not 
a very complicated thing, but exposing an API very much is a very 
complicated thing.
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-07-18 17:42 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-07-18 10:30 [FFmpeg-devel] [PATCH 1/4] avutil: add generic side data for video coding info Timothée Regaud
2025-07-18 10:30 ` [FFmpeg-devel] [PATCH 2/4] avcodec: add option to export " Timothée Regaud
2025-07-18 10:30 ` [FFmpeg-devel] [PATCH 3/4] avcodec/h264dec: implement export of video coding info for H.264 Timothée Regaud
2025-07-18 14:17   ` Michael Niedermayer
2025-07-18 10:30 ` [FFmpeg-devel] [PATCH 4/4] vf_codecview: add support for AV_FRAME_DATA_VIDEO_CODING_INFO Timothée Regaud
2025-07-18 15:48 ` [FFmpeg-devel] [PATCH 1/4] avutil: add generic side data for video coding info Michael Niedermayer
2025-07-18 17:42 ` Lynne

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
		ffmpegdev@gitmailbox.com
	public-inbox-index ffmpegdev

Example config snippet for mirrors.


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git