* [FFmpeg-devel] [PATCH v3 0/7] APV support
@ 2025-04-23 20:45 Mark Thompson
2025-04-23 20:45 ` [FFmpeg-devel] [PATCH v3 1/7] lavc: APV codec ID and descriptor Mark Thompson
` (6 more replies)
0 siblings, 7 replies; 17+ messages in thread
From: Mark Thompson @ 2025-04-23 20:45 UTC (permalink / raw)
To: ffmpeg-devel
v3:
* Updated to match specification v4 (released a week ago). Main change is the bitstream signature which is mandatory and helpfully makes probing a lot easier.
* Demuxer changed to use bytestream (thanks to Andreas for his comments).
* Improvements to AVX2 code (thanks to James for his comments).
* Decoder metadata support (not well-tested, need proper samples).
* Raw muxer added for easier testing (round-trip through cbs is the identity).
* Some other minor changes.
Thanks,
- Mark
Mark Thompson (7):
lavc: APV codec ID and descriptor
lavc/cbs: APV support
lavf: APV demuxer
lavc: APV decoder
lavc/apv: AVX2 transquant for x86-64
lavc: APV metadata bitstream filter
lavf: APV muxer
configure | 2 +
libavcodec/Makefile | 2 +
libavcodec/allcodecs.c | 1 +
libavcodec/apv.h | 89 ++++
libavcodec/apv_decode.c | 403 ++++++++++++++++++
libavcodec/apv_decode.h | 80 ++++
libavcodec/apv_dsp.c | 140 +++++++
libavcodec/apv_dsp.h | 39 ++
libavcodec/apv_entropy.c | 200 +++++++++
libavcodec/bitstream_filters.c | 1 +
libavcodec/bsf/Makefile | 1 +
libavcodec/bsf/apv_metadata.c | 134 ++++++
libavcodec/cbs.c | 6 +
libavcodec/cbs_apv.c | 408 ++++++++++++++++++
libavcodec/cbs_apv.h | 207 ++++++++++
libavcodec/cbs_apv_syntax_template.c | 596 +++++++++++++++++++++++++++
libavcodec/cbs_internal.h | 4 +
libavcodec/codec_desc.c | 7 +
libavcodec/codec_id.h | 1 +
libavcodec/x86/Makefile | 2 +
libavcodec/x86/apv_dsp.asm | 311 ++++++++++++++
libavcodec/x86/apv_dsp_init.c | 44 ++
libavformat/Makefile | 2 +
libavformat/allformats.c | 2 +
libavformat/apvdec.c | 241 +++++++++++
libavformat/apvenc.c | 40 ++
libavformat/cbs.h | 1 +
tests/checkasm/Makefile | 1 +
tests/checkasm/apv_dsp.c | 109 +++++
tests/checkasm/checkasm.c | 3 +
tests/checkasm/checkasm.h | 1 +
tests/fate/checkasm.mak | 1 +
32 files changed, 3079 insertions(+)
create mode 100644 libavcodec/apv.h
create mode 100644 libavcodec/apv_decode.c
create mode 100644 libavcodec/apv_decode.h
create mode 100644 libavcodec/apv_dsp.c
create mode 100644 libavcodec/apv_dsp.h
create mode 100644 libavcodec/apv_entropy.c
create mode 100644 libavcodec/bsf/apv_metadata.c
create mode 100644 libavcodec/cbs_apv.c
create mode 100644 libavcodec/cbs_apv.h
create mode 100644 libavcodec/cbs_apv_syntax_template.c
create mode 100644 libavcodec/x86/apv_dsp.asm
create mode 100644 libavcodec/x86/apv_dsp_init.c
create mode 100644 libavformat/apvdec.c
create mode 100644 libavformat/apvenc.c
create mode 100644 tests/checkasm/apv_dsp.c
--
2.47.2
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 17+ messages in thread
* [FFmpeg-devel] [PATCH v3 1/7] lavc: APV codec ID and descriptor
2025-04-23 20:45 [FFmpeg-devel] [PATCH v3 0/7] APV support Mark Thompson
@ 2025-04-23 20:45 ` Mark Thompson
2025-04-23 20:45 ` [FFmpeg-devel] [PATCH v3 2/7] lavc/cbs: APV support Mark Thompson
` (5 subsequent siblings)
6 siblings, 0 replies; 17+ messages in thread
From: Mark Thompson @ 2025-04-23 20:45 UTC (permalink / raw)
To: ffmpeg-devel
---
libavcodec/codec_desc.c | 7 +++++++
libavcodec/codec_id.h | 1 +
2 files changed, 8 insertions(+)
diff --git a/libavcodec/codec_desc.c b/libavcodec/codec_desc.c
index 9fb190e35a..88fed478a3 100644
--- a/libavcodec/codec_desc.c
+++ b/libavcodec/codec_desc.c
@@ -1985,6 +1985,13 @@ static const AVCodecDescriptor codec_descriptors[] = {
.props = AV_CODEC_PROP_LOSSY | AV_CODEC_PROP_LOSSLESS,
.mime_types= MT("image/jxl"),
},
+ {
+ .id = AV_CODEC_ID_APV,
+ .type = AVMEDIA_TYPE_VIDEO,
+ .name = "apv",
+ .long_name = NULL_IF_CONFIG_SMALL("Advanced Professional Video"),
+ .props = AV_CODEC_PROP_INTRA_ONLY | AV_CODEC_PROP_LOSSY,
+ },
/* various PCM "codecs" */
{
diff --git a/libavcodec/codec_id.h b/libavcodec/codec_id.h
index 2f6efe8261..be0a65bcb9 100644
--- a/libavcodec/codec_id.h
+++ b/libavcodec/codec_id.h
@@ -329,6 +329,7 @@ enum AVCodecID {
AV_CODEC_ID_DNXUC,
AV_CODEC_ID_RV60,
AV_CODEC_ID_JPEGXL_ANIM,
+ AV_CODEC_ID_APV,
/* various PCM "codecs" */
AV_CODEC_ID_FIRST_AUDIO = 0x10000, ///< A dummy id pointing at the start of audio codecs
--
2.47.2
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 17+ messages in thread
* [FFmpeg-devel] [PATCH v3 2/7] lavc/cbs: APV support
2025-04-23 20:45 [FFmpeg-devel] [PATCH v3 0/7] APV support Mark Thompson
2025-04-23 20:45 ` [FFmpeg-devel] [PATCH v3 1/7] lavc: APV codec ID and descriptor Mark Thompson
@ 2025-04-23 20:45 ` Mark Thompson
2025-04-24 0:02 ` James Almer
2025-04-23 20:45 ` [FFmpeg-devel] [PATCH v3 3/7] lavf: APV demuxer Mark Thompson
` (4 subsequent siblings)
6 siblings, 1 reply; 17+ messages in thread
From: Mark Thompson @ 2025-04-23 20:45 UTC (permalink / raw)
To: ffmpeg-devel
---
configure | 1 +
libavcodec/Makefile | 1 +
libavcodec/apv.h | 89 ++++
libavcodec/cbs.c | 6 +
libavcodec/cbs_apv.c | 408 ++++++++++++++++++
libavcodec/cbs_apv.h | 207 ++++++++++
libavcodec/cbs_apv_syntax_template.c | 596 +++++++++++++++++++++++++++
libavcodec/cbs_internal.h | 4 +
libavformat/cbs.h | 1 +
9 files changed, 1313 insertions(+)
create mode 100644 libavcodec/apv.h
create mode 100644 libavcodec/cbs_apv.c
create mode 100644 libavcodec/cbs_apv.h
create mode 100644 libavcodec/cbs_apv_syntax_template.c
diff --git a/configure b/configure
index c94b8eac43..ca404d2797 100755
--- a/configure
+++ b/configure
@@ -2562,6 +2562,7 @@ CONFIG_EXTRA="
bswapdsp
cabac
cbs
+ cbs_apv
cbs_av1
cbs_h264
cbs_h265
diff --git a/libavcodec/Makefile b/libavcodec/Makefile
index 7bd1dbec9a..a5f5c4e904 100644
--- a/libavcodec/Makefile
+++ b/libavcodec/Makefile
@@ -83,6 +83,7 @@ OBJS-$(CONFIG_BLOCKDSP) += blockdsp.o
OBJS-$(CONFIG_BSWAPDSP) += bswapdsp.o
OBJS-$(CONFIG_CABAC) += cabac.o
OBJS-$(CONFIG_CBS) += cbs.o cbs_bsf.o
+OBJS-$(CONFIG_CBS_APV) += cbs_apv.o
OBJS-$(CONFIG_CBS_AV1) += cbs_av1.o
OBJS-$(CONFIG_CBS_H264) += cbs_h2645.o cbs_sei.o h2645_parse.o
OBJS-$(CONFIG_CBS_H265) += cbs_h2645.o cbs_sei.o h2645_parse.o
diff --git a/libavcodec/apv.h b/libavcodec/apv.h
new file mode 100644
index 0000000000..14ca27bf31
--- /dev/null
+++ b/libavcodec/apv.h
@@ -0,0 +1,89 @@
+/*
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#ifndef AVCODEC_APV_H
+#define AVCODEC_APV_H
+
+// Signature value in APV bitstreams (section 5.3.1).
+#define APV_SIGNATURE MKBETAG('a', 'P', 'v', '1')
+
+// PBU types (section 5.3.3).
+enum {
+ APV_PBU_PRIMARY_FRAME = 1,
+ APV_PBU_NON_PRIMARY_FRAME = 2,
+ APV_PBU_PREVIEW_FRAME = 25,
+ APV_PBU_DEPTH_FRAME = 26,
+ APV_PBU_ALPHA_FRAME = 27,
+ APV_PBU_ACCESS_UNIT_INFORMATION = 65,
+ APV_PBU_METADATA = 66,
+ APV_PBU_FILLER = 67,
+};
+
+// Format parameters (section 4.2).
+enum {
+ APV_MAX_NUM_COMP = 4,
+ APV_MB_WIDTH = 16,
+ APV_MB_HEIGHT = 16,
+ APV_TR_SIZE = 8,
+};
+
+// Chroma formats (section 4.2).
+enum {
+ APV_CHROMA_FORMAT_400 = 0,
+ APV_CHROMA_FORMAT_422 = 2,
+ APV_CHROMA_FORMAT_444 = 3,
+ APV_CHROMA_FORMAT_4444 = 4,
+};
+
+// Coefficient limits (section 5.3.15).
+enum {
+ APV_BLK_COEFFS = (APV_TR_SIZE * APV_TR_SIZE),
+ APV_MIN_TRANS_COEFF = -32768,
+ APV_MAX_TRANS_COEFF = 32767,
+};
+
+// Profiles (section 10.1.3).
+enum {
+ APV_PROFILE_422_10 = 33,
+ APV_PROFILE_422_12 = 44,
+ APV_PROFILE_444_10 = 55,
+ APV_PROFILE_444_12 = 66,
+ APV_PROFILE_4444_10 = 77,
+ APV_PROFILE_4444_12 = 88,
+ APV_PROFILE_400_10 = 99,
+};
+
+// General level limits for tiles (section 10.1.4.1).
+enum {
+ APV_MIN_TILE_WIDTH_IN_MBS = 16,
+ APV_MIN_TILE_HEIGHT_IN_MBS = 8,
+ APV_MAX_TILE_COLS = 20,
+ APV_MAX_TILE_ROWS = 20,
+ APV_MAX_TILE_COUNT = APV_MAX_TILE_COLS * APV_MAX_TILE_ROWS,
+};
+
+// Metadata types (section 10.3.1).
+enum {
+ APV_METADATA_ITU_T_T35 = 4,
+ APV_METADATA_MDCV = 5,
+ APV_METADATA_CLL = 6,
+ APV_METADATA_FILLER = 10,
+ APV_METADATA_USER_DEFINED = 170,
+};
+
+#endif /* AVCODEC_APV_H */
diff --git a/libavcodec/cbs.c b/libavcodec/cbs.c
index ba1034a72e..9b485420d5 100644
--- a/libavcodec/cbs.c
+++ b/libavcodec/cbs.c
@@ -31,6 +31,9 @@
static const CodedBitstreamType *const cbs_type_table[] = {
+#if CBS_APV
+ &CBS_FUNC(type_apv),
+#endif
#if CBS_AV1
&CBS_FUNC(type_av1),
#endif
@@ -58,6 +61,9 @@ static const CodedBitstreamType *const cbs_type_table[] = {
};
const enum AVCodecID CBS_FUNC(all_codec_ids)[] = {
+#if CBS_APV
+ AV_CODEC_ID_APV,
+#endif
#if CBS_AV1
AV_CODEC_ID_AV1,
#endif
diff --git a/libavcodec/cbs_apv.c b/libavcodec/cbs_apv.c
new file mode 100644
index 0000000000..baf07ced07
--- /dev/null
+++ b/libavcodec/cbs_apv.c
@@ -0,0 +1,408 @@
+/*
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include "libavutil/mem.h"
+#include "cbs.h"
+#include "cbs_internal.h"
+#include "cbs_apv.h"
+
+
+static int cbs_apv_get_num_comp(const APVRawFrameHeader *fh)
+{
+ switch (fh->frame_info.chroma_format_idc) {
+ case APV_CHROMA_FORMAT_400:
+ return 1;
+ case APV_CHROMA_FORMAT_422:
+ case APV_CHROMA_FORMAT_444:
+ return 3;
+ case APV_CHROMA_FORMAT_4444:
+ return 4;
+ default:
+ av_assert0(0 && "Invalid chroma_format_idc");
+ }
+}
+
+static void cbs_apv_derive_tile_info(APVDerivedTileInfo *ti,
+ const APVRawFrameHeader *fh)
+{
+ int frame_width_in_mbs = (fh->frame_info.frame_width + 15) / 16;
+ int frame_height_in_mbs = (fh->frame_info.frame_height + 15) / 16;
+ int start_mb, i;
+
+ start_mb = 0;
+ for (i = 0; start_mb < frame_width_in_mbs; i++) {
+ ti->col_starts[i] = start_mb * APV_MB_WIDTH;
+ start_mb += fh->tile_info.tile_width_in_mbs;
+ }
+ av_assert0(i <= APV_MAX_TILE_COLS);
+ ti->col_starts[i] = frame_width_in_mbs * APV_MB_WIDTH;
+ ti->tile_cols = i;
+
+ start_mb = 0;
+ for (i = 0; start_mb < frame_height_in_mbs; i++) {
+ av_assert0(i < APV_MAX_TILE_ROWS);
+ ti->row_starts[i] = start_mb * APV_MB_HEIGHT;
+ start_mb += fh->tile_info.tile_height_in_mbs;
+ }
+ av_assert0(i <= APV_MAX_TILE_ROWS);
+ ti->row_starts[i] = frame_height_in_mbs * APV_MB_HEIGHT;
+ ti->tile_rows = i;
+
+ ti->num_tiles = ti->tile_cols * ti->tile_rows;
+}
+
+
+#define HEADER(name) do { \
+ ff_cbs_trace_header(ctx, name); \
+ } while (0)
+
+#define CHECK(call) do { \
+ err = (call); \
+ if (err < 0) \
+ return err; \
+ } while (0)
+
+#define SUBSCRIPTS(subs, ...) (subs > 0 ? ((int[subs + 1]){ subs, __VA_ARGS__ }) : NULL)
+
+
+#define u(width, name, range_min, range_max) \
+ xu(width, name, current->name, range_min, range_max, 0, )
+#define ub(width, name) \
+ xu(width, name, current->name, 0, MAX_UINT_BITS(width), 0, )
+#define us(width, name, range_min, range_max, subs, ...) \
+ xu(width, name, current->name, range_min, range_max, subs, __VA_ARGS__)
+#define ubs(width, name, subs, ...) \
+ xu(width, name, current->name, 0, MAX_UINT_BITS(width), subs, __VA_ARGS__)
+
+#define fixed(width, name, value) do { \
+ av_unused uint32_t fixed_value = value; \
+ xu(width, name, fixed_value, value, value, 0, ); \
+ } while (0)
+
+
+#define READ
+#define READWRITE read
+#define RWContext GetBitContext
+#define FUNC(name) cbs_apv_read_ ## name
+
+#define xu(width, name, var, range_min, range_max, subs, ...) do { \
+ uint32_t value; \
+ CHECK(ff_cbs_read_unsigned(ctx, rw, width, #name, \
+ SUBSCRIPTS(subs, __VA_ARGS__), \
+ &value, range_min, range_max)); \
+ var = value; \
+ } while (0)
+
+#define infer(name, value) do { \
+ current->name = value; \
+ } while (0)
+
+#define byte_alignment(rw) (get_bits_count(rw) % 8)
+
+#include "cbs_apv_syntax_template.c"
+
+#undef READ
+#undef READWRITE
+#undef RWContext
+#undef FUNC
+#undef xu
+#undef infer
+#undef byte_alignment
+
+#define WRITE
+#define READWRITE write
+#define RWContext PutBitContext
+#define FUNC(name) cbs_apv_write_ ## name
+
+#define xu(width, name, var, range_min, range_max, subs, ...) do { \
+ uint32_t value = var; \
+ CHECK(ff_cbs_write_unsigned(ctx, rw, width, #name, \
+ SUBSCRIPTS(subs, __VA_ARGS__), \
+ value, range_min, range_max)); \
+ } while (0)
+
+#define infer(name, value) do { \
+ if (current->name != (value)) { \
+ av_log(ctx->log_ctx, AV_LOG_ERROR, \
+ "%s does not match inferred value: " \
+ "%"PRId64", but should be %"PRId64".\n", \
+ #name, (int64_t)current->name, (int64_t)(value)); \
+ return AVERROR_INVALIDDATA; \
+ } \
+ } while (0)
+
+#define byte_alignment(rw) (put_bits_count(rw) % 8)
+
+#include "cbs_apv_syntax_template.c"
+
+#undef WRITE
+#undef READWRITE
+#undef RWContext
+#undef FUNC
+#undef xu
+#undef infer
+#undef byte_alignment
+
+
+static int cbs_apv_split_fragment(CodedBitstreamContext *ctx,
+ CodedBitstreamFragment *frag,
+ int header)
+{
+ uint8_t *data = frag->data;
+ size_t size = frag->data_size;
+ uint32_t signature;
+ int err, trace;
+
+ // Don't include parsing here in trace output.
+ trace = ctx->trace_enable;
+ ctx->trace_enable = 0;
+
+ signature = AV_RB32(data);
+ if (signature != APV_SIGNATURE) {
+ av_log(ctx->log_ctx, AV_LOG_ERROR,
+ "Invalid APV access unit: bad signature %08x.\n",
+ signature);
+ err = AVERROR_INVALIDDATA;
+ goto fail;
+ }
+ data += 4;
+ size -= 4;
+
+ while (size > 0) {
+ GetBitContext gbc;
+ uint32_t pbu_size;
+ APVRawPBUHeader pbu_header;
+
+ if (size < 8) {
+ av_log(ctx->log_ctx, AV_LOG_ERROR, "Invalid PBU: "
+ "fragment too short (%"SIZE_SPECIFIER" bytes).\n",
+ size);
+ err = AVERROR_INVALIDDATA;
+ goto fail;
+ }
+
+ pbu_size = AV_RB32(data);
+ if (pbu_size < 8 ) {
+ av_log(ctx->log_ctx, AV_LOG_ERROR, "Invalid PBU: "
+ "pbu_size too small (%"PRIu32" bytes).\n",
+ pbu_size);
+ err = AVERROR_INVALIDDATA;
+ goto fail;
+ }
+
+ data += 4;
+ size -= 4;
+
+ if (pbu_size > size) {
+ av_log(ctx->log_ctx, AV_LOG_ERROR, "Invalid PBU: "
+ "pbu_size too large (%"PRIu32" bytes).\n",
+ pbu_size);
+ err = AVERROR_INVALIDDATA;
+ goto fail;
+ }
+
+ init_get_bits(&gbc, data, 8 * pbu_size);
+
+ err = cbs_apv_read_pbu_header(ctx, &gbc, &pbu_header);
+ if (err < 0)
+ return err;
+
+ // Could select/skip frames based on type/group_id here.
+
+ err = ff_cbs_append_unit_data(frag, pbu_header.pbu_type,
+ data, pbu_size, frag->data_ref);
+ if (err < 0)
+ return err;
+
+ data += pbu_size;
+ size -= pbu_size;
+ }
+
+ err = 0;
+fail:
+ ctx->trace_enable = trace;
+ return err;
+}
+
+static int cbs_apv_read_unit(CodedBitstreamContext *ctx,
+ CodedBitstreamUnit *unit)
+{
+ GetBitContext gbc;
+ int err;
+
+ err = init_get_bits(&gbc, unit->data, 8 * unit->data_size);
+ if (err < 0)
+ return err;
+
+ err = ff_cbs_alloc_unit_content(ctx, unit);
+ if (err < 0)
+ return err;
+
+ switch (unit->type) {
+ case APV_PBU_PRIMARY_FRAME:
+ {
+ APVRawFrame *frame = unit->content;
+
+ err = cbs_apv_read_frame(ctx, &gbc, frame);
+ if (err < 0)
+ return err;
+
+ // Each tile inside the frame has pointers into the unit
+ // data buffer; make a single reference here for all of
+ // them together.
+ frame->tile_data_ref = av_buffer_ref(unit->data_ref);
+ if (!frame->tile_data_ref)
+ return AVERROR(ENOMEM);
+ }
+ break;
+ case APV_PBU_ACCESS_UNIT_INFORMATION:
+ {
+ err = cbs_apv_read_au_info(ctx, &gbc, unit->content);
+ if (err < 0)
+ return err;
+ }
+ break;
+ case APV_PBU_METADATA:
+ {
+ err = cbs_apv_read_metadata(ctx, &gbc, unit->content);
+ if (err < 0)
+ return err;
+ }
+ break;
+ case APV_PBU_FILLER:
+ {
+ err = cbs_apv_read_filler(ctx, &gbc, unit->content);
+ if (err < 0)
+ return err;
+ }
+ break;
+ default:
+ return AVERROR(ENOSYS);
+ }
+
+ return 0;
+}
+
+static int cbs_apv_write_unit(CodedBitstreamContext *ctx,
+ CodedBitstreamUnit *unit,
+ PutBitContext *pbc)
+{
+ int err;
+
+ switch (unit->type) {
+ case APV_PBU_PRIMARY_FRAME:
+ {
+ APVRawFrame *frame = unit->content;
+
+ err = cbs_apv_write_frame(ctx, pbc, frame);
+ if (err < 0)
+ return err;
+ }
+ break;
+ case APV_PBU_ACCESS_UNIT_INFORMATION:
+ {
+ err = cbs_apv_write_au_info(ctx, pbc, unit->content);
+ if (err < 0)
+ return err;
+ }
+ break;
+ case APV_PBU_METADATA:
+ {
+ err = cbs_apv_write_metadata(ctx, pbc, unit->content);
+ if (err < 0)
+ return err;
+ }
+ break;
+ case APV_PBU_FILLER:
+ {
+ err = cbs_apv_write_filler(ctx, pbc, unit->content);
+ if (err < 0)
+ return err;
+ }
+ break;
+ default:
+ return AVERROR(ENOSYS);
+ }
+
+ return 0;
+}
+
+static int cbs_apv_assemble_fragment(CodedBitstreamContext *ctx,
+ CodedBitstreamFragment *frag)
+{
+ size_t size = 4, pos;
+
+ for (int i = 0; i < frag->nb_units; i++)
+ size += frag->units[i].data_size + 4;
+
+ frag->data_ref = av_buffer_alloc(size + AV_INPUT_BUFFER_PADDING_SIZE);
+ if (!frag->data_ref)
+ return AVERROR(ENOMEM);
+ frag->data = frag->data_ref->data;
+ memset(frag->data + size, 0, AV_INPUT_BUFFER_PADDING_SIZE);
+
+ AV_WB32(frag->data, APV_SIGNATURE);
+ pos = 4;
+ for (int i = 0; i < frag->nb_units; i++) {
+ AV_WB32(frag->data + pos, frag->units[i].data_size);
+ pos += 4;
+
+ memcpy(frag->data + pos, frag->units[i].data,
+ frag->units[i].data_size);
+ pos += frag->units[i].data_size;
+ }
+ av_assert0(pos == size);
+ frag->data_size = size;
+
+ return 0;
+}
+
+
+static const CodedBitstreamUnitTypeDescriptor cbs_apv_unit_types[] = {
+ {
+ .nb_unit_types = CBS_UNIT_TYPE_RANGE,
+ .unit_type.range = {
+ .start = APV_PBU_PRIMARY_FRAME,
+ .end = APV_PBU_ALPHA_FRAME,
+ },
+ .content_type = CBS_CONTENT_TYPE_INTERNAL_REFS,
+ .content_size = sizeof(APVRawFrame),
+ .type.ref = {
+ .nb_offsets = 1,
+ .offsets = { offsetof(APVRawFrame, tile_data_ref) -
+ sizeof(void*) },
+ },
+ },
+
+ CBS_UNIT_TYPE_POD(APV_PBU_METADATA, APVRawMetadata),
+
+ CBS_UNIT_TYPE_END_OF_LIST
+};
+
+const CodedBitstreamType ff_cbs_type_apv = {
+ .codec_id = AV_CODEC_ID_APV,
+
+ .priv_data_size = sizeof(CodedBitstreamAPVContext),
+
+ .unit_types = cbs_apv_unit_types,
+
+ .split_fragment = &cbs_apv_split_fragment,
+ .read_unit = &cbs_apv_read_unit,
+ .write_unit = &cbs_apv_write_unit,
+ .assemble_fragment = &cbs_apv_assemble_fragment,
+};
diff --git a/libavcodec/cbs_apv.h b/libavcodec/cbs_apv.h
new file mode 100644
index 0000000000..cbaeb45acb
--- /dev/null
+++ b/libavcodec/cbs_apv.h
@@ -0,0 +1,207 @@
+/*
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#ifndef AVCODEC_CBS_APV_H
+#define AVCODEC_CBS_APV_H
+
+#include <stddef.h>
+#include <stdint.h>
+
+#include "libavutil/buffer.h"
+#include "apv.h"
+
+// Arbitrary limits to avoid large structures.
+#define CBS_APV_MAX_AU_FRAMES 8
+#define CBS_APV_MAX_METADATA_PAYLOADS 8
+
+
+typedef struct APVRawPBUHeader {
+ uint8_t pbu_type;
+ uint16_t group_id;
+ uint8_t reserved_zero_8bits;
+} APVRawPBUHeader;
+
+typedef struct APVRawFiller {
+ size_t filler_size;
+} APVRawFiller;
+
+typedef struct APVRawFrameInfo {
+ uint8_t profile_idc;
+ uint8_t level_idc;
+ uint8_t band_idc;
+ uint8_t reserved_zero_5bits;
+ uint32_t frame_width;
+ uint32_t frame_height;
+ uint8_t chroma_format_idc;
+ uint8_t bit_depth_minus8;
+ uint8_t capture_time_distance;
+ uint8_t reserved_zero_8bits;
+} APVRawFrameInfo;
+
+typedef struct APVRawQuantizationMatrix {
+ uint8_t q_matrix[APV_MAX_NUM_COMP][APV_TR_SIZE][APV_TR_SIZE];
+} APVRawQuantizationMatrix;
+
+typedef struct APVRawTileInfo {
+ uint32_t tile_width_in_mbs;
+ uint32_t tile_height_in_mbs;
+ uint8_t tile_size_present_in_fh_flag;
+ uint32_t tile_size_in_fh[APV_MAX_TILE_COUNT];
+} APVRawTileInfo;
+
+typedef struct APVRawFrameHeader {
+ APVRawFrameInfo frame_info;
+ uint8_t reserved_zero_8bits;
+
+ uint8_t color_description_present_flag;
+ uint8_t color_primaries;
+ uint8_t transfer_characteristics;
+ uint8_t matrix_coefficients;
+ uint8_t full_range_flag;
+
+ uint8_t use_q_matrix;
+ APVRawQuantizationMatrix quantization_matrix;
+
+ APVRawTileInfo tile_info;
+
+ uint8_t reserved_zero_8bits_2;
+} APVRawFrameHeader;
+
+typedef struct APVRawTileHeader {
+ uint16_t tile_header_size;
+ uint16_t tile_index;
+ uint32_t tile_data_size[APV_MAX_NUM_COMP];
+ uint8_t tile_qp [APV_MAX_NUM_COMP];
+ uint8_t reserved_zero_8bits;
+} APVRawTileHeader;
+
+typedef struct APVRawTile {
+ APVRawTileHeader tile_header;
+
+ uint8_t *tile_data[APV_MAX_NUM_COMP];
+ uint8_t *tile_dummy_byte;
+ uint32_t tile_dummy_byte_size;
+} APVRawTile;
+
+typedef struct APVRawFrame {
+ APVRawPBUHeader pbu_header;
+ APVRawFrameHeader frame_header;
+ uint32_t tile_size[APV_MAX_TILE_COUNT];
+ APVRawTile tile [APV_MAX_TILE_COUNT];
+ APVRawFiller filler;
+
+ AVBufferRef *tile_data_ref;
+} APVRawFrame;
+
+typedef struct APVRawAUInfo {
+ uint16_t num_frames;
+
+ uint8_t pbu_type [CBS_APV_MAX_AU_FRAMES];
+ uint8_t group_id [CBS_APV_MAX_AU_FRAMES];
+ uint8_t reserved_zero_8bits[CBS_APV_MAX_AU_FRAMES];
+ APVRawFrameInfo frame_info [CBS_APV_MAX_AU_FRAMES];
+
+ uint8_t reserved_zero_8bits_2;
+
+ APVRawFiller filler;
+} APVRawAUInfo;
+
+typedef struct APVRawMetadataITUTT35 {
+ uint8_t itu_t_t35_country_code;
+ uint8_t itu_t_t35_country_code_extension;
+
+ uint8_t *data;
+ AVBufferRef *data_ref;
+ size_t data_size;
+} APVRawMetadataITUTT35;
+
+typedef struct APVRawMetadataMDCV {
+ uint16_t primary_chromaticity_x[3];
+ uint16_t primary_chromaticity_y[3];
+ uint16_t white_point_chromaticity_x;
+ uint16_t white_point_chromaticity_y;
+ uint32_t max_mastering_luminance;
+ uint32_t min_mastering_luminance;
+} APVRawMetadataMDCV;
+
+typedef struct APVRawMetadataCLL {
+ uint16_t max_cll;
+ uint16_t max_fall;
+} APVRawMetadataCLL;
+
+typedef struct APVRawMetadataFiller {
+ uint32_t payload_size;
+} APVRawMetadataFiller;
+
+typedef struct APVRawMetadataUserDefined {
+ uint8_t uuid[16];
+
+ uint8_t *data;
+ AVBufferRef *data_ref;
+ size_t data_size;
+} APVRawMetadataUserDefined;
+
+typedef struct APVRawMetadataUndefined {
+ uint8_t *data;
+ AVBufferRef *data_ref;
+ size_t data_size;
+} APVRawMetadataUndefined;
+
+typedef struct APVRawMetadataPayload {
+ uint32_t payload_type;
+ uint32_t payload_size;
+ union {
+ APVRawMetadataITUTT35 itu_t_t35;
+ APVRawMetadataMDCV mdcv;
+ APVRawMetadataCLL cll;
+ APVRawMetadataFiller filler;
+ APVRawMetadataUserDefined user_defined;
+ APVRawMetadataUndefined undefined;
+ };
+} APVRawMetadataPayload;
+
+typedef struct APVRawMetadata {
+ APVRawPBUHeader pbu_header;
+
+ uint32_t metadata_size;
+ uint32_t metadata_count;
+
+ APVRawMetadataPayload payloads[CBS_APV_MAX_METADATA_PAYLOADS];
+
+ APVRawFiller filler;
+} APVRawMetadata;
+
+
+typedef struct APVDerivedTileInfo {
+ uint8_t tile_cols;
+ uint8_t tile_rows;
+ uint16_t num_tiles;
+ // The spec uses an extra element on the end of these arrays
+ // not corresponding to any tile.
+ uint16_t col_starts[APV_MAX_TILE_COLS + 1];
+ uint16_t row_starts[APV_MAX_TILE_ROWS + 1];
+} APVDerivedTileInfo;
+
+typedef struct CodedBitstreamAPVContext {
+ int bit_depth;
+ int num_comp;
+
+ APVDerivedTileInfo tile_info;
+} CodedBitstreamAPVContext;
+
+#endif /* AVCODEC_CBS_APV_H */
diff --git a/libavcodec/cbs_apv_syntax_template.c b/libavcodec/cbs_apv_syntax_template.c
new file mode 100644
index 0000000000..7864dfca9d
--- /dev/null
+++ b/libavcodec/cbs_apv_syntax_template.c
@@ -0,0 +1,596 @@
+/*
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+static int FUNC(pbu_header)(CodedBitstreamContext *ctx, RWContext *rw,
+ APVRawPBUHeader *current)
+{
+ int err;
+
+ ub(8, pbu_type);
+ ub(16, group_id);
+ u(8, reserved_zero_8bits, 0, 0);
+
+ return 0;
+}
+
+static int FUNC(byte_alignment)(CodedBitstreamContext *ctx, RWContext *rw)
+{
+ int err;
+
+ while (byte_alignment(rw) != 0)
+ fixed(1, alignment_bit_equal_to_zero, 0);
+
+ return 0;
+}
+
+static int FUNC(filler)(CodedBitstreamContext *ctx, RWContext *rw,
+ APVRawFiller *current)
+{
+ int err;
+
+#ifdef READ
+ current->filler_size = 0;
+ while (show_bits(rw, 8) == 0xff) {
+ fixed(8, ff_byte, 0xff);
+ ++current->filler_size;
+ }
+#else
+ {
+ uint32_t i;
+ for (i = 0; i < current->filler_size; i++)
+ fixed(8, ff_byte, 0xff);
+ }
+#endif
+
+ return 0;
+}
+
+static int FUNC(frame_info)(CodedBitstreamContext *ctx, RWContext *rw,
+ APVRawFrameInfo *current)
+{
+ int err;
+
+ ub(8, profile_idc);
+ ub(8, level_idc);
+ ub(3, band_idc);
+
+ u(5, reserved_zero_5bits, 0, 0);
+
+ ub(24, frame_width);
+ ub(24, frame_height);
+
+ u(4, chroma_format_idc, 0, 4);
+ if (current->chroma_format_idc == 1) {
+ av_log(ctx->log_ctx, AV_LOG_ERROR,
+ "chroma_format_idc 1 for 4:2:0 is not allowed in APV.\n");
+ return AVERROR_INVALIDDATA;
+ }
+
+ u(4, bit_depth_minus8, 2, 8);
+
+ ub(8, capture_time_distance);
+
+ u(8, reserved_zero_8bits, 0, 0);
+
+ return 0;
+}
+
+static int FUNC(quantization_matrix)(CodedBitstreamContext *ctx,
+ RWContext *rw,
+ APVRawQuantizationMatrix *current)
+{
+ const CodedBitstreamAPVContext *priv = ctx->priv_data;
+ int err;
+
+ for (int c = 0; c < priv->num_comp; c++) {
+ for (int y = 0; y < 8; y++) {
+ for (int x = 0; x < 8 ; x++) {
+ us(8, q_matrix[c][x][y], 1, 255, 3, c, x, y);
+ }
+ }
+ }
+
+ return 0;
+}
+
+static int FUNC(tile_info)(CodedBitstreamContext *ctx, RWContext *rw,
+ APVRawTileInfo *current,
+ const APVRawFrameHeader *fh)
+{
+ CodedBitstreamAPVContext *priv = ctx->priv_data;
+ int err;
+
+ u(20, tile_width_in_mbs,
+ APV_MIN_TILE_WIDTH_IN_MBS, MAX_UINT_BITS(20));
+ u(20, tile_height_in_mbs,
+ APV_MIN_TILE_HEIGHT_IN_MBS, MAX_UINT_BITS(20));
+
+ ub(1, tile_size_present_in_fh_flag);
+
+ cbs_apv_derive_tile_info(&priv->tile_info, fh);
+
+ if (current->tile_size_present_in_fh_flag) {
+ for (int t = 0; t < priv->tile_info.num_tiles; t++) {
+ us(32, tile_size_in_fh[t], 10, MAX_UINT_BITS(32), 1, t);
+ }
+ }
+
+ return 0;
+}
+
+static int FUNC(frame_header)(CodedBitstreamContext *ctx, RWContext *rw,
+ APVRawFrameHeader *current)
+{
+ CodedBitstreamAPVContext *priv = ctx->priv_data;
+ int err;
+
+ CHECK(FUNC(frame_info)(ctx, rw, ¤t->frame_info));
+
+ u(8, reserved_zero_8bits, 0, 0);
+
+ ub(1, color_description_present_flag);
+ if (current->color_description_present_flag) {
+ ub(8, color_primaries);
+ ub(8, transfer_characteristics);
+ ub(8, matrix_coefficients);
+ ub(1, full_range_flag);
+ } else {
+ infer(color_primaries, 2);
+ infer(transfer_characteristics, 2);
+ infer(matrix_coefficients, 2);
+ infer(full_range_flag, 0);
+ }
+
+ priv->bit_depth = current->frame_info.bit_depth_minus8 + 8;
+ priv->num_comp = cbs_apv_get_num_comp(current);
+
+ ub(1, use_q_matrix);
+ if (current->use_q_matrix) {
+ CHECK(FUNC(quantization_matrix)(ctx, rw,
+ ¤t->quantization_matrix));
+ } else {
+ for (int c = 0; c < priv->num_comp; c++) {
+ for (int y = 0; y < 8; y++) {
+ for (int x = 0; x < 8 ; x++) {
+ infer(quantization_matrix.q_matrix[c][y][x], 16);
+ }
+ }
+ }
+ }
+
+ CHECK(FUNC(tile_info)(ctx, rw, ¤t->tile_info, current));
+
+ u(8, reserved_zero_8bits_2, 0, 0);
+
+ CHECK(FUNC(byte_alignment)(ctx, rw));
+
+ return 0;
+}
+
+static int FUNC(tile_header)(CodedBitstreamContext *ctx, RWContext *rw,
+ APVRawTileHeader *current, int tile_idx)
+{
+ const CodedBitstreamAPVContext *priv = ctx->priv_data;
+ uint16_t expected_tile_header_size;
+ uint8_t max_qp;
+ int err;
+
+ expected_tile_header_size = 4 + priv->num_comp * (4 + 1) + 1;
+
+ u(16, tile_header_size,
+ expected_tile_header_size, expected_tile_header_size);
+
+ u(16, tile_index, tile_idx, tile_idx);
+
+ for (int c = 0; c < priv->num_comp; c++) {
+ us(32, tile_data_size[c], 1, MAX_UINT_BITS(32), 1, c);
+ }
+
+ max_qp = 3 + priv->bit_depth * 6;
+ for (int c = 0; c < priv->num_comp; c++) {
+ us(8, tile_qp[c], 0, max_qp, 1, c);
+ }
+
+ u(8, reserved_zero_8bits, 0, 0);
+
+ return 0;
+}
+
+static int FUNC(tile)(CodedBitstreamContext *ctx, RWContext *rw,
+ APVRawTile *current, int tile_idx)
+{
+ const CodedBitstreamAPVContext *priv = ctx->priv_data;
+ int err;
+
+ CHECK(FUNC(tile_header)(ctx, rw, ¤t->tile_header, tile_idx));
+
+ for (int c = 0; c < priv->num_comp; c++) {
+ uint32_t comp_size = current->tile_header.tile_data_size[c];
+#ifdef READ
+ int pos = get_bits_count(rw);
+ av_assert0(pos % 8 == 0);
+ current->tile_data[c] = (uint8_t*)align_get_bits(rw);
+ skip_bits_long(rw, 8 * comp_size);
+#else
+ if (put_bytes_left(rw, 0) < comp_size)
+ return AVERROR(ENOSPC);
+ ff_copy_bits(rw, current->tile_data[c], comp_size * 8);
+#endif
+ }
+
+ return 0;
+}
+
+static int FUNC(frame)(CodedBitstreamContext *ctx, RWContext *rw,
+ APVRawFrame *current)
+{
+ const CodedBitstreamAPVContext *priv = ctx->priv_data;
+ int err;
+
+ HEADER("Frame");
+
+ CHECK(FUNC(pbu_header)(ctx, rw, ¤t->pbu_header));
+
+ CHECK(FUNC(frame_header)(ctx, rw, ¤t->frame_header));
+
+ for (int t = 0; t < priv->tile_info.num_tiles; t++) {
+ us(32, tile_size[t], 10, MAX_UINT_BITS(32), 1, t);
+
+ CHECK(FUNC(tile)(ctx, rw, ¤t->tile[t], t));
+ }
+
+ CHECK(FUNC(filler)(ctx, rw, ¤t->filler));
+
+ return 0;
+}
+
+static int FUNC(au_info)(CodedBitstreamContext *ctx, RWContext *rw,
+ APVRawAUInfo *current)
+{
+ int err;
+
+ HEADER("Access Unit Information");
+
+ u(16, num_frames, 1, CBS_APV_MAX_AU_FRAMES);
+
+ for (int i = 0; i < current->num_frames; i++) {
+ ubs(8, pbu_type[i], 1, i);
+ ubs(8, group_id[i], 1, i);
+
+ us(8, reserved_zero_8bits[i], 0, 0, 1, i);
+
+ CHECK(FUNC(frame_info)(ctx, rw, ¤t->frame_info[i]));
+ }
+
+ u(8, reserved_zero_8bits_2, 0, 0);
+
+ return 0;
+}
+
+static int FUNC(metadata_itu_t_t35)(CodedBitstreamContext *ctx,
+ RWContext *rw,
+ APVRawMetadataITUTT35 *current,
+ size_t payload_size)
+{
+ int err;
+ size_t read_size = payload_size - 1;
+
+ HEADER("ITU-T T.35 Metadata");
+
+ ub(8, itu_t_t35_country_code);
+
+ if (current->itu_t_t35_country_code == 0xff) {
+ ub(8, itu_t_t35_country_code_extension);
+ --read_size;
+ }
+
+#ifdef READ
+ current->data_size = read_size;
+ current->data_ref = av_buffer_alloc(current->data_size);
+ if (!current->data_ref)
+ return AVERROR(ENOMEM);
+ current->data = current->data_ref->data;
+#else
+ if (current->data_size != read_size) {
+ av_log(ctx->log_ctx, AV_LOG_ERROR, "Write size mismatch: "
+ "payload %zu but expecting %zu\n",
+ current->data_size, read_size);
+ return AVERROR(EINVAL);
+ }
+#endif
+
+ for (size_t i = 0; i < current->data_size; i++) {
+ xu(8, itu_t_t35_payload[i],
+ current->data[i], 0x00, 0xff, 1, i);
+ }
+
+ return 0;
+}
+
+static int FUNC(metadata_mdcv)(CodedBitstreamContext *ctx,
+ RWContext *rw,
+ APVRawMetadataMDCV *current)
+{
+ int err, i;
+
+ HEADER("MDCV Metadata");
+
+ for (i = 0; i < 3; i++) {
+ ubs(16, primary_chromaticity_x[i], 1, i);
+ ubs(16, primary_chromaticity_y[i], 1, i);
+ }
+
+ ub(16, white_point_chromaticity_x);
+ ub(16, white_point_chromaticity_y);
+
+ ub(32, max_mastering_luminance);
+ ub(32, min_mastering_luminance);
+
+ return 0;
+}
+
+static int FUNC(metadata_cll)(CodedBitstreamContext *ctx,
+ RWContext *rw,
+ APVRawMetadataCLL *current)
+{
+ int err;
+
+ HEADER("CLL Metadata");
+
+ ub(16, max_cll);
+ ub(16, max_fall);
+
+ return 0;
+}
+
+static int FUNC(metadata_filler)(CodedBitstreamContext *ctx,
+ RWContext *rw,
+ APVRawMetadataFiller *current,
+ size_t payload_size)
+{
+ int err;
+
+ HEADER("Filler Metadata");
+
+ for (size_t i = 0; i < payload_size; i++)
+ fixed(8, ff_byte, 0xff);
+
+ return 0;
+}
+
+static int FUNC(metadata_user_defined)(CodedBitstreamContext *ctx,
+ RWContext *rw,
+ APVRawMetadataUserDefined *current,
+ size_t payload_size)
+{
+ int err;
+
+ HEADER("User-Defined Metadata");
+
+ for (int i = 0; i < 16; i++)
+ ubs(8, uuid[i], 1, i);
+
+#ifdef READ
+ current->data_size = payload_size - 16;
+ current->data_ref = av_buffer_alloc(current->data_size);
+ if (!current->data_ref)
+ return AVERROR(ENOMEM);
+ current->data = current->data_ref->data;
+#else
+ if (current->data_size != payload_size - 16) {
+ av_log(ctx->log_ctx, AV_LOG_ERROR, "Write size mismatch: "
+ "payload %zu but expecting %zu\n",
+ current->data_size, payload_size - 16);
+ return AVERROR(EINVAL);
+ }
+#endif
+
+ for (size_t i = 0; i < current->data_size; i++) {
+ xu(8, user_defined_data_payload[i],
+ current->data[i], 0x00, 0xff, 1, i);
+ }
+
+ return 0;
+}
+
+static int FUNC(metadata_undefined)(CodedBitstreamContext *ctx,
+ RWContext *rw,
+ APVRawMetadataUndefined *current,
+ size_t payload_size)
+{
+ int err;
+
+ HEADER("Undefined Metadata");
+
+#ifdef READ
+ current->data_size = payload_size;
+ current->data_ref = av_buffer_alloc(current->data_size);
+ if (!current->data_ref)
+ return AVERROR(ENOMEM);
+ current->data = current->data_ref->data;
+#else
+ if (current->data_size != payload_size) {
+ av_log(ctx->log_ctx, AV_LOG_ERROR, "Write size mismatch: "
+ "payload %zu but expecting %zu\n",
+ current->data_size, payload_size - 16);
+ return AVERROR(EINVAL);
+ }
+#endif
+
+ for (size_t i = 0; i < current->data_size; i++) {
+ xu(8, undefined_metadata_payload_byte[i],
+ current->data[i], 0x00, 0xff, 1, i);
+ }
+
+ return 0;
+}
+
+static int FUNC(metadata_payload)(CodedBitstreamContext *ctx,
+ RWContext *rw,
+ APVRawMetadataPayload *current)
+{
+ int err;
+
+ switch (current->payload_type) {
+ case APV_METADATA_ITU_T_T35:
+ CHECK(FUNC(metadata_itu_t_t35)(ctx, rw,
+ ¤t->itu_t_t35,
+ current->payload_size));
+ break;
+ case APV_METADATA_MDCV:
+ CHECK(FUNC(metadata_mdcv)(ctx, rw, ¤t->mdcv));
+ break;
+ case APV_METADATA_CLL:
+ CHECK(FUNC(metadata_cll)(ctx, rw, ¤t->cll));
+ break;
+ case APV_METADATA_FILLER:
+ CHECK(FUNC(metadata_filler)(ctx, rw,
+ ¤t->filler,
+ current->payload_size));
+ break;
+ case APV_METADATA_USER_DEFINED:
+ CHECK(FUNC(metadata_user_defined)(ctx, rw,
+ ¤t->user_defined,
+ current->payload_size));
+ break;
+ default:
+ CHECK(FUNC(metadata_undefined)(ctx, rw,
+ ¤t->undefined,
+ current->payload_size));
+ }
+
+ return 0;
+}
+
+static int FUNC(metadata)(CodedBitstreamContext *ctx, RWContext *rw,
+ APVRawMetadata *current)
+{
+ int err;
+
+#ifdef WRITE
+ PutBitContext metadata_start_state;
+ uint32_t metadata_start_position;
+ int trace;
+#endif
+
+ HEADER("Metadata");
+
+ CHECK(FUNC(pbu_header)(ctx, rw, ¤t->pbu_header));
+
+#ifdef READ
+ ub(32, metadata_size);
+
+ for (int p = 0; p < CBS_APV_MAX_METADATA_PAYLOADS; p++) {
+ APVRawMetadataPayload *pl = ¤t->payloads[p];
+ uint32_t metadata_bytes_left = current->metadata_size;
+ uint32_t tmp;
+
+ pl->payload_type = 0;
+ while (show_bits(rw, 8) == 0xff) {
+ fixed(8, ff_byte, 0xff);
+ pl->payload_type += 255;
+ --metadata_bytes_left;
+ }
+ xu(8, metadata_payload_type, tmp, 0, 254, 0);
+ pl->payload_type += tmp;
+ --metadata_bytes_left;
+
+ pl->payload_size = 0;
+ while (show_bits(rw, 8) == 0xff) {
+ fixed(8, ff_byte, 0xff);
+ pl->payload_size += 255;
+ --metadata_bytes_left;
+ }
+ xu(8, metadata_payload_size, tmp, 0, 254, 0);
+ pl->payload_size += tmp;
+ --metadata_bytes_left;
+
+ if (pl->payload_size > metadata_bytes_left) {
+ av_log(ctx->log_ctx, AV_LOG_ERROR, "Invalid metadata: "
+ "payload_size larger than remaining metadata size "
+ "(%"PRIu32" bytes).\n", pl->payload_size);
+ return AVERROR_INVALIDDATA;
+ }
+
+ CHECK(FUNC(metadata_payload)(ctx, rw, pl));
+
+ metadata_bytes_left -= pl->payload_size;
+
+ current->metadata_count = p + 1;
+ if (metadata_bytes_left == 0)
+ break;
+ }
+#else
+ // Two passes: the first write finds the size (with tracing
+ // disabled), the second write does the real write.
+
+ metadata_start_state = *rw;
+ metadata_start_position = put_bits_count(rw);
+
+ trace = ctx->trace_enable;
+ ctx->trace_enable = 0;
+
+ for (int pass = 1; pass <= 2; pass++) {
+ *rw = metadata_start_state;
+
+ ub(32, metadata_size);
+
+ for (int p = 0; p < current->metadata_count; p++) {
+ APVRawMetadataPayload *pl = ¤t->payloads[p];
+ uint32_t payload_start_position;
+ uint32_t tmp;
+
+ payload_start_position = put_bits_count(rw);
+
+ tmp = pl->payload_type;
+ while (tmp >= 255) {
+ fixed(8, ff_byte, 0xff);
+ tmp -= 255;
+ }
+ xu(8, metadata_payload_type, tmp, 0, 254, 0);
+
+ tmp = pl->payload_size;
+ while (tmp >= 255) {
+ fixed(8, ff_byte, 0xff);
+ tmp -= 255;
+ }
+ xu(8, metadata_payload_size, tmp, 0, 254, 0);
+
+ err = FUNC(metadata_payload)(ctx, rw, pl);
+ ctx->trace_enable = trace;
+ if (err < 0)
+ return err;
+
+ if (pass == 1) {
+ pl->payload_size = (put_bits_count(rw) -
+ payload_start_position) / 8;
+ }
+ }
+
+ if (pass == 1) {
+ current->metadata_size = (put_bits_count(rw) -
+ metadata_start_position) / 8;
+ ctx->trace_enable = trace;
+ }
+ }
+#endif
+
+ CHECK(FUNC(filler)(ctx, rw, ¤t->filler));
+
+ return 0;
+}
diff --git a/libavcodec/cbs_internal.h b/libavcodec/cbs_internal.h
index 1ed1f04c15..c3265924ba 100644
--- a/libavcodec/cbs_internal.h
+++ b/libavcodec/cbs_internal.h
@@ -42,6 +42,9 @@
#define CBS_TRACE 1
#endif
+#ifndef CBS_APV
+#define CBS_APV CONFIG_CBS_APV
+#endif
#ifndef CBS_AV1
#define CBS_AV1 CONFIG_CBS_AV1
#endif
@@ -383,6 +386,7 @@ int CBS_FUNC(write_signed)(CodedBitstreamContext *ctx, PutBitContext *pbc,
#define CBS_UNIT_TYPE_END_OF_LIST { .nb_unit_types = 0 }
+extern const CodedBitstreamType CBS_FUNC(type_apv);
extern const CodedBitstreamType CBS_FUNC(type_av1);
extern const CodedBitstreamType CBS_FUNC(type_h264);
extern const CodedBitstreamType CBS_FUNC(type_h265);
diff --git a/libavformat/cbs.h b/libavformat/cbs.h
index e4dc231001..0fab3a7457 100644
--- a/libavformat/cbs.h
+++ b/libavformat/cbs.h
@@ -22,6 +22,7 @@
#define CBS_PREFIX lavf_cbs
#define CBS_WRITE 0
#define CBS_TRACE 0
+#define CBS_APV 0
#define CBS_H264 0
#define CBS_H265 0
#define CBS_H266 0
--
2.47.2
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 17+ messages in thread
* [FFmpeg-devel] [PATCH v3 3/7] lavf: APV demuxer
2025-04-23 20:45 [FFmpeg-devel] [PATCH v3 0/7] APV support Mark Thompson
2025-04-23 20:45 ` [FFmpeg-devel] [PATCH v3 1/7] lavc: APV codec ID and descriptor Mark Thompson
2025-04-23 20:45 ` [FFmpeg-devel] [PATCH v3 2/7] lavc/cbs: APV support Mark Thompson
@ 2025-04-23 20:45 ` Mark Thompson
2025-04-24 0:10 ` James Almer
2025-04-23 20:45 ` [FFmpeg-devel] [PATCH v3 4/7] lavc: APV decoder Mark Thompson
` (3 subsequent siblings)
6 siblings, 1 reply; 17+ messages in thread
From: Mark Thompson @ 2025-04-23 20:45 UTC (permalink / raw)
To: ffmpeg-devel
Demuxes raw streams as defined in draft spec section 10.2.
---
libavformat/Makefile | 1 +
libavformat/allformats.c | 1 +
libavformat/apvdec.c | 241 +++++++++++++++++++++++++++++++++++++++
3 files changed, 243 insertions(+)
create mode 100644 libavformat/apvdec.c
diff --git a/libavformat/Makefile b/libavformat/Makefile
index a94ac66e7e..ef96c2762e 100644
--- a/libavformat/Makefile
+++ b/libavformat/Makefile
@@ -119,6 +119,7 @@ OBJS-$(CONFIG_APTX_DEMUXER) += aptxdec.o
OBJS-$(CONFIG_APTX_MUXER) += rawenc.o
OBJS-$(CONFIG_APTX_HD_DEMUXER) += aptxdec.o
OBJS-$(CONFIG_APTX_HD_MUXER) += rawenc.o
+OBJS-$(CONFIG_APV_DEMUXER) += apvdec.o
OBJS-$(CONFIG_AQTITLE_DEMUXER) += aqtitledec.o subtitles.o
OBJS-$(CONFIG_ARGO_ASF_DEMUXER) += argo_asf.o
OBJS-$(CONFIG_ARGO_ASF_MUXER) += argo_asf.o
diff --git a/libavformat/allformats.c b/libavformat/allformats.c
index 445f13f42a..90a4fe64ec 100644
--- a/libavformat/allformats.c
+++ b/libavformat/allformats.c
@@ -72,6 +72,7 @@ extern const FFInputFormat ff_aptx_demuxer;
extern const FFOutputFormat ff_aptx_muxer;
extern const FFInputFormat ff_aptx_hd_demuxer;
extern const FFOutputFormat ff_aptx_hd_muxer;
+extern const FFInputFormat ff_apv_demuxer;
extern const FFInputFormat ff_aqtitle_demuxer;
extern const FFInputFormat ff_argo_asf_demuxer;
extern const FFOutputFormat ff_argo_asf_muxer;
diff --git a/libavformat/apvdec.c b/libavformat/apvdec.c
new file mode 100644
index 0000000000..04f9ef0a8f
--- /dev/null
+++ b/libavformat/apvdec.c
@@ -0,0 +1,241 @@
+/*
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include "libavcodec/apv.h"
+#include "libavcodec/bytestream.h"
+
+#include "avformat.h"
+#include "avio_internal.h"
+#include "demux.h"
+#include "internal.h"
+
+
+typedef struct APVHeaderInfo {
+ uint8_t pbu_type;
+ uint16_t group_id;
+
+ uint8_t profile_idc;
+ uint8_t level_idc;
+ uint8_t band_idc;
+
+ int frame_width;
+ int frame_height;
+
+ uint8_t chroma_format_idc;
+ uint8_t bit_depth_minus8;
+
+ enum AVPixelFormat pixel_format;
+} APVHeaderInfo;
+
+static const enum AVPixelFormat apv_format_table[5][5] = {
+ { AV_PIX_FMT_GRAY8, AV_PIX_FMT_GRAY10, AV_PIX_FMT_GRAY12, AV_PIX_FMT_GRAY14, AV_PIX_FMT_GRAY16 },
+ { 0 }, // 4:2:0 is not valid.
+ { AV_PIX_FMT_YUV422P, AV_PIX_FMT_YUV422P10, AV_PIX_FMT_YUV422P12, AV_PIX_FMT_GRAY14, AV_PIX_FMT_YUV422P16 },
+ { AV_PIX_FMT_YUV444P, AV_PIX_FMT_YUV444P10, AV_PIX_FMT_YUV444P12, AV_PIX_FMT_GRAY14, AV_PIX_FMT_YUV444P16 },
+ { AV_PIX_FMT_YUVA444P, AV_PIX_FMT_YUVA444P10, AV_PIX_FMT_YUVA444P12, AV_PIX_FMT_GRAY14, AV_PIX_FMT_YUVA444P16 },
+};
+
+static int apv_extract_header_info(APVHeaderInfo *info,
+ GetByteContext *gbc)
+{
+ int zero, byte, bit_depth_index;
+
+ info->pbu_type = bytestream2_get_byte(gbc);
+ info->group_id = bytestream2_get_be16(gbc);
+
+ zero = bytestream2_get_byte(gbc);
+ if (zero != 0)
+ return AVERROR_INVALIDDATA;
+
+ if (info->pbu_type != APV_PBU_PRIMARY_FRAME)
+ return AVERROR_INVALIDDATA;
+
+ info->profile_idc = bytestream2_get_byte(gbc);
+ info->level_idc = bytestream2_get_byte(gbc);
+
+ byte = bytestream2_get_byte(gbc);
+ info->band_idc = byte >> 3;
+ zero = byte & 7;
+ if (zero != 0)
+ return AVERROR_INVALIDDATA;
+
+ info->frame_width = bytestream2_get_be24(gbc);
+ info->frame_height = bytestream2_get_be24(gbc);
+ if (info->frame_width < 1 || info->frame_width > 65536 ||
+ info->frame_height < 1 || info->frame_height > 65536)
+ return AVERROR_INVALIDDATA;
+
+ byte = bytestream2_get_byte(gbc);
+ info->chroma_format_idc = byte >> 4;
+ info->bit_depth_minus8 = byte & 0xf;
+
+ if (info->bit_depth_minus8 > 8) {
+ return AVERROR_INVALIDDATA;
+ }
+ if (info->bit_depth_minus8 % 2) {
+ // Odd bit depths are technically valid but not useful here.
+ return AVERROR_INVALIDDATA;
+ }
+ bit_depth_index = info->bit_depth_minus8 / 2;
+
+ switch (info->chroma_format_idc) {
+ case APV_CHROMA_FORMAT_400:
+ case APV_CHROMA_FORMAT_422:
+ case APV_CHROMA_FORMAT_444:
+ case APV_CHROMA_FORMAT_4444:
+ info->pixel_format = apv_format_table[info->chroma_format_idc][bit_depth_index];
+ break;
+ default:
+ return AVERROR_INVALIDDATA;
+ }
+
+ // Ignore capture_time_distance.
+ bytestream2_skip(gbc, 1);
+
+ zero = bytestream2_get_byte(gbc);
+ if (zero != 0)
+ return AVERROR_INVALIDDATA;
+
+ return 1;
+}
+
+static int apv_probe(const AVProbeData *p)
+{
+ GetByteContext gbc;
+ APVHeaderInfo header;
+ uint32_t au_size, signature, pbu_size;
+ int err;
+
+ if (p->buf_size < 28) {
+ // Too small to fit an APV header.
+ return 0;
+ }
+
+ bytestream2_init(&gbc, p->buf, p->buf_size);
+
+ au_size = bytestream2_get_be32(&gbc);
+ if (au_size < 24) {
+ // Too small.
+ return 0;
+ }
+ signature = bytestream2_get_be32(&gbc);
+ if (signature != APV_SIGNATURE) {
+ // Signature is mandatory.
+ return 0;
+ }
+ pbu_size = bytestream2_get_be32(&gbc);
+ if (pbu_size < 16) {
+ // Too small.
+ return 0;
+ }
+
+ err = apv_extract_header_info(&header, &gbc);
+ if (err < 0) {
+ // Header does not look like APV.
+ return 0;
+ }
+ return AVPROBE_SCORE_MAX;
+}
+
+static int apv_read_header(AVFormatContext *s)
+{
+ AVStream *st;
+ GetByteContext gbc;
+ APVHeaderInfo header;
+ uint8_t buffer[28];
+ uint32_t au_size, signature, pbu_size;
+ int err, size;
+
+ err = ffio_ensure_seekback(s->pb, sizeof(buffer));
+ if (err < 0)
+ return err;
+ size = avio_read(s->pb, buffer, sizeof(buffer));
+ if (size < 0)
+ return size;
+
+ bytestream2_init(&gbc, buffer, sizeof(buffer));
+
+ au_size = bytestream2_get_be32(&gbc);
+ if (au_size < 24) {
+ // Too small.
+ return AVERROR_INVALIDDATA;
+ }
+ signature = bytestream2_get_be32(&gbc);
+ if (signature != APV_SIGNATURE) {
+ // Signature is mandatory.
+ return AVERROR_INVALIDDATA;
+ }
+ pbu_size = bytestream2_get_be32(&gbc);
+ if (pbu_size < 16) {
+ // Too small.
+ return AVERROR_INVALIDDATA;
+ }
+
+ err = apv_extract_header_info(&header, &gbc);
+ if (err < 0)
+ return err;
+
+ st = avformat_new_stream(s, NULL);
+ if (!st)
+ return AVERROR(ENOMEM);
+
+ st->codecpar->codec_type = AVMEDIA_TYPE_VIDEO;
+ st->codecpar->codec_id = AV_CODEC_ID_APV;
+ st->codecpar->format = header.pixel_format;
+ st->codecpar->profile = header.profile_idc;
+ st->codecpar->level = header.level_idc;
+ st->codecpar->width = header.frame_width;
+ st->codecpar->height = header.frame_height;
+
+ st->avg_frame_rate = (AVRational){ 30, 1 };
+ avpriv_set_pts_info(st, 64, 1, 30);
+
+ avio_seek(s->pb, -size, SEEK_CUR);
+
+ return 0;
+}
+
+static int apv_read_packet(AVFormatContext *s, AVPacket *pkt)
+{
+ uint32_t au_size;
+ int ret;
+
+ au_size = avio_rb32(s->pb);
+ if (au_size == 0 && avio_feof(s->pb))
+ return AVERROR_EOF;
+ if (au_size < 16 || au_size > 1 << 24) {
+ av_log(s, AV_LOG_ERROR, "APV AU is bad\n");
+ return AVERROR_INVALIDDATA;
+ }
+
+ ret = av_get_packet(s->pb, pkt, au_size);
+ pkt->flags = AV_PKT_FLAG_KEY;
+
+ return ret;
+}
+
+const FFInputFormat ff_apv_demuxer = {
+ .p.name = "apv",
+ .p.long_name = NULL_IF_CONFIG_SMALL("APV raw bitstream"),
+ .p.extensions = "apv",
+ .p.flags = AVFMT_GENERIC_INDEX | AVFMT_NOTIMESTAMPS,
+ .flags_internal = FF_INFMT_FLAG_INIT_CLEANUP,
+ .read_probe = apv_probe,
+ .read_header = apv_read_header,
+ .read_packet = apv_read_packet,
+};
--
2.47.2
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 17+ messages in thread
* [FFmpeg-devel] [PATCH v3 4/7] lavc: APV decoder
2025-04-23 20:45 [FFmpeg-devel] [PATCH v3 0/7] APV support Mark Thompson
` (2 preceding siblings ...)
2025-04-23 20:45 ` [FFmpeg-devel] [PATCH v3 3/7] lavf: APV demuxer Mark Thompson
@ 2025-04-23 20:45 ` Mark Thompson
2025-04-24 3:04 ` James Almer
2025-04-25 17:25 ` Michael Niedermayer
2025-04-23 20:45 ` [FFmpeg-devel] [PATCH v3 5/7] lavc/apv: AVX2 transquant for x86-64 Mark Thompson
` (2 subsequent siblings)
6 siblings, 2 replies; 17+ messages in thread
From: Mark Thompson @ 2025-04-23 20:45 UTC (permalink / raw)
To: ffmpeg-devel
---
configure | 1 +
libavcodec/Makefile | 1 +
libavcodec/allcodecs.c | 1 +
libavcodec/apv_decode.c | 403 +++++++++++++++++++++++++++++++++++++++
libavcodec/apv_decode.h | 80 ++++++++
libavcodec/apv_dsp.c | 136 +++++++++++++
libavcodec/apv_dsp.h | 37 ++++
libavcodec/apv_entropy.c | 200 +++++++++++++++++++
8 files changed, 859 insertions(+)
create mode 100644 libavcodec/apv_decode.c
create mode 100644 libavcodec/apv_decode.h
create mode 100644 libavcodec/apv_dsp.c
create mode 100644 libavcodec/apv_dsp.h
create mode 100644 libavcodec/apv_entropy.c
diff --git a/configure b/configure
index ca404d2797..ee270b770c 100755
--- a/configure
+++ b/configure
@@ -2935,6 +2935,7 @@ apng_decoder_select="inflate_wrapper"
apng_encoder_select="deflate_wrapper llvidencdsp"
aptx_encoder_select="audio_frame_queue"
aptx_hd_encoder_select="audio_frame_queue"
+apv_decoder_select="cbs_apv"
asv1_decoder_select="blockdsp bswapdsp idctdsp"
asv1_encoder_select="aandcttables bswapdsp fdctdsp pixblockdsp"
asv2_decoder_select="blockdsp bswapdsp idctdsp"
diff --git a/libavcodec/Makefile b/libavcodec/Makefile
index a5f5c4e904..e674671460 100644
--- a/libavcodec/Makefile
+++ b/libavcodec/Makefile
@@ -244,6 +244,7 @@ OBJS-$(CONFIG_APTX_HD_DECODER) += aptxdec.o aptx.o
OBJS-$(CONFIG_APTX_HD_ENCODER) += aptxenc.o aptx.o
OBJS-$(CONFIG_APNG_DECODER) += png.o pngdec.o pngdsp.o
OBJS-$(CONFIG_APNG_ENCODER) += png.o pngenc.o
+OBJS-$(CONFIG_APV_DECODER) += apv_decode.o apv_entropy.o apv_dsp.o
OBJS-$(CONFIG_ARBC_DECODER) += arbc.o
OBJS-$(CONFIG_ARGO_DECODER) += argo.o
OBJS-$(CONFIG_SSA_DECODER) += assdec.o ass.o
diff --git a/libavcodec/allcodecs.c b/libavcodec/allcodecs.c
index f10519617e..09f06c71d6 100644
--- a/libavcodec/allcodecs.c
+++ b/libavcodec/allcodecs.c
@@ -47,6 +47,7 @@ extern const FFCodec ff_anm_decoder;
extern const FFCodec ff_ansi_decoder;
extern const FFCodec ff_apng_encoder;
extern const FFCodec ff_apng_decoder;
+extern const FFCodec ff_apv_decoder;
extern const FFCodec ff_arbc_decoder;
extern const FFCodec ff_argo_decoder;
extern const FFCodec ff_asv1_encoder;
diff --git a/libavcodec/apv_decode.c b/libavcodec/apv_decode.c
new file mode 100644
index 0000000000..0cc4f57dab
--- /dev/null
+++ b/libavcodec/apv_decode.c
@@ -0,0 +1,403 @@
+/*
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include "libavutil/mastering_display_metadata.h"
+#include "libavutil/mem_internal.h"
+#include "libavutil/pixdesc.h"
+
+#include "apv.h"
+#include "apv_decode.h"
+#include "apv_dsp.h"
+#include "avcodec.h"
+#include "cbs.h"
+#include "cbs_apv.h"
+#include "codec_internal.h"
+#include "decode.h"
+#include "thread.h"
+
+
+typedef struct APVDecodeContext {
+ CodedBitstreamContext *cbc;
+ APVDSPContext dsp;
+
+ CodedBitstreamFragment au;
+ APVDerivedTileInfo tile_info;
+
+ APVVLCLUT decode_lut;
+
+ AVFrame *output_frame;
+} APVDecodeContext;
+
+static const enum AVPixelFormat apv_format_table[5][5] = {
+ { AV_PIX_FMT_GRAY8, AV_PIX_FMT_GRAY10, AV_PIX_FMT_GRAY12, AV_PIX_FMT_GRAY14, AV_PIX_FMT_GRAY16 },
+ { 0 }, // 4:2:0 is not valid.
+ { AV_PIX_FMT_YUV422P, AV_PIX_FMT_YUV422P10, AV_PIX_FMT_YUV422P12, AV_PIX_FMT_GRAY14, AV_PIX_FMT_YUV422P16 },
+ { AV_PIX_FMT_YUV444P, AV_PIX_FMT_YUV444P10, AV_PIX_FMT_YUV444P12, AV_PIX_FMT_GRAY14, AV_PIX_FMT_YUV444P16 },
+ { AV_PIX_FMT_YUVA444P, AV_PIX_FMT_YUVA444P10, AV_PIX_FMT_YUVA444P12, AV_PIX_FMT_GRAY14, AV_PIX_FMT_YUVA444P16 },
+};
+
+static int apv_decode_check_format(AVCodecContext *avctx,
+ const APVRawFrameHeader *header)
+{
+ int err, bit_depth;
+
+ avctx->profile = header->frame_info.profile_idc;
+ avctx->level = header->frame_info.level_idc;
+
+ bit_depth = header->frame_info.bit_depth_minus8 + 8;
+ if (bit_depth < 8 || bit_depth > 16 || bit_depth % 2) {
+ avpriv_request_sample(avctx, "Bit depth %d", bit_depth);
+ return AVERROR_PATCHWELCOME;
+ }
+ avctx->pix_fmt =
+ apv_format_table[header->frame_info.chroma_format_idc][bit_depth - 4 >> 2];
+
+ err = ff_set_dimensions(avctx,
+ FFALIGN(header->frame_info.frame_width, 16),
+ FFALIGN(header->frame_info.frame_height, 16));
+ if (err < 0) {
+ // Unsupported frame size.
+ return err;
+ }
+ avctx->width = header->frame_info.frame_width;
+ avctx->height = header->frame_info.frame_height;
+
+ avctx->sample_aspect_ratio = (AVRational){ 1, 1 };
+
+ avctx->color_primaries = header->color_primaries;
+ avctx->color_trc = header->transfer_characteristics;
+ avctx->colorspace = header->matrix_coefficients;
+ avctx->color_range = header->full_range_flag ? AVCOL_RANGE_JPEG
+ : AVCOL_RANGE_MPEG;
+ avctx->chroma_sample_location = AVCHROMA_LOC_TOPLEFT;
+
+ avctx->refs = 0;
+ avctx->has_b_frames = 0;
+
+ return 0;
+}
+
+static av_cold int apv_decode_init(AVCodecContext *avctx)
+{
+ APVDecodeContext *apv = avctx->priv_data;
+ int err;
+
+ err = ff_cbs_init(&apv->cbc, AV_CODEC_ID_APV, avctx);
+ if (err < 0)
+ return err;
+
+ ff_apv_entropy_build_decode_lut(&apv->decode_lut);
+
+ ff_apv_dsp_init(&apv->dsp);
+
+ if (avctx->extradata) {
+ av_log(avctx, AV_LOG_WARNING,
+ "APV does not support extradata.\n");
+ }
+
+ return 0;
+}
+
+static av_cold int apv_decode_close(AVCodecContext *avctx)
+{
+ APVDecodeContext *apv = avctx->priv_data;
+
+ ff_cbs_fragment_free(&apv->au);
+ ff_cbs_close(&apv->cbc);
+
+ return 0;
+}
+
+static int apv_decode_block(AVCodecContext *avctx,
+ void *output,
+ ptrdiff_t pitch,
+ GetBitContext *gbc,
+ APVEntropyState *entropy_state,
+ int bit_depth,
+ int qp_shift,
+ const uint16_t *qmatrix)
+{
+ APVDecodeContext *apv = avctx->priv_data;
+ int err;
+
+ LOCAL_ALIGNED_32(int16_t, coeff, [64]);
+
+ err = ff_apv_entropy_decode_block(coeff, gbc, entropy_state);
+ if (err < 0)
+ return 0;
+
+ apv->dsp.decode_transquant(output, pitch,
+ coeff, qmatrix,
+ bit_depth, qp_shift);
+
+ return 0;
+}
+
+static int apv_decode_tile_component(AVCodecContext *avctx, void *data,
+ int job, int thread)
+{
+ APVRawFrame *input = data;
+ APVDecodeContext *apv = avctx->priv_data;
+ const CodedBitstreamAPVContext *apv_cbc = apv->cbc->priv_data;
+ const APVDerivedTileInfo *tile_info = &apv_cbc->tile_info;
+
+ int tile_index = job / apv_cbc->num_comp;
+ int comp_index = job % apv_cbc->num_comp;
+
+ const AVPixFmtDescriptor *pix_fmt_desc =
+ av_pix_fmt_desc_get(avctx->pix_fmt);
+
+ int sub_w = comp_index == 0 ? 1 : pix_fmt_desc->log2_chroma_w + 1;
+ int sub_h = comp_index == 0 ? 1 : pix_fmt_desc->log2_chroma_h + 1;
+
+ APVRawTile *tile = &input->tile[tile_index];
+
+ int tile_y = tile_index / tile_info->tile_cols;
+ int tile_x = tile_index % tile_info->tile_cols;
+
+ int tile_start_x = tile_info->col_starts[tile_x];
+ int tile_start_y = tile_info->row_starts[tile_y];
+
+ int tile_width = tile_info->col_starts[tile_x + 1] - tile_start_x;
+ int tile_height = tile_info->row_starts[tile_y + 1] - tile_start_y;
+
+ int tile_mb_width = tile_width / APV_MB_WIDTH;
+ int tile_mb_height = tile_height / APV_MB_HEIGHT;
+
+ int blk_mb_width = 2 / sub_w;
+ int blk_mb_height = 2 / sub_h;
+
+ int bit_depth;
+ int qp_shift;
+ LOCAL_ALIGNED_32(uint16_t, qmatrix_scaled, [64]);
+
+ GetBitContext gbc;
+
+ APVEntropyState entropy_state = {
+ .log_ctx = avctx,
+ .decode_lut = &apv->decode_lut,
+ .prev_dc = 0,
+ .prev_dc_diff = 20,
+ .prev_1st_ac_level = 0,
+ };
+
+ init_get_bits8(&gbc, tile->tile_data[comp_index],
+ tile->tile_header.tile_data_size[comp_index]);
+
+ // Combine the bitstream quantisation matrix with the qp scaling
+ // in advance. (Including qp_shift as well would overflow 16 bits.)
+ // Fix the row ordering at the same time.
+ {
+ static const uint8_t apv_level_scale[6] = { 40, 45, 51, 57, 64, 71 };
+ int qp = tile->tile_header.tile_qp[comp_index];
+ int level_scale = apv_level_scale[qp % 6];
+
+ bit_depth = apv_cbc->bit_depth;
+ qp_shift = qp / 6;
+
+ for (int y = 0; y < 8; y++) {
+ for (int x = 0; x < 8; x++)
+ qmatrix_scaled[y * 8 + x] = level_scale *
+ input->frame_header.quantization_matrix.q_matrix[comp_index][x][y];
+ }
+ }
+
+ for (int mb_y = 0; mb_y < tile_mb_height; mb_y++) {
+ for (int mb_x = 0; mb_x < tile_mb_width; mb_x++) {
+ for (int blk_y = 0; blk_y < blk_mb_height; blk_y++) {
+ for (int blk_x = 0; blk_x < blk_mb_width; blk_x++) {
+ int frame_y = (tile_start_y +
+ APV_MB_HEIGHT * mb_y +
+ APV_TR_SIZE * blk_y) / sub_h;
+ int frame_x = (tile_start_x +
+ APV_MB_WIDTH * mb_x +
+ APV_TR_SIZE * blk_x) / sub_w;
+
+ ptrdiff_t frame_pitch = apv->output_frame->linesize[comp_index];
+ uint8_t *block_start = apv->output_frame->data[comp_index] +
+ frame_y * frame_pitch + 2 * frame_x;
+
+ apv_decode_block(avctx,
+ block_start, frame_pitch,
+ &gbc, &entropy_state,
+ bit_depth,
+ qp_shift,
+ qmatrix_scaled);
+ }
+ }
+ }
+ }
+
+ av_log(avctx, AV_LOG_DEBUG,
+ "Decoded tile %d component %d: %dx%d MBs starting at (%d,%d)\n",
+ tile_index, comp_index, tile_mb_width, tile_mb_height,
+ tile_start_x, tile_start_y);
+
+ return 0;
+}
+
+static int apv_decode(AVCodecContext *avctx, AVFrame *output,
+ APVRawFrame *input)
+{
+ APVDecodeContext *apv = avctx->priv_data;
+ const CodedBitstreamAPVContext *apv_cbc = apv->cbc->priv_data;
+ const APVDerivedTileInfo *tile_info = &apv_cbc->tile_info;
+ int err, job_count;
+
+ err = apv_decode_check_format(avctx, &input->frame_header);
+ if (err < 0) {
+ av_log(avctx, AV_LOG_ERROR, "Unsupported format parameters.\n");
+ return err;
+ }
+
+ err = ff_thread_get_buffer(avctx, output, 0);
+ if (err) {
+ av_log(avctx, AV_LOG_ERROR, "No output frame supplied.\n");
+ return err;
+ }
+
+ apv->output_frame = output;
+
+ // Each component within a tile is independent of every other,
+ // so we can decode all in parallel.
+ job_count = tile_info->num_tiles * apv_cbc->num_comp;
+
+ avctx->execute2(avctx, apv_decode_tile_component,
+ input, NULL, job_count);
+
+ return 0;
+}
+
+static int apv_decode_metadata(AVCodecContext *avctx, AVFrame *frame,
+ const APVRawMetadata *md)
+{
+ int err;
+
+ for (int i = 0; i < md->metadata_count; i++) {
+ const APVRawMetadataPayload *pl = &md->payloads[i];
+
+ switch (pl->payload_type) {
+ case APV_METADATA_MDCV:
+ {
+ const APVRawMetadataMDCV *mdcv = &pl->mdcv;
+ AVMasteringDisplayMetadata *mdm;
+
+ err = ff_decode_mastering_display_new(avctx, frame, &mdm);
+ if (err < 0)
+ return err;
+
+ if (mdm) {
+ for (int i = 0; i < 3; i++) {
+ mdm->display_primaries[i][0] =
+ av_make_q(mdcv->primary_chromaticity_x[i], 1 << 16);
+ mdm->display_primaries[i][1] =
+ av_make_q(mdcv->primary_chromaticity_y[i], 1 << 16);
+ }
+
+ mdm->white_point[0] =
+ av_make_q(mdcv->white_point_chromaticity_x, 1 << 16);
+ mdm->white_point[1] =
+ av_make_q(mdcv->white_point_chromaticity_y, 1 << 16);
+
+ mdm->max_luminance =
+ av_make_q(mdcv->max_mastering_luminance, 1 << 8);
+ mdm->min_luminance =
+ av_make_q(mdcv->min_mastering_luminance, 1 << 14);
+
+ mdm->has_primaries = 1;
+ mdm->has_luminance = 1;
+ }
+ }
+ break;
+ case APV_METADATA_CLL:
+ {
+ const APVRawMetadataCLL *cll = &pl->cll;
+ AVContentLightMetadata *clm;
+
+ err = ff_decode_content_light_new(avctx, frame, &clm);
+ if (err < 0)
+ return err;
+
+ if (clm) {
+ clm->MaxCLL = cll->max_cll;
+ clm->MaxFALL = cll->max_fall;
+ }
+ }
+ break;
+ default:
+ // Ignore other types of metadata.
+ }
+ }
+
+ return 0;
+}
+
+static int apv_decode_frame(AVCodecContext *avctx, AVFrame *frame,
+ int *got_frame, AVPacket *packet)
+{
+ APVDecodeContext *apv = avctx->priv_data;
+ CodedBitstreamFragment *au = &apv->au;
+ int err;
+
+ err = ff_cbs_read_packet(apv->cbc, au, packet);
+ if (err < 0) {
+ av_log(avctx, AV_LOG_ERROR, "Failed to read packet.\n");
+ return err;
+ }
+
+ for (int i = 0; i < au->nb_units; i++) {
+ CodedBitstreamUnit *pbu = &au->units[i];
+
+ switch (pbu->type) {
+ case APV_PBU_PRIMARY_FRAME:
+ err = apv_decode(avctx, frame, pbu->content);
+ if (err < 0)
+ return err;
+ *got_frame = 1;
+ break;
+ case APV_PBU_METADATA:
+ apv_decode_metadata(avctx, frame, pbu->content);
+ break;
+ case APV_PBU_ACCESS_UNIT_INFORMATION:
+ case APV_PBU_FILLER:
+ // Ignored by the decoder.
+ break;
+ default:
+ av_log(avctx, AV_LOG_WARNING,
+ "Ignoring unsupported PBU type %d.\n", pbu->type);
+ }
+ }
+
+ ff_cbs_fragment_reset(au);
+
+ return packet->size;
+}
+
+const FFCodec ff_apv_decoder = {
+ .p.name = "apv",
+ CODEC_LONG_NAME("Advanced Professional Video"),
+ .p.type = AVMEDIA_TYPE_VIDEO,
+ .p.id = AV_CODEC_ID_APV,
+ .priv_data_size = sizeof(APVDecodeContext),
+ .init = apv_decode_init,
+ .close = apv_decode_close,
+ FF_CODEC_DECODE_CB(apv_decode_frame),
+ .p.capabilities = AV_CODEC_CAP_DR1 |
+ AV_CODEC_CAP_SLICE_THREADS |
+ AV_CODEC_CAP_FRAME_THREADS,
+};
diff --git a/libavcodec/apv_decode.h b/libavcodec/apv_decode.h
new file mode 100644
index 0000000000..34c6176ea0
--- /dev/null
+++ b/libavcodec/apv_decode.h
@@ -0,0 +1,80 @@
+/*
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#ifndef AVCODEC_APV_DECODE_H
+#define AVCODEC_APV_DECODE_H
+
+#include <stdint.h>
+
+#include "apv.h"
+#include "avcodec.h"
+#include "get_bits.h"
+
+
+// Number of bits in the entropy look-up tables.
+// It may be desirable to tune this per-architecture, as a larger LUT
+// trades greater memory use for fewer instructions.
+// (N bits -> 24*2^N bytes of tables; 9 -> 12KB of tables.)
+#define APV_VLC_LUT_BITS 9
+#define APV_VLC_LUT_SIZE (1 << APV_VLC_LUT_BITS)
+
+typedef struct APVVLCLUTEntry {
+ uint16_t result; // Return value if not reading more.
+ uint8_t consume; // Number of bits to consume.
+ uint8_t more; // Whether to read additional bits.
+} APVVLCLUTEntry;
+
+typedef struct APVVLCLUT {
+ APVVLCLUTEntry lut[6][APV_VLC_LUT_SIZE];
+} APVVLCLUT;
+
+typedef struct APVEntropyState {
+ void *log_ctx;
+
+ const APVVLCLUT *decode_lut;
+
+ int16_t prev_dc;
+ int16_t prev_dc_diff;
+ int16_t prev_1st_ac_level;
+} APVEntropyState;
+
+
+/**
+ * Build the decoder VLC look-up table.
+ */
+void ff_apv_entropy_build_decode_lut(APVVLCLUT *decode_lut);
+
+/**
+ * Entropy decode a single 8x8 block to coefficients.
+ *
+ * Outputs in block order (dezigzag already applied).
+ */
+int ff_apv_entropy_decode_block(int16_t *coeff,
+ GetBitContext *gbc,
+ APVEntropyState *state);
+
+/**
+ * Read a single APV VLC code.
+ *
+ * This entrypoint is exposed for testing.
+ */
+unsigned int ff_apv_read_vlc(GetBitContext *gbc, int k_param,
+ const APVVLCLUT *lut);
+
+
+#endif /* AVCODEC_APV_DECODE_H */
diff --git a/libavcodec/apv_dsp.c b/libavcodec/apv_dsp.c
new file mode 100644
index 0000000000..fe11cd6b94
--- /dev/null
+++ b/libavcodec/apv_dsp.c
@@ -0,0 +1,136 @@
+/*
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include <stdint.h>
+
+#include "config.h"
+#include "libavutil/attributes.h"
+#include "libavutil/common.h"
+
+#include "apv.h"
+#include "apv_dsp.h"
+
+
+static const int8_t apv_trans_matrix[8][8] = {
+ { 64, 64, 64, 64, 64, 64, 64, 64 },
+ { 89, 75, 50, 18, -18, -50, -75, -89 },
+ { 84, 35, -35, -84, -84, -35, 35, 84 },
+ { 75, -18, -89, -50, 50, 89, 18, -75 },
+ { 64, -64, -64, 64, 64, -64, -64, 64 },
+ { 50, -89, 18, 75, -75, -18, 89, -50 },
+ { 35, -84, 84, -35, -35, 84, -84, 35 },
+ { 18, -50, 75, -89, 89, -75, 50, -18 },
+};
+
+static void apv_decode_transquant_c(void *output,
+ ptrdiff_t pitch,
+ const int16_t *input_flat,
+ const int16_t *qmatrix_flat,
+ int bit_depth,
+ int qp_shift)
+{
+ const int16_t (*input)[8] = (const int16_t(*)[8])input_flat;
+ const int16_t (*qmatrix)[8] = (const int16_t(*)[8])qmatrix_flat;
+
+ int16_t scaled_coeff[8][8];
+ int32_t recon_sample[8][8];
+
+ // Dequant.
+ {
+ // Note that level_scale was already combined into qmatrix
+ // before we got here.
+ int bd_shift = bit_depth + 3 - 5;
+
+ for (int y = 0; y < 8; y++) {
+ for (int x = 0; x < 8; x++) {
+ int coeff = (((input[y][x] * qmatrix[y][x]) << qp_shift) +
+ (1 << (bd_shift - 1))) >> bd_shift;
+
+ scaled_coeff[y][x] =
+ av_clip(coeff, APV_MIN_TRANS_COEFF,
+ APV_MAX_TRANS_COEFF);
+ }
+ }
+ }
+
+ // Transform.
+ {
+ int32_t tmp[8][8];
+
+ // Vertical transform of columns.
+ for (int x = 0; x < 8; x++) {
+ for (int i = 0; i < 8; i++) {
+ int sum = 0;
+ for (int j = 0; j < 8; j++)
+ sum += apv_trans_matrix[j][i] * scaled_coeff[j][x];
+ tmp[i][x] = sum;
+ }
+ }
+
+ // Renormalise.
+ for (int x = 0; x < 8; x++) {
+ for (int y = 0; y < 8; y++)
+ tmp[y][x] = (tmp[y][x] + 64) >> 7;
+ }
+
+ // Horizontal transform of rows.
+ for (int y = 0; y < 8; y++) {
+ for (int i = 0; i < 8; i++) {
+ int sum = 0;
+ for (int j = 0; j < 8; j++)
+ sum += apv_trans_matrix[j][i] * tmp[y][j];
+ recon_sample[y][i] = sum;
+ }
+ }
+ }
+
+ // Output.
+ if (bit_depth == 8) {
+ uint8_t *ptr = output;
+ int bd_shift = 20 - bit_depth;
+
+ for (int y = 0; y < 8; y++) {
+ for (int x = 0; x < 8; x++) {
+ int sample = ((recon_sample[y][x] +
+ (1 << (bd_shift - 1))) >> bd_shift) +
+ (1 << (bit_depth - 1));
+ ptr[x] = av_clip_uintp2(sample, bit_depth);
+ }
+ ptr += pitch;
+ }
+ } else {
+ uint16_t *ptr = output;
+ int bd_shift = 20 - bit_depth;
+ pitch /= 2; // Pitch was in bytes, 2 bytes per sample.
+
+ for (int y = 0; y < 8; y++) {
+ for (int x = 0; x < 8; x++) {
+ int sample = ((recon_sample[y][x] +
+ (1 << (bd_shift - 1))) >> bd_shift) +
+ (1 << (bit_depth - 1));
+ ptr[x] = av_clip_uintp2(sample, bit_depth);
+ }
+ ptr += pitch;
+ }
+ }
+}
+
+av_cold void ff_apv_dsp_init(APVDSPContext *dsp)
+{
+ dsp->decode_transquant = apv_decode_transquant_c;
+}
diff --git a/libavcodec/apv_dsp.h b/libavcodec/apv_dsp.h
new file mode 100644
index 0000000000..31645b8581
--- /dev/null
+++ b/libavcodec/apv_dsp.h
@@ -0,0 +1,37 @@
+/*
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#ifndef AVCODEC_APV_DSP_H
+#define AVCODEC_APV_DSP_H
+
+#include <stddef.h>
+#include <stdint.h>
+
+
+typedef struct APVDSPContext {
+ void (*decode_transquant)(void *output,
+ ptrdiff_t pitch,
+ const int16_t *input,
+ const int16_t *qmatrix,
+ int bit_depth,
+ int qp_shift);
+} APVDSPContext;
+
+void ff_apv_dsp_init(APVDSPContext *dsp);
+
+#endif /* AVCODEC_APV_DSP_H */
diff --git a/libavcodec/apv_entropy.c b/libavcodec/apv_entropy.c
new file mode 100644
index 0000000000..00e0b4fbdf
--- /dev/null
+++ b/libavcodec/apv_entropy.c
@@ -0,0 +1,200 @@
+/*
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include "apv.h"
+#include "apv_decode.h"
+
+
+void ff_apv_entropy_build_decode_lut(APVVLCLUT *decode_lut)
+{
+ const int code_len = APV_VLC_LUT_BITS;
+ const int lut_size = APV_VLC_LUT_SIZE;
+
+ for (int k = 0; k <= 5; k++) {
+ for (unsigned int code = 0; code < lut_size; code++) {
+ APVVLCLUTEntry *ent = &decode_lut->lut[k][code];
+ unsigned int first_bit = code & (1 << code_len - 1);
+ unsigned int remaining_bits = code ^ first_bit;
+
+ if (first_bit) {
+ ent->consume = 1 + k;
+ ent->result = remaining_bits >> (code_len - k - 1);
+ ent->more = 0;
+ } else {
+ unsigned int second_bit = code & (1 << code_len - 2);
+ remaining_bits ^= second_bit;
+
+ if (second_bit) {
+ unsigned int bits_left = code_len - 2;
+ unsigned int first_set = bits_left - av_log2(remaining_bits);
+ unsigned int last_bits = first_set - 1 + k;
+
+ if (first_set + last_bits <= bits_left) {
+ // Whole code fits here.
+ ent->consume = 2 + first_set + last_bits;
+ ent->result = ((2 << k) +
+ (((1 << first_set - 1) - 1) << k) +
+ ((code >> bits_left - first_set - last_bits) & (1 << last_bits) - 1));
+ ent->more = 0;
+ } else {
+ // Need to read more, collapse to default.
+ ent->consume = 2;
+ ent->more = 1;
+ }
+ } else {
+ ent->consume = 2 + k;
+ ent->result = (1 << k) + (remaining_bits >> (code_len - k - 2));
+ ent->more = 0;
+ }
+ }
+ }
+ }
+}
+
+av_always_inline
+static unsigned int apv_read_vlc(GetBitContext *gbc, int k_param,
+ const APVVLCLUT *lut)
+{
+ unsigned int next_bits;
+ const APVVLCLUTEntry *ent;
+
+ next_bits = show_bits(gbc, APV_VLC_LUT_BITS);
+ ent = &lut->lut[k_param][next_bits];
+
+ if (ent->more) {
+ unsigned int leading_zeroes;
+
+ skip_bits(gbc, ent->consume);
+
+ next_bits = show_bits(gbc, 16);
+ leading_zeroes = 15 - av_log2(next_bits);
+
+ skip_bits(gbc, leading_zeroes + 1);
+
+ return (2 << k_param) +
+ ((1 << leading_zeroes) - 1) * (1 << k_param) +
+ get_bits(gbc, leading_zeroes + k_param);
+ } else {
+ skip_bits(gbc, ent->consume);
+ return ent->result;
+ }
+}
+
+unsigned int ff_apv_read_vlc(GetBitContext *gbc, int k_param,
+ const APVVLCLUT *lut)
+{
+ return apv_read_vlc(gbc, k_param, lut);
+}
+
+int ff_apv_entropy_decode_block(int16_t *coeff,
+ GetBitContext *gbc,
+ APVEntropyState *state)
+{
+ const APVVLCLUT *lut = state->decode_lut;
+ int k_param;
+
+ // DC coefficient.
+ {
+ int abs_dc_coeff_diff;
+ int sign_dc_coeff_diff;
+ int dc_coeff;
+
+ k_param = av_clip(state->prev_dc_diff >> 1, 0, 5);
+ abs_dc_coeff_diff = apv_read_vlc(gbc, k_param, lut);
+
+ if (abs_dc_coeff_diff > 0)
+ sign_dc_coeff_diff = get_bits1(gbc);
+ else
+ sign_dc_coeff_diff = 0;
+
+ if (sign_dc_coeff_diff)
+ dc_coeff = state->prev_dc - abs_dc_coeff_diff;
+ else
+ dc_coeff = state->prev_dc + abs_dc_coeff_diff;
+
+ if (dc_coeff < APV_MIN_TRANS_COEFF ||
+ dc_coeff > APV_MAX_TRANS_COEFF) {
+ av_log(state->log_ctx, AV_LOG_ERROR,
+ "Out-of-range DC coefficient value: %d "
+ "(from prev_dc %d abs_dc_coeff_diff %d sign_dc_coeff_diff %d)\n",
+ dc_coeff, state->prev_dc, abs_dc_coeff_diff, sign_dc_coeff_diff);
+ return AVERROR_INVALIDDATA;
+ }
+
+ coeff[0] = dc_coeff;
+
+ state->prev_dc = dc_coeff;
+ state->prev_dc_diff = abs_dc_coeff_diff;
+ }
+
+ // AC coefficients.
+ {
+ int scan_pos = 1;
+ int first_ac = 1;
+ int prev_level = state->prev_1st_ac_level;
+ int prev_run = 0;
+
+ do {
+ int coeff_zero_run;
+
+ k_param = av_clip(prev_run >> 2, 0, 2);
+ coeff_zero_run = apv_read_vlc(gbc, k_param, lut);
+
+ if (coeff_zero_run > APV_BLK_COEFFS - scan_pos) {
+ av_log(state->log_ctx, AV_LOG_ERROR,
+ "Out-of-range zero-run value: %d (at scan pos %d)\n",
+ coeff_zero_run, scan_pos);
+ return AVERROR_INVALIDDATA;
+ }
+
+ for (int i = 0; i < coeff_zero_run; i++) {
+ coeff[ff_zigzag_direct[scan_pos]] = 0;
+ ++scan_pos;
+ }
+ prev_run = coeff_zero_run;
+
+ if (scan_pos < APV_BLK_COEFFS) {
+ int abs_ac_coeff_minus1;
+ int sign_ac_coeff;
+ int level;
+
+ k_param = av_clip(prev_level >> 2, 0, 4);
+ abs_ac_coeff_minus1 = apv_read_vlc(gbc, k_param, lut);
+ sign_ac_coeff = get_bits(gbc, 1);
+
+ if (sign_ac_coeff)
+ level = -abs_ac_coeff_minus1 - 1;
+ else
+ level = abs_ac_coeff_minus1 + 1;
+
+ coeff[ff_zigzag_direct[scan_pos]] = level;
+
+ prev_level = abs_ac_coeff_minus1 + 1;
+ if (first_ac) {
+ state->prev_1st_ac_level = prev_level;
+ first_ac = 0;
+ }
+
+ ++scan_pos;
+ }
+
+ } while (scan_pos < APV_BLK_COEFFS);
+ }
+
+ return 0;
+}
--
2.47.2
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 17+ messages in thread
* [FFmpeg-devel] [PATCH v3 5/7] lavc/apv: AVX2 transquant for x86-64
2025-04-23 20:45 [FFmpeg-devel] [PATCH v3 0/7] APV support Mark Thompson
` (3 preceding siblings ...)
2025-04-23 20:45 ` [FFmpeg-devel] [PATCH v3 4/7] lavc: APV decoder Mark Thompson
@ 2025-04-23 20:45 ` Mark Thompson
2025-04-24 2:55 ` James Almer
2025-04-23 20:45 ` [FFmpeg-devel] [PATCH v3 6/7] lavc: APV metadata bitstream filter Mark Thompson
2025-04-23 20:45 ` [FFmpeg-devel] [PATCH v3 7/7] lavf: APV muxer Mark Thompson
6 siblings, 1 reply; 17+ messages in thread
From: Mark Thompson @ 2025-04-23 20:45 UTC (permalink / raw)
To: ffmpeg-devel
Typical checkasm result on Alder Lake:
decode_transquant_8_c: 464.2 ( 1.00x)
decode_transquant_8_avx2: 86.2 ( 5.38x)
decode_transquant_10_c: 481.6 ( 1.00x)
decode_transquant_10_avx2: 83.5 ( 5.77x)
---
libavcodec/apv_dsp.c | 4 +
libavcodec/apv_dsp.h | 2 +
libavcodec/x86/Makefile | 2 +
libavcodec/x86/apv_dsp.asm | 311 ++++++++++++++++++++++++++++++++++
libavcodec/x86/apv_dsp_init.c | 44 +++++
tests/checkasm/Makefile | 1 +
tests/checkasm/apv_dsp.c | 109 ++++++++++++
tests/checkasm/checkasm.c | 3 +
tests/checkasm/checkasm.h | 1 +
tests/fate/checkasm.mak | 1 +
10 files changed, 478 insertions(+)
create mode 100644 libavcodec/x86/apv_dsp.asm
create mode 100644 libavcodec/x86/apv_dsp_init.c
create mode 100644 tests/checkasm/apv_dsp.c
diff --git a/libavcodec/apv_dsp.c b/libavcodec/apv_dsp.c
index fe11cd6b94..fd814ef900 100644
--- a/libavcodec/apv_dsp.c
+++ b/libavcodec/apv_dsp.c
@@ -133,4 +133,8 @@ static void apv_decode_transquant_c(void *output,
av_cold void ff_apv_dsp_init(APVDSPContext *dsp)
{
dsp->decode_transquant = apv_decode_transquant_c;
+
+#if ARCH_X86_64
+ ff_apv_dsp_init_x86_64(dsp);
+#endif
}
diff --git a/libavcodec/apv_dsp.h b/libavcodec/apv_dsp.h
index 31645b8581..c63d6a88ee 100644
--- a/libavcodec/apv_dsp.h
+++ b/libavcodec/apv_dsp.h
@@ -34,4 +34,6 @@ typedef struct APVDSPContext {
void ff_apv_dsp_init(APVDSPContext *dsp);
+void ff_apv_dsp_init_x86_64(APVDSPContext *dsp);
+
#endif /* AVCODEC_APV_DSP_H */
diff --git a/libavcodec/x86/Makefile b/libavcodec/x86/Makefile
index 5d53515381..821c410a0f 100644
--- a/libavcodec/x86/Makefile
+++ b/libavcodec/x86/Makefile
@@ -44,6 +44,7 @@ OBJS-$(CONFIG_ADPCM_G722_DECODER) += x86/g722dsp_init.o
OBJS-$(CONFIG_ADPCM_G722_ENCODER) += x86/g722dsp_init.o
OBJS-$(CONFIG_ALAC_DECODER) += x86/alacdsp_init.o
OBJS-$(CONFIG_APNG_DECODER) += x86/pngdsp_init.o
+OBJS-$(CONFIG_APV_DECODER) += x86/apv_dsp_init.o
OBJS-$(CONFIG_CAVS_DECODER) += x86/cavsdsp.o
OBJS-$(CONFIG_CFHD_DECODER) += x86/cfhddsp_init.o
OBJS-$(CONFIG_CFHD_ENCODER) += x86/cfhdencdsp_init.o
@@ -149,6 +150,7 @@ X86ASM-OBJS-$(CONFIG_ADPCM_G722_DECODER) += x86/g722dsp.o
X86ASM-OBJS-$(CONFIG_ADPCM_G722_ENCODER) += x86/g722dsp.o
X86ASM-OBJS-$(CONFIG_ALAC_DECODER) += x86/alacdsp.o
X86ASM-OBJS-$(CONFIG_APNG_DECODER) += x86/pngdsp.o
+X86ASM-OBJS-$(CONFIG_APV_DECODER) += x86/apv_dsp.o
X86ASM-OBJS-$(CONFIG_CAVS_DECODER) += x86/cavsidct.o
X86ASM-OBJS-$(CONFIG_CFHD_ENCODER) += x86/cfhdencdsp.o
X86ASM-OBJS-$(CONFIG_CFHD_DECODER) += x86/cfhddsp.o
diff --git a/libavcodec/x86/apv_dsp.asm b/libavcodec/x86/apv_dsp.asm
new file mode 100644
index 0000000000..12d96481de
--- /dev/null
+++ b/libavcodec/x86/apv_dsp.asm
@@ -0,0 +1,311 @@
+;************************************************************************
+;* This file is part of FFmpeg.
+;*
+;* FFmpeg is free software; you can redistribute it and/or
+;* modify it under the terms of the GNU Lesser General Public
+;* License as published by the Free Software Foundation; either
+;* version 2.1 of the License, or (at your option) any later version.
+;*
+;* FFmpeg is distributed in the hope that it will be useful,
+;* but WITHOUT ANY WARRANTY; without even the implied warranty of
+;* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+;* Lesser General Public License for more details.
+;*
+;* You should have received a copy of the GNU Lesser General Public
+;* License along with FFmpeg; if not, write to the Free Software
+;* 51, Inc., Foundation Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+;******************************************************************************
+
+%include "libavutil/x86/x86util.asm"
+
+%if ARCH_X86_64
+
+SECTION_RODATA 32
+
+; Full matrix for row transform.
+const tmatrix_row
+ dw 64, 89, 84, 75, 64, 50, 35, 18
+ dw 64, -18, -84, 50, 64, -75, -35, 89
+ dw 64, 75, 35, -18, -64, -89, -84, -50
+ dw 64, -50, -35, 89, -64, -18, 84, -75
+ dw 64, 50, -35, -89, -64, 18, 84, 75
+ dw 64, -75, 35, 18, -64, 89, -84, 50
+ dw 64, 18, -84, -50, 64, 75, -35, -89
+ dw 64, -89, 84, -75, 64, -50, 35, -18
+
+; Constant pairs for broadcast in column transform.
+const tmatrix_col_even
+ dw 64, 64, 64, -64
+ dw 84, 35, 35, -84
+const tmatrix_col_odd
+ dw 89, 75, 50, 18
+ dw 75, -18, -89, -50
+ dw 50, -89, 18, 75
+ dw 18, -50, 75, -89
+
+; Memory targets for vpbroadcastd (register version requires AVX512).
+cextern pd_1
+const sixtyfour
+ dd 64
+
+SECTION .text
+
+; void ff_apv_decode_transquant_avx2(void *output,
+; ptrdiff_t pitch,
+; const int16_t *input,
+; const int16_t *qmatrix,
+; int bit_depth,
+; int qp_shift);
+
+INIT_YMM avx2
+
+cglobal apv_decode_transquant, 5, 7, 16, output, pitch, input, qmatrix, bit_depth, qp_shift, tmp
+
+ ; Load input and dequantise
+
+ vpbroadcastd m10, [pd_1]
+ lea tmpd, [bit_depthd - 2]
+ movd xm8, qp_shiftm
+ movd xm9, tmpd
+ vpslld m10, m10, xm9
+ vpsrld m10, m10, 1
+
+ ; m8 = scalar qp_shift
+ ; m9 = scalar bd_shift
+ ; m10 = vector 1 << (bd_shift - 1)
+ ; m11 = qmatrix load
+
+%macro LOAD_AND_DEQUANT 2 ; (xmm input, constant offset)
+ vpmovsxwd m%1, [inputq + %2]
+ vpmovsxwd m11, [qmatrixq + %2]
+ vpmaddwd m%1, m%1, m11
+ vpslld m%1, m%1, xm8
+ vpaddd m%1, m%1, m10
+ vpsrad m%1, m%1, xm9
+ vpackssdw m%1, m%1, m%1
+%endmacro
+
+ LOAD_AND_DEQUANT 0, 0x00
+ LOAD_AND_DEQUANT 1, 0x10
+ LOAD_AND_DEQUANT 2, 0x20
+ LOAD_AND_DEQUANT 3, 0x30
+ LOAD_AND_DEQUANT 4, 0x40
+ LOAD_AND_DEQUANT 5, 0x50
+ LOAD_AND_DEQUANT 6, 0x60
+ LOAD_AND_DEQUANT 7, 0x70
+
+ ; mN = row N words 0 1 2 3 0 1 2 3 4 5 6 7 4 5 6 7
+
+ ; Transform columns
+ ; This applies a 1-D DCT butterfly
+
+ vpunpcklwd m12, m0, m4
+ vpunpcklwd m13, m2, m6
+ vpunpcklwd m14, m1, m3
+ vpunpcklwd m15, m5, m7
+
+ ; m12 = rows 0 and 4 interleaved
+ ; m13 = rows 2 and 6 interleaved
+ ; m14 = rows 1 and 3 interleaved
+ ; m15 = rows 5 and 7 interleaved
+
+ lea tmpq, [tmatrix_col_even]
+ vpbroadcastd m0, [tmpq + 0x00]
+ vpbroadcastd m1, [tmpq + 0x04]
+ vpbroadcastd m2, [tmpq + 0x08]
+ vpbroadcastd m3, [tmpq + 0x0c]
+
+ vpmaddwd m4, m12, m0
+ vpmaddwd m5, m12, m1
+ vpmaddwd m6, m13, m2
+ vpmaddwd m7, m13, m3
+ vpaddd m8, m4, m6
+ vpaddd m9, m5, m7
+ vpsubd m10, m5, m7
+ vpsubd m11, m4, m6
+
+ lea tmpq, [tmatrix_col_odd]
+ vpbroadcastd m0, [tmpq + 0x00]
+ vpbroadcastd m1, [tmpq + 0x04]
+ vpbroadcastd m2, [tmpq + 0x08]
+ vpbroadcastd m3, [tmpq + 0x0c]
+
+ vpmaddwd m4, m14, m0
+ vpmaddwd m5, m15, m1
+ vpmaddwd m6, m14, m2
+ vpmaddwd m7, m15, m3
+ vpaddd m12, m4, m5
+ vpaddd m13, m6, m7
+
+ vpbroadcastd m0, [tmpq + 0x10]
+ vpbroadcastd m1, [tmpq + 0x14]
+ vpbroadcastd m2, [tmpq + 0x18]
+ vpbroadcastd m3, [tmpq + 0x1c]
+
+ vpmaddwd m4, m14, m0
+ vpmaddwd m5, m15, m1
+ vpmaddwd m6, m14, m2
+ vpmaddwd m7, m15, m3
+ vpaddd m14, m4, m5
+ vpaddd m15, m6, m7
+
+ vpaddd m0, m8, m12
+ vpaddd m1, m9, m13
+ vpaddd m2, m10, m14
+ vpaddd m3, m11, m15
+ vpsubd m4, m11, m15
+ vpsubd m5, m10, m14
+ vpsubd m6, m9, m13
+ vpsubd m7, m8, m12
+
+ ; Mid-transform normalisation
+ ; Note that outputs here are fitted to 16 bits
+
+ vpbroadcastd m8, [sixtyfour]
+
+%macro NORMALISE 1
+ vpaddd m%1, m%1, m8
+ vpsrad m%1, m%1, 7
+ vpackssdw m%1, m%1, m%1
+ vpermq m%1, m%1, q3120
+%endmacro
+
+ NORMALISE 0
+ NORMALISE 1
+ NORMALISE 2
+ NORMALISE 3
+ NORMALISE 4
+ NORMALISE 5
+ NORMALISE 6
+ NORMALISE 7
+
+ ; mN = row N words 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
+
+ ; Transform rows
+ ; This multiplies the rows directly by the transform matrix,
+ ; avoiding the need to transpose anything
+
+ lea tmpq, [tmatrix_row]
+ mova m12, [tmpq + 0x00]
+ mova m13, [tmpq + 0x20]
+ mova m14, [tmpq + 0x40]
+ mova m15, [tmpq + 0x60]
+
+%macro TRANS_ROW_STEP 1
+ vpmaddwd m8, m%1, m12
+ vpmaddwd m9, m%1, m13
+ vpmaddwd m10, m%1, m14
+ vpmaddwd m11, m%1, m15
+ vphaddd m8, m8, m9
+ vphaddd m10, m10, m11
+ vphaddd m%1, m8, m10
+%endmacro
+
+ TRANS_ROW_STEP 0
+ TRANS_ROW_STEP 1
+ TRANS_ROW_STEP 2
+ TRANS_ROW_STEP 3
+ TRANS_ROW_STEP 4
+ TRANS_ROW_STEP 5
+ TRANS_ROW_STEP 6
+ TRANS_ROW_STEP 7
+
+ ; Renormalise, clip and store output
+
+ vpbroadcastd m14, [pd_1]
+ mov tmpd, 20
+ sub tmpd, bit_depthd
+ movd xm9, tmpd
+ dec tmpd
+ movd xm13, tmpd
+ movd xm15, bit_depthd
+ vpslld m8, m14, xm13
+ vpslld m12, m14, xm15
+ vpsrld m10, m12, 1
+ vpsubd m12, m12, m14
+ vpxor m11, m11, m11
+
+ ; m8 = vector 1 << (bd_shift - 1)
+ ; m9 = scalar bd_shift
+ ; m10 = vector 1 << (bit_depth - 1)
+ ; m11 = zero
+ ; m12 = vector (1 << bit_depth) - 1
+
+ cmp bit_depthd, 8
+ jne store_10
+
+ lea tmpq, [pitchq + 2*pitchq]
+%macro NORMALISE_AND_STORE_8 4
+ vpaddd m%1, m%1, m8
+ vpaddd m%2, m%2, m8
+ vpaddd m%3, m%3, m8
+ vpaddd m%4, m%4, m8
+ vpsrad m%1, m%1, xm9
+ vpsrad m%2, m%2, xm9
+ vpsrad m%3, m%3, xm9
+ vpsrad m%4, m%4, xm9
+ vpaddd m%1, m%1, m10
+ vpaddd m%2, m%2, m10
+ vpaddd m%3, m%3, m10
+ vpaddd m%4, m%4, m10
+ ; m%1 = A0-3 A4-7
+ ; m%2 = B0-3 B4-7
+ ; m%3 = C0-3 C4-7
+ ; m%4 = D0-3 D4-7
+ vpackusdw m%1, m%1, m%2
+ vpackusdw m%3, m%3, m%4
+ ; m%1 = A0-3 B0-3 A4-7 B4-7
+ ; m%2 = C0-3 D0-3 C4-7 D4-7
+ vpermq m%1, m%1, q3120
+ vpermq m%2, m%3, q3120
+ ; m%1 = A0-3 A4-7 B0-3 B4-7
+ ; m%2 = C0-3 C4-7 D0-3 D4-7
+ vpackuswb m%1, m%1, m%2
+ ; m%1 = A0-3 A4-7 C0-3 C4-7 B0-3 B4-7 D0-3 D4-7
+ vextracti128 xm%2, m%1, 1
+ vpsrldq xm%3, xm%1, 8
+ vpsrldq xm%4, xm%2, 8
+ vmovq [outputq], xm%1
+ vmovq [outputq + pitchq], xm%2
+ vmovq [outputq + 2*pitchq], xm%3
+ vmovq [outputq + tmpq], xm%4
+ lea outputq, [outputq + 4*pitchq]
+%endmacro
+
+ NORMALISE_AND_STORE_8 0, 1, 2, 3
+ NORMALISE_AND_STORE_8 4, 5, 6, 7
+
+ RET
+
+store_10:
+
+%macro NORMALISE_AND_STORE_10 2
+ vpaddd m%1, m%1, m8
+ vpaddd m%2, m%2, m8
+ vpsrad m%1, m%1, xm9
+ vpsrad m%2, m%2, xm9
+ vpaddd m%1, m%1, m10
+ vpaddd m%2, m%2, m10
+ vpmaxsd m%1, m%1, m11
+ vpmaxsd m%2, m%2, m11
+ vpminsd m%1, m%1, m12
+ vpminsd m%2, m%2, m12
+ ; m%1 = A0-3 A4-7
+ ; m%2 = B0-3 B4-7
+ vpackusdw m%1, m%1, m%2
+ ; m%1 = A0-3 B0-3 A4-7 B4-7
+ vpermq m%1, m%1, q3120
+ ; m%1 = A0-3 A4-7 B0-3 B4-7
+ vextracti128 [outputq], m%1, 0
+ vextracti128 [outputq + pitchq], m%1, 1
+ lea outputq, [outputq + 2*pitchq]
+%endmacro
+
+ NORMALISE_AND_STORE_10 0, 1
+ NORMALISE_AND_STORE_10 2, 3
+ NORMALISE_AND_STORE_10 4, 5
+ NORMALISE_AND_STORE_10 6, 7
+
+ RET
+
+%endif ; ARCH_X86_64
diff --git a/libavcodec/x86/apv_dsp_init.c b/libavcodec/x86/apv_dsp_init.c
new file mode 100644
index 0000000000..39360a0ad2
--- /dev/null
+++ b/libavcodec/x86/apv_dsp_init.c
@@ -0,0 +1,44 @@
+/*
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include "config.h"
+#include "libavutil/attributes.h"
+#include "libavutil/cpu.h"
+#include "libavutil/x86/asm.h"
+#include "libavutil/x86/cpu.h"
+#include "libavcodec/apv_dsp.h"
+
+#if ARCH_X86_64
+
+void ff_apv_decode_transquant_avx2(void *output,
+ ptrdiff_t pitch,
+ const int16_t *input,
+ const int16_t *qmatrix,
+ int bit_depth,
+ int qp_shift);
+
+av_cold void ff_apv_dsp_init_x86_64(APVDSPContext *dsp)
+{
+ int cpu_flags = av_get_cpu_flags();
+
+ if (EXTERNAL_AVX2_FAST(cpu_flags)) {
+ dsp->decode_transquant = ff_apv_decode_transquant_avx2;
+ }
+}
+
+#endif /* ARCH_X86_64 */
diff --git a/tests/checkasm/Makefile b/tests/checkasm/Makefile
index d5c50e5599..193c1e4633 100644
--- a/tests/checkasm/Makefile
+++ b/tests/checkasm/Makefile
@@ -28,6 +28,7 @@ AVCODECOBJS-$(CONFIG_AAC_DECODER) += aacpsdsp.o \
sbrdsp.o
AVCODECOBJS-$(CONFIG_AAC_ENCODER) += aacencdsp.o
AVCODECOBJS-$(CONFIG_ALAC_DECODER) += alacdsp.o
+AVCODECOBJS-$(CONFIG_APV_DECODER) += apv_dsp.o
AVCODECOBJS-$(CONFIG_DCA_DECODER) += synth_filter.o
AVCODECOBJS-$(CONFIG_DIRAC_DECODER) += diracdsp.o
AVCODECOBJS-$(CONFIG_EXR_DECODER) += exrdsp.o
diff --git a/tests/checkasm/apv_dsp.c b/tests/checkasm/apv_dsp.c
new file mode 100644
index 0000000000..b3adb8ca06
--- /dev/null
+++ b/tests/checkasm/apv_dsp.c
@@ -0,0 +1,109 @@
+/*
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include <stdint.h>
+
+#include "checkasm.h"
+
+#include "libavutil/attributes.h"
+#include "libavutil/mem_internal.h"
+#include "libavcodec/apv_dsp.h"
+
+
+static void check_decode_transquant_8(void)
+{
+ LOCAL_ALIGNED_16(int16_t, input, [64]);
+ LOCAL_ALIGNED_16(int16_t, qmatrix, [64]);
+ LOCAL_ALIGNED_16(uint8_t, new_output, [64]);
+ LOCAL_ALIGNED_16(uint8_t, ref_output, [64]);
+
+ declare_func(void,
+ uint8_t *output,
+ ptrdiff_t pitch,
+ const int16_t *input,
+ const int16_t *qmatrix,
+ int bit_depth,
+ int qp_shift);
+
+ for (int i = 0; i < 64; i++) {
+ // Any signed 12-bit integer.
+ input[i] = rnd() % 2048 - 1024;
+
+ // qmatrix input is premultiplied by level_scale, so
+ // range is 1 to 255 * 71. Interesting values are all
+ // at the low end of that, though.
+ qmatrix[i] = rnd() % 16 + 16;
+ }
+
+ call_ref(ref_output, 8, input, qmatrix, 8, 4);
+ call_new(new_output, 8, input, qmatrix, 8, 4);
+
+ if (memcmp(new_output, ref_output, 64 * sizeof(*ref_output)))
+ fail();
+
+ bench_new(new_output, 8, input, qmatrix, 8, 4);
+}
+
+static void check_decode_transquant_10(void)
+{
+ LOCAL_ALIGNED_16( int16_t, input, [64]);
+ LOCAL_ALIGNED_16( int16_t, qmatrix, [64]);
+ LOCAL_ALIGNED_16(uint16_t, new_output, [64]);
+ LOCAL_ALIGNED_16(uint16_t, ref_output, [64]);
+
+ declare_func(void,
+ uint16_t *output,
+ ptrdiff_t pitch,
+ const int16_t *input,
+ const int16_t *qmatrix,
+ int bit_depth,
+ int qp_shift);
+
+ for (int i = 0; i < 64; i++) {
+ // Any signed 14-bit integer.
+ input[i] = rnd() % 16384 - 8192;
+
+ // qmatrix input is premultiplied by level_scale, so
+ // range is 1 to 255 * 71. Interesting values are all
+ // at the low end of that, though.
+ qmatrix[i] = 16; //rnd() % 16 + 16;
+ }
+
+ call_ref(ref_output, 16, input, qmatrix, 10, 4);
+ call_new(new_output, 16, input, qmatrix, 10, 4);
+
+ if (memcmp(new_output, ref_output, 64 * sizeof(*ref_output)))
+ fail();
+
+ bench_new(new_output, 16, input, qmatrix, 10, 4);
+}
+
+void checkasm_check_apv_dsp(void)
+{
+ APVDSPContext dsp;
+
+ ff_apv_dsp_init(&dsp);
+
+ if (check_func(dsp.decode_transquant, "decode_transquant_8"))
+ check_decode_transquant_8();
+
+ if (check_func(dsp.decode_transquant, "decode_transquant_10"))
+ check_decode_transquant_10();
+
+ report("apv_dsp");
+}
diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c
index 412b8b2cd1..3bb82ed0e5 100644
--- a/tests/checkasm/checkasm.c
+++ b/tests/checkasm/checkasm.c
@@ -129,6 +129,9 @@ static const struct {
#if CONFIG_ALAC_DECODER
{ "alacdsp", checkasm_check_alacdsp },
#endif
+ #if CONFIG_APV_DECODER
+ { "apv_dsp", checkasm_check_apv_dsp },
+ #endif
#if CONFIG_AUDIODSP
{ "audiodsp", checkasm_check_audiodsp },
#endif
diff --git a/tests/checkasm/checkasm.h b/tests/checkasm/checkasm.h
index ad239fb2a4..a6b5965e02 100644
--- a/tests/checkasm/checkasm.h
+++ b/tests/checkasm/checkasm.h
@@ -83,6 +83,7 @@ void checkasm_check_ac3dsp(void);
void checkasm_check_aes(void);
void checkasm_check_afir(void);
void checkasm_check_alacdsp(void);
+void checkasm_check_apv_dsp(void);
void checkasm_check_audiodsp(void);
void checkasm_check_av_tx(void);
void checkasm_check_blend(void);
diff --git a/tests/fate/checkasm.mak b/tests/fate/checkasm.mak
index 6d42df148e..720c5fd77e 100644
--- a/tests/fate/checkasm.mak
+++ b/tests/fate/checkasm.mak
@@ -4,6 +4,7 @@ FATE_CHECKASM = fate-checkasm-aacencdsp \
fate-checkasm-aes \
fate-checkasm-af_afir \
fate-checkasm-alacdsp \
+ fate-checkasm-apv_dsp \
fate-checkasm-audiodsp \
fate-checkasm-av_tx \
fate-checkasm-blockdsp \
--
2.47.2
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 17+ messages in thread
* [FFmpeg-devel] [PATCH v3 6/7] lavc: APV metadata bitstream filter
2025-04-23 20:45 [FFmpeg-devel] [PATCH v3 0/7] APV support Mark Thompson
` (4 preceding siblings ...)
2025-04-23 20:45 ` [FFmpeg-devel] [PATCH v3 5/7] lavc/apv: AVX2 transquant for x86-64 Mark Thompson
@ 2025-04-23 20:45 ` Mark Thompson
2025-04-23 20:45 ` [FFmpeg-devel] [PATCH v3 7/7] lavf: APV muxer Mark Thompson
6 siblings, 0 replies; 17+ messages in thread
From: Mark Thompson @ 2025-04-23 20:45 UTC (permalink / raw)
To: ffmpeg-devel
---
libavcodec/bitstream_filters.c | 1 +
libavcodec/bsf/Makefile | 1 +
libavcodec/bsf/apv_metadata.c | 134 +++++++++++++++++++++++++++++++++
3 files changed, 136 insertions(+)
create mode 100644 libavcodec/bsf/apv_metadata.c
diff --git a/libavcodec/bitstream_filters.c b/libavcodec/bitstream_filters.c
index f923411bee..da9d0d2513 100644
--- a/libavcodec/bitstream_filters.c
+++ b/libavcodec/bitstream_filters.c
@@ -25,6 +25,7 @@
#include "bsf_internal.h"
extern const FFBitStreamFilter ff_aac_adtstoasc_bsf;
+extern const FFBitStreamFilter ff_apv_metadata_bsf;
extern const FFBitStreamFilter ff_av1_frame_merge_bsf;
extern const FFBitStreamFilter ff_av1_frame_split_bsf;
extern const FFBitStreamFilter ff_av1_metadata_bsf;
diff --git a/libavcodec/bsf/Makefile b/libavcodec/bsf/Makefile
index 40b7fc6e9b..39ea091b50 100644
--- a/libavcodec/bsf/Makefile
+++ b/libavcodec/bsf/Makefile
@@ -2,6 +2,7 @@ clean::
$(RM) $(CLEANSUFFIXES:%=libavcodec/bsf/%)
OBJS-$(CONFIG_AAC_ADTSTOASC_BSF) += bsf/aac_adtstoasc.o
+OBJS-$(CONFIG_APV_METADATA_BSF) += bsf/apv_metadata.o
OBJS-$(CONFIG_AV1_FRAME_MERGE_BSF) += bsf/av1_frame_merge.o
OBJS-$(CONFIG_AV1_FRAME_SPLIT_BSF) += bsf/av1_frame_split.o
OBJS-$(CONFIG_AV1_METADATA_BSF) += bsf/av1_metadata.o
diff --git a/libavcodec/bsf/apv_metadata.c b/libavcodec/bsf/apv_metadata.c
new file mode 100644
index 0000000000..a1cdcf86c8
--- /dev/null
+++ b/libavcodec/bsf/apv_metadata.c
@@ -0,0 +1,134 @@
+/*
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include "libavutil/common.h"
+#include "libavutil/opt.h"
+
+#include "bsf.h"
+#include "bsf_internal.h"
+#include "cbs.h"
+#include "cbs_bsf.h"
+#include "cbs_apv.h"
+
+typedef struct APVMetadataContext {
+ CBSBSFContext common;
+
+ int color_primaries;
+ int transfer_characteristics;
+ int matrix_coefficients;
+ int full_range_flag;
+} APVMetadataContext;
+
+
+static int apv_metadata_update_frame_header(AVBSFContext *bsf,
+ APVRawFrameHeader *hdr)
+{
+ APVMetadataContext *ctx = bsf->priv_data;
+
+ if (ctx->color_primaries >= 0 ||
+ ctx->transfer_characteristics >= 0 ||
+ ctx->matrix_coefficients >= 0 ||
+ ctx->full_range_flag >= 0) {
+ hdr->color_description_present_flag = 1;
+
+ if (ctx->color_primaries >= 0)
+ hdr->color_primaries = ctx->color_primaries;
+ if (ctx->transfer_characteristics >= 0)
+ hdr->transfer_characteristics = ctx->transfer_characteristics;
+ if (ctx->matrix_coefficients >= 0)
+ hdr->matrix_coefficients = ctx->matrix_coefficients;
+ if (ctx->full_range_flag >= 0)
+ hdr->full_range_flag = ctx->full_range_flag;
+ }
+
+ return 0;
+}
+
+static int apv_metadata_update_fragment(AVBSFContext *bsf, AVPacket *pkt,
+ CodedBitstreamFragment *frag)
+{
+ int err, i;
+
+ for (i = 0; i < frag->nb_units; i++) {
+ if (frag->units[i].type == APV_PBU_PRIMARY_FRAME) {
+ APVRawFrame *pbu = frag->units[i].content;
+ err = apv_metadata_update_frame_header(bsf, &pbu->frame_header);
+ if (err < 0)
+ return err;
+ }
+ }
+
+ return 0;
+}
+
+static const CBSBSFType apv_metadata_type = {
+ .codec_id = AV_CODEC_ID_APV,
+ .fragment_name = "access unit",
+ .unit_name = "PBU",
+ .update_fragment = &apv_metadata_update_fragment,
+};
+
+static int apv_metadata_init(AVBSFContext *bsf)
+{
+ return ff_cbs_bsf_generic_init(bsf, &apv_metadata_type);
+}
+
+#define OFFSET(x) offsetof(APVMetadataContext, x)
+#define FLAGS (AV_OPT_FLAG_VIDEO_PARAM|AV_OPT_FLAG_BSF_PARAM)
+static const AVOption apv_metadata_options[] = {
+ { "color_primaries", "Set color primaries (section 5.3.5)",
+ OFFSET(color_primaries), AV_OPT_TYPE_INT,
+ { .i64 = -1 }, -1, 255, FLAGS },
+ { "transfer_characteristics", "Set transfer characteristics (section 5.3.5)",
+ OFFSET(transfer_characteristics), AV_OPT_TYPE_INT,
+ { .i64 = -1 }, -1, 255, FLAGS },
+ { "matrix_coefficients", "Set matrix coefficients (section 5.3.5)",
+ OFFSET(matrix_coefficients), AV_OPT_TYPE_INT,
+ { .i64 = -1 }, -1, 255, FLAGS },
+
+ { "full_range_flag", "Set full range flag flag (section 5.3.5)",
+ OFFSET(full_range_flag), AV_OPT_TYPE_INT,
+ { .i64 = -1 }, -1, 1, FLAGS, .unit = "cr" },
+ { "tv", "TV (limited) range", 0, AV_OPT_TYPE_CONST,
+ { .i64 = 0 }, .flags = FLAGS, .unit = "cr" },
+ { "pc", "PC (full) range", 0, AV_OPT_TYPE_CONST,
+ { .i64 = 1 }, .flags = FLAGS, .unit = "cr" },
+
+ { NULL }
+};
+
+static const AVClass apv_metadata_class = {
+ .class_name = "apv_metadata_bsf",
+ .item_name = av_default_item_name,
+ .option = apv_metadata_options,
+ .version = LIBAVUTIL_VERSION_INT,
+};
+
+static const enum AVCodecID apv_metadata_codec_ids[] = {
+ AV_CODEC_ID_APV, AV_CODEC_ID_NONE,
+};
+
+const FFBitStreamFilter ff_apv_metadata_bsf = {
+ .p.name = "apv_metadata",
+ .p.codec_ids = apv_metadata_codec_ids,
+ .p.priv_class = &apv_metadata_class,
+ .priv_data_size = sizeof(APVMetadataContext),
+ .init = &apv_metadata_init,
+ .close = &ff_cbs_bsf_generic_close,
+ .filter = &ff_cbs_bsf_generic_filter,
+};
--
2.47.2
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 17+ messages in thread
* [FFmpeg-devel] [PATCH v3 7/7] lavf: APV muxer
2025-04-23 20:45 [FFmpeg-devel] [PATCH v3 0/7] APV support Mark Thompson
` (5 preceding siblings ...)
2025-04-23 20:45 ` [FFmpeg-devel] [PATCH v3 6/7] lavc: APV metadata bitstream filter Mark Thompson
@ 2025-04-23 20:45 ` Mark Thompson
6 siblings, 0 replies; 17+ messages in thread
From: Mark Thompson @ 2025-04-23 20:45 UTC (permalink / raw)
To: ffmpeg-devel
---
libavformat/Makefile | 1 +
libavformat/allformats.c | 1 +
libavformat/apvenc.c | 40 ++++++++++++++++++++++++++++++++++++++++
3 files changed, 42 insertions(+)
create mode 100644 libavformat/apvenc.c
diff --git a/libavformat/Makefile b/libavformat/Makefile
index ef96c2762e..6c9992adab 100644
--- a/libavformat/Makefile
+++ b/libavformat/Makefile
@@ -120,6 +120,7 @@ OBJS-$(CONFIG_APTX_MUXER) += rawenc.o
OBJS-$(CONFIG_APTX_HD_DEMUXER) += aptxdec.o
OBJS-$(CONFIG_APTX_HD_MUXER) += rawenc.o
OBJS-$(CONFIG_APV_DEMUXER) += apvdec.o
+OBJS-$(CONFIG_APV_MUXER) += apvenc.o
OBJS-$(CONFIG_AQTITLE_DEMUXER) += aqtitledec.o subtitles.o
OBJS-$(CONFIG_ARGO_ASF_DEMUXER) += argo_asf.o
OBJS-$(CONFIG_ARGO_ASF_MUXER) += argo_asf.o
diff --git a/libavformat/allformats.c b/libavformat/allformats.c
index 90a4fe64ec..b5a23f9c17 100644
--- a/libavformat/allformats.c
+++ b/libavformat/allformats.c
@@ -73,6 +73,7 @@ extern const FFOutputFormat ff_aptx_muxer;
extern const FFInputFormat ff_aptx_hd_demuxer;
extern const FFOutputFormat ff_aptx_hd_muxer;
extern const FFInputFormat ff_apv_demuxer;
+extern const FFOutputFormat ff_apv_muxer;
extern const FFInputFormat ff_aqtitle_demuxer;
extern const FFInputFormat ff_argo_asf_demuxer;
extern const FFOutputFormat ff_argo_asf_muxer;
diff --git a/libavformat/apvenc.c b/libavformat/apvenc.c
new file mode 100644
index 0000000000..9c4d33fdae
--- /dev/null
+++ b/libavformat/apvenc.c
@@ -0,0 +1,40 @@
+/*
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include "libavcodec/apv.h"
+
+#include "avformat.h"
+#include "mux.h"
+
+static int apv_write_packet(AVFormatContext *s, AVPacket *pkt)
+{
+ avio_wb32(s->pb, pkt->size);
+ avio_write(s->pb, pkt->data, pkt->size);
+ return 0;
+}
+
+const FFOutputFormat ff_apv_muxer = {
+ .p.name = "apv",
+ .p.long_name = NULL_IF_CONFIG_SMALL("APV raw bitstream"),
+ .p.extensions = "apv",
+ .p.audio_codec = AV_CODEC_ID_NONE,
+ .p.video_codec = AV_CODEC_ID_APV,
+ .p.subtitle_codec = AV_CODEC_ID_NONE,
+ .flags_internal = FF_OFMT_FLAG_MAX_ONE_OF_EACH,
+ .write_packet = apv_write_packet,
+};
--
2.47.2
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [FFmpeg-devel] [PATCH v3 2/7] lavc/cbs: APV support
2025-04-23 20:45 ` [FFmpeg-devel] [PATCH v3 2/7] lavc/cbs: APV support Mark Thompson
@ 2025-04-24 0:02 ` James Almer
2025-04-24 20:16 ` Mark Thompson
0 siblings, 1 reply; 17+ messages in thread
From: James Almer @ 2025-04-24 0:02 UTC (permalink / raw)
To: ffmpeg-devel
[-- Attachment #1.1.1: Type: text/plain, Size: 1359 bytes --]
On 4/23/2025 5:45 PM, Mark Thompson wrote:
> +static int cbs_apv_split_fragment(CodedBitstreamContext *ctx,
> + CodedBitstreamFragment *frag,
> + int header)
> +{
> + uint8_t *data = frag->data;
> + size_t size = frag->data_size;
> + uint32_t signature;
> + int err, trace;
To prepare CBS for the presence of extradata, make this function a no-op
when header != 0.
> +
> + // Don't include parsing here in trace output.
> + trace = ctx->trace_enable;
> + ctx->trace_enable = 0;
> +
> + signature = AV_RB32(data);
> + if (signature != APV_SIGNATURE) {
> + av_log(ctx->log_ctx, AV_LOG_ERROR,
> + "Invalid APV access unit: bad signature %08x.\n",
> + signature);
> + err = AVERROR_INVALIDDATA;
> + goto fail;
> + }
> + data += 4;
> + size -= 4;
> +
> + while (size > 0) {
> + GetBitContext gbc;
> + uint32_t pbu_size;
> + APVRawPBUHeader pbu_header;
> +
> + if (size < 8) {
> + av_log(ctx->log_ctx, AV_LOG_ERROR, "Invalid PBU: "
> + "fragment too short (%"SIZE_SPECIFIER" bytes).\n",
> + size);
> + err = AVERROR_INVALIDDATA;
> + goto fail;
> + }
[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]
[-- Attachment #2: Type: text/plain, Size: 251 bytes --]
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [FFmpeg-devel] [PATCH v3 3/7] lavf: APV demuxer
2025-04-23 20:45 ` [FFmpeg-devel] [PATCH v3 3/7] lavf: APV demuxer Mark Thompson
@ 2025-04-24 0:10 ` James Almer
2025-04-24 20:15 ` Mark Thompson
0 siblings, 1 reply; 17+ messages in thread
From: James Almer @ 2025-04-24 0:10 UTC (permalink / raw)
To: ffmpeg-devel
[-- Attachment #1.1.1: Type: text/plain, Size: 2482 bytes --]
On 4/23/2025 5:45 PM, Mark Thompson wrote:
> +static int apv_read_header(AVFormatContext *s)
> +{
> + AVStream *st;
> + GetByteContext gbc;
> + APVHeaderInfo header;
> + uint8_t buffer[28];
> + uint32_t au_size, signature, pbu_size;
> + int err, size;
> +
> + err = ffio_ensure_seekback(s->pb, sizeof(buffer));
Isn't 28 bytes small enough that a backwards avio_seek() should always
succeed?
> + if (err < 0)
> + return err;
> + size = avio_read(s->pb, buffer, sizeof(buffer));
> + if (size < 0)
> + return size;
> +
> + bytestream2_init(&gbc, buffer, sizeof(buffer));
> +
> + au_size = bytestream2_get_be32(&gbc);
> + if (au_size < 24) {
> + // Too small.
> + return AVERROR_INVALIDDATA;
> + }
> + signature = bytestream2_get_be32(&gbc);
> + if (signature != APV_SIGNATURE) {
> + // Signature is mandatory.
> + return AVERROR_INVALIDDATA;
> + }
> + pbu_size = bytestream2_get_be32(&gbc);
> + if (pbu_size < 16) {
> + // Too small.
> + return AVERROR_INVALIDDATA;
> + }
> +
> + err = apv_extract_header_info(&header, &gbc);
> + if (err < 0)
> + return err;
> +
> + st = avformat_new_stream(s, NULL);
> + if (!st)
> + return AVERROR(ENOMEM);
> +
> + st->codecpar->codec_type = AVMEDIA_TYPE_VIDEO;
> + st->codecpar->codec_id = AV_CODEC_ID_APV;
> + st->codecpar->format = header.pixel_format;
> + st->codecpar->profile = header.profile_idc;
> + st->codecpar->level = header.level_idc;
> + st->codecpar->width = header.frame_width;
> + st->codecpar->height = header.frame_height;
> +
> + st->avg_frame_rate = (AVRational){ 30, 1 };
> + avpriv_set_pts_info(st, 64, 1, 30);
> +
> + avio_seek(s->pb, -size, SEEK_CUR);
> +
> + return 0;
> +}
> +
> +static int apv_read_packet(AVFormatContext *s, AVPacket *pkt)
> +{
> + uint32_t au_size;
> + int ret;
> +
> + au_size = avio_rb32(s->pb);
> + if (au_size == 0 && avio_feof(s->pb))
> + return AVERROR_EOF;
> + if (au_size < 16 || au_size > 1 << 24) {
Might be a good idea to also check for the signature.
> + av_log(s, AV_LOG_ERROR, "APV AU is bad\n");
> + return AVERROR_INVALIDDATA;
> + }
> +
> + ret = av_get_packet(s->pb, pkt, au_size);
> + pkt->flags = AV_PKT_FLAG_KEY;
> +
> + return ret;
> +}
[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]
[-- Attachment #2: Type: text/plain, Size: 251 bytes --]
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [FFmpeg-devel] [PATCH v3 5/7] lavc/apv: AVX2 transquant for x86-64
2025-04-23 20:45 ` [FFmpeg-devel] [PATCH v3 5/7] lavc/apv: AVX2 transquant for x86-64 Mark Thompson
@ 2025-04-24 2:55 ` James Almer
2025-04-24 20:37 ` Mark Thompson
0 siblings, 1 reply; 17+ messages in thread
From: James Almer @ 2025-04-24 2:55 UTC (permalink / raw)
To: ffmpeg-devel
[-- Attachment #1.1.1: Type: text/plain, Size: 22344 bytes --]
On 4/23/2025 5:45 PM, Mark Thompson wrote:
> Typical checkasm result on Alder Lake:
>
> decode_transquant_8_c: 464.2 ( 1.00x)
> decode_transquant_8_avx2: 86.2 ( 5.38x)
> decode_transquant_10_c: 481.6 ( 1.00x)
> decode_transquant_10_avx2: 83.5 ( 5.77x)
> ---
> libavcodec/apv_dsp.c | 4 +
> libavcodec/apv_dsp.h | 2 +
> libavcodec/x86/Makefile | 2 +
> libavcodec/x86/apv_dsp.asm | 311 ++++++++++++++++++++++++++++++++++
> libavcodec/x86/apv_dsp_init.c | 44 +++++
> tests/checkasm/Makefile | 1 +
> tests/checkasm/apv_dsp.c | 109 ++++++++++++
> tests/checkasm/checkasm.c | 3 +
> tests/checkasm/checkasm.h | 1 +
> tests/fate/checkasm.mak | 1 +
> 10 files changed, 478 insertions(+)
> create mode 100644 libavcodec/x86/apv_dsp.asm
> create mode 100644 libavcodec/x86/apv_dsp_init.c
> create mode 100644 tests/checkasm/apv_dsp.c
>
> diff --git a/libavcodec/apv_dsp.c b/libavcodec/apv_dsp.c
> index fe11cd6b94..fd814ef900 100644
> --- a/libavcodec/apv_dsp.c
> +++ b/libavcodec/apv_dsp.c
> @@ -133,4 +133,8 @@ static void apv_decode_transquant_c(void *output,
> av_cold void ff_apv_dsp_init(APVDSPContext *dsp)
> {
> dsp->decode_transquant = apv_decode_transquant_c;
> +
> +#if ARCH_X86_64
> + ff_apv_dsp_init_x86_64(dsp);
> +#endif
> }
> diff --git a/libavcodec/apv_dsp.h b/libavcodec/apv_dsp.h
> index 31645b8581..c63d6a88ee 100644
> --- a/libavcodec/apv_dsp.h
> +++ b/libavcodec/apv_dsp.h
> @@ -34,4 +34,6 @@ typedef struct APVDSPContext {
>
> void ff_apv_dsp_init(APVDSPContext *dsp);
>
> +void ff_apv_dsp_init_x86_64(APVDSPContext *dsp);
> +
> #endif /* AVCODEC_APV_DSP_H */
> diff --git a/libavcodec/x86/Makefile b/libavcodec/x86/Makefile
> index 5d53515381..821c410a0f 100644
> --- a/libavcodec/x86/Makefile
> +++ b/libavcodec/x86/Makefile
> @@ -44,6 +44,7 @@ OBJS-$(CONFIG_ADPCM_G722_DECODER) += x86/g722dsp_init.o
> OBJS-$(CONFIG_ADPCM_G722_ENCODER) += x86/g722dsp_init.o
> OBJS-$(CONFIG_ALAC_DECODER) += x86/alacdsp_init.o
> OBJS-$(CONFIG_APNG_DECODER) += x86/pngdsp_init.o
> +OBJS-$(CONFIG_APV_DECODER) += x86/apv_dsp_init.o
> OBJS-$(CONFIG_CAVS_DECODER) += x86/cavsdsp.o
> OBJS-$(CONFIG_CFHD_DECODER) += x86/cfhddsp_init.o
> OBJS-$(CONFIG_CFHD_ENCODER) += x86/cfhdencdsp_init.o
> @@ -149,6 +150,7 @@ X86ASM-OBJS-$(CONFIG_ADPCM_G722_DECODER) += x86/g722dsp.o
> X86ASM-OBJS-$(CONFIG_ADPCM_G722_ENCODER) += x86/g722dsp.o
> X86ASM-OBJS-$(CONFIG_ALAC_DECODER) += x86/alacdsp.o
> X86ASM-OBJS-$(CONFIG_APNG_DECODER) += x86/pngdsp.o
> +X86ASM-OBJS-$(CONFIG_APV_DECODER) += x86/apv_dsp.o
> X86ASM-OBJS-$(CONFIG_CAVS_DECODER) += x86/cavsidct.o
> X86ASM-OBJS-$(CONFIG_CFHD_ENCODER) += x86/cfhdencdsp.o
> X86ASM-OBJS-$(CONFIG_CFHD_DECODER) += x86/cfhddsp.o
> diff --git a/libavcodec/x86/apv_dsp.asm b/libavcodec/x86/apv_dsp.asm
> new file mode 100644
> index 0000000000..12d96481de
> --- /dev/null
> +++ b/libavcodec/x86/apv_dsp.asm
> @@ -0,0 +1,311 @@
> +;************************************************************************
> +;* This file is part of FFmpeg.
> +;*
> +;* FFmpeg is free software; you can redistribute it and/or
> +;* modify it under the terms of the GNU Lesser General Public
> +;* License as published by the Free Software Foundation; either
> +;* version 2.1 of the License, or (at your option) any later version.
> +;*
> +;* FFmpeg is distributed in the hope that it will be useful,
> +;* but WITHOUT ANY WARRANTY; without even the implied warranty of
> +;* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> +;* Lesser General Public License for more details.
> +;*
> +;* You should have received a copy of the GNU Lesser General Public
> +;* License along with FFmpeg; if not, write to the Free Software
> +;* 51, Inc., Foundation Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
> +;******************************************************************************
> +
> +%include "libavutil/x86/x86util.asm"
> +
> +%if ARCH_X86_64
> +
> +SECTION_RODATA 32
> +
> +; Full matrix for row transform.
> +const tmatrix_row
> + dw 64, 89, 84, 75, 64, 50, 35, 18
> + dw 64, -18, -84, 50, 64, -75, -35, 89
> + dw 64, 75, 35, -18, -64, -89, -84, -50
> + dw 64, -50, -35, 89, -64, -18, 84, -75
> + dw 64, 50, -35, -89, -64, 18, 84, 75
> + dw 64, -75, 35, 18, -64, 89, -84, 50
> + dw 64, 18, -84, -50, 64, 75, -35, -89
> + dw 64, -89, 84, -75, 64, -50, 35, -18
> +
> +; Constant pairs for broadcast in column transform.
> +const tmatrix_col_even
> + dw 64, 64, 64, -64
> + dw 84, 35, 35, -84
> +const tmatrix_col_odd
> + dw 89, 75, 50, 18
> + dw 75, -18, -89, -50
> + dw 50, -89, 18, 75
> + dw 18, -50, 75, -89
> +
> +; Memory targets for vpbroadcastd (register version requires AVX512).
> +cextern pd_1
> +const sixtyfour
> + dd 64
> +
> +SECTION .text
> +
> +; void ff_apv_decode_transquant_avx2(void *output,
> +; ptrdiff_t pitch,
> +; const int16_t *input,
> +; const int16_t *qmatrix,
> +; int bit_depth,
> +; int qp_shift);
> +
> +INIT_YMM avx2
> +
> +cglobal apv_decode_transquant, 5, 7, 16, output, pitch, input, qmatrix, bit_depth, qp_shift, tmp
> +
> + ; Load input and dequantise
> +
> + vpbroadcastd m10, [pd_1]
> + lea tmpd, [bit_depthd - 2]
> + movd xm8, qp_shiftm
> + movd xm9, tmpd
> + vpslld m10, m10, xm9
> + vpsrld m10, m10, 1
> +
> + ; m8 = scalar qp_shift
> + ; m9 = scalar bd_shift
> + ; m10 = vector 1 << (bd_shift - 1)
> + ; m11 = qmatrix load
> +
> +%macro LOAD_AND_DEQUANT 2 ; (xmm input, constant offset)
> + vpmovsxwd m%1, [inputq + %2]
> + vpmovsxwd m11, [qmatrixq + %2]
> + vpmaddwd m%1, m%1, m11
> + vpslld m%1, m%1, xm8
> + vpaddd m%1, m%1, m10
> + vpsrad m%1, m%1, xm9
> + vpackssdw m%1, m%1, m%1
> +%endmacro
> +
> + LOAD_AND_DEQUANT 0, 0x00
> + LOAD_AND_DEQUANT 1, 0x10
> + LOAD_AND_DEQUANT 2, 0x20
> + LOAD_AND_DEQUANT 3, 0x30
> + LOAD_AND_DEQUANT 4, 0x40
> + LOAD_AND_DEQUANT 5, 0x50
> + LOAD_AND_DEQUANT 6, 0x60
> + LOAD_AND_DEQUANT 7, 0x70
> +
> + ; mN = row N words 0 1 2 3 0 1 2 3 4 5 6 7 4 5 6 7
> +
> + ; Transform columns
> + ; This applies a 1-D DCT butterfly
> +
> + vpunpcklwd m12, m0, m4
> + vpunpcklwd m13, m2, m6
> + vpunpcklwd m14, m1, m3
> + vpunpcklwd m15, m5, m7
> +
> + ; m12 = rows 0 and 4 interleaved
> + ; m13 = rows 2 and 6 interleaved
> + ; m14 = rows 1 and 3 interleaved
> + ; m15 = rows 5 and 7 interleaved
> +
> + lea tmpq, [tmatrix_col_even]
> + vpbroadcastd m0, [tmpq + 0x00]
> + vpbroadcastd m1, [tmpq + 0x04]
> + vpbroadcastd m2, [tmpq + 0x08]
> + vpbroadcastd m3, [tmpq + 0x0c]
How about
vbroadcasti128 m0, [tmatrix_col_even]
pshufd m1, m0, q1111
pshufd m2, m0, q2222
pshufd m3, m0, q3333
pshufd m0, m0, q0000
So you remove the lea, and do a single load from memory within a single
cross-lane intruction, instead of four of each.
Same below.
> +
> + vpmaddwd m4, m12, m0
> + vpmaddwd m5, m12, m1
> + vpmaddwd m6, m13, m2
> + vpmaddwd m7, m13, m3
> + vpaddd m8, m4, m6
> + vpaddd m9, m5, m7
> + vpsubd m10, m5, m7
> + vpsubd m11, m4, m6
> +
> + lea tmpq, [tmatrix_col_odd]
> + vpbroadcastd m0, [tmpq + 0x00]
> + vpbroadcastd m1, [tmpq + 0x04]
> + vpbroadcastd m2, [tmpq + 0x08]
> + vpbroadcastd m3, [tmpq + 0x0c]
> +
> + vpmaddwd m4, m14, m0
> + vpmaddwd m5, m15, m1
> + vpmaddwd m6, m14, m2
> + vpmaddwd m7, m15, m3
> + vpaddd m12, m4, m5
> + vpaddd m13, m6, m7
> +
> + vpbroadcastd m0, [tmpq + 0x10]
> + vpbroadcastd m1, [tmpq + 0x14]
> + vpbroadcastd m2, [tmpq + 0x18]
> + vpbroadcastd m3, [tmpq + 0x1c]
> +
> + vpmaddwd m4, m14, m0
> + vpmaddwd m5, m15, m1
> + vpmaddwd m6, m14, m2
> + vpmaddwd m7, m15, m3
> + vpaddd m14, m4, m5
> + vpaddd m15, m6, m7
> +
> + vpaddd m0, m8, m12
> + vpaddd m1, m9, m13
> + vpaddd m2, m10, m14
> + vpaddd m3, m11, m15
> + vpsubd m4, m11, m15
> + vpsubd m5, m10, m14
> + vpsubd m6, m9, m13
> + vpsubd m7, m8, m12
> +
> + ; Mid-transform normalisation
> + ; Note that outputs here are fitted to 16 bits
> +
> + vpbroadcastd m8, [sixtyfour]
> +
> +%macro NORMALISE 1
> + vpaddd m%1, m%1, m8
> + vpsrad m%1, m%1, 7
> + vpackssdw m%1, m%1, m%1
> + vpermq m%1, m%1, q3120
> +%endmacro
> +
> + NORMALISE 0
> + NORMALISE 1
> + NORMALISE 2
> + NORMALISE 3
> + NORMALISE 4
> + NORMALISE 5
> + NORMALISE 6
> + NORMALISE 7
> +
> + ; mN = row N words 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
> +
> + ; Transform rows
> + ; This multiplies the rows directly by the transform matrix,
> + ; avoiding the need to transpose anything
> +
> + lea tmpq, [tmatrix_row]
> + mova m12, [tmpq + 0x00]
> + mova m13, [tmpq + 0x20]
> + mova m14, [tmpq + 0x40]
> + mova m15, [tmpq + 0x60]
> +
> +%macro TRANS_ROW_STEP 1
> + vpmaddwd m8, m%1, m12
> + vpmaddwd m9, m%1, m13
> + vpmaddwd m10, m%1, m14
> + vpmaddwd m11, m%1, m15
> + vphaddd m8, m8, m9
> + vphaddd m10, m10, m11
> + vphaddd m%1, m8, m10
> +%endmacro
> +
> + TRANS_ROW_STEP 0
> + TRANS_ROW_STEP 1
> + TRANS_ROW_STEP 2
> + TRANS_ROW_STEP 3
> + TRANS_ROW_STEP 4
> + TRANS_ROW_STEP 5
> + TRANS_ROW_STEP 6
> + TRANS_ROW_STEP 7
> +
> + ; Renormalise, clip and store output
> +
> + vpbroadcastd m14, [pd_1]
> + mov tmpd, 20
> + sub tmpd, bit_depthd
> + movd xm9, tmpd
> + dec tmpd
> + movd xm13, tmpd
> + movd xm15, bit_depthd
> + vpslld m8, m14, xm13
> + vpslld m12, m14, xm15
> + vpsrld m10, m12, 1
> + vpsubd m12, m12, m14
> + vpxor m11, m11, m11
> +
> + ; m8 = vector 1 << (bd_shift - 1)
> + ; m9 = scalar bd_shift
> + ; m10 = vector 1 << (bit_depth - 1)
> + ; m11 = zero
> + ; m12 = vector (1 << bit_depth) - 1
> +
> + cmp bit_depthd, 8
> + jne store_10
> +
> + lea tmpq, [pitchq + 2*pitchq]
> +%macro NORMALISE_AND_STORE_8 4
> + vpaddd m%1, m%1, m8
> + vpaddd m%2, m%2, m8
> + vpaddd m%3, m%3, m8
> + vpaddd m%4, m%4, m8
> + vpsrad m%1, m%1, xm9
> + vpsrad m%2, m%2, xm9
> + vpsrad m%3, m%3, xm9
> + vpsrad m%4, m%4, xm9
> + vpaddd m%1, m%1, m10
> + vpaddd m%2, m%2, m10
> + vpaddd m%3, m%3, m10
> + vpaddd m%4, m%4, m10
> + ; m%1 = A0-3 A4-7
> + ; m%2 = B0-3 B4-7
> + ; m%3 = C0-3 C4-7
> + ; m%4 = D0-3 D4-7
> + vpackusdw m%1, m%1, m%2
> + vpackusdw m%3, m%3, m%4
> + ; m%1 = A0-3 B0-3 A4-7 B4-7
> + ; m%2 = C0-3 D0-3 C4-7 D4-7
> + vpermq m%1, m%1, q3120
> + vpermq m%2, m%3, q3120
> + ; m%1 = A0-3 A4-7 B0-3 B4-7
> + ; m%2 = C0-3 C4-7 D0-3 D4-7
> + vpackuswb m%1, m%1, m%2
> + ; m%1 = A0-3 A4-7 C0-3 C4-7 B0-3 B4-7 D0-3 D4-7
> + vextracti128 xm%2, m%1, 1
> + vpsrldq xm%3, xm%1, 8
> + vpsrldq xm%4, xm%2, 8
> + vmovq [outputq], xm%1
> + vmovq [outputq + pitchq], xm%2
> + vmovq [outputq + 2*pitchq], xm%3
> + vmovq [outputq + tmpq], xm%4
vextracti128 xm%2, m%1, 1
vmovq [outputq], xm%1
vmovq [outputq + pitchq], xm%2
vpextrq [outputq + 2*pitchq], xm%1, 1
vpextrq [outputq + tmpq], xm%2, 1
Saves you two vpsrldq, and may or may not be faster. Feel free to bench
or ignore.
> + lea outputq, [outputq + 4*pitchq]
> +%endmacro
> +
> + NORMALISE_AND_STORE_8 0, 1, 2, 3
> + NORMALISE_AND_STORE_8 4, 5, 6, 7
> +
> + RET
> +
> +store_10:
> +
> +%macro NORMALISE_AND_STORE_10 2
> + vpaddd m%1, m%1, m8
> + vpaddd m%2, m%2, m8
> + vpsrad m%1, m%1, xm9
> + vpsrad m%2, m%2, xm9
> + vpaddd m%1, m%1, m10
> + vpaddd m%2, m%2, m10
> + vpmaxsd m%1, m%1, m11
> + vpmaxsd m%2, m%2, m11
> + vpminsd m%1, m%1, m12
> + vpminsd m%2, m%2, m12
> + ; m%1 = A0-3 A4-7
> + ; m%2 = B0-3 B4-7
> + vpackusdw m%1, m%1, m%2
> + ; m%1 = A0-3 B0-3 A4-7 B4-7
> + vpermq m%1, m%1, q3120
> + ; m%1 = A0-3 A4-7 B0-3 B4-7
> + vextracti128 [outputq], m%1, 0
mova [outputq], xm%1
There's pretty much never a good reason to use extract for the lower bits.
> + vextracti128 [outputq + pitchq], m%1, 1
> + lea outputq, [outputq + 2*pitchq]
> +%endmacro
> +
> + NORMALISE_AND_STORE_10 0, 1
> + NORMALISE_AND_STORE_10 2, 3
> + NORMALISE_AND_STORE_10 4, 5
> + NORMALISE_AND_STORE_10 6, 7
> +
> + RET
> +
> +%endif ; ARCH_X86_64
> diff --git a/libavcodec/x86/apv_dsp_init.c b/libavcodec/x86/apv_dsp_init.c
> new file mode 100644
> index 0000000000..39360a0ad2
> --- /dev/null
> +++ b/libavcodec/x86/apv_dsp_init.c
> @@ -0,0 +1,44 @@
> +/*
> + * This file is part of FFmpeg.
> + *
> + * FFmpeg is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * FFmpeg is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with FFmpeg; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
> + */
> +
> +#include "config.h"
> +#include "libavutil/attributes.h"
> +#include "libavutil/cpu.h"
> +#include "libavutil/x86/asm.h"
> +#include "libavutil/x86/cpu.h"
> +#include "libavcodec/apv_dsp.h"
> +
> +#if ARCH_X86_64
> +
> +void ff_apv_decode_transquant_avx2(void *output,
> + ptrdiff_t pitch,
> + const int16_t *input,
> + const int16_t *qmatrix,
> + int bit_depth,
> + int qp_shift);
> +
> +av_cold void ff_apv_dsp_init_x86_64(APVDSPContext *dsp)
> +{
> + int cpu_flags = av_get_cpu_flags();
> +
> + if (EXTERNAL_AVX2_FAST(cpu_flags)) {
> + dsp->decode_transquant = ff_apv_decode_transquant_avx2;
> + }
> +}
> +
> +#endif /* ARCH_X86_64 */
> diff --git a/tests/checkasm/Makefile b/tests/checkasm/Makefile
> index d5c50e5599..193c1e4633 100644
> --- a/tests/checkasm/Makefile
> +++ b/tests/checkasm/Makefile
> @@ -28,6 +28,7 @@ AVCODECOBJS-$(CONFIG_AAC_DECODER) += aacpsdsp.o \
> sbrdsp.o
> AVCODECOBJS-$(CONFIG_AAC_ENCODER) += aacencdsp.o
> AVCODECOBJS-$(CONFIG_ALAC_DECODER) += alacdsp.o
> +AVCODECOBJS-$(CONFIG_APV_DECODER) += apv_dsp.o
> AVCODECOBJS-$(CONFIG_DCA_DECODER) += synth_filter.o
> AVCODECOBJS-$(CONFIG_DIRAC_DECODER) += diracdsp.o
> AVCODECOBJS-$(CONFIG_EXR_DECODER) += exrdsp.o
> diff --git a/tests/checkasm/apv_dsp.c b/tests/checkasm/apv_dsp.c
> new file mode 100644
> index 0000000000..b3adb8ca06
> --- /dev/null
> +++ b/tests/checkasm/apv_dsp.c
> @@ -0,0 +1,109 @@
> +/*
> + * This file is part of FFmpeg.
> + *
> + * FFmpeg is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * FFmpeg is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with FFmpeg; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
> + */
> +
> +#include <stdint.h>
> +
> +#include "checkasm.h"
> +
> +#include "libavutil/attributes.h"
> +#include "libavutil/mem_internal.h"
> +#include "libavcodec/apv_dsp.h"
> +
> +
> +static void check_decode_transquant_8(void)
> +{
> + LOCAL_ALIGNED_16(int16_t, input, [64]);
> + LOCAL_ALIGNED_16(int16_t, qmatrix, [64]);
> + LOCAL_ALIGNED_16(uint8_t, new_output, [64]);
> + LOCAL_ALIGNED_16(uint8_t, ref_output, [64]);
> +
> + declare_func(void,
> + uint8_t *output,
nit: this parameter is void*, so maybe just use that instead.
> + ptrdiff_t pitch,
> + const int16_t *input,
> + const int16_t *qmatrix,
> + int bit_depth,
> + int qp_shift);
> +
> + for (int i = 0; i < 64; i++) {
> + // Any signed 12-bit integer.
> + input[i] = rnd() % 2048 - 1024;
> +
> + // qmatrix input is premultiplied by level_scale, so
> + // range is 1 to 255 * 71. Interesting values are all
> + // at the low end of that, though.
> + qmatrix[i] = rnd() % 16 + 16;
> + }
> +
> + call_ref(ref_output, 8, input, qmatrix, 8, 4);
> + call_new(new_output, 8, input, qmatrix, 8, 4);
> +
> + if (memcmp(new_output, ref_output, 64 * sizeof(*ref_output)))
> + fail();
> +
> + bench_new(new_output, 8, input, qmatrix, 8, 4);
> +}
> +
> +static void check_decode_transquant_10(void)
> +{
> + LOCAL_ALIGNED_16( int16_t, input, [64]);
> + LOCAL_ALIGNED_16( int16_t, qmatrix, [64]);
> + LOCAL_ALIGNED_16(uint16_t, new_output, [64]);
> + LOCAL_ALIGNED_16(uint16_t, ref_output, [64]);
> +
> + declare_func(void,
> + uint16_t *output,
Ditto.
> + ptrdiff_t pitch,
> + const int16_t *input,
> + const int16_t *qmatrix,
> + int bit_depth,
> + int qp_shift);
> +
> + for (int i = 0; i < 64; i++) {
> + // Any signed 14-bit integer.
> + input[i] = rnd() % 16384 - 8192;
> +
> + // qmatrix input is premultiplied by level_scale, so
> + // range is 1 to 255 * 71. Interesting values are all
> + // at the low end of that, though.
> + qmatrix[i] = 16; //rnd() % 16 + 16;
> + }
> +
> + call_ref(ref_output, 16, input, qmatrix, 10, 4);
> + call_new(new_output, 16, input, qmatrix, 10, 4);
> +
> + if (memcmp(new_output, ref_output, 64 * sizeof(*ref_output)))
> + fail();
> +
> + bench_new(new_output, 16, input, qmatrix, 10, 4);
> +}
> +
> +void checkasm_check_apv_dsp(void)
> +{
> + APVDSPContext dsp;
> +
> + ff_apv_dsp_init(&dsp);
> +
> + if (check_func(dsp.decode_transquant, "decode_transquant_8"))
> + check_decode_transquant_8();
> +
> + if (check_func(dsp.decode_transquant, "decode_transquant_10"))
> + check_decode_transquant_10();
> +
> + report("apv_dsp");
report("decode_transquant");
So you get
- apv_dsp.decode_transquant [OK]
instead of
- apv_dsp.apv_dsp [OK]
In the checkasm output.
> +}
> diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c
> index 412b8b2cd1..3bb82ed0e5 100644
> --- a/tests/checkasm/checkasm.c
> +++ b/tests/checkasm/checkasm.c
> @@ -129,6 +129,9 @@ static const struct {
> #if CONFIG_ALAC_DECODER
> { "alacdsp", checkasm_check_alacdsp },
> #endif
> + #if CONFIG_APV_DECODER
> + { "apv_dsp", checkasm_check_apv_dsp },
> + #endif
> #if CONFIG_AUDIODSP
> { "audiodsp", checkasm_check_audiodsp },
> #endif
> diff --git a/tests/checkasm/checkasm.h b/tests/checkasm/checkasm.h
> index ad239fb2a4..a6b5965e02 100644
> --- a/tests/checkasm/checkasm.h
> +++ b/tests/checkasm/checkasm.h
> @@ -83,6 +83,7 @@ void checkasm_check_ac3dsp(void);
> void checkasm_check_aes(void);
> void checkasm_check_afir(void);
> void checkasm_check_alacdsp(void);
> +void checkasm_check_apv_dsp(void);
> void checkasm_check_audiodsp(void);
> void checkasm_check_av_tx(void);
> void checkasm_check_blend(void);
> diff --git a/tests/fate/checkasm.mak b/tests/fate/checkasm.mak
> index 6d42df148e..720c5fd77e 100644
> --- a/tests/fate/checkasm.mak
> +++ b/tests/fate/checkasm.mak
> @@ -4,6 +4,7 @@ FATE_CHECKASM = fate-checkasm-aacencdsp \
> fate-checkasm-aes \
> fate-checkasm-af_afir \
> fate-checkasm-alacdsp \
> + fate-checkasm-apv_dsp \
> fate-checkasm-audiodsp \
> fate-checkasm-av_tx \
> fate-checkasm-blockdsp \
[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]
[-- Attachment #2: Type: text/plain, Size: 251 bytes --]
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [FFmpeg-devel] [PATCH v3 4/7] lavc: APV decoder
2025-04-23 20:45 ` [FFmpeg-devel] [PATCH v3 4/7] lavc: APV decoder Mark Thompson
@ 2025-04-24 3:04 ` James Almer
2025-04-25 17:25 ` Michael Niedermayer
1 sibling, 0 replies; 17+ messages in thread
From: James Almer @ 2025-04-24 3:04 UTC (permalink / raw)
To: ffmpeg-devel
[-- Attachment #1.1.1: Type: text/plain, Size: 18292 bytes --]
On 4/23/2025 5:45 PM, Mark Thompson wrote:
> ---
> configure | 1 +
> libavcodec/Makefile | 1 +
> libavcodec/allcodecs.c | 1 +
> libavcodec/apv_decode.c | 403 +++++++++++++++++++++++++++++++++++++++
> libavcodec/apv_decode.h | 80 ++++++++
> libavcodec/apv_dsp.c | 136 +++++++++++++
> libavcodec/apv_dsp.h | 37 ++++
> libavcodec/apv_entropy.c | 200 +++++++++++++++++++
> 8 files changed, 859 insertions(+)
> create mode 100644 libavcodec/apv_decode.c
> create mode 100644 libavcodec/apv_decode.h
> create mode 100644 libavcodec/apv_dsp.c
> create mode 100644 libavcodec/apv_dsp.h
> create mode 100644 libavcodec/apv_entropy.c
>
> diff --git a/configure b/configure
> index ca404d2797..ee270b770c 100755
> --- a/configure
> +++ b/configure
> @@ -2935,6 +2935,7 @@ apng_decoder_select="inflate_wrapper"
> apng_encoder_select="deflate_wrapper llvidencdsp"
> aptx_encoder_select="audio_frame_queue"
> aptx_hd_encoder_select="audio_frame_queue"
> +apv_decoder_select="cbs_apv"
> asv1_decoder_select="blockdsp bswapdsp idctdsp"
> asv1_encoder_select="aandcttables bswapdsp fdctdsp pixblockdsp"
> asv2_decoder_select="blockdsp bswapdsp idctdsp"
> diff --git a/libavcodec/Makefile b/libavcodec/Makefile
> index a5f5c4e904..e674671460 100644
> --- a/libavcodec/Makefile
> +++ b/libavcodec/Makefile
> @@ -244,6 +244,7 @@ OBJS-$(CONFIG_APTX_HD_DECODER) += aptxdec.o aptx.o
> OBJS-$(CONFIG_APTX_HD_ENCODER) += aptxenc.o aptx.o
> OBJS-$(CONFIG_APNG_DECODER) += png.o pngdec.o pngdsp.o
> OBJS-$(CONFIG_APNG_ENCODER) += png.o pngenc.o
> +OBJS-$(CONFIG_APV_DECODER) += apv_decode.o apv_entropy.o apv_dsp.o
> OBJS-$(CONFIG_ARBC_DECODER) += arbc.o
> OBJS-$(CONFIG_ARGO_DECODER) += argo.o
> OBJS-$(CONFIG_SSA_DECODER) += assdec.o ass.o
> diff --git a/libavcodec/allcodecs.c b/libavcodec/allcodecs.c
> index f10519617e..09f06c71d6 100644
> --- a/libavcodec/allcodecs.c
> +++ b/libavcodec/allcodecs.c
> @@ -47,6 +47,7 @@ extern const FFCodec ff_anm_decoder;
> extern const FFCodec ff_ansi_decoder;
> extern const FFCodec ff_apng_encoder;
> extern const FFCodec ff_apng_decoder;
> +extern const FFCodec ff_apv_decoder;
> extern const FFCodec ff_arbc_decoder;
> extern const FFCodec ff_argo_decoder;
> extern const FFCodec ff_asv1_encoder;
> diff --git a/libavcodec/apv_decode.c b/libavcodec/apv_decode.c
> new file mode 100644
> index 0000000000..0cc4f57dab
> --- /dev/null
> +++ b/libavcodec/apv_decode.c
> @@ -0,0 +1,403 @@
> +/*
> + * This file is part of FFmpeg.
> + *
> + * FFmpeg is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * FFmpeg is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with FFmpeg; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
> + */
> +
> +#include "libavutil/mastering_display_metadata.h"
> +#include "libavutil/mem_internal.h"
> +#include "libavutil/pixdesc.h"
> +
> +#include "apv.h"
> +#include "apv_decode.h"
> +#include "apv_dsp.h"
> +#include "avcodec.h"
> +#include "cbs.h"
> +#include "cbs_apv.h"
> +#include "codec_internal.h"
> +#include "decode.h"
> +#include "thread.h"
> +
> +
> +typedef struct APVDecodeContext {
> + CodedBitstreamContext *cbc;
> + APVDSPContext dsp;
> +
> + CodedBitstreamFragment au;
> + APVDerivedTileInfo tile_info;
> +
> + APVVLCLUT decode_lut;
> +
> + AVFrame *output_frame;
> +} APVDecodeContext;
> +
> +static const enum AVPixelFormat apv_format_table[5][5] = {
> + { AV_PIX_FMT_GRAY8, AV_PIX_FMT_GRAY10, AV_PIX_FMT_GRAY12, AV_PIX_FMT_GRAY14, AV_PIX_FMT_GRAY16 },
> + { 0 }, // 4:2:0 is not valid.
> + { AV_PIX_FMT_YUV422P, AV_PIX_FMT_YUV422P10, AV_PIX_FMT_YUV422P12, AV_PIX_FMT_GRAY14, AV_PIX_FMT_YUV422P16 },
> + { AV_PIX_FMT_YUV444P, AV_PIX_FMT_YUV444P10, AV_PIX_FMT_YUV444P12, AV_PIX_FMT_GRAY14, AV_PIX_FMT_YUV444P16 },
> + { AV_PIX_FMT_YUVA444P, AV_PIX_FMT_YUVA444P10, AV_PIX_FMT_YUVA444P12, AV_PIX_FMT_GRAY14, AV_PIX_FMT_YUVA444P16 },
> +};
> +
> +static int apv_decode_check_format(AVCodecContext *avctx,
> + const APVRawFrameHeader *header)
> +{
> + int err, bit_depth;
> +
> + avctx->profile = header->frame_info.profile_idc;
> + avctx->level = header->frame_info.level_idc;
> +
> + bit_depth = header->frame_info.bit_depth_minus8 + 8;
> + if (bit_depth < 8 || bit_depth > 16 || bit_depth % 2) {
> + avpriv_request_sample(avctx, "Bit depth %d", bit_depth);
> + return AVERROR_PATCHWELCOME;
> + }
> + avctx->pix_fmt =
> + apv_format_table[header->frame_info.chroma_format_idc][bit_depth - 4 >> 2];
> +
> + err = ff_set_dimensions(avctx,
> + FFALIGN(header->frame_info.frame_width, 16),
> + FFALIGN(header->frame_info.frame_height, 16));
> + if (err < 0) {
> + // Unsupported frame size.
> + return err;
> + }
> + avctx->width = header->frame_info.frame_width;
> + avctx->height = header->frame_info.frame_height;
> +
> + avctx->sample_aspect_ratio = (AVRational){ 1, 1 };
> +
> + avctx->color_primaries = header->color_primaries;
> + avctx->color_trc = header->transfer_characteristics;
> + avctx->colorspace = header->matrix_coefficients;
> + avctx->color_range = header->full_range_flag ? AVCOL_RANGE_JPEG
> + : AVCOL_RANGE_MPEG;
> + avctx->chroma_sample_location = AVCHROMA_LOC_TOPLEFT;
> +
> + avctx->refs = 0;
> + avctx->has_b_frames = 0;
> +
> + return 0;
> +}
> +
> +static av_cold int apv_decode_init(AVCodecContext *avctx)
> +{
> + APVDecodeContext *apv = avctx->priv_data;
> + int err;
> +
> + err = ff_cbs_init(&apv->cbc, AV_CODEC_ID_APV, avctx);
> + if (err < 0)
> + return err;
> +
> + ff_apv_entropy_build_decode_lut(&apv->decode_lut);
> +
> + ff_apv_dsp_init(&apv->dsp);
> +
> + if (avctx->extradata) {
> + av_log(avctx, AV_LOG_WARNING,
> + "APV does not support extradata.\n");
Either remove this in preparation for extradata to be
exported/generated, or only print it if avctx->internal->is_copy is
false. Otherwise it will be print spammed thread_count times when using
frame threading.
> + }
> +
> + return 0;
> +}
> +
> +static av_cold int apv_decode_close(AVCodecContext *avctx)
> +{
> + APVDecodeContext *apv = avctx->priv_data;
> +
> + ff_cbs_fragment_free(&apv->au);
> + ff_cbs_close(&apv->cbc);
> +
> + return 0;
> +}
> +
> +static int apv_decode_block(AVCodecContext *avctx,
> + void *output,
> + ptrdiff_t pitch,
> + GetBitContext *gbc,
> + APVEntropyState *entropy_state,
> + int bit_depth,
> + int qp_shift,
> + const uint16_t *qmatrix)
> +{
> + APVDecodeContext *apv = avctx->priv_data;
> + int err;
> +
> + LOCAL_ALIGNED_32(int16_t, coeff, [64]);
> +
> + err = ff_apv_entropy_decode_block(coeff, gbc, entropy_state);
> + if (err < 0)
> + return 0;
> +
> + apv->dsp.decode_transquant(output, pitch,
> + coeff, qmatrix,
> + bit_depth, qp_shift);
> +
> + return 0;
> +}
> +
> +static int apv_decode_tile_component(AVCodecContext *avctx, void *data,
> + int job, int thread)
> +{
> + APVRawFrame *input = data;
> + APVDecodeContext *apv = avctx->priv_data;
> + const CodedBitstreamAPVContext *apv_cbc = apv->cbc->priv_data;
> + const APVDerivedTileInfo *tile_info = &apv_cbc->tile_info;
> +
> + int tile_index = job / apv_cbc->num_comp;
> + int comp_index = job % apv_cbc->num_comp;
> +
> + const AVPixFmtDescriptor *pix_fmt_desc =
> + av_pix_fmt_desc_get(avctx->pix_fmt);
> +
> + int sub_w = comp_index == 0 ? 1 : pix_fmt_desc->log2_chroma_w + 1;
> + int sub_h = comp_index == 0 ? 1 : pix_fmt_desc->log2_chroma_h + 1;
> +
> + APVRawTile *tile = &input->tile[tile_index];
> +
> + int tile_y = tile_index / tile_info->tile_cols;
> + int tile_x = tile_index % tile_info->tile_cols;
> +
> + int tile_start_x = tile_info->col_starts[tile_x];
> + int tile_start_y = tile_info->row_starts[tile_y];
> +
> + int tile_width = tile_info->col_starts[tile_x + 1] - tile_start_x;
> + int tile_height = tile_info->row_starts[tile_y + 1] - tile_start_y;
> +
> + int tile_mb_width = tile_width / APV_MB_WIDTH;
> + int tile_mb_height = tile_height / APV_MB_HEIGHT;
> +
> + int blk_mb_width = 2 / sub_w;
> + int blk_mb_height = 2 / sub_h;
> +
> + int bit_depth;
> + int qp_shift;
> + LOCAL_ALIGNED_32(uint16_t, qmatrix_scaled, [64]);
> +
> + GetBitContext gbc;
> +
> + APVEntropyState entropy_state = {
> + .log_ctx = avctx,
> + .decode_lut = &apv->decode_lut,
> + .prev_dc = 0,
> + .prev_dc_diff = 20,
> + .prev_1st_ac_level = 0,
> + };
> +
> + init_get_bits8(&gbc, tile->tile_data[comp_index],
> + tile->tile_header.tile_data_size[comp_index]);
> +
> + // Combine the bitstream quantisation matrix with the qp scaling
> + // in advance. (Including qp_shift as well would overflow 16 bits.)
> + // Fix the row ordering at the same time.
> + {
> + static const uint8_t apv_level_scale[6] = { 40, 45, 51, 57, 64, 71 };
> + int qp = tile->tile_header.tile_qp[comp_index];
> + int level_scale = apv_level_scale[qp % 6];
> +
> + bit_depth = apv_cbc->bit_depth;
> + qp_shift = qp / 6;
> +
> + for (int y = 0; y < 8; y++) {
> + for (int x = 0; x < 8; x++)
> + qmatrix_scaled[y * 8 + x] = level_scale *
> + input->frame_header.quantization_matrix.q_matrix[comp_index][x][y];
> + }
> + }
> +
> + for (int mb_y = 0; mb_y < tile_mb_height; mb_y++) {
> + for (int mb_x = 0; mb_x < tile_mb_width; mb_x++) {
> + for (int blk_y = 0; blk_y < blk_mb_height; blk_y++) {
> + for (int blk_x = 0; blk_x < blk_mb_width; blk_x++) {
> + int frame_y = (tile_start_y +
> + APV_MB_HEIGHT * mb_y +
> + APV_TR_SIZE * blk_y) / sub_h;
> + int frame_x = (tile_start_x +
> + APV_MB_WIDTH * mb_x +
> + APV_TR_SIZE * blk_x) / sub_w;
> +
> + ptrdiff_t frame_pitch = apv->output_frame->linesize[comp_index];
> + uint8_t *block_start = apv->output_frame->data[comp_index] +
> + frame_y * frame_pitch + 2 * frame_x;
> +
> + apv_decode_block(avctx,
> + block_start, frame_pitch,
> + &gbc, &entropy_state,
> + bit_depth,
> + qp_shift,
> + qmatrix_scaled);
> + }
> + }
> + }
> + }
> +
> + av_log(avctx, AV_LOG_DEBUG,
> + "Decoded tile %d component %d: %dx%d MBs starting at (%d,%d)\n",
> + tile_index, comp_index, tile_mb_width, tile_mb_height,
> + tile_start_x, tile_start_y);
> +
> + return 0;
> +}
> +
> +static int apv_decode(AVCodecContext *avctx, AVFrame *output,
> + APVRawFrame *input)
> +{
> + APVDecodeContext *apv = avctx->priv_data;
> + const CodedBitstreamAPVContext *apv_cbc = apv->cbc->priv_data;
> + const APVDerivedTileInfo *tile_info = &apv_cbc->tile_info;
> + int err, job_count;
> +
> + err = apv_decode_check_format(avctx, &input->frame_header);
> + if (err < 0) {
> + av_log(avctx, AV_LOG_ERROR, "Unsupported format parameters.\n");
> + return err;
> + }
> +
> + err = ff_thread_get_buffer(avctx, output, 0);
> + if (err) {
> + av_log(avctx, AV_LOG_ERROR, "No output frame supplied.\n");
> + return err;
> + }
> +
> + apv->output_frame = output;
> +
> + // Each component within a tile is independent of every other,
> + // so we can decode all in parallel.
> + job_count = tile_info->num_tiles * apv_cbc->num_comp;
> +
> + avctx->execute2(avctx, apv_decode_tile_component,
> + input, NULL, job_count);
> +
> + return 0;
> +}
> +
> +static int apv_decode_metadata(AVCodecContext *avctx, AVFrame *frame,
> + const APVRawMetadata *md)
> +{
> + int err;
> +
> + for (int i = 0; i < md->metadata_count; i++) {
> + const APVRawMetadataPayload *pl = &md->payloads[i];
> +
> + switch (pl->payload_type) {
> + case APV_METADATA_MDCV:
> + {
> + const APVRawMetadataMDCV *mdcv = &pl->mdcv;
> + AVMasteringDisplayMetadata *mdm;
> +
> + err = ff_decode_mastering_display_new(avctx, frame, &mdm);
> + if (err < 0)
> + return err;
> +
> + if (mdm) {
> + for (int i = 0; i < 3; i++) {
> + mdm->display_primaries[i][0] =
> + av_make_q(mdcv->primary_chromaticity_x[i], 1 << 16);
> + mdm->display_primaries[i][1] =
> + av_make_q(mdcv->primary_chromaticity_y[i], 1 << 16);
> + }
> +
> + mdm->white_point[0] =
> + av_make_q(mdcv->white_point_chromaticity_x, 1 << 16);
> + mdm->white_point[1] =
> + av_make_q(mdcv->white_point_chromaticity_y, 1 << 16);
> +
> + mdm->max_luminance =
> + av_make_q(mdcv->max_mastering_luminance, 1 << 8);
> + mdm->min_luminance =
> + av_make_q(mdcv->min_mastering_luminance, 1 << 14);
> +
> + mdm->has_primaries = 1;
> + mdm->has_luminance = 1;
> + }
> + }
> + break;
> + case APV_METADATA_CLL:
> + {
> + const APVRawMetadataCLL *cll = &pl->cll;
> + AVContentLightMetadata *clm;
> +
> + err = ff_decode_content_light_new(avctx, frame, &clm);
> + if (err < 0)
> + return err;
> +
> + if (clm) {
> + clm->MaxCLL = cll->max_cll;
> + clm->MaxFALL = cll->max_fall;
> + }
> + }
> + break;
> + default:
> + // Ignore other types of metadata.
> + }
> + }
> +
> + return 0;
> +}
> +
> +static int apv_decode_frame(AVCodecContext *avctx, AVFrame *frame,
> + int *got_frame, AVPacket *packet)
> +{
> + APVDecodeContext *apv = avctx->priv_data;
> + CodedBitstreamFragment *au = &apv->au;
> + int err;
> +
> + err = ff_cbs_read_packet(apv->cbc, au, packet);
> + if (err < 0) {
> + av_log(avctx, AV_LOG_ERROR, "Failed to read packet.\n");
> + return err;
> + }
> +
> + for (int i = 0; i < au->nb_units; i++) {
> + CodedBitstreamUnit *pbu = &au->units[i];
> +
> + switch (pbu->type) {
> + case APV_PBU_PRIMARY_FRAME:
If the other frame types are not going to be supported for now, then
define decompose_unit_types to ignore them.
> + err = apv_decode(avctx, frame, pbu->content);
> + if (err < 0)
> + return err;
> + *got_frame = 1;
> + break;
> + case APV_PBU_METADATA:
> + apv_decode_metadata(avctx, frame, pbu->content);
> + break;
> + case APV_PBU_ACCESS_UNIT_INFORMATION:
> + case APV_PBU_FILLER:
And add these too.
> + // Ignored by the decoder.
> + break;
> + default:
> + av_log(avctx, AV_LOG_WARNING,
Maybe VERBOSE instead? If a sample has non supported frame types, this
will be spammed in standard log levels.
If anything, print at WARNING level for PBU types not currently defined.
> + "Ignoring unsupported PBU type %d.\n", pbu->type);
> + }
> + }
> +
> + ff_cbs_fragment_reset(au);
> +
> + return packet->size;
> +}
> +
> +const FFCodec ff_apv_decoder = {
> + .p.name = "apv",
> + CODEC_LONG_NAME("Advanced Professional Video"),
> + .p.type = AVMEDIA_TYPE_VIDEO,
> + .p.id = AV_CODEC_ID_APV,
> + .priv_data_size = sizeof(APVDecodeContext),
> + .init = apv_decode_init,
> + .close = apv_decode_close,
> + FF_CODEC_DECODE_CB(apv_decode_frame),
> + .p.capabilities = AV_CODEC_CAP_DR1 |
> + AV_CODEC_CAP_SLICE_THREADS |
> + AV_CODEC_CAP_FRAME_THREADS,
> +};
[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]
[-- Attachment #2: Type: text/plain, Size: 251 bytes --]
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [FFmpeg-devel] [PATCH v3 3/7] lavf: APV demuxer
2025-04-24 0:10 ` James Almer
@ 2025-04-24 20:15 ` Mark Thompson
0 siblings, 0 replies; 17+ messages in thread
From: Mark Thompson @ 2025-04-24 20:15 UTC (permalink / raw)
To: ffmpeg-devel
On 24/04/2025 01:10, James Almer wrote:
> On 4/23/2025 5:45 PM, Mark Thompson wrote:
>> +static int apv_read_header(AVFormatContext *s)
>> +{
>> + AVStream *st;
>> + GetByteContext gbc;
>> + APVHeaderInfo header;
>> + uint8_t buffer[28];
>> + uint32_t au_size, signature, pbu_size;
>> + int err, size;
>> +
>> + err = ffio_ensure_seekback(s->pb, sizeof(buffer));
>
> Isn't 28 bytes small enough that a backwards avio_seek() should always succeed?
I don't see any documentation to that effect and I'm not familiar with the details of the implementation.
There are various calls in other places in lavf which give single-digit numbers to ffio_ensure_seekback, so I'll keep it unless there is a guarantee somewhere that I've missed.
>> + if (err < 0)
>> + return err;
>> + size = avio_read(s->pb, buffer, sizeof(buffer));
>> + if (size < 0)
>> + return size;
>> +
>> + bytestream2_init(&gbc, buffer, sizeof(buffer));
>> +
>> + au_size = bytestream2_get_be32(&gbc);
>> + if (au_size < 24) {
>> + // Too small.
>> + return AVERROR_INVALIDDATA;
>> + }
>> + signature = bytestream2_get_be32(&gbc);
>> + if (signature != APV_SIGNATURE) {
>> + // Signature is mandatory.
>> + return AVERROR_INVALIDDATA;
>> + }
>> + pbu_size = bytestream2_get_be32(&gbc);
>> + if (pbu_size < 16) {
>> + // Too small.
>> + return AVERROR_INVALIDDATA;
>> + }
>> +
>> + err = apv_extract_header_info(&header, &gbc);
>> + if (err < 0)
>> + return err;
>> +
>> + st = avformat_new_stream(s, NULL);
>> + if (!st)
>> + return AVERROR(ENOMEM);
>> +
>> + st->codecpar->codec_type = AVMEDIA_TYPE_VIDEO;
>> + st->codecpar->codec_id = AV_CODEC_ID_APV;
>> + st->codecpar->format = header.pixel_format;
>> + st->codecpar->profile = header.profile_idc;
>> + st->codecpar->level = header.level_idc;
>> + st->codecpar->width = header.frame_width;
>> + st->codecpar->height = header.frame_height;
>> +
>> + st->avg_frame_rate = (AVRational){ 30, 1 };
>> + avpriv_set_pts_info(st, 64, 1, 30);
>> +
>> + avio_seek(s->pb, -size, SEEK_CUR);
>> +
>> + return 0;
>> +}
>> +
>> +static int apv_read_packet(AVFormatContext *s, AVPacket *pkt)
>> +{
>> + uint32_t au_size;
>> + int ret;
>> +
>> + au_size = avio_rb32(s->pb);
>> + if (au_size == 0 && avio_feof(s->pb))
>> + return AVERROR_EOF;
>> + if (au_size < 16 || au_size > 1 << 24) {
>
> Might be a good idea to also check for the signature.
Fair, added.
Also made the upper limit a bit bigger because it feels a little too close if 16K video is ever a thing.
>> + av_log(s, AV_LOG_ERROR, "APV AU is bad\n");
>> + return AVERROR_INVALIDDATA;
>> + }
>> +
>> + ret = av_get_packet(s->pb, pkt, au_size);
>> + pkt->flags = AV_PKT_FLAG_KEY;
>> +
>> + return ret;
>> +}
Thanks,
- Mark
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [FFmpeg-devel] [PATCH v3 2/7] lavc/cbs: APV support
2025-04-24 0:02 ` James Almer
@ 2025-04-24 20:16 ` Mark Thompson
0 siblings, 0 replies; 17+ messages in thread
From: Mark Thompson @ 2025-04-24 20:16 UTC (permalink / raw)
To: ffmpeg-devel
On 24/04/2025 01:02, James Almer wrote:
> On 4/23/2025 5:45 PM, Mark Thompson wrote:
>> +static int cbs_apv_split_fragment(CodedBitstreamContext *ctx,
>> + CodedBitstreamFragment *frag,
>> + int header)
>> +{
>> + uint8_t *data = frag->data;
>> + size_t size = frag->data_size;
>> + uint32_t signature;
>> + int err, trace;
>
> To prepare CBS for the presence of extradata, make this function a no-op when header != 0.
Yep, done.
I've also added the check that we actually have 4 bytes available for the signature (missed when I added it to the AVPacket).
>> +
>> + // Don't include parsing here in trace output.
>> + trace = ctx->trace_enable;
>> + ctx->trace_enable = 0;
>> +
>> + signature = AV_RB32(data);
>> + if (signature != APV_SIGNATURE) {
>> + av_log(ctx->log_ctx, AV_LOG_ERROR,
>> + "Invalid APV access unit: bad signature %08x.\n",
>> + signature);
>> + err = AVERROR_INVALIDDATA;
>> + goto fail;
>> + }
>> + data += 4;
>> + size -= 4;
>> +
>> + while (size > 0) {
>> + GetBitContext gbc;
>> + uint32_t pbu_size;
>> + APVRawPBUHeader pbu_header;
>> +
>> + if (size < 8) {
>> + av_log(ctx->log_ctx, AV_LOG_ERROR, "Invalid PBU: "
>> + "fragment too short (%"SIZE_SPECIFIER" bytes).\n",
>> + size);
>> + err = AVERROR_INVALIDDATA;
>> + goto fail;
>> + }
Thanks,
- Mark
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [FFmpeg-devel] [PATCH v3 5/7] lavc/apv: AVX2 transquant for x86-64
2025-04-24 2:55 ` James Almer
@ 2025-04-24 20:37 ` Mark Thompson
2025-04-24 21:41 ` James Almer
0 siblings, 1 reply; 17+ messages in thread
From: Mark Thompson @ 2025-04-24 20:37 UTC (permalink / raw)
To: ffmpeg-devel
On 24/04/2025 03:55, James Almer wrote:
> On 4/23/2025 5:45 PM, Mark Thompson wrote:
>> Typical checkasm result on Alder Lake:
>>
>> decode_transquant_8_c: 464.2 ( 1.00x)
>> decode_transquant_8_avx2: 86.2 ( 5.38x)
>> decode_transquant_10_c: 481.6 ( 1.00x)
>> decode_transquant_10_avx2: 83.5 ( 5.77x)
>> ---
>> libavcodec/apv_dsp.c | 4 +
>> libavcodec/apv_dsp.h | 2 +
>> libavcodec/x86/Makefile | 2 +
>> libavcodec/x86/apv_dsp.asm | 311 ++++++++++++++++++++++++++++++++++
>> libavcodec/x86/apv_dsp_init.c | 44 +++++
>> tests/checkasm/Makefile | 1 +
>> tests/checkasm/apv_dsp.c | 109 ++++++++++++
>> tests/checkasm/checkasm.c | 3 +
>> tests/checkasm/checkasm.h | 1 +
>> tests/fate/checkasm.mak | 1 +
>> 10 files changed, 478 insertions(+)
>> create mode 100644 libavcodec/x86/apv_dsp.asm
>> create mode 100644 libavcodec/x86/apv_dsp_init.c
>> create mode 100644 tests/checkasm/apv_dsp.c
>>
>> ...
>> diff --git a/libavcodec/x86/apv_dsp.asm b/libavcodec/x86/apv_dsp.asm
>> new file mode 100644
>> index 0000000000..12d96481de
>> --- /dev/null
>> +++ b/libavcodec/x86/apv_dsp.asm
>> @@ -0,0 +1,311 @@
>> +;************************************************************************
>> +;* This file is part of FFmpeg.
>> +;*
>> +;* FFmpeg is free software; you can redistribute it and/or
>> +;* modify it under the terms of the GNU Lesser General Public
>> +;* License as published by the Free Software Foundation; either
>> +;* version 2.1 of the License, or (at your option) any later version.
>> +;*
>> +;* FFmpeg is distributed in the hope that it will be useful,
>> +;* but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +;* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
>> +;* Lesser General Public License for more details.
>> +;*
>> +;* You should have received a copy of the GNU Lesser General Public
>> +;* License along with FFmpeg; if not, write to the Free Software
>> +;* 51, Inc., Foundation Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
>> +;******************************************************************************
>> +
>> +%include "libavutil/x86/x86util.asm"
>> +
>> +%if ARCH_X86_64
>> +
>> +SECTION_RODATA 32
>> +
>> +; Full matrix for row transform.
>> +const tmatrix_row
>> + dw 64, 89, 84, 75, 64, 50, 35, 18
>> + dw 64, -18, -84, 50, 64, -75, -35, 89
>> + dw 64, 75, 35, -18, -64, -89, -84, -50
>> + dw 64, -50, -35, 89, -64, -18, 84, -75
>> + dw 64, 50, -35, -89, -64, 18, 84, 75
>> + dw 64, -75, 35, 18, -64, 89, -84, 50
>> + dw 64, 18, -84, -50, 64, 75, -35, -89
>> + dw 64, -89, 84, -75, 64, -50, 35, -18
>> +
>> +; Constant pairs for broadcast in column transform.
>> +const tmatrix_col_even
>> + dw 64, 64, 64, -64
>> + dw 84, 35, 35, -84
>> +const tmatrix_col_odd
>> + dw 89, 75, 50, 18
>> + dw 75, -18, -89, -50
>> + dw 50, -89, 18, 75
>> + dw 18, -50, 75, -89
>> +
>> +; Memory targets for vpbroadcastd (register version requires AVX512).
>> +cextern pd_1
>> +const sixtyfour
>> + dd 64
>> +
>> +SECTION .text
>> +
>> +; void ff_apv_decode_transquant_avx2(void *output,
>> +; ptrdiff_t pitch,
>> +; const int16_t *input,
>> +; const int16_t *qmatrix,
>> +; int bit_depth,
>> +; int qp_shift);
>> +
>> +INIT_YMM avx2
>> +
>> +cglobal apv_decode_transquant, 5, 7, 16, output, pitch, input, qmatrix, bit_depth, qp_shift, tmp
>> +
>> + ; Load input and dequantise
>> +
>> + vpbroadcastd m10, [pd_1]
>> + lea tmpd, [bit_depthd - 2]
>> + movd xm8, qp_shiftm
>> + movd xm9, tmpd
>> + vpslld m10, m10, xm9
>> + vpsrld m10, m10, 1
>> +
>> + ; m8 = scalar qp_shift
>> + ; m9 = scalar bd_shift
>> + ; m10 = vector 1 << (bd_shift - 1)
>> + ; m11 = qmatrix load
>> +
>> +%macro LOAD_AND_DEQUANT 2 ; (xmm input, constant offset)
>> + vpmovsxwd m%1, [inputq + %2]
>> + vpmovsxwd m11, [qmatrixq + %2]
>> + vpmaddwd m%1, m%1, m11
>> + vpslld m%1, m%1, xm8
>> + vpaddd m%1, m%1, m10
>> + vpsrad m%1, m%1, xm9
>> + vpackssdw m%1, m%1, m%1
>> +%endmacro
>> +
>> + LOAD_AND_DEQUANT 0, 0x00
>> + LOAD_AND_DEQUANT 1, 0x10
>> + LOAD_AND_DEQUANT 2, 0x20
>> + LOAD_AND_DEQUANT 3, 0x30
>> + LOAD_AND_DEQUANT 4, 0x40
>> + LOAD_AND_DEQUANT 5, 0x50
>> + LOAD_AND_DEQUANT 6, 0x60
>> + LOAD_AND_DEQUANT 7, 0x70
>> +
>> + ; mN = row N words 0 1 2 3 0 1 2 3 4 5 6 7 4 5 6 7
>> +
>> + ; Transform columns
>> + ; This applies a 1-D DCT butterfly
>> +
>> + vpunpcklwd m12, m0, m4
>> + vpunpcklwd m13, m2, m6
>> + vpunpcklwd m14, m1, m3
>> + vpunpcklwd m15, m5, m7
>> +
>> + ; m12 = rows 0 and 4 interleaved
>> + ; m13 = rows 2 and 6 interleaved
>> + ; m14 = rows 1 and 3 interleaved
>> + ; m15 = rows 5 and 7 interleaved
>> +
>> + lea tmpq, [tmatrix_col_even]
>> + vpbroadcastd m0, [tmpq + 0x00]
>> + vpbroadcastd m1, [tmpq + 0x04]
>> + vpbroadcastd m2, [tmpq + 0x08]
>> + vpbroadcastd m3, [tmpq + 0x0c]
>
> How about
>
> vbroadcasti128 m0, [tmatrix_col_even]
> pshufd m1, m0, q1111
> pshufd m2, m0, q2222
> pshufd m3, m0, q3333
> pshufd m0, m0, q0000
>
> So you remove the lea, and do a single load from memory within a single cross-lane intruction, instead of four of each.
>
> Same below.
The broadcasts from memory are not slow, they don't read from either lane.
I can't measure a diffrence but instruction tables have vpbroadcastd as 1/3 and pshufd as 1/2 so I think I'll take that as a tie-break? (lea is free and they will all load together, the vbroadcasti128 load is unaligned but pretty sure that is irrelevant.)
>> +
>> + vpmaddwd m4, m12, m0
>> + vpmaddwd m5, m12, m1
>> + vpmaddwd m6, m13, m2
>> + vpmaddwd m7, m13, m3
>> + vpaddd m8, m4, m6
>> + vpaddd m9, m5, m7
>> + vpsubd m10, m5, m7
>> + vpsubd m11, m4, m6
>> +
>> + lea tmpq, [tmatrix_col_odd]
>> + vpbroadcastd m0, [tmpq + 0x00]
>> + vpbroadcastd m1, [tmpq + 0x04]
>> + vpbroadcastd m2, [tmpq + 0x08]
>> + vpbroadcastd m3, [tmpq + 0x0c]
>> +
>> + vpmaddwd m4, m14, m0
>> + vpmaddwd m5, m15, m1
>> + vpmaddwd m6, m14, m2
>> + vpmaddwd m7, m15, m3
>> + vpaddd m12, m4, m5
>> + vpaddd m13, m6, m7
>> +
>> + vpbroadcastd m0, [tmpq + 0x10]
>> + vpbroadcastd m1, [tmpq + 0x14]
>> + vpbroadcastd m2, [tmpq + 0x18]
>> + vpbroadcastd m3, [tmpq + 0x1c]
>> +
>> + vpmaddwd m4, m14, m0
>> + vpmaddwd m5, m15, m1
>> + vpmaddwd m6, m14, m2
>> + vpmaddwd m7, m15, m3
>> + vpaddd m14, m4, m5
>> + vpaddd m15, m6, m7
>> +
>> + vpaddd m0, m8, m12
>> + vpaddd m1, m9, m13
>> + vpaddd m2, m10, m14
>> + vpaddd m3, m11, m15
>> + vpsubd m4, m11, m15
>> + vpsubd m5, m10, m14
>> + vpsubd m6, m9, m13
>> + vpsubd m7, m8, m12
>> +
>> + ; Mid-transform normalisation
>> + ; Note that outputs here are fitted to 16 bits
>> +
>> + vpbroadcastd m8, [sixtyfour]
>> +
>> +%macro NORMALISE 1
>> + vpaddd m%1, m%1, m8
>> + vpsrad m%1, m%1, 7
>> + vpackssdw m%1, m%1, m%1
>> + vpermq m%1, m%1, q3120
>> +%endmacro
>> +
>> + NORMALISE 0
>> + NORMALISE 1
>> + NORMALISE 2
>> + NORMALISE 3
>> + NORMALISE 4
>> + NORMALISE 5
>> + NORMALISE 6
>> + NORMALISE 7
>> +
>> + ; mN = row N words 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
>> +
>> + ; Transform rows
>> + ; This multiplies the rows directly by the transform matrix,
>> + ; avoiding the need to transpose anything
>> +
>> + lea tmpq, [tmatrix_row]
>> + mova m12, [tmpq + 0x00]
>> + mova m13, [tmpq + 0x20]
>> + mova m14, [tmpq + 0x40]
>> + mova m15, [tmpq + 0x60]
>> +
>> +%macro TRANS_ROW_STEP 1
>> + vpmaddwd m8, m%1, m12
>> + vpmaddwd m9, m%1, m13
>> + vpmaddwd m10, m%1, m14
>> + vpmaddwd m11, m%1, m15
>> + vphaddd m8, m8, m9
>> + vphaddd m10, m10, m11
>> + vphaddd m%1, m8, m10
>> +%endmacro
>> +
>> + TRANS_ROW_STEP 0
>> + TRANS_ROW_STEP 1
>> + TRANS_ROW_STEP 2
>> + TRANS_ROW_STEP 3
>> + TRANS_ROW_STEP 4
>> + TRANS_ROW_STEP 5
>> + TRANS_ROW_STEP 6
>> + TRANS_ROW_STEP 7
>> +
>> + ; Renormalise, clip and store output
>> +
>> + vpbroadcastd m14, [pd_1]
>> + mov tmpd, 20
>> + sub tmpd, bit_depthd
>> + movd xm9, tmpd
>> + dec tmpd
>> + movd xm13, tmpd
>> + movd xm15, bit_depthd
>> + vpslld m8, m14, xm13
>> + vpslld m12, m14, xm15
>> + vpsrld m10, m12, 1
>> + vpsubd m12, m12, m14
>> + vpxor m11, m11, m11
>> +
>> + ; m8 = vector 1 << (bd_shift - 1)
>> + ; m9 = scalar bd_shift
>> + ; m10 = vector 1 << (bit_depth - 1)
>> + ; m11 = zero
>> + ; m12 = vector (1 << bit_depth) - 1
>> +
>> + cmp bit_depthd, 8
>> + jne store_10
>> +
>> + lea tmpq, [pitchq + 2*pitchq]
>> +%macro NORMALISE_AND_STORE_8 4
>> + vpaddd m%1, m%1, m8
>> + vpaddd m%2, m%2, m8
>> + vpaddd m%3, m%3, m8
>> + vpaddd m%4, m%4, m8
>> + vpsrad m%1, m%1, xm9
>> + vpsrad m%2, m%2, xm9
>> + vpsrad m%3, m%3, xm9
>> + vpsrad m%4, m%4, xm9
>> + vpaddd m%1, m%1, m10
>> + vpaddd m%2, m%2, m10
>> + vpaddd m%3, m%3, m10
>> + vpaddd m%4, m%4, m10
>> + ; m%1 = A0-3 A4-7
>> + ; m%2 = B0-3 B4-7
>> + ; m%3 = C0-3 C4-7
>> + ; m%4 = D0-3 D4-7
>> + vpackusdw m%1, m%1, m%2
>> + vpackusdw m%3, m%3, m%4
>> + ; m%1 = A0-3 B0-3 A4-7 B4-7
>> + ; m%2 = C0-3 D0-3 C4-7 D4-7
>> + vpermq m%1, m%1, q3120
>> + vpermq m%2, m%3, q3120
>> + ; m%1 = A0-3 A4-7 B0-3 B4-7
>> + ; m%2 = C0-3 C4-7 D0-3 D4-7
>> + vpackuswb m%1, m%1, m%2
>> + ; m%1 = A0-3 A4-7 C0-3 C4-7 B0-3 B4-7 D0-3 D4-7
>> + vextracti128 xm%2, m%1, 1
>> + vpsrldq xm%3, xm%1, 8
>> + vpsrldq xm%4, xm%2, 8
>> + vmovq [outputq], xm%1
>> + vmovq [outputq + pitchq], xm%2
>> + vmovq [outputq + 2*pitchq], xm%3
>> + vmovq [outputq + tmpq], xm%4
>
> vextracti128 xm%2, m%1, 1
> vmovq [outputq], xm%1
> vmovq [outputq + pitchq], xm%2
> vpextrq [outputq + 2*pitchq], xm%1, 1
> vpextrq [outputq + tmpq], xm%2, 1
>
> Saves you two vpsrldq, and may or may not be faster. Feel free to bench or ignore.
My measurements are not accurate enough to see a difference but instruction tables say it should win, so I'll take it.
>> + lea outputq, [outputq + 4*pitchq]
>> +%endmacro
>> +
>> + NORMALISE_AND_STORE_8 0, 1, 2, 3
>> + NORMALISE_AND_STORE_8 4, 5, 6, 7
>> +
>> + RET
>> +
>> +store_10:
>> +
>> +%macro NORMALISE_AND_STORE_10 2
>> + vpaddd m%1, m%1, m8
>> + vpaddd m%2, m%2, m8
>> + vpsrad m%1, m%1, xm9
>> + vpsrad m%2, m%2, xm9
>> + vpaddd m%1, m%1, m10
>> + vpaddd m%2, m%2, m10
>> + vpmaxsd m%1, m%1, m11
>> + vpmaxsd m%2, m%2, m11
>> + vpminsd m%1, m%1, m12
>> + vpminsd m%2, m%2, m12
>> + ; m%1 = A0-3 A4-7
>> + ; m%2 = B0-3 B4-7
>> + vpackusdw m%1, m%1, m%2
>> + ; m%1 = A0-3 B0-3 A4-7 B4-7
>> + vpermq m%1, m%1, q3120
>> + ; m%1 = A0-3 A4-7 B0-3 B4-7
>> + vextracti128 [outputq], m%1, 0
>
> mova [outputq], xm%1
>
> There's pretty much never a good reason to use extract for the lower bits.
Yep. Nicely symmetrical code is not enough of a good reason :(
>> + vextracti128 [outputq + pitchq], m%1, 1
>> + lea outputq, [outputq + 2*pitchq]
>> +%endmacro
>> +
>> + NORMALISE_AND_STORE_10 0, 1
>> + NORMALISE_AND_STORE_10 2, 3
>> + NORMALISE_AND_STORE_10 4, 5
>> + NORMALISE_AND_STORE_10 6, 7
>> +
>> + RET
>> +
>> +%endif ; ARCH_X86_64
>> ...
>> diff --git a/tests/checkasm/apv_dsp.c b/tests/checkasm/apv_dsp.c
>> new file mode 100644
>> index 0000000000..b3adb8ca06
>> --- /dev/null
>> +++ b/tests/checkasm/apv_dsp.c
>> @@ -0,0 +1,109 @@
>> +/*
>> + * This file is part of FFmpeg.
>> + *
>> + * FFmpeg is free software; you can redistribute it and/or
>> + * modify it under the terms of the GNU Lesser General Public
>> + * License as published by the Free Software Foundation; either
>> + * version 2.1 of the License, or (at your option) any later version.
>> + *
>> + * FFmpeg is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
>> + * Lesser General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU Lesser General Public
>> + * License along with FFmpeg; if not, write to the Free Software
>> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
>> + */
>> +
>> +#include <stdint.h>
>> +
>> +#include "checkasm.h"
>> +
>> +#include "libavutil/attributes.h"
>> +#include "libavutil/mem_internal.h"
>> +#include "libavcodec/apv_dsp.h"
>> +
>> +
>> +static void check_decode_transquant_8(void)
>> +{
>> + LOCAL_ALIGNED_16(int16_t, input, [64]);
>> + LOCAL_ALIGNED_16(int16_t, qmatrix, [64]);
>> + LOCAL_ALIGNED_16(uint8_t, new_output, [64]);
>> + LOCAL_ALIGNED_16(uint8_t, ref_output, [64]);
>> +
>> + declare_func(void,
>> + uint8_t *output,
>
> nit: this parameter is void*, so maybe just use that instead.
Fair. I was wondering whether to template two functions which lets it keep the type, but the fact that the branch is completely predictable makes it seem like a waste,
>> + ptrdiff_t pitch,
>> + const int16_t *input,
>> + const int16_t *qmatrix,
>> + int bit_depth,
>> + int qp_shift);
>> +
>> + for (int i = 0; i < 64; i++) {
>> + // Any signed 12-bit integer.
>> + input[i] = rnd() % 2048 - 1024;
>> +
>> + // qmatrix input is premultiplied by level_scale, so
>> + // range is 1 to 255 * 71. Interesting values are all
>> + // at the low end of that, though.
>> + qmatrix[i] = rnd() % 16 + 16;
>> + }
>> +
>> + call_ref(ref_output, 8, input, qmatrix, 8, 4);
>> + call_new(new_output, 8, input, qmatrix, 8, 4);
>> +
>> + if (memcmp(new_output, ref_output, 64 * sizeof(*ref_output)))
>> + fail();
>> +
>> + bench_new(new_output, 8, input, qmatrix, 8, 4);
>> +}
>> +
>> +static void check_decode_transquant_10(void)
>> +{
>> + LOCAL_ALIGNED_16( int16_t, input, [64]);
>> + LOCAL_ALIGNED_16( int16_t, qmatrix, [64]);
>> + LOCAL_ALIGNED_16(uint16_t, new_output, [64]);
>> + LOCAL_ALIGNED_16(uint16_t, ref_output, [64]);
>> +
>> + declare_func(void,
>> + uint16_t *output,
>
> Ditto.
>
>> + ptrdiff_t pitch,
>> + const int16_t *input,
>> + const int16_t *qmatrix,
>> + int bit_depth,
>> + int qp_shift);
>> +
>> + for (int i = 0; i < 64; i++) {
>> + // Any signed 14-bit integer.
>> + input[i] = rnd() % 16384 - 8192;
>> +
>> + // qmatrix input is premultiplied by level_scale, so
>> + // range is 1 to 255 * 71. Interesting values are all
>> + // at the low end of that, though.
>> + qmatrix[i] = 16; //rnd() % 16 + 16;
>> + }
>> +
>> + call_ref(ref_output, 16, input, qmatrix, 10, 4);
>> + call_new(new_output, 16, input, qmatrix, 10, 4);
>> +
>> + if (memcmp(new_output, ref_output, 64 * sizeof(*ref_output)))
>> + fail();
>> +
>> + bench_new(new_output, 16, input, qmatrix, 10, 4);
>> +}
>> +
>> +void checkasm_check_apv_dsp(void)
>> +{
>> + APVDSPContext dsp;
>> +
>> + ff_apv_dsp_init(&dsp);
>> +
>> + if (check_func(dsp.decode_transquant, "decode_transquant_8"))
>> + check_decode_transquant_8();
>> +
>> + if (check_func(dsp.decode_transquant, "decode_transquant_10"))
>> + check_decode_transquant_10();
>> +
>> + report("apv_dsp");
>
> report("decode_transquant");
>
> So you get
>
> - apv_dsp.decode_transquant [OK]
>
> instead of
>
> - apv_dsp.apv_dsp [OK]
>
> In the checkasm output.
Ah, I hadn't connected that output to the string here. Fixed.
>> +}
>> diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c
>> index 412b8b2cd1..3bb82ed0e5 100644
>> --- a/tests/checkasm/checkasm.c
>> +++ b/tests/checkasm/checkasm.c
>> @@ -129,6 +129,9 @@ static const struct {
>> #if CONFIG_ALAC_DECODER
>> { "alacdsp", checkasm_check_alacdsp },
>> #endif
>> + #if CONFIG_APV_DECODER
>> + { "apv_dsp", checkasm_check_apv_dsp },
>> + #endif
>> #if CONFIG_AUDIODSP
>> { "audiodsp", checkasm_check_audiodsp },
>> #endif
>> ...
Thanks,
- Mark
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [FFmpeg-devel] [PATCH v3 5/7] lavc/apv: AVX2 transquant for x86-64
2025-04-24 20:37 ` Mark Thompson
@ 2025-04-24 21:41 ` James Almer
0 siblings, 0 replies; 17+ messages in thread
From: James Almer @ 2025-04-24 21:41 UTC (permalink / raw)
To: ffmpeg-devel
[-- Attachment #1.1.1: Type: text/plain, Size: 7643 bytes --]
On 4/24/2025 5:37 PM, Mark Thompson wrote:
> On 24/04/2025 03:55, James Almer wrote:
>> On 4/23/2025 5:45 PM, Mark Thompson wrote:
>>> Typical checkasm result on Alder Lake:
>>>
>>> decode_transquant_8_c: 464.2 ( 1.00x)
>>> decode_transquant_8_avx2: 86.2 ( 5.38x)
>>> decode_transquant_10_c: 481.6 ( 1.00x)
>>> decode_transquant_10_avx2: 83.5 ( 5.77x)
>>> ---
>>> libavcodec/apv_dsp.c | 4 +
>>> libavcodec/apv_dsp.h | 2 +
>>> libavcodec/x86/Makefile | 2 +
>>> libavcodec/x86/apv_dsp.asm | 311 ++++++++++++++++++++++++++++++++++
>>> libavcodec/x86/apv_dsp_init.c | 44 +++++
>>> tests/checkasm/Makefile | 1 +
>>> tests/checkasm/apv_dsp.c | 109 ++++++++++++
>>> tests/checkasm/checkasm.c | 3 +
>>> tests/checkasm/checkasm.h | 1 +
>>> tests/fate/checkasm.mak | 1 +
>>> 10 files changed, 478 insertions(+)
>>> create mode 100644 libavcodec/x86/apv_dsp.asm
>>> create mode 100644 libavcodec/x86/apv_dsp_init.c
>>> create mode 100644 tests/checkasm/apv_dsp.c
>>>
>>> ...
>>> diff --git a/libavcodec/x86/apv_dsp.asm b/libavcodec/x86/apv_dsp.asm
>>> new file mode 100644
>>> index 0000000000..12d96481de
>>> --- /dev/null
>>> +++ b/libavcodec/x86/apv_dsp.asm
>>> @@ -0,0 +1,311 @@
>>> +;************************************************************************
>>> +;* This file is part of FFmpeg.
>>> +;*
>>> +;* FFmpeg is free software; you can redistribute it and/or
>>> +;* modify it under the terms of the GNU Lesser General Public
>>> +;* License as published by the Free Software Foundation; either
>>> +;* version 2.1 of the License, or (at your option) any later version.
>>> +;*
>>> +;* FFmpeg is distributed in the hope that it will be useful,
>>> +;* but WITHOUT ANY WARRANTY; without even the implied warranty of
>>> +;* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
>>> +;* Lesser General Public License for more details.
>>> +;*
>>> +;* You should have received a copy of the GNU Lesser General Public
>>> +;* License along with FFmpeg; if not, write to the Free Software
>>> +;* 51, Inc., Foundation Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
>>> +;******************************************************************************
>>> +
>>> +%include "libavutil/x86/x86util.asm"
>>> +
>>> +%if ARCH_X86_64
>>> +
>>> +SECTION_RODATA 32
>>> +
>>> +; Full matrix for row transform.
>>> +const tmatrix_row
>>> + dw 64, 89, 84, 75, 64, 50, 35, 18
>>> + dw 64, -18, -84, 50, 64, -75, -35, 89
>>> + dw 64, 75, 35, -18, -64, -89, -84, -50
>>> + dw 64, -50, -35, 89, -64, -18, 84, -75
>>> + dw 64, 50, -35, -89, -64, 18, 84, 75
>>> + dw 64, -75, 35, 18, -64, 89, -84, 50
>>> + dw 64, 18, -84, -50, 64, 75, -35, -89
>>> + dw 64, -89, 84, -75, 64, -50, 35, -18
>>> +
>>> +; Constant pairs for broadcast in column transform.
>>> +const tmatrix_col_even
>>> + dw 64, 64, 64, -64
>>> + dw 84, 35, 35, -84
>>> +const tmatrix_col_odd
>>> + dw 89, 75, 50, 18
>>> + dw 75, -18, -89, -50
>>> + dw 50, -89, 18, 75
>>> + dw 18, -50, 75, -89
>>> +
>>> +; Memory targets for vpbroadcastd (register version requires AVX512).
>>> +cextern pd_1
>>> +const sixtyfour
>>> + dd 64
>>> +
>>> +SECTION .text
>>> +
>>> +; void ff_apv_decode_transquant_avx2(void *output,
>>> +; ptrdiff_t pitch,
>>> +; const int16_t *input,
>>> +; const int16_t *qmatrix,
>>> +; int bit_depth,
>>> +; int qp_shift);
>>> +
>>> +INIT_YMM avx2
>>> +
>>> +cglobal apv_decode_transquant, 5, 7, 16, output, pitch, input, qmatrix, bit_depth, qp_shift, tmp
>>> +
>>> + ; Load input and dequantise
>>> +
>>> + vpbroadcastd m10, [pd_1]
>>> + lea tmpd, [bit_depthd - 2]
>>> + movd xm8, qp_shiftm
>>> + movd xm9, tmpd
>>> + vpslld m10, m10, xm9
>>> + vpsrld m10, m10, 1
>>> +
>>> + ; m8 = scalar qp_shift
>>> + ; m9 = scalar bd_shift
>>> + ; m10 = vector 1 << (bd_shift - 1)
>>> + ; m11 = qmatrix load
>>> +
>>> +%macro LOAD_AND_DEQUANT 2 ; (xmm input, constant offset)
>>> + vpmovsxwd m%1, [inputq + %2]
>>> + vpmovsxwd m11, [qmatrixq + %2]
>>> + vpmaddwd m%1, m%1, m11
>>> + vpslld m%1, m%1, xm8
>>> + vpaddd m%1, m%1, m10
>>> + vpsrad m%1, m%1, xm9
>>> + vpackssdw m%1, m%1, m%1
>>> +%endmacro
>>> +
>>> + LOAD_AND_DEQUANT 0, 0x00
>>> + LOAD_AND_DEQUANT 1, 0x10
>>> + LOAD_AND_DEQUANT 2, 0x20
>>> + LOAD_AND_DEQUANT 3, 0x30
>>> + LOAD_AND_DEQUANT 4, 0x40
>>> + LOAD_AND_DEQUANT 5, 0x50
>>> + LOAD_AND_DEQUANT 6, 0x60
>>> + LOAD_AND_DEQUANT 7, 0x70
>>> +
>>> + ; mN = row N words 0 1 2 3 0 1 2 3 4 5 6 7 4 5 6 7
>>> +
>>> + ; Transform columns
>>> + ; This applies a 1-D DCT butterfly
>>> +
>>> + vpunpcklwd m12, m0, m4
>>> + vpunpcklwd m13, m2, m6
>>> + vpunpcklwd m14, m1, m3
>>> + vpunpcklwd m15, m5, m7
>>> +
>>> + ; m12 = rows 0 and 4 interleaved
>>> + ; m13 = rows 2 and 6 interleaved
>>> + ; m14 = rows 1 and 3 interleaved
>>> + ; m15 = rows 5 and 7 interleaved
>>> +
>>> + lea tmpq, [tmatrix_col_even]
>>> + vpbroadcastd m0, [tmpq + 0x00]
>>> + vpbroadcastd m1, [tmpq + 0x04]
>>> + vpbroadcastd m2, [tmpq + 0x08]
>>> + vpbroadcastd m3, [tmpq + 0x0c]
>>
>> How about
>>
>> vbroadcasti128 m0, [tmatrix_col_even]
>> pshufd m1, m0, q1111
>> pshufd m2, m0, q2222
>> pshufd m3, m0, q3333
>> pshufd m0, m0, q0000
>>
>> So you remove the lea, and do a single load from memory within a single cross-lane intruction, instead of four of each.
>>
>> Same below.
>
> The broadcasts from memory are not slow, they don't read from either lane.
>
> I can't measure a diffrence but instruction tables have vpbroadcastd as 1/3 and pshufd as 1/2 so I think I'll take that as a tie-break? (lea is free and they will all load together, the vbroadcasti128 load is unaligned but pretty sure that is irrelevant.)
AVX doesn't care about alignment outside of intructions that are
explicit about it (so movdqa/movaps). vbroadcasti128 in any case loads
16 bytes and tmatrix_col_even seems to be 16 byte aligned.
Looking at Skylake and newer, vpbroadcastd has 4 cycle latency and 0.5
throughput, so by the time the results are stored, the pmaddwd will be
executed. Meanwhile, vbroadcasti128 has 3 latency, so the pshufd will
not execute immediately.
I guess your version may be better.
[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]
[-- Attachment #2: Type: text/plain, Size: 251 bytes --]
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [FFmpeg-devel] [PATCH v3 4/7] lavc: APV decoder
2025-04-23 20:45 ` [FFmpeg-devel] [PATCH v3 4/7] lavc: APV decoder Mark Thompson
2025-04-24 3:04 ` James Almer
@ 2025-04-25 17:25 ` Michael Niedermayer
1 sibling, 0 replies; 17+ messages in thread
From: Michael Niedermayer @ 2025-04-25 17:25 UTC (permalink / raw)
To: FFmpeg development discussions and patches
[-- Attachment #1.1: Type: text/plain, Size: 1090 bytes --]
Hi
On Wed, Apr 23, 2025 at 09:45:22PM +0100, Mark Thompson wrote:
> ---
[...]
> + case APV_METADATA_CLL:
> + {
> + const APVRawMetadataCLL *cll = &pl->cll;
> + AVContentLightMetadata *clm;
> +
> + err = ff_decode_content_light_new(avctx, frame, &clm);
> + if (err < 0)
> + return err;
> +
> + if (clm) {
> + clm->MaxCLL = cll->max_cll;
> + clm->MaxFALL = cll->max_fall;
> + }
> + }
> + break;
> + default:
> + // Ignore other types of metadata.
> + }
src/libavcodec/apv_decode.c: In function ‘apv_decode_metadata’:
src/libavcodec/apv_decode.c:342:9: error: label at end of compound statement
342 | default:
| ^~~~~~~
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
If you think the mosad wants you dead since a long time then you are either
wrong or dead since a long time.
[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]
[-- Attachment #2: Type: text/plain, Size: 251 bytes --]
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2025-04-25 17:25 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-04-23 20:45 [FFmpeg-devel] [PATCH v3 0/7] APV support Mark Thompson
2025-04-23 20:45 ` [FFmpeg-devel] [PATCH v3 1/7] lavc: APV codec ID and descriptor Mark Thompson
2025-04-23 20:45 ` [FFmpeg-devel] [PATCH v3 2/7] lavc/cbs: APV support Mark Thompson
2025-04-24 0:02 ` James Almer
2025-04-24 20:16 ` Mark Thompson
2025-04-23 20:45 ` [FFmpeg-devel] [PATCH v3 3/7] lavf: APV demuxer Mark Thompson
2025-04-24 0:10 ` James Almer
2025-04-24 20:15 ` Mark Thompson
2025-04-23 20:45 ` [FFmpeg-devel] [PATCH v3 4/7] lavc: APV decoder Mark Thompson
2025-04-24 3:04 ` James Almer
2025-04-25 17:25 ` Michael Niedermayer
2025-04-23 20:45 ` [FFmpeg-devel] [PATCH v3 5/7] lavc/apv: AVX2 transquant for x86-64 Mark Thompson
2025-04-24 2:55 ` James Almer
2025-04-24 20:37 ` Mark Thompson
2025-04-24 21:41 ` James Almer
2025-04-23 20:45 ` [FFmpeg-devel] [PATCH v3 6/7] lavc: APV metadata bitstream filter Mark Thompson
2025-04-23 20:45 ` [FFmpeg-devel] [PATCH v3 7/7] lavf: APV muxer Mark Thompson
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
This inbox may be cloned and mirrored by anyone:
git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git
# If you have public-inbox 1.1+ installed, you may
# initialize and index your mirror using the following commands:
public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
ffmpegdev@gitmailbox.com
public-inbox-index ffmpegdev
Example config snippet for mirrors.
AGPL code for this site: git clone https://public-inbox.org/public-inbox.git