From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.ffmpeg.org (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTPS id A05F44F7EF for ; Mon, 23 Jun 2025 13:38:05 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.ffmpeg.org (Postfix) with ESMTP id D130968E12F; Mon, 23 Jun 2025 16:37:27 +0300 (EEST) Received: from mail-pl1-f172.google.com (mail-pl1-f172.google.com [209.85.214.172]) by ffbox0-bg.ffmpeg.org (Postfix) with ESMTPS id 6FB1D68E0CB for ; Mon, 23 Jun 2025 16:37:26 +0300 (EEST) Received: by mail-pl1-f172.google.com with SMTP id d9443c01a7336-2363616a1a6so32834635ad.3 for ; Mon, 23 Jun 2025 06:37:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1750685845; x=1751290645; darn=ffmpeg.org; h=cc:to:mime-version:content-transfer-encoding:fcc:subject:date :references:in-reply-to:message-id:from:from:to:cc:subject:date :message-id:reply-to; bh=P8R5j70a7K3prowsYPuniIUiBmjYzQz0j1gOz4uKJjU=; b=O60+xhwcL93eAv6AB+QbW4j8GgqRa+Cr1ERutl95Rre6R+3/1KsUBZMcuIboKpHol/ CcfRlfAhhetV5fVD0LsTXJDtkE2cjYaTE4aijiJKSC6HWiKvkOUa+YHnYn6SC/IIMDLc ltHoe082bewcBrwxuyUZ4snH3FBuKiKCMM9PL9RwcCU0sJoaS6Xal3Zwb8ryKjCQF2IC p5HrxYWAQ0g3OuLypOgXyh1EjH2pO6tx7IzFcZ/xRFVMK0N5MlRA857n4TqcbtaM4AS9 NP9J7myR8lgH10807ucN11A7HmD5VQ466yin8niH5myLtv6Ac5UoxJkEcJ4qof/IJoex QmKg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1750685845; x=1751290645; h=cc:to:mime-version:content-transfer-encoding:fcc:subject:date :references:in-reply-to:message-id:from:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=P8R5j70a7K3prowsYPuniIUiBmjYzQz0j1gOz4uKJjU=; b=WdVG8DSHuaVlkHB5QJHL3e6/WyWkCfXPay0Zwkxr8wB08srM+rwTdPJsMR4ZP856lz wUEXFP7GM07QWm/wx4E1s+8SbH3UNLx2dFsGj29ojxm4OsJuAciNYS+nHIzvWdV1l/Wk 0T7iU9Nrh2M4qqd/VPYzDSI7Ygkz2ngA7mR+nHvg/0mRB2Z0HGRn2MbKyvn5jglLtttd x6hKo3zXzF8SjxCF4OsCEZhXWt3V2uVddfAfb8r1mhJFf5abr/fr9jy+FkhqdExImbob /O+UWIsjVO6MOKMAIDkMpI0UYDN5KX0m2lAB+ZzWI4ocfP6pOAUTvirNMtYawoldvFxv rNBw== X-Gm-Message-State: AOJu0YyCdjh1KW/aY6t+H5MpZWXJMYjQ2OmVJQKjhNPkwPtz69Ch9iaa v9Y8HIiL1Qa2/oUiB6PKjqgo4duqTIpf14l7+s1Poyqpo3zln40m3Jd3JZpXRQ== X-Gm-Gg: ASbGncuwT/qcQ561Elp59GnUt8j0DnYA42WsyIuMg+DiTThwcMhtSPFQRo+xzwYSOOd 1ZVZ/A2lEEZMxGNlnywYY3uHGGDVpjJ3FUEedfLLDbJQS2yfDlS34Ex62m6j5PUGPhKFkd9fROk aNoqe12LOD4Ohc3Ei9OagZ08JbCK5R/L2LuvDUYf9YEzNStm33+KO9F9zMN79wh271x35jEROKC IkiXYlc0yh1DdgpmwUcNikX9Lo8xhHGdrXCUfQ8eeaUCaizART32KXi8fESQ1KlZxLqoGI8lKr0 itTxoRVUK3zwpRfAvm5iY9B5VSRrudslTOP/yEV05zhOe1+QP0x3QiJYmTRhPWdVvWYm8fHhJTZ VxxEvuXaHWncZo7u1 X-Google-Smtp-Source: AGHT+IF7COe3t2MIRomqmSFELh4YZqhmT9sCJRVKQ9WNWpIZif8IKLF2ntoFR8fkwKfR6AguJ6QCtQ== X-Received: by 2002:a17:902:ea0c:b0:235:e71e:a37b with SMTP id d9443c01a7336-237d99bcf92mr199703425ad.34.1750685844489; Mon, 23 Jun 2025 06:37:24 -0700 (PDT) Received: from [127.0.0.1] (master.gitmailbox.com. [34.83.118.50]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-237d83cdf90sm84263525ad.62.2025.06.23.06.37.24 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 23 Jun 2025 06:37:24 -0700 (PDT) From: Andreas Rheinhardt X-Google-Original-From: Andreas Rheinhardt Message-Id: <8172fdacf46be3ff2a8b83f4633b88cf5e8998f5.1750685809.git.ffmpegagent@gmail.com> In-Reply-To: References: Date: Mon, 23 Jun 2025 13:36:05 +0000 Fcc: Sent MIME-Version: 1.0 To: ffmpeg-devel@ffmpeg.org Subject: [FFmpeg-devel] [PATCH 05/48] avcodec/mpegvideoenc: Allocate blocks as part of MPVEncContext X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Andreas Rheinhardt Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: From: Andreas Rheinhardt This avoids mpegvideo.c having to deal with the fact that the encoders use two sets of blocks and is in preparation for not allocating blocks at all. Signed-off-by: Andreas Rheinhardt --- libavcodec/h263enc.h | 2 +- libavcodec/mpeg4videodec.c | 1 - libavcodec/mpeg4videoenc.c | 2 +- libavcodec/mpegvideo.c | 12 +++--- libavcodec/mpegvideo.h | 3 +- libavcodec/mpegvideo_enc.c | 78 +++++++++++++++++++------------------- libavcodec/mpegvideoenc.h | 5 +++ 7 files changed, 54 insertions(+), 49 deletions(-) diff --git a/libavcodec/h263enc.h b/libavcodec/h263enc.h index 20e0c57326..5a6e7d500d 100644 --- a/libavcodec/h263enc.h +++ b/libavcodec/h263enc.h @@ -87,7 +87,7 @@ static inline int get_p_cbp(MPVEncContext *const s, for (int i = 0; i < 6; i++) { if (s->c.block_last_index[i] >= 0 && !((cbp >> (5 - i)) & 1)) { s->c.block_last_index[i] = -1; - s->c.bdsp.clear_block(s->c.block[i]); + s->c.bdsp.clear_block(s->block[i]); } } } else { diff --git a/libavcodec/mpeg4videodec.c b/libavcodec/mpeg4videodec.c index 2e3609b0d5..34f383bbbd 100644 --- a/libavcodec/mpeg4videodec.c +++ b/libavcodec/mpeg4videodec.c @@ -3826,7 +3826,6 @@ static av_cold void clear_context(MpegEncContext *s) memset(s->thread_context, 0, sizeof(s->thread_context)); s->block = NULL; - s->blocks = NULL; s->ac_val_base = NULL; s->ac_val = NULL; memset(&s->sc, 0, sizeof(s->sc)); diff --git a/libavcodec/mpeg4videoenc.c b/libavcodec/mpeg4videoenc.c index a9d673707a..1aa35aa70a 100644 --- a/libavcodec/mpeg4videoenc.c +++ b/libavcodec/mpeg4videoenc.c @@ -433,7 +433,7 @@ static inline int get_b_cbp(MPVEncContext *const s, int16_t block[6][64], for (i = 0; i < 6; i++) { if (s->c.block_last_index[i] >= 0 && ((cbp >> (5 - i)) & 1) == 0) { s->c.block_last_index[i] = -1; - s->c.bdsp.clear_block(s->c.block[i]); + s->c.bdsp.clear_block(s->block[i]); } } } else { diff --git a/libavcodec/mpegvideo.c b/libavcodec/mpegvideo.c index ff2703f487..3cbd686558 100644 --- a/libavcodec/mpegvideo.c +++ b/libavcodec/mpegvideo.c @@ -116,9 +116,11 @@ av_cold void ff_mpv_idct_init(MpegEncContext *s) static av_cold int init_duplicate_context(MpegEncContext *s) { - if (!FF_ALLOCZ_TYPED_ARRAY(s->blocks, 1 + s->encoding)) - return AVERROR(ENOMEM); - s->block = s->blocks[0]; + if (!s->encoding) { + s->block = av_mallocz(12 * sizeof(*s->block)); + if (!s->block) + return AVERROR(ENOMEM); + } return 0; } @@ -158,8 +160,7 @@ static av_cold void free_duplicate_context(MpegEncContext *s) s->sc.obmc_scratchpad = NULL; s->sc.linesize = 0; - av_freep(&s->blocks); - s->block = NULL; + av_freep(&s->block); } static av_cold void free_duplicate_contexts(MpegEncContext *s) @@ -175,7 +176,6 @@ int ff_update_duplicate_context(MpegEncContext *dst, const MpegEncContext *src) { #define COPY(M) \ M(ScratchpadContext, sc) \ - M(void*, blocks) \ M(void*, block) \ M(int, start_mb_y) \ M(int, end_mb_y) \ diff --git a/libavcodec/mpegvideo.h b/libavcodec/mpegvideo.h index 55a490adc7..4ff5e8906d 100644 --- a/libavcodec/mpegvideo.h +++ b/libavcodec/mpegvideo.h @@ -309,8 +309,7 @@ typedef struct MpegEncContext { int interlaced_dct; int first_field; ///< is 1 for the first field of a field picture 0 otherwise - int16_t (*block)[64]; ///< points to one of the following blocks - int16_t (*blocks)[12][64]; // for HQ mode we need to keep the best block + int16_t (*block)[64]; int (*decode_mb)(struct MpegEncContext *s, int16_t block[12][64]); // used by some codecs to avoid a switch() #define SLICE_OK 0 diff --git a/libavcodec/mpegvideo_enc.c b/libavcodec/mpegvideo_enc.c index 1f51f237fa..1ceb6296c4 100644 --- a/libavcodec/mpegvideo_enc.c +++ b/libavcodec/mpegvideo_enc.c @@ -538,6 +538,8 @@ static av_cold int init_slice_buffers(MPVMainEncContext *const m) for (unsigned i = 0; i < nb_slices; ++i) { MPVEncContext *const s2 = s->c.enc_contexts[i]; + s2->block = s2->blocks[0]; + if (dct_error) { s2->dct_offset = s->dct_offset; s2->dct_error_sum = (void*)dct_error; @@ -2184,7 +2186,7 @@ static inline void dct_single_coeff_elimination(MPVEncContext *const s, int score = 0; int run = 0; int i; - int16_t *block = s->c.block[n]; + int16_t *block = s->block[n]; const int last_index = s->c.block_last_index[n]; int skip_dc; @@ -2399,27 +2401,27 @@ static av_always_inline void encode_mb_internal(MPVEncContext *const s, } } - s->pdsp.get_pixels(s->c.block[0], ptr_y, wrap_y); - s->pdsp.get_pixels(s->c.block[1], ptr_y + 8, wrap_y); - s->pdsp.get_pixels(s->c.block[2], ptr_y + dct_offset, wrap_y); - s->pdsp.get_pixels(s->c.block[3], ptr_y + dct_offset + 8, wrap_y); + s->pdsp.get_pixels(s->block[0], ptr_y, wrap_y); + s->pdsp.get_pixels(s->block[1], ptr_y + 8, wrap_y); + s->pdsp.get_pixels(s->block[2], ptr_y + dct_offset, wrap_y); + s->pdsp.get_pixels(s->block[3], ptr_y + dct_offset + 8, wrap_y); if (s->c.avctx->flags & AV_CODEC_FLAG_GRAY) { skip_dct[4] = 1; skip_dct[5] = 1; } else { - s->pdsp.get_pixels(s->c.block[4], ptr_cb, wrap_c); - s->pdsp.get_pixels(s->c.block[5], ptr_cr, wrap_c); + s->pdsp.get_pixels(s->block[4], ptr_cb, wrap_c); + s->pdsp.get_pixels(s->block[5], ptr_cr, wrap_c); if (chroma_format == CHROMA_422) { - s->pdsp.get_pixels(s->c.block[6], ptr_cb + uv_dct_offset, wrap_c); - s->pdsp.get_pixels(s->c.block[7], ptr_cr + uv_dct_offset, wrap_c); + s->pdsp.get_pixels(s->block[6], ptr_cb + uv_dct_offset, wrap_c); + s->pdsp.get_pixels(s->block[7], ptr_cr + uv_dct_offset, wrap_c); } else if (chroma_format == CHROMA_444) { - s->pdsp.get_pixels(s->c.block[ 6], ptr_cb + 8, wrap_c); - s->pdsp.get_pixels(s->c.block[ 7], ptr_cr + 8, wrap_c); - s->pdsp.get_pixels(s->c.block[ 8], ptr_cb + uv_dct_offset, wrap_c); - s->pdsp.get_pixels(s->c.block[ 9], ptr_cr + uv_dct_offset, wrap_c); - s->pdsp.get_pixels(s->c.block[10], ptr_cb + uv_dct_offset + 8, wrap_c); - s->pdsp.get_pixels(s->c.block[11], ptr_cr + uv_dct_offset + 8, wrap_c); + s->pdsp.get_pixels(s->block[ 6], ptr_cb + 8, wrap_c); + s->pdsp.get_pixels(s->block[ 7], ptr_cr + 8, wrap_c); + s->pdsp.get_pixels(s->block[ 8], ptr_cb + uv_dct_offset, wrap_c); + s->pdsp.get_pixels(s->block[ 9], ptr_cr + uv_dct_offset, wrap_c); + s->pdsp.get_pixels(s->block[10], ptr_cb + uv_dct_offset + 8, wrap_c); + s->pdsp.get_pixels(s->block[11], ptr_cr + uv_dct_offset + 8, wrap_c); } } } else { @@ -2483,23 +2485,23 @@ static av_always_inline void encode_mb_internal(MPVEncContext *const s, } } - s->pdsp.diff_pixels(s->c.block[0], ptr_y, dest_y, wrap_y); - s->pdsp.diff_pixels(s->c.block[1], ptr_y + 8, dest_y + 8, wrap_y); - s->pdsp.diff_pixels(s->c.block[2], ptr_y + dct_offset, + s->pdsp.diff_pixels(s->block[0], ptr_y, dest_y, wrap_y); + s->pdsp.diff_pixels(s->block[1], ptr_y + 8, dest_y + 8, wrap_y); + s->pdsp.diff_pixels(s->block[2], ptr_y + dct_offset, dest_y + dct_offset, wrap_y); - s->pdsp.diff_pixels(s->c.block[3], ptr_y + dct_offset + 8, + s->pdsp.diff_pixels(s->block[3], ptr_y + dct_offset + 8, dest_y + dct_offset + 8, wrap_y); if (s->c.avctx->flags & AV_CODEC_FLAG_GRAY) { skip_dct[4] = 1; skip_dct[5] = 1; } else { - s->pdsp.diff_pixels(s->c.block[4], ptr_cb, dest_cb, wrap_c); - s->pdsp.diff_pixels(s->c.block[5], ptr_cr, dest_cr, wrap_c); + s->pdsp.diff_pixels(s->block[4], ptr_cb, dest_cb, wrap_c); + s->pdsp.diff_pixels(s->block[5], ptr_cr, dest_cr, wrap_c); if (!chroma_y_shift) { /* 422 */ - s->pdsp.diff_pixels(s->c.block[6], ptr_cb + uv_dct_offset, + s->pdsp.diff_pixels(s->block[6], ptr_cb + uv_dct_offset, dest_cb + uv_dct_offset, wrap_c); - s->pdsp.diff_pixels(s->c.block[7], ptr_cr + uv_dct_offset, + s->pdsp.diff_pixels(s->block[7], ptr_cr + uv_dct_offset, dest_cr + uv_dct_offset, wrap_c); } } @@ -2554,7 +2556,7 @@ static av_always_inline void encode_mb_internal(MPVEncContext *const s, get_visual_weight(weight[7], ptr_cr + uv_dct_offset, wrap_c); } - memcpy(orig[0], s->c.block[0], sizeof(int16_t) * 64 * mb_block_count); + memcpy(orig[0], s->block[0], sizeof(int16_t) * 64 * mb_block_count); } /* DCT & quantize */ @@ -2563,14 +2565,14 @@ static av_always_inline void encode_mb_internal(MPVEncContext *const s, for (i = 0; i < mb_block_count; i++) { if (!skip_dct[i]) { int overflow; - s->c.block_last_index[i] = s->dct_quantize(s, s->c.block[i], i, s->c.qscale, &overflow); + s->c.block_last_index[i] = s->dct_quantize(s, s->block[i], i, s->c.qscale, &overflow); // FIXME we could decide to change to quantizer instead of // clipping // JS: I don't think that would be a good idea it could lower // quality instead of improve it. Just INTRADC clipping // deserves changes in quantizer if (overflow) - clip_coeffs(s, s->c.block[i], s->c.block_last_index[i]); + clip_coeffs(s, s->block[i], s->c.block_last_index[i]); } else s->c.block_last_index[i] = -1; } @@ -2578,7 +2580,7 @@ static av_always_inline void encode_mb_internal(MPVEncContext *const s, for (i = 0; i < mb_block_count; i++) { if (!skip_dct[i]) { s->c.block_last_index[i] = - dct_quantize_refine(s, s->c.block[i], weight[i], + dct_quantize_refine(s, s->block[i], weight[i], orig[i], i, s->c.qscale); } } @@ -2602,12 +2604,12 @@ static av_always_inline void encode_mb_internal(MPVEncContext *const s, if ((s->c.avctx->flags & AV_CODEC_FLAG_GRAY) && s->c.mb_intra) { s->c.block_last_index[4] = s->c.block_last_index[5] = 0; - s->c.block[4][0] = - s->c.block[5][0] = (1024 + s->c.c_dc_scale / 2) / s->c.c_dc_scale; + s->block[4][0] = + s->block[5][0] = (1024 + s->c.c_dc_scale / 2) / s->c.c_dc_scale; if (!chroma_y_shift) { /* 422 / 444 */ for (i=6; i<12; i++) { s->c.block_last_index[i] = 0; - s->c.block[i][0] = s->c.block[4][0]; + s->block[i][0] = s->block[4][0]; } } } @@ -2618,7 +2620,7 @@ static av_always_inline void encode_mb_internal(MPVEncContext *const s, int j; if (s->c.block_last_index[i] > 0) { for (j = 63; j > 0; j--) { - if (s->c.block[i][s->c.intra_scantable.permutated[j]]) + if (s->block[i][s->c.intra_scantable.permutated[j]]) break; } s->c.block_last_index[i] = j; @@ -2626,7 +2628,7 @@ static av_always_inline void encode_mb_internal(MPVEncContext *const s, } } - s->encode_mb(s, s->c.block, motion_x, motion_y); + s->encode_mb(s, s->block, motion_x, motion_y); } static void encode_mb(MPVEncContext *const s, int motion_x, int motion_y) @@ -2649,11 +2651,11 @@ typedef struct MBBackup { int qscale; int block_last_index[8]; int interlaced_dct; - int16_t (*block)[64]; } c; int mv_bits, i_tex_bits, p_tex_bits, i_count, misc_bits, last_bits; int dquant; int esc3_level_length; + int16_t (*block)[64]; PutBitContext pb, pb2, tex_pb; } MBBackup; @@ -2713,7 +2715,7 @@ static inline void AFTER ## _context_after_encode(DST_TYPE *const d, \ d->pb2 = s->pb2; \ d->tex_pb = s->tex_pb; \ } \ - d->c.block = s->c.block; \ + d->block = s->block; \ for (int i = 0; i < 8; i++) \ d->c.block_last_index[i] = s->c.block_last_index[i]; \ d->c.interlaced_dct = s->c.interlaced_dct; \ @@ -2734,7 +2736,7 @@ static void encode_mb_hq(MPVEncContext *const s, MBBackup *const backup, MBBacku reset_context_before_encode(s, backup); - s->c.block = s->c.blocks[*next_block]; + s->block = s->blocks[*next_block]; s->pb = pb[*next_block]; if (s->c.data_partitioning) { s->pb2 = pb2 [*next_block]; @@ -2758,7 +2760,7 @@ static void encode_mb_hq(MPVEncContext *const s, MBBackup *const backup, MBBacku } if (s->c.avctx->mb_decision == FF_MB_DECISION_RD) { - mpv_reconstruct_mb(s, s->c.block); + mpv_reconstruct_mb(s, s->block); score *= s->lambda2; score += sse_mb(s) << FF_LAMBDA_SHIFT; @@ -3446,7 +3448,7 @@ static int encode_thread(AVCodecContext *c, void *arg){ } if (s->c.avctx->mb_decision == FF_MB_DECISION_BITS) - mpv_reconstruct_mb(s, s->c.block); + mpv_reconstruct_mb(s, s->block); } else { int motion_x = 0, motion_y = 0; s->c.mv_type = MV_TYPE_16X16; @@ -3569,7 +3571,7 @@ static int encode_thread(AVCodecContext *c, void *arg){ s->c.out_format == FMT_H263 && s->c.pict_type != AV_PICTURE_TYPE_B) ff_h263_update_mb(s); - mpv_reconstruct_mb(s, s->c.block); + mpv_reconstruct_mb(s, s->block); } s->c.cur_pic.qscale_table[xy] = s->c.qscale; diff --git a/libavcodec/mpegvideoenc.h b/libavcodec/mpegvideoenc.h index 5510b43f86..8ad2fe2e8a 100644 --- a/libavcodec/mpegvideoenc.h +++ b/libavcodec/mpegvideoenc.h @@ -31,6 +31,7 @@ #include #include "libavutil/avassert.h" +#include "libavutil/mem_internal.h" #include "libavutil/opt.h" #include "fdctdsp.h" #include "motion_est.h" @@ -110,6 +111,8 @@ typedef struct MPVEncContext { int coded_score[12]; + int16_t (*block)[64]; ///< points into blocks below + /** precomputed matrix (combine qscale and DCT renorm) */ int (*q_intra_matrix)[64]; int (*q_chroma_intra_matrix)[64]; @@ -173,6 +176,8 @@ typedef struct MPVEncContext { int (*sum_abs_dctelem)(const int16_t *block); int intra_penalty; + + DECLARE_ALIGNED_32(int16_t, blocks)[2][12][64]; // for HQ mode we need to keep the best block } MPVEncContext; typedef struct MPVMainEncContext { -- ffmpeg-codebot _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".