From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.ffmpeg.org (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTPS id 8AD374A489 for ; Thu, 18 Sep 2025 16:53:26 +0000 (UTC) Authentication-Results: ffbox; dkim=fail (body hash mismatch (got b'mPfsNrDnkjT/TLFmzd3nG+QfALx+TxSp0bXRIqTwjl8=', expected b'6b6ACa6YkxslT4UkuDEh32wxthHqSl3c+pvZQgn2Eg0=')) header.d=ffmpeg.org header.i=@ffmpeg.org header.a=rsa-sha256 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ffmpeg.org; i=@ffmpeg.org; q=dns/txt; s=mail; t=1758214393; h=mime-version : to : date : message-id : reply-to : subject : list-id : list-archive : list-archive : list-help : list-owner : list-post : list-subscribe : list-unsubscribe : from : cc : content-type : content-transfer-encoding : from; bh=mPfsNrDnkjT/TLFmzd3nG+QfALx+TxSp0bXRIqTwjl8=; b=eZei6v9X4gBqm8biR1i7tX3W0XCMYoLM68daG1FZzVIH1fFaWvx8WUXSkNw8r9C/Y19Z5 aKTOYB4njg5nNNaCQdLUmNtbTAWAQe2Ixoym8ufBW9v6Ug/iRwMr/3FPh5CudoLn667908z jMrtAOSX+4Jp1RdIcEMn4Bvy3XyWTjnQbDSLBpHTdrKrKluaL0fCeNprr7/D4yY5jha6MAW A9T65cvbcIiWC7Gq+jZbAt+ArGq8C2S40FVtum4RmzNsFIMBq0J+V7XD+GXCf3tYCgg/1Po Vd1bbB4ddnQemjjE41NYteGh2lRhzFCykKmqEK+SP4doUjOyVo2oHcrUVqXw== Received: from [172.19.0.4] (unknown [172.19.0.4]) by ffbox0-bg.ffmpeg.org (Postfix) with ESMTP id 31E2268EA68; Thu, 18 Sep 2025 19:53:13 +0300 (EEST) ARC-Seal: i=1; cv=none; a=rsa-sha256; d=ffmpeg.org; s=arc; t=1758214391; b=V34jYfzcEuR9V7EdkOgo5dmq3FKeTEgqRdvRK+lcBpXe3eH+KcTRwlZi46DMbjmBzkeJp /CH3pN9cNFHbqoC4wJimbzINi6DsNnh2fpajNFqqlJ2wFZPy2skFs6GHW80vCBDMaYZBoa6 8JQbcGkN3F8h/ZqECPDTNgCXm4kOo8wfVdaMvv39QWwtbmk+Q9pxxfx7LWaq3DMCX3/coXl UkEvztkmvbpOnZoikYbAiGnOKxel0c9gcl3t2trCJeL4KMtHVS+mMB9pE95n+9ixn4+9dmF JF/x0xttSVEZTPU2gwYuIHPmjeN2kogMhy2eA2IYWpsTd016UXWJvujV6Auw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=ffmpeg.org; s=arc; t=1758214391; h=from : sender : reply-to : subject : date : message-id : to : cc : mime-version : content-type : content-transfer-encoding : content-id : content-description : resent-date : resent-from : resent-sender : resent-to : resent-cc : resent-message-id : in-reply-to : references : list-id : list-help : list-unsubscribe : list-subscribe : list-post : list-owner : list-archive; bh=r50RQ4QZS7HobeTPMPLnCDQ59oTvs6FwCVBamdxNsSk=; b=YEzpfL94dcUEpQTE8DXuivhWx3BnnkMOq5ieOY+z3JM/Eh4eeE3+ycMS2OOcL1Q+u7QNI vc8ThKeDWgcspEUn6YUC+ij0k7jxhA+JKRltFGfTvPWyJJcQ7gHw36jzpixb9Xdgdl5kNY9 kl1UT+1behKDGEcH5ayKN1//zZJdNyBonpHiuZdEVwwTj5VHeHSIlHk1fVk5urBvJmhlsWI akl+PPkGmgOrWN6KGm7xQ2b65IewK4C/Memym5ITkOAGrPZgvBLtMKZtrp4bgaBc65yXCYu mBKqw5+hr8MXtq6kNKCn4kwzJMrAOXif+NXq76MwO27DB8e0q/Op9iMUGPMA== ARC-Authentication-Results: i=1; ffmpeg.org; dkim=pass header.d=ffmpeg.org header.i=@ffmpeg.org; arc=none; dmarc=none Authentication-Results: ffmpeg.org; dkim=pass header.d=ffmpeg.org header.i=@ffmpeg.org; arc=none (Message is not ARC signed); dmarc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ffmpeg.org; i=@ffmpeg.org; q=dns/txt; s=mail; t=1758214380; h=content-type : mime-version : content-transfer-encoding : from : to : reply-to : subject : date : from; bh=6b6ACa6YkxslT4UkuDEh32wxthHqSl3c+pvZQgn2Eg0=; b=Sy86hEyUmD8ZXB5Bw7Ggq/DEg+0a/kBp16blTdxbUi8FeujSBftdasn0XnJB4jFlVqEWQ jH/VCaNT7ph6vwCEbcqRe8QmMf5DLlIyRGP9cL9Uyv/0K+RqwW7FccYlohnhVezPCNN+w9R mWeHbOP0Bw1ZJvSdF8QCkwnUihXPHt/jmQmn1hAr+iMUhrXigdQfieMvXbzr72fhweEk5qG WfezCsMPK2X6Gny6sSkuEKVm7mu/zLPhH4YHL2BXqz3qP2AlNV98QAO3px8kqdXM+znTnhp GZwuSjzffrcp6nVvM9wpxRFqYb0GQ2oazAaNlfFPkrKZ9bRWz8AvwwwAPRzQ== Received: from ed19c606a818 (code.ffmpeg.org [188.245.149.3]) by ffbox0-bg.ffmpeg.org (Postfix) with ESMTPS id 34D8D68EA1F for ; Thu, 18 Sep 2025 19:53:00 +0300 (EEST) MIME-Version: 1.0 To: ffmpeg-devel@ffmpeg.org Date: Thu, 18 Sep 2025 16:52:59 -0000 Message-ID: <175821438045.25.17531496873284204896@463a07221176> Message-ID-Hash: SQEZYGWAJI7TZBUDVHMPV6LSW4TJJFXQ X-Message-ID-Hash: SQEZYGWAJI7TZBUDVHMPV6LSW4TJJFXQ X-MailFrom: code@ffmpeg.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; header-match-ffmpeg-devel.ffmpeg.org-0; header-match-ffmpeg-devel.ffmpeg.org-1; header-match-ffmpeg-devel.ffmpeg.org-2; header-match-ffmpeg-devel.ffmpeg.org-3; emergency; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.10 Precedence: list Reply-To: FFmpeg development discussions and patches Subject: [FFmpeg-devel] [PATCH] avcodec/h274: Make H274FilmGrainDatabase a shared object (PR #20549) List-Id: FFmpeg development discussions and patches Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: From: mkver via ffmpeg-devel Cc: mkver Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Archived-At: List-Archive: List-Post: PR #20549 opened by mkver URL: https://code.ffmpeg.org/FFmpeg/FFmpeg/pulls/20549 Patch URL: https://code.ffmpeg.org/FFmpeg/FFmpeg/pulls/20549.patch Right now, the private contexts of every decoder supporting H.274 film grain synthesis (namely H.264, HEVC and VVC) contain a H274FilmGrainDatabase; said structure is very large 700442B before this commit) and takes up the overwhelming majority of said contexts: Removing it reduces sizeof(H264Context) by 92.88%, sizeof(HEVCContext) by 97.78% and sizeof(VVCContext) by 99.86%. This is especially important for H.264 and HEVC when using frame-threading. The content of said film grain database does not depend on any input parameter; it is shareable between all its users and could be hardcoded in the binary (but isn't, because it is so huge). This commit adds a database with static storage duration to h274.c and uses it instead of the elements in the private contexts above. It is still lazily initialized as-needed; a mutex is used for the necessary synchronization. An alternative would be to use an AV_ONCE to initialize the whole database either in the decoders' init function (which would be wasteful given that most videos don't use film grain synthesis) or in ff_h274_apply_film_grain(). >>From 8569976d747f422c4ce88a0248525b60259541de Mon Sep 17 00:00:00 2001 From: Andreas Rheinhardt Date: Thu, 18 Sep 2025 17:57:16 +0200 Subject: [PATCH] avcodec/h274: Make H274FilmGrainDatabase a shared object Right now, the private contexts of every decoder supporting H.274 film grain synthesis (namely H.264, HEVC and VVC) contain a H274FilmGrainDatabase; said structure is very large 700442B before this commit) and takes up the overwhelming majority of said contexts: Removing it reduces sizeof(H264Context) by 92.88%, sizeof(HEVCContext) by 97.78% and sizeof(VVCContext) by 99.86%. This is especially important for H.264 and HEVC when using frame-threading. The content of said film grain database does not depend on any input parameter; it is shareable between all its users and could be hardcoded in the binary (but isn't, because it is so huge). This commit adds a database with static storage duration to h274.c and uses it instead of the elements in the private contexts above. It is still lazily initialized as-needed; a mutex is used for the necessary synchronization. An alternative would be to use an AV_ONCE to initialize the whole database either in the decoders' init function (which would be wasteful given that most videos don't use film grain synthesis) or in ff_h274_apply_film_grain(). Signed-off-by: Andreas Rheinhardt --- libavcodec/h264_picture.c | 4 ++-- libavcodec/h264dec.h | 3 --- libavcodec/h274.c | 42 ++++++++++++++++++++++++++++++--------- libavcodec/h274.h | 11 ---------- libavcodec/hevc/hevcdec.c | 4 ++-- libavcodec/hevc/hevcdec.h | 3 --- libavcodec/vvc/dec.c | 3 +-- libavcodec/vvc/dec.h | 1 - 8 files changed, 38 insertions(+), 33 deletions(-) diff --git a/libavcodec/h264_picture.c b/libavcodec/h264_picture.c index f5d2b31cd6..aa3d2629c8 100644 --- a/libavcodec/h264_picture.c +++ b/libavcodec/h264_picture.c @@ -30,10 +30,10 @@ #include "error_resilience.h" #include "avcodec.h" #include "h264dec.h" +#include "h274.h" #include "hwaccel_internal.h" #include "mpegutils.h" #include "libavutil/refstruct.h" -#include "thread.h" #include "threadframe.h" void ff_h264_unref_picture(H264Picture *pic) @@ -213,7 +213,7 @@ int ff_h264_field_end(H264Context *h, H264SliceContext *sl, int in_setup) err = AVERROR_INVALIDDATA; if (sd) // a decoding error may have happened before the side data could be allocated - err = ff_h274_apply_film_grain(cur->f_grain, cur->f, &h->h274db, + err = ff_h274_apply_film_grain(cur->f_grain, cur->f, (AVFilmGrainParams *) sd->data); if (err < 0) { av_log(h->avctx, AV_LOG_WARNING, "Failed synthesizing film " diff --git a/libavcodec/h264dec.h b/libavcodec/h264dec.h index 1df99015cc..74fd09dfaa 100644 --- a/libavcodec/h264dec.h +++ b/libavcodec/h264dec.h @@ -28,7 +28,6 @@ #ifndef AVCODEC_H264DEC_H #define AVCODEC_H264DEC_H -#include "libavutil/buffer.h" #include "libavutil/mem_internal.h" #include "cabac.h" @@ -41,7 +40,6 @@ #include "h264dsp.h" #include "h264pred.h" #include "h264qpel.h" -#include "h274.h" #include "mpegutils.h" #include "threadframe.h" #include "videodsp.h" @@ -344,7 +342,6 @@ typedef struct H264Context { H264DSPContext h264dsp; H264ChromaContext h264chroma; H264QpelContext h264qpel; - H274FilmGrainDatabase h274db; H264Picture DPB[H264_MAX_PICTURE_COUNT]; H264Picture *cur_pic_ptr; diff --git a/libavcodec/h274.c b/libavcodec/h274.c index 332d0c2c52..81bbea7bfa 100644 --- a/libavcodec/h274.c +++ b/libavcodec/h274.c @@ -25,6 +25,8 @@ * @author Niklas Haas */ +#include + #include "libavutil/avassert.h" #include "libavutil/bswap.h" #include "libavcodec/bswapdsp.h" @@ -32,9 +34,23 @@ #include "libavutil/imgutils.h" #include "libavutil/md5.h" #include "libavutil/mem.h" +#include "libavutil/thread.h" #include "h274.h" +typedef struct H274FilmGrainDatabase { + // Database of film grain patterns, lazily computed as-needed + int8_t db[13 /* h */][13 /* v */][64][64]; + atomic_uint residency[6]; + + // Temporary buffer for slice generation + int16_t slice_tmp[64][64]; +} H274FilmGrainDatabase; + +static H274FilmGrainDatabase film_grain_db; + +static AVMutex mutex = AV_MUTEX_INITIALIZER; + static const int8_t Gaussian_LUT[2048+4]; static const uint32_t Seed_LUT[256]; static const int8_t R64T[64][64]; @@ -106,13 +122,23 @@ static void init_slice_c(int8_t out[64][64], uint8_t h, uint8_t v, } } -static void init_slice(H274FilmGrainDatabase *database, uint8_t h, uint8_t v) +static void init_slice(uint8_t h, uint8_t v) { - if (database->residency[h] & (1 << v)) + unsigned bitpos = h * 13 + v; + unsigned res = atomic_load_explicit(&film_grain_db.residency[bitpos / 32], + memory_order_acquire); + + if (res & (1U << (bitpos & 31))) return; - database->residency[h] |= (1 << v); - init_slice_c(database->db[h][v], h, v, database->slice_tmp); + ff_mutex_lock(&mutex); + res = atomic_load_explicit(&film_grain_db.residency[bitpos / 32], memory_order_relaxed); + if (!(res & (1U << (bitpos & 31)))) { + init_slice_c(film_grain_db.db[h][v], h, v, film_grain_db.slice_tmp); + atomic_store_explicit(&film_grain_db.residency[bitpos / 32], + res | (1U << (bitpos & 31)), memory_order_release); + } + ff_mutex_unlock(&mutex); } // Computes the average of an 8x8 block @@ -160,7 +186,6 @@ static void deblock_8x8_c(int8_t *out, const int out_stride) // deblocking step (note that this implies writing to the previous block). static av_always_inline void generate(int8_t *out, int out_stride, const uint8_t *in, int in_stride, - H274FilmGrainDatabase *database, const AVFilmGrainH274Params *h274, int c, int invert, int deblock, int y_offset, int x_offset) @@ -198,14 +223,14 @@ static av_always_inline void generate(int8_t *out, int out_stride, h = av_clip(h274->comp_model_value[c][s][1], 2, 14) - 2; v = av_clip(h274->comp_model_value[c][s][2], 2, 14) - 2; - init_slice(database, h, v); + init_slice(h, v); scale = h274->comp_model_value[c][s][0]; if (invert) scale = -scale; synth_grain_8x8_c(out, out_stride, scale, shift, - &database->db[h][v][y_offset][x_offset]); + &film_grain_db.db[h][v][y_offset][x_offset]); if (deblock) deblock_8x8_c(out, out_stride); @@ -220,7 +245,6 @@ static void add_8x8_clip_c(uint8_t *out, const uint8_t *a, const int8_t *b, } int ff_h274_apply_film_grain(AVFrame *out_frame, const AVFrame *in_frame, - H274FilmGrainDatabase *database, const AVFilmGrainParams *params) { AVFilmGrainH274Params h274 = params->codec.h274; @@ -275,7 +299,7 @@ int ff_h274_apply_film_grain(AVFrame *out_frame, const AVFrame *in_frame, for (int xx = 0; xx < 16 && x+xx < width; xx += 8) { generate(grain + (y+yy) * grain_stride + (x+xx), grain_stride, in + (y+yy) * in_stride + (x+xx), in_stride, - database, &h274, c, invert, (x+xx) > 0, + &h274, c, invert, (x+xx) > 0, y_offset + yy, x_offset + xx); } } diff --git a/libavcodec/h274.h b/libavcodec/h274.h index 055dd591d2..7bbc3e8aa7 100644 --- a/libavcodec/h274.h +++ b/libavcodec/h274.h @@ -30,16 +30,6 @@ #include "libavutil/film_grain_params.h" -// Must be initialized to {0} prior to first usage -typedef struct H274FilmGrainDatabase { - // Database of film grain patterns, lazily computed as-needed - int8_t db[13 /* h */][13 /* v */][64][64]; - uint16_t residency[13 /* h */]; // bit field of v - - // Temporary buffer for slice generation - int16_t slice_tmp[64][64]; -} H274FilmGrainDatabase; - /** * Check whether ff_h274_apply_film_grain() supports the given parameter combination. * @@ -61,7 +51,6 @@ static inline int ff_h274_film_grain_params_supported(int model_id, enum AVPixel // ff_h274_film_grain_params_supported() coincide with actual values // from the frames and params. int ff_h274_apply_film_grain(AVFrame *out, const AVFrame *in, - H274FilmGrainDatabase *db, const AVFilmGrainParams *params); typedef struct H274HashContext H274HashContext; diff --git a/libavcodec/hevc/hevcdec.c b/libavcodec/hevc/hevcdec.c index ba12d84474..b27d1d79e8 100644 --- a/libavcodec/hevc/hevcdec.c +++ b/libavcodec/hevc/hevcdec.c @@ -45,6 +45,7 @@ #include "codec_internal.h" #include "decode.h" #include "golomb.h" +#include "h274.h" #include "hevc.h" #include "parse.h" #include "hevcdec.h" @@ -3496,8 +3497,7 @@ static int hevc_frame_end(HEVCContext *s, HEVCLayerContext *l) av_assert0(0); return AVERROR_BUG; case AV_FILM_GRAIN_PARAMS_H274: - ret = ff_h274_apply_film_grain(out->frame_grain, out->f, - &s->h274db, fgp); + ret = ff_h274_apply_film_grain(out->frame_grain, out->f, fgp); break; case AV_FILM_GRAIN_PARAMS_AV1: ret = ff_aom_apply_film_grain(out->frame_grain, out->f, fgp); diff --git a/libavcodec/hevc/hevcdec.h b/libavcodec/hevc/hevcdec.h index 911d5c3ae3..8394740c4b 100644 --- a/libavcodec/hevc/hevcdec.h +++ b/libavcodec/hevc/hevcdec.h @@ -32,9 +32,7 @@ #include "libavcodec/bswapdsp.h" #include "libavcodec/cabac.h" #include "libavcodec/dovi_rpu.h" -#include "libavcodec/get_bits.h" #include "libavcodec/h2645_parse.h" -#include "libavcodec/h274.h" #include "libavcodec/progressframe.h" #include "libavcodec/videodsp.h" @@ -537,7 +535,6 @@ typedef struct HEVCContext { HEVCDSPContext hevcdsp; VideoDSPContext vdsp; BswapDSPContext bdsp; - H274FilmGrainDatabase h274db; /** used on BE to byteswap the lines for checksumming */ uint8_t *checksum_buf; diff --git a/libavcodec/vvc/dec.c b/libavcodec/vvc/dec.c index 6f52306080..b31fceef40 100644 --- a/libavcodec/vvc/dec.c +++ b/libavcodec/vvc/dec.c @@ -1091,8 +1091,7 @@ static int frame_end(VVCContext *s, VVCFrameContext *fc) av_assert0(0); return AVERROR_BUG; case AV_FILM_GRAIN_PARAMS_H274: - ret = ff_h274_apply_film_grain(fc->ref->frame_grain, fc->ref->frame, - &s->h274db, fgp); + ret = ff_h274_apply_film_grain(fc->ref->frame_grain, fc->ref->frame, fgp); if (ret < 0) return ret; break; diff --git a/libavcodec/vvc/dec.h b/libavcodec/vvc/dec.h index 5f8065b38b..bfb8a2e20d 100644 --- a/libavcodec/vvc/dec.h +++ b/libavcodec/vvc/dec.h @@ -222,7 +222,6 @@ typedef struct VVCContext { CodedBitstreamFragment current_frame; VVCParamSets ps; - H274FilmGrainDatabase h274db; int temporal_id; ///< temporal_id_plus1 - 1 int poc_tid0; -- 2.49.1 _______________________________________________ ffmpeg-devel mailing list -- ffmpeg-devel@ffmpeg.org To unsubscribe send an email to ffmpeg-devel-leave@ffmpeg.org