From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTP id 00D3C4329E for ; Fri, 27 May 2022 13:51:29 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 1CA2168B5A3; Fri, 27 May 2022 16:51:26 +0300 (EEST) Received: from mail-wr1-f41.google.com (mail-wr1-f41.google.com [209.85.221.41]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 904F368B575 for ; Fri, 27 May 2022 16:51:19 +0300 (EEST) Received: by mail-wr1-f41.google.com with SMTP id t13so5954112wrg.9 for ; Fri, 27 May 2022 06:51:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kynesim-co-uk.20210112.gappssmtp.com; s=20210112; h=from:to:subject:date:message-id:user-agent:mime-version :content-transfer-encoding; bh=zMEIlhZPUMQ3vq7Uul60sQxDWgVuYcvOvkirAEDHoos=; b=H8Wh6dhaD43eHy/MskibnvaS7ry4l1UWLeK3sUlji24kb6LdfxjK9QS1J7/eZusJIg +D+9oThJ4J3Z/vlN2TrZjD/0CJcqhxdlax8vP1T4UlwF8gGQd1GGs2MjKbb9KwOY68eP 2PSdHUxIyYtmKPe714zpu7KeBt4Jq76rONXzvptpf6yXEyCy6JLoTngtghyD0vQ9gzVR bcZNKnQOwVllTVjW8KkrkwddqnzOysFnwG4gzMkcID/MFuLrcHy98znGaBZrASK1LOo2 hDLTvkVcYJLcRR8pV8KKS9FZc55zXJ6piyD1qua1GOpDJaZGQUbc2ysuSFobzYNx+9Sg B1LQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:subject:date:message-id:user-agent :mime-version:content-transfer-encoding; bh=zMEIlhZPUMQ3vq7Uul60sQxDWgVuYcvOvkirAEDHoos=; b=42Mfj9Zo3yJWkKe2SaFWaA+a2kDDa0tdbanCyz32Anm9hOedaxrCUsBLUJZRwtXoVy xTOspC8ZPEbwmNbWY4xv/dyXuZYSPSneaPJufECN7JUgDRsqy5AG0b2e/AQjtle1k5/O Zwa4NMai0iOZvbFruIWWqUahxDtgPiV5mACnxsni7s8NHjCEPcOmijTDb5MCIRkLeYZn lsTQfblL+Axp04iTd9xUNFLJrVix24/sj99MPvIQ8/uwn4vsD2FwMrNc5EDvis4N1jgG LSeYFVogWoTXhUMFZtLoslli76FKgN+kCUmt5q3qWG92+w3wbCpO0BTA18iqL9DCu0f+ h2Qw== X-Gm-Message-State: AOAM531OUZ3h0W0N8CwCGj+0OtADl8i1uumohM38+q+YA/62kANrkjVe gzAfjtDRmhIpoWSbWxHHTNlpzl+mMhPWXBHc X-Google-Smtp-Source: ABdhPJy2Re4/Kl0II0Lq8oiuE3s5aLAcOGb7OdrTdPlgMs5D2MH2aCiQHUgm6CxT+/ong/EXAmNDwg== X-Received: by 2002:a5d:680b:0:b0:20d:932:8d55 with SMTP id w11-20020a5d680b000000b0020d09328d55mr36110221wru.389.1653659478818; Fri, 27 May 2022 06:51:18 -0700 (PDT) Received: from CTHALPA.outer.uphall.net (cpc1-cmbg20-2-0-cust759.5-4.cable.virginm.net. [86.21.218.248]) by smtp.gmail.com with ESMTPSA id c5-20020a056000104500b0020c5253d8d3sm1861372wrx.31.2022.05.27.06.51.17 for (version=TLS1 cipher=ECDHE-ECDSA-AES128-SHA bits=128/128); Fri, 27 May 2022 06:51:18 -0700 (PDT) From: John Cox To: FFmpeg development discussions and patches Date: Fri, 27 May 2022 14:51:17 +0100 Message-ID: User-Agent: ForteAgent/8.00.32.1272 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH] hevc: If hwccel avoid creation/use of s/w only arrays X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: Hwaccel doesn't use any of the block strength, pcm, slice address, etc. arrays which can be >100k each for 4k video. Patch to avoid initial allocation and zeroing at the start of every frame. On a Pi4 the memsets can use 10% CPU on 4k 60Hz decode, this fixes that. Signed-off-by: John Cox --- libavcodec/hevc_refs.c | 35 +++++++++++++++++++++-------------- libavcodec/hevcdec.c | 42 +++++++++++++++++++++++++++++------------- 2 files changed, 50 insertions(+), 27 deletions(-) diff --git a/libavcodec/hevc_refs.c b/libavcodec/hevc_refs.c index fe18ca2b1d..ab3103f66c 100644 --- a/libavcodec/hevc_refs.c +++ b/libavcodec/hevc_refs.c @@ -97,18 +97,22 @@ static HEVCFrame *alloc_frame(HEVCContext *s) if (!frame->rpl_buf) goto fail; - frame->tab_mvf_buf = av_buffer_pool_get(s->tab_mvf_pool); - if (!frame->tab_mvf_buf) - goto fail; - frame->tab_mvf = (MvField *)frame->tab_mvf_buf->data; + if (s->tab_mvf_pool) { + frame->tab_mvf_buf = av_buffer_pool_get(s->tab_mvf_pool); + if (!frame->tab_mvf_buf) + goto fail; + frame->tab_mvf = (MvField *)frame->tab_mvf_buf->data; + } - frame->rpl_tab_buf = av_buffer_pool_get(s->rpl_tab_pool); - if (!frame->rpl_tab_buf) - goto fail; - frame->rpl_tab = (RefPicListTab **)frame->rpl_tab_buf->data; - frame->ctb_count = s->ps.sps->ctb_width * s->ps.sps->ctb_height; - for (j = 0; j < frame->ctb_count; j++) - frame->rpl_tab[j] = (RefPicListTab *)frame->rpl_buf->data; + if (s->rpl_tab_pool) { + frame->rpl_tab_buf = av_buffer_pool_get(s->rpl_tab_pool); + if (!frame->rpl_tab_buf) + goto fail; + frame->rpl_tab = (RefPicListTab **)frame->rpl_tab_buf->data; + frame->ctb_count = s->ps.sps->ctb_width * s->ps.sps->ctb_height; + for (j = 0; j < frame->ctb_count; j++) + frame->rpl_tab[j] = (RefPicListTab *)frame->rpl_buf->data; + } frame->frame->top_field_first = s->sei.picture_timing.picture_struct == AV_PICTURE_STRUCTURE_TOP_FIELD; frame->frame->interlaced_frame = (s->sei.picture_timing.picture_struct == AV_PICTURE_STRUCTURE_TOP_FIELD) || (s->sei.picture_timing.picture_struct == AV_PICTURE_STRUCTURE_BOTTOM_FIELD); @@ -283,14 +287,17 @@ static int init_slice_rpl(HEVCContext *s) int ctb_count = frame->ctb_count; int ctb_addr_ts = s->ps.pps->ctb_addr_rs_to_ts[s->sh.slice_segment_addr]; int i; + RefPicListTab * const tab = (RefPicListTab *)frame->rpl_buf->data + s->slice_idx; if (s->slice_idx >= frame->rpl_buf->size / sizeof(RefPicListTab)) return AVERROR_INVALIDDATA; - for (i = ctb_addr_ts; i < ctb_count; i++) - frame->rpl_tab[i] = (RefPicListTab *)frame->rpl_buf->data + s->slice_idx; + if (frame->rpl_tab) { + for (i = ctb_addr_ts; i < ctb_count; i++) + frame->rpl_tab[i] = tab; + } - frame->refPicList = (RefPicList *)frame->rpl_tab[ctb_addr_ts]; + frame->refPicList = tab->refPicList; return 0; } diff --git a/libavcodec/hevcdec.c b/libavcodec/hevcdec.c index f782ea6394..48b059ce45 100644 --- a/libavcodec/hevcdec.c +++ b/libavcodec/hevcdec.c @@ -504,6 +504,16 @@ static int set_sps(HEVCContext *s, const HEVCSPS *sps, if (!sps) return 0; + // If hwaccel then we don't need all the s/w decode helper arrays + if (s->avctx->hwaccel) { + export_stream_params(s, sps); + + s->avctx->pix_fmt = pix_fmt; + s->ps.sps = sps; + s->ps.vps = (HEVCVPS*) s->ps.vps_list[s->ps.sps->vps_id]->data; + return 0; + } + ret = pic_arrays_init(s, sps); if (ret < 0) goto fail; @@ -3008,11 +3018,13 @@ static int hevc_frame_start(HEVCContext *s) ((s->ps.sps->height >> s->ps.sps->log2_min_cb_size) + 1); int ret; - memset(s->horizontal_bs, 0, s->bs_width * s->bs_height); - memset(s->vertical_bs, 0, s->bs_width * s->bs_height); - memset(s->cbf_luma, 0, s->ps.sps->min_tb_width * s->ps.sps->min_tb_height); - memset(s->is_pcm, 0, (s->ps.sps->min_pu_width + 1) * (s->ps.sps->min_pu_height + 1)); - memset(s->tab_slice_address, -1, pic_size_in_ctb * sizeof(*s->tab_slice_address)); + if (s->horizontal_bs) { + memset(s->horizontal_bs, 0, s->bs_width * s->bs_height); + memset(s->vertical_bs, 0, s->bs_width * s->bs_height); + memset(s->cbf_luma, 0, s->ps.sps->min_tb_width * s->ps.sps->min_tb_height); + memset(s->is_pcm, 0, (s->ps.sps->min_pu_width + 1) * (s->ps.sps->min_pu_height + 1)); + memset(s->tab_slice_address, -1, pic_size_in_ctb * sizeof(*s->tab_slice_address)); + } s->is_decoded = 0; s->first_nal_type = s->nal_unit_type; @@ -3555,15 +3567,19 @@ static int hevc_ref_frame(HEVCContext *s, HEVCFrame *dst, HEVCFrame *src) dst->needs_fg = 1; } - dst->tab_mvf_buf = av_buffer_ref(src->tab_mvf_buf); - if (!dst->tab_mvf_buf) - goto fail; - dst->tab_mvf = src->tab_mvf; + if (src->tab_mvf_buf) { + dst->tab_mvf_buf = av_buffer_ref(src->tab_mvf_buf); + if (!dst->tab_mvf_buf) + goto fail; + dst->tab_mvf = src->tab_mvf; + } - dst->rpl_tab_buf = av_buffer_ref(src->rpl_tab_buf); - if (!dst->rpl_tab_buf) - goto fail; - dst->rpl_tab = src->rpl_tab; + if (src->rpl_tab_buf) { + dst->rpl_tab_buf = av_buffer_ref(src->rpl_tab_buf); + if (!dst->rpl_tab_buf) + goto fail; + dst->rpl_tab = src->rpl_tab; + } dst->rpl_buf = av_buffer_ref(src->rpl_buf); if (!dst->rpl_buf) -- 2.34.1 _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".