From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTPS id B92F548423 for ; Wed, 19 Feb 2025 16:51:04 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id E685A68C2ED; Wed, 19 Feb 2025 18:50:51 +0200 (EET) Received: from out203-205-221-235.mail.qq.com (out203-205-221-235.mail.qq.com [203.205.221.235]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id C314468C15D for ; Wed, 19 Feb 2025 18:50:44 +0200 (EET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1739983833; bh=LVLNDY7cG0eOLe42UFp60hxAC2zQJKMAl8+DHPZe0e4=; h=From:To:Cc:Subject:Date; b=bGDP9GD1rjNM1i0Gis9ExFmzu4C66kb2L+AiNFW198ZZcwBC2kW9SQ77GFS+lsGiy bUCyi6ignoa68JMw1YeH7YzyhbND9pZUfyo0IkN/biOivQqPGiYBcvS/zB53jIQBOF UdZNXh7xSbMgMSuEMO3A/w3qKjRzsv3PS4wxOUzo= Received: from ZHILIZHAO-MB1.tencent.com ([240e:3b7:3277:36f0:81b:3417:f475:29b9]) by newxmesmtplogicsvrszc11-0.qq.com (NewEsmtp) with SMTP id CA01AE3A; Thu, 20 Feb 2025 00:50:32 +0800 X-QQ-mid: xmsmtpt1739983832tuk9w7ks5 Message-ID: X-QQ-XMAILINFO: NsVJ/rM7HgTn7x/myvrBBf23QkfL6ZQO/22kFQq8AvpD0tIixmnq+CqEfmpxGQ ZDYdMsHMSsvEfuzaKWuGcKDx+W5+7LuZ6DUovleR2HsAIC/o4VjuHRBWym233N+c6qDw/Xum9t9v 7QnlPgNcDBT8nK4H3JLIctwBnuRDg00x3Ef8BAMhuOtmz/qko2wBUaVyRSgBUZUuAnL8gOC7B0BD HwVJun0PoZS1bsqUx9kpMhBExyYSA1KM+kutXaKvoP/5doFUK8XSNBeYjIi8znbfUe7JY0sWRSzC x80D/5gXJveDbm5s2QyufnUugfEjHh2mWyKw8mY/tcgXvYUylwvaJCdsmtfW/S9CFKBcaX34mhRP wOtaJ+qNo6NVJyNT3Nj8uR38Vo6ieMNJnjHsOmXaxhvJmWaNgpgu1dXmgXzpcW8ujLhVbnICAQ8M A/knih4svDx/KX9a8sevaOceY9N2MdIcws0NhSg0JipzqtXPi+i+bRIA4YTAWcytU0a6O+Ll2P9R ZHcvKiMTfx4gYXHNSxa4082H915wQ5YV8DslR57Ca5Cbp7CnH0IEW6RnIz4/SNtyk+2G3/kq5e/X /VleeZjW1+cD+VQyabPFp6pFhZthDnzn3eNZYTiEzZrgEVwXSAVQhDX8hDuHAnCSmxI8vbcnf7Y+ J+x2QFqhm5ZHeowz2LBByxCn4htDGlg4dO+32b6pzwvGxSy/nntECmaPq3ECtL5uvH1CN6UqeYpO PGEYotr8dYywpPLE4S0bl3Z36fGlkdnFjspjCSr6a+8qlocfLR2RMmETXgvAnWf9bWRdO0CNHlzP pQTP6ytwVrdaKOzrCNhfzcAmRSt/1qGe8gEw+HPvu7D5vyyt0gr5m9aFzfEmRpmKVXJ+wFHvQZAF ZD9YKNZdTx5IluKhQzcS9JDBf13HSP3poKXDpRglFd9u/khnsKd4BewDDaRTWUNNbeehoYzUkOK1 IBgZNnXmIlRBvHogtPj/Y3t8rCvFNqgEKJn+XrDk8+S3tPduWGlTOPP/GM/PPg X-QQ-XMRINFO: MSVp+SPm3vtS1Vd6Y4Mggwc= From: Zhao Zhili To: ffmpeg-devel@ffmpeg.org Date: Thu, 20 Feb 2025 00:50:31 +0800 X-OQ-MSGID: <20250219165031.47405-1-quinkblack@foxmail.com> X-Mailer: git-send-email 2.46.0 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 2/2] aarch64/hevcdsp_idct_neon: Add implementation for idct dc 12 X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Zhao Zhili Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: From: Zhao Zhili Reduce binary size at the same time. The performance compared to clang -O3 is the same. --- libavcodec/aarch64/hevcdsp_idct_neon.S | 43 ++++++++++++++--------- libavcodec/aarch64/hevcdsp_init_aarch64.c | 8 +++++ 2 files changed, 35 insertions(+), 16 deletions(-) diff --git a/libavcodec/aarch64/hevcdsp_idct_neon.S b/libavcodec/aarch64/hevcdsp_idct_neon.S index 4543ab6b07..e1fee0cd80 100644 --- a/libavcodec/aarch64/hevcdsp_idct_neon.S +++ b/libavcodec/aarch64/hevcdsp_idct_neon.S @@ -901,15 +901,33 @@ endfunc .endm // void ff_hevc_idct_NxN_dc_DEPTH_neon(int16_t *coeffs) -.macro idct_dc size, bitdepth -function ff_hevc_idct_\size\()x\size\()_dc_\bitdepth\()_neon, export=1 +.macro idct_dc size +function ff_hevc_idct_\size\()x\size\()_dc_10_neon, export=1 ldrsh w1, [x0] add w1, w1, #1 asr w1, w1, #1 - add w1, w1, #(1 << (13 - \bitdepth)) - asr w1, w1, #(14 - \bitdepth) - dup v0.8h, w1 + add w1, w1, #(1 << (13 - 10)) + asr w1, w1, #(14 - 10) + b 2f +endfunc + +function ff_hevc_idct_\size\()x\size\()_dc_12_neon, export=1 + ldrsh w1, [x0] + add w1, w1, #1 + asr w1, w1, #1 + add w1, w1, #(1 << (13 - 12)) + asr w1, w1, #(14 - 12) + b 2f +endfunc +function ff_hevc_idct_\size\()x\size\()_dc_8_neon, export=1 + ldrsh w1, [x0] + add w1, w1, #1 + asr w1, w1, #1 + add w1, w1, #(1 << (13 - 8)) + asr w1, w1, #(14 - 8) +2: + dup v0.8h, w1 .if \size < 8 stp q0, q0, [x0] .else @@ -932,14 +950,7 @@ function ff_hevc_idct_\size\()x\size\()_dc_\bitdepth\()_neon, export=1 endfunc .endm -idct_dc 4, 8 -idct_dc 4, 10 - -idct_dc 8, 8 -idct_dc 8, 10 - -idct_dc 16, 8 -idct_dc 16, 10 - -idct_dc 32, 8 -idct_dc 32, 10 +idct_dc 4 +idct_dc 8 +idct_dc 16 +idct_dc 32 diff --git a/libavcodec/aarch64/hevcdsp_init_aarch64.c b/libavcodec/aarch64/hevcdsp_init_aarch64.c index 386d7c59c8..5dd470baaa 100644 --- a/libavcodec/aarch64/hevcdsp_init_aarch64.c +++ b/libavcodec/aarch64/hevcdsp_init_aarch64.c @@ -91,6 +91,10 @@ void ff_hevc_idct_4x4_dc_10_neon(int16_t *coeffs); void ff_hevc_idct_8x8_dc_10_neon(int16_t *coeffs); void ff_hevc_idct_16x16_dc_10_neon(int16_t *coeffs); void ff_hevc_idct_32x32_dc_10_neon(int16_t *coeffs); +void ff_hevc_idct_4x4_dc_12_neon(int16_t *coeffs); +void ff_hevc_idct_8x8_dc_12_neon(int16_t *coeffs); +void ff_hevc_idct_16x16_dc_12_neon(int16_t *coeffs); +void ff_hevc_idct_32x32_dc_12_neon(int16_t *coeffs); void ff_hevc_transform_luma_4x4_neon_8(int16_t *coeffs); #define NEON8_FNASSIGN(member, v, h, fn, ext) \ @@ -267,5 +271,9 @@ av_cold void ff_hevc_dsp_init_aarch64(HEVCDSPContext *c, const int bit_depth) c->add_residual[1] = ff_hevc_add_residual_8x8_12_neon; c->add_residual[2] = ff_hevc_add_residual_16x16_12_neon; c->add_residual[3] = ff_hevc_add_residual_32x32_12_neon; + c->idct_dc[0] = ff_hevc_idct_4x4_dc_12_neon; + c->idct_dc[1] = ff_hevc_idct_8x8_dc_12_neon; + c->idct_dc[2] = ff_hevc_idct_16x16_dc_12_neon; + c->idct_dc[3] = ff_hevc_idct_32x32_dc_12_neon; } } -- 2.46.0 _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".