From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.ffmpeg.org (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTPS id 90D3C4E1A6 for ; Sat, 7 Jun 2025 10:18:35 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.ffmpeg.org (Postfix) with ESMTP id 15A3268BE61; Sat, 7 Jun 2025 13:18:31 +0300 (EEST) Received: from out162-62-58-216.mail.qq.com (out162-62-58-216.mail.qq.com [162.62.58.216]) by ffbox0-bg.ffmpeg.org (Postfix) with ESMTPS id C858168BA25 for ; Sat, 7 Jun 2025 13:18:23 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1749291493; bh=s/H388oufRmVTjVoc/MRpluQfnepxym0/F9sS2HxpNQ=; h=From:To:Cc:Subject:Date; b=YPkIaeVms1jKADfpHy/bwHupl9f40UCUKP9Y1jRDSzadd/lO4vKt3sc8ZdfWMCh5j qLBX/SINTCEcurg0/tBHNHGgoaDmZnmoG0BvnC6wwhfj/BrkMIR44NyVuUGylhDZga tgtP2cHrIO4KdbzI8qoC1NVVenAW9ayBNW73j5QI= Received: from black.. ([240e:3b7:3276:26d0:8e5f:26b8:da55:55d0]) by newxmesmtplogicsvrsza36-0.qq.com (NewEsmtp) with SMTP id 48CA7819; Sat, 07 Jun 2025 18:18:12 +0800 X-QQ-mid: xmsmtpt1749291492tyocf386y Message-ID: X-QQ-XMAILINFO: MLQF5pNkQhvxL0JTpn0VjcjvBjMdauBoCk0VqF9IqSKXI50pEJ7UoLapuQBLWk Z5BjlDCzYgfsPYnZ4o9+l6wbVSvMordEJh9kK3dYmZY5qmmpuuo09C/8qeTMmo8AUgYxV3XkHnCq f11Ub1OZWoML23iK2k7+z0UFCi5R5SIY82PU4Xe1J6orRPOeqSVo+NqvsOUDQ/ktGysIRGWo2L4Z rZMQDmDQnN54TYE+6a5luSHw0dC4/PGb4GmOHkp/7hIZgmT+hyTftErHamrxqyPyZ6Vlg3KazrWo 8Xt559gNsJQwb66weeY+Ycshbl9CXb5T1muzjp5hgNhk3x2ov045IsIOSHwvQbOngbck0rEmnRtS Dq/Ek5YpkjPAgPw9S259IymuORdGVwjNZCfNgKV5+J6UXW/gg/lYKbPqk2lYHD/KL3XiYNlLo/1i 0w/g6nhi1C4ZLB2nUKRWp+IDaCvBO7goKSJs35U7olgXtXImOgCfZUmPseIa4WJmqOHW4VR1QgkP bC4NYmZ3I4PMOhdh1HQXlLovY9F/Jrfa0ciYIlqnThafIK8wALKvrYmWvMIzf9ENLOJt7+ZBmgjq DOtiDbfBWJNEr3/QBL3N6IbRmfhkyCKZ3cPqpaiNgSWdTJnaLW1vvKLAhBkRX9aKVdCBrPbQxi2L KDmCagYoOB+c37bR4EB74FmarzcBxYqdYBPPHyTC4Bx0hnE3JXSt78bKNZZDcDH9HBXC10G5+fU2 3jvIWF21S0J1NT3q8jxCx6CCWtfE0R/EU55fklpu8/BX4s5niRIej4kcXXiO9PbEftMl7V25fNJ7 JmpbYvNE+2LS5UMC62fxFXsFREwuS+XCHqVbJunpKqvbHkoJl4GySGPLMbgzIlmsg+48BCJUTuI1 adn6pS1F2gp/JDEckKTU9ImsXfJDuXZBFlWJ5N0ioVTIUX0Egc/eHiqRkE7O3FXHU/k1bDXx3Zyr XpD8Emgy/DtrozxkGmCuS/CRagkDNUB7p47vPuWd2jq0g5i9HzjQ== X-QQ-XMRINFO: Mp0Kj//9VHAxr69bL5MkOOs= From: Zhao Zhili To: ffmpeg-devel@ffmpeg.org Date: Sat, 7 Jun 2025 18:18:11 +0800 X-OQ-MSGID: <20250607101811.454289-1-quinkblack@foxmail.com> X-Mailer: git-send-email 2.43.0 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 1/2] wasm/hevc: Add sao_band_filter X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Zhao Zhili Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: From: Zhao Zhili hevc_sao_band_8_8_c: 63.0 ( 1.00x) hevc_sao_band_8_8_simd128: 10.4 ( 6.06x) hevc_sao_band_16_8_c: 230.4 ( 1.00x) hevc_sao_band_16_8_simd128: 22.9 (10.07x) hevc_sao_band_32_8_c: 900.4 ( 1.00x) hevc_sao_band_32_8_simd128: 81.5 (11.05x) hevc_sao_band_48_8_c: 2009.1 ( 1.00x) hevc_sao_band_48_8_simd128: 170.2 (11.80x) hevc_sao_band_64_8_c: 3535.0 ( 1.00x) hevc_sao_band_64_8_simd128: 297.5 (11.88x) Signed-off-by: Zhao Zhili --- libavcodec/wasm/hevc/Makefile | 3 +- libavcodec/wasm/hevc/dsp_init.c | 7 ++ libavcodec/wasm/hevc/sao.c | 113 ++++++++++++++++++++++++++++++++ libavcodec/wasm/hevc/sao.h | 41 ++++++++++++ 4 files changed, 163 insertions(+), 1 deletion(-) create mode 100644 libavcodec/wasm/hevc/sao.c create mode 100644 libavcodec/wasm/hevc/sao.h diff --git a/libavcodec/wasm/hevc/Makefile b/libavcodec/wasm/hevc/Makefile index 132daa3106..7e8ab3776e 100644 --- a/libavcodec/wasm/hevc/Makefile +++ b/libavcodec/wasm/hevc/Makefile @@ -1,3 +1,4 @@ OBJS-$(CONFIG_HEVC_DECODER) += wasm/hevc/dsp_init.o -SIMD128-OBJS-$(CONFIG_HEVC_DECODER) += wasm/hevc/idct.o +SIMD128-OBJS-$(CONFIG_HEVC_DECODER) += wasm/hevc/idct.o \ + wasm/hevc/sao.o diff --git a/libavcodec/wasm/hevc/dsp_init.c b/libavcodec/wasm/hevc/dsp_init.c index e5c8a2ebb6..76a1031ff4 100644 --- a/libavcodec/wasm/hevc/dsp_init.c +++ b/libavcodec/wasm/hevc/dsp_init.c @@ -21,6 +21,7 @@ #include "libavutil/cpu_internal.h" #include "libavcodec/hevc/dsp.h" #include "libavcodec/wasm/hevc/idct.h" +#include "libavcodec/wasm/hevc/sao.h" av_cold void ff_hevc_dsp_init_wasm(HEVCDSPContext *c, const int bit_depth) { @@ -35,6 +36,12 @@ av_cold void ff_hevc_dsp_init_wasm(HEVCDSPContext *c, const int bit_depth) c->idct[1] = ff_hevc_idct_8x8_8_simd128; c->idct[2] = ff_hevc_idct_16x16_8_simd128; c->idct[3] = ff_hevc_idct_32x32_8_simd128; + + c->sao_band_filter[0] = ff_hevc_sao_band_filter_8x8_8_simd128; + c->sao_band_filter[1] = + c->sao_band_filter[2] = + c->sao_band_filter[3] = + c->sao_band_filter[4] = ff_hevc_sao_band_filter_16x16_8_simd128; } else if (bit_depth == 10) { c->idct[0] = ff_hevc_idct_4x4_10_simd128; c->idct[1] = ff_hevc_idct_8x8_10_simd128; diff --git a/libavcodec/wasm/hevc/sao.c b/libavcodec/wasm/hevc/sao.c new file mode 100644 index 0000000000..82134af7f3 --- /dev/null +++ b/libavcodec/wasm/hevc/sao.c @@ -0,0 +1,113 @@ +/* + * Copyright (c) 2025 Zhao Zhili + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "sao.h" + +#include + +void ff_hevc_sao_band_filter_8x8_8_simd128(uint8_t *dst, const uint8_t *src, + ptrdiff_t stride_dst, + ptrdiff_t stride_src, + const int16_t *sao_offset_val, + int sao_left_class, int width, + int height) +{ + int8_t offset_table[32] = {0}; + v128_t offset_low, offset_high; + + for (int k = 0; k < 4; k++) + offset_table[(k + sao_left_class) & 31] = (int8_t)sao_offset_val[k + 1]; + + offset_low = wasm_v128_load(offset_table); + offset_high = wasm_v128_load(&offset_table[16]); + + for (int y = height; y > 0; y -= 2) { + v128_t src_v, src_high; + v128_t v0, v1; + + src_v = wasm_v128_load64_zero(src); + src += stride_src; + src_v = wasm_v128_load64_lane(src, src_v, 1); + src += stride_src; + + v0 = wasm_u8x16_shr(src_v, 3); + v1 = wasm_i8x16_sub(v0, wasm_i8x16_const_splat(16)); + v0 = wasm_i8x16_swizzle(offset_low, v0); + v1 = wasm_i8x16_swizzle(offset_high, v1); + v0 = wasm_v128_or(v0, v1); + src_high = wasm_u16x8_extend_high_u8x16(src_v); + v1 = wasm_i16x8_extend_high_i8x16(v0); + src_v = wasm_u16x8_extend_low_u8x16(src_v); + v0 = wasm_i16x8_extend_low_i8x16(v0); + + v0 = wasm_i16x8_add_sat(src_v, v0); + v1 = wasm_i16x8_add_sat(src_high, v1); + v0 = wasm_u8x16_narrow_i16x8(v0, v1); + + wasm_v128_store64_lane(dst, v0, 0); + dst += stride_dst; + wasm_v128_store64_lane(dst, v0, 1); + dst += stride_dst; + } +} + +void ff_hevc_sao_band_filter_16x16_8_simd128(uint8_t *dst, const uint8_t *src, + ptrdiff_t stride_dst, + ptrdiff_t stride_src, + const int16_t *sao_offset_val, + int sao_left_class, int width, + int height) +{ + int8_t offset_table[32] = {0}; + v128_t offset_low, offset_high; + + for (int k = 0; k < 4; k++) + offset_table[(k + sao_left_class) & 31] = (int8_t)sao_offset_val[k + 1]; + + offset_low = wasm_v128_load(offset_table); + offset_high = wasm_v128_load(&offset_table[16]); + + for (int y = height; y > 0; y--) { + for (int x = 0; x < width; x += 16) { + v128_t src_v, src_high; + v128_t v0, v1; + + src_v = wasm_v128_load(&src[x]); + + v0 = wasm_u8x16_shr(src_v, 3); + v1 = wasm_i8x16_sub(v0, wasm_i8x16_const_splat(16)); + v0 = wasm_i8x16_swizzle(offset_low, v0); + v1 = wasm_i8x16_swizzle(offset_high, v1); + v0 = wasm_v128_or(v0, v1); + src_high = wasm_u16x8_extend_high_u8x16(src_v); + v1 = wasm_i16x8_extend_high_i8x16(v0); + src_v = wasm_u16x8_extend_low_u8x16(src_v); + v0 = wasm_i16x8_extend_low_i8x16(v0); + + v0 = wasm_i16x8_add_sat(src_v, v0); + v1 = wasm_i16x8_add_sat(src_high, v1); + v0 = wasm_u8x16_narrow_i16x8(v0, v1); + wasm_v128_store(&dst[x], v0); + } + + dst += stride_dst; + src += stride_src; + } +} diff --git a/libavcodec/wasm/hevc/sao.h b/libavcodec/wasm/hevc/sao.h new file mode 100644 index 0000000000..6119ec90f1 --- /dev/null +++ b/libavcodec/wasm/hevc/sao.h @@ -0,0 +1,41 @@ +/* + * Copyright (c) 2025 Zhao Zhili + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#ifndef AVCODEC_WASM_HEVC_SAO_H +#define AVCODEC_WASM_HEVC_SAO_H + +#include +#include + +void ff_hevc_sao_band_filter_8x8_8_simd128(uint8_t *_dst, const uint8_t *_src, + ptrdiff_t _stride_dst, + ptrdiff_t _stride_src, + const int16_t *sao_offset_val, + int sao_left_class, int width, + int height); + +void ff_hevc_sao_band_filter_16x16_8_simd128(uint8_t *_dst, const uint8_t *_src, + ptrdiff_t _stride_dst, + ptrdiff_t _stride_src, + const int16_t *sao_offset_val, + int sao_left_class, int width, + int height); + +#endif \ No newline at end of file -- 2.43.0 _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".