From: flow gg <hlefthleft@gmail.com> To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org> Subject: Re: [FFmpeg-devel] [PATCH] af_afir: RISC-V V fcmul_add Date: Mon, 13 Nov 2023 17:43:01 +0800 Message-ID: <CAEa-L+uLB2dnEV3UEHERvpgB2aUjmWOjps_K9d2U037nBqDy4g@mail.gmail.com> (raw) In-Reply-To: <B5EA077E-39A9-4083-99D4-934F766BD572@remlab.net> [-- Attachment #1: Type: text/plain, Size: 2905 bytes --] Sorry for the long delay in responding. How is the modified patch now? no longer using register stride(learn from your code) and have switched to shNadd instead. (using m4 and m2 as they are slightly faster than m8 and m4) benchmark: fcmul_add_c: 2179 fcmul_add_rvv_f32: 1652 Rémi Denis-Courmont <remi@remlab.net> 于2023年9月28日周四 21:33写道: > > > Le 28 septembre 2023 08:45:44 GMT+03:00, flow gg <hlefthleft@gmail.com> a > écrit : > >Okay, I revert the volatile in ff_read_time > > > >How about this version? > > It's still using register stride which is all but guaranteed to be slow on > any hardware and should only be used as a last resort. > > The code is also missing scheduling for multi-issue and unrolling with the > group multiplier. > > And lastly, while that probably won't change much, there are no reasons to > use mul here. You can use shNadd like existing code does. > > > > > >use vls instead vlseg, and use vfmacc > > > >The benchmark is sometimes better, sometimes the same > > > >fcmul_add_c: 3.5 > >fcmul_add_rvv_f32: 3.5 > > - af_afir.fcmul_add [OK] > >fcmul_add_c: 4.5 > >fcmul_add_rvv_f32: 4.2 > > - af_afir.fcmul_add [OK] > >fcmul_add_c: 4.2 > >fcmul_add_rvv_f32: 4.2 > > - af_afir.fcmul_add [OK] > >fcmul_add_c: 4.5 > >fcmul_add_rvv_f32: 4.2 > > - af_afir.fcmul_add [OK] > >fcmul_add_c: 4.7 > >fcmul_add_rvv_f32: 3.5 > > > > > >Rémi Denis-Courmont <remi@remlab.net> 于2023年9月28日周四 00:41写道: > > > >> Le tiistaina 26. syyskuuta 2023, 12.24.58 EEST flow gg a écrit : > >> > benchmark: > >> > fcmul_add_c: 19.7 > >> > fcmul_add_rvv_f32: 6.7 > >> > >> With optimisations enabled and the benchmarking fix, I get this (on the > >> same > >> hardware, I believe): > >> > >> fcmul_add_c: 3.5 > >> fcmul_add_rvv_f32: 6.7 > >> > >> For sure unfortunate design limitations of T-Head C910 are to blame to > no > >> small extent. It is not the first occurrence of an RVV optimisation that > >> turns > >> out worse than scalar due to those, and I still have honest hopes that > >> newer > >> (and conformant) IP would give saner results, but... I also believe that > >> the > >> code could be improved regardless. > >> > >> -- > >> Rémi Denis-Courmont > >> http://www.remlab.net/ > >> > >> > >> > >> _______________________________________________ > >> ffmpeg-devel mailing list > >> ffmpeg-devel@ffmpeg.org > >> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > >> > >> To unsubscribe, visit link above, or email > >> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". > >> > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". > [-- Attachment #2: af_afir-RISC-V-V-fcmul_add.patch --] [-- Type: text/x-patch, Size: 5417 bytes --] From 4199887247d31348385cd864b4efd6f4c02740f2 Mon Sep 17 00:00:00 2001 From: sunyuechi <sunyuechi@iscas.ac.cn> Date: Fri, 3 Nov 2023 10:35:53 +0800 Subject: [PATCH] af_afir: RISC-V V fcmul_add benchmark: fcmul_add_c: 2179 fcmul_add_rvv_f32: 1652 --- libavfilter/af_afirdsp.h | 3 ++ libavfilter/riscv/Makefile | 2 ++ libavfilter/riscv/af_afir_init.c | 39 ++++++++++++++++++++ libavfilter/riscv/af_afir_rvv.S | 61 ++++++++++++++++++++++++++++++++ 4 files changed, 105 insertions(+) create mode 100644 libavfilter/riscv/Makefile create mode 100644 libavfilter/riscv/af_afir_init.c create mode 100644 libavfilter/riscv/af_afir_rvv.S diff --git a/libavfilter/af_afirdsp.h b/libavfilter/af_afirdsp.h index 4208501393..d2d1e909c1 100644 --- a/libavfilter/af_afirdsp.h +++ b/libavfilter/af_afirdsp.h @@ -34,6 +34,7 @@ typedef struct AudioFIRDSPContext { } AudioFIRDSPContext; void ff_afir_init_x86(AudioFIRDSPContext *s); +void ff_afir_init_riscv(AudioFIRDSPContext *s); static void fcmul_add_c(float *sum, const float *t, const float *c, ptrdiff_t len) { @@ -76,6 +77,8 @@ static av_unused void ff_afir_init(AudioFIRDSPContext *dsp) #if ARCH_X86 ff_afir_init_x86(dsp); +#elif ARCH_RISCV + ff_afir_init_riscv(dsp); #endif } diff --git a/libavfilter/riscv/Makefile b/libavfilter/riscv/Makefile new file mode 100644 index 0000000000..0b968a9c0d --- /dev/null +++ b/libavfilter/riscv/Makefile @@ -0,0 +1,2 @@ +OBJS += riscv/af_afir_init.o +RVV-OBJS += riscv/af_afir_rvv.o diff --git a/libavfilter/riscv/af_afir_init.c b/libavfilter/riscv/af_afir_init.c new file mode 100644 index 0000000000..13df8341e7 --- /dev/null +++ b/libavfilter/riscv/af_afir_init.c @@ -0,0 +1,39 @@ +/* + * Copyright (c) 2023 Institue of Software Chinese Academy of Sciences (ISCAS). + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include <stdint.h> + +#include "config.h" +#include "libavutil/attributes.h" +#include "libavutil/cpu.h" +#include "libavfilter/af_afirdsp.h" + +void ff_fcmul_add_rvv(float *sum, const float *t, const float *c, + ptrdiff_t len); + +av_cold void ff_afir_init_riscv(AudioFIRDSPContext *s) +{ +#if HAVE_RVV + int flags = av_get_cpu_flags(); + + if (flags & AV_CPU_FLAG_RVV_F32) + s->fcmul_add = ff_fcmul_add_rvv; +#endif +} diff --git a/libavfilter/riscv/af_afir_rvv.S b/libavfilter/riscv/af_afir_rvv.S new file mode 100644 index 0000000000..078cac8e7e --- /dev/null +++ b/libavfilter/riscv/af_afir_rvv.S @@ -0,0 +1,61 @@ +/* + * Copyright (c) 2023 Institue of Software Chinese Academy of Sciences (ISCAS). + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/riscv/asm.S" + +// void ff_fcmul_add(float *sum, const float *t, const float *c, int len) +func ff_fcmul_add_rvv, zve32f + li t1, 32 +1: + vsetvli t0, a3, e64, m4, ta, ma + vle64.v v12, (a0) + sub a3, a3, t0 + vsetvli zero, zero, e32, m2, ta, ma + vnsrl.vx v8, v12, zero + vnsrl.vx v10, v12, t1 + vsetvli zero, zero, e64, m4, ta, ma + vle64.v v12, (a1) + sh3add a1, t0, a1 + vsetvli zero, zero, e32, m2, ta, ma + vnsrl.vx v0, v12, zero + vnsrl.vx v2, v12, t1 + vsetvli zero, zero, e64, m4, ta, ma + vle64.v v12, (a2) + sh3add a2, t0, a2 + vsetvli zero, zero, e32, m2, ta, ma + vnsrl.vx v4, v12, zero + vnsrl.vx v6, v12, t1 + vfmacc.vv v8, v0, v4 + vfnmsac.vv v8, v2, v6 + vfmacc.vv v10, v0, v6 + vfmacc.vv v10, v2, v4 + vsseg2e32.v v8, (a0) + sh3add a0, t0, a0 + bgtz a3, 1b + + flw fa0, 0(a1) + flw fa1, 0(a2) + flw fa2, 0(a0) + fmul.s fa0, fa0, fa1 + fadd.s fa2, fa2, fa0 + fsw fa2, 0(a0) + + ret +endfunc -- 2.42.1 [-- Attachment #3: Type: text/plain, Size: 251 bytes --] _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
next prev parent reply other threads:[~2023-11-13 9:43 UTC|newest] Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top 2023-09-26 9:24 flow gg 2023-09-26 18:34 ` Rémi Denis-Courmont 2023-09-26 18:40 ` Paul B Mahol 2023-09-26 18:44 ` Rémi Denis-Courmont 2023-09-27 1:47 ` flow gg 2023-09-27 16:01 ` Rémi Denis-Courmont 2023-09-27 16:27 ` Rémi Denis-Courmont 2023-09-26 18:50 ` Rémi Denis-Courmont 2023-09-27 16:41 ` Rémi Denis-Courmont 2023-09-28 5:45 ` flow gg 2023-09-28 13:33 ` Rémi Denis-Courmont 2023-11-13 9:43 ` flow gg [this message] 2023-11-13 15:35 ` Rémi Denis-Courmont 2023-11-13 16:01 ` Paul B Mahol 2023-11-15 8:57 ` flow gg 2023-11-15 8:59 ` flow gg 2023-11-15 15:05 ` Rémi Denis-Courmont 2023-11-15 23:04 ` flow gg
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=CAEa-L+uLB2dnEV3UEHERvpgB2aUjmWOjps_K9d2U037nBqDy4g@mail.gmail.com \ --to=hlefthleft@gmail.com \ --cc=ffmpeg-devel@ffmpeg.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel This inbox may be cloned and mirrored by anyone: git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \ ffmpegdev@gitmailbox.com public-inbox-index ffmpegdev Example config snippet for mirrors. AGPL code for this site: git clone https://public-inbox.org/public-inbox.git