[FFmpeg-devel] [PATCHv2 0/10] RISC-V V floating point DSP

* [FFmpeg-devel] [PATCHv2 0/10] RISC-V V floating point DSP
@ 2022-09-04 13:54 Rémi Denis-Courmont
  2022-09-04 13:54 ` [FFmpeg-devel] [PATCH 01/10] riscv: add CPU flags for the RISC-V Vector extension remi
                   ` (10 more replies)
  0 siblings, 11 replies; 13+ messages in thread
From: Rémi Denis-Courmont @ 2022-09-04 13:54 UTC (permalink / raw)
  To: ffmpeg-devel

The following changes since commit b6e8fc1c201d58672639134a737137e1ba7b55fe:

  avcodec/speexdec: improve support for speex in non-ogg (2022-09-04 11:31:57 +0200)

are waiting thorough bashing at your express convenience up to:

  riscv: float vector dot product with RVV (2022-09-04 16:45:38 +0300)

Changes since v1:

- Removed stray define.
- Fixed mismatch between byte and element size in mul-scalar.
- Added fmul, fac, dmul, dmac, fmul-add, fmul-reverse, fmul-window.
- Added float butterfly and dot product.

All operations are unrolled to the maximum group size (8), with the
exception of overlap/add. The later seems to require a minimum of 6
vectors (maybe 5 by extremely careful ordering), so the group size is
only 4.

The pointer arithmetic could be slightly optimised with SH2ADD and
SH3ADD instructions from the Zvba extension. This would require more
conditional code, or requiring support for Zvba for probably neglible
performance gains though.

----------------------------------------------------------------
Rémi Denis-Courmont (10):
      riscv: add CPU flags for the RISC-V Vector extension
      riscv: initial common header for assembler macros
      riscv: float vector-scalar multiplication with RVV
      riscv: float vector-vector multiplication with RVV
      riscv: float vector multiply-accumulate with RVV
      riscv: float vector multiplication-addition with RVV
      riscv: float vector sum-and-difference with RVV
      riscv: float reversed vector multiplication with RVV
      riscv: float vector windowed overlap/add with RVV
      riscv: float vector dot product with RVV

 libavutil/cpu.c                  |  14 +++
 libavutil/cpu.h                  |   6 +
 libavutil/cpu_internal.h         |   1 +
 libavutil/float_dsp.c            |   2 +
 libavutil/float_dsp.h            |   1 +
 libavutil/riscv/Makefile         |   3 +
 libavutil/riscv/asm.S            |  33 +++++
 libavutil/riscv/cpu.c            |  57 +++++++++
 libavutil/riscv/float_dsp_init.c |  67 ++++++++++
 libavutil/riscv/float_dsp_rvv.S  | 255 +++++++++++++++++++++++++++++++++++++++
 10 files changed, 439 insertions(+)
 create mode 100644 libavutil/riscv/Makefile
 create mode 100644 libavutil/riscv/asm.S
 create mode 100644 libavutil/riscv/cpu.c
 create mode 100644 libavutil/riscv/float_dsp_init.c
 create mode 100644 libavutil/riscv/float_dsp_rvv.S

-- 
レミ・デニ-クールモン
http://www.remlab.net/

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 13+ messages in thread