* [FFmpeg-devel] [Question]Inquiry Regarding RISC-V RVV Optimization for HEVC Decoding in FFmpeg
@ 2025-11-14 1:52 yunfei_zhou--- via ffmpeg-devel
2025-11-14 4:20 ` [FFmpeg-devel] " Zhao Zhili via ffmpeg-devel
2025-11-14 14:21 ` Rémi Denis-Courmont via ffmpeg-devel
0 siblings, 2 replies; 5+ messages in thread
From: yunfei_zhou--- via ffmpeg-devel @ 2025-11-14 1:52 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: yunfei_zhou
Hi all,
I hope this message finds you well.
My name is Yunfei Zhou, and I am a Software Development Engineer at Alibaba DAMO Academy, where I focus on video coding and decoding optimization.
We are currently exploring vectorization optimizations for HEVC decoding using the RISC-V Vector Extension (RVV) and are eager to contribute our work to the FFmpeg community.
Before proceeding, we would like to understand whether there are any existing or ongoing efforts in this area to avoid duplication and, ideally, align or collaborate with current initiatives.
Could you kindly share an update on the current status of RISC-V RVV support for HEVC decoding in FFmpeg? Specifically, we’d appreciate any information regarding:
*
Recent developments or patches related to RVV-based HEVC optimizations,
*
Planned future work in this domain, and
*
Available documentation or resources that could help us better understand the existing codebase and optimization strategies.
Thank you very much for your time and support. We look forward to contributing to and learning from the FFmpeg community.
Best regards,
Yunfei Zhou
Alibaba DAMO Academy
_______________________________________________
ffmpeg-devel mailing list -- ffmpeg-devel@ffmpeg.org
To unsubscribe send an email to ffmpeg-devel-leave@ffmpeg.org
^ permalink raw reply [flat|nested] 5+ messages in thread
* [FFmpeg-devel] Re: [Question]Inquiry Regarding RISC-V RVV Optimization for HEVC Decoding in FFmpeg
2025-11-14 1:52 [FFmpeg-devel] [Question]Inquiry Regarding RISC-V RVV Optimization for HEVC Decoding in FFmpeg yunfei_zhou--- via ffmpeg-devel
@ 2025-11-14 4:20 ` Zhao Zhili via ffmpeg-devel
2025-11-14 14:21 ` Rémi Denis-Courmont via ffmpeg-devel
1 sibling, 0 replies; 5+ messages in thread
From: Zhao Zhili via ffmpeg-devel @ 2025-11-14 4:20 UTC (permalink / raw)
To: FFmpeg development discussions and patches; +Cc: yunfei_zhou, Zhao Zhili
> On Nov 14, 2025, at 09:52, yunfei_zhou--- via ffmpeg-devel <ffmpeg-devel@ffmpeg.org> wrote:
>
> Hi all,
> I hope this message finds you well.
> My name is Yunfei Zhou, and I am a Software Development Engineer at Alibaba DAMO Academy, where I focus on video coding and decoding optimization.
> We are currently exploring vectorization optimizations for HEVC decoding using the RISC-V Vector Extension (RVV) and are eager to contribute our work to the FFmpeg community.
> Before proceeding, we would like to understand whether there are any existing or ongoing efforts in this area to avoid duplication and, ideally, align or collaborate with current initiatives.
> Could you kindly share an update on the current status of RISC-V RVV support for HEVC decoding in FFmpeg? Specifically, we’d appreciate any information regarding:
>
> *
> Recent developments or patches related to RVV-based HEVC optimizations,
Search RVV on code.ffmpeg.org
https://code.ffmpeg.org/FFmpeg/FFmpeg/pulls?state=open&type=all&labels=&milestone=0&project=0&assignee=0&poster=0&sort=&q=rvv
>
> *
> Planned future work in this domain, and
>
> *
> Available documentation or resources that could help us better understand the existing codebase and optimization strategies.
> Thank you very much for your time and support. We look forward to contributing to and learning from the FFmpeg community.
General document on contributions
https://www.ffmpeg.org/developer.html#Introduction
I have a simple summary on asm optimization
https://gist.github.com/quink-black/316ac42e0482f4158dd7df3003794576
> Best regards,
> Yunfei Zhou
> Alibaba DAMO Academy
> _______________________________________________
> ffmpeg-devel mailing list -- ffmpeg-devel@ffmpeg.org
> To unsubscribe send an email to ffmpeg-devel-leave@ffmpeg.org
_______________________________________________
ffmpeg-devel mailing list -- ffmpeg-devel@ffmpeg.org
To unsubscribe send an email to ffmpeg-devel-leave@ffmpeg.org
^ permalink raw reply [flat|nested] 5+ messages in thread
* [FFmpeg-devel] Re: [Question]Inquiry Regarding RISC-V RVV Optimization for HEVC Decoding in FFmpeg
2025-11-14 1:52 [FFmpeg-devel] [Question]Inquiry Regarding RISC-V RVV Optimization for HEVC Decoding in FFmpeg yunfei_zhou--- via ffmpeg-devel
2025-11-14 4:20 ` [FFmpeg-devel] " Zhao Zhili via ffmpeg-devel
@ 2025-11-14 14:21 ` Rémi Denis-Courmont via ffmpeg-devel
2025-11-15 2:50 ` [FFmpeg-devel] 回复:Re: " yunfei_zhou--- via ffmpeg-devel
1 sibling, 1 reply; 5+ messages in thread
From: Rémi Denis-Courmont via ffmpeg-devel @ 2025-11-14 14:21 UTC (permalink / raw)
To: FFmpeg development discussions and patches
Cc: yunfei_zhou, Rémi Denis-Courmont
Nihao,
Le 14 novembre 2025 03:52:51 GMT+02:00, yunfei_zhou--- via ffmpeg-devel <ffmpeg-devel@ffmpeg.org> a écrit :
>Before proceeding, we would like to understand whether there are any existing or ongoing efforts in this area to avoid duplication and, ideally, align or collaborate with current initiatives.
Existing code you can find in the official Git repo. Ongoing efforts are unknown to us. You had probably better ask the RISE multimedia group than FFmpeg-devel. I suppose you or one of your colleagues should have access. (I don't anyone here has.)
> *
>Available documentation or resources that could help us better understand the existing codebase and optimization strategies.
To be honest, in my experience, while it is obviously possible to optimise video decoding with RVV, the current implementations are not competitive (with e.g. Armv8 AdvSIMD) due most particularly to two aspects:
1) Segmented loads&stores are slow. Because video decoding often involves transposition, we would really need segmented unit-strided accesses to run as fast or almost as fast as single-segment unit-strided accesses of the same size. Likewise we need segmented register-strided accesses to be almost as fast as single-segment register strided accesses.
2) Because RVV is scalable, and video decoding uses a lot of fixed-size and/or small vectors, we need instruction execution cost to scale according to VL or next_power_of_two(VL). Currently it seems to scale according to VLMAX, which means larger vectors make optimisations worse rather than better.
(This is based on benchmarks for your C910 and C908 cores, and SpacemiT's X60. I don't have access to any other hardware at the moment.)
Point being, the available hardware seems a little bit immature, so we don't really have settled optimisations strategies.
Br,
_______________________________________________
ffmpeg-devel mailing list -- ffmpeg-devel@ffmpeg.org
To unsubscribe send an email to ffmpeg-devel-leave@ffmpeg.org
^ permalink raw reply [flat|nested] 5+ messages in thread
* [FFmpeg-devel] 回复:Re: [Question]Inquiry Regarding RISC-V RVV Optimization for HEVC Decoding in FFmpeg
2025-11-14 14:21 ` Rémi Denis-Courmont via ffmpeg-devel
@ 2025-11-15 2:50 ` yunfei_zhou--- via ffmpeg-devel
2025-11-15 11:51 ` [FFmpeg-devel] " Rémi Denis-Courmont via ffmpeg-devel
0 siblings, 1 reply; 5+ messages in thread
From: yunfei_zhou--- via ffmpeg-devel @ 2025-11-15 2:50 UTC (permalink / raw)
To: FFmpeg development discussions and patches
Cc: Rémi Denis-Courmont, yunfei_zhou
Hi Zhili & Rémi Denis-Courmont,
Thank you very much, Zhao Zhili, for the helpful pointers! I’ll immediately review the latest RVV-related patches and pull requests on code.ffmpeg.org <https://code.ffmpeg.org/ > and study your excellent summary on assembly optimization.
I also sincerely appreciate Rémi Denis-Courmont’s detailed feedback. In response to the points you raised, I’d like to share a bit about our current efforts:
*
RISE Multimedia Group: We’ll reach out to our internal colleagues to check whether there are any ongoing initiatives within that community, to avoid duplication and explore potential collaboration.
*
Segmented load/store performance: We’ve encountered similar bottlenecks in our video decoding optimizations. To address this, we’re actively proposing new vector instructions tailored for media workloads to the RISC-V International standards body. At the same time, we’re working closely with RISC-V CPU microarchitecture teams to improve the hardware efficiency of these memory operations.
*
Scalable VLEN (variable vector length) challenges: I fully agree with your observation. The scalability of RVV is meant to provide flexibility—ideally, a single optimized implementation should adapt gracefully across different VLEN configurations. However, in practice, video codecs like HEVC predominantly use fixed-size blocks (e.g., 4×4, 8×8). As a result, an algorithm optimized for VLEN=128 may not perform better—or may even regress—on a VLEN=256 system, despite the latter having higher theoretical compute throughput. This forces us to develop separate optimizations per VLEN, which undermines the original intent of RVV’s scalability. We believe there’s significant room for discussion and innovation here—both in software strategies and hardware design.
We recognize that the RISC-V vector ecosystem is still evolving rapidly. Nevertheless, we’re confident that through close hardware-software co-design, RVV can become highly competitive in video coding workloads over time. RISC-V’s open nature makes it especially well-suited for such collaborative improvements—and we warmly welcome any performance insights, suggestions, or discussions from the community.
Thank you again for your valuable input and support!
Best regards,
Yunfei Zhou
Alibaba DAMO Academy
------------------------------------------------------------------
发件人:Rémi Denis-Courmont via ffmpeg-devel <ffmpeg-devel@ffmpeg.org>
发送时间:2025年11月14日(周五) 22:22
收件人:FFmpeg development discussions and patches<ffmpeg-devel@ffmpeg.org>
抄 送:"yunfei_zhou@linux.alibaba.com"<yunfei_zhou@linux.alibaba.com>; "Rémi Denis-Courmont"<remi@remlab.net>
主 题:[FFmpeg-devel] Re: [Question]Inquiry Regarding RISC-V RVV Optimization for HEVC Decoding in FFmpeg
Nihao,
Le 14 novembre 2025 03:52:51 GMT+02:00, yunfei_zhou--- via ffmpeg-devel <ffmpeg-devel@ffmpeg.org> a écrit :
>Before proceeding, we would like to understand whether there are any existing or ongoing efforts in this area to avoid duplication and, ideally, align or collaborate with current initiatives.
Existing code you can find in the official Git repo. Ongoing efforts are unknown to us. You had probably better ask the RISE multimedia group than FFmpeg-devel. I suppose you or one of your colleagues should have access. (I don't anyone here has.)
> *
>Available documentation or resources that could help us better understand the existing codebase and optimization strategies.
To be honest, in my experience, while it is obviously possible to optimise video decoding with RVV, the current implementations are not competitive (with e.g. Armv8 AdvSIMD) due most particularly to two aspects:
1) Segmented loads&stores are slow. Because video decoding often involves transposition, we would really need segmented unit-strided accesses to run as fast or almost as fast as single-segment unit-strided accesses of the same size. Likewise we need segmented register-strided accesses to be almost as fast as single-segment register strided accesses.
2) Because RVV is scalable, and video decoding uses a lot of fixed-size and/or small vectors, we need instruction execution cost to scale according to VL or next_power_of_two(VL). Currently it seems to scale according to VLMAX, which means larger vectors make optimisations worse rather than better.
(This is based on benchmarks for your C910 and C908 cores, and SpacemiT's X60. I don't have access to any other hardware at the moment.)
Point being, the available hardware seems a little bit immature, so we don't really have settled optimisations strategies.
Br,
_______________________________________________
ffmpeg-devel mailing list -- ffmpeg-devel@ffmpeg.org
To unsubscribe send an email to ffmpeg-devel-leave@ffmpeg.org
_______________________________________________
ffmpeg-devel mailing list -- ffmpeg-devel@ffmpeg.org
To unsubscribe send an email to ffmpeg-devel-leave@ffmpeg.org
^ permalink raw reply [flat|nested] 5+ messages in thread
* [FFmpeg-devel] Re: 回复:Re: [Question]Inquiry Regarding RISC-V RVV Optimization for HEVC Decoding in FFmpeg
2025-11-15 2:50 ` [FFmpeg-devel] 回复:Re: " yunfei_zhou--- via ffmpeg-devel
@ 2025-11-15 11:51 ` Rémi Denis-Courmont via ffmpeg-devel
0 siblings, 0 replies; 5+ messages in thread
From: Rémi Denis-Courmont via ffmpeg-devel @ 2025-11-15 11:51 UTC (permalink / raw)
To: FFmpeg development discussions and patches; +Cc: Rémi Denis-Courmont
Nihao,
Le lauantaina 15. marraskuuta 2025, 4.50.11 Itä-Euroopan normaaliaika
yunfei_zhou--- via ffmpeg-devel a écrit :
> Segmented load/store performance: We’ve encountered similar bottlenecks in
> our video decoding optimizations. To address this, we’re actively proposing
> new vector instructions tailored for media workloads to the RISC-V
> International standards body. At the same time, we’re working closely with
> RISC-V CPU microarchitecture teams to improve the hardware efficiency of
> these memory operations.
Nathan Edge (Google / RISE) gathered a list of useful instructions for
multimedia at last year's VDD in Seoul. I do not know what came out of it
though. However as far as segmented loads and stores are concerned, I don't
think that the instruction set has a much of a problem. This looks like an
implementation limitation.
Of course, FFmpeg could use an in-register transpose instruction. There are
cases of transposition not immediately following a load or preceding a store -
particularly with video codec two-dimensional transforms. But at the same
time, FFmpeg probably has to retain support for RVV 1.0 and RVA23 processors
for a long time. Any new instruction set extension will require additional
specialised optimisations, adding to the maintenance burden. So from the open-
source project's standpoint, that really should be the last resort.
For comparison, FFmpeg is still in the process of removing MMX in favour of
SSE2 and co... Maybe we will be done before MMX turns 30.
--
德尼-库尔蒙‧雷米
https://www.remlab.net/
_______________________________________________
ffmpeg-devel mailing list -- ffmpeg-devel@ffmpeg.org
To unsubscribe send an email to ffmpeg-devel-leave@ffmpeg.org
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2025-11-15 11:52 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-11-14 1:52 [FFmpeg-devel] [Question]Inquiry Regarding RISC-V RVV Optimization for HEVC Decoding in FFmpeg yunfei_zhou--- via ffmpeg-devel
2025-11-14 4:20 ` [FFmpeg-devel] " Zhao Zhili via ffmpeg-devel
2025-11-14 14:21 ` Rémi Denis-Courmont via ffmpeg-devel
2025-11-15 2:50 ` [FFmpeg-devel] 回复:Re: " yunfei_zhou--- via ffmpeg-devel
2025-11-15 11:51 ` [FFmpeg-devel] " Rémi Denis-Courmont via ffmpeg-devel
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
This inbox may be cloned and mirrored by anyone:
git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git
# If you have public-inbox 1.1+ installed, you may
# initialize and index your mirror using the following commands:
public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
ffmpegdev@gitmailbox.com
public-inbox-index ffmpegdev
Example config snippet for mirrors.
AGPL code for this site: git clone https://public-inbox.org/public-inbox.git