Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
 help / color / mirror / Atom feed
From: yunfei_zhou--- via ffmpeg-devel <ffmpeg-devel@ffmpeg.org>
To: "FFmpeg development discussions and patches" <ffmpeg-devel@ffmpeg.org>
Cc: "Rémi Denis-Courmont" <remi@remlab.net>,
	"yunfei_zhou@linux.alibaba.com" <yunfei_zhou@linux.alibaba.com>
Subject: [FFmpeg-devel] 回复:Re: [Question]Inquiry Regarding RISC-V RVV Optimization for HEVC Decoding in FFmpeg
Date: Sat, 15 Nov 2025 10:50:11 +0800
Message-ID: <3f548521-c75c-4ed3-a566-e21e2216288e.yunfei_zhou@linux.alibaba.com> (raw)
In-Reply-To: <08719E19-103B-4A65-BFE6-04507613189E@remlab.net>

Hi Zhili & Rémi Denis-Courmont,
Thank you very much, Zhao Zhili, for the helpful pointers! I’ll immediately review the latest RVV-related patches and pull requests on code.ffmpeg.org <https://code.ffmpeg.org/ > and study your excellent summary on assembly optimization.
I also sincerely appreciate Rémi Denis-Courmont’s detailed feedback. In response to the points you raised, I’d like to share a bit about our current efforts:

 * 
RISE Multimedia Group: We’ll reach out to our internal colleagues to check whether there are any ongoing initiatives within that community, to avoid duplication and explore potential collaboration.

 * 
Segmented load/store performance: We’ve encountered similar bottlenecks in our video decoding optimizations. To address this, we’re actively proposing new vector instructions tailored for media workloads to the RISC-V International standards body. At the same time, we’re working closely with RISC-V CPU microarchitecture teams to improve the hardware efficiency of these memory operations.

 * 
Scalable VLEN (variable vector length) challenges: I fully agree with your observation. The scalability of RVV is meant to provide flexibility—ideally, a single optimized implementation should adapt gracefully across different VLEN configurations. However, in practice, video codecs like HEVC predominantly use fixed-size blocks (e.g., 4×4, 8×8). As a result, an algorithm optimized for VLEN=128 may not perform better—or may even regress—on a VLEN=256 system, despite the latter having higher theoretical compute throughput. This forces us to develop separate optimizations per VLEN, which undermines the original intent of RVV’s scalability. We believe there’s significant room for discussion and innovation here—both in software strategies and hardware design.
We recognize that the RISC-V vector ecosystem is still evolving rapidly. Nevertheless, we’re confident that through close hardware-software co-design, RVV can become highly competitive in video coding workloads over time. RISC-V’s open nature makes it especially well-suited for such collaborative improvements—and we warmly welcome any performance insights, suggestions, or discussions from the community.
Thank you again for your valuable input and support!
Best regards,
Yunfei Zhou
Alibaba DAMO Academy
------------------------------------------------------------------
发件人:Rémi Denis-Courmont via ffmpeg-devel <ffmpeg-devel@ffmpeg.org>
发送时间:2025年11月14日(周五) 22:22
收件人:FFmpeg development discussions and patches<ffmpeg-devel@ffmpeg.org>
抄 送:"yunfei_zhou@linux.alibaba.com"<yunfei_zhou@linux.alibaba.com>; "Rémi Denis-Courmont"<remi@remlab.net>
主 题:[FFmpeg-devel] Re: [Question]Inquiry Regarding RISC-V RVV Optimization for HEVC Decoding in FFmpeg
Nihao,
Le 14 novembre 2025 03:52:51 GMT+02:00, yunfei_zhou--- via ffmpeg-devel <ffmpeg-devel@ffmpeg.org> a écrit :
>Before proceeding, we would like to understand whether there are any existing or ongoing efforts in this area to avoid duplication and, ideally, align or collaborate with current initiatives.
Existing code you can find in the official Git repo. Ongoing efforts are unknown to us. You had probably better ask the RISE multimedia group than FFmpeg-devel. I suppose you or one of your colleagues should have access. (I don't anyone here has.)
> * 
>Available documentation or resources that could help us better understand the existing codebase and optimization strategies.
To be honest, in my experience, while it is obviously possible to optimise video decoding with RVV, the current implementations are not competitive (with e.g. Armv8 AdvSIMD) due most particularly to two aspects:
1) Segmented loads&stores are slow. Because video decoding often involves transposition, we would really need segmented unit-strided accesses to run as fast or almost as fast as single-segment unit-strided accesses of the same size. Likewise we need segmented register-strided accesses to be almost as fast as single-segment register strided accesses.
2) Because RVV is scalable, and video decoding uses a lot of fixed-size and/or small vectors, we need instruction execution cost to scale according to VL or next_power_of_two(VL). Currently it seems to scale according to VLMAX, which means larger vectors make optimisations worse rather than better.
(This is based on benchmarks for your C910 and C908 cores, and SpacemiT's X60. I don't have access to any other hardware at the moment.)
Point being, the available hardware seems a little bit immature, so we don't really have settled optimisations strategies.
Br,
_______________________________________________
ffmpeg-devel mailing list -- ffmpeg-devel@ffmpeg.org
To unsubscribe send an email to ffmpeg-devel-leave@ffmpeg.org
_______________________________________________
ffmpeg-devel mailing list -- ffmpeg-devel@ffmpeg.org
To unsubscribe send an email to ffmpeg-devel-leave@ffmpeg.org

  reply	other threads:[~2025-11-15  2:51 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-14  1:52 [FFmpeg-devel] " yunfei_zhou--- via ffmpeg-devel
2025-11-14  4:20 ` [FFmpeg-devel] " Zhao Zhili via ffmpeg-devel
2025-11-14 14:21 ` Rémi Denis-Courmont via ffmpeg-devel
2025-11-15  2:50   ` yunfei_zhou--- via ffmpeg-devel [this message]
2025-11-15 11:51     ` [FFmpeg-devel] Re: 回复:Re: " Rémi Denis-Courmont via ffmpeg-devel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3f548521-c75c-4ed3-a566-e21e2216288e.yunfei_zhou@linux.alibaba.com \
    --to=ffmpeg-devel@ffmpeg.org \
    --cc=remi@remlab.net \
    --cc=yunfei_zhou@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
		ffmpegdev@gitmailbox.com
	public-inbox-index ffmpegdev

Example config snippet for mirrors.


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git