From: "Rémi Denis-Courmont" <remi@remlab.net> To: "ffmpeg-devel@ffmpeg.org" <ffmpeg-devel@ffmpeg.org> Cc: Michael Platzer <michael.platzer@axelera.ai> Subject: Re: [FFmpeg-devel] RISC-V vector DSP functions: Motivation for commit 446b009 Date: Fri, 19 Jan 2024 19:14:02 +0200 Message-ID: <2165884.x4K2pz0Oi7@basile.remlab.net> (raw) In-Reply-To: <DB8P194MB0839826005206AA13F7CA7AC97702@DB8P194MB0839.EURP194.PROD.OUTLOOK.COM> Hi, Le perjantaina 19. tammikuuta 2024, 17.30.00 EET Michael Platzer via ffmpeg- devel a écrit : > Commit 446b0090cbb66ee614dcf6ca79c78dc8eb7f0e37 by Remi Denis-Courmont has > replaced RISC-V vector loads and stores with negative stride with vrgather > (generalized permutation within vector registers) instructions in order to > reverse the elements in a vector register. The commit message explains that > this change was done, but it does not explain why. It was faster on what the best approximation of real hardware available at the time, i.e. a Sipeed Lichee Pi4A board. There are no benchmarks in the commit because I don't like to publish benchmarks collected from prototypes. Nevertheless I think the commit message hints enough that anybody could easily guess that it was a performance optimisation, if I'm being honest. This is not exactly surprising: typical hardware can only access so many memory addresses simultaneously (i.e. one or maybe two), so indexed loads and strided loads are bound to be much slower than unit-strided loads. Maybe you have access to special hardware that is able to optimise the special case of strides equal to minus one to reduce the number of memory accesses. But I didn't back then, and as a matter of fact, I still don't. Hardware donations are welcome. > I fail to see what could possibly have motivated this change. > The RISC-V vector loads and stores support negative stride values for use > cases such as this one. [Citation required] > Using vrgather instead replaces the more specific operation with a more > generic one, That is a very subjective and unsubstantiated assertion. This feels a bit hypocritical while you are attacking me for not providing justification. As far as I can tell, neither instruction are specific to reversing vector element order. An actual real-life specific instruction exists on Arm in the form of vector-reverse. I don't know any ISA with load-reverse or store- reverse. > which is likely to be less performant on most HW architectures. Would you care to define "most architectures"? I only know one commercially available hardware architecture as of today, Kendryte K230 SoC with T-Head C908 CPU, so I can't make much sense of your sentence here. > In addition, it requires to setup an index vector, That is irrelevant since in this loop, the vector bank is not a bottleneck. The loop can run with maximul LMUL either way. And besides, the loop turned out to be faster with a smaller multiplier. > thus raising dynamic instruction count. It adds only one instruction (reverse subtraction) in the main loop, and even that could be optimised away if relevant. -- レミ・デニ-クールモン http://www.remlab.net/ _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
next prev parent reply other threads:[~2024-01-19 17:14 UTC|newest] Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top 2024-01-19 15:30 Michael Platzer via ffmpeg-devel 2024-01-19 17:14 ` Rémi Denis-Courmont [this message] 2024-01-23 17:34 ` Michael Platzer via ffmpeg-devel 2024-01-23 18:02 ` Rémi Denis-Courmont 2024-07-06 12:05 ` Rémi Denis-Courmont
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=2165884.x4K2pz0Oi7@basile.remlab.net \ --to=remi@remlab.net \ --cc=ffmpeg-devel@ffmpeg.org \ --cc=michael.platzer@axelera.ai \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel This inbox may be cloned and mirrored by anyone: git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \ ffmpegdev@gitmailbox.com public-inbox-index ffmpegdev Example config snippet for mirrors. AGPL code for this site: git clone https://public-inbox.org/public-inbox.git