* [FFmpeg-devel] [RFC] flac_wasted32 vector implementation for VSX on ppc64le @ 2024-06-06 7:43 Sean McGovern 2024-06-06 9:53 ` Rémi Denis-Courmont 0 siblings, 1 reply; 4+ messages in thread From: Sean McGovern @ 2024-06-06 7:43 UTC (permalink / raw) To: FFmpeg development discussions and patches Hi, Attached inline is a _non-working_ implementation of flac_wasted32 for VSX developed on a POWER9 in little-endian mode but probably just as usable on POWER{8,10}. I'm not sure why probably one of the simplest DSP functions in lavc does not work for me, I imagine this is probably something endian related even though IBM's documentation for vec_sl()[1] does not suggest any. Here's my code: #define VSX_STRIDE 16 void ff_flac_wasted32_vsx(int32_t *decoded, int wasted, int len) { register vec_s32 vec1; register vec_u32 vec2 = { wasted, wasted, wasted, wasted }; register vec_s32 shifted; for (int i = 0; i < len; i += VSX_STRIDE) { vec1 = vec_vsx_ld(i, decoded); shifted = vec_sl(vec1, vec2); vec_vsx_st(shifted, i, decoded); } } Anyone with experience with AltiVec or VSX see something obvious I am missing? -- Sean McGovern [1] https://www.ibm.com/docs/en/xl-c-and-cpp-linux/16.1.1?topic=functions-vec-sl _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [FFmpeg-devel] [RFC] flac_wasted32 vector implementation for VSX on ppc64le 2024-06-06 7:43 [FFmpeg-devel] [RFC] flac_wasted32 vector implementation for VSX on ppc64le Sean McGovern @ 2024-06-06 9:53 ` Rémi Denis-Courmont 2024-06-06 16:51 ` Sean McGovern 0 siblings, 1 reply; 4+ messages in thread From: Rémi Denis-Courmont @ 2024-06-06 9:53 UTC (permalink / raw) To: FFmpeg development discussions and patches Le 6 juin 2024 10:43:05 GMT+03:00, Sean McGovern <gseanmcg@gmail.com> a écrit : >Hi, > >Attached inline is a _non-working_ implementation of flac_wasted32 for >VSX developed on a POWER9 in little-endian mode but probably just as >usable on POWER{8,10}. > >I'm not sure why probably one of the simplest DSP functions in lavc >does not work for me, I imagine this is probably something endian >related even though IBM's documentation for vec_sl()[1] does not >suggest any. Mixing up bytes and elements in the iterator. But you should be able to track this down with gdb or good ol' printf(). >Here's my code: > >#define VSX_STRIDE 16 > >void ff_flac_wasted32_vsx(int32_t *decoded, int wasted, int len) >{ > register vec_s32 vec1; > register vec_u32 vec2 = { wasted, wasted, wasted, wasted }; There should be an instruction to splat a scalar to a vector. Better yet use vector-scalar shift, if VSX has it. > register vec_s32 shifted; > > for (int i = 0; i < len; i += VSX_STRIDE) { > vec1 = vec_vsx_ld(i, decoded); > shifted = vec_sl(vec1, vec2); > vec_vsx_st(shifted, i, decoded); > } >} > >Anyone with experience with AltiVec or VSX see something obvious I am missing? > >-- Sean McGovern > >[1] https://www.ibm.com/docs/en/xl-c-and-cpp-linux/16.1.1?topic=functions-vec-sl >_______________________________________________ >ffmpeg-devel mailing list >ffmpeg-devel@ffmpeg.org >https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > >To unsubscribe, visit link above, or email >ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". > _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [FFmpeg-devel] [RFC] flac_wasted32 vector implementation for VSX on ppc64le 2024-06-06 9:53 ` Rémi Denis-Courmont @ 2024-06-06 16:51 ` Sean McGovern 2024-06-26 22:01 ` Sean McGovern 0 siblings, 1 reply; 4+ messages in thread From: Sean McGovern @ 2024-06-06 16:51 UTC (permalink / raw) To: FFmpeg development discussions and patches On Thu, Jun 6, 2024, 05:53 Rémi Denis-Courmont <remi@remlab.net> wrote: > > > Le 6 juin 2024 10:43:05 GMT+03:00, Sean McGovern <gseanmcg@gmail.com> a > écrit : > >Hi, > > > >Attached inline is a _non-working_ implementation of flac_wasted32 for > >VSX developed on a POWER9 in little-endian mode but probably just as > >usable on POWER{8,10}. > > > >I'm not sure why probably one of the simplest DSP functions in lavc > >does not work for me, I imagine this is probably something endian > >related even though IBM's documentation for vec_sl()[1] does not > >suggest any. > > Mixing up bytes and elements in the iterator. But you should be able to > track this down with gdb or good ol' printf(). > > >Here's my code: > > > >#define VSX_STRIDE 16 > > > >void ff_flac_wasted32_vsx(int32_t *decoded, int wasted, int len) > >{ > > register vec_s32 vec1; > > register vec_u32 vec2 = { wasted, wasted, wasted, wasted }; > > There should be an instruction to splat a scalar to a vector. Better yet > use vector-scalar shift, if VSX has it. > In the POWER ISA, vec_splat() only accepts an immediate, so I think this is the only way to do it in flac_wasted32. > > register vec_s32 shifted; > > > > for (int i = 0; i < len; i += VSX_STRIDE) { > > vec1 = vec_vsx_ld(i, decoded); > > shifted = vec_sl(vec1, vec2); > > vec_vsx_st(shifted, i, decoded); > > } > >} > > > >Anyone with experience with AltiVec or VSX see something obvious I am > missing? > > > >-- Sean McGovern > > > >[1] > https://www.ibm.com/docs/en/xl-c-and-cpp-linux/16.1.1?topic=functions-vec-sl > >_______________________________________________ > >ffmpeg-devel mailing list > >ffmpeg-devel@ffmpeg.org > >https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > > >To unsubscribe, visit link above, or email > >ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". > > > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". > _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [FFmpeg-devel] [RFC] flac_wasted32 vector implementation for VSX on ppc64le 2024-06-06 16:51 ` Sean McGovern @ 2024-06-26 22:01 ` Sean McGovern 0 siblings, 0 replies; 4+ messages in thread From: Sean McGovern @ 2024-06-26 22:01 UTC (permalink / raw) To: FFmpeg development discussions and patches Hi, On Thu, Jun 6, 2024, 12:51 Sean McGovern <gseanmcg@gmail.com> wrote: > > > On Thu, Jun 6, 2024, 05:53 Rémi Denis-Courmont <remi@remlab.net> wrote: > >> >> >> Le 6 juin 2024 10:43:05 GMT+03:00, Sean McGovern <gseanmcg@gmail.com> a >> écrit : >> >Hi, >> > >> >Attached inline is a _non-working_ implementation of flac_wasted32 for >> >VSX developed on a POWER9 in little-endian mode but probably just as >> >usable on POWER{8,10}. >> > >> >I'm not sure why probably one of the simplest DSP functions in lavc >> >does not work for me, I imagine this is probably something endian >> >related even though IBM's documentation for vec_sl()[1] does not >> >suggest any. >> >> Mixing up bytes and elements in the iterator. But you should be able to >> track this down with gdb or good ol' printf(). >> >> >Here's my code: >> > >> >#define VSX_STRIDE 16 >> > >> >void ff_flac_wasted32_vsx(int32_t *decoded, int wasted, int len) >> >{ >> > register vec_s32 vec1; >> > register vec_u32 vec2 = { wasted, wasted, wasted, wasted }; >> >> There should be an instruction to splat a scalar to a vector. Better yet >> use vector-scalar shift, if VSX has it. >> > > In the POWER ISA, vec_splat() only accepts an immediate, so I think this > is the only way to do it in flac_wasted32. > > >> > register vec_s32 shifted; >> > >> > for (int i = 0; i < len; i += VSX_STRIDE) { >> > vec1 = vec_vsx_ld(i, decoded); >> > shifted = vec_sl(vec1, vec2); >> > vec_vsx_st(shifted, i, decoded); >> > } >> >} >> > >> >Anyone with experience with AltiVec or VSX see something obvious I am >> missing? >> > >> >-- Sean McGovern >> > >> >[1] >> https://www.ibm.com/docs/en/xl-c-and-cpp-linux/16.1.1?topic=functions-vec-sl >> >_______________________________________________ >> >ffmpeg-devel mailing list >> >ffmpeg-devel@ffmpeg.org >> >https://ffmpeg.org/mailman/listinfo/ffmpeg-devel >> > >> >To unsubscribe, visit link above, or email >> >ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". >> > >> _______________________________________________ >> ffmpeg-devel mailing list >> ffmpeg-devel@ffmpeg.org >> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel >> >> To unsubscribe, visit link above, or email >> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". >> > I feel the need to correct myself here: it turns out there is a way -- vec_splat() only accepts an immediate but vec_splats()[1] is what I need instead. Thanks for the tips, I have a working version of wasted32 for VSX now. I'll tackle wasted33 next and then submit them up. -- Sean McGovern [1] https://www.ibm.com/docs/en/xl-c-and-cpp-linux/16.1.1?topic=functions-vec-splats _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2024-06-26 22:01 UTC | newest] Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2024-06-06 7:43 [FFmpeg-devel] [RFC] flac_wasted32 vector implementation for VSX on ppc64le Sean McGovern 2024-06-06 9:53 ` Rémi Denis-Courmont 2024-06-06 16:51 ` Sean McGovern 2024-06-26 22:01 ` Sean McGovern
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel This inbox may be cloned and mirrored by anyone: git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \ ffmpegdev@gitmailbox.com public-inbox-index ffmpegdev Example config snippet for mirrors. AGPL code for this site: git clone https://public-inbox.org/public-inbox.git