* [FFmpeg-devel] [RFC] flac_wasted32 vector implementation for VSX on ppc64le
@ 2024-06-06 7:43 Sean McGovern
2024-06-06 9:53 ` Rémi Denis-Courmont
0 siblings, 1 reply; 4+ messages in thread
From: Sean McGovern @ 2024-06-06 7:43 UTC (permalink / raw)
To: FFmpeg development discussions and patches
Hi,
Attached inline is a _non-working_ implementation of flac_wasted32 for
VSX developed on a POWER9 in little-endian mode but probably just as
usable on POWER{8,10}.
I'm not sure why probably one of the simplest DSP functions in lavc
does not work for me, I imagine this is probably something endian
related even though IBM's documentation for vec_sl()[1] does not
suggest any.
Here's my code:
#define VSX_STRIDE 16
void ff_flac_wasted32_vsx(int32_t *decoded, int wasted, int len)
{
register vec_s32 vec1;
register vec_u32 vec2 = { wasted, wasted, wasted, wasted };
register vec_s32 shifted;
for (int i = 0; i < len; i += VSX_STRIDE) {
vec1 = vec_vsx_ld(i, decoded);
shifted = vec_sl(vec1, vec2);
vec_vsx_st(shifted, i, decoded);
}
}
Anyone with experience with AltiVec or VSX see something obvious I am missing?
-- Sean McGovern
[1] https://www.ibm.com/docs/en/xl-c-and-cpp-linux/16.1.1?topic=functions-vec-sl
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [FFmpeg-devel] [RFC] flac_wasted32 vector implementation for VSX on ppc64le
2024-06-06 7:43 [FFmpeg-devel] [RFC] flac_wasted32 vector implementation for VSX on ppc64le Sean McGovern
@ 2024-06-06 9:53 ` Rémi Denis-Courmont
2024-06-06 16:51 ` Sean McGovern
0 siblings, 1 reply; 4+ messages in thread
From: Rémi Denis-Courmont @ 2024-06-06 9:53 UTC (permalink / raw)
To: FFmpeg development discussions and patches
Le 6 juin 2024 10:43:05 GMT+03:00, Sean McGovern <gseanmcg@gmail.com> a écrit :
>Hi,
>
>Attached inline is a _non-working_ implementation of flac_wasted32 for
>VSX developed on a POWER9 in little-endian mode but probably just as
>usable on POWER{8,10}.
>
>I'm not sure why probably one of the simplest DSP functions in lavc
>does not work for me, I imagine this is probably something endian
>related even though IBM's documentation for vec_sl()[1] does not
>suggest any.
Mixing up bytes and elements in the iterator. But you should be able to track this down with gdb or good ol' printf().
>Here's my code:
>
>#define VSX_STRIDE 16
>
>void ff_flac_wasted32_vsx(int32_t *decoded, int wasted, int len)
>{
> register vec_s32 vec1;
> register vec_u32 vec2 = { wasted, wasted, wasted, wasted };
There should be an instruction to splat a scalar to a vector. Better yet use vector-scalar shift, if VSX has it.
> register vec_s32 shifted;
>
> for (int i = 0; i < len; i += VSX_STRIDE) {
> vec1 = vec_vsx_ld(i, decoded);
> shifted = vec_sl(vec1, vec2);
> vec_vsx_st(shifted, i, decoded);
> }
>}
>
>Anyone with experience with AltiVec or VSX see something obvious I am missing?
>
>-- Sean McGovern
>
>[1] https://www.ibm.com/docs/en/xl-c-and-cpp-linux/16.1.1?topic=functions-vec-sl
>_______________________________________________
>ffmpeg-devel mailing list
>ffmpeg-devel@ffmpeg.org
>https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
>To unsubscribe, visit link above, or email
>ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [FFmpeg-devel] [RFC] flac_wasted32 vector implementation for VSX on ppc64le
2024-06-06 9:53 ` Rémi Denis-Courmont
@ 2024-06-06 16:51 ` Sean McGovern
2024-06-26 22:01 ` Sean McGovern
0 siblings, 1 reply; 4+ messages in thread
From: Sean McGovern @ 2024-06-06 16:51 UTC (permalink / raw)
To: FFmpeg development discussions and patches
On Thu, Jun 6, 2024, 05:53 Rémi Denis-Courmont <remi@remlab.net> wrote:
>
>
> Le 6 juin 2024 10:43:05 GMT+03:00, Sean McGovern <gseanmcg@gmail.com> a
> écrit :
> >Hi,
> >
> >Attached inline is a _non-working_ implementation of flac_wasted32 for
> >VSX developed on a POWER9 in little-endian mode but probably just as
> >usable on POWER{8,10}.
> >
> >I'm not sure why probably one of the simplest DSP functions in lavc
> >does not work for me, I imagine this is probably something endian
> >related even though IBM's documentation for vec_sl()[1] does not
> >suggest any.
>
> Mixing up bytes and elements in the iterator. But you should be able to
> track this down with gdb or good ol' printf().
>
> >Here's my code:
> >
> >#define VSX_STRIDE 16
> >
> >void ff_flac_wasted32_vsx(int32_t *decoded, int wasted, int len)
> >{
> > register vec_s32 vec1;
> > register vec_u32 vec2 = { wasted, wasted, wasted, wasted };
>
> There should be an instruction to splat a scalar to a vector. Better yet
> use vector-scalar shift, if VSX has it.
>
In the POWER ISA, vec_splat() only accepts an immediate, so I think this is
the only way to do it in flac_wasted32.
> > register vec_s32 shifted;
> >
> > for (int i = 0; i < len; i += VSX_STRIDE) {
> > vec1 = vec_vsx_ld(i, decoded);
> > shifted = vec_sl(vec1, vec2);
> > vec_vsx_st(shifted, i, decoded);
> > }
> >}
> >
> >Anyone with experience with AltiVec or VSX see something obvious I am
> missing?
> >
> >-- Sean McGovern
> >
> >[1]
> https://www.ibm.com/docs/en/xl-c-and-cpp-linux/16.1.1?topic=functions-vec-sl
> >_______________________________________________
> >ffmpeg-devel mailing list
> >ffmpeg-devel@ffmpeg.org
> >https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >
> >To unsubscribe, visit link above, or email
> >ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
> >
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [FFmpeg-devel] [RFC] flac_wasted32 vector implementation for VSX on ppc64le
2024-06-06 16:51 ` Sean McGovern
@ 2024-06-26 22:01 ` Sean McGovern
0 siblings, 0 replies; 4+ messages in thread
From: Sean McGovern @ 2024-06-26 22:01 UTC (permalink / raw)
To: FFmpeg development discussions and patches
Hi,
On Thu, Jun 6, 2024, 12:51 Sean McGovern <gseanmcg@gmail.com> wrote:
>
>
> On Thu, Jun 6, 2024, 05:53 Rémi Denis-Courmont <remi@remlab.net> wrote:
>
>>
>>
>> Le 6 juin 2024 10:43:05 GMT+03:00, Sean McGovern <gseanmcg@gmail.com> a
>> écrit :
>> >Hi,
>> >
>> >Attached inline is a _non-working_ implementation of flac_wasted32 for
>> >VSX developed on a POWER9 in little-endian mode but probably just as
>> >usable on POWER{8,10}.
>> >
>> >I'm not sure why probably one of the simplest DSP functions in lavc
>> >does not work for me, I imagine this is probably something endian
>> >related even though IBM's documentation for vec_sl()[1] does not
>> >suggest any.
>>
>> Mixing up bytes and elements in the iterator. But you should be able to
>> track this down with gdb or good ol' printf().
>>
>> >Here's my code:
>> >
>> >#define VSX_STRIDE 16
>> >
>> >void ff_flac_wasted32_vsx(int32_t *decoded, int wasted, int len)
>> >{
>> > register vec_s32 vec1;
>> > register vec_u32 vec2 = { wasted, wasted, wasted, wasted };
>>
>> There should be an instruction to splat a scalar to a vector. Better yet
>> use vector-scalar shift, if VSX has it.
>>
>
> In the POWER ISA, vec_splat() only accepts an immediate, so I think this
> is the only way to do it in flac_wasted32.
>
>
>> > register vec_s32 shifted;
>> >
>> > for (int i = 0; i < len; i += VSX_STRIDE) {
>> > vec1 = vec_vsx_ld(i, decoded);
>> > shifted = vec_sl(vec1, vec2);
>> > vec_vsx_st(shifted, i, decoded);
>> > }
>> >}
>> >
>> >Anyone with experience with AltiVec or VSX see something obvious I am
>> missing?
>> >
>> >-- Sean McGovern
>> >
>> >[1]
>> https://www.ibm.com/docs/en/xl-c-and-cpp-linux/16.1.1?topic=functions-vec-sl
>> >_______________________________________________
>> >ffmpeg-devel mailing list
>> >ffmpeg-devel@ffmpeg.org
>> >https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>> >
>> >To unsubscribe, visit link above, or email
>> >ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>> >
>> _______________________________________________
>> ffmpeg-devel mailing list
>> ffmpeg-devel@ffmpeg.org
>> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>>
>> To unsubscribe, visit link above, or email
>> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>>
>
I feel the need to correct myself here: it turns out there is a way --
vec_splat() only accepts an immediate but vec_splats()[1] is what I need
instead.
Thanks for the tips, I have a working version of wasted32 for VSX now. I'll
tackle wasted33 next and then submit them up.
-- Sean McGovern
[1]
https://www.ibm.com/docs/en/xl-c-and-cpp-linux/16.1.1?topic=functions-vec-splats
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2024-06-26 22:01 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-06-06 7:43 [FFmpeg-devel] [RFC] flac_wasted32 vector implementation for VSX on ppc64le Sean McGovern
2024-06-06 9:53 ` Rémi Denis-Courmont
2024-06-06 16:51 ` Sean McGovern
2024-06-26 22:01 ` Sean McGovern
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
This inbox may be cloned and mirrored by anyone:
git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git
# If you have public-inbox 1.1+ installed, you may
# initialize and index your mirror using the following commands:
public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
ffmpegdev@gitmailbox.com
public-inbox-index ffmpegdev
Example config snippet for mirrors.
AGPL code for this site: git clone https://public-inbox.org/public-inbox.git