[FFmpeg-devel] [RFC] flac_wasted32 vector implementation for VSX on ppc64le

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
 help / color / mirror / Atom feed

* [FFmpeg-devel] [RFC] flac_wasted32 vector implementation for VSX on ppc64le
@ 2024-06-06  7:43 Sean McGovern
  2024-06-06  9:53 ` Rémi Denis-Courmont
  0 siblings, 1 reply; 4+ messages in thread
From: Sean McGovern @ 2024-06-06  7:43 UTC (permalink / raw)
  To: FFmpeg development discussions and patches

Hi,

Attached inline is a _non-working_ implementation of flac_wasted32 for
VSX developed on a POWER9 in little-endian mode but probably just as
usable on POWER{8,10}.

I'm not sure why probably one of the simplest DSP functions in lavc
does not work for me, I imagine this is probably something endian
related even though IBM's documentation for vec_sl()[1] does not
suggest any.

Here's my code:

#define VSX_STRIDE 16

void ff_flac_wasted32_vsx(int32_t *decoded, int wasted, int len)
{
   register vec_s32 vec1;
   register vec_u32 vec2 = { wasted, wasted, wasted, wasted };
   register vec_s32 shifted;

   for (int i = 0; i < len; i += VSX_STRIDE) {
       vec1 = vec_vsx_ld(i, decoded);
       shifted = vec_sl(vec1, vec2);
       vec_vsx_st(shifted, i, decoded);
   }
}

Anyone with experience with AltiVec or VSX see something obvious I am missing?

-- Sean McGovern

[1] https://www.ibm.com/docs/en/xl-c-and-cpp-linux/16.1.1?topic=functions-vec-sl
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [FFmpeg-devel] [RFC] flac_wasted32 vector implementation for VSX on ppc64le
  2024-06-06  7:43 [FFmpeg-devel] [RFC] flac_wasted32 vector implementation for VSX on ppc64le Sean McGovern
@ 2024-06-06  9:53 ` Rémi Denis-Courmont
  2024-06-06 16:51   ` Sean McGovern
  0 siblings, 1 reply; 4+ messages in thread
From: Rémi Denis-Courmont @ 2024-06-06  9:53 UTC (permalink / raw)
  To: FFmpeg development discussions and patches



Le 6 juin 2024 10:43:05 GMT+03:00, Sean McGovern <gseanmcg@gmail.com> a écrit :
>Hi,
>
>Attached inline is a _non-working_ implementation of flac_wasted32 for
>VSX developed on a POWER9 in little-endian mode but probably just as
>usable on POWER{8,10}.
>
>I'm not sure why probably one of the simplest DSP functions in lavc
>does not work for me, I imagine this is probably something endian
>related even though IBM's documentation for vec_sl()[1] does not
>suggest any.

Mixing up bytes and elements in the iterator. But you should be able to track this down with gdb or good ol' printf().

>Here's my code:
>
>#define VSX_STRIDE 16
>
>void ff_flac_wasted32_vsx(int32_t *decoded, int wasted, int len)
>{
>   register vec_s32 vec1;
>   register vec_u32 vec2 = { wasted, wasted, wasted, wasted };

There should be an instruction to splat a scalar to a vector. Better yet use vector-scalar shift, if VSX has it.

>   register vec_s32 shifted;
>
>   for (int i = 0; i < len; i += VSX_STRIDE) {
>       vec1 = vec_vsx_ld(i, decoded);
>       shifted = vec_sl(vec1, vec2);
>       vec_vsx_st(shifted, i, decoded);
>   }
>}
>
>Anyone with experience with AltiVec or VSX see something obvious I am missing?
>
>-- Sean McGovern
>
>[1] https://www.ibm.com/docs/en/xl-c-and-cpp-linux/16.1.1?topic=functions-vec-sl
>_______________________________________________
>ffmpeg-devel mailing list
>ffmpeg-devel@ffmpeg.org
>https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
>To unsubscribe, visit link above, or email
>ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [FFmpeg-devel] [RFC] flac_wasted32 vector implementation for VSX on ppc64le
  2024-06-06  9:53 ` Rémi Denis-Courmont
@ 2024-06-06 16:51   ` Sean McGovern
  2024-06-26 22:01     ` Sean McGovern
  0 siblings, 1 reply; 4+ messages in thread
From: Sean McGovern @ 2024-06-06 16:51 UTC (permalink / raw)
  To: FFmpeg development discussions and patches

On Thu, Jun 6, 2024, 05:53 Rémi Denis-Courmont <remi@remlab.net> wrote:

>
>
> Le 6 juin 2024 10:43:05 GMT+03:00, Sean McGovern <gseanmcg@gmail.com> a
> écrit :
> >Hi,
> >
> >Attached inline is a _non-working_ implementation of flac_wasted32 for
> >VSX developed on a POWER9 in little-endian mode but probably just as
> >usable on POWER{8,10}.
> >
> >I'm not sure why probably one of the simplest DSP functions in lavc
> >does not work for me, I imagine this is probably something endian
> >related even though IBM's documentation for vec_sl()[1] does not
> >suggest any.
>
> Mixing up bytes and elements in the iterator. But you should be able to
> track this down with gdb or good ol' printf().
>
> >Here's my code:
> >
> >#define VSX_STRIDE 16
> >
> >void ff_flac_wasted32_vsx(int32_t *decoded, int wasted, int len)
> >{
> >   register vec_s32 vec1;
> >   register vec_u32 vec2 = { wasted, wasted, wasted, wasted };
>
> There should be an instruction to splat a scalar to a vector. Better yet
> use vector-scalar shift, if VSX has it.
>

In the POWER ISA, vec_splat() only accepts an immediate, so I think this is
the only way to do it in flac_wasted32.


> >   register vec_s32 shifted;
> >
> >   for (int i = 0; i < len; i += VSX_STRIDE) {
> >       vec1 = vec_vsx_ld(i, decoded);
> >       shifted = vec_sl(vec1, vec2);
> >       vec_vsx_st(shifted, i, decoded);
> >   }
> >}
> >
> >Anyone with experience with AltiVec or VSX see something obvious I am
> missing?
> >
> >-- Sean McGovern
> >
> >[1]
> https://www.ibm.com/docs/en/xl-c-and-cpp-linux/16.1.1?topic=functions-vec-sl
> >_______________________________________________
> >ffmpeg-devel mailing list
> >ffmpeg-devel@ffmpeg.org
> >https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >
> >To unsubscribe, visit link above, or email
> >ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
> >
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [FFmpeg-devel] [RFC] flac_wasted32 vector implementation for VSX on ppc64le
  2024-06-06 16:51   ` Sean McGovern
@ 2024-06-26 22:01     ` Sean McGovern
  0 siblings, 0 replies; 4+ messages in thread
From: Sean McGovern @ 2024-06-26 22:01 UTC (permalink / raw)
  To: FFmpeg development discussions and patches

Hi,


On Thu, Jun 6, 2024, 12:51 Sean McGovern <gseanmcg@gmail.com> wrote:

>
>
> On Thu, Jun 6, 2024, 05:53 Rémi Denis-Courmont <remi@remlab.net> wrote:
>
>>
>>
>> Le 6 juin 2024 10:43:05 GMT+03:00, Sean McGovern <gseanmcg@gmail.com> a
>> écrit :
>> >Hi,
>> >
>> >Attached inline is a _non-working_ implementation of flac_wasted32 for
>> >VSX developed on a POWER9 in little-endian mode but probably just as
>> >usable on POWER{8,10}.
>> >
>> >I'm not sure why probably one of the simplest DSP functions in lavc
>> >does not work for me, I imagine this is probably something endian
>> >related even though IBM's documentation for vec_sl()[1] does not
>> >suggest any.
>>
>> Mixing up bytes and elements in the iterator. But you should be able to
>> track this down with gdb or good ol' printf().
>>
>> >Here's my code:
>> >
>> >#define VSX_STRIDE 16
>> >
>> >void ff_flac_wasted32_vsx(int32_t *decoded, int wasted, int len)
>> >{
>> >   register vec_s32 vec1;
>> >   register vec_u32 vec2 = { wasted, wasted, wasted, wasted };
>>
>> There should be an instruction to splat a scalar to a vector. Better yet
>> use vector-scalar shift, if VSX has it.
>>
>
> In the POWER ISA, vec_splat() only accepts an immediate, so I think this
> is the only way to do it in flac_wasted32.
>
>
>> >   register vec_s32 shifted;
>> >
>> >   for (int i = 0; i < len; i += VSX_STRIDE) {
>> >       vec1 = vec_vsx_ld(i, decoded);
>> >       shifted = vec_sl(vec1, vec2);
>> >       vec_vsx_st(shifted, i, decoded);
>> >   }
>> >}
>> >
>> >Anyone with experience with AltiVec or VSX see something obvious I am
>> missing?
>> >
>> >-- Sean McGovern
>> >
>> >[1]
>> https://www.ibm.com/docs/en/xl-c-and-cpp-linux/16.1.1?topic=functions-vec-sl
>> >_______________________________________________
>> >ffmpeg-devel mailing list
>> >ffmpeg-devel@ffmpeg.org
>> >https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>> >
>> >To unsubscribe, visit link above, or email
>> >ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>> >
>> _______________________________________________
>> ffmpeg-devel mailing list
>> ffmpeg-devel@ffmpeg.org
>> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>>
>> To unsubscribe, visit link above, or email
>> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>>
>
I feel the need to correct myself here: it turns out there is a way --
vec_splat() only accepts an immediate but vec_splats()[1] is what I need
instead.

Thanks for the tips, I have a working version of wasted32 for VSX now. I'll
tackle wasted33 next and then submit them up.

-- Sean McGovern

[1]
https://www.ibm.com/docs/en/xl-c-and-cpp-linux/16.1.1?topic=functions-vec-splats
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-06-26 22:01 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-06-06  7:43 [FFmpeg-devel] [RFC] flac_wasted32 vector implementation for VSX on ppc64le Sean McGovern
2024-06-06  9:53 ` Rémi Denis-Courmont
2024-06-06 16:51   ` Sean McGovern
2024-06-26 22:01     ` Sean McGovern

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
		ffmpegdev@gitmailbox.com
	public-inbox-index ffmpegdev

Example config snippet for mirrors.


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git