Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
 help / color / mirror / Atom feed
From: flow gg <hlefthleft@gmail.com>
To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Subject: Re: [FFmpeg-devel] [PATCH 1/3] lavc/vp8dsp: R-V V put_bilin_h
Date: Sat, 24 Feb 2024 16:31:36 +0800
Message-ID: <CAEa-L+vSsrs4NrS7SK0OeXBwbAgA+PaJgfZXdfqEFH9TsT6T+g@mail.gmail.com> (raw)
In-Reply-To: <B4E257C0-2429-423F-B06A-70935BB3D900@remlab.net>

Okay, Thanks for clarifying.

I have used many fractional multipliers, mostly not for correctness, but
often for performance improvements (though I don't know why),
and there are no obvious downsides, How about leaving this code?

Rémi Denis-Courmont <remi@remlab.net> 于2024年2月24日周六 15:39写道:

> Hi,
>
> Le 24 février 2024 03:07:36 GMT+02:00, flow gg <hlefthleft@gmail.com> a
> écrit :
> > .ifc \len,4
> >-        vsetivli        zero, 5, e8, mf2, ta, ma
> >+        vsetivli        zero, 5, e8, m1, ta, ma
> > .elseif \len == 8
> >         vsetivli        zero, 9, e8, m1, ta, ma
> > .else
> >@@ -112,9 +112,9 @@ endfunc
> >         vslide1down.vx  v2, \dst, t5
> >
> > .ifc \len,4
> >-        vsetivli        zero, 4, e8, mf4, ta, ma
> >+        vsetivli        zero, 4, e8, m1, ta, ma
> > .elseif \len == 8
> >-        vsetivli        zero, 8, e8, mf2, ta, ma
> >+        vsetivli        zero, 8, e8, m1, ta, ma
> >
> >What are the benefits of not using fractional multipliers here?
>
> Insofar as E8/MF4 is guaranteed to work for Zve32x, there are no benefits
> per se.
>
> However fractional multipliers were added to the specification to enable
> addressing invididual vectors whilst the effective multiplier is larger
> than one. This can only happen with mixed widths. Fractions were not
> intended to make vector shorter - there is the vector length for that
> already.
>
> That's why "E64/MF2" doesn't work, even though it's the same vector bit
> size as "E8/MF2".
>
> > Making this
> >change would result in a 10%-20% slowdown.
>
> That's kind of odd. This may be caused by the slides, but it's strange to
> go out of the way for hardware to optimise a case that's not even intended.
>
> >                                              mf2/4   m1
> >vp8_put_bilin4_h_rvv_i32:   158.7   193.7
> >vp8_put_bilin4_hv_rvv_i32:  255.7   302.7
> >vp8_put_bilin8_h_rvv_i32:   318.7   358.7
> >vp8_put_bilin8_hv_rvv_i32:  528.7   569.7
> >
> >Rémi Denis-Courmont <remi@remlab.net> 于2024年2月24日周六 01:18写道:
> >
> >> Hi,
> >>
> >> +
> >> +.macro bilin_h_load dst len
> >> +.ifc \len,4
> >> +        vsetivli        zero, 5, e8, mf2, ta, ma
> >>
> >> Don't use fractional multipliers if you don't mix element widths.
> >>
> >> +.elseif \len == 8
> >> +        vsetivli        zero, 9, e8, m1, ta, ma
> >> +.else
> >> +        vsetivli        zero, 17, e8, m2, ta, ma
> >> +.endif
> >> +
> >> +        vle8.v          \dst, (a2)
> >> +        vslide1down.vx  v2, \dst, t5
> >> +
> >>
> >> +.ifc \len,4
> >> +        vsetivli        zero, 4, e8, mf4, ta, ma
> >>
> >> Same as above.
> >>
> >> +.elseif \len == 8
> >> +        vsetivli        zero, 8, e8, mf2, ta, ma
> >>
> >> Also.
> >>
> >> +.else
> >> +        vsetivli        zero, 16, e8, m1, ta, ma
> >> +.endif
> >>
> >> +        vwmulu.vx       v28, \dst, t1
> >> +        vwmaccu.vx      v28, a5, v2
> >> +        vwaddu.wx       v24, v28, t4
> >> +        vnsra.wi        \dst, v24, 3
> >> +.endm
> >> +
> >> +.macro put_vp8_bilin_h len
> >> +        li              t1, 8
> >> +        li              t4, 4
> >> +        li              t5, 1
> >> +        sub             t1, t1, a5
> >> +1:
> >> +        addi            a4, a4, -1
> >> +        bilin_h_load    v0, \len
> >> +        vse8.v          v0, (a0)
> >> +        add             a2, a2, a3
> >> +        add             a0, a0, a1
> >> +        bnez            a4, 1b
> >> +
> >> +        ret
> >> +.endm
> >> +
> >> +func ff_put_vp8_bilin16_h_rvv, zve32x
> >> +        put_vp8_bilin_h 16
> >> +endfunc
> >> +
> >> +func ff_put_vp8_bilin8_h_rvv, zve32x
> >> +        put_vp8_bilin_h 8
> >> +endfunc
> >> +
> >> +func ff_put_vp8_bilin4_h_rvv, zve32x
> >> +        put_vp8_bilin_h 4
> >> +endfunc
> >>
> >> --
> >> レミ・デニ-クールモン
> >> http://www.remlab.net/
> >>
> >>
> >>
> >> _______________________________________________
> >> ffmpeg-devel mailing list
> >> ffmpeg-devel@ffmpeg.org
> >> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >>
> >> To unsubscribe, visit link above, or email
> >> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
> >>
> >_______________________________________________
> >ffmpeg-devel mailing list
> >ffmpeg-devel@ffmpeg.org
> >https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >
> >To unsubscribe, visit link above, or email
> >ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

  reply	other threads:[~2024-02-24  8:31 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-23 14:45 flow gg
2024-02-23 17:17 ` Rémi Denis-Courmont
2024-02-24  1:07   ` flow gg
2024-02-24  7:38     ` Rémi Denis-Courmont
2024-02-24  8:31       ` flow gg [this message]
2024-02-28 20:25         ` Rémi Denis-Courmont
2024-03-03 14:39 ` Rémi Denis-Courmont
2024-03-03 15:03   ` flow gg
2024-03-17 16:42     ` flow gg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAEa-L+vSsrs4NrS7SK0OeXBwbAgA+PaJgfZXdfqEFH9TsT6T+g@mail.gmail.com \
    --to=hlefthleft@gmail.com \
    --cc=ffmpeg-devel@ffmpeg.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
		ffmpegdev@gitmailbox.com
	public-inbox-index ffmpegdev

Example config snippet for mirrors.


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git