From: Henrik Gramner <henrik@gramner.com>
To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Subject: Re: [FFmpeg-devel] [PATCH v2] x86/tx_float: implement inverse MDCT AVX2 assembly
Date: Fri, 2 Sep 2022 16:03:21 +0200
Message-ID: <CAFGUN0rkzOQVwOeg1fy1HxRjrgBMftu0Vj54jLF3FzCqmRmmBw@mail.gmail.com> (raw)
In-Reply-To: <NAwlBSj--3-2@lynne.ee>
On Fri, Sep 2, 2022 at 7:55 AM Lynne <dev@lynne.ee> wrote:
> + movd xmm4, strided
> + neg t2d
> + movd xmm5, t2d
> + SPLATD xmm4
> + SPLATD xmm5
> + vperm2f128 m4, m4, m4, 0x00 ; +stride splatted
> + vperm2f128 m5, m5, m5, 0x00 ; -stride splatted
movd xm4, strided
pxor m5, m5
vpbroadcastd m4, xm4
+ mova m2, [lutq] ; load LUT indices
+ pcmpeqd m0, m0 ; zero out a register
+ pmulld m3, m2, m4 ; multiply by +stride
+ pmulld m2, m5 ; multiply by -stride
+ movaps m1, m0
+ vgatherdps m6, [inq + 2*m3], m0 ; im
+ vgatherdps m7, [t1q + 2*m2], m1 ; re
pmulld m2, m4, [lutq]
pcmpeqd m0, m0
mova m1, m0
vgatherdps m6, [inq + 2*m2], m0
psubd m2, m5, m2
vgatherdps m7, [t1q + 2*m2], m1
The comment for pcmpeqd is also wrong as bits are set to 1, not 0.
That instruction could also be moved outside the loop and replaced
with a cheaper register-register move inside the loop.
> + vperm2f128 m0, m0, 0x01 ; flip
> + vperm2f128 m4, m4, 0x01 ; flip (2)
> + shufpd m0, m0, 101b
> + shufpd m4, m4, 101b
vpermpd m0, m0, q0123
vpermpd m4, m4, q0123
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
prev parent reply other threads:[~2022-09-02 14:03 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-09-01 21:47 [FFmpeg-devel] [PATCH] " Lynne
[not found] ` <NAv0PJm--3-2@lynne.ee-NAv0T7a----2>
2022-09-02 5:49 ` [FFmpeg-devel] [PATCH v2] " Lynne
[not found] ` <NAwjob8--3-2@lynne.ee-NAwjsed----2>
2022-09-02 5:55 ` Lynne
2022-09-02 14:03 ` Henrik Gramner [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAFGUN0rkzOQVwOeg1fy1HxRjrgBMftu0Vj54jLF3FzCqmRmmBw@mail.gmail.com \
--to=henrik@gramner.com \
--cc=ffmpeg-devel@ffmpeg.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
This inbox may be cloned and mirrored by anyone:
git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git
# If you have public-inbox 1.1+ installed, you may
# initialize and index your mirror using the following commands:
public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
ffmpegdev@gitmailbox.com
public-inbox-index ffmpegdev
Example config snippet for mirrors.
AGPL code for this site: git clone https://public-inbox.org/public-inbox.git