Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
 help / color / mirror / Atom feed
From: Lynne via ffmpeg-devel <ffmpeg-devel@ffmpeg.org>
To: ffmpeg-devel@ffmpeg.org
Cc: Lynne <dev@lynne.ee>
Subject: Re: [FFmpeg-devel] [PATCH] lpc: rewrite lpc_compute_autocorr in external asm
Date: Sun, 26 May 2024 01:24:43 +0200
Message-ID: <b9f4b827-e73a-42aa-b83a-3dc266d4629b@lynne.ee> (raw)
In-Reply-To: <72761d42-0f8f-4abb-8476-6832d14b0774@gmail.com>


[-- Attachment #1.1.1.1: Type: text/plain, Size: 3832 bytes --]

On 26/05/2024 00:31, James Almer wrote:
> On 5/25/2024 5:57 PM, Lynne via ffmpeg-devel wrote:
>> The inline asm function had issues running under checkasm.
>> So I came to finish what I started, and wrote the last part
>> of LPC computation in assembly.
>>
>> autocorr_10_c: 135525.8
>> autocorr_10_sse2: 50729.8
>> autocorr_10_fma3: 19007.8
>> autocorr_30_c: 390100.8
>> autocorr_30_sse2: 142478.8
>> autocorr_30_fma3: 50559.8
>> autocorr_32_c: 407058.3
>> autocorr_32_sse2: 151633.3
>> autocorr_32_fma3: 50517.3
>> ---
>>   libavcodec/x86/lpc.asm    | 91 +++++++++++++++++++++++++++++++++++++++
>>   libavcodec/x86/lpc_init.c | 87 ++++---------------------------------
>>   2 files changed, 100 insertions(+), 78 deletions(-)
>>
>> diff --git a/libavcodec/x86/lpc.asm b/libavcodec/x86/lpc.asm
>> index a585c17ef5..790841b7f4 100644
>> --- a/libavcodec/x86/lpc.asm
>> +++ b/libavcodec/x86/lpc.asm
>> @@ -32,6 +32,8 @@ dec_tab_sse2: times 2 dq -2.0
>>   dec_tab_scalar: times 2 dq -1.0
>>   seq_tab_sse2: dq 1.0, 0.0
>> +autoc_init_tab: times 4 dq 1.0
>> +
>>   SECTION .text
>>   %macro APPLY_WELCH_FN 0
>> @@ -261,3 +263,92 @@ APPLY_WELCH_FN
>>   INIT_YMM avx2
>>   APPLY_WELCH_FN
>>   %endif
>> +
>> +%macro COMPUTE_AUTOCORR_FN 0
>> +cglobal lpc_compute_autocorr, 4, 7, 8, data, len, lag, autoc, lag_p, 
>> data_l, len_p
> 
> Already mentioned, but it should be 3 not 8.

Already done, as said on IRC not 10 minutes after I submitted it.

> 
>> +
>> +    shl lagd, 3
>> +    shl lenq, 3
>> +    xor lag_pq, lag_pq
>> +
>> +.lag_l:
>> +    movaps m8, [autoc_init_tab]
> 
> m2
> 
>> +
>> +    mov len_pq, lag_pq
>> +
>> +    lea data_lq, [lag_pq + mmsize - 8]
>> +    neg data_lq                     ; -j - mmsize
>> +    add data_lq, dataq              ; data[-j - mmsize]
>> +.len_l:
>> +    ; We waste the upper value here on SSE2,
>> +    ; but we use it on AVX.
>> +    movupd xm0, [dataq + len_pq]    ; data[i]
> 
> movsd

Fixed.

> 
>> +    movupd m1, [data_lq + len_pq]   ; data[i - j]
>> +
>> +%if cpuflag(avx)
> 
> %if mmsize == 32 here and everywhere else.

Done.

> 
>> +    vbroadcastsd m0, xm0
> 
> This is AVX2. AVX only has memory input argument. So use that and save 
> the movsd from above for the FMA3 version.
> 
>> +    vperm2f128 m1, m1, m1, 0x01
> 
> Aren't you loading 16 extra bytes for no reason if you're just going to 
> use the upper 16 bytes from the load above?

Lane swapped, like you mentioned.

>> +%endif
>> +
>> +    shufpd m0, m0, m0, 1100b
> 
> The last argument has two bits, not four. What you're doing here is a 
> splat/broadcast, so you don't need it for FMA3.
> 
>> +    shufpd m1, m1, m1, 0101b
> 
> The upper two bits of imm8 are ignored.

Intentional. Not ignored on FMA3.

>> +
>> +%if cpuflag(fma3)
>> +    fmaddpd m8, m0, m1, m8          ; sum += data[i]*data[i-j]
>> +%else
>> +    mulpd m0, m1
>> +    addpd m8, m0                    ; sum += data[i]*data[i-j]
>> +%endif
>> +
>> +    add len_pq, 8
>> +    cmp len_pq, lenq
>> +    jl .len_l
>> +
>> +    movups [autocq + lag_pq], m8    ; autoc[j] = sum
>> +    add lag_pq, mmsize
>> +    cmp lag_pq, lagq
>> +    jl .lag_l
>> +
>> +    ; The tail computation is guaranteed never to happen
>> +    ; as long as we're doing multiples of 4, rather than 2.
>> +    ; It is trivial to convert this to avx if ever needed.
>> +%if !cpuflag(avx)
> 
> This doesn't seem to be tested as is. Maybe the checkasm should try 
> other lag values?

That's for the checkasm patch. You can trigger this check with
fate-alac-16-lpc-orders as-is.

[-- Attachment #1.1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 637 bytes --]

[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 236 bytes --]

[-- Attachment #2: Type: text/plain, Size: 251 bytes --]

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

  parent reply	other threads:[~2024-05-25 23:24 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-25 20:57 Lynne via ffmpeg-devel
2024-05-25 22:12 ` Michael Niedermayer
2024-05-25 22:31 ` James Almer
2024-05-25 22:45   ` James Almer
2024-05-26  0:02     ` Lynne via ffmpeg-devel
2024-05-26  0:09       ` James Almer
2024-05-25 23:24   ` Lynne via ffmpeg-devel [this message]
2024-05-25 23:41     ` James Almer
2024-05-26  5:45   ` Rémi Denis-Courmont
2024-05-26  0:39 ` James Almer
2024-05-26  1:42 ` [FFmpeg-devel] [PATCH v2] " Lynne via ffmpeg-devel
2024-05-26  1:51   ` James Almer
2024-05-26  2:16     ` James Almer
2024-05-26 19:43   ` Michael Niedermayer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b9f4b827-e73a-42aa-b83a-3dc266d4629b@lynne.ee \
    --to=ffmpeg-devel@ffmpeg.org \
    --cc=dev@lynne.ee \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
		ffmpegdev@gitmailbox.com
	public-inbox-index ffmpegdev

Example config snippet for mirrors.


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git