Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
 help / color / mirror / Atom feed
* [FFmpeg-devel] [PATCH 1/3] x86/ac3dsp: reduce instruction count inside the float_to_fixed24 loop
@ 2023-11-22 19:49 James Almer
  2023-11-22 19:49 ` [FFmpeg-devel] [PATCH 2/3] x86/ac3dsp: add ff_float_to_fixed24_avx2() James Almer
  2023-11-22 19:49 ` [FFmpeg-devel] [PATCH 3/3] avcodec/ac3dsp: make len a size_t in float_to_fixed24 James Almer
  0 siblings, 2 replies; 11+ messages in thread
From: James Almer @ 2023-11-22 19:49 UTC (permalink / raw)
  To: ffmpeg-devel

Signed-off-by: James Almer <jamrial@gmail.com>
---
 libavcodec/x86/ac3dsp.asm | 46 +++++++++++++++++++--------------------
 1 file changed, 23 insertions(+), 23 deletions(-)

diff --git a/libavcodec/x86/ac3dsp.asm b/libavcodec/x86/ac3dsp.asm
index a95d359d95..42c8310462 100644
--- a/libavcodec/x86/ac3dsp.asm
+++ b/libavcodec/x86/ac3dsp.asm
@@ -77,16 +77,20 @@ AC3_EXPONENT_MIN
 INIT_XMM sse2
 cglobal float_to_fixed24, 3, 3, 9, dst, src, len
     movaps     m0, [pf_1_24]
+    shl      lenq, 2
+    add      srcq, lenq
+    add      dstq, lenq
+    neg      lenq
 .loop:
-    movaps     m1, [srcq    ]
-    movaps     m2, [srcq+16 ]
-    movaps     m3, [srcq+32 ]
-    movaps     m4, [srcq+48 ]
+    movaps     m1, [srcq+lenq    ]
+    movaps     m2, [srcq+lenq+16 ]
+    movaps     m3, [srcq+lenq+32 ]
+    movaps     m4, [srcq+lenq+48 ]
 %ifdef m8
-    movaps     m5, [srcq+64 ]
-    movaps     m6, [srcq+80 ]
-    movaps     m7, [srcq+96 ]
-    movaps     m8, [srcq+112]
+    movaps     m5, [srcq+lenq+64 ]
+    movaps     m6, [srcq+lenq+80 ]
+    movaps     m7, [srcq+lenq+96 ]
+    movaps     m8, [srcq+lenq+112]
 %endif
     mulps      m1, m0
     mulps      m2, m0
@@ -108,24 +112,20 @@ cglobal float_to_fixed24, 3, 3, 9, dst, src, len
     cvtps2dq   m7, m7
     cvtps2dq   m8, m8
 %endif
-    movdqa  [dstq    ], m1
-    movdqa  [dstq+16 ], m2
-    movdqa  [dstq+32 ], m3
-    movdqa  [dstq+48 ], m4
+    movdqa  [dstq+lenq    ], m1
+    movdqa  [dstq+lenq+16 ], m2
+    movdqa  [dstq+lenq+32 ], m3
+    movdqa  [dstq+lenq+48 ], m4
 %ifdef m8
-    movdqa  [dstq+64 ], m5
-    movdqa  [dstq+80 ], m6
-    movdqa  [dstq+96 ], m7
-    movdqa  [dstq+112], m8
-    add      srcq, 128
-    add      dstq, 128
-    sub      lenq, 32
+    movdqa  [dstq+lenq+64 ], m5
+    movdqa  [dstq+lenq+80 ], m6
+    movdqa  [dstq+lenq+96 ], m7
+    movdqa  [dstq+lenq+112], m8
+    add      lenq, 128
 %else
-    add      srcq, 64
-    add      dstq, 64
-    sub      lenq, 16
+    add      lenq, 64
 %endif
-    ja .loop
+    jl .loop
     RET
 
 ;------------------------------------------------------------------------------
-- 
2.42.1

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2023-11-26 10:30 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-22 19:49 [FFmpeg-devel] [PATCH 1/3] x86/ac3dsp: reduce instruction count inside the float_to_fixed24 loop James Almer
2023-11-22 19:49 ` [FFmpeg-devel] [PATCH 2/3] x86/ac3dsp: add ff_float_to_fixed24_avx2() James Almer
2023-11-23  6:56   ` Kieran Kunhya
2023-11-23 11:51     ` James Almer
2023-11-23 15:19       ` Henrik Gramner via ffmpeg-devel
2023-11-22 19:49 ` [FFmpeg-devel] [PATCH 3/3] avcodec/ac3dsp: make len a size_t in float_to_fixed24 James Almer
2023-11-22 20:05   ` Rémi Denis-Courmont
2023-11-24 21:01   ` Michael Niedermayer
2023-11-24 21:03     ` James Almer
2023-11-24 23:49       ` Michael Niedermayer
2023-11-26 10:29       ` Anton Khirnov

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
		ffmpegdev@gitmailbox.com
	public-inbox-index ffmpegdev

Example config snippet for mirrors.


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git