Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
 help / color / mirror / Atom feed
* [FFmpeg-devel] [PATCH 1/4] aarch64: Fix ff_hevc_put_hevc_epel_h48_8_neon_i8mm
@ 2024-03-12 13:12 Martin Storsjö
  2024-03-12 13:12 ` [FFmpeg-devel] [PATCH 2/4] checkasm: hevc_pel: Check the full output in hevc_epel/hevc_qpel Martin Storsjö
                   ` (3 more replies)
  0 siblings, 4 replies; 6+ messages in thread
From: Martin Storsjö @ 2024-03-12 13:12 UTC (permalink / raw)
  To: ffmpeg-devel; +Cc: Logan Lyu, J . Dekker

The first 32 elements of each row were correct, while the
last 16 were scrambled.

This hasn't been noticed, because the checkasm test erroneously
only checked half of the output (for 8 bit functions), and
apparently none of the samples as part of "fate-hevc" seem to
trigger this specific function.
---
 libavcodec/aarch64/hevcdsp_epel_neon.S | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/libavcodec/aarch64/hevcdsp_epel_neon.S b/libavcodec/aarch64/hevcdsp_epel_neon.S
index 2dafa09337..d3f0a26f79 100644
--- a/libavcodec/aarch64/hevcdsp_epel_neon.S
+++ b/libavcodec/aarch64/hevcdsp_epel_neon.S
@@ -1572,6 +1572,7 @@ function ff_hevc_put_hevc_epel_h48_8_neon_i8mm, export=1
         xtn2            v22.8h, v26.4s
         xtn             v23.4h, v23.4s
         xtn2            v23.8h, v27.4s
+        add             x7, x0, #64
         st4             {v20.8h, v21.8h, v22.8h, v23.8h}, [x0], x10
         ext             v4.16b, v2.16b, v3.16b, #1
         ext             v5.16b, v2.16b, v3.16b, #2
@@ -1584,11 +1585,14 @@ function ff_hevc_put_hevc_epel_h48_8_neon_i8mm, export=1
         usdot           v21.4s, v4.16b, v30.16b
         usdot           v22.4s, v5.16b, v30.16b
         usdot           v23.4s, v6.16b, v30.16b
-        xtn             v20.4h, v20.4s
-        xtn2            v20.8h, v22.4s
-        xtn             v21.4h, v21.4s
-        xtn2            v21.8h, v23.4s
-        add             x7, x0, #64
+        zip1            v24.4s, v20.4s, v22.4s
+        zip2            v25.4s, v20.4s, v22.4s
+        zip1            v26.4s, v21.4s, v23.4s
+        zip2            v27.4s, v21.4s, v23.4s
+        xtn             v20.4h, v24.4s
+        xtn2            v20.8h, v25.4s
+        xtn             v21.4h, v26.4s
+        xtn2            v21.8h, v27.4s
         st2             {v20.8h, v21.8h}, [x7]
         b.ne            1b
         ret
-- 
2.39.3 (Apple Git-146)

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-03-14 13:17 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-12 13:12 [FFmpeg-devel] [PATCH 1/4] aarch64: Fix ff_hevc_put_hevc_epel_h48_8_neon_i8mm Martin Storsjö
2024-03-12 13:12 ` [FFmpeg-devel] [PATCH 2/4] checkasm: hevc_pel: Check the full output in hevc_epel/hevc_qpel Martin Storsjö
2024-03-12 13:12 ` [FFmpeg-devel] [PATCH 3/4] checkasm: hevc_pel: Split a couple excessively long lines Martin Storsjö
2024-03-12 13:12 ` [FFmpeg-devel] [PATCH 4/4] checkasm: hevc_pel: Use checkasm_check for printing failing output Martin Storsjö
2024-03-14 12:47 ` [FFmpeg-devel] [PATCH 1/4] aarch64: Fix ff_hevc_put_hevc_epel_h48_8_neon_i8mm J. Dekker
2024-03-14 13:16   ` Martin Storsjö

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
		ffmpegdev@gitmailbox.com
	public-inbox-index ffmpegdev

Example config snippet for mirrors.


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git