From: "Martin Storsjö" <martin@martin.st>
To: ffmpeg-devel@ffmpeg.org
Cc: Logan Lyu <Logan.Lyu@myais.com.cn>, "J . Dekker" <jdek@itanimul.li>
Subject: [FFmpeg-devel] [PATCH 1/4] aarch64: Fix ff_hevc_put_hevc_epel_h48_8_neon_i8mm
Date: Tue, 12 Mar 2024 15:12:26 +0200
Message-ID: <20240312131229.1551-1-martin@martin.st> (raw)
The first 32 elements of each row were correct, while the
last 16 were scrambled.
This hasn't been noticed, because the checkasm test erroneously
only checked half of the output (for 8 bit functions), and
apparently none of the samples as part of "fate-hevc" seem to
trigger this specific function.
---
libavcodec/aarch64/hevcdsp_epel_neon.S | 14 +++++++++-----
1 file changed, 9 insertions(+), 5 deletions(-)
diff --git a/libavcodec/aarch64/hevcdsp_epel_neon.S b/libavcodec/aarch64/hevcdsp_epel_neon.S
index 2dafa09337..d3f0a26f79 100644
--- a/libavcodec/aarch64/hevcdsp_epel_neon.S
+++ b/libavcodec/aarch64/hevcdsp_epel_neon.S
@@ -1572,6 +1572,7 @@ function ff_hevc_put_hevc_epel_h48_8_neon_i8mm, export=1
xtn2 v22.8h, v26.4s
xtn v23.4h, v23.4s
xtn2 v23.8h, v27.4s
+ add x7, x0, #64
st4 {v20.8h, v21.8h, v22.8h, v23.8h}, [x0], x10
ext v4.16b, v2.16b, v3.16b, #1
ext v5.16b, v2.16b, v3.16b, #2
@@ -1584,11 +1585,14 @@ function ff_hevc_put_hevc_epel_h48_8_neon_i8mm, export=1
usdot v21.4s, v4.16b, v30.16b
usdot v22.4s, v5.16b, v30.16b
usdot v23.4s, v6.16b, v30.16b
- xtn v20.4h, v20.4s
- xtn2 v20.8h, v22.4s
- xtn v21.4h, v21.4s
- xtn2 v21.8h, v23.4s
- add x7, x0, #64
+ zip1 v24.4s, v20.4s, v22.4s
+ zip2 v25.4s, v20.4s, v22.4s
+ zip1 v26.4s, v21.4s, v23.4s
+ zip2 v27.4s, v21.4s, v23.4s
+ xtn v20.4h, v24.4s
+ xtn2 v20.8h, v25.4s
+ xtn v21.4h, v26.4s
+ xtn2 v21.8h, v27.4s
st2 {v20.8h, v21.8h}, [x7]
b.ne 1b
ret
--
2.39.3 (Apple Git-146)
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
next reply other threads:[~2024-03-12 13:12 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-12 13:12 Martin Storsjö [this message]
2024-03-12 13:12 ` [FFmpeg-devel] [PATCH 2/4] checkasm: hevc_pel: Check the full output in hevc_epel/hevc_qpel Martin Storsjö
2024-03-12 13:12 ` [FFmpeg-devel] [PATCH 3/4] checkasm: hevc_pel: Split a couple excessively long lines Martin Storsjö
2024-03-12 13:12 ` [FFmpeg-devel] [PATCH 4/4] checkasm: hevc_pel: Use checkasm_check for printing failing output Martin Storsjö
2024-03-14 12:47 ` [FFmpeg-devel] [PATCH 1/4] aarch64: Fix ff_hevc_put_hevc_epel_h48_8_neon_i8mm J. Dekker
2024-03-14 13:16 ` Martin Storsjö
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240312131229.1551-1-martin@martin.st \
--to=martin@martin.st \
--cc=Logan.Lyu@myais.com.cn \
--cc=ffmpeg-devel@ffmpeg.org \
--cc=jdek@itanimul.li \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
This inbox may be cloned and mirrored by anyone:
git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git
# If you have public-inbox 1.1+ installed, you may
# initialize and index your mirror using the following commands:
public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
ffmpegdev@gitmailbox.com
public-inbox-index ffmpegdev
Example config snippet for mirrors.
AGPL code for this site: git clone https://public-inbox.org/public-inbox.git