Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
 help / color / mirror / Atom feed
From: "Martin Storsjö" <martin@martin.st>
To: ffmpeg-devel@ffmpeg.org
Cc: Logan Lyu <Logan.Lyu@myais.com.cn>, "J . Dekker" <jdek@itanimul.li>
Subject: [FFmpeg-devel] [PATCH 00/21] aarch64: hevc: Add missing hevc_pel NEON functions
Date: Mon, 25 Mar 2024 17:02:22 +0200
Message-ID: <20240325150243.59058-1-martin@martin.st> (raw)

Hi,

Since some time, we have pretty complete AArch64 NEON coverage
for the hevc decoder.

However, some of these functions require the I8MM instruction set
extension, and many of them (but not all) lack a plain NEON
version.

This patchset fills in a regular NEON version of all functions
where we have an I8MM function.

For context; the I8MM instruction set extension is a mandatory
part of armv8.6-a. E.g. Apple M2, AWS Graviton 3 have it,
but Apple M1 and Ampere Altra don't.

This patchset takes decoding of a 1080p HEVC clip from 402
fps to 649 fps on an Apple M1.

Patch #2 also fixes a subtle bug in the existing implementation;
two functions relied on the contents on the stack, below the
stack pointer, being untouched within a function. If a signal
gets delivered, those parts of the stack could be clobbered.

// Martin

Martin Storsjö (21):
  aarch64: hevc: Reorder a misplaced function init line
  aarch64: hevc: Don't iterate with sp in
    ff_hevc_put_hevc_qpel_uni_w_hv32/64_8_neon_i8mm
  aarch64: hevc: Merge consecutive stores in
    put_hevc_\type\()_h16_8_neon
  aarch64: hevc: Specialize put_hevc_\type\()_h*_8_neon for horizontal
    looping
  aarch64: hevc: Use ld1r instead of ldr+dup in hevc_qpel_uni_w_h
  aarch64: hevc: Implement a neon version of put_hevc_epel_h*_8
  aarch64: hevc: Implement a neon version of hevc_epel_uni_w_h*_8
  aarch64: hevc: Split the epel_*_hv functions into two parts
  aarch64: hevc: Reorder epel_hv functions to prepare for templating
  aarch64: hevc: Produce epel_hv functions for both plain neon and i8mm
  aarch64: hevc: Produce epel_uni_hv functions for both neon and i8mm
  aarch64: hevc: Produce epel_uni_w_hv functions for both neon and i8mm
  aarch64: hevc: Produce epel_bi_hv functions for both neon and i8mm
  aarch64: hevc: Implement a neon version of hevc_qpel_uni_w_h*_8
  aarch64: hevc: Split the qpel_*_hv functions into two parts
  aarch64: hevc: Deduplicate the hevc_put_hevc_qpel_uni_w_hv*_8_end_neon
    functions
  aarch64: hevc: Reorder qpel_hv functions to prepare for templating
  aarch64: hevc: Produce plain neon versions of qpel_hv
  aarch64: hevc: Produce plain neon versions of qpel_uni_hv
  aarch64: hevc: Produce plain neon versions of qpel_uni_w_hv
  aarch64: hevc: Produce plain neon versions of qpel_bi_hv

 libavcodec/aarch64/hevcdsp_epel_neon.S    | 1529 +++++++++++------
 libavcodec/aarch64/hevcdsp_init_aarch64.c |   96 +-
 libavcodec/aarch64/hevcdsp_qpel_neon.S    | 1804 +++++++++++++--------
 3 files changed, 2291 insertions(+), 1138 deletions(-)

-- 
2.39.3 (Apple Git-146)

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

             reply	other threads:[~2024-03-25 15:02 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-25 15:02 Martin Storsjö [this message]
2024-03-25 15:02 ` [FFmpeg-devel] [PATCH 01/21] aarch64: hevc: Reorder a misplaced function init line Martin Storsjö
2024-03-25 15:02 ` [FFmpeg-devel] [PATCH 02/21] aarch64: hevc: Don't iterate with sp in ff_hevc_put_hevc_qpel_uni_w_hv32/64_8_neon_i8mm Martin Storsjö
2024-03-25 15:02 ` [FFmpeg-devel] [PATCH 03/21] aarch64: hevc: Merge consecutive stores in put_hevc_\type\()_h16_8_neon Martin Storsjö
2024-03-25 15:02 ` [FFmpeg-devel] [PATCH 04/21] aarch64: hevc: Specialize put_hevc_\type\()_h*_8_neon for horizontal looping Martin Storsjö
2024-03-25 15:02 ` [FFmpeg-devel] [PATCH 05/21] aarch64: hevc: Use ld1r instead of ldr+dup in hevc_qpel_uni_w_h Martin Storsjö
2024-03-25 15:02 ` [FFmpeg-devel] [PATCH 06/21] aarch64: hevc: Implement a neon version of put_hevc_epel_h*_8 Martin Storsjö
2024-03-25 15:02 ` [FFmpeg-devel] [PATCH 07/21] aarch64: hevc: Implement a neon version of hevc_epel_uni_w_h*_8 Martin Storsjö
2024-03-25 15:02 ` [FFmpeg-devel] [PATCH 08/21] aarch64: hevc: Split the epel_*_hv functions into two parts Martin Storsjö
2024-03-25 15:02 ` [FFmpeg-devel] [PATCH 09/21] aarch64: hevc: Reorder epel_hv functions to prepare for templating Martin Storsjö
2024-03-25 15:02 ` [FFmpeg-devel] [PATCH 10/21] aarch64: hevc: Produce epel_hv functions for both plain neon and i8mm Martin Storsjö
2024-03-25 15:02 ` [FFmpeg-devel] [PATCH 11/21] aarch64: hevc: Produce epel_uni_hv functions for both " Martin Storsjö
2024-03-25 15:02 ` [FFmpeg-devel] [PATCH 12/21] aarch64: hevc: Produce epel_uni_w_hv " Martin Storsjö
2024-03-25 15:02 ` [FFmpeg-devel] [PATCH 13/21] aarch64: hevc: Produce epel_bi_hv " Martin Storsjö
2024-03-25 15:02 ` [FFmpeg-devel] [PATCH 14/21] aarch64: hevc: Implement a neon version of hevc_qpel_uni_w_h*_8 Martin Storsjö
2024-03-25 15:02 ` [FFmpeg-devel] [PATCH 15/21] aarch64: hevc: Split the qpel_*_hv functions into two parts Martin Storsjö
2024-03-25 15:02 ` [FFmpeg-devel] [PATCH 16/21] aarch64: hevc: Deduplicate the hevc_put_hevc_qpel_uni_w_hv*_8_end_neon functions Martin Storsjö
2024-03-25 15:02 ` [FFmpeg-devel] [PATCH 17/21] aarch64: hevc: Reorder qpel_hv functions to prepare for templating Martin Storsjö
2024-03-25 15:02 ` [FFmpeg-devel] [PATCH 18/21] aarch64: hevc: Produce plain neon versions of qpel_hv Martin Storsjö
2024-03-25 15:02 ` [FFmpeg-devel] [PATCH 19/21] aarch64: hevc: Produce plain neon versions of qpel_uni_hv Martin Storsjö
2024-03-25 15:02 ` [FFmpeg-devel] [PATCH 20/21] aarch64: hevc: Produce plain neon versions of qpel_uni_w_hv Martin Storsjö
2024-03-25 15:02 ` [FFmpeg-devel] [PATCH 21/21] aarch64: hevc: Produce plain neon versions of qpel_bi_hv Martin Storsjö
2024-03-25 21:15 ` [FFmpeg-devel] [PATCH 00/21] aarch64: hevc: Add missing hevc_pel NEON functions Martin Storsjö
2024-03-25 21:56   ` J. Dekker
2024-03-26  6:01     ` Jean-Baptiste Kempf
2024-03-26  7:09       ` Martin Storsjö

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240325150243.59058-1-martin@martin.st \
    --to=martin@martin.st \
    --cc=Logan.Lyu@myais.com.cn \
    --cc=ffmpeg-devel@ffmpeg.org \
    --cc=jdek@itanimul.li \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
		ffmpegdev@gitmailbox.com
	public-inbox-index ffmpegdev

Example config snippet for mirrors.


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git