From: "Martin Storsjö" <martin@martin.st> To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org> Subject: Re: [FFmpeg-devel] [PATCH v2 3/5] aarch64: Add Linux runtime cpu feature detection using getauxval(AT_HWCAP) Date: Wed, 31 May 2023 22:37:56 +0300 (EEST) Message-ID: <1a0f544-3865-366b-1f28-746dc6a68533@martin.st> (raw) In-Reply-To: <3789008.K3ae2cLcPR@basile.remlab.net> On Wed, 31 May 2023, Rémi Denis-Courmont wrote: > Le tiistaina 30. toukokuuta 2023, 15.30.41 EEST Martin Storsjö a écrit : >> Based partially on code by Janne Grunau. >> >> --- >> Updated to use both the direct HWCAP* macros and HWCAP_CPUID. A >> not unreasonably old distribution like Ubuntu 20.04 does have >> HWCAP_CPUID but not HWCAP2_I8MM in the distribution provided headers. >> >> Alternatively I guess we could carry our own fallback hardcoded values >> for the HWCAP* values we use and skip HWCAP_CPUID. >> --- >> configure | 2 ++ >> libavutil/aarch64/cpu.c | 63 +++++++++++++++++++++++++++++++++++++++++ >> 2 files changed, 65 insertions(+) >> >> diff --git a/configure b/configure >> index 50eb27ba0e..b39de74de5 100755 >> --- a/configure >> +++ b/configure >> @@ -2209,6 +2209,7 @@ HAVE_LIST_PUB=" >> >> HEADERS_LIST=" >> arpa_inet_h >> + asm_hwcap_h >> asm_types_h >> cdio_paranoia_h >> cdio_paranoia_paranoia_h >> @@ -6432,6 +6433,7 @@ check_headers io.h >> enabled libdrm && >> check_headers linux/dma-buf.h >> >> +check_headers asm/hwcap.h >> check_headers linux/perf_event.h >> check_headers libcrystalhd/libcrystalhd_if.h >> check_headers malloc.h >> diff --git a/libavutil/aarch64/cpu.c b/libavutil/aarch64/cpu.c >> index 0c76f5ad15..4563959ffd 100644 >> --- a/libavutil/aarch64/cpu.c >> +++ b/libavutil/aarch64/cpu.c >> @@ -20,6 +20,67 @@ >> #include "libavutil/cpu_internal.h" >> #include "config.h" >> >> +#if (defined(__linux__) || defined(__ANDROID__)) && HAVE_GETAUXVAL && >> HAVE_ASM_HWCAP_H +#include <stdint.h> >> +#include <asm/hwcap.h> >> +#include <sys/auxv.h> >> + >> +#define get_cpu_feature_reg(reg, val) \ >> + __asm__("mrs %0, " #reg : "=r" (val)) >> + >> +static int detect_flags(void) >> +{ >> + int flags = 0; >> + unsigned long hwcap, hwcap2; >> + >> + // Check for support using direct individual HWCAPs >> + hwcap = getauxval(AT_HWCAP); >> +#ifdef HWCAP_ASIMDDP >> + if (hwcap & HWCAP_ASIMDDP) >> + flags |= AV_CPU_FLAG_DOTPROD; >> +#endif >> + >> +#ifdef AT_HWCAP2 >> + hwcap2 = getauxval(AT_HWCAP2); >> +#ifdef HWCAP2_I8MM >> + if (hwcap2 & HWCAP2_I8MM) >> + flags |= AV_CPU_FLAG_I8MM; >> +#endif >> +#endif >> + >> + // Silence warnings if none of the hwcaps to check are known. >> + (void)hwcap; >> + (void)hwcap2; >> + >> +#if defined(HWCAP_CPUID) >> + // The HWCAP_* defines for individual extensions may become available >> late, as >> + // they require updates to userland headers. As a fallback, see if we > can access >> + // the CPUID registers (trapped via the kernel). >> + // See https://www.kernel.org/doc/html/latest/arm64/cpu-feature-registers.html > > I don't actually care which method is used and whether to hard-code the > missing constants or not. But doing both methods is weird. If you are going to > trigger the TID3 traps anyway, there is no point checking the auxillary > vectors before, AFAICT. Yeah, that's true. > You *could* check the auxillary vectors as a run-time fallback if HWCAP_CPUID > is *not* set, but that only really makes for HWCAP_FP and HWCAP_ASIMD, not for > HWCAP_ASIMDDP (Linux 4.15) and HWCAP2_I8MM (Linux 5.6) which are more recent > than HWCAP_CPUID (Linux 4.11). And then, that would be only in the corner case > that FP and/or AdvSIMD were explicitly disabled since they are on by default > for all AArch64 targets. Yeah - I guess there's no potential configuration where a kernel does know about HWCAP_CPUID and newer HWCAPs but has decided to set HWCAP_CPUID to 0 and not handle the trapping? I considered falling back on the trapping CPUID codepath only if the individual HWCAPs weren't detected/supported, but that soon becomes quite a mess if we're adding more than a couple extensions. So I guess after all that it's simplest to just go with CPUID, possibly with a code comment that we could go with individual HWCAPs at some point in the future if we want to simplify things and don't care about older systems/toolchains. // Martin _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
next prev parent reply other threads:[~2023-05-31 19:38 UTC|newest] Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top 2023-05-30 12:30 [FFmpeg-devel] [PATCH v2 1/5] configure: aarch64: Support assembling the dotprod and i8mm arch extensions Martin Storsjö 2023-05-30 12:30 ` [FFmpeg-devel] [PATCH v2 2/5] aarch64: Add cpu flags for the dotprod and i8mm extensions Martin Storsjö 2023-05-30 12:30 ` [FFmpeg-devel] [PATCH v2 3/5] aarch64: Add Linux runtime cpu feature detection using getauxval(AT_HWCAP) Martin Storsjö 2023-05-31 16:54 ` Rémi Denis-Courmont 2023-05-31 19:37 ` Martin Storsjö [this message] 2023-05-30 12:30 ` [FFmpeg-devel] [PATCH v2 4/5] aarch64: Add Apple runtime detection of dotprod and i8mm using sysctl Martin Storsjö 2023-05-30 12:30 ` [FFmpeg-devel] [PATCH v2 5/5] aarch64: Add Windows runtime detection of the dotprod instructions Martin Storsjö 2023-06-03 20:51 ` Martin Storsjö 2023-06-05 17:36 ` James Zern 2023-06-06 9:32 ` Martin Storsjö 2023-06-06 10:25 ` [FFmpeg-devel] [PATCH v2 1/5] configure: aarch64: Support assembling the dotprod and i8mm arch extensions Martin Storsjö
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=1a0f544-3865-366b-1f28-746dc6a68533@martin.st \ --to=martin@martin.st \ --cc=ffmpeg-devel@ffmpeg.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel This inbox may be cloned and mirrored by anyone: git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \ ffmpegdev@gitmailbox.com public-inbox-index ffmpegdev Example config snippet for mirrors. AGPL code for this site: git clone https://public-inbox.org/public-inbox.git