Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
 help / color / mirror / Atom feed
From: "Martin Storsjö" <martin@martin.st>
To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Subject: Re: [FFmpeg-devel] [PATCH v2 3/5] aarch64: Add Linux runtime cpu feature detection using getauxval(AT_HWCAP)
Date: Wed, 31 May 2023 22:37:56 +0300 (EEST)
Message-ID: <1a0f544-3865-366b-1f28-746dc6a68533@martin.st> (raw)
In-Reply-To: <3789008.K3ae2cLcPR@basile.remlab.net>

On Wed, 31 May 2023, Rémi Denis-Courmont wrote:

> Le tiistaina 30. toukokuuta 2023, 15.30.41 EEST Martin Storsjö a écrit :
>> Based partially on code by Janne Grunau.
>> 
>> ---
>> Updated to use both the direct HWCAP* macros and HWCAP_CPUID. A
>> not unreasonably old distribution like Ubuntu 20.04 does have
>> HWCAP_CPUID but not HWCAP2_I8MM in the distribution provided headers.
>> 
>> Alternatively I guess we could carry our own fallback hardcoded values
>> for the HWCAP* values we use and skip HWCAP_CPUID.
>> ---
>>  configure               |  2 ++
>>  libavutil/aarch64/cpu.c | 63 +++++++++++++++++++++++++++++++++++++++++
>>  2 files changed, 65 insertions(+)
>> 
>> diff --git a/configure b/configure
>> index 50eb27ba0e..b39de74de5 100755
>> --- a/configure
>> +++ b/configure
>> @@ -2209,6 +2209,7 @@ HAVE_LIST_PUB="
>>
>>  HEADERS_LIST="
>>      arpa_inet_h
>> +    asm_hwcap_h
>>      asm_types_h
>>      cdio_paranoia_h
>>      cdio_paranoia_paranoia_h
>> @@ -6432,6 +6433,7 @@ check_headers io.h
>>  enabled libdrm &&
>>      check_headers linux/dma-buf.h
>> 
>> +check_headers asm/hwcap.h
>>  check_headers linux/perf_event.h
>>  check_headers libcrystalhd/libcrystalhd_if.h
>>  check_headers malloc.h
>> diff --git a/libavutil/aarch64/cpu.c b/libavutil/aarch64/cpu.c
>> index 0c76f5ad15..4563959ffd 100644
>> --- a/libavutil/aarch64/cpu.c
>> +++ b/libavutil/aarch64/cpu.c
>> @@ -20,6 +20,67 @@
>>  #include "libavutil/cpu_internal.h"
>>  #include "config.h"
>> 
>> +#if (defined(__linux__) || defined(__ANDROID__)) && HAVE_GETAUXVAL &&
>> HAVE_ASM_HWCAP_H +#include <stdint.h>
>> +#include <asm/hwcap.h>
>> +#include <sys/auxv.h>
>> +
>> +#define get_cpu_feature_reg(reg, val) \
>> +        __asm__("mrs %0, " #reg : "=r" (val))
>> +
>> +static int detect_flags(void)
>> +{
>> +    int flags = 0;
>> +    unsigned long hwcap, hwcap2;
>> +
>> +    // Check for support using direct individual HWCAPs
>> +    hwcap = getauxval(AT_HWCAP);
>> +#ifdef HWCAP_ASIMDDP
>> +    if (hwcap & HWCAP_ASIMDDP)
>> +        flags |= AV_CPU_FLAG_DOTPROD;
>> +#endif
>> +
>> +#ifdef AT_HWCAP2
>> +    hwcap2 = getauxval(AT_HWCAP2);
>> +#ifdef HWCAP2_I8MM
>> +    if (hwcap2 & HWCAP2_I8MM)
>> +        flags |= AV_CPU_FLAG_I8MM;
>> +#endif
>> +#endif
>> +
>> +    // Silence warnings if none of the hwcaps to check are known.
>> +    (void)hwcap;
>> +    (void)hwcap2;
>> +
>> +#if defined(HWCAP_CPUID)
>> +    // The HWCAP_* defines for individual extensions may become available
>> late, as
>> +    // they require updates to userland headers. As a fallback, see if we 
> can access
>> +    // the CPUID registers (trapped via the kernel).
>> +    // See https://www.kernel.org/doc/html/latest/arm64/cpu-feature-registers.html
>
> I don't actually care which method is used and whether to hard-code the 
> missing constants or not. But doing both methods is weird. If you are going to 
> trigger the TID3 traps anyway, there is no point checking the auxillary 
> vectors before, AFAICT.

Yeah, that's true.

> You *could* check the auxillary vectors as a run-time fallback if HWCAP_CPUID 
> is *not* set, but that only really makes for HWCAP_FP and HWCAP_ASIMD, not for 
> HWCAP_ASIMDDP (Linux 4.15) and HWCAP2_I8MM (Linux 5.6) which are more recent 
> than HWCAP_CPUID (Linux 4.11). And then, that would be only in the corner case 
> that FP and/or AdvSIMD were explicitly disabled since they are on by default 
> for all AArch64 targets.

Yeah - I guess there's no potential configuration where a kernel does know 
about HWCAP_CPUID and newer HWCAPs but has decided to set HWCAP_CPUID to 0 
and not handle the trapping?

I considered falling back on the trapping CPUID codepath only if the 
individual HWCAPs weren't detected/supported, but that soon becomes quite 
a mess if we're adding more than a couple extensions.

So I guess after all that it's simplest to just go with CPUID, possibly 
with a code comment that we could go with individual HWCAPs at some point 
in the future if we want to simplify things and don't care about older 
systems/toolchains.

// Martin
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

  reply	other threads:[~2023-05-31 19:38 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-30 12:30 [FFmpeg-devel] [PATCH v2 1/5] configure: aarch64: Support assembling the dotprod and i8mm arch extensions Martin Storsjö
2023-05-30 12:30 ` [FFmpeg-devel] [PATCH v2 2/5] aarch64: Add cpu flags for the dotprod and i8mm extensions Martin Storsjö
2023-05-30 12:30 ` [FFmpeg-devel] [PATCH v2 3/5] aarch64: Add Linux runtime cpu feature detection using getauxval(AT_HWCAP) Martin Storsjö
2023-05-31 16:54   ` Rémi Denis-Courmont
2023-05-31 19:37     ` Martin Storsjö [this message]
2023-05-30 12:30 ` [FFmpeg-devel] [PATCH v2 4/5] aarch64: Add Apple runtime detection of dotprod and i8mm using sysctl Martin Storsjö
2023-05-30 12:30 ` [FFmpeg-devel] [PATCH v2 5/5] aarch64: Add Windows runtime detection of the dotprod instructions Martin Storsjö
2023-06-03 20:51   ` Martin Storsjö
2023-06-05 17:36   ` James Zern
2023-06-06  9:32     ` Martin Storsjö
2023-06-06 10:25 ` [FFmpeg-devel] [PATCH v2 1/5] configure: aarch64: Support assembling the dotprod and i8mm arch extensions Martin Storsjö

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1a0f544-3865-366b-1f28-746dc6a68533@martin.st \
    --to=martin@martin.st \
    --cc=ffmpeg-devel@ffmpeg.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
		ffmpegdev@gitmailbox.com
	public-inbox-index ffmpegdev

Example config snippet for mirrors.


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git