From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTP id A895D44C20 for ; Sun, 13 Nov 2022 22:32:26 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id CC90468BA23; Mon, 14 Nov 2022 00:32:23 +0200 (EET) Received: from mail-oo1-f53.google.com (mail-oo1-f53.google.com [209.85.161.53]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 8C7CF68BA23 for ; Mon, 14 Nov 2022 00:32:17 +0200 (EET) Received: by mail-oo1-f53.google.com with SMTP id g15-20020a4a894f000000b0047f8e899623so1364552ooi.5 for ; Sun, 13 Nov 2022 14:32:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:in-reply-to:content-language:references :to:from:subject:user-agent:mime-version:date:message-id:from:to:cc :subject:date:message-id:reply-to; bh=Y7NcN/QHmaA4xBJLM2Dia9nZpwy2BpfQJet/X6UdSTE=; b=D2872R69yDXZaF34IlQYWCycIIGDf/i2coDVDC69E0NnfcsqgUJm966e+NmdLcXY9i g+QAvCZbkLTlVG61XqF3yYWl77pI5VmPGnQ44sBKHVAOlXZcmUSSi5JMMiekbJ2NLb37 Tmt4jIgQohLrSdmDoknit1xpUMnyUr+j7uNgBzpOKiDuDPvh6oik9sfG2lh279+HBh7q WWsjp7eMnlSx3/KN5YrLaY4eXYwneuofg1D4zlXI63PopeJP0nGp5sRXr8kfGE6zu1WI Om/qGw+rhnnjsHC9hCMi4jSmIic4M+vgESE4T8exBzYh/1u/23tUJGpHTK7aAfDfN8Rl kDaA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:content-language:references :to:from:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Y7NcN/QHmaA4xBJLM2Dia9nZpwy2BpfQJet/X6UdSTE=; b=ejhzgmHsZ98wF1OOWpeCb7ssHZu8Rwr4Ag49CtNygFwT00p0AesNIN4wybe0IbSI6o bVmwaMjaijs6UvehcXjalKmBcU7QdUEm2CJgCDZ6EXAwMpnuuBD6mQr6kG3jHTaT1tEC TGuHm85QsM634Gh+AJH31yA7dOImRjEbtK5nmgSn1X7eofAcgt5xBtls9wqZWpJ2lNUR PxqUH5bBkdhr9JZVcVrIuewA/WUaY+kjGQOHekQjrC5n3TjVhRZ3XJmBRigoR8e7ktiM OKI0i5fd7Jp8JPF1Ce1Ih1QUN8tEAb3EoPtHit8n/5GXqBhJ/xLCVHaOlWV64EL3GbUR jJXg== X-Gm-Message-State: ANoB5plRn/Z92maWOhP44fE7zmvPW1pxBzGw8LGgFOVy5QlVZPAI9q+c kSSsG1gDE8F+1EBr+UFgXzos2PQPtYI= X-Google-Smtp-Source: AA0mqf4tQqCUAgcRTb98Rdc6b5N9RO3Xu6MBGHXEwi3Zo7T/WFwjEb6xg366gaOAU57SD0SU29aysA== X-Received: by 2002:a4a:b3c6:0:b0:49f:46ea:1bbc with SMTP id q6-20020a4ab3c6000000b0049f46ea1bbcmr3944036ooo.53.1668378735580; Sun, 13 Nov 2022 14:32:15 -0800 (PST) Received: from [192.168.0.15] ([181.85.72.69]) by smtp.gmail.com with ESMTPSA id 128-20020a4a0d86000000b004805b00b2cdsm3018735oob.28.2022.11.13.14.32.14 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 13 Nov 2022 14:32:15 -0800 (PST) Message-ID: Date: Sun, 13 Nov 2022 19:32:33 -0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.4.2 From: James Almer To: ffmpeg-devel@ffmpeg.org References: <20221024200351.15126-1-jamrial@gmail.com> Content-Language: en-US In-Reply-To: <20221024200351.15126-1-jamrial@gmail.com> Subject: Re: [FFmpeg-devel] [PATCH] x86/intreadwrite: use intrinsics instead of inline asm for AV_ZERO128 X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: On 10/24/2022 5:03 PM, James Almer wrote: > When called inside a loop, the inline asm version results in one pxor > unnecessarely emitted per iteration, as the contents of the __asm__() block are > opaque to the compiler's instruction scheduler. > This is not the case with intrinsics, where pxor will be emitted once with any > half decent compiler. > > The code can be adapted to also work with MSVC, but for now, it will work with > the same compilers previously supported (GCC, Clang, etc). > > Signed-off-by: James Almer > --- > configure | 3 +++ > libavutil/x86/intreadwrite.h | 15 +++++++-------- > 2 files changed, 10 insertions(+), 8 deletions(-) > > diff --git a/configure b/configure > index c5a466657f..5bb83f5b5a 100755 > --- a/configure > +++ b/configure > @@ -2222,6 +2222,7 @@ HEADERS_LIST=" > > INTRINSICS_LIST=" > intrinsics_neon > + intrinsics_sse2 > " > > COMPLEX_FUNCS=" > @@ -2636,6 +2637,7 @@ armv6t2_deps="arm" > armv8_deps="aarch64" > neon_deps_any="aarch64 arm" > intrinsics_neon_deps="neon" > +intrinsics_sse2_deps="sse2" > vfp_deps_any="aarch64 arm" > vfpv3_deps="vfp" > setend_deps="arm" > @@ -6207,6 +6209,7 @@ elif enabled loongarch; then > fi > > check_cc intrinsics_neon arm_neon.h "int16x8_t test = vdupq_n_s16(0)" > +check_cc intrinsics_sse2 emmintrin.h "__m128i test = _mm_setzero_si128()" > > check_ldflags -Wl,--as-needed > check_ldflags -Wl,-z,noexecstack > diff --git a/libavutil/x86/intreadwrite.h b/libavutil/x86/intreadwrite.h > index 40f375b013..4a03e60fc6 100644 > --- a/libavutil/x86/intreadwrite.h > +++ b/libavutil/x86/intreadwrite.h > @@ -21,6 +21,9 @@ > #ifndef AVUTIL_X86_INTREADWRITE_H > #define AVUTIL_X86_INTREADWRITE_H > > +#if HAVE_INTRINSICS_SSE2 > +#include > +#endif > #include > #include "config.h" > #include "libavutil/attributes.h" > @@ -79,20 +82,16 @@ static av_always_inline void AV_COPY128(void *d, const void *s) > > #endif /* __SSE__ */ > > -#ifdef __SSE2__ > +#if HAVE_INTRINSICS_SSE2 && defined(__SSE2__) > > #define AV_ZERO128 AV_ZERO128 > static av_always_inline void AV_ZERO128(void *d) > { > - struct v {uint64_t v[2];}; > - > - __asm__("pxor %%xmm0, %%xmm0 \n\t" > - "movdqa %%xmm0, %0 \n\t" > - : "=m"(*(struct v*)d) > - :: "xmm0"); > + __m128i zero = _mm_setzero_si128(); > + _mm_store_si128(d, zero); > } > > -#endif /* __SSE2__ */ > +#endif /* HAVE_INTRINSICS_SSE2 && defined(__SSE2__) */ > > #endif /* HAVE_MMX */ Will apply. _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".