From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.ffmpeg.org (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTPS id CB83D4C468 for ; Fri, 5 Sep 2025 15:41:38 +0000 (UTC) Authentication-Results: ffbox; dkim=fail (body hash mismatch (got b'lvdXlqvJY03KCmgZTLXD3PcixhHvc+rhgzPAUnc2N1U=', expected b'GUgfXf/8rL5E6c63S2BfOJMBTo0dCKMzTchghh2pqu4=')) header.d=ffmpeg.org header.i=@ffmpeg.org header.a=rsa-sha256 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ffmpeg.org; i=@ffmpeg.org; q=dns/txt; s=mail; t=1757086864; h=mime-version : to : message-id : reply-to : subject : list-id : list-archive : list-archive : list-help : list-owner : list-post : list-subscribe : list-unsubscribe : from : cc : content-type : content-transfer-encoding : from; bh=lvdXlqvJY03KCmgZTLXD3PcixhHvc+rhgzPAUnc2N1U=; b=K/mk1YUnlWTHRTQqtEflhurLQiWnVwgdFz1W2JyTwvQXOkDCBVkbXsVCE14wf/XoT7Qvy o+bV0S5ZW/pdCHdWGdDYlXrEKVqbUAaa2LNRioWyeOFhNkVhMVYXY/3fD70N8zXLkZIO/T3 6IIxllVJx2mQ2e2YKs804X0EcEliirhD8vYvYwtd/38RheMk/ezPrBb0NZfA+jKc2YRDHVt NX8mcvbL9wOTRdqwWVeVH+JfGVzpKrBcytdeNKBfBXXbSpElhz8Sucn28d2cwXnh223WaFA bfMoXZ08SRh7RYGvrUacJ+ScyuOau8fvooQT3soUOdGfTNtAMnOtgezhu9Zw== Received: from [172.19.0.4] (unknown [172.19.0.4]) by ffbox0-bg.ffmpeg.org (Postfix) with ESMTP id 2725B68CA75; Fri, 5 Sep 2025 18:41:04 +0300 (EEST) ARC-Seal: i=1; cv=none; a=rsa-sha256; d=ffmpeg.org; s=arc; t=1757086859; b=jLfbHdqiI4QfauFmprJjL/mcqQ9wk6R6SAsRnIGrl/vKgAuNvlxKqv43l7h/x0qR9pdpV mJxlD7BZOAgPz9Pmm3kVkOenNKvK4xP4DNZ/lpxhW7sTOsntjYOKCp652BHPFw8EffcYZk8 kIMkUvFaQmbFcytgYx5rjqFsJqCTiB7BLMWFpzFRlnuaTpzRPEtzj5WiROUnfjyIBEZcMHz xgdsq+24O5ebD4kxGPXQr/5iQJDHz/Q6EcoDOe9VwRhd9i1Otg/Ow15mtI11CvqhpiPLA2W yfqrOzfsi6+vKkruBeBhC0Ol8qGH00HRHl0Zd/CBC/ZqjxjecfslsYyRbw6w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=ffmpeg.org; s=arc; t=1757086859; h=from : sender : reply-to : subject : date : message-id : to : cc : mime-version : content-type : content-transfer-encoding : content-id : content-description : resent-date : resent-from : resent-sender : resent-to : resent-cc : resent-message-id : in-reply-to : references : list-id : list-help : list-unsubscribe : list-subscribe : list-post : list-owner : list-archive; bh=KPPl+H1GF/OxttqClsU5Zq0gnzUOklwgDUei6vqeL70=; b=UOmxCMAt2WUH/iBTsIfkg9xUqt14mwiBnjEQTQM7CGGwbGUIXJi2H7nvd/VCwUWb/u+Ba YEh4B7bF0Cmoa3VxEoh3rgx9sln2GsRnggf5AFuinY6bYT0wlk9ESAXLe1cGNAFQvY7RFhB 4hxL3ylEGnXz5u8DXHJm90LHP+HSSunCPcWP+YCx60Uac9SlwaCxghVwRJZZctvFkG/vBsF J4lfflttUOgiU8aAgp9cZEAFTE2nfzq3WyWlirag7hRrNfTf3Fkaf7pjbFyftRfFkwO6+M+ qtYUr7LVqtynPXWrWp5c2+E+FO+ELf7I07Dkj0kxqpYOwMzDyIXkFts4lLPA== ARC-Authentication-Results: i=1; ffmpeg.org; dkim=pass header.d=ffmpeg.org header.i=@ffmpeg.org; arc=none; dmarc=none Authentication-Results: ffmpeg.org; dkim=pass header.d=ffmpeg.org header.i=@ffmpeg.org; arc=none (Message is not ARC signed); dmarc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ffmpeg.org; i=@ffmpeg.org; q=dns/txt; s=mail; t=1757086843; h=content-type : mime-version : content-transfer-encoding : from : to : reply-to : subject : from; bh=GUgfXf/8rL5E6c63S2BfOJMBTo0dCKMzTchghh2pqu4=; b=gNC653lxnVE+1koZSXPJ/2MukThY/4RdQC/e1kmzjSwh7bZt7kXVUr4oVKzSm3LJp0v/x YdEsNnwUYFqFvtG7wCbMAis1Fs5vO9afGR+busmfGQdkNbfVAssBHyi4R6xt8KJjHuYRIoO PdaYUPVT9f6F4YHP1u5QN9Iem1ouYOJLnWY9ABaF+OgTbldgfZmoLsb9zw0wzh4aFzAPtup cQ4kDbPXrRcn6rpXJ3InuaPZqt0GL4nLwoDpwUvA3UybniYCqXkhFIqq2krGMZYNbdOEOnf 9z4TRkNErDg1hHti2S95zGkhveR1azcyOn3t6h2xHLB4+Tkx5gDfahX+kZGg== Received: from 5d8f51c41678 (code.ffmpeg.org [188.245.149.3]) by ffbox0-bg.ffmpeg.org (Postfix) with ESMTPS id 8E8BF68ABBC for ; Fri, 5 Sep 2025 18:40:43 +0300 (EEST) MIME-Version: 1.0 To: ffmpeg-devel@ffmpeg.org Message-ID: <175708684389.25.2130016717995355706@463a07221176> Message-ID-Hash: PMF74FHPES6EKNOVSGT447SCOEU6HB62 X-Message-ID-Hash: PMF74FHPES6EKNOVSGT447SCOEU6HB62 X-MailFrom: code@ffmpeg.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; header-match-ffmpeg-devel.ffmpeg.org-0; header-match-ffmpeg-devel.ffmpeg.org-1; header-match-ffmpeg-devel.ffmpeg.org-2; header-match-ffmpeg-devel.ffmpeg.org-3; emergency; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.10 Precedence: list Reply-To: FFmpeg development discussions and patches Subject: [FFmpeg-devel] [PATCH] swscale: Disable avx2 hscale 8to15 on IceLake and below due to Intel Gather Data Sampling mitigation performance loss (PR #20446) List-Id: FFmpeg development discussions and patches Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: From: legrosbuffle via ffmpeg-devel Cc: legrosbuffle Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Archived-At: List-Archive: List-Post: PR #20446 opened by legrosbuffle URL: https://code.ffmpeg.org/FFmpeg/FFmpeg/pulls/20446 Patch URL: https://code.ffmpeg.org/FFmpeg/FFmpeg/pulls/20446.patch Intel provided a microcode update to mitigate this security vulnerability which has a huge negative performance impact on gather instructions. This means that hscale 8to15 avx2, which uses gather extensively, is no longer faster than SSSE3 on impacted CPUs. https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/technical-documentation/gather-data-sampling.html Broadwell: hscale_8_to_15__fs_4_dstW_512_c: 3379.5 ( 1.00x) hscale_8_to_15__fs_4_dstW_512_sse2: 615.7 ( 5.49x) hscale_8_to_15__fs_4_dstW_512_ssse3: 613.4 ( 5.51x) hscale_8_to_15__fs_4_dstW_512_avx2: 495.7 ( 6.82x) Skylake: hscale_8_to_15__fs_4_dstW_512_c: 3411.4 ( 1.00x) hscale_8_to_15__fs_4_dstW_512_sse2: 591.0 ( 5.77x) hscale_8_to_15__fs_4_dstW_512_ssse3: 591.5 ( 5.77x) hscale_8_to_15__fs_4_dstW_512_avx2: 1386.2 ( 2.46x) Cascade Lake: hscale_8_to_15__fs_4_dstW_512_c: 3231.3 ( 1.00x) hscale_8_to_15__fs_4_dstW_512_sse2: 517.9 ( 6.24x) hscale_8_to_15__fs_4_dstW_512_ssse3: 521.6 ( 6.19x) hscale_8_to_15__fs_4_dstW_512_avx2: 1775.0 ( 1.82x) Sapphire Rapids: hscale_8_to_15__fs_4_dstW_512_c: 1840.0 ( 1.00x) hscale_8_to_15__fs_4_dstW_512_sse2: 287.9 ( 6.39x) hscale_8_to_15__fs_4_dstW_512_ssse3: 293.8 ( 6.26x) hscale_8_to_15__fs_4_dstW_512_avx2: 219.2 ( 8.40x) >>From 225382486832df97b74f84666aa5f895df503539 Mon Sep 17 00:00:00 2001 From: Alan Kelly Date: Fri, 5 Sep 2025 15:17:25 +0000 Subject: [PATCH] swscale: Disable avx2 hscale 8to15 on IceLake and below due to Intel Gather Data Sampling mitigation performance loss Intel provided a microcode update to mitigate this security vulnerability which has a huge negative performance impact on gather instructions. This means that hscale 8to15 avx2, which uses gather extensively, is no longer faster than SSSE3 on impacted CPUs. https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/technical-documentation/gather-data-sampling.html Broadwell: hscale_8_to_15__fs_4_dstW_512_c: 3379.5 ( 1.00x) hscale_8_to_15__fs_4_dstW_512_sse2: 615.7 ( 5.49x) hscale_8_to_15__fs_4_dstW_512_ssse3: 613.4 ( 5.51x) hscale_8_to_15__fs_4_dstW_512_avx2: 495.7 ( 6.82x) Skylake: hscale_8_to_15__fs_4_dstW_512_c: 3411.4 ( 1.00x) hscale_8_to_15__fs_4_dstW_512_sse2: 591.0 ( 5.77x) hscale_8_to_15__fs_4_dstW_512_ssse3: 591.5 ( 5.77x) hscale_8_to_15__fs_4_dstW_512_avx2: 1386.2 ( 2.46x) Cascade Lake: hscale_8_to_15__fs_4_dstW_512_c: 3231.3 ( 1.00x) hscale_8_to_15__fs_4_dstW_512_sse2: 517.9 ( 6.24x) hscale_8_to_15__fs_4_dstW_512_ssse3: 521.6 ( 6.19x) hscale_8_to_15__fs_4_dstW_512_avx2: 1775.0 ( 1.82x) Sapphire Rapids: hscale_8_to_15__fs_4_dstW_512_c: 1840.0 ( 1.00x) hscale_8_to_15__fs_4_dstW_512_sse2: 287.9 ( 6.39x) hscale_8_to_15__fs_4_dstW_512_ssse3: 293.8 ( 6.26x) hscale_8_to_15__fs_4_dstW_512_avx2: 219.2 ( 8.40x) --- libavutil/x86/cpu.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/libavutil/x86/cpu.c b/libavutil/x86/cpu.c index d6cd4fab9c..1a592f3bf4 100644 --- a/libavutil/x86/cpu.c +++ b/libavutil/x86/cpu.c @@ -244,8 +244,9 @@ int ff_get_cpu_flags_x86(void) family == 6 && model < 23) rval |= AV_CPU_FLAG_SSSE3SLOW; - /* Haswell has slow gather */ - if ((rval & AV_CPU_FLAG_AVX2) && family == 6 && model < 70) + /* Ice Lake and below have slow gather due to Gather Data Sampling + * mitigation. */ + if ((rval & AV_CPU_FLAG_AVX2) && family == 6 && model < 143) rval |= AV_CPU_FLAG_SLOW_GATHER; } -- 2.49.1 _______________________________________________ ffmpeg-devel mailing list -- ffmpeg-devel@ffmpeg.org To unsubscribe send an email to ffmpeg-devel-leave@ffmpeg.org