From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.ffmpeg.org (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTPS id B88FA46AB2 for ; Fri, 18 Jul 2025 11:37:19 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.ffmpeg.org (Postfix) with ESMTP id 1BE7D68C993; Fri, 18 Jul 2025 14:37:17 +0300 (EEST) Received: from mail-ed1-f43.google.com (mail-ed1-f43.google.com [209.85.208.43]) by ffbox0-bg.ffmpeg.org (Postfix) with ESMTPS id 87BF568C50D for ; Fri, 18 Jul 2025 14:37:15 +0300 (EEST) Received: by mail-ed1-f43.google.com with SMTP id 4fb4d7f45d1cf-60bf5a08729so3884300a12.0 for ; Fri, 18 Jul 2025 04:37:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1752838634; x=1753443434; darn=ffmpeg.org; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :from:to:cc:subject:date:message-id:reply-to; bh=unG3M8ApVQerbK4B36U2AyWj3UfPEuKE32Xf+LXQwdM=; b=dycIUCLNo5GVTfU0Qr4TTnLPV00+Q3tJOmuS0VTk8sNyFxA7RwyxPngauiKbpHYfkI 3fOjLbbPU0jX1aAQ+UySS/q9VyCKwm1akj2I4qxiPTppnAKMhxzKHW+JsnydB+mTiQo+ pCCexQ44/MsAFf/SS8G3iyl/bHImR9bFCrw9MhMWqlkPPNTJwtQ2LpzVMSZ2pzeYWi+3 2Vv1sd4LADZxM/dY6F4gRzEIMIKJGUDZtONit0FjBYr29BriK3yOjlL8YbMeS2kPu8Xg h+u1MWHMGKIPBlXYMF6Yxpxjmu1wSBAaHe4CGqaqOW+GagsFpSdVIn7eLPdPAgk+3153 BXMg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1752838634; x=1753443434; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=unG3M8ApVQerbK4B36U2AyWj3UfPEuKE32Xf+LXQwdM=; b=i4RU3DtEDBxRQyzZ9hYbOa5Mqj2YPgGskwIoWrl0uyDm+Wcyas6mgj2sH5p4+EM/qr ZYGYbWa2e/7EZiIEzV14XvRYY+/tMIu/Wik84GMDdnr7X4HrdBRBeeMUl4XHWMNoaBO2 +1C/un3R4SMbJMsBejy2D8WD8Y1uc1lwjjm1fjKf6tf8vawH3fxfAgT1FlCNI2uomqf3 1tnyqdib9Pk98oPueYFrHSRGsDHZF2+SKz8PBPRxzoNmMP7zSjyP9S1PCxVsEQ0XgLe6 6IpiN9pv7yPKijrQ79TDjZkNpL+c+9wFEoZIGzkCdL53jcARh5G7WXr+4bYTtx0y3yv8 1OCw== X-Gm-Message-State: AOJu0Yz9tK6wsMTxu6BgzO09QQ3u+7KdQO4VTkHinlHdiKOYoSBllfYe FPEp+KRkLUv6+Tsv1c18YVYggsYC7kYZRMX/nY5iJgBTLdD1+ajp0SL7f4jTj1hPPtYfpXymd4Z tdGXGucKE5KqyxIzCOo/60zHUW67o+Lj9XOSQ X-Gm-Gg: ASbGncuAdSHD449x0xCPmj8MIHG6kRQq5WcxOAYCJB73K9vC2ilvha3VIuqwG2VoojK kkicWgE/QJNkw0GEyPXYLr9Iih7knJCw83cTyfKpqjK8pWYtYWtL4J7cgrkaDVwmPxnAKeP85oE 7qdSvJolSyYhrhvHztJkXcMhiOalRRIHVhxonXOvR5pZgUUtvnSC6A/xqvsYzef7FxSYD7jfW2J b4f X-Google-Smtp-Source: AGHT+IHkc3jKsqrFztbKRRm9Q7t+zzTJU4MdX53LZRmvnnB8U4AiKTKXQogQfVUWyI0dfx/Gzp4atCxjY/wk5/8XUUg= X-Received: by 2002:a05:6402:13d2:b0:612:c810:6b8f with SMTP id 4fb4d7f45d1cf-612c8106fbcmr1931022a12.31.1752838633689; Fri, 18 Jul 2025 04:37:13 -0700 (PDT) MIME-Version: 1.0 References: <20250717104525.1290708-1-ffmpeg@haasn.xyz> In-Reply-To: <20250717104525.1290708-1-ffmpeg@haasn.xyz> From: Kacper Michajlow Date: Fri, 18 Jul 2025 13:36:41 +0200 X-Gm-Features: Ac12FXx3TGWAXA62IvPvA9cpinoM-Ftfd6WLbQgUh0IhVhHT60Hz7omAnAT04A4 Message-ID: To: FFmpeg development discussions and patches Subject: Re: [FFmpeg-devel] [PATCH v2 1/2] avfilter/vf_blackdetect: add AVX2 SIMD version X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: On Thu, 17 Jul 2025 at 12:45, Niklas Haas wrote: > > From: Niklas Haas > > Requested by a user. Even with autovectorization enabled, the compiler > performs a quite poor job of optimizing this function, due to not being > able to take advantage of the pmaxub + pcmpeqb trick for counting the number > of pixels less than or equal-to a threshold. > > blackdetect8_c: 4625.0 ( 1.00x) > blackdetect8_avx2: 155.1 (29.83x) > blackdetect16_c: 2529.4 ( 1.00x) > blackdetect16_avx2: 163.6 (15.46x) I think we should try to have better standards for reporting performance metrics. Those numbers without context mean not so much. What compiler, flags, cpu were used? Sure, we can omit some information if we want to show only the scaling, but if it highly depends on those things, then we should at least try to be more specific. Sorry for being pedantic about those things, but I think it's important, especially if we put those values in a commit message which will live forever in the repository as a vague reference. > Even with autovectorization enabled You mention the auto vectorization enabled, yet the reported numbers are without it. In my mind this description implies that shown performance comparison is with auto vectorization enabled. When we compare apples to apples, with avx2 we get a more expectable 3.74x (gcc) / 2.38x (clang) depending on the compiler. It's still a good improvement, no reason to oversell it. For reference some metrics on me end: clang 20.1.7 march=generic (default config): blackdetect8_c: 1591.1 ( 1.00x) blackdetect8_avx2: 225.1 ( 7.07x) blackdetect16_c: 643.5 ( 1.00x) blackdetect16_avx2: 220.6 ( 2.92x) march=core-avx2: blackdetect8_c: 526.0 ( 1.00x) blackdetect8_avx2: 220.9 ( 2.38x) blackdetect16_c: 318.8 ( 1.00x) blackdetect16_avx2: 225.9 ( 1.41x) gcc 14.2.0 -fno-tree-vectorize (default config): blackdetect8_c: 5126.6 ( 1.00x) blackdetect8_avx2: 198.0 (25.89x) blackdetect16_c: 2151.9 ( 1.00x) blackdetect16_avx2: 196.8 (10.93x) march=generic -ftree-vectorize: blackdetect8_c: 1354.4 ( 1.00x) blackdetect8_avx2: 196.9 ( 6.88x) blackdetect16_c: 644.2 ( 1.00x) blackdetect16_avx2: 249.8 ( 2.58x) march=core-avx2 -ftree-vectorize: blackdetect8_c: 820.8 ( 1.00x) blackdetect8_avx2: 219.2 ( 3.74x) blackdetect16_c: 372.8 ( 1.00x) blackdetect16_avx2: 201.4 ( 1.85x) Again, sorry for being pedantic here, but it gives the wrong impression especially if you look at this from outside. - Kacper _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".