From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.ffmpeg.org (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTPS id E0FD64C659 for ; Mon, 4 Aug 2025 13:50:49 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.ffmpeg.org (Postfix) with ESMTP id 41CAF68C339; Mon, 4 Aug 2025 16:50:45 +0300 (EEST) Received: from mail-ej1-f73.google.com (mail-ej1-f73.google.com [209.85.218.73]) by ffbox0-bg.ffmpeg.org (Postfix) with ESMTPS id A860A687C10 for ; Mon, 4 Aug 2025 16:50:38 +0300 (EEST) Received: by mail-ej1-f73.google.com with SMTP id a640c23a62f3a-af93e838f92so174345266b.0 for ; Mon, 04 Aug 2025 06:50:38 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1754315438; x=1754920238; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=WrmhnVT0fQ1xxUaL7mZfrrQLGB6DAIWxyd8Gf1IMdn8=; b=EODZ0zVrxfo9LliNv4K+8/amqGRgwYzdXDdZkH42q99Fb7jbfJgw1OZsPzofuXIot/ Pg/Z5rnlGYxoZ1OrFd++1I0e9kvZotc1OstY3rbXeLThKegNCY9Z48vcFXfPWZjSdG68 IvvRKrMR72cIIPXrOsyDliC0IdU+CvtQcVK2wraiegV0imW3DFz9VX0Mqo1/29E7vvEl j1FIx42m1OENnZer/X4pZw8IWhOWb9lqXEqN3QcAhQt59UwT6K7fKOvYhQQ21cJbhfY2 184btUu7Da7m77/dFVR8mvoW+YS45i7IWcd38nQpSVL8lLN/tE3g3MYr+t+bpf9Z6MSF UY0Q== X-Gm-Message-State: AOJu0YyBAjxkKRXIcHSoV1SW3SbR8/Zbd49VIpo321kEqx+7hlLHHLDZ 6yNlmOc6BF82y++j/dZ0FpioeHL0hUJa1tolmjiwEYliV2Ch6VTDqP+IXtsqWEfbato/Y4AvFTU OyeMhKhYLnkzZqQ6B8X4V8ZH4BL2ozIyKbNYVevne3rbquYXwDabc1w+/Dy0gvC5L7KwhFNpuLY SuRWdvFRB2JZmEicG971dXZ3OS4sTsZwzn1uDuMDgkdW+RBDO0MIYwAQ== X-Google-Smtp-Source: AGHT+IFerVD1H1tYTpft5lxeTh/fO1xILpR0REKfJpkF6fA3Sb0NFOObgbBxN/s2MabDOU5cpDsJZCGB3kNzbNE= X-Received: from ejcsi9.prod.google.com ([2002:a17:906:6209:b0:ae9:ba12:ebd1]) (user=alankelly job=prod-delivery.src-stubby-dispatcher) by 2002:a17:906:794a:b0:ae3:6cc8:e431 with SMTP id a640c23a62f3a-af940249b02mr1101148466b.57.1754315437773; Mon, 04 Aug 2025 06:50:37 -0700 (PDT) Date: Mon, 4 Aug 2025 13:49:20 +0000 Mime-Version: 1.0 X-Mailer: git-send-email 2.50.1.565.gc32cd1483b-goog Message-ID: <20250804135035.465073-1-alankelly@google.com> To: ffmpeg-devel@ffmpeg.org Subject: [FFmpeg-devel] [PATCH] swscale: Break loop-carried dependency enabling parallel out of order execution of the gathers. X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Alan Kelly via ffmpeg-devel Reply-To: FFmpeg development discussions and patches Cc: Alan Kelly Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: The gather is unmasked but the instruction does a merge into ymm4, which depends on the value of ymm4 from the previous loop iteration. The out-of-order scheduler does not know statically that the instruction is fully unmasked, preventing parallel out-of-order execution of the gathers. --- libswscale/x86/scale_avx2.asm | 3 +++ 1 file changed, 3 insertions(+) diff --git a/libswscale/x86/scale_avx2.asm b/libswscale/x86/scale_avx2.asm index b4b852d60b..90ee8b0a0e 100644 --- a/libswscale/x86/scale_avx2.asm +++ b/libswscale/x86/scale_avx2.asm @@ -68,8 +68,10 @@ cglobal hscale8to15_%1, 7, 9, 16, pos0, dst, w, srcmem, filter, fltpos, fltsize, .innerloop: %endif vpcmpeqd m13, m13 + pxor m3, m3 ; break loop-carried dependency vpgatherdd m3,[srcmemq + m1], m13 vpcmpeqd m13, m13 + pxor m4, m4 ; break loop-carried dependency vpgatherdd m4,[srcmemq + m2], m13 vpunpcklbw m5, m3, m0 vpunpckhbw m6, m3, m0 @@ -119,6 +121,7 @@ cglobal hscale8to15_%1, 7, 9, 16, pos0, dst, w, srcmem, filter, fltpos, fltsize, .tail_innerloop: %endif vpcmpeqd xm13, xm13 + pxor m3, m3 ; break loop-carried dependency vpgatherdd xm3,[srcmemq + xm1], xm13 vpunpcklbw xm5, xm3, xm0 vpunpckhbw xm6, xm3, xm0 -- 2.50.1.565.gc32cd1483b-goog _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".