From: Alan Kelly via ffmpeg-devel <ffmpeg-devel@ffmpeg.org> To: ffmpeg-devel@ffmpeg.org Cc: Alan Kelly <alankelly@google.com> Subject: [FFmpeg-devel] [PATCH] swscale: Break loop-carried dependency enabling parallel out of order execution of the gathers. Date: Mon, 4 Aug 2025 13:49:20 +0000 Message-ID: <20250804135035.465073-1-alankelly@google.com> (raw) The gather is unmasked but the instruction does a merge into ymm4, which depends on the value of ymm4 from the previous loop iteration. The out-of-order scheduler does not know statically that the instruction is fully unmasked, preventing parallel out-of-order execution of the gathers. --- libswscale/x86/scale_avx2.asm | 3 +++ 1 file changed, 3 insertions(+) diff --git a/libswscale/x86/scale_avx2.asm b/libswscale/x86/scale_avx2.asm index b4b852d60b..90ee8b0a0e 100644 --- a/libswscale/x86/scale_avx2.asm +++ b/libswscale/x86/scale_avx2.asm @@ -68,8 +68,10 @@ cglobal hscale8to15_%1, 7, 9, 16, pos0, dst, w, srcmem, filter, fltpos, fltsize, .innerloop: %endif vpcmpeqd m13, m13 + pxor m3, m3 ; break loop-carried dependency vpgatherdd m3,[srcmemq + m1], m13 vpcmpeqd m13, m13 + pxor m4, m4 ; break loop-carried dependency vpgatherdd m4,[srcmemq + m2], m13 vpunpcklbw m5, m3, m0 vpunpckhbw m6, m3, m0 @@ -119,6 +121,7 @@ cglobal hscale8to15_%1, 7, 9, 16, pos0, dst, w, srcmem, filter, fltpos, fltsize, .tail_innerloop: %endif vpcmpeqd xm13, xm13 + pxor m3, m3 ; break loop-carried dependency vpgatherdd xm3,[srcmemq + xm1], xm13 vpunpcklbw xm5, xm3, xm0 vpunpckhbw xm6, xm3, xm0 -- 2.50.1.565.gc32cd1483b-goog _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
next reply other threads:[~2025-08-04 13:50 UTC|newest] Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top 2025-08-04 13:49 Alan Kelly via ffmpeg-devel [this message] 2025-08-04 17:19 ` Jacob Lifshay
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20250804135035.465073-1-alankelly@google.com \ --to=ffmpeg-devel@ffmpeg.org \ --cc=alankelly@google.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel This inbox may be cloned and mirrored by anyone: git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \ ffmpegdev@gitmailbox.com public-inbox-index ffmpegdev Example config snippet for mirrors. AGPL code for this site: git clone https://public-inbox.org/public-inbox.git