From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTP id 0E13E46973 for ; Mon, 31 Jul 2023 11:29:47 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 2DB3168C196; Mon, 31 Jul 2023 14:29:44 +0300 (EEST) Received: from mail-ej1-f45.google.com (mail-ej1-f45.google.com [209.85.218.45]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id B7FD968C05A for ; Mon, 31 Jul 2023 14:29:37 +0300 (EEST) Received: by mail-ej1-f45.google.com with SMTP id a640c23a62f3a-99b9421aaebso638822966b.2 for ; Mon, 31 Jul 2023 04:29:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1690802977; x=1691407777; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=jpUFRNKimJPDB6f5P0fP0iIHr+1NftZn4kVAX5pjYe0=; b=SYmrcFhrxWjl3ueuRu0lTw0+WhyU9EaKuXAidqeLcN25v5jfcVirc7K0GFMglgrLle MSE1CjsaUy/HTjUwQLoAWCiGi0TE8Z5tTRQv5CWqO9HmId/sgUCP+nYRvCWGm5QMyHK5 gzmIUA6r63JmZuwUDCd3Psb987VJ/QmxIHFfXWT/tSyKDhUOHDsAfvZ+BcSZgXDzyrrj OSO5zTotFNXLgGMsCKDUxqJjOndmrA7SHVc2HST5b5NVoGua4zdjuPIfQx3KiJz0JhVc vFwmDPwZNFLbCHb/G5/Cp+Uclb7eOSZddD0peQfCKDL+esr+H9jpVhV21woxn26YDC6Y jR6w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690802977; x=1691407777; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=jpUFRNKimJPDB6f5P0fP0iIHr+1NftZn4kVAX5pjYe0=; b=fe1QEDZcFmE7S3S4V16F6rSyBB3cthLLSYp/HS1r8I6YF7rYDjTrUJUZKbtfbhg/9R 1hNYPWo099/Lgb3nSFCRbUldfvSZ6eiqO1dTx5V4OXUr/nQ9lF5Q4EIemaibBYeKQDRL uN/xnxdRvJpPA6CsHV0YvGBKYNKyWMuay/4/J+Se1R4HCGlRCxprURWn5IVFgF6OkZhe 41YM3kszryAE9uL6NFyCsZgvSOj/fUwdMVXlFtTVtHvR6/JndfvskUxeeK7OYqnfG0Os sd/uPa79ZDOUQQgI6ZgofCPZZSmjf842NZ8nsk+tjjgJ2yRr7t37Jldb0T9XQZKhfcDa Y8WQ== X-Gm-Message-State: ABy/qLaquvw6cXjyaZDzyaKVMe0ya+96NTn56oX4YmGOTt+xvIjrQyfJ 6Nlm750cYP9Sp0yxf3Z8cQraCj9hCIZPrw== X-Google-Smtp-Source: APBJJlHfcjwrJHXQJvYh6poyhbevqyCvmOri7aocBGCO3hfOKO5vAUwv9hta7jYK1KXg+aaSvc65Vw== X-Received: by 2002:a17:906:20f:b0:99b:d275:53e1 with SMTP id 15-20020a170906020f00b0099bd27553e1mr6353698ejd.42.1690802976546; Mon, 31 Jul 2023 04:29:36 -0700 (PDT) Received: from MSDN-EVPAVLOV.amd.com (79-101-179-103.dynamic.isp.telekom.rs. [79.101.179.103]) by smtp.gmail.com with ESMTPSA id cw25-20020a170906c79900b0098d2d219649sm6065927ejb.174.2023.07.31.04.29.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 31 Jul 2023 04:29:36 -0700 (PDT) From: Evgeny Pavlov To: ffmpeg-devel@ffmpeg.org Date: Mon, 31 Jul 2023 13:26:16 +0200 Message-ID: <20230731112703.13730-2-lucenticus@gmail.com> X-Mailer: git-send-email 2.41.0 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH] avfilter/vf_ssim: Fix x86 assembly code for SSIM calculation X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Evgeny Pavlov Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: This commit fixes bug #10495 The code had several bugs related to post-loop compensation code: - test assembly instruction performs bitwise AND operation and generate flags used by jz branch instruction. Wrong test condition leads to incorrect branching - Incorrect compensation code for some branches Signed-off-by: Evgeny Pavlov --- libavfilter/x86/vf_ssim.asm | 25 +++++++++++-------------- 1 file changed, 11 insertions(+), 14 deletions(-) diff --git a/libavfilter/x86/vf_ssim.asm b/libavfilter/x86/vf_ssim.asm index 78809305de..e3e0c8104b 100644 --- a/libavfilter/x86/vf_ssim.asm +++ b/libavfilter/x86/vf_ssim.asm @@ -228,25 +228,22 @@ cglobal ssim_end_line, 3, 3, 7, sum0, sum1, w ; subpd the ones we added too much test wd, wd - jz .end + jz .end add wd, 4 - test wd, 3 - jz .skip3 - test wd, 2 - jz .skip2 - test wd, 1 - jz .skip1 -.skip3: + cmp wd, 1 + jz .skip3 + cmp wd, 2 + jz .skip2 +.skip1: ; 3 valid => skip 1 invalid psrldq m5, 8 subpd m6, m5 - jmp .end -.skip2: - psrldq m5, 8 + jmp .end +.skip2: ; 2 valid => skip 2 invalid subpd m6, m5 + jmp .end +.skip3: ; 1 valid => skip 3 invalid + psrldq m3, 8 subpd m0, m3 - jmp .end -.skip1: - psrldq m3, 16 subpd m6, m5 .end: -- 2.41.0 _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".