From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTP id 7709A48DCA for ; Sun, 25 Feb 2024 23:00:32 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 0710A68C75C; Mon, 26 Feb 2024 01:00:30 +0200 (EET) Received: from mail-lf1-f53.google.com (mail-lf1-f53.google.com [209.85.167.53]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 864FC68BDE1 for ; Mon, 26 Feb 2024 01:00:23 +0200 (EET) Received: by mail-lf1-f53.google.com with SMTP id 2adb3069b0e04-512fe342841so172051e87.0 for ; Sun, 25 Feb 2024 15:00:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1708902022; x=1709506822; darn=ffmpeg.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=sX08jKmzd7Ts5yoQ9KgtUdPP8S+5GIzsOJrSD50oI9g=; b=Hec2d4P2FxfJQRU/qurLahfOs3/6t1hGlhOhvMffZadvF8gjZC1me/iQTZubgHuo1M CvUvJwLrktc9evEOTVoaJOP+Lld+HTg9t2T7/jzjc0n0mPP3ATQ0xwm1qnA6cC3A/BjL Es4bglOlqYVJKGVn9Iw1T/VumlW0BS4BuRPPjzSOv+8cA4XxzvulFjd2ZqoMiIiG7PWX pEIQGsNWceAN7QMt45oEvYSvEqPDGLnbCMKIFpU1SwnEsP5WBfM0CYYyW2Y3XNXBKGO/ sGjm+CwkrSec/bEvbloX7eLf/2mUJXmOr3a+I+voC7TPwH4vFQ6RBAY3dH9AxjRXkyS1 s4Sw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1708902022; x=1709506822; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=sX08jKmzd7Ts5yoQ9KgtUdPP8S+5GIzsOJrSD50oI9g=; b=uS757VHCWJYz4l9xbuIUyQM/6RS85J73ZKcUur1W0ucuoUOLMkXLLyRUfjgd5Fx5Zr r2hRcy1KYWV7jdvbM8UoUsFFpdZtR+3h3T5fbn+mhM58aT5OwUW4TpLOUwgP2HSalhem qiJGyW0aGeeq9SJpU4J7q/7Ypx9lVI7L++hrFkMUT2whdm6DJTPgFiHWQmU3oEOXJoh/ iZL0aO7fOhwTxye/+3IR9+Kj+NIxrO6XpJcqkWOjL44vurl91VXsUqt58atJdR3sqbvS o3gASoNtbniQZwk89Gl0UcfDS4wTbN9VDOc3MHPsGYDQ58HUpGAQArb9YMiswOUBeuvZ UK7w== X-Gm-Message-State: AOJu0YwZqE4+lGx8KvcuRGaDZMbDvCkpwWP5042SioLiJqp2xWklMi6m MdLm8/yC1xQZrp7EOcKdQUc6i4XxPALj2m/gJ8ld9VilKWAiPVjNHgMHVmwrgRKrs/wmfH2ma1I CKqdqkyW6K4BG35LOv243cab3XEBunY6P X-Google-Smtp-Source: AGHT+IEJ+X/hSy+U4PDbNcKNW6fd/JL1257j4xiyYaVATKRZOQ+lqkLvR8uVKBpz3bANHGgEGDUlnHTdrpOQmoclHSo= X-Received: by 2002:a2e:b049:0:b0:2d2:825b:9922 with SMTP id d9-20020a2eb049000000b002d2825b9922mr1048466ljl.0.1708902021915; Sun, 25 Feb 2024 15:00:21 -0800 (PST) MIME-Version: 1.0 References: <20240225082755.355295-1-jdek@itanimul.li> In-Reply-To: From: "Ronald S. Bultje" Date: Sun, 25 Feb 2024 18:00:10 -0500 Message-ID: To: FFmpeg development discussions and patches Content-Type: multipart/mixed; boundary="000000000000f8fa7806123cc1ee" X-Content-Filtered-By: Mailman/MimeDel 2.1.29 Subject: Re: [FFmpeg-devel] [PATCH] avcodec/x86/hevc: fix luma 12b overflow X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Henrik Gramner Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: --000000000000f8fa7806123cc1ee Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi, On Sun, Feb 25, 2024 at 5:30=E2=80=AFPM Henrik Gramner via ffmpeg-devel < ffmpeg-devel@ffmpeg.org> wrote: > On Sun, Feb 25, 2024 at 5:42=E2=80=AFPM Ronald S. Bultje > wrote: > > + mova m13, [pw_8] > > + paddw m10, m12, m12 > > + paddw m12, m10 ; 9 * (q0 - p0) - 3 * ( q1 - p1 ) > > paddw m12, m13; + 8 > > Memory operand > > > + paddw m10, m13, m13 > > + paddw m13, m10 ; abs(9 * (q0 - p0) - 3 * ( q1 - p1 )) > > + paddw m13, [pw_8] > [...] > > + paddw m13, m12, m12 > > + paddw m13, m12 ; 3*abs(m12) > > + paddw m13, [pw_8] > > Another minor improvement would be to reorder the adds like (x + x) + > (x + 8) instead of ((x + x) + x) + 8 to allow for more > instruction-level parallelism. > New version attached. Ronald --000000000000f8fa7806123cc1ee Content-Type: application/octet-stream; name="0001-hevc-x86-deblock-fix-12bit-overflow.patch" Content-Disposition: attachment; filename="0001-hevc-x86-deblock-fix-12bit-overflow.patch" Content-Transfer-Encoding: base64 Content-ID: X-Attachment-Id: f_lt244fed0 RnJvbSA1ZDUxY2MyNjQ3YmY1ZTdkNDJiNmVhMTM4NDUyOWIwMDRkNTYzOWQxIE1vbiBTZXAgMTcg MDA6MDA6MDAgMjAwMQpGcm9tOiAiUm9uYWxkIFMuIEJ1bHRqZSIgPHJzYnVsdGplQGdtYWlsLmNv bT4KRGF0ZTogU3VuLCAyNSBGZWIgMjAyNCAxMDo0OTozNSAtMDUwMApTdWJqZWN0OiBbUEFUQ0hd IGhldmMveDg2L2RlYmxvY2s6IGZpeCAxMmJpdCBvdmVyZmxvdy4KCi0tLQogbGliYXZjb2RlYy94 ODYvaGV2Y19kZWJsb2NrLmFzbSB8IDQwICsrKysrKysrKysrKysrKysrKysrKysrKystLS0tLS0t LQogMSBmaWxlIGNoYW5nZWQsIDMxIGluc2VydGlvbnMoKyksIDkgZGVsZXRpb25zKC0pCgpkaWZm IC0tZ2l0IGEvbGliYXZjb2RlYy94ODYvaGV2Y19kZWJsb2NrLmFzbSBiL2xpYmF2Y29kZWMveDg2 L2hldmNfZGVibG9jay5hc20KaW5kZXggODVlZTQ4MDBiYi4uNjFiNzlmODA3OSAxMDA2NDQKLS0t IGEvbGliYXZjb2RlYy94ODYvaGV2Y19kZWJsb2NrLmFzbQorKysgYi9saWJhdmNvZGVjL3g4Ni9o ZXZjX2RlYmxvY2suYXNtCkBAIC01NDEsMTkgKzU0MSw0MSBAQCBBTElHTiAxNgogICAgIGFkZCAg ICAgICAgICAgICBiZXRhcSwgcjEzCiAgICAgc2hyICAgICAgICAgICAgIGJldGFxLCAzOyAoKGJl dGEgKyAoYmV0YSA+PiAxKSkgPj4gMykpCiAKLSAgICBtb3ZhICAgICAgICAgICAgbTEzLCBbcHdf OF0KICAgICBwc3VidyAgICAgICAgICAgbTEyLCBtNCwgbTMgOyBxMCAtIHAwCi0gICAgcHNsbHcg ICAgICAgICAgIG0xMCwgbTEyLCAzOyA4ICogKHEwIC0gcDApCi0gICAgcGFkZHcgICAgICAgICAg IG0xMiwgbTEwIDsgOSAqIChxMCAtIHAwKQotCisgICAgcGFkZHcgICAgICAgICAgIG0xMCwgbTEy LCBtMTIKKyAgICBwYWRkdyAgICAgICAgICAgbTEyLCBtMTAgOyAzICogKHEwIC0gcDApCiAgICAg cHN1YncgICAgICAgICAgIG0xMCwgbTUsIG0yIDsgcTEgLSBwMQotICAgIHBzbGx3ICAgICAgICAg ICAgbTgsIG0xMCwgMTsgMiAqICggcTEgLSBwMSApCi0gICAgcGFkZHcgICAgICAgICAgIG0xMCwg bTg7IDMgKiAoIHExIC0gcDEgKQotICAgIHBzdWJ3ICAgICAgICAgICBtMTIsIG0xMDsgOSAqIChx MCAtIHAwKSAtIDMgKiAoIHExIC0gcDEgKQotICAgIHBhZGR3ICAgICAgICAgICBtMTIsIG0xMzsg KyA4CisgICAgcHN1YncgICAgICAgICAgIG0xMiwgbTEwIDsgMyAqIChxMCAtIHAwKSAtIChxMSAt IHAxKQorJWlmICUxIDwgMTIKKyAgICBwYWRkdyAgICAgICAgICAgbTEwLCBtMTIsIG0xMgorICAg IHBhZGR3ICAgICAgICAgICBtMTIsIFtwd184XTsgKyA4CisgICAgcGFkZHcgICAgICAgICAgIG0x MiwgbTEwIDsgOSAqIChxMCAtIHAwKSAtIDMgKiAoIHExIC0gcDEgKQogICAgIHBzcmF3ICAgICAg ICAgICBtMTIsIDQ7ID4+IDQgLCBkZWx0YTAKICAgICBQQUJTVyAgICAgICAgICAgbTEzLCBtMTI7 IGFicyhkZWx0YTApCi0KKyVlbGlmIGNwdWZsYWcoc3NzZTMpCisgICAgcGFic3cgICAgICAgICAg IG0xMywgbTEyCisgICAgcGFkZHcgICAgICAgICAgIG0xMCwgbTEzLCBtMTMKKyAgICBwYWRkdyAg ICAgICAgICAgbTEzLCBbcHdfOF0KKyAgICBwYWRkdyAgICAgICAgICAgbTEzLCBtMTAgOyBhYnMo OSAqIChxMCAtIHAwKSAtIDMgKiAoIHExIC0gcDEgKSkKKyAgICBweG9yICAgICAgICAgICAgbTEw LCBtMTAKKyAgICBwY21wZ3R3ICAgICAgICAgbTEwLCBtMTIKKyAgICBwYWRkdyAgICAgICAgICAg bTEzLCBtMTAKKyAgICBwc3JsdyAgICAgICAgICAgbTEzLCA0OyA+PiA0LCBhYnMoZGVsdGEwKQor ICAgIHBzaWdudyAgICAgICAgICBtMTAsIG0xMywgbTEyCisgICAgU1dBUCAgICAgICAgICAgICAx MCwgMTIKKyVlbHNlCisgICAgcHhvciAgICAgICAgICAgIG0xMCwgbTEwCisgICAgcGNtcGd0dyAg ICAgICAgIG0xMCwgbTEyCisgICAgcHhvciAgICAgICAgICAgIG0xMiwgbTEwCisgICAgcHN1Yncg ICAgICAgICAgIG0xMiwgbTEwIDsgYWJzKCkKKyAgICBwYWRkdyAgICAgICAgICAgbTEzLCBtMTIs IG0xMgorICAgIHBhZGR3ICAgICAgICAgICBtMTIsIFtwd184XQorICAgIHBhZGR3ICAgICAgICAg ICBtMTMsIG0xMiA7IDMqYWJzKG0xMikKKyAgICBwYWRkdyAgICAgICAgICAgbTEzLCBtMTAKKyAg ICBwc3JsdyAgICAgICAgICAgbTEzLCA0CisgICAgcHhvciAgICAgICAgICAgIG0xMiwgbTEzLCBt MTAKKyAgICBwc3VidyAgICAgICAgICAgbTEyLCBtMTAKKyVlbmRpZgogCiAgICAgcHNsbHcgICAg ICAgICAgIG0xMCwgbTksIDI7IDggKiB0YwogICAgIHBhZGR3ICAgICAgICAgICBtMTAsIG05OyAx MCAqIHRjCi0tIAoyLjQzLjEKCg== --000000000000f8fa7806123cc1ee Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". --000000000000f8fa7806123cc1ee--