From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTP id 19F684ACD0 for ; Thu, 18 Jul 2024 17:40:15 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 730FD68DAD1; Thu, 18 Jul 2024 20:40:12 +0300 (EEST) Received: from relay8-d.mail.gandi.net (relay8-d.mail.gandi.net [217.70.183.201]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id F1E1D68D8BA for ; Thu, 18 Jul 2024 20:40:05 +0300 (EEST) Received: by mail.gandi.net (Postfix) with ESMTPSA id 26CC41BF203 for ; Thu, 18 Jul 2024 17:40:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=niedermayer.cc; s=gm1; t=1721324405; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=iVGD5tPsbAgyTGX0yykIKYp6iX7heRC96DZGD/zdOnk=; b=g5ctnb1xOht3kAvlJDBI9hwKu18vzScsWAwDo08F6xvLkz9bYKDZEwonPA8IpwNlQ/ls95 Q1VnRivjlHSdM9EOq2v6Tgj6g8Ye30MCSkb/rlnXdcMUGCaXMaD92tBUEJ0POpR7EXjH/O lRMPPqt8Sz5gzLf3eF+qWZd57oVSWtcMgAbPj48Akr16PZa2WtUWlBrcIIa+ZL/9Hq5EL2 QDGC98VvYjWvBcLCdeOCnZyIyFcJ9lleCx/SYo6mIIxJ6Fso3ly+FodEaMjujWpTP9GDOw +0zx7DsG+m1Kt2K1/9MNq+ihvLSOho7z1G5DYScwBxoisbe2LFznz8KH65WEEA== Date: Thu, 18 Jul 2024 19:40:04 +0200 From: Michael Niedermayer To: FFmpeg development discussions and patches Message-ID: <20240718174004.GE4991@pb2> References: <20240716171155.31838-1-anton@khirnov.net> <20240716171155.31838-13-anton@khirnov.net> <20240717223238.GW4991@pb2> <172129080994.21847.15080640617406361149@lain.khirnov.net> MIME-Version: 1.0 In-Reply-To: <172129080994.21847.15080640617406361149@lain.khirnov.net> X-GND-Sasl: michael@niedermayer.cc Subject: Re: [FFmpeg-devel] [PATCH 13/39] lavc/ffv1: drop redundant PlaneContext.quant_table X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Content-Type: multipart/mixed; boundary="===============8746401596284064038==" Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: --===============8746401596284064038== Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="kGQGJjO6+plzktLY" Content-Disposition: inline --kGQGJjO6+plzktLY Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Jul 18, 2024 at 10:20:09AM +0200, Anton Khirnov wrote: > Quoting Michael Niedermayer (2024-07-18 00:32:38) > > the data for each decoder task should be together and not scattered aro= und > > more than needed, reducing cache efficiency > >=20 > > putting all this extra code in the inner per pixel loop is not ok > > especially not for the sake of avoiding a memcpy of a few hundread byte= s multiple levels of loops outside >=20 > A nice theory, but in practice this patchset makes single-threaded > decoding about 4% faster overall, on a 1920x1080 10bit sample. That's > just the ffv1 parts (up to patch 28), full set also improves frame > threading performance as follows: > threads improvement > --------------------------- > 2 52% (yes really) > 4 16% > 8 12% I do want the speed improvements, yes. But you compare frame threading when slice threading performed much better than frame threading prior to the patch also id like to see the individual changes which look like they should make teh code slower, to be tested individually. If they make the code slow= er they should be dropped Also the code has a bug, benchmarks may theoretically changes once it is fi= xed using matrixbench -i matrixbench_mpeg2.mpg -an -vcodec ffv1 -slices 4 -t 100 -coder 1 -cont= ext 1 -bitexact /tCmp/ffv1.2-11.avi prior patchset: time ./ffmpeg -thread_type slice -threads 1 -i /tmp/ffv1.2-11.avi -f null - real 0m31.976s user 0m32.001s sys 0m0.080s time ./ffmpeg -thread_type slice -threads 2 -i /tmp/ffv1.2-11.avi -f null - real 0m18.086s user 0m34.199s sys 0m0.089s time ./ffmpeg -threads 2 -i /tmp/ffv1.2-11.avi -f null - real 0m33.578s user 0m33.611s sys 0m0.052s time ./ffmpeg -thread_type slice -threads 4 -i /tmp/ffv1.2-11.avi -f null - real 0m9.189s user 0m33.608s sys 0m0.073s time ./ffmpeg -threads 4 -i /tmp/ffv1.2-11.avi -f null - real 0m11.159s user 0m32.712s sys 0m0.124s post patchset: time ./ffmpeg -thread_type slice -threads 1 -i /tmp/ffv1.2-11.avi -f null - eal 0m31.755s user 0m31.758s sys 0m0.096s time ./ffmpeg -thread_type slice -threads 2 -i /tmp/ffv1.2-11.avi -f null - real 0m17.481s user 0m33.385s sys 0m0.076s time ./ffmpeg -threads 2 -i /tmp/ffv1.2-11.avi -f null - real 0m16.893s user 0m33.465s sys 0m0.113s time ./ffmpeg -thread_type slice -threads 4 -i /tmp/ffv1.2-11.avi -f null - real 0m9.180s user 0m33.500s sys 0m0.088s time ./ffmpeg -threads 4 -i /tmp/ffv1.2-11.avi -f null - real 0m8.811s user 0m33.338s sys 0m0.061s [...] --=20 Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB The bravest are surely those who have the clearest vision of what is before them, glory and danger alike, and yet notwithstanding go out to meet it. -- Thucydides --kGQGJjO6+plzktLY Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iF0EABEKAB0WIQSf8hKLFH72cwut8TNhHseHBAsPqwUCZplTcAAKCRBhHseHBAsP q19AAKCX836zGMpvpqRf1z+eEjqedNx1SQCfSUQXPSDP4IJzVWj/RsPvUycpEng= =Ba3V -----END PGP SIGNATURE----- --kGQGJjO6+plzktLY-- --===============8746401596284064038== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". --===============8746401596284064038==--