From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: <ffmpeg-devel-bounces@ffmpeg.org> Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTPS id 004A34CBB9 for <ffmpegdev@gitmailbox.com>; Fri, 11 Apr 2025 20:51:00 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 2CF9768C40A; Fri, 11 Apr 2025 23:50:56 +0300 (EEST) Received: from relay5-d.mail.gandi.net (relay5-d.mail.gandi.net [217.70.183.197]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id A32A268C37D for <ffmpeg-devel@ffmpeg.org>; Fri, 11 Apr 2025 23:50:49 +0300 (EEST) Received: by mail.gandi.net (Postfix) with ESMTPSA id 925CC43A22 for <ffmpeg-devel@ffmpeg.org>; Fri, 11 Apr 2025 20:50:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=niedermayer.cc; s=gm1; t=1744404648; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=pjuMR0Y8H/5NvMDqVSnJxTCNBG7/ahnOLwYXVcx6sS0=; b=UntrMC8q1R6Fq5uR/OEkdXBgOKa5V/YjuEKW20KtEcunRMVd6/oBzM7lBz6i4ObGOhrEqF dlutTcques7mWY8Ezi7BF4e4RzSjb5UhGVnhP2eWFnEsNhU+dYajwGyIt7kjQkOfKUOaLV S9zyJ7mBxfv+/je/m6j7TYaedvDm5zfzLNvSWPjkTAkgbXJyA/w9fB0EizAbreGapYWXug CiiIyqXk3pmETDFC71hW9l/eQw7mB/ReJNKJHth2Hch0/o+R3oh9Cw414kIMLFaEr9fC/m swHwAaZSk2mr8+8HIcUA3sMQdmwKJNc7TaibnzdkiLwK5/eMCsFqzwdecd5PCA== Date: Fri, 11 Apr 2025 22:50:47 +0200 From: Michael Niedermayer <michael@niedermayer.cc> To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org> Message-ID: <20250411205047.GD4991@pb2> References: <20250408101959.GP4991@pb2> MIME-Version: 1.0 In-Reply-To: <20250408101959.GP4991@pb2> X-GND-State: clean X-GND-Score: -85 X-GND-Cause: gggruggvucftvghtrhhoucdtuddrgeefvddrtddtgddvuddvkeduucetufdoteggodetrfdotffvucfrrhhofhhilhgvmecuifetpfffkfdpucggtfgfnhhsuhgsshgtrhhisggvnecuuegrihhlohhuthemuceftddunecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenfghrlhcuvffnffculdduhedmnecujfgurhepfffhvffukfhfgggtuggjsehgtderredttddvnecuhfhrohhmpefoihgthhgrvghlucfpihgvuggvrhhmrgihvghruceomhhitghhrggvlhesnhhivgguvghrmhgrhigvrhdrtggtqeenucggtffrrghtthgvrhhnpeffledtfeevfeffheeuuefhtdejieelueeftdeitdfgheetgefffeefteekffdthfenucffohhmrghinhepfhhfmhhpvghgrdhorhhgnecukfhppeeguddrieeirdeijedruddufeenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepihhnvghtpeeguddrieeirdeijedruddufedphhgvlhhopehlohgtrghlhhhoshhtpdhmrghilhhfrhhomhepmhhitghhrggvlhesnhhivgguvghrmhgrhigvrhdrtggtpdhnsggprhgtphhtthhopedupdhrtghpthhtohepfhhfmhhpvghgqdguvghvvghlsehffhhmphgvghdrohhrgh X-GND-Sasl: michael@niedermayer.cc Subject: Re: [FFmpeg-devel] [RFC] AVDictionary2 X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches <ffmpeg-devel.ffmpeg.org> List-Unsubscribe: <https://ffmpeg.org/mailman/options/ffmpeg-devel>, <mailto:ffmpeg-devel-request@ffmpeg.org?subject=unsubscribe> List-Archive: <https://ffmpeg.org/pipermail/ffmpeg-devel> List-Post: <mailto:ffmpeg-devel@ffmpeg.org> List-Help: <mailto:ffmpeg-devel-request@ffmpeg.org?subject=help> List-Subscribe: <https://ffmpeg.org/mailman/listinfo/ffmpeg-devel>, <mailto:ffmpeg-devel-request@ffmpeg.org?subject=subscribe> Reply-To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org> Content-Type: multipart/mixed; boundary="===============4991605088193853599==" Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org> Archived-At: <https://master.gitmailbox.com/ffmpegdev/20250411205047.GD4991@pb2/> List-Archive: <https://master.gitmailbox.com/ffmpegdev/> List-Post: <mailto:ffmpegdev@gitmailbox.com> --===============4991605088193853599== Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="D5cEKevPIgUj4x1u" Content-Disposition: inline --D5cEKevPIgUj4x1u Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Apr 08, 2025 at 12:19:59PM +0200, Michael Niedermayer wrote: > Hi all >=20 > As i have too many things to do already i did the most logic thing and > started thinking about a new and unrelated idea. >=20 > This is a list of problems and ideas, that everyone is welcome to add to = and > comment on. >=20 > AVDictionary is just bad. >=20 > * its complicated internally with > unneeded alternative (AV_DICT_DONT_STRDUP_VAL/KEY) these are rarely used > and probably not relevant for performance. >=20 > * all basic operations are as slow as possible. > you want to find, update or remove an entry, search through all entries >=20 > * its heavy on memory allocations > 1 malloc for key, 1 malloc for value, 1 realloc on the AVDictionaryEntr= y array > that makes 2+ malloc() for every "foo"=3D"bar" >=20 > Ideas: > 1. put the node struct (AVDictionaryEntry), the key and value in the same > allocated block, 1 malloc() instead of 2. > We can simply concatenate the key and value string, we could even use = the > 0 terminator instead of the 2nd pointer. Either way the whole > can go to the end of the Node structure for a tree > 1b. Now if we did put the key and value together, we can order in the tree > by this combined entity. Why ? because now we have a unique ordering > and also the key+value could be required to be always unique. Simplify= ing > things from what we have now and making it more replicatable, no > more changes in output because order changed > 2. We have a simple AVL tree implementation which we could use to make > all operations O(log n) instead of O(n) > 3. We could go with hash tables, splay trees, critbit trees or something > else. hash tables have issues with malicious/odd input which would > require more complexity to workaround. >=20 > Of course we could also go a step further and eliminate the malloc per > node and put it all in a linear array. > As in, insert -> append at the end, > realloc with every power of 2 size increase > complete rebuild once enough elements are removed > not sure this isnt overkill for a metadata string dictionary >=20 > I probably wont have time to implement this in the near future but as i > was thinking about this, it seemed to make sense to write this down and > post here >=20 > git grep av_dict | wc is 1436 >=20 > So its used a bit, justifying looking at improving it >=20 >=20 > git grep AV_DICT_DONT_STRDUP | wc is 87 > git grep AV_DICT_DONT_STRDUP libavutil/ tests doc | wc is 20 >=20 > Seems not too common and one malloc/copy of a string once per metadata en= try > which is once per file generally, seems a strange optimization to me If anyone wants a testcase that shows dictionary speed issues, try to do anything with this file: https://samples.ffmpeg.org/nut/meta.nut its a single image 256x256 jpeg with a lot of metadata thx [...] --=20 Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Its not that you shouldnt use gotos but rather that you should write readable code and code with gotos often but not always is less readable --D5cEKevPIgUj4x1u Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iF0EABEKAB0WIQSf8hKLFH72cwut8TNhHseHBAsPqwUCZ/mAowAKCRBhHseHBAsP q9CzAJ9oOQcaaplYQCXY5mWqe+LnEoqrjACglKrZaPgj45SmlN2Y/8JdIHsC7Bk= =37Z4 -----END PGP SIGNATURE----- --D5cEKevPIgUj4x1u-- --===============4991605088193853599== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". --===============4991605088193853599==--