From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: <ffmpeg-devel-bounces@ffmpeg.org> Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTPS id 701C74C381 for <ffmpegdev@gitmailbox.com>; Tue, 8 Apr 2025 10:20:13 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 86446687D38; Tue, 8 Apr 2025 13:20:09 +0300 (EEST) Received: from relay1-d.mail.gandi.net (relay1-d.mail.gandi.net [217.70.183.193]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 69EBF687D1B for <ffmpeg-devel@ffmpeg.org>; Tue, 8 Apr 2025 13:20:02 +0300 (EEST) Received: by mail.gandi.net (Postfix) with ESMTPSA id 570434331D for <ffmpeg-devel@ffmpeg.org>; Tue, 8 Apr 2025 10:20:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=niedermayer.cc; s=gm1; t=1744107601; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type; bh=Edjjl+E/v4l1KLmqR7YR21xsEp5ZZrAvQ6UYN5yxZZ8=; b=ZN1ITlEN0Pm0CwxXXH3elPssj3X9USXMqVTqG+zRv9Gb06yjseiqTn+QOywFl3smnXUTa0 FBZnTWIFCjwLPJWkqjGOvssqNYrTmwNL8Ezcr4/F4U+dSre5CO4httiSL6vH7xMBsOD68u tHSde9+i2DKWqWoKFDIxLm5ozmOnewdaKpBvn+2IX08ve3ehd1pkVxzDWKw6roo0EN2VeK 3Su1XUH+qYPSJ4QclvQ7Vx+LUKiimpniS3ffl3eC3b3cwtf9OWRYZgUxB+w2Cn+1JXslgz U2/tGJOo67v5CgibhSM1sYJ/fYRFD4UogidYCjYdy19sh70dVzs6O9UO7CW/Ng== Date: Tue, 8 Apr 2025 12:19:59 +0200 From: Michael Niedermayer <michael@niedermayer.cc> To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org> Message-ID: <20250408101959.GP4991@pb2> MIME-Version: 1.0 X-GND-State: clean X-GND-Score: -70 X-GND-Cause: gggruggvucftvghtrhhoucdtuddrgeefvddrtddtgddvtddvkedvucetufdoteggodetrfdotffvucfrrhhofhhilhgvmecuifetpfffkfdpucggtfgfnhhsuhgsshgtrhhisggvnecuuegrihhlohhuthemuceftddunecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenfghrlhcuvffnffculdeftddmnecujfgurhepfffhvffukfggtggusehgtderredttddvnecuhfhrohhmpefoihgthhgrvghlucfpihgvuggvrhhmrgihvghruceomhhitghhrggvlhesnhhivgguvghrmhgrhigvrhdrtggtqeenucggtffrrghtthgvrhhnpeeifeegvefgvdegledugeehlefhgeffvdeggfdtgeevgeduleevieeuleeiteevffenucfkphepgedurdeiiedrieejrdduudefnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehinhgvthepgedurdeiiedrieejrdduudefpdhhvghloheplhhotggrlhhhohhsthdpmhgrihhlfhhrohhmpehmihgthhgrvghlsehnihgvuggvrhhmrgihvghrrdgttgdpnhgspghrtghpthhtohepuddprhgtphhtthhopehffhhmphgvghdquggvvhgvlhesfhhfmhhpvghgrdhorhhg X-GND-Sasl: michael@niedermayer.cc Subject: [FFmpeg-devel] [RFC] AVDictionary2 X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches <ffmpeg-devel.ffmpeg.org> List-Unsubscribe: <https://ffmpeg.org/mailman/options/ffmpeg-devel>, <mailto:ffmpeg-devel-request@ffmpeg.org?subject=unsubscribe> List-Archive: <https://ffmpeg.org/pipermail/ffmpeg-devel> List-Post: <mailto:ffmpeg-devel@ffmpeg.org> List-Help: <mailto:ffmpeg-devel-request@ffmpeg.org?subject=help> List-Subscribe: <https://ffmpeg.org/mailman/listinfo/ffmpeg-devel>, <mailto:ffmpeg-devel-request@ffmpeg.org?subject=subscribe> Reply-To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org> Content-Type: multipart/mixed; boundary="===============1603012628282792555==" Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org> Archived-At: <https://master.gitmailbox.com/ffmpegdev/20250408101959.GP4991@pb2/> List-Archive: <https://master.gitmailbox.com/ffmpegdev/> List-Post: <mailto:ffmpegdev@gitmailbox.com> --===============1603012628282792555== Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="9oKqUQTGbbRxTe6h" Content-Disposition: inline --9oKqUQTGbbRxTe6h Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hi all As i have too many things to do already i did the most logic thing and started thinking about a new and unrelated idea. This is a list of problems and ideas, that everyone is welcome to add to and comment on. AVDictionary is just bad. * its complicated internally with unneeded alternative (AV_DICT_DONT_STRDUP_VAL/KEY) these are rarely used and probably not relevant for performance. * all basic operations are as slow as possible. you want to find, update or remove an entry, search through all entries * its heavy on memory allocations 1 malloc for key, 1 malloc for value, 1 realloc on the AVDictionaryEntry = array that makes 2+ malloc() for every "foo"=3D"bar" Ideas: 1. put the node struct (AVDictionaryEntry), the key and value in the same allocated block, 1 malloc() instead of 2. We can simply concatenate the key and value string, we could even use the 0 terminator instead of the 2nd pointer. Either way the whole can go to the end of the Node structure for a tree 1b. Now if we did put the key and value together, we can order in the tree by this combined entity. Why ? because now we have a unique ordering and also the key+value could be required to be always unique. Simplifying things from what we have now and making it more replicatable, no more changes in output because order changed 2. We have a simple AVL tree implementation which we could use to make all operations O(log n) instead of O(n) 3. We could go with hash tables, splay trees, critbit trees or something else. hash tables have issues with malicious/odd input which would require more complexity to workaround. Of course we could also go a step further and eliminate the malloc per node and put it all in a linear array. As in, insert -> append at the end, realloc with every power of 2 size increase complete rebuild once enough elements are removed not sure this isnt overkill for a metadata string dictionary I probably wont have time to implement this in the near future but as i was thinking about this, it seemed to make sense to write this down and post here git grep av_dict | wc is 1436 So its used a bit, justifying looking at improving it git grep AV_DICT_DONT_STRDUP | wc is 87 git grep AV_DICT_DONT_STRDUP libavutil/ tests doc | wc is 20 Seems not too common and one malloc/copy of a string once per metadata entry which is once per file generally, seems a strange optimization to me thx --=20 Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety -- Benjamin Franklin --9oKqUQTGbbRxTe6h Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iF0EABEKAB0WIQSf8hKLFH72cwut8TNhHseHBAsPqwUCZ/T4TAAKCRBhHseHBAsP q9Z2AJ48TecOR7OhnE4hiL6C/+BVKlDStgCgkEz3eLu9mrK4czo8KJHG9BJ4CY8= =t0ux -----END PGP SIGNATURE----- --9oKqUQTGbbRxTe6h-- --===============1603012628282792555== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". --===============1603012628282792555==--