From: Michael Niedermayer <michael@niedermayer.cc> To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org> Subject: [FFmpeg-devel] [RFC] AVDictionary2 Date: Tue, 8 Apr 2025 12:19:59 +0200 Message-ID: <20250408101959.GP4991@pb2> (raw) [-- Attachment #1.1: Type: text/plain, Size: 2757 bytes --] Hi all As i have too many things to do already i did the most logic thing and started thinking about a new and unrelated idea. This is a list of problems and ideas, that everyone is welcome to add to and comment on. AVDictionary is just bad. * its complicated internally with unneeded alternative (AV_DICT_DONT_STRDUP_VAL/KEY) these are rarely used and probably not relevant for performance. * all basic operations are as slow as possible. you want to find, update or remove an entry, search through all entries * its heavy on memory allocations 1 malloc for key, 1 malloc for value, 1 realloc on the AVDictionaryEntry array that makes 2+ malloc() for every "foo"="bar" Ideas: 1. put the node struct (AVDictionaryEntry), the key and value in the same allocated block, 1 malloc() instead of 2. We can simply concatenate the key and value string, we could even use the 0 terminator instead of the 2nd pointer. Either way the whole can go to the end of the Node structure for a tree 1b. Now if we did put the key and value together, we can order in the tree by this combined entity. Why ? because now we have a unique ordering and also the key+value could be required to be always unique. Simplifying things from what we have now and making it more replicatable, no more changes in output because order changed 2. We have a simple AVL tree implementation which we could use to make all operations O(log n) instead of O(n) 3. We could go with hash tables, splay trees, critbit trees or something else. hash tables have issues with malicious/odd input which would require more complexity to workaround. Of course we could also go a step further and eliminate the malloc per node and put it all in a linear array. As in, insert -> append at the end, realloc with every power of 2 size increase complete rebuild once enough elements are removed not sure this isnt overkill for a metadata string dictionary I probably wont have time to implement this in the near future but as i was thinking about this, it seemed to make sense to write this down and post here git grep av_dict | wc is 1436 So its used a bit, justifying looking at improving it git grep AV_DICT_DONT_STRDUP | wc is 87 git grep AV_DICT_DONT_STRDUP libavutil/ tests doc | wc is 20 Seems not too common and one malloc/copy of a string once per metadata entry which is once per file generally, seems a strange optimization to me thx -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety -- Benjamin Franklin [-- Attachment #1.2: signature.asc --] [-- Type: application/pgp-signature, Size: 195 bytes --] [-- Attachment #2: Type: text/plain, Size: 251 bytes --] _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
next reply other threads:[~2025-04-08 10:20 UTC|newest] Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top 2025-04-08 10:19 Michael Niedermayer [this message] 2025-04-08 16:10 ` Romain Beauxis 2025-04-08 20:29 ` Michael Niedermayer 2025-04-08 22:18 ` Gerion Entrup 2025-04-08 22:35 ` Michael Niedermayer 2025-04-08 22:37 ` softworkz . 2025-04-08 16:56 ` softworkz . 2025-04-08 18:16 ` Michael Niedermayer 2025-04-08 18:36 ` softworkz . 2025-04-08 19:45 ` Michael Niedermayer 2025-04-08 21:30 ` softworkz . 2025-04-11 19:06 ` Michael Niedermayer 2025-04-12 1:41 ` softworkz . 2025-04-12 11:02 ` softworkz . 2025-04-09 0:00 ` Leo Izen 2025-04-09 16:56 ` Michael Niedermayer 2025-04-10 8:40 ` Nicolas George 2025-04-10 18:31 ` softworkz . 2025-04-11 20:50 ` Michael Niedermayer
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20250408101959.GP4991@pb2 \ --to=michael@niedermayer.cc \ --cc=ffmpeg-devel@ffmpeg.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel This inbox may be cloned and mirrored by anyone: git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \ ffmpegdev@gitmailbox.com public-inbox-index ffmpegdev Example config snippet for mirrors. AGPL code for this site: git clone https://public-inbox.org/public-inbox.git