From: Michael Niedermayer <michael@niedermayer.cc>
To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Subject: Re: [FFmpeg-devel] AVDictionary vs. AVSet (AVDictionary2 approximation)
Date: Tue, 15 Apr 2025 21:11:33 +0200
Message-ID: <20250415191133.GT4991@pb2> (raw)
In-Reply-To: <BN0P223MB03581807CBC5FCBC171EC87CBAB32@BN0P223MB0358.NAMP223.PROD.OUTLOOK.COM>
[-- Attachment #1.1: Type: text/plain, Size: 7420 bytes --]
On Mon, Apr 14, 2025 at 01:02:00PM +0000, softworkz . wrote:
>
>
> > -----Original Message-----
> > From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of
> > softworkz .
> > Sent: Montag, 14. April 2025 14:40
> > To: FFmpeg development discussions and patches <ffmpeg-
> > devel@ffmpeg.org>
> > Subject: Re: [FFmpeg-devel] AVDictionary vs. AVSet (AVDictionary2
> > approximation)
> >
> >
> >
> > > -----Original Message-----
> > > From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of
> > > Michael Niedermayer
> > > Sent: Montag, 14. April 2025 13:33
> > > To: FFmpeg development discussions and patches <ffmpeg-
> > > devel@ffmpeg.org>
> > > Subject: [FFmpeg-devel] AVDictionary vs. AVSet (AVDictionary2
> > > approximation)
> > >
> > > Hi
> > >
> > > I just posted a AVSet implementation i wrote in the last 2 days (yes
> > > thats
> > > why i did dissapear for the last 2 days)
> > >
> > > My plan was to use that AVSet as basis for AVDictionary2 in case
> > > benchmarks indicate that its worth it, so is it ?
> > >
> > > with 3 entries (100000 runs)
> > > AVDictionary 0.040sec
> > > AVSet 0.027sec
> > >
> > > with 5 entries (100000 runs)
> > > AVDictionary 0.065sec
> > > AVSet 0.042sec
> > >
> > > with 10 entries (100000 runs)
> > > AVDictionary 0.193sec
> > > AVSet 0.087sec
> > >
> > > with 100 entries (100000 runs)
> > > AVDictionary 8.7 sec
> > > AVSet 1.4 sec
> > >
> > > with 1000 entries (1000 runs)
> > > AVDictionary 8.0 sec
> > > AVSet 0.240 sec
> > >
> > > with 10000 entries (10 runs)
> > > AVDictionary 7.2 sec
> > > AVSet 0.042 sec
> > >
> > >
> > > I was a bit surprised for the 3 and 5 entry case, maybe my benchmark
> > > is buggy or
> > > AVSet is, but then AVDictionary is pretty bad with memory
> > allocations
> > >
> > > AVDictionary needs to strdup every key and value, needs to allocate
> > > the AVDictionary itself and reallocs the entry array each time
> > > thats 10 memory allocation related calls for adding 3 entries
> > >
> > > while AVSet allocates the AVSet and then uses av_fast_realloc() for
> > > the array
> > > and theres nothing else, the key/value goes in that array too
> > >
> > >
> > > bechmark code used is below:
> > >
> > >
> > > #if 0
> > > for (int runs = 0; runs < 100000; runs++) {
> > > AVSet *set = av_set_new(strcmp, NULL, NULL);
> > > for(int pass = 0; pass < 2; pass++) {
> > > unsigned r = 5;
> > > for(int i=0; i<100; i++) {
> > > r = r*123 + 7;
> > > char str[2*7] = "TESTXXTESTXX";
> > > str[4] = r;
> > > str[5] = r>>8;
> > > if(pass == 0) {
> > > av_set_add(set, str, 2*7, 0);
> > > } else {
> > > av_set_get(set, NULL, str, NULL);
> > > }
> > > }
> > > }
> > > av_set_free(&set);
> > > }
> > > #else
> > > for (int runs = 0; runs < 100000; runs++) {
> > > AVDictionary *dict = NULL;
> > > for(int pass = 0; pass < 2; pass++) {
> > > unsigned r = 5;
> > > for(int i=0; i<100; i++) {
> > > r = r*123 + 7;
> > > char str[7] = "TEST";
> > > str[4] = r;
> > > str[5] = r>>8;
> > > if(pass == 0) {
> > > av_dict_set(&dict, str, str, 0);
> > > } else {
> > > av_dict_get(dict, str, NULL, 0);
> > > }
> > > }
> > > }
> > > av_dict_free(&dict);
> > > }
> > > #endif
> > >
> > >
> > > --
> >
> > Hi Michael,
> >
> >
> > what's not quite realistic is that all keys are starting with the same
> > 4 characters. This affects the lookups of course - and probably
> > (maybe) not equally for both sides.
> >
> > Doesn't the code create duplicate keys (at least when it gets > 65536
> > it will for sure) ?
> >
> > So, I think, the keys should be completely random (all chars).
> >
> > I would also check whether the lookups are successful (just to be
> > sure).
>
> Sorry, I forgot the most important one:
>
> Timing for population and lookup should be measured separately..
Sure, for the v2 (AVMap) i just posted
with TESTXX / TESTXX strings where XX is random
1000 entries
5354505 decicycles in av_map_add, 512 runs, 0 skips
4040575 decicycles in av_map_get, 512 runs, 0 skips
148082828 decicycles in av_dict_set, 512 runs, 0 skips
145828939 decicycles in av_dict_get, 512 runs, 0 skips
100 entries
332015 decicycles in av_map_add, 512 runs, 0 skips
193726 decicycles in av_map_get, 512 runs, 0 skips
1697242 decicycles in av_dict_set, 512 runs, 0 skips
1392837 decicycles in av_dict_get, 512 runs, 0 skips
10 entries
21142 decicycles in av_map_add, 512 runs, 0 skips
11395 decicycles in av_map_get, 512 runs, 0 skips
45663 decicycles in av_dict_set, 512 runs, 0 skips
19756 decicycles in av_dict_get, 512 runs, 0 skips
5 entries
9210 decicycles in av_map_add, 512 runs, 0 skips
4870 decicycles in av_map_get, 511 runs, 1 skips
18823 decicycles in av_dict_set, 512 runs, 0 skips
5483 decicycles in av_dict_get, 512 runs, 0 skips
3 entries
5693 decicycles in av_map_add, 512 runs, 0 skips
2645 decicycles in av_map_get, 512 runs, 0 skips
11462 decicycles in av_dict_set, 511 runs, 1 skips
2532 decicycles in av_dict_get, 512 runs, 0 skips
with XXST / XXST strings where XX is random
1000 entries
5321153 decicycles in av_map_add, 512 runs, 0 skips
4295153 decicycles in av_map_get, 512 runs, 0 skips
70417784 decicycles in av_dict_set, 512 runs, 0 skips
68188612 decicycles in av_dict_get, 512 runs, 0 skips
100 entries
322872 decicycles in av_map_add, 512 runs, 0 skips
216032 decicycles in av_map_get, 511 runs, 1 skips
1022088 decicycles in av_dict_set, 512 runs, 0 skips
723612 decicycles in av_dict_get, 512 runs, 0 skips
10 entries
20993 decicycles in av_map_add, 512 runs, 0 skips
11744 decicycles in av_map_get, 512 runs, 0 skips
38945 decicycles in av_dict_set, 512 runs, 0 skips
11308 decicycles in av_dict_get, 512 runs, 0 skips
5 entries
10007 decicycles in av_map_add, 511 runs, 1 skips
5004 decicycles in av_map_get, 512 runs, 0 skips
17374 decicycles in av_dict_set, 511 runs, 1 skips
3848 decicycles in av_dict_get, 512 runs, 0 skips
3 entries
5896 decicycles in av_map_add, 512 runs, 0 skips
2765 decicycles in av_map_get, 512 runs, 0 skips
11396 decicycles in av_dict_set, 511 runs, 1 skips
2029 decicycles in av_dict_get, 512 runs, 0 skips
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
The worst form of inequality is to try to make unequal things equal.
-- Aristotle
[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]
[-- Attachment #2: Type: text/plain, Size: 251 bytes --]
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
next prev parent reply other threads:[~2025-04-15 19:11 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-14 11:33 Michael Niedermayer
2025-04-14 12:40 ` softworkz .
2025-04-14 13:02 ` softworkz .
2025-04-15 19:11 ` Michael Niedermayer [this message]
2025-04-15 20:36 ` Michael Niedermayer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250415191133.GT4991@pb2 \
--to=michael@niedermayer.cc \
--cc=ffmpeg-devel@ffmpeg.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
This inbox may be cloned and mirrored by anyone:
git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git
# If you have public-inbox 1.1+ installed, you may
# initialize and index your mirror using the following commands:
public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
ffmpegdev@gitmailbox.com
public-inbox-index ffmpegdev
Example config snippet for mirrors.
AGPL code for this site: git clone https://public-inbox.org/public-inbox.git