Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
 help / color / mirror / Atom feed
* [FFmpeg-devel] AVDictionary vs. AVSet (AVDictionary2 approximation)
@ 2025-04-14 11:33 Michael Niedermayer
  2025-04-14 12:40 ` softworkz .
  0 siblings, 1 reply; 5+ messages in thread
From: Michael Niedermayer @ 2025-04-14 11:33 UTC (permalink / raw)
  To: FFmpeg development discussions and patches


[-- Attachment #1.1: Type: text/plain, Size: 2709 bytes --]

Hi

I just posted a AVSet implementation i wrote in the last 2 days (yes thats
why i did dissapear for the last 2 days)

My plan was to use that AVSet as basis for AVDictionary2 in case
benchmarks indicate that its worth it, so is it ?

with 3 entries (100000 runs)
AVDictionary    0.040sec
AVSet           0.027sec

with 5 entries (100000 runs)
AVDictionary    0.065sec
AVSet           0.042sec

with 10 entries (100000 runs)
AVDictionary    0.193sec
AVSet           0.087sec

with 100 entries (100000 runs)
AVDictionary    8.7  sec
AVSet           1.4  sec

with 1000 entries (1000 runs)
AVDictionary    8.0   sec
AVSet           0.240 sec

with 10000 entries (10 runs)
AVDictionary    7.2   sec
AVSet           0.042 sec


I was a bit surprised for the 3 and 5 entry case, maybe my benchmark is buggy or
AVSet is, but then AVDictionary is pretty bad with memory allocations

AVDictionary needs to strdup every key and value, needs to allocate
the AVDictionary itself and reallocs the entry array each time
thats 10 memory allocation related calls for adding 3 entries

while AVSet allocates the AVSet and then uses av_fast_realloc() for the array
and theres nothing else, the key/value goes in that array too


bechmark code used is below:


#if 0
    for (int runs = 0; runs < 100000; runs++) {
        AVSet *set = av_set_new(strcmp, NULL, NULL);
        for(int pass = 0; pass < 2; pass++) {
            unsigned r = 5;
            for(int i=0; i<100; i++) {
                r = r*123 + 7;
                char str[2*7] = "TESTXXTESTXX";
                str[4] = r;
                str[5] = r>>8;
                if(pass == 0) {
                    av_set_add(set, str, 2*7, 0);
                } else {
                    av_set_get(set, NULL, str, NULL);
                }
            }
        }
        av_set_free(&set);
    }
#else
    for (int runs = 0; runs < 100000; runs++) {
        AVDictionary *dict = NULL;
        for(int pass = 0; pass < 2; pass++) {
            unsigned r = 5;
            for(int i=0; i<100; i++) {
                r = r*123 + 7;
                char str[7] = "TEST";
                str[4] = r;
                str[5] = r>>8;
                if(pass == 0) {
                    av_dict_set(&dict, str, str, 0);
                } else {
                    av_dict_get(dict, str, NULL, 0);
                }
            }
        }
        av_dict_free(&dict);
    }
#endif


-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Good people do not need laws to tell them to act responsibly, while bad
people will find a way around the laws. -- Plato

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

[-- Attachment #2: Type: text/plain, Size: 251 bytes --]

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [FFmpeg-devel] AVDictionary vs. AVSet (AVDictionary2 approximation)
  2025-04-14 11:33 [FFmpeg-devel] AVDictionary vs. AVSet (AVDictionary2 approximation) Michael Niedermayer
@ 2025-04-14 12:40 ` softworkz .
  2025-04-14 13:02   ` softworkz .
  0 siblings, 1 reply; 5+ messages in thread
From: softworkz . @ 2025-04-14 12:40 UTC (permalink / raw)
  To: FFmpeg development discussions and patches



> -----Original Message-----
> From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of
> Michael Niedermayer
> Sent: Montag, 14. April 2025 13:33
> To: FFmpeg development discussions and patches <ffmpeg-
> devel@ffmpeg.org>
> Subject: [FFmpeg-devel] AVDictionary vs. AVSet (AVDictionary2
> approximation)
> 
> Hi
> 
> I just posted a AVSet implementation i wrote in the last 2 days (yes
> thats
> why i did dissapear for the last 2 days)
> 
> My plan was to use that AVSet as basis for AVDictionary2 in case
> benchmarks indicate that its worth it, so is it ?
> 
> with 3 entries (100000 runs)
> AVDictionary    0.040sec
> AVSet           0.027sec
> 
> with 5 entries (100000 runs)
> AVDictionary    0.065sec
> AVSet           0.042sec
> 
> with 10 entries (100000 runs)
> AVDictionary    0.193sec
> AVSet           0.087sec
> 
> with 100 entries (100000 runs)
> AVDictionary    8.7  sec
> AVSet           1.4  sec
> 
> with 1000 entries (1000 runs)
> AVDictionary    8.0   sec
> AVSet           0.240 sec
> 
> with 10000 entries (10 runs)
> AVDictionary    7.2   sec
> AVSet           0.042 sec
> 
> 
> I was a bit surprised for the 3 and 5 entry case, maybe my benchmark
> is buggy or
> AVSet is, but then AVDictionary is pretty bad with memory allocations
> 
> AVDictionary needs to strdup every key and value, needs to allocate
> the AVDictionary itself and reallocs the entry array each time
> thats 10 memory allocation related calls for adding 3 entries
> 
> while AVSet allocates the AVSet and then uses av_fast_realloc() for
> the array
> and theres nothing else, the key/value goes in that array too
> 
> 
> bechmark code used is below:
> 
> 
> #if 0
>     for (int runs = 0; runs < 100000; runs++) {
>         AVSet *set = av_set_new(strcmp, NULL, NULL);
>         for(int pass = 0; pass < 2; pass++) {
>             unsigned r = 5;
>             for(int i=0; i<100; i++) {
>                 r = r*123 + 7;
>                 char str[2*7] = "TESTXXTESTXX";
>                 str[4] = r;
>                 str[5] = r>>8;
>                 if(pass == 0) {
>                     av_set_add(set, str, 2*7, 0);
>                 } else {
>                     av_set_get(set, NULL, str, NULL);
>                 }
>             }
>         }
>         av_set_free(&set);
>     }
> #else
>     for (int runs = 0; runs < 100000; runs++) {
>         AVDictionary *dict = NULL;
>         for(int pass = 0; pass < 2; pass++) {
>             unsigned r = 5;
>             for(int i=0; i<100; i++) {
>                 r = r*123 + 7;
>                 char str[7] = "TEST";
>                 str[4] = r;
>                 str[5] = r>>8;
>                 if(pass == 0) {
>                     av_dict_set(&dict, str, str, 0);
>                 } else {
>                     av_dict_get(dict, str, NULL, 0);
>                 }
>             }
>         }
>         av_dict_free(&dict);
>     }
> #endif
> 
> 
> --

Hi Michael,


what's not quite realistic is that all keys are starting with the same 4 characters. This affects the lookups of course - and probably (maybe) not equally for both sides.

Doesn't the code create duplicate keys (at least when it gets > 65536 it will for sure) ?

So, I think, the keys should be completely random (all chars).

I would also check whether the lookups are successful (just to be sure).


Best,
sw


_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [FFmpeg-devel] AVDictionary vs. AVSet (AVDictionary2 approximation)
  2025-04-14 12:40 ` softworkz .
@ 2025-04-14 13:02   ` softworkz .
  2025-04-15 19:11     ` Michael Niedermayer
  0 siblings, 1 reply; 5+ messages in thread
From: softworkz . @ 2025-04-14 13:02 UTC (permalink / raw)
  To: FFmpeg development discussions and patches



> -----Original Message-----
> From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of
> softworkz .
> Sent: Montag, 14. April 2025 14:40
> To: FFmpeg development discussions and patches <ffmpeg-
> devel@ffmpeg.org>
> Subject: Re: [FFmpeg-devel] AVDictionary vs. AVSet (AVDictionary2
> approximation)
> 
> 
> 
> > -----Original Message-----
> > From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of
> > Michael Niedermayer
> > Sent: Montag, 14. April 2025 13:33
> > To: FFmpeg development discussions and patches <ffmpeg-
> > devel@ffmpeg.org>
> > Subject: [FFmpeg-devel] AVDictionary vs. AVSet (AVDictionary2
> > approximation)
> >
> > Hi
> >
> > I just posted a AVSet implementation i wrote in the last 2 days (yes
> > thats
> > why i did dissapear for the last 2 days)
> >
> > My plan was to use that AVSet as basis for AVDictionary2 in case
> > benchmarks indicate that its worth it, so is it ?
> >
> > with 3 entries (100000 runs)
> > AVDictionary    0.040sec
> > AVSet           0.027sec
> >
> > with 5 entries (100000 runs)
> > AVDictionary    0.065sec
> > AVSet           0.042sec
> >
> > with 10 entries (100000 runs)
> > AVDictionary    0.193sec
> > AVSet           0.087sec
> >
> > with 100 entries (100000 runs)
> > AVDictionary    8.7  sec
> > AVSet           1.4  sec
> >
> > with 1000 entries (1000 runs)
> > AVDictionary    8.0   sec
> > AVSet           0.240 sec
> >
> > with 10000 entries (10 runs)
> > AVDictionary    7.2   sec
> > AVSet           0.042 sec
> >
> >
> > I was a bit surprised for the 3 and 5 entry case, maybe my benchmark
> > is buggy or
> > AVSet is, but then AVDictionary is pretty bad with memory
> allocations
> >
> > AVDictionary needs to strdup every key and value, needs to allocate
> > the AVDictionary itself and reallocs the entry array each time
> > thats 10 memory allocation related calls for adding 3 entries
> >
> > while AVSet allocates the AVSet and then uses av_fast_realloc() for
> > the array
> > and theres nothing else, the key/value goes in that array too
> >
> >
> > bechmark code used is below:
> >
> >
> > #if 0
> >     for (int runs = 0; runs < 100000; runs++) {
> >         AVSet *set = av_set_new(strcmp, NULL, NULL);
> >         for(int pass = 0; pass < 2; pass++) {
> >             unsigned r = 5;
> >             for(int i=0; i<100; i++) {
> >                 r = r*123 + 7;
> >                 char str[2*7] = "TESTXXTESTXX";
> >                 str[4] = r;
> >                 str[5] = r>>8;
> >                 if(pass == 0) {
> >                     av_set_add(set, str, 2*7, 0);
> >                 } else {
> >                     av_set_get(set, NULL, str, NULL);
> >                 }
> >             }
> >         }
> >         av_set_free(&set);
> >     }
> > #else
> >     for (int runs = 0; runs < 100000; runs++) {
> >         AVDictionary *dict = NULL;
> >         for(int pass = 0; pass < 2; pass++) {
> >             unsigned r = 5;
> >             for(int i=0; i<100; i++) {
> >                 r = r*123 + 7;
> >                 char str[7] = "TEST";
> >                 str[4] = r;
> >                 str[5] = r>>8;
> >                 if(pass == 0) {
> >                     av_dict_set(&dict, str, str, 0);
> >                 } else {
> >                     av_dict_get(dict, str, NULL, 0);
> >                 }
> >             }
> >         }
> >         av_dict_free(&dict);
> >     }
> > #endif
> >
> >
> > --
> 
> Hi Michael,
> 
> 
> what's not quite realistic is that all keys are starting with the same
> 4 characters. This affects the lookups of course - and probably
> (maybe) not equally for both sides.
> 
> Doesn't the code create duplicate keys (at least when it gets > 65536
> it will for sure) ?
> 
> So, I think, the keys should be completely random (all chars).
> 
> I would also check whether the lookups are successful (just to be
> sure).

Sorry, I forgot the most important one: 

Timing for population and lookup should be measured separately..

sw

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [FFmpeg-devel] AVDictionary vs. AVSet (AVDictionary2 approximation)
  2025-04-14 13:02   ` softworkz .
@ 2025-04-15 19:11     ` Michael Niedermayer
  2025-04-15 20:36       ` Michael Niedermayer
  0 siblings, 1 reply; 5+ messages in thread
From: Michael Niedermayer @ 2025-04-15 19:11 UTC (permalink / raw)
  To: FFmpeg development discussions and patches


[-- Attachment #1.1: Type: text/plain, Size: 7420 bytes --]

On Mon, Apr 14, 2025 at 01:02:00PM +0000, softworkz . wrote:
> 
> 
> > -----Original Message-----
> > From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of
> > softworkz .
> > Sent: Montag, 14. April 2025 14:40
> > To: FFmpeg development discussions and patches <ffmpeg-
> > devel@ffmpeg.org>
> > Subject: Re: [FFmpeg-devel] AVDictionary vs. AVSet (AVDictionary2
> > approximation)
> > 
> > 
> > 
> > > -----Original Message-----
> > > From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of
> > > Michael Niedermayer
> > > Sent: Montag, 14. April 2025 13:33
> > > To: FFmpeg development discussions and patches <ffmpeg-
> > > devel@ffmpeg.org>
> > > Subject: [FFmpeg-devel] AVDictionary vs. AVSet (AVDictionary2
> > > approximation)
> > >
> > > Hi
> > >
> > > I just posted a AVSet implementation i wrote in the last 2 days (yes
> > > thats
> > > why i did dissapear for the last 2 days)
> > >
> > > My plan was to use that AVSet as basis for AVDictionary2 in case
> > > benchmarks indicate that its worth it, so is it ?
> > >
> > > with 3 entries (100000 runs)
> > > AVDictionary    0.040sec
> > > AVSet           0.027sec
> > >
> > > with 5 entries (100000 runs)
> > > AVDictionary    0.065sec
> > > AVSet           0.042sec
> > >
> > > with 10 entries (100000 runs)
> > > AVDictionary    0.193sec
> > > AVSet           0.087sec
> > >
> > > with 100 entries (100000 runs)
> > > AVDictionary    8.7  sec
> > > AVSet           1.4  sec
> > >
> > > with 1000 entries (1000 runs)
> > > AVDictionary    8.0   sec
> > > AVSet           0.240 sec
> > >
> > > with 10000 entries (10 runs)
> > > AVDictionary    7.2   sec
> > > AVSet           0.042 sec
> > >
> > >
> > > I was a bit surprised for the 3 and 5 entry case, maybe my benchmark
> > > is buggy or
> > > AVSet is, but then AVDictionary is pretty bad with memory
> > allocations
> > >
> > > AVDictionary needs to strdup every key and value, needs to allocate
> > > the AVDictionary itself and reallocs the entry array each time
> > > thats 10 memory allocation related calls for adding 3 entries
> > >
> > > while AVSet allocates the AVSet and then uses av_fast_realloc() for
> > > the array
> > > and theres nothing else, the key/value goes in that array too
> > >
> > >
> > > bechmark code used is below:
> > >
> > >
> > > #if 0
> > >     for (int runs = 0; runs < 100000; runs++) {
> > >         AVSet *set = av_set_new(strcmp, NULL, NULL);
> > >         for(int pass = 0; pass < 2; pass++) {
> > >             unsigned r = 5;
> > >             for(int i=0; i<100; i++) {
> > >                 r = r*123 + 7;
> > >                 char str[2*7] = "TESTXXTESTXX";
> > >                 str[4] = r;
> > >                 str[5] = r>>8;
> > >                 if(pass == 0) {
> > >                     av_set_add(set, str, 2*7, 0);
> > >                 } else {
> > >                     av_set_get(set, NULL, str, NULL);
> > >                 }
> > >             }
> > >         }
> > >         av_set_free(&set);
> > >     }
> > > #else
> > >     for (int runs = 0; runs < 100000; runs++) {
> > >         AVDictionary *dict = NULL;
> > >         for(int pass = 0; pass < 2; pass++) {
> > >             unsigned r = 5;
> > >             for(int i=0; i<100; i++) {
> > >                 r = r*123 + 7;
> > >                 char str[7] = "TEST";
> > >                 str[4] = r;
> > >                 str[5] = r>>8;
> > >                 if(pass == 0) {
> > >                     av_dict_set(&dict, str, str, 0);
> > >                 } else {
> > >                     av_dict_get(dict, str, NULL, 0);
> > >                 }
> > >             }
> > >         }
> > >         av_dict_free(&dict);
> > >     }
> > > #endif
> > >
> > >
> > > --
> > 
> > Hi Michael,
> > 
> > 
> > what's not quite realistic is that all keys are starting with the same
> > 4 characters. This affects the lookups of course - and probably
> > (maybe) not equally for both sides.
> > 
> > Doesn't the code create duplicate keys (at least when it gets > 65536
> > it will for sure) ?
> > 
> > So, I think, the keys should be completely random (all chars).
> > 
> > I would also check whether the lookups are successful (just to be
> > sure).
> 
> Sorry, I forgot the most important one: 
> 
> Timing for population and lookup should be measured separately..

Sure, for the v2 (AVMap) i just posted

with TESTXX / TESTXX strings where XX is random

1000 entries
  5354505 decicycles in av_map_add,     512 runs,      0 skips
  4040575 decicycles in av_map_get,     512 runs,      0 skips
148082828 decicycles in av_dict_set,     512 runs,      0 skips
145828939 decicycles in av_dict_get,     512 runs,      0 skips

100 entries
 332015 decicycles in av_map_add,     512 runs,      0 skips
 193726 decicycles in av_map_get,     512 runs,      0 skips
1697242 decicycles in av_dict_set,     512 runs,      0 skips
1392837 decicycles in av_dict_get,     512 runs,      0 skips

10 entries
  21142 decicycles in av_map_add,     512 runs,      0 skips
  11395 decicycles in av_map_get,     512 runs,      0 skips
  45663 decicycles in av_dict_set,     512 runs,      0 skips
  19756 decicycles in av_dict_get,     512 runs,      0 skips

5 entries
   9210 decicycles in av_map_add,     512 runs,      0 skips
   4870 decicycles in av_map_get,     511 runs,      1 skips
  18823 decicycles in av_dict_set,     512 runs,      0 skips
   5483 decicycles in av_dict_get,     512 runs,      0 skips

3 entries
   5693 decicycles in av_map_add,     512 runs,      0 skips
   2645 decicycles in av_map_get,     512 runs,      0 skips
  11462 decicycles in av_dict_set,     511 runs,      1 skips
   2532 decicycles in av_dict_get,     512 runs,      0 skips



with XXST / XXST strings where XX is random

1000 entries
 5321153 decicycles in av_map_add,     512 runs,      0 skips
 4295153 decicycles in av_map_get,     512 runs,      0 skips
70417784 decicycles in av_dict_set,     512 runs,      0 skips
68188612 decicycles in av_dict_get,     512 runs,      0 skips

100 entries
 322872 decicycles in av_map_add,     512 runs,      0 skips
 216032 decicycles in av_map_get,     511 runs,      1 skips
1022088 decicycles in av_dict_set,     512 runs,      0 skips
 723612 decicycles in av_dict_get,     512 runs,      0 skips

10 entries
  20993 decicycles in av_map_add,     512 runs,      0 skips
  11744 decicycles in av_map_get,     512 runs,      0 skips
  38945 decicycles in av_dict_set,     512 runs,      0 skips
  11308 decicycles in av_dict_get,     512 runs,      0 skips

5 entries
  10007 decicycles in av_map_add,     511 runs,      1 skips
   5004 decicycles in av_map_get,     512 runs,      0 skips
  17374 decicycles in av_dict_set,     511 runs,      1 skips
   3848 decicycles in av_dict_get,     512 runs,      0 skips

3 entries
   5896 decicycles in av_map_add,     512 runs,      0 skips
   2765 decicycles in av_map_get,     512 runs,      0 skips
  11396 decicycles in av_dict_set,     511 runs,      1 skips
   2029 decicycles in av_dict_get,     512 runs,      0 skips


[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

The worst form of inequality is to try to make unequal things equal.
-- Aristotle

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

[-- Attachment #2: Type: text/plain, Size: 251 bytes --]

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [FFmpeg-devel] AVDictionary vs. AVSet (AVDictionary2 approximation)
  2025-04-15 19:11     ` Michael Niedermayer
@ 2025-04-15 20:36       ` Michael Niedermayer
  0 siblings, 0 replies; 5+ messages in thread
From: Michael Niedermayer @ 2025-04-15 20:36 UTC (permalink / raw)
  To: FFmpeg development discussions and patches


[-- Attachment #1.1: Type: text/plain, Size: 8314 bytes --]

On Tue, Apr 15, 2025 at 09:11:33PM +0200, Michael Niedermayer wrote:
> On Mon, Apr 14, 2025 at 01:02:00PM +0000, softworkz . wrote:
> > 
> > 
> > > -----Original Message-----
> > > From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of
> > > softworkz .
> > > Sent: Montag, 14. April 2025 14:40
> > > To: FFmpeg development discussions and patches <ffmpeg-
> > > devel@ffmpeg.org>
> > > Subject: Re: [FFmpeg-devel] AVDictionary vs. AVSet (AVDictionary2
> > > approximation)
> > > 
> > > 
> > > 
> > > > -----Original Message-----
> > > > From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of
> > > > Michael Niedermayer
> > > > Sent: Montag, 14. April 2025 13:33
> > > > To: FFmpeg development discussions and patches <ffmpeg-
> > > > devel@ffmpeg.org>
> > > > Subject: [FFmpeg-devel] AVDictionary vs. AVSet (AVDictionary2
> > > > approximation)
> > > >
> > > > Hi
> > > >
> > > > I just posted a AVSet implementation i wrote in the last 2 days (yes
> > > > thats
> > > > why i did dissapear for the last 2 days)
> > > >
> > > > My plan was to use that AVSet as basis for AVDictionary2 in case
> > > > benchmarks indicate that its worth it, so is it ?
> > > >
> > > > with 3 entries (100000 runs)
> > > > AVDictionary    0.040sec
> > > > AVSet           0.027sec
> > > >
> > > > with 5 entries (100000 runs)
> > > > AVDictionary    0.065sec
> > > > AVSet           0.042sec
> > > >
> > > > with 10 entries (100000 runs)
> > > > AVDictionary    0.193sec
> > > > AVSet           0.087sec
> > > >
> > > > with 100 entries (100000 runs)
> > > > AVDictionary    8.7  sec
> > > > AVSet           1.4  sec
> > > >
> > > > with 1000 entries (1000 runs)
> > > > AVDictionary    8.0   sec
> > > > AVSet           0.240 sec
> > > >
> > > > with 10000 entries (10 runs)
> > > > AVDictionary    7.2   sec
> > > > AVSet           0.042 sec
> > > >
> > > >
> > > > I was a bit surprised for the 3 and 5 entry case, maybe my benchmark
> > > > is buggy or
> > > > AVSet is, but then AVDictionary is pretty bad with memory
> > > allocations
> > > >
> > > > AVDictionary needs to strdup every key and value, needs to allocate
> > > > the AVDictionary itself and reallocs the entry array each time
> > > > thats 10 memory allocation related calls for adding 3 entries
> > > >
> > > > while AVSet allocates the AVSet and then uses av_fast_realloc() for
> > > > the array
> > > > and theres nothing else, the key/value goes in that array too
> > > >
> > > >
> > > > bechmark code used is below:
> > > >
> > > >
> > > > #if 0
> > > >     for (int runs = 0; runs < 100000; runs++) {
> > > >         AVSet *set = av_set_new(strcmp, NULL, NULL);
> > > >         for(int pass = 0; pass < 2; pass++) {
> > > >             unsigned r = 5;
> > > >             for(int i=0; i<100; i++) {
> > > >                 r = r*123 + 7;
> > > >                 char str[2*7] = "TESTXXTESTXX";
> > > >                 str[4] = r;
> > > >                 str[5] = r>>8;
> > > >                 if(pass == 0) {
> > > >                     av_set_add(set, str, 2*7, 0);
> > > >                 } else {
> > > >                     av_set_get(set, NULL, str, NULL);
> > > >                 }
> > > >             }
> > > >         }
> > > >         av_set_free(&set);
> > > >     }
> > > > #else
> > > >     for (int runs = 0; runs < 100000; runs++) {
> > > >         AVDictionary *dict = NULL;
> > > >         for(int pass = 0; pass < 2; pass++) {
> > > >             unsigned r = 5;
> > > >             for(int i=0; i<100; i++) {
> > > >                 r = r*123 + 7;
> > > >                 char str[7] = "TEST";
> > > >                 str[4] = r;
> > > >                 str[5] = r>>8;
> > > >                 if(pass == 0) {
> > > >                     av_dict_set(&dict, str, str, 0);
> > > >                 } else {
> > > >                     av_dict_get(dict, str, NULL, 0);
> > > >                 }
> > > >             }
> > > >         }
> > > >         av_dict_free(&dict);
> > > >     }
> > > > #endif
> > > >
> > > >
> > > > --
> > > 
> > > Hi Michael,
> > > 
> > > 
> > > what's not quite realistic is that all keys are starting with the same
> > > 4 characters. This affects the lookups of course - and probably
> > > (maybe) not equally for both sides.
> > > 
> > > Doesn't the code create duplicate keys (at least when it gets > 65536
> > > it will for sure) ?
> > > 
> > > So, I think, the keys should be completely random (all chars).
> > > 
> > > I would also check whether the lookups are successful (just to be
> > > sure).
> > 
> > Sorry, I forgot the most important one: 
> > 
> > Timing for population and lookup should be measured separately..
> 
> Sure, for the v2 (AVMap) i just posted
> 
> with TESTXX / TESTXX strings where XX is random
> 
> 1000 entries
>   5354505 decicycles in av_map_add,     512 runs,      0 skips
>   4040575 decicycles in av_map_get,     512 runs,      0 skips
> 148082828 decicycles in av_dict_set,     512 runs,      0 skips
> 145828939 decicycles in av_dict_get,     512 runs,      0 skips
> 
> 100 entries
>  332015 decicycles in av_map_add,     512 runs,      0 skips
>  193726 decicycles in av_map_get,     512 runs,      0 skips
> 1697242 decicycles in av_dict_set,     512 runs,      0 skips
> 1392837 decicycles in av_dict_get,     512 runs,      0 skips
> 
> 10 entries
>   21142 decicycles in av_map_add,     512 runs,      0 skips
>   11395 decicycles in av_map_get,     512 runs,      0 skips
>   45663 decicycles in av_dict_set,     512 runs,      0 skips
>   19756 decicycles in av_dict_get,     512 runs,      0 skips
> 
> 5 entries
>    9210 decicycles in av_map_add,     512 runs,      0 skips
>    4870 decicycles in av_map_get,     511 runs,      1 skips
>   18823 decicycles in av_dict_set,     512 runs,      0 skips
>    5483 decicycles in av_dict_get,     512 runs,      0 skips
> 
> 3 entries
>    5693 decicycles in av_map_add,     512 runs,      0 skips
>    2645 decicycles in av_map_get,     512 runs,      0 skips
>   11462 decicycles in av_dict_set,     511 runs,      1 skips
>    2532 decicycles in av_dict_get,     512 runs,      0 skips
> 
> 
> 
> with XXST / XXST strings where XX is random
> 
> 1000 entries
>  5321153 decicycles in av_map_add,     512 runs,      0 skips
>  4295153 decicycles in av_map_get,     512 runs,      0 skips
> 70417784 decicycles in av_dict_set,     512 runs,      0 skips
> 68188612 decicycles in av_dict_get,     512 runs,      0 skips
> 
> 100 entries
>  322872 decicycles in av_map_add,     512 runs,      0 skips
>  216032 decicycles in av_map_get,     511 runs,      1 skips
> 1022088 decicycles in av_dict_set,     512 runs,      0 skips
>  723612 decicycles in av_dict_get,     512 runs,      0 skips
> 
> 10 entries
>   20993 decicycles in av_map_add,     512 runs,      0 skips
>   11744 decicycles in av_map_get,     512 runs,      0 skips
>   38945 decicycles in av_dict_set,     512 runs,      0 skips
>   11308 decicycles in av_dict_get,     512 runs,      0 skips
> 
> 5 entries
>   10007 decicycles in av_map_add,     511 runs,      1 skips
>    5004 decicycles in av_map_get,     512 runs,      0 skips
>   17374 decicycles in av_dict_set,     511 runs,      1 skips
>    3848 decicycles in av_dict_get,     512 runs,      0 skips
> 
> 3 entries
>    5896 decicycles in av_map_add,     512 runs,      0 skips
>    2765 decicycles in av_map_get,     512 runs,      0 skips
>   11396 decicycles in av_dict_set,     511 runs,      1 skips
>    2029 decicycles in av_dict_get,     512 runs,      0 skips

This contained a bug, Dictionary used C code written by FFmpeg developers
and did a case insensitive compare while AVMap used strcmp() which
is case sensitive and not written by FFmpeg developers.

I will post new test results with av_strcasecmp() instead
and also put it in a new thread so it can be found easier

thx

[...]

-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Into a blind darkness they enter who follow after the Ignorance,
they as if into a greater darkness enter who devote themselves
to the Knowledge alone. -- Isha Upanishad

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

[-- Attachment #2: Type: text/plain, Size: 251 bytes --]

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-04-15 20:36 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-04-14 11:33 [FFmpeg-devel] AVDictionary vs. AVSet (AVDictionary2 approximation) Michael Niedermayer
2025-04-14 12:40 ` softworkz .
2025-04-14 13:02   ` softworkz .
2025-04-15 19:11     ` Michael Niedermayer
2025-04-15 20:36       ` Michael Niedermayer

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
		ffmpegdev@gitmailbox.com
	public-inbox-index ffmpegdev

Example config snippet for mirrors.


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git