* [FFmpeg-devel] [PATCH 1/4] avcodec/magicyuv: Use a compile time constant for vlc_bits @ 2023-10-24 15:04 Michael Niedermayer 2023-10-24 15:04 ` [FFmpeg-devel] [PATCH 2/4] avcodec/get_bits: Avoid 2nd bitstream read in GET_VLC() if bits are known at build and small Michael Niedermayer ` (3 more replies) 0 siblings, 4 replies; 9+ messages in thread From: Michael Niedermayer @ 2023-10-24 15:04 UTC (permalink / raw) To: FFmpeg development discussions and patches This will permit further optimizations Signed-off-by: Michael Niedermayer <michael@niedermayer.cc> --- libavcodec/magicyuv.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/libavcodec/magicyuv.c b/libavcodec/magicyuv.c index 4a143cdbbf7..78d7f44cd65 100644 --- a/libavcodec/magicyuv.c +++ b/libavcodec/magicyuv.c @@ -124,13 +124,13 @@ static void magicyuv_median_pred16(uint16_t *dst, const uint16_t *src1, x = 0; \ for (; x < width-c && get_bits_left(&gb) > 0;) {\ ret = get_vlc_multi(&gb, (uint8_t *)dst + x * b, multi, \ - vlc, vlc_bits, 3); \ + vlc, VLC_BITS, 3); \ if (ret <= 0) \ return AVERROR_INVALIDDATA; \ x += ret; \ } \ for (; x < width && get_bits_left(&gb) > 0; x++) \ - dst[x] = get_vlc2(&gb, vlc, vlc_bits, 3); \ + dst[x] = get_vlc2(&gb, vlc, VLC_BITS, 3); \ dst += stride; \ } @@ -155,7 +155,6 @@ static int magy_decode_slice10(AVCodecContext *avctx, void *tdata, ptrdiff_t stride = p->linesize[i] / 2; const VLC_MULTI_ELEM *const multi = s->multi[i].table; const VLCElem *const vlc = s->vlc[i].table; - const int vlc_bits = s->vlc[i].bits; int flags, pred; int ret = init_get_bits8(&gb, s->buf + s->slices[i][j].start, s->slices[i][j].size); -- 2.17.1 _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 9+ messages in thread
* [FFmpeg-devel] [PATCH 2/4] avcodec/get_bits: Avoid 2nd bitstream read in GET_VLC() if bits are known at build and small 2023-10-24 15:04 [FFmpeg-devel] [PATCH 1/4] avcodec/magicyuv: Use a compile time constant for vlc_bits Michael Niedermayer @ 2023-10-24 15:04 ` Michael Niedermayer 2023-10-27 3:10 ` Andreas Rheinhardt 2023-10-24 15:04 ` [FFmpeg-devel] [PATCH 3/4] avcodec/get_bits: Implement get_vlc_multi() Michael Niedermayer ` (2 subsequent siblings) 3 siblings, 1 reply; 9+ messages in thread From: Michael Niedermayer @ 2023-10-24 15:04 UTC (permalink / raw) To: FFmpeg development discussions and patches Signed-off-by: Michael Niedermayer <michael@niedermayer.cc> --- libavcodec/get_bits.h | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/libavcodec/get_bits.h b/libavcodec/get_bits.h index cfcf97c021c..86cea00494a 100644 --- a/libavcodec/get_bits.h +++ b/libavcodec/get_bits.h @@ -581,8 +581,12 @@ static inline const uint8_t *align_get_bits(GetBitContext *s) n = table[index].len; \ \ if (max_depth > 1 && n < 0) { \ - LAST_SKIP_BITS(name, gb, bits); \ - UPDATE_CACHE(name, gb); \ + if (av_builtin_constant_p(bits <= MIN_CACHE_BITS/2) && bits <= MIN_CACHE_BITS/2) { \ + SKIP_BITS(name, gb, bits); \ + } else { \ + LAST_SKIP_BITS(name, gb, bits); \ + UPDATE_CACHE(name, gb); \ + } \ \ nb_bits = -n; \ \ -- 2.17.1 _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [FFmpeg-devel] [PATCH 2/4] avcodec/get_bits: Avoid 2nd bitstream read in GET_VLC() if bits are known at build and small 2023-10-24 15:04 ` [FFmpeg-devel] [PATCH 2/4] avcodec/get_bits: Avoid 2nd bitstream read in GET_VLC() if bits are known at build and small Michael Niedermayer @ 2023-10-27 3:10 ` Andreas Rheinhardt 2023-10-27 18:38 ` Michael Niedermayer 0 siblings, 1 reply; 9+ messages in thread From: Andreas Rheinhardt @ 2023-10-27 3:10 UTC (permalink / raw) To: ffmpeg-devel Michael Niedermayer: > Signed-off-by: Michael Niedermayer <michael@niedermayer.cc> > --- > libavcodec/get_bits.h | 8 ++++++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/libavcodec/get_bits.h b/libavcodec/get_bits.h > index cfcf97c021c..86cea00494a 100644 > --- a/libavcodec/get_bits.h > +++ b/libavcodec/get_bits.h > @@ -581,8 +581,12 @@ static inline const uint8_t *align_get_bits(GetBitContext *s) > n = table[index].len; \ > \ > if (max_depth > 1 && n < 0) { \ > - LAST_SKIP_BITS(name, gb, bits); \ > - UPDATE_CACHE(name, gb); \ > + if (av_builtin_constant_p(bits <= MIN_CACHE_BITS/2) && bits <= MIN_CACHE_BITS/2) { \ > + SKIP_BITS(name, gb, bits); \ > + } else { \ > + LAST_SKIP_BITS(name, gb, bits); \ > + UPDATE_CACHE(name, gb); \ > + } \ > \ > nb_bits = -n; \ > \ This is problematic: The GET_VLC macro does not presume that MIN_CACHE_BITS are available; there is code that directly uses GET_VLC instead of get_vlc2(). I had the same idea when I made my VLC patchset, yet I wanted to first apply it (which I forgot). While investigating the above issue, I found out that all users of GET_VLC always call UPDATE_CACHE immediately before GET_VLC, so UPDATE_CACHE should be moved into GET_VLC; furthermore, no user of GET_VLC relies on the reloads inside of GET_VLC. The patches for this are here: https://github.com/mkver/FFmpeg/commits/vlc Shall I send them? Notice that making GET_VLC more standalone enables improvements over the current approach; yet it will not lead to optimal code: E.g. the VLCs in decode_alpha_block() in speedhqdec.c are so short that one could read both VLCs with only one UPDATE_CACHE(); another example is mjpegdec.c which currently does this: GET_VLC(code, re, &s->gb, s->vlcs[1][ac_index].table, 9, 2); i += ((unsigned)code) >> 4; code &= 0xf; if (code) { if (code > MIN_CACHE_BITS - 16) UPDATE_CACHE(re, &s->gb); { int cache = GET_CACHE(re, &s->gb); int sign = (~cache) >> 31; level = (NEG_USR32(sign ^ cache,code) ^ sign) - sign; } LAST_SKIP_BITS(re, &s->gb, code); Because of the reloads in GET_VLC, there will always be at least MIN_CACHE_BITS - 9 (= 16) bits available after GET_VLC, so one can read code (<= 15) bits without updating the cache at all (16 in MIN_CACHE_BITS - 16 is the maximum length of a VLC code used here); this will no longer be possible with this optimization. Btw: I am surprised that there is a branch before UPDATE_CACHE instead of an unconditional UPDATE_CACHE. I also do not really see why this uses these macros directly and not the functions. Given my objection to your patch #1, magicyuv will not benefit from this; a different approach (see https://github.com/mkver/FFmpeg/commit/9b5a977957968c0718dea55a5b15f060ef6201dc) is to add a get_vlc() that uses the nb of bits used to create the VLC and a compile-time upper bound for the maximum length of a VLC code as parameters instead of the maximum depth of the VLC. Reading VLCs for the cached bitstream reader can btw also be improved: https://github.com/mkver/FFmpeg/commit/fba57506a9cf6be2f4aa5eeee7b10d54729fd92a - Andreas _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [FFmpeg-devel] [PATCH 2/4] avcodec/get_bits: Avoid 2nd bitstream read in GET_VLC() if bits are known at build and small 2023-10-27 3:10 ` Andreas Rheinhardt @ 2023-10-27 18:38 ` Michael Niedermayer 2023-10-30 20:49 ` Andreas Rheinhardt 0 siblings, 1 reply; 9+ messages in thread From: Michael Niedermayer @ 2023-10-27 18:38 UTC (permalink / raw) To: FFmpeg development discussions and patches [-- Attachment #1.1: Type: text/plain, Size: 5428 bytes --] On Fri, Oct 27, 2023 at 05:10:32AM +0200, Andreas Rheinhardt wrote: > Michael Niedermayer: > > Signed-off-by: Michael Niedermayer <michael@niedermayer.cc> > > --- > > libavcodec/get_bits.h | 8 ++++++-- > > 1 file changed, 6 insertions(+), 2 deletions(-) > > > > diff --git a/libavcodec/get_bits.h b/libavcodec/get_bits.h > > index cfcf97c021c..86cea00494a 100644 > > --- a/libavcodec/get_bits.h > > +++ b/libavcodec/get_bits.h > > @@ -581,8 +581,12 @@ static inline const uint8_t *align_get_bits(GetBitContext *s) > > n = table[index].len; \ > > \ > > if (max_depth > 1 && n < 0) { \ > > - LAST_SKIP_BITS(name, gb, bits); \ > > - UPDATE_CACHE(name, gb); \ > > + if (av_builtin_constant_p(bits <= MIN_CACHE_BITS/2) && bits <= MIN_CACHE_BITS/2) { \ > > + SKIP_BITS(name, gb, bits); \ > > + } else { \ > > + LAST_SKIP_BITS(name, gb, bits); \ > > + UPDATE_CACHE(name, gb); \ > > + } \ > > \ > > nb_bits = -n; \ > > \ > > This is problematic: The GET_VLC macro does not presume that > MIN_CACHE_BITS are available; there is code that directly uses GET_VLC > instead of get_vlc2(). > > I had the same idea when I made my VLC patchset, yet I wanted to first > apply it (which I forgot). While investigating the above issue, I found > out that all users of GET_VLC always call UPDATE_CACHE immediately > before GET_VLC, so UPDATE_CACHE should be moved into GET_VLC; > furthermore, no user of GET_VLC relies on the reloads inside of GET_VLC. > The patches for this are here: > https://github.com/mkver/FFmpeg/commits/vlc Shall I send them? > > Notice that making GET_VLC more standalone enables improvements over the > current approach; yet it will not lead to optimal code: E.g. the VLCs in > decode_alpha_block() in speedhqdec.c are so short that one could read > both VLCs with only one UPDATE_CACHE(); another example is mjpegdec.c > which currently does this: > > GET_VLC(code, re, &s->gb, s->vlcs[1][ac_index].table, 9, 2); > > i += ((unsigned)code) >> 4; > code &= 0xf; > if (code) { > if (code > MIN_CACHE_BITS - 16) > UPDATE_CACHE(re, &s->gb); > > { > int cache = GET_CACHE(re, &s->gb); > int sign = (~cache) >> 31; > level = (NEG_USR32(sign ^ cache,code) ^ sign) - sign; > } > > LAST_SKIP_BITS(re, &s->gb, code); > > Because of the reloads in GET_VLC, there will always be at least > MIN_CACHE_BITS - 9 (= 16) bits available after GET_VLC, so one can read > code (<= 15) bits without updating the cache at all (16 in > MIN_CACHE_BITS - 16 is the maximum length of a VLC code used here); this > will no longer be possible with this optimization. > Btw: I am surprised that there is a branch before UPDATE_CACHE instead > of an unconditional UPDATE_CACHE. I also do not really see why this uses > these macros directly and not the functions. > > Given my objection to your patch #1, magicyuv will not benefit from > this; a different approach (see > https://github.com/mkver/FFmpeg/commit/9b5a977957968c0718dea55a5b15f060ef6201dc) > is to add a get_vlc() that uses the nb of bits used to create the VLC > and a compile-time upper bound for the maximum length of a VLC code as > parameters instead of the maximum depth of the VLC. > > Reading VLCs for the cached bitstream reader can btw also be improved: > https://github.com/mkver/FFmpeg/commit/fba57506a9cf6be2f4aa5eeee7b10d54729fd92a speaking of that, i was wondering if the alternatives we had in get_bits.h A32_BITSTREAM_READER wouldnt be worth reinvestigating especially when extended to 64bit some of these readers might perform better There are then just more bits available and fewer reads and fewer mispredicted branches for cached ones It would be somewhat nice if we could avoid having 2 different APIs as we have now with the cached and normal reader. Also the normal one with 64bit would be interresting, more available bits so fewer reads also i was wondering about a vlc reader thats entirely free of conditional branches. Just a loop that in each iteration would step by 0-n symbols forward and update a pointer to which table to use next but i dont think i will have the time to try to implement this. I have alot more ideas than i have time to try sadly, if you wan/can/or did try any of above that would be interresting thx [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB "You are 36 times more likely to die in a bathtub than at the hands of a terrorist. Also, you are 2.5 times more likely to become a president and 2 times more likely to become an astronaut, than to die in a terrorist attack." -- Thoughty2 [-- Attachment #1.2: signature.asc --] [-- Type: application/pgp-signature, Size: 195 bytes --] [-- Attachment #2: Type: text/plain, Size: 251 bytes --] _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [FFmpeg-devel] [PATCH 2/4] avcodec/get_bits: Avoid 2nd bitstream read in GET_VLC() if bits are known at build and small 2023-10-27 18:38 ` Michael Niedermayer @ 2023-10-30 20:49 ` Andreas Rheinhardt 2023-10-31 0:25 ` Michael Niedermayer 0 siblings, 1 reply; 9+ messages in thread From: Andreas Rheinhardt @ 2023-10-30 20:49 UTC (permalink / raw) To: ffmpeg-devel Michael Niedermayer: > On Fri, Oct 27, 2023 at 05:10:32AM +0200, Andreas Rheinhardt wrote: >> Michael Niedermayer: >>> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc> >>> --- >>> libavcodec/get_bits.h | 8 ++++++-- >>> 1 file changed, 6 insertions(+), 2 deletions(-) >>> >>> diff --git a/libavcodec/get_bits.h b/libavcodec/get_bits.h >>> index cfcf97c021c..86cea00494a 100644 >>> --- a/libavcodec/get_bits.h >>> +++ b/libavcodec/get_bits.h >>> @@ -581,8 +581,12 @@ static inline const uint8_t *align_get_bits(GetBitContext *s) >>> n = table[index].len; \ >>> \ >>> if (max_depth > 1 && n < 0) { \ >>> - LAST_SKIP_BITS(name, gb, bits); \ >>> - UPDATE_CACHE(name, gb); \ >>> + if (av_builtin_constant_p(bits <= MIN_CACHE_BITS/2) && bits <= MIN_CACHE_BITS/2) { \ >>> + SKIP_BITS(name, gb, bits); \ >>> + } else { \ >>> + LAST_SKIP_BITS(name, gb, bits); \ >>> + UPDATE_CACHE(name, gb); \ >>> + } \ >>> \ >>> nb_bits = -n; \ >>> \ >> >> This is problematic: The GET_VLC macro does not presume that >> MIN_CACHE_BITS are available; there is code that directly uses GET_VLC >> instead of get_vlc2(). >> >> I had the same idea when I made my VLC patchset, yet I wanted to first >> apply it (which I forgot). While investigating the above issue, I found >> out that all users of GET_VLC always call UPDATE_CACHE immediately >> before GET_VLC, so UPDATE_CACHE should be moved into GET_VLC; >> furthermore, no user of GET_VLC relies on the reloads inside of GET_VLC. >> The patches for this are here: >> https://github.com/mkver/FFmpeg/commits/vlc Shall I send them? >> >> Notice that making GET_VLC more standalone enables improvements over the >> current approach; yet it will not lead to optimal code: E.g. the VLCs in >> decode_alpha_block() in speedhqdec.c are so short that one could read >> both VLCs with only one UPDATE_CACHE(); another example is mjpegdec.c >> which currently does this: >> >> GET_VLC(code, re, &s->gb, s->vlcs[1][ac_index].table, 9, 2); >> >> i += ((unsigned)code) >> 4; >> code &= 0xf; >> if (code) { >> if (code > MIN_CACHE_BITS - 16) >> UPDATE_CACHE(re, &s->gb); >> >> { >> int cache = GET_CACHE(re, &s->gb); >> int sign = (~cache) >> 31; >> level = (NEG_USR32(sign ^ cache,code) ^ sign) - sign; >> } >> >> LAST_SKIP_BITS(re, &s->gb, code); >> >> Because of the reloads in GET_VLC, there will always be at least >> MIN_CACHE_BITS - 9 (= 16) bits available after GET_VLC, so one can read >> code (<= 15) bits without updating the cache at all (16 in >> MIN_CACHE_BITS - 16 is the maximum length of a VLC code used here); this >> will no longer be possible with this optimization. >> Btw: I am surprised that there is a branch before UPDATE_CACHE instead >> of an unconditional UPDATE_CACHE. I also do not really see why this uses >> these macros directly and not the functions. >> >> Given my objection to your patch #1, magicyuv will not benefit from >> this; a different approach (see >> https://github.com/mkver/FFmpeg/commit/9b5a977957968c0718dea55a5b15f060ef6201dc) >> is to add a get_vlc() that uses the nb of bits used to create the VLC >> and a compile-time upper bound for the maximum length of a VLC code as >> parameters instead of the maximum depth of the VLC. >> > >> Reading VLCs for the cached bitstream reader can btw also be improved: >> https://github.com/mkver/FFmpeg/commit/fba57506a9cf6be2f4aa5eeee7b10d54729fd92a > > speaking of that, i was wondering if the alternatives we had in get_bits.h > A32_BITSTREAM_READER wouldnt be worth reinvestigating > especially when extended to 64bit some of these readers might perform better > There are then just more bits available and fewer reads and fewer mispredicted > branches for cached ones > > It would be somewhat nice if we could avoid having 2 different APIs as we have > now with the cached and normal reader. > Also the normal one with 64bit would be interresting, more available bits > so fewer reads I already did something like that in b0fb8e82dde375efb8d5602f6cd479f210c1e93c. But be aware that the most important reason we read more often than we necessarily have to is not the number of bits in the cache, but that we stopped using the macros (like SKIP_CACHE, SKIP_COUNTER) directly and used functions which are more standalone (and IMO also more readable). Furthermore, creating replacements of these macros is complicated for cached bitstream readers, because of the implicit overread checks. Here is the important part of read_nz: if (n > bc->bits_valid) { if (BS_FUNC(priv_refill_32)(bc) < 0) bc->bits_valid = n; } return BS_FUNC(priv_val_get)(bc, n); priv_val_get presumes there to be enough valid bits available; otherwise bits_valid wraps around. This does not mean that there is no hope for avoiding unnecessary reads: We could add an unsigned get_bits_cached(GetBitContext *gb, int n) (basically a get_bits(), but the valid bits are the most significant bits (for BE)) and get_bits_from_cache(unsigned *cache, int n) (basically, SHOW_UBITS+SKIP_CACHE (or priv_val_get for the cached API) and that could then be used as follows: unsigned cache = get_bits_cached(gb, 4 * 4); s->mpeg_f_code[0][0] = get_bits_from_cache(&cache, 4); s->mpeg_f_code[0][1] = get_bits_from_cache(&cache, 4); s->mpeg_f_code[1][0] = get_bits_from_cache(&cache, 4); s->mpeg_f_code[1][1] = get_bits_from_cache(&cache, 4); This works because get_bits_cached() would already remove the specified number of bits from the bitstream; but this also shows its restrictions: It is only usable when the number of bits to be read subsequently is known in advance. Another potential API is to add a show_bits_cached() that is the same as show_bits(), but returns the valid bits in the most significant bits (for BE). This could then be complemented with a get_bits_from_cache() and in the end one needs to skip the number of bits actually consumed. The latter would of course need to validate/sanitize the number of bits consumed for both the cached and uncached reader. For the cached reader, doing so incurs a check that is avoided in case one uses the get_bits_cached() API outlined above (of course, this only applies to the scenarios where one can actually use said API). > > also i was wondering about a vlc reader thats entirely free of conditional > branches. Just a loop that in each iteration would step by 0-n symbols > forward and update a pointer to which table to use next Doesn't this have the downside that short symbols need as many iterations as the longest one? > but i dont think i will have the time to try to implement this. > > I have alot more ideas than i have time to try sadly, if you wan/can/or did > try any of above that would be interresting > _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [FFmpeg-devel] [PATCH 2/4] avcodec/get_bits: Avoid 2nd bitstream read in GET_VLC() if bits are known at build and small 2023-10-30 20:49 ` Andreas Rheinhardt @ 2023-10-31 0:25 ` Michael Niedermayer 0 siblings, 0 replies; 9+ messages in thread From: Michael Niedermayer @ 2023-10-31 0:25 UTC (permalink / raw) To: FFmpeg development discussions and patches [-- Attachment #1.1: Type: text/plain, Size: 2151 bytes --] On Mon, Oct 30, 2023 at 09:49:07PM +0100, Andreas Rheinhardt wrote: > Michael Niedermayer: [...] > > > > also i was wondering about a vlc reader thats entirely free of conditional > > branches. Just a loop that in each iteration would step by 0-n symbols > > forward and update a pointer to which table to use next > > Doesn't this have the downside that short symbols need as many > iterations as the longest one? i dont think so but maybe iam thinking of something else lets assume our main table is 10bits so we read 10 bits look it up in the table and that tells us what there is and lets assume these 10 bits contain 2 complete symbols and one partial first we copy from the table a block into our symbol output list (that contains 2 symbols) second we take a pointer from the table to the next table the incomplete can either now be handled in the next iteration or we could point back to the main 10bit table and only handle the 2 complete in this iteration third we move bits (10) and symbols output pointer (2) forward now we go back to the start of the loop and continue handling the partial symbol first we copy from the table a block into our symbol output list (that contains 0 symbols as our partial one still is unfinished) second we take a pointer from the table to the next table this is the next table to decode the long symbol third we move bits (10) and symbols output pointer (0) forward now we go back to the start of the loop and continue handling the partial symbol first we copy from the table a block into our symbol output list (that finally completes the long symbol so we output it) second we take a pointer from the table to the next which is back at the main table third we move bits (x) and symbols output pointer (1) forward thx [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Frequently ignored answer#1 FFmpeg bugs should be sent to our bugtracker. User questions about the command line tools should be sent to the ffmpeg-user ML. And questions about how to use libav* should be sent to the libav-user ML. [-- Attachment #1.2: signature.asc --] [-- Type: application/pgp-signature, Size: 195 bytes --] [-- Attachment #2: Type: text/plain, Size: 251 bytes --] _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 9+ messages in thread
* [FFmpeg-devel] [PATCH 3/4] avcodec/get_bits: Implement get_vlc_multi() 2023-10-24 15:04 [FFmpeg-devel] [PATCH 1/4] avcodec/magicyuv: Use a compile time constant for vlc_bits Michael Niedermayer 2023-10-24 15:04 ` [FFmpeg-devel] [PATCH 2/4] avcodec/get_bits: Avoid 2nd bitstream read in GET_VLC() if bits are known at build and small Michael Niedermayer @ 2023-10-24 15:04 ` Michael Niedermayer 2023-10-24 15:04 ` [FFmpeg-devel] [PATCH 4/4] avcodec/magicyuv: Set UNCHECKED_BITSTREAM_READER Michael Niedermayer 2023-10-26 21:37 ` [FFmpeg-devel] [PATCH 1/4] avcodec/magicyuv: Use a compile time constant for vlc_bits Andreas Rheinhardt 3 siblings, 0 replies; 9+ messages in thread From: Michael Niedermayer @ 2023-10-24 15:04 UTC (permalink / raw) To: FFmpeg development discussions and patches Signed-off-by: Michael Niedermayer <michael@niedermayer.cc> --- libavcodec/get_bits.h | 61 ++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 58 insertions(+), 3 deletions(-) diff --git a/libavcodec/get_bits.h b/libavcodec/get_bits.h index 86cea00494a..806e4b16f49 100644 --- a/libavcodec/get_bits.h +++ b/libavcodec/get_bits.h @@ -668,13 +668,68 @@ static av_always_inline int get_vlc2(GetBitContext *s, const VLCElem *table, return code; } -static inline int get_vlc_multi(GetBitContext *s, uint8_t *dst, +/** + * Parse a vlc / vlc_multi code. + * @param bits is the number of bits which will be read at once, must be + * identical to nb_bits in vlc_init(), should be known at compile time + * @param max_depth is the number of times bits bits must be read to completely + * read the longest vlc code + * = (max_vlc_length + bits - 1) / bits + * @param dst the parsed symbol(s) will be stored here. Up to 8 bytes are written + * @returns number of symbols parsed + * If the vlc code is invalid and max_depth=1, then no bits will be removed. + * If the vlc code is invalid and max_depth>1, then the number of bits removed + * is undefined. + */ +static inline int get_vlc_multi(GetBitContext *s, uint8_t * restrict dst, const VLC_MULTI_ELEM *const Jtable, const VLCElem *const table, const int bits, const int max_depth) { - dst[0] = get_vlc2(s, table, bits, max_depth); - return 1; + int ret, nb_bits, n; + unsigned int index; + + OPEN_READER(re, s); + UPDATE_CACHE(re, s); + + index = SHOW_UBITS(re, s, bits); + n = Jtable[index].len; + if (Jtable[index].num) { + AV_COPY64U(dst, Jtable[index].val); + ret = Jtable[index].num; + } else { + int code = table[index].sym; + + if (av_builtin_constant_p(bits <= MIN_CACHE_BITS/2) && bits <= MIN_CACHE_BITS/2) { + SKIP_BITS(re, s, bits); + } else { + LAST_SKIP_BITS(re, s, bits); + UPDATE_CACHE(re, s); + } + + nb_bits = -n; + + index = SHOW_UBITS(re, s, nb_bits) + code; + code = table[index].sym; + n = table[index].len; + if (max_depth > 2 && n < 0) { + LAST_SKIP_BITS(re, s, nb_bits); + UPDATE_CACHE(re, s); + + nb_bits = -n; + + index = SHOW_UBITS(re, s, nb_bits) + code; + code = table[index].sym; + n = table[index].len; + } + dst[0] = code; + ret = n > 0; + } + LAST_SKIP_BITS(re, s, n); + + CLOSE_READER(re, s); + + return ret; } static inline int decode012(GetBitContext *gb) -- 2.17.1 _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 9+ messages in thread
* [FFmpeg-devel] [PATCH 4/4] avcodec/magicyuv: Set UNCHECKED_BITSTREAM_READER 2023-10-24 15:04 [FFmpeg-devel] [PATCH 1/4] avcodec/magicyuv: Use a compile time constant for vlc_bits Michael Niedermayer 2023-10-24 15:04 ` [FFmpeg-devel] [PATCH 2/4] avcodec/get_bits: Avoid 2nd bitstream read in GET_VLC() if bits are known at build and small Michael Niedermayer 2023-10-24 15:04 ` [FFmpeg-devel] [PATCH 3/4] avcodec/get_bits: Implement get_vlc_multi() Michael Niedermayer @ 2023-10-24 15:04 ` Michael Niedermayer 2023-10-26 21:37 ` [FFmpeg-devel] [PATCH 1/4] avcodec/magicyuv: Use a compile time constant for vlc_bits Andreas Rheinhardt 3 siblings, 0 replies; 9+ messages in thread From: Michael Niedermayer @ 2023-10-24 15:04 UTC (permalink / raw) To: FFmpeg development discussions and patches The code already checks for the end everywhere Signed-off-by: Michael Niedermayer <michael@niedermayer.cc> --- libavcodec/magicyuv.c | 1 + 1 file changed, 1 insertion(+) diff --git a/libavcodec/magicyuv.c b/libavcodec/magicyuv.c index 78d7f44cd65..2ab29df2f51 100644 --- a/libavcodec/magicyuv.c +++ b/libavcodec/magicyuv.c @@ -23,6 +23,7 @@ #include <string.h> #define CACHED_BITSTREAM_READER !ARCH_X86_32 +#define UNCHECKED_BITSTREAM_READER 1 #include "libavutil/pixdesc.h" -- 2.17.1 _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [FFmpeg-devel] [PATCH 1/4] avcodec/magicyuv: Use a compile time constant for vlc_bits 2023-10-24 15:04 [FFmpeg-devel] [PATCH 1/4] avcodec/magicyuv: Use a compile time constant for vlc_bits Michael Niedermayer ` (2 preceding siblings ...) 2023-10-24 15:04 ` [FFmpeg-devel] [PATCH 4/4] avcodec/magicyuv: Set UNCHECKED_BITSTREAM_READER Michael Niedermayer @ 2023-10-26 21:37 ` Andreas Rheinhardt 3 siblings, 0 replies; 9+ messages in thread From: Andreas Rheinhardt @ 2023-10-26 21:37 UTC (permalink / raw) To: ffmpeg-devel Michael Niedermayer: > This will permit further optimizations > > Signed-off-by: Michael Niedermayer <michael@niedermayer.cc> > --- > libavcodec/magicyuv.c | 5 ++--- > 1 file changed, 2 insertions(+), 3 deletions(-) > > diff --git a/libavcodec/magicyuv.c b/libavcodec/magicyuv.c > index 4a143cdbbf7..78d7f44cd65 100644 > --- a/libavcodec/magicyuv.c > +++ b/libavcodec/magicyuv.c > @@ -124,13 +124,13 @@ static void magicyuv_median_pred16(uint16_t *dst, const uint16_t *src1, > x = 0; \ > for (; x < width-c && get_bits_left(&gb) > 0;) {\ > ret = get_vlc_multi(&gb, (uint8_t *)dst + x * b, multi, \ > - vlc, vlc_bits, 3); \ > + vlc, VLC_BITS, 3); \ > if (ret <= 0) \ > return AVERROR_INVALIDDATA; \ > x += ret; \ > } \ > for (; x < width && get_bits_left(&gb) > 0; x++) \ > - dst[x] = get_vlc2(&gb, vlc, vlc_bits, 3); \ > + dst[x] = get_vlc2(&gb, vlc, VLC_BITS, 3); \ > dst += stride; \ > } > > @@ -155,7 +155,6 @@ static int magy_decode_slice10(AVCodecContext *avctx, void *tdata, > ptrdiff_t stride = p->linesize[i] / 2; > const VLC_MULTI_ELEM *const multi = s->multi[i].table; > const VLCElem *const vlc = s->vlc[i].table; > - const int vlc_bits = s->vlc[i].bits; > int flags, pred; > int ret = init_get_bits8(&gb, s->buf + s->slices[i][j].start, > s->slices[i][j].size); The VLCs are created via "ff_vlc_init_multi_from_lengths(vlc, multi, FFMIN(he[0].len, VLC_BITS), nb_elems, nb_elems,", so the number of bits of the VLC is not a compile-time constant. - Andreas _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2023-10-31 0:25 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-10-24 15:04 [FFmpeg-devel] [PATCH 1/4] avcodec/magicyuv: Use a compile time constant for vlc_bits Michael Niedermayer 2023-10-24 15:04 ` [FFmpeg-devel] [PATCH 2/4] avcodec/get_bits: Avoid 2nd bitstream read in GET_VLC() if bits are known at build and small Michael Niedermayer 2023-10-27 3:10 ` Andreas Rheinhardt 2023-10-27 18:38 ` Michael Niedermayer 2023-10-30 20:49 ` Andreas Rheinhardt 2023-10-31 0:25 ` Michael Niedermayer 2023-10-24 15:04 ` [FFmpeg-devel] [PATCH 3/4] avcodec/get_bits: Implement get_vlc_multi() Michael Niedermayer 2023-10-24 15:04 ` [FFmpeg-devel] [PATCH 4/4] avcodec/magicyuv: Set UNCHECKED_BITSTREAM_READER Michael Niedermayer 2023-10-26 21:37 ` [FFmpeg-devel] [PATCH 1/4] avcodec/magicyuv: Use a compile time constant for vlc_bits Andreas Rheinhardt
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel This inbox may be cloned and mirrored by anyone: git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \ ffmpegdev@gitmailbox.com public-inbox-index ffmpegdev Example config snippet for mirrors. AGPL code for this site: git clone https://public-inbox.org/public-inbox.git