From: Michael Niedermayer <michael@niedermayer.cc> To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org> Subject: Re: [FFmpeg-devel] [PATCH 2/4] avcodec/get_bits: Avoid 2nd bitstream read in GET_VLC() if bits are known at build and small Date: Fri, 27 Oct 2023 20:38:14 +0200 Message-ID: <20231027183814.GW3543730@pb2> (raw) In-Reply-To: <AS8P250MB0744FA16EEE8F6C73B5E94B28FDCA@AS8P250MB0744.EURP250.PROD.OUTLOOK.COM> [-- Attachment #1.1: Type: text/plain, Size: 5428 bytes --] On Fri, Oct 27, 2023 at 05:10:32AM +0200, Andreas Rheinhardt wrote: > Michael Niedermayer: > > Signed-off-by: Michael Niedermayer <michael@niedermayer.cc> > > --- > > libavcodec/get_bits.h | 8 ++++++-- > > 1 file changed, 6 insertions(+), 2 deletions(-) > > > > diff --git a/libavcodec/get_bits.h b/libavcodec/get_bits.h > > index cfcf97c021c..86cea00494a 100644 > > --- a/libavcodec/get_bits.h > > +++ b/libavcodec/get_bits.h > > @@ -581,8 +581,12 @@ static inline const uint8_t *align_get_bits(GetBitContext *s) > > n = table[index].len; \ > > \ > > if (max_depth > 1 && n < 0) { \ > > - LAST_SKIP_BITS(name, gb, bits); \ > > - UPDATE_CACHE(name, gb); \ > > + if (av_builtin_constant_p(bits <= MIN_CACHE_BITS/2) && bits <= MIN_CACHE_BITS/2) { \ > > + SKIP_BITS(name, gb, bits); \ > > + } else { \ > > + LAST_SKIP_BITS(name, gb, bits); \ > > + UPDATE_CACHE(name, gb); \ > > + } \ > > \ > > nb_bits = -n; \ > > \ > > This is problematic: The GET_VLC macro does not presume that > MIN_CACHE_BITS are available; there is code that directly uses GET_VLC > instead of get_vlc2(). > > I had the same idea when I made my VLC patchset, yet I wanted to first > apply it (which I forgot). While investigating the above issue, I found > out that all users of GET_VLC always call UPDATE_CACHE immediately > before GET_VLC, so UPDATE_CACHE should be moved into GET_VLC; > furthermore, no user of GET_VLC relies on the reloads inside of GET_VLC. > The patches for this are here: > https://github.com/mkver/FFmpeg/commits/vlc Shall I send them? > > Notice that making GET_VLC more standalone enables improvements over the > current approach; yet it will not lead to optimal code: E.g. the VLCs in > decode_alpha_block() in speedhqdec.c are so short that one could read > both VLCs with only one UPDATE_CACHE(); another example is mjpegdec.c > which currently does this: > > GET_VLC(code, re, &s->gb, s->vlcs[1][ac_index].table, 9, 2); > > i += ((unsigned)code) >> 4; > code &= 0xf; > if (code) { > if (code > MIN_CACHE_BITS - 16) > UPDATE_CACHE(re, &s->gb); > > { > int cache = GET_CACHE(re, &s->gb); > int sign = (~cache) >> 31; > level = (NEG_USR32(sign ^ cache,code) ^ sign) - sign; > } > > LAST_SKIP_BITS(re, &s->gb, code); > > Because of the reloads in GET_VLC, there will always be at least > MIN_CACHE_BITS - 9 (= 16) bits available after GET_VLC, so one can read > code (<= 15) bits without updating the cache at all (16 in > MIN_CACHE_BITS - 16 is the maximum length of a VLC code used here); this > will no longer be possible with this optimization. > Btw: I am surprised that there is a branch before UPDATE_CACHE instead > of an unconditional UPDATE_CACHE. I also do not really see why this uses > these macros directly and not the functions. > > Given my objection to your patch #1, magicyuv will not benefit from > this; a different approach (see > https://github.com/mkver/FFmpeg/commit/9b5a977957968c0718dea55a5b15f060ef6201dc) > is to add a get_vlc() that uses the nb of bits used to create the VLC > and a compile-time upper bound for the maximum length of a VLC code as > parameters instead of the maximum depth of the VLC. > > Reading VLCs for the cached bitstream reader can btw also be improved: > https://github.com/mkver/FFmpeg/commit/fba57506a9cf6be2f4aa5eeee7b10d54729fd92a speaking of that, i was wondering if the alternatives we had in get_bits.h A32_BITSTREAM_READER wouldnt be worth reinvestigating especially when extended to 64bit some of these readers might perform better There are then just more bits available and fewer reads and fewer mispredicted branches for cached ones It would be somewhat nice if we could avoid having 2 different APIs as we have now with the cached and normal reader. Also the normal one with 64bit would be interresting, more available bits so fewer reads also i was wondering about a vlc reader thats entirely free of conditional branches. Just a loop that in each iteration would step by 0-n symbols forward and update a pointer to which table to use next but i dont think i will have the time to try to implement this. I have alot more ideas than i have time to try sadly, if you wan/can/or did try any of above that would be interresting thx [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB "You are 36 times more likely to die in a bathtub than at the hands of a terrorist. Also, you are 2.5 times more likely to become a president and 2 times more likely to become an astronaut, than to die in a terrorist attack." -- Thoughty2 [-- Attachment #1.2: signature.asc --] [-- Type: application/pgp-signature, Size: 195 bytes --] [-- Attachment #2: Type: text/plain, Size: 251 bytes --] _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
next prev parent reply other threads:[~2023-10-27 18:38 UTC|newest] Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top 2023-10-24 15:04 [FFmpeg-devel] [PATCH 1/4] avcodec/magicyuv: Use a compile time constant for vlc_bits Michael Niedermayer 2023-10-24 15:04 ` [FFmpeg-devel] [PATCH 2/4] avcodec/get_bits: Avoid 2nd bitstream read in GET_VLC() if bits are known at build and small Michael Niedermayer 2023-10-27 3:10 ` Andreas Rheinhardt 2023-10-27 18:38 ` Michael Niedermayer [this message] 2023-10-30 20:49 ` Andreas Rheinhardt 2023-10-31 0:25 ` Michael Niedermayer 2023-10-24 15:04 ` [FFmpeg-devel] [PATCH 3/4] avcodec/get_bits: Implement get_vlc_multi() Michael Niedermayer 2023-10-24 15:04 ` [FFmpeg-devel] [PATCH 4/4] avcodec/magicyuv: Set UNCHECKED_BITSTREAM_READER Michael Niedermayer 2023-10-26 21:37 ` [FFmpeg-devel] [PATCH 1/4] avcodec/magicyuv: Use a compile time constant for vlc_bits Andreas Rheinhardt
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20231027183814.GW3543730@pb2 \ --to=michael@niedermayer.cc \ --cc=ffmpeg-devel@ffmpeg.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel This inbox may be cloned and mirrored by anyone: git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \ ffmpegdev@gitmailbox.com public-inbox-index ffmpegdev Example config snippet for mirrors. AGPL code for this site: git clone https://public-inbox.org/public-inbox.git