From: Anton Khirnov <anton@khirnov.net> To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org> Subject: Re: [FFmpeg-devel] [PATCH 10/39] lavc/ffv1dec: move the bitreader to stack Date: Thu, 18 Jul 2024 17:31:16 +0200 Message-ID: <172131667667.21847.10057209425663694866@lain.khirnov.net> (raw) In-Reply-To: <20240718144806.GC4991@pb2> Quoting Michael Niedermayer (2024-07-18 16:48:06) > On Thu, Jul 18, 2024 at 11:08:59AM +0200, Anton Khirnov wrote: > > Quoting Michael Niedermayer (2024-07-18 00:42:05) > > > all the stuff should be put together close so its efficiently > > > using CPU caches > > > > Which is why it shares its cacheline with PutBitContext, because the > > code benefits from having the both in the cache, right? And the 4-byte > > hole in PutBitContext is there presumably to aerate the cache for > > smoother data streaming. > > thanks for spoting these, can you fix these ? I have no interest in optimizing the performance of this code. My primary goal here is to remove FFV1-specific hacks from the frame threading code for patch 33/39, which is in turn needed for 38/39. As a public service, I also spent some effort on making the ffv1 code easier to understand, but if you insist on keeping the code as it is I can also just drop its non-compliant frame threading implementation. > > > > More seriously, this is not how caches work. Being close together > > matters mainly so long as your data fits in a cacheline, beyond that > > physical proximity matters little. On stack, the bitreader is likely to > > share the cacheline with other data that is currently needed, thus > > improving cache utilization. > > caches are complex, and being close does matter. > having things in seperate allocations risks hitting aliassing cases > (that is things that cannot be in the cache at the same time) > so when you have the bitstream, the frame buffer, the context already > in 3 independant locations adding a few more increases the risk for hitting > these. > Also sequential memory access is faster than non sequential, it does > make sense to put things together in few places than to scatter them > > Its years since ive done hardcore optimization stuff but i dont think > the principles have changed that much that random access is faster than > sequential and that caches work fundamentally differently I don't see how any of these arguments are relevant - I am not moving the bitreader to a new allocation, but to stack, which is already highly likely to be in cache. > > > > Another factor that matters in efficient cache use is e.g. not having > > multiple copies of the same constant data scattered around, which you're > > objecting to in my other patches. > > copying the actually used small data together per slice > where its accessed per pixel should improve teh speed per pixel while > making the per slice code a little slower. now we have 4 slices maybe > and millions of pixels. Thats why this can give an overall gain This all sounds like premature optimization, AKA the root of all evil. As I said above, I intended to make this code more readable, not faster. Yet somehow it became faster anyway, which suggests this code is not very optimized. So then arguing whether this or that specific change adds or removes a few cycles per frame seems like a waste time to me. -- Anton Khirnov _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
next prev parent reply other threads:[~2024-07-18 15:31 UTC|newest] Thread overview: 110+ messages / expand[flat|nested] mbox.gz Atom feed top 2024-07-16 17:11 [FFmpeg-devel] [PATCH 01/39] tests/fate/vcodec: add vsynth tests for FFV1 version 2 Anton Khirnov 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 02/39] lavc/ffv1dec: declare loop variables in the loop where possible Anton Khirnov 2024-07-24 18:22 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 03/39] lavc/ffv1dec: simplify slice index calculation Anton Khirnov 2024-07-24 18:24 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 04/39] lavc/ffv1dec: drop FFV1Context.cur Anton Khirnov 2024-07-24 18:27 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 05/39] lavc/ffv1dec: drop a pointless variable in decode_slice() Anton Khirnov 2024-07-24 18:58 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 06/39] lavc/ffv1dec: move copy_fields() under HAVE_THREADS Anton Khirnov 2024-07-24 18:58 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 07/39] lavc/ffv1: add a per-slice context Anton Khirnov 2024-07-24 19:01 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 08/39] lavc/ffv1: move sample_buffer to the " Anton Khirnov 2024-07-24 19:04 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 09/39] lavc/ffv1: move run_index " Anton Khirnov 2024-07-17 22:49 ` Michael Niedermayer 2024-07-18 15:36 ` Anton Khirnov 2024-07-18 17:41 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 10/39] lavc/ffv1dec: move the bitreader to stack Anton Khirnov 2024-07-17 22:42 ` Michael Niedermayer 2024-07-18 9:08 ` Anton Khirnov 2024-07-18 14:48 ` Michael Niedermayer 2024-07-18 15:31 ` Anton Khirnov [this message] 2024-07-18 15:35 ` Paul B Mahol 2024-07-18 18:18 ` Michael Niedermayer 2024-07-20 12:15 ` Anton Khirnov 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 11/39] lavc/ffv1enc: move bit writer to per-slice context Anton Khirnov 2024-07-24 19:07 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 12/39] lavc/ffv1: drop redundant FFV1Context.quant_table Anton Khirnov 2024-07-17 22:37 ` Michael Niedermayer 2024-07-17 23:24 ` James Almer 2024-07-18 8:22 ` Anton Khirnov 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 13/39] lavc/ffv1: drop redundant PlaneContext.quant_table Anton Khirnov 2024-07-17 22:32 ` Michael Niedermayer 2024-07-18 8:20 ` Anton Khirnov 2024-07-18 14:31 ` Michael Niedermayer 2024-07-18 15:14 ` Anton Khirnov 2024-07-18 17:03 ` Michael Niedermayer 2024-07-18 15:31 ` Paul B Mahol 2024-07-18 15:43 ` Anton Khirnov 2024-07-18 15:47 ` Paul B Mahol 2024-07-18 17:40 ` Michael Niedermayer 2024-07-20 9:22 ` Anton Khirnov 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 14/39] lavc/ffv1: drop write-only PlaneContext.interlace_bit_state Anton Khirnov 2024-07-24 19:12 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 15/39] lavc/ffv1: always use the main context values of plane_count/transparency Anton Khirnov 2024-07-24 19:15 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 16/39] lavc/ffv1: move FFV1Context.slice_{coding_mode, rct_.y_coef} to per-slice context Anton Khirnov 2024-07-24 19:16 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 17/39] lavc/ffv1: always use the main context values of ac Anton Khirnov 2024-07-24 19:23 ` Michael Niedermayer 2024-07-31 8:33 ` [FFmpeg-devel] [PATCH v2 " Anton Khirnov 2024-07-31 12:20 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 18/39] lavc/ffv1: move FFV1Context.plane to per-slice context Anton Khirnov 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 19/39] lavc/ffv1: move RangeCoder " Anton Khirnov 2024-07-24 19:28 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 20/39] lavc/ffv1enc: store per-slice rc_stat(2?) in FFV1SliceContext Anton Khirnov 2024-07-24 19:30 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 21/39] lavc/ffv1: move ac_byte_count to per-slice context Anton Khirnov 2024-07-24 19:31 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 22/39] lavc/ffv1enc: stop using per-slice FFV1Context Anton Khirnov 2024-07-24 19:42 ` Michael Niedermayer 2024-07-31 8:50 ` [FFmpeg-devel] [PATCH v2 " Anton Khirnov 2024-07-31 12:32 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 23/39] lavc/ffv1dec: move slice_reset_contexts to per-slice context Anton Khirnov 2024-07-24 19:44 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 24/39] lavc/ffv1dec: move slice_damaged " Anton Khirnov 2024-07-24 19:45 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 25/39] lavc/ffv1dec: stop using per-slice FFV1Context Anton Khirnov 2024-07-24 19:48 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 26/39] lavc/ffv1dec: inline copy_fields() into update_thread_context() Anton Khirnov 2024-07-24 19:48 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 27/39] lavc/ffv1: change FFV1SliceContext.plane into a RefStruct object Anton Khirnov 2024-07-24 19:53 ` Michael Niedermayer 2024-08-01 8:17 ` Anton Khirnov 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 28/39] lavc/ffv1dec: fix races in accessing FFV1SliceContext.slice_damaged Anton Khirnov 2024-07-17 20:51 ` Michael Niedermayer 2024-07-22 9:43 ` [FFmpeg-devel] [PATCH 1/3] lavc/ffv1dec: drop code handling AV_PIX_FMT_FLAG_PAL Anton Khirnov 2024-07-22 9:43 ` [FFmpeg-devel] [PATCH 2/3] lavc/ffv1: move damage handling code to decode_slice() Anton Khirnov 2024-07-22 21:14 ` Michael Niedermayer 2024-07-23 6:52 ` Anton Khirnov 2024-07-23 20:14 ` Michael Niedermayer 2024-07-23 21:02 ` Anton Khirnov 2024-07-22 9:43 ` [FFmpeg-devel] [PATCH 3/3] lavc/ffv1dec: fix races in accessing FFV1SliceContext.slice_damaged Anton Khirnov 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 29/39] lavc/thread: move generic-layer API to avcodec_internal.h Anton Khirnov 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 30/39] lavc/internal: document the precise meaning of AVCodecInternal.draining Anton Khirnov 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 31/39] lavc/decode: wrap AV_FRAME_FLAG_DISCARD handling in a loop Anton Khirnov 2024-07-17 21:20 ` Michael Niedermayer 2024-07-18 8:14 ` Anton Khirnov 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 32/39] lavc/decode: reindent Anton Khirnov 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 33/39] lavc: convert frame threading to the receive_frame() pattern Anton Khirnov 2024-07-24 18:44 ` Michael Niedermayer 2024-07-31 11:26 ` Anton Khirnov 2024-07-31 12:59 ` Michael Niedermayer 2024-08-01 14:33 ` [FFmpeg-devel] [PATCH] lavc/ffv1dec: fix races in accessing FFV1SliceContext.slice_damaged Anton Khirnov 2024-08-06 4:39 ` Anton Khirnov 2024-08-09 21:26 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 34/39] lavc/decode: reindent after previous commit Anton Khirnov 2024-08-12 12:49 ` Anton Khirnov 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 35/39] lavc/hevcdec: switch to receive_frame() Anton Khirnov 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 36/39] lavc: add private container FIFO API Anton Khirnov 2024-08-10 0:09 ` Andreas Rheinhardt 2024-08-12 12:14 ` Anton Khirnov 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 37/39] lavc/hevcdec: use a ContainerFifo to hold frames scheduled for output Anton Khirnov 2024-08-09 23:52 ` Andreas Rheinhardt 2024-08-12 12:28 ` Anton Khirnov 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 38/39] lavc/hevcdec: simplify output logic Anton Khirnov 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 39/39] lavc/hevcdec: call ff_thread_finish_setup() even if hwaccel is in use Anton Khirnov 2024-07-24 18:20 ` [FFmpeg-devel] [PATCH 01/39] tests/fate/vcodec: add vsynth tests for FFV1 version 2 Michael Niedermayer
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=172131667667.21847.10057209425663694866@lain.khirnov.net \ --to=anton@khirnov.net \ --cc=ffmpeg-devel@ffmpeg.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel This inbox may be cloned and mirrored by anyone: git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \ ffmpegdev@gitmailbox.com public-inbox-index ffmpegdev Example config snippet for mirrors. AGPL code for this site: git clone https://public-inbox.org/public-inbox.git