From: Michael Niedermayer <michael@niedermayer.cc> To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org> Subject: Re: [FFmpeg-devel] [PATCH 10/39] lavc/ffv1dec: move the bitreader to stack Date: Thu, 18 Jul 2024 16:48:06 +0200 Message-ID: <20240718144806.GC4991@pb2> (raw) In-Reply-To: <172129373901.21847.12392144255635795802@lain.khirnov.net> [-- Attachment #1.1: Type: text/plain, Size: 2135 bytes --] On Thu, Jul 18, 2024 at 11:08:59AM +0200, Anton Khirnov wrote: > Quoting Michael Niedermayer (2024-07-18 00:42:05) > > all the stuff should be put together close so its efficiently > > using CPU caches > > Which is why it shares its cacheline with PutBitContext, because the > code benefits from having the both in the cache, right? And the 4-byte > hole in PutBitContext is there presumably to aerate the cache for > smoother data streaming. thanks for spoting these, can you fix these ? > > More seriously, this is not how caches work. Being close together > matters mainly so long as your data fits in a cacheline, beyond that > physical proximity matters little. On stack, the bitreader is likely to > share the cacheline with other data that is currently needed, thus > improving cache utilization. caches are complex, and being close does matter. having things in seperate allocations risks hitting aliassing cases (that is things that cannot be in the cache at the same time) so when you have the bitstream, the frame buffer, the context already in 3 independant locations adding a few more increases the risk for hitting these. Also sequential memory access is faster than non sequential, it does make sense to put things together in few places than to scatter them Its years since ive done hardcore optimization stuff but i dont think the principles have changed that much that random access is faster than sequential and that caches work fundamentally differently > > Another factor that matters in efficient cache use is e.g. not having > multiple copies of the same constant data scattered around, which you're > objecting to in my other patches. copying the actually used small data together per slice where its accessed per pixel should improve teh speed per pixel while making the per slice code a little slower. now we have 4 slices maybe and millions of pixels. Thats why this can give an overall gain thx [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB No snowflake in an avalanche ever feels responsible. -- Voltaire [-- Attachment #1.2: signature.asc --] [-- Type: application/pgp-signature, Size: 195 bytes --] [-- Attachment #2: Type: text/plain, Size: 251 bytes --] _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
next prev parent reply other threads:[~2024-07-18 14:48 UTC|newest] Thread overview: 110+ messages / expand[flat|nested] mbox.gz Atom feed top 2024-07-16 17:11 [FFmpeg-devel] [PATCH 01/39] tests/fate/vcodec: add vsynth tests for FFV1 version 2 Anton Khirnov 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 02/39] lavc/ffv1dec: declare loop variables in the loop where possible Anton Khirnov 2024-07-24 18:22 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 03/39] lavc/ffv1dec: simplify slice index calculation Anton Khirnov 2024-07-24 18:24 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 04/39] lavc/ffv1dec: drop FFV1Context.cur Anton Khirnov 2024-07-24 18:27 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 05/39] lavc/ffv1dec: drop a pointless variable in decode_slice() Anton Khirnov 2024-07-24 18:58 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 06/39] lavc/ffv1dec: move copy_fields() under HAVE_THREADS Anton Khirnov 2024-07-24 18:58 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 07/39] lavc/ffv1: add a per-slice context Anton Khirnov 2024-07-24 19:01 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 08/39] lavc/ffv1: move sample_buffer to the " Anton Khirnov 2024-07-24 19:04 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 09/39] lavc/ffv1: move run_index " Anton Khirnov 2024-07-17 22:49 ` Michael Niedermayer 2024-07-18 15:36 ` Anton Khirnov 2024-07-18 17:41 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 10/39] lavc/ffv1dec: move the bitreader to stack Anton Khirnov 2024-07-17 22:42 ` Michael Niedermayer 2024-07-18 9:08 ` Anton Khirnov 2024-07-18 14:48 ` Michael Niedermayer [this message] 2024-07-18 15:31 ` Anton Khirnov 2024-07-18 15:35 ` Paul B Mahol 2024-07-18 18:18 ` Michael Niedermayer 2024-07-20 12:15 ` Anton Khirnov 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 11/39] lavc/ffv1enc: move bit writer to per-slice context Anton Khirnov 2024-07-24 19:07 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 12/39] lavc/ffv1: drop redundant FFV1Context.quant_table Anton Khirnov 2024-07-17 22:37 ` Michael Niedermayer 2024-07-17 23:24 ` James Almer 2024-07-18 8:22 ` Anton Khirnov 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 13/39] lavc/ffv1: drop redundant PlaneContext.quant_table Anton Khirnov 2024-07-17 22:32 ` Michael Niedermayer 2024-07-18 8:20 ` Anton Khirnov 2024-07-18 14:31 ` Michael Niedermayer 2024-07-18 15:14 ` Anton Khirnov 2024-07-18 17:03 ` Michael Niedermayer 2024-07-18 15:31 ` Paul B Mahol 2024-07-18 15:43 ` Anton Khirnov 2024-07-18 15:47 ` Paul B Mahol 2024-07-18 17:40 ` Michael Niedermayer 2024-07-20 9:22 ` Anton Khirnov 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 14/39] lavc/ffv1: drop write-only PlaneContext.interlace_bit_state Anton Khirnov 2024-07-24 19:12 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 15/39] lavc/ffv1: always use the main context values of plane_count/transparency Anton Khirnov 2024-07-24 19:15 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 16/39] lavc/ffv1: move FFV1Context.slice_{coding_mode, rct_.y_coef} to per-slice context Anton Khirnov 2024-07-24 19:16 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 17/39] lavc/ffv1: always use the main context values of ac Anton Khirnov 2024-07-24 19:23 ` Michael Niedermayer 2024-07-31 8:33 ` [FFmpeg-devel] [PATCH v2 " Anton Khirnov 2024-07-31 12:20 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 18/39] lavc/ffv1: move FFV1Context.plane to per-slice context Anton Khirnov 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 19/39] lavc/ffv1: move RangeCoder " Anton Khirnov 2024-07-24 19:28 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 20/39] lavc/ffv1enc: store per-slice rc_stat(2?) in FFV1SliceContext Anton Khirnov 2024-07-24 19:30 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 21/39] lavc/ffv1: move ac_byte_count to per-slice context Anton Khirnov 2024-07-24 19:31 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 22/39] lavc/ffv1enc: stop using per-slice FFV1Context Anton Khirnov 2024-07-24 19:42 ` Michael Niedermayer 2024-07-31 8:50 ` [FFmpeg-devel] [PATCH v2 " Anton Khirnov 2024-07-31 12:32 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 23/39] lavc/ffv1dec: move slice_reset_contexts to per-slice context Anton Khirnov 2024-07-24 19:44 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 24/39] lavc/ffv1dec: move slice_damaged " Anton Khirnov 2024-07-24 19:45 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 25/39] lavc/ffv1dec: stop using per-slice FFV1Context Anton Khirnov 2024-07-24 19:48 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 26/39] lavc/ffv1dec: inline copy_fields() into update_thread_context() Anton Khirnov 2024-07-24 19:48 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 27/39] lavc/ffv1: change FFV1SliceContext.plane into a RefStruct object Anton Khirnov 2024-07-24 19:53 ` Michael Niedermayer 2024-08-01 8:17 ` Anton Khirnov 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 28/39] lavc/ffv1dec: fix races in accessing FFV1SliceContext.slice_damaged Anton Khirnov 2024-07-17 20:51 ` Michael Niedermayer 2024-07-22 9:43 ` [FFmpeg-devel] [PATCH 1/3] lavc/ffv1dec: drop code handling AV_PIX_FMT_FLAG_PAL Anton Khirnov 2024-07-22 9:43 ` [FFmpeg-devel] [PATCH 2/3] lavc/ffv1: move damage handling code to decode_slice() Anton Khirnov 2024-07-22 21:14 ` Michael Niedermayer 2024-07-23 6:52 ` Anton Khirnov 2024-07-23 20:14 ` Michael Niedermayer 2024-07-23 21:02 ` Anton Khirnov 2024-07-22 9:43 ` [FFmpeg-devel] [PATCH 3/3] lavc/ffv1dec: fix races in accessing FFV1SliceContext.slice_damaged Anton Khirnov 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 29/39] lavc/thread: move generic-layer API to avcodec_internal.h Anton Khirnov 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 30/39] lavc/internal: document the precise meaning of AVCodecInternal.draining Anton Khirnov 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 31/39] lavc/decode: wrap AV_FRAME_FLAG_DISCARD handling in a loop Anton Khirnov 2024-07-17 21:20 ` Michael Niedermayer 2024-07-18 8:14 ` Anton Khirnov 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 32/39] lavc/decode: reindent Anton Khirnov 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 33/39] lavc: convert frame threading to the receive_frame() pattern Anton Khirnov 2024-07-24 18:44 ` Michael Niedermayer 2024-07-31 11:26 ` Anton Khirnov 2024-07-31 12:59 ` Michael Niedermayer 2024-08-01 14:33 ` [FFmpeg-devel] [PATCH] lavc/ffv1dec: fix races in accessing FFV1SliceContext.slice_damaged Anton Khirnov 2024-08-06 4:39 ` Anton Khirnov 2024-08-09 21:26 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 34/39] lavc/decode: reindent after previous commit Anton Khirnov 2024-08-12 12:49 ` Anton Khirnov 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 35/39] lavc/hevcdec: switch to receive_frame() Anton Khirnov 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 36/39] lavc: add private container FIFO API Anton Khirnov 2024-08-10 0:09 ` Andreas Rheinhardt 2024-08-12 12:14 ` Anton Khirnov 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 37/39] lavc/hevcdec: use a ContainerFifo to hold frames scheduled for output Anton Khirnov 2024-08-09 23:52 ` Andreas Rheinhardt 2024-08-12 12:28 ` Anton Khirnov 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 38/39] lavc/hevcdec: simplify output logic Anton Khirnov 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 39/39] lavc/hevcdec: call ff_thread_finish_setup() even if hwaccel is in use Anton Khirnov 2024-07-24 18:20 ` [FFmpeg-devel] [PATCH 01/39] tests/fate/vcodec: add vsynth tests for FFV1 version 2 Michael Niedermayer
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20240718144806.GC4991@pb2 \ --to=michael@niedermayer.cc \ --cc=ffmpeg-devel@ffmpeg.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel This inbox may be cloned and mirrored by anyone: git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \ ffmpegdev@gitmailbox.com public-inbox-index ffmpegdev Example config snippet for mirrors. AGPL code for this site: git clone https://public-inbox.org/public-inbox.git