Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
 help / color / mirror / Atom feed
From: Anton Khirnov <anton@khirnov.net>
To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Subject: Re: [FFmpeg-devel] [PATCH 10/39] lavc/ffv1dec: move the bitreader to stack
Date: Thu, 18 Jul 2024 17:31:16 +0200
Message-ID: <172131667667.21847.10057209425663694866@lain.khirnov.net> (raw)
In-Reply-To: <20240718144806.GC4991@pb2>

Quoting Michael Niedermayer (2024-07-18 16:48:06)
> On Thu, Jul 18, 2024 at 11:08:59AM +0200, Anton Khirnov wrote:
> > Quoting Michael Niedermayer (2024-07-18 00:42:05)
> > > all the stuff should be put together close so its efficiently
> > > using CPU caches
> > 
> > Which is why it shares its cacheline with PutBitContext, because the
> > code benefits from having the both in the cache, right? And the 4-byte
> > hole in PutBitContext is there presumably to aerate the cache for
> > smoother data streaming.
> 
> thanks for spoting these, can you fix these ?

I have no interest in optimizing the performance of this code. My
primary goal here is to remove FFV1-specific hacks from the frame
threading code for patch 33/39, which is in turn needed for 38/39.

As a public service, I also spent some effort on making the ffv1 code
easier to understand, but if you insist on keeping the code as it is I
can also just drop its non-compliant frame threading implementation.

> > 
> > More seriously, this is not how caches work. Being close together
> > matters mainly so long as your data fits in a cacheline, beyond that
> > physical proximity matters little. On stack, the bitreader is likely to
> > share the cacheline with other data that is currently needed, thus
> > improving cache utilization.
> 
> caches are complex, and being close does matter.
> having things in seperate allocations risks hitting aliassing cases
> (that is things that cannot be in the cache at the same time)
> so when you have the bitstream, the frame buffer, the context already
> in 3 independant locations adding a few more increases the risk for hitting
> these.
> Also sequential memory access is faster than non sequential, it does
> make sense to put things together in few places than to scatter them
> 
> Its years since ive done hardcore optimization stuff but i dont think
> the principles have changed that much that random access is faster than
> sequential and that caches work fundamentally differently

I don't see how any of these arguments are relevant - I am not moving
the bitreader to a new allocation, but to stack, which is already highly
likely to be in cache.

> > 
> > Another factor that matters in efficient cache use is e.g. not having
> > multiple copies of the same constant data scattered around, which you're
> > objecting to in my other patches.
> 
> copying the actually used small data together per slice
> where its accessed per pixel should improve teh speed per pixel while
> making the per slice code a little slower. now we have 4 slices maybe
> and millions of pixels. Thats why this can give an overall gain

This all sounds like premature optimization, AKA the root of all evil.
As I said above, I intended to make this code more readable, not faster.
Yet somehow it became faster anyway, which suggests this code is not
very optimized. So then arguing whether this or that specific change
adds or removes a few cycles per frame seems like a waste time to me.

-- 
Anton Khirnov
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

  reply	other threads:[~2024-07-18 15:31 UTC|newest]

Thread overview: 110+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-16 17:11 [FFmpeg-devel] [PATCH 01/39] tests/fate/vcodec: add vsynth tests for FFV1 version 2 Anton Khirnov
2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 02/39] lavc/ffv1dec: declare loop variables in the loop where possible Anton Khirnov
2024-07-24 18:22   ` Michael Niedermayer
2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 03/39] lavc/ffv1dec: simplify slice index calculation Anton Khirnov
2024-07-24 18:24   ` Michael Niedermayer
2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 04/39] lavc/ffv1dec: drop FFV1Context.cur Anton Khirnov
2024-07-24 18:27   ` Michael Niedermayer
2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 05/39] lavc/ffv1dec: drop a pointless variable in decode_slice() Anton Khirnov
2024-07-24 18:58   ` Michael Niedermayer
2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 06/39] lavc/ffv1dec: move copy_fields() under HAVE_THREADS Anton Khirnov
2024-07-24 18:58   ` Michael Niedermayer
2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 07/39] lavc/ffv1: add a per-slice context Anton Khirnov
2024-07-24 19:01   ` Michael Niedermayer
2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 08/39] lavc/ffv1: move sample_buffer to the " Anton Khirnov
2024-07-24 19:04   ` Michael Niedermayer
2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 09/39] lavc/ffv1: move run_index " Anton Khirnov
2024-07-17 22:49   ` Michael Niedermayer
2024-07-18 15:36     ` Anton Khirnov
2024-07-18 17:41       ` Michael Niedermayer
2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 10/39] lavc/ffv1dec: move the bitreader to stack Anton Khirnov
2024-07-17 22:42   ` Michael Niedermayer
2024-07-18  9:08     ` Anton Khirnov
2024-07-18 14:48       ` Michael Niedermayer
2024-07-18 15:31         ` Anton Khirnov [this message]
2024-07-18 15:35           ` Paul B Mahol
2024-07-18 18:18           ` Michael Niedermayer
2024-07-20 12:15             ` Anton Khirnov
2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 11/39] lavc/ffv1enc: move bit writer to per-slice context Anton Khirnov
2024-07-24 19:07   ` Michael Niedermayer
2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 12/39] lavc/ffv1: drop redundant FFV1Context.quant_table Anton Khirnov
2024-07-17 22:37   ` Michael Niedermayer
2024-07-17 23:24     ` James Almer
2024-07-18  8:22     ` Anton Khirnov
2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 13/39] lavc/ffv1: drop redundant PlaneContext.quant_table Anton Khirnov
2024-07-17 22:32   ` Michael Niedermayer
2024-07-18  8:20     ` Anton Khirnov
2024-07-18 14:31       ` Michael Niedermayer
2024-07-18 15:14         ` Anton Khirnov
2024-07-18 17:03           ` Michael Niedermayer
2024-07-18 15:31       ` Paul B Mahol
2024-07-18 15:43         ` Anton Khirnov
2024-07-18 15:47           ` Paul B Mahol
2024-07-18 17:40       ` Michael Niedermayer
2024-07-20  9:22         ` Anton Khirnov
2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 14/39] lavc/ffv1: drop write-only PlaneContext.interlace_bit_state Anton Khirnov
2024-07-24 19:12   ` Michael Niedermayer
2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 15/39] lavc/ffv1: always use the main context values of plane_count/transparency Anton Khirnov
2024-07-24 19:15   ` Michael Niedermayer
2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 16/39] lavc/ffv1: move FFV1Context.slice_{coding_mode, rct_.y_coef} to per-slice context Anton Khirnov
2024-07-24 19:16   ` Michael Niedermayer
2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 17/39] lavc/ffv1: always use the main context values of ac Anton Khirnov
2024-07-24 19:23   ` Michael Niedermayer
2024-07-31  8:33     ` [FFmpeg-devel] [PATCH v2 " Anton Khirnov
2024-07-31 12:20       ` Michael Niedermayer
2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 18/39] lavc/ffv1: move FFV1Context.plane to per-slice context Anton Khirnov
2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 19/39] lavc/ffv1: move RangeCoder " Anton Khirnov
2024-07-24 19:28   ` Michael Niedermayer
2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 20/39] lavc/ffv1enc: store per-slice rc_stat(2?) in FFV1SliceContext Anton Khirnov
2024-07-24 19:30   ` Michael Niedermayer
2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 21/39] lavc/ffv1: move ac_byte_count to per-slice context Anton Khirnov
2024-07-24 19:31   ` Michael Niedermayer
2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 22/39] lavc/ffv1enc: stop using per-slice FFV1Context Anton Khirnov
2024-07-24 19:42   ` Michael Niedermayer
2024-07-31  8:50     ` [FFmpeg-devel] [PATCH v2 " Anton Khirnov
2024-07-31 12:32       ` Michael Niedermayer
2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 23/39] lavc/ffv1dec: move slice_reset_contexts to per-slice context Anton Khirnov
2024-07-24 19:44   ` Michael Niedermayer
2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 24/39] lavc/ffv1dec: move slice_damaged " Anton Khirnov
2024-07-24 19:45   ` Michael Niedermayer
2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 25/39] lavc/ffv1dec: stop using per-slice FFV1Context Anton Khirnov
2024-07-24 19:48   ` Michael Niedermayer
2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 26/39] lavc/ffv1dec: inline copy_fields() into update_thread_context() Anton Khirnov
2024-07-24 19:48   ` Michael Niedermayer
2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 27/39] lavc/ffv1: change FFV1SliceContext.plane into a RefStruct object Anton Khirnov
2024-07-24 19:53   ` Michael Niedermayer
2024-08-01  8:17   ` Anton Khirnov
2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 28/39] lavc/ffv1dec: fix races in accessing FFV1SliceContext.slice_damaged Anton Khirnov
2024-07-17 20:51   ` Michael Niedermayer
2024-07-22  9:43     ` [FFmpeg-devel] [PATCH 1/3] lavc/ffv1dec: drop code handling AV_PIX_FMT_FLAG_PAL Anton Khirnov
2024-07-22  9:43       ` [FFmpeg-devel] [PATCH 2/3] lavc/ffv1: move damage handling code to decode_slice() Anton Khirnov
2024-07-22 21:14         ` Michael Niedermayer
2024-07-23  6:52           ` Anton Khirnov
2024-07-23 20:14             ` Michael Niedermayer
2024-07-23 21:02               ` Anton Khirnov
2024-07-22  9:43       ` [FFmpeg-devel] [PATCH 3/3] lavc/ffv1dec: fix races in accessing FFV1SliceContext.slice_damaged Anton Khirnov
2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 29/39] lavc/thread: move generic-layer API to avcodec_internal.h Anton Khirnov
2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 30/39] lavc/internal: document the precise meaning of AVCodecInternal.draining Anton Khirnov
2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 31/39] lavc/decode: wrap AV_FRAME_FLAG_DISCARD handling in a loop Anton Khirnov
2024-07-17 21:20   ` Michael Niedermayer
2024-07-18  8:14     ` Anton Khirnov
2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 32/39] lavc/decode: reindent Anton Khirnov
2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 33/39] lavc: convert frame threading to the receive_frame() pattern Anton Khirnov
2024-07-24 18:44   ` Michael Niedermayer
2024-07-31 11:26     ` Anton Khirnov
2024-07-31 12:59       ` Michael Niedermayer
2024-08-01 14:33         ` [FFmpeg-devel] [PATCH] lavc/ffv1dec: fix races in accessing FFV1SliceContext.slice_damaged Anton Khirnov
2024-08-06  4:39           ` Anton Khirnov
2024-08-09 21:26           ` Michael Niedermayer
2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 34/39] lavc/decode: reindent after previous commit Anton Khirnov
2024-08-12 12:49   ` Anton Khirnov
2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 35/39] lavc/hevcdec: switch to receive_frame() Anton Khirnov
2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 36/39] lavc: add private container FIFO API Anton Khirnov
2024-08-10  0:09   ` Andreas Rheinhardt
2024-08-12 12:14     ` Anton Khirnov
2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 37/39] lavc/hevcdec: use a ContainerFifo to hold frames scheduled for output Anton Khirnov
2024-08-09 23:52   ` Andreas Rheinhardt
2024-08-12 12:28     ` Anton Khirnov
2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 38/39] lavc/hevcdec: simplify output logic Anton Khirnov
2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 39/39] lavc/hevcdec: call ff_thread_finish_setup() even if hwaccel is in use Anton Khirnov
2024-07-24 18:20 ` [FFmpeg-devel] [PATCH 01/39] tests/fate/vcodec: add vsynth tests for FFV1 version 2 Michael Niedermayer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=172131667667.21847.10057209425663694866@lain.khirnov.net \
    --to=anton@khirnov.net \
    --cc=ffmpeg-devel@ffmpeg.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
		ffmpegdev@gitmailbox.com
	public-inbox-index ffmpegdev

Example config snippet for mirrors.


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git