From: Anton Khirnov <anton@khirnov.net> To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org> Subject: Re: [FFmpeg-devel] [PATCH 13/39] lavc/ffv1: drop redundant PlaneContext.quant_table Date: Sat, 20 Jul 2024 11:22:43 +0200 Message-ID: <172146736349.21847.12937616463646199218@lain.khirnov.net> (raw) In-Reply-To: <20240718174004.GE4991@pb2> Quoting Michael Niedermayer (2024-07-18 19:40:04) > On Thu, Jul 18, 2024 at 10:20:09AM +0200, Anton Khirnov wrote: > > Quoting Michael Niedermayer (2024-07-18 00:32:38) > > > the data for each decoder task should be together and not scattered around > > > more than needed, reducing cache efficiency > > > > > > putting all this extra code in the inner per pixel loop is not ok > > > especially not for the sake of avoiding a memcpy of a few hundread bytes multiple levels of loops outside > > > > A nice theory, but in practice this patchset makes single-threaded > > decoding about 4% faster overall, on a 1920x1080 10bit sample. That's > > just the ffv1 parts (up to patch 28), full set also improves frame > > threading performance as follows: > > threads improvement > > --------------------------- > > 2 52% (yes really) > > 4 16% > > 8 12% > > I do want the speed improvements, yes. > > But > you compare frame threading when slice threading performed > much better than frame threading prior to the patch If that were true in general, there'd be no reason for frame threading support in ffv1, as it has a higher latency and uses more memory; higher performance is its only advantage. However you added frame threading in a0c0900e470fde0d6db360e555620476c2323895 claiming it is faster, which I can partially confirm even with current master - slice threading saturates at thread count = slice count, while frame threading scales beyond it. Frame threading also improves significantly after this set: threads | slice | frame/before | frame/after ----------------------------------------------- 2 22.6124 43.738 22.0354 4 14.3367 15.115 13.1964 6 14.3850 11.974 10.9745 8 14.3472 9.7229 8.76617 10 14.3579 8.4638 8.6499 12 14.3665 8.4636 8.5735 16 14.2960 7.6926 7.1696 ----------------------------------------------- (values are total decode time in seconds) Note that after this set frame threading is ALWAYS faster than slice threading, for any thread count. > also id like to see the individual changes which look like they should > make teh code slower, to be tested individually. If they make the code slower > they should be dropped I don't think it's meaningful to individually benchmark the patches moving per-slice data into the new per-slice context. I split them to simplify testing and review, but it only makes sense to apply all of them or none, otherwise the code gets more complex. -- Anton Khirnov _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
next prev parent reply other threads:[~2024-07-20 9:22 UTC|newest] Thread overview: 110+ messages / expand[flat|nested] mbox.gz Atom feed top 2024-07-16 17:11 [FFmpeg-devel] [PATCH 01/39] tests/fate/vcodec: add vsynth tests for FFV1 version 2 Anton Khirnov 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 02/39] lavc/ffv1dec: declare loop variables in the loop where possible Anton Khirnov 2024-07-24 18:22 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 03/39] lavc/ffv1dec: simplify slice index calculation Anton Khirnov 2024-07-24 18:24 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 04/39] lavc/ffv1dec: drop FFV1Context.cur Anton Khirnov 2024-07-24 18:27 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 05/39] lavc/ffv1dec: drop a pointless variable in decode_slice() Anton Khirnov 2024-07-24 18:58 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 06/39] lavc/ffv1dec: move copy_fields() under HAVE_THREADS Anton Khirnov 2024-07-24 18:58 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 07/39] lavc/ffv1: add a per-slice context Anton Khirnov 2024-07-24 19:01 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 08/39] lavc/ffv1: move sample_buffer to the " Anton Khirnov 2024-07-24 19:04 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 09/39] lavc/ffv1: move run_index " Anton Khirnov 2024-07-17 22:49 ` Michael Niedermayer 2024-07-18 15:36 ` Anton Khirnov 2024-07-18 17:41 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 10/39] lavc/ffv1dec: move the bitreader to stack Anton Khirnov 2024-07-17 22:42 ` Michael Niedermayer 2024-07-18 9:08 ` Anton Khirnov 2024-07-18 14:48 ` Michael Niedermayer 2024-07-18 15:31 ` Anton Khirnov 2024-07-18 15:35 ` Paul B Mahol 2024-07-18 18:18 ` Michael Niedermayer 2024-07-20 12:15 ` Anton Khirnov 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 11/39] lavc/ffv1enc: move bit writer to per-slice context Anton Khirnov 2024-07-24 19:07 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 12/39] lavc/ffv1: drop redundant FFV1Context.quant_table Anton Khirnov 2024-07-17 22:37 ` Michael Niedermayer 2024-07-17 23:24 ` James Almer 2024-07-18 8:22 ` Anton Khirnov 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 13/39] lavc/ffv1: drop redundant PlaneContext.quant_table Anton Khirnov 2024-07-17 22:32 ` Michael Niedermayer 2024-07-18 8:20 ` Anton Khirnov 2024-07-18 14:31 ` Michael Niedermayer 2024-07-18 15:14 ` Anton Khirnov 2024-07-18 17:03 ` Michael Niedermayer 2024-07-18 15:31 ` Paul B Mahol 2024-07-18 15:43 ` Anton Khirnov 2024-07-18 15:47 ` Paul B Mahol 2024-07-18 17:40 ` Michael Niedermayer 2024-07-20 9:22 ` Anton Khirnov [this message] 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 14/39] lavc/ffv1: drop write-only PlaneContext.interlace_bit_state Anton Khirnov 2024-07-24 19:12 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 15/39] lavc/ffv1: always use the main context values of plane_count/transparency Anton Khirnov 2024-07-24 19:15 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 16/39] lavc/ffv1: move FFV1Context.slice_{coding_mode, rct_.y_coef} to per-slice context Anton Khirnov 2024-07-24 19:16 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 17/39] lavc/ffv1: always use the main context values of ac Anton Khirnov 2024-07-24 19:23 ` Michael Niedermayer 2024-07-31 8:33 ` [FFmpeg-devel] [PATCH v2 " Anton Khirnov 2024-07-31 12:20 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 18/39] lavc/ffv1: move FFV1Context.plane to per-slice context Anton Khirnov 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 19/39] lavc/ffv1: move RangeCoder " Anton Khirnov 2024-07-24 19:28 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 20/39] lavc/ffv1enc: store per-slice rc_stat(2?) in FFV1SliceContext Anton Khirnov 2024-07-24 19:30 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 21/39] lavc/ffv1: move ac_byte_count to per-slice context Anton Khirnov 2024-07-24 19:31 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 22/39] lavc/ffv1enc: stop using per-slice FFV1Context Anton Khirnov 2024-07-24 19:42 ` Michael Niedermayer 2024-07-31 8:50 ` [FFmpeg-devel] [PATCH v2 " Anton Khirnov 2024-07-31 12:32 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 23/39] lavc/ffv1dec: move slice_reset_contexts to per-slice context Anton Khirnov 2024-07-24 19:44 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 24/39] lavc/ffv1dec: move slice_damaged " Anton Khirnov 2024-07-24 19:45 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 25/39] lavc/ffv1dec: stop using per-slice FFV1Context Anton Khirnov 2024-07-24 19:48 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 26/39] lavc/ffv1dec: inline copy_fields() into update_thread_context() Anton Khirnov 2024-07-24 19:48 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 27/39] lavc/ffv1: change FFV1SliceContext.plane into a RefStruct object Anton Khirnov 2024-07-24 19:53 ` Michael Niedermayer 2024-08-01 8:17 ` Anton Khirnov 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 28/39] lavc/ffv1dec: fix races in accessing FFV1SliceContext.slice_damaged Anton Khirnov 2024-07-17 20:51 ` Michael Niedermayer 2024-07-22 9:43 ` [FFmpeg-devel] [PATCH 1/3] lavc/ffv1dec: drop code handling AV_PIX_FMT_FLAG_PAL Anton Khirnov 2024-07-22 9:43 ` [FFmpeg-devel] [PATCH 2/3] lavc/ffv1: move damage handling code to decode_slice() Anton Khirnov 2024-07-22 21:14 ` Michael Niedermayer 2024-07-23 6:52 ` Anton Khirnov 2024-07-23 20:14 ` Michael Niedermayer 2024-07-23 21:02 ` Anton Khirnov 2024-07-22 9:43 ` [FFmpeg-devel] [PATCH 3/3] lavc/ffv1dec: fix races in accessing FFV1SliceContext.slice_damaged Anton Khirnov 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 29/39] lavc/thread: move generic-layer API to avcodec_internal.h Anton Khirnov 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 30/39] lavc/internal: document the precise meaning of AVCodecInternal.draining Anton Khirnov 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 31/39] lavc/decode: wrap AV_FRAME_FLAG_DISCARD handling in a loop Anton Khirnov 2024-07-17 21:20 ` Michael Niedermayer 2024-07-18 8:14 ` Anton Khirnov 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 32/39] lavc/decode: reindent Anton Khirnov 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 33/39] lavc: convert frame threading to the receive_frame() pattern Anton Khirnov 2024-07-24 18:44 ` Michael Niedermayer 2024-07-31 11:26 ` Anton Khirnov 2024-07-31 12:59 ` Michael Niedermayer 2024-08-01 14:33 ` [FFmpeg-devel] [PATCH] lavc/ffv1dec: fix races in accessing FFV1SliceContext.slice_damaged Anton Khirnov 2024-08-06 4:39 ` Anton Khirnov 2024-08-09 21:26 ` Michael Niedermayer 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 34/39] lavc/decode: reindent after previous commit Anton Khirnov 2024-08-12 12:49 ` Anton Khirnov 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 35/39] lavc/hevcdec: switch to receive_frame() Anton Khirnov 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 36/39] lavc: add private container FIFO API Anton Khirnov 2024-08-10 0:09 ` Andreas Rheinhardt 2024-08-12 12:14 ` Anton Khirnov 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 37/39] lavc/hevcdec: use a ContainerFifo to hold frames scheduled for output Anton Khirnov 2024-08-09 23:52 ` Andreas Rheinhardt 2024-08-12 12:28 ` Anton Khirnov 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 38/39] lavc/hevcdec: simplify output logic Anton Khirnov 2024-07-16 17:11 ` [FFmpeg-devel] [PATCH 39/39] lavc/hevcdec: call ff_thread_finish_setup() even if hwaccel is in use Anton Khirnov 2024-07-24 18:20 ` [FFmpeg-devel] [PATCH 01/39] tests/fate/vcodec: add vsynth tests for FFV1 version 2 Michael Niedermayer
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=172146736349.21847.12937616463646199218@lain.khirnov.net \ --to=anton@khirnov.net \ --cc=ffmpeg-devel@ffmpeg.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel This inbox may be cloned and mirrored by anyone: git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \ ffmpegdev@gitmailbox.com public-inbox-index ffmpegdev Example config snippet for mirrors. AGPL code for this site: git clone https://public-inbox.org/public-inbox.git