Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
 help / color / mirror / Atom feed
From: Devin Heitmueller <devin.heitmueller@ltnglobal.com>
To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Subject: Re: [FFmpeg-devel] [RFC/PATCH] bitpacked_dec: Optimization for bitpacked_dec decoder performance
Date: Fri, 12 May 2023 11:26:44 -0400
Message-ID: <CAHGibzF0gg0Tu0bpdOs71=y=5Nmg+MHJZya315B93TyQJ6SqDQ@mail.gmail.com> (raw)
In-Reply-To: <f3587e5-b42d-de1-e37-c16825991da0@passwd.hu>

On Thu, May 11, 2023 at 6:20 PM Marton Balint <cus@passwd.hu> wrote:
> Actually the cached bitstream reader was faster here than the manual
> approach:
>
> ./ffmpeg -stream_loop 128 -threads 1 -f bitpacked -pix_fmt yuv422p10le -s 3840x2160 -c:v bitpacked -i source.yuv -pix_fmt yuv422p10le -f null none -loglevel error
>
> Old code:
>
> 821050920 decicycles in bitpacked,       1 runs,      0 skips
> 815402160 decicycles in bitpacked,       2 runs,      0 skips
> 814108410 decicycles in bitpacked,       4 runs,      0 skips
> 814213800 decicycles in bitpacked,       8 runs,      0 skips
> 815048325 decicycles in bitpacked,      16 runs,      0 skips
> 812866713 decicycles in bitpacked,      32 runs,      0 skips
> 809186523 decicycles in bitpacked,      64 runs,      0 skips
> 808317601 decicycles in bitpacked,     128 runs,      0 skips
>
> With the patch:
>
> 379879920 decicycles in bitpacked,       1 runs,      0 skips
> 387491580 decicycles in bitpacked,       2 runs,      0 skips
> 397720260 decicycles in bitpacked,       4 runs,      0 skips
> 389581560 decicycles in bitpacked,       8 runs,      0 skips
> 381820635 decicycles in bitpacked,      16 runs,      0 skips
> 379791675 decicycles in bitpacked,      32 runs,      0 skips
> 379246303 decicycles in bitpacked,      64 runs,      0 skips
> 379221671 decicycles in bitpacked,     128 runs,      0 skips
>
> Old code and #defined CACHED_BITSTREAM_READER 1
>
> 345122280 decicycles in bitpacked,       1 runs,      0 skips
> 343663020 decicycles in bitpacked,       2 runs,      0 skips
> 343372680 decicycles in bitpacked,       4 runs,      0 skips
> 342554535 decicycles in bitpacked,       8 runs,      0 skips
> 340816522 decicycles in bitpacked,      16 runs,      0 skips
> 340225672 decicycles in bitpacked,      32 runs,      0 skips
> 340283520 decicycles in bitpacked,      64 runs,      0 skips
> 339643105 decicycles in bitpacked,     128 runs,      0 skips

I don't have a good explanation for this.  I could speculate that some
of it comes down to the processor architecture, how much onboard cache
it has, gcc version (and what sort of optimization/vectorization it
does, if any), etc.  In my case I was testing on Haswell and Skylake
(both with 12MB cache) with gcc 4.8.

I would welcome feedback from others.

Looking at the code to libavcodec/git_bits.h, it might also be worth
looking at setting #define LONG_BITSTREAM_READER, as that might speed
things up as well for such large files.

Devin

-- 
Devin Heitmueller, Senior Software Engineer
LTN Global Communications
o: +1 (301) 363-1001
w: https://ltnglobal.com  e: devin.heitmueller@ltnglobal.com
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

  reply	other threads:[~2023-05-12 15:27 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-05 21:54 Devin Heitmueller
2023-05-06 11:32 ` Lance Wang
2023-05-06 11:49   ` Devin Heitmueller
2023-05-06 11:52   ` Paul B Mahol
2023-05-06 12:13     ` Devin Heitmueller
2023-05-06 12:16       ` James Almer
2023-05-06 12:40         ` Devin Heitmueller
2023-05-10 11:16           ` Lance Wang
2023-05-11 22:20             ` Marton Balint
2023-05-12 15:26               ` Devin Heitmueller [this message]
2023-12-13 19:58                 ` [FFmpeg-devel] [PATCH] avcodec/bitpacked_dec: optimize bitpacked_decode_yuv422p10 Marton Balint
2023-12-28 20:42                   ` Marton Balint
2023-06-12 16:05               ` [FFmpeg-devel] [RFC/PATCH] bitpacked_dec: Optimization for bitpacked_dec decoder performance Paul B Mahol

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHGibzF0gg0Tu0bpdOs71=y=5Nmg+MHJZya315B93TyQJ6SqDQ@mail.gmail.com' \
    --to=devin.heitmueller@ltnglobal.com \
    --cc=ffmpeg-devel@ffmpeg.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
		ffmpegdev@gitmailbox.com
	public-inbox-index ffmpegdev

Example config snippet for mirrors.


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git