From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTP id A99E24833F for ; Thu, 28 Dec 2023 20:42:38 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 9066368CC86; Thu, 28 Dec 2023 22:42:36 +0200 (EET) Received: from iq.passwd.hu (iq.passwd.hu [217.27.212.140]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id BACA468C180 for ; Thu, 28 Dec 2023 22:42:29 +0200 (EET) Received: from localhost (localhost [127.0.0.1]) by iq.passwd.hu (Postfix) with ESMTP id 4235AE7627 for ; Thu, 28 Dec 2023 21:42:29 +0100 (CET) X-Virus-Scanned: amavisd-new at passwd.hu Received: from iq.passwd.hu ([127.0.0.1]) by localhost (iq.passwd.hu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id OWBhMG-cT6eY for ; Thu, 28 Dec 2023 21:42:26 +0100 (CET) Received: from iq (iq [217.27.212.140]) by iq.passwd.hu (Postfix) with ESMTPS id 18F43E6FF6 for ; Thu, 28 Dec 2023 21:42:26 +0100 (CET) Date: Thu, 28 Dec 2023 21:42:26 +0100 (CET) From: Marton Balint To: FFmpeg development discussions and patches In-Reply-To: <20231213195820.21046-1-cus@passwd.hu> Message-ID: <4f48be94-ee2-7455-4793-c3f6d6921b21@passwd.hu> References: <20231213195820.21046-1-cus@passwd.hu> MIME-Version: 1.0 Subject: Re: [FFmpeg-devel] [PATCH] avcodec/bitpacked_dec: optimize bitpacked_decode_yuv422p10 X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: On Wed, 13 Dec 2023, Marton Balint wrote: > From: Devin Heitmueller > > Rework the code a bit to speed up the 10-bit bitpacked decoding > routine. This is probably about as fast as I can get it without > switching to assembly language. > > Demonstratable with: > > ./ffmpeg -f lavfi -i "smptehdbars=size=3840x2160" -c bitpacked -f image2 -frames:v 1 source.yuv > ./ffmpeg -f bitpacked -pix_fmt yuv422p10le -s 3840x2160 -c:v bitpacked -i source.yuv -pix_fmt yuv422p10le out.yuv > > On my development system, it went from 80ms for a 2160p frame > down to 20ms (i.e. a 4X speedup). Good enough for now, I hope... > > Comments from Marton: > > Originally on my system better performance could be achieved by simply > switching to the cached bitstream reader, but for Devin it was slower than > his direct byte operations. > > I changed the order of writing output from u/y/v/y to u/v/y/y, and that made > the code faster than the cached bitstream reader on my system as well. > > TIMER measurement of the decode loop on Ryzen 5 3600 with command line: > > ./ffmpeg -stream_loop 256 -threads 1 -f bitpacked -pix_fmt yuv422p10le -s 3840x2160 -c:v bitpacked -i source.yuv -pix_fmt yuv422p10le -f null none -loglevel error > > Before: 823204127 decicycles in YUV, 256 runs, 0 skips > After: 315070524 decicycles in YUV, 256 runs, 0 skips > > Signed-off-by: Devin Heitmueller > Signed-off-by: Marton Balint > --- > libavcodec/bitpacked_dec.c | 17 +++++++---------- > 1 file changed, 7 insertions(+), 10 deletions(-) Will apply. Regards, Marton > > diff --git a/libavcodec/bitpacked_dec.c b/libavcodec/bitpacked_dec.c > index c88f861993..54c008bd86 100644 > --- a/libavcodec/bitpacked_dec.c > +++ b/libavcodec/bitpacked_dec.c > @@ -28,7 +28,6 @@ > > #include "avcodec.h" > #include "codec_internal.h" > -#include "get_bits.h" > #include "libavutil/imgutils.h" > #include "thread.h" > > @@ -65,7 +64,7 @@ static int bitpacked_decode_yuv422p10(AVCodecContext *avctx, AVFrame *frame, > { > uint64_t frame_size = (uint64_t)avctx->width * (uint64_t)avctx->height * 20; > uint64_t packet_size = (uint64_t)avpkt->size * 8; > - GetBitContext bc; > + uint8_t *src; > uint16_t *y, *u, *v; > int ret, i, j; > > @@ -79,20 +78,18 @@ static int bitpacked_decode_yuv422p10(AVCodecContext *avctx, AVFrame *frame, > if (avctx->width % 2) > return AVERROR_PATCHWELCOME; > > - ret = init_get_bits(&bc, avpkt->data, avctx->width * avctx->height * 20); > - if (ret) > - return ret; > - > + src = avpkt->data; > for (i = 0; i < avctx->height; i++) { > y = (uint16_t*)(frame->data[0] + i * frame->linesize[0]); > u = (uint16_t*)(frame->data[1] + i * frame->linesize[1]); > v = (uint16_t*)(frame->data[2] + i * frame->linesize[2]); > > for (j = 0; j < avctx->width; j += 2) { > - *u++ = get_bits(&bc, 10); > - *y++ = get_bits(&bc, 10); > - *v++ = get_bits(&bc, 10); > - *y++ = get_bits(&bc, 10); > + *u++ = (src[0] << 2) | (src[1] >> 6); > + *v++ = ((src[2] << 6) | (src[3] >> 2)) & 0x3ff; > + *y++ = ((src[1] << 4) | (src[2] >> 4)) & 0x3ff; > + *y++ = ((src[3] << 8) | (src[4])) & 0x3ff; > + src += 5; > } > } > > -- > 2.35.3 > > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". > _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".