From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTP id 9C25A46113 for ; Sat, 6 May 2023 12:14:06 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id B16E568C13D; Sat, 6 May 2023 15:14:03 +0300 (EEST) Received: from mail-yw1-f178.google.com (mail-yw1-f178.google.com [209.85.128.178]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id A50AF68C0AD for ; Sat, 6 May 2023 15:13:56 +0300 (EEST) Received: by mail-yw1-f178.google.com with SMTP id 00721157ae682-559f1819c5dso42067537b3.0 for ; Sat, 06 May 2023 05:13:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ltnglobal-com.20221208.gappssmtp.com; s=20221208; t=1683375235; x=1685967235; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :from:to:cc:subject:date:message-id:reply-to; bh=5EudsuKYJvaNY4H8VGW9jc8WAUe23PM8k/Re9OFAh1c=; b=GMHnjjvdbA5XTI8ia6igk+yd9zfk0yK3KyRiOBBT3DDlNZBGiZ804qK4i3K5sGazEY 7h5BjOamjvOkHMD4P3GEqDWzQh+z8Rg1yRou57x01B1uEURc30hVDCX+ukAj+CRDecoo aKTh2088GN82U8LhvclSjJ8W3p6SRyAJldEnL34q/DYErob4IAru5ufce9vfFFYRMxUy FkucdCpBkP7KsgUhMHXsUxNg5r6a3lXjS45Szr5WpUCii2IL4Y/LBGxVJYTMxGGrc7/I 4kHPijgb0NPHQEbxA2Ii2qT73El4fc0yG4NmasacfHzcjpS1l6bAyNa/6ZrzEoXcciqP Xjdg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1683375235; x=1685967235; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=5EudsuKYJvaNY4H8VGW9jc8WAUe23PM8k/Re9OFAh1c=; b=cYz8k2Vk+gvETDUXzmLYPg6mFBTHeJRBK2pAtySEAhoQQCKrtELRAW4QsDP0yCfIlk vU5ZeJkfOoeT5G5cwK6xOdWcwVtEWpGJ+chq+2/b7sNTnHoUK5g+g98uDBXgYX5ejheS nmwZeoIU3vuZPtRvX1reqh0ncFH8jAn8GfWzjsg8C3ergP92WH7bV1czCTrwJbKmMvUd XIvbye1W+maETpYvJF52fAfVRGZesGXZ2zbl27SgBJDtkcp07J6EZ+snvtu1I37c4AYm 0CKlvmiw8iRJZyL2yTH487JQgzjalz1910wBA5y642AQDwFO3cmFjGQ/+R1o0QG4V5gn dvWQ== X-Gm-Message-State: AC+VfDwennDNRANylMXpBHL+f509eKn6YwKUx1gKvkPt+JJUco369OxS G8Lmcow0uJjGjrzuXPONS1fTEAsz3D2d02hJrHpZQoqaPZ7TBNKhmkk= X-Google-Smtp-Source: ACHHUZ7BC612OI6uqQpVJvqziAHh95xWhx/ymGTr26dJZEQl41A5NbTcxJIEDFD8L72+4GaA/+QtieVcqZGveac1EOg= X-Received: by 2002:a0d:dfd5:0:b0:55a:1026:4ad9 with SMTP id i204-20020a0ddfd5000000b0055a10264ad9mr4736252ywe.4.1683375234950; Sat, 06 May 2023 05:13:54 -0700 (PDT) MIME-Version: 1.0 References: <1683323657-20687-1-git-send-email-dheitmueller@ltnglobal.com> In-Reply-To: From: Devin Heitmueller Date: Sat, 6 May 2023 08:13:44 -0400 Message-ID: To: FFmpeg development discussions and patches Content-Type: multipart/mixed; boundary="000000000000e7910005fb055622" Subject: Re: [FFmpeg-devel] [RFC/PATCH] bitpacked_dec: Optimization for bitpacked_dec decoder performance X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: --000000000000e7910005fb055622 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable I added some instrumentation via the attached patch. You can see the benefits here: Before=3D1683378057.243350 After 1683378057.264239 Before=3D1683378083.335424 After 1683378083.356440 Before=3D1683378089.675400 After 1683378089.696512 Before=3D1683378151.792324 After 1683378151.813579 21 ms per run After patch: Before=3D1683378222.167796 After 1683378222.175760 Before=3D1683378233.131416 After 1683378233.139326 Before=3D1683378243.591895 After 1683378243.599840 8 ms per run Note: this is a different platform than I did the original development on, and apparently the improvement on this particular box is only 2.5x rather than 4x. Devin On Sat, May 6, 2023 at 7:53=E2=80=AFAM Paul B Mahol wrot= e: > > On Sat, May 6, 2023 at 1:32=E2=80=AFPM Lance Wang wrote: > > > On Sat, May 6, 2023 at 4:58=E2=80=AFAM Devin Heitmueller < > > devin.heitmueller@ltnglobal.com> wrote: > > > > > Rework the code a bit to speed up the 10-bit bitpacked decoding > > > routine. This is probably about as fast as I can get it without > > > switching to assembly language. > > > > > > Demonstratable with: > > > > > > ./ffmpeg -f lavfi -i "smptehdbars=3Dsize=3D3840x2160" -c bitpacked -f= image2 > > > -frames:v 1 source.yuv > > > ./ffmpeg -f bitpacked -pix_fmt yuv422p10le -s 3840x2160 -c:v bitpacke= d -i > > > source.yuv -pix_fmt yuv422p10le out.yuv > > > > > > On my development system, it went from 80ms for a 2160p frame > > > down to 20ms (i.e. a 4X speedup). Good enough for now, I hope... > > > > > > > > FYI, on my development system, I run two time for the original and modi= fied > > version and no obvious difference: > > ./ffmpeg -f lavfi -i "smptehdbars=3Dsize=3D3840x2160" -c bitpacked -fra= mes:v 25 > > source.yuv > > time ./ffmpeg -f bitpacked -pix_fmt yuv422p10le -s 3840x2160 -c:v bitpa= cked > > -i source.yuv -pix_fmt yuv422p10le out.yuv > > frame=3D 25 fps=3D0.0 q=3D-0.0 Lsize=3D 810000kB time=3D00:00:00.96 > > bitrate=3D6912000.0kbits/s speed=3D1.13x > > > > real 0m0.961s > > user 0m1.086s > > sys 0m1.360s > > > > frame=3D 25 fps=3D0.0 q=3D-0.0 Lsize=3D 810000kB time=3D00:00:00.96 > > bitrate=3D6912000.0kbits/s speed=3D1.16x > > > > real 0m0.936s > > user 0m1.358s > > sys 0m1.350s > > > > after apply the patch: > > frame=3D 25 fps=3D0.0 q=3D-0.0 Lsize=3D 810000kB time=3D00:00:00.96 > > bitrate=3D6912000.0kbits/s speed=3D1.14x > > > > real 0m0.953s > > user 0m0.906s > > sys 0m1.438s > > > > frame=3D 25 fps=3D0.0 q=3D-0.0 Lsize=3D 810000kB time=3D00:00:00.96 > > bitrate=3D6912000.0kbits/s speed=3D1.17x > > > > real 0m0.922s > > user 0m0.926s > > sys 0m1.066s > > > > Only 25 frames? > This is flawed. > > > > > > > > > > > Signed-off-by: Devin Heitmueller > > > --- > > > libavcodec/bitpacked_dec.c | 17 +++++++---------- > > > 1 file changed, 7 insertions(+), 10 deletions(-) > > > > > > diff --git a/libavcodec/bitpacked_dec.c b/libavcodec/bitpacked_dec.c > > > index a1ffef1..96aba27 100644 > > > --- a/libavcodec/bitpacked_dec.c > > > +++ b/libavcodec/bitpacked_dec.c > > > @@ -28,7 +28,6 @@ > > > > > > #include "avcodec.h" > > > #include "codec_internal.h" > > > -#include "get_bits.h" > > > #include "libavutil/imgutils.h" > > > #include "thread.h" > > > > > > @@ -65,7 +64,7 @@ static int bitpacked_decode_yuv422p10(AVCodecContex= t > > > *avctx, AVFrame *frame, > > > { > > > uint64_t frame_size =3D (uint64_t)avctx->width * > > > (uint64_t)avctx->height * 20; > > > uint64_t packet_size =3D (uint64_t)avpkt->size * 8; > > > - GetBitContext bc; > > > + uint8_t *src; > > > uint16_t *y, *u, *v; > > > int ret, i, j; > > > > > > @@ -79,20 +78,18 @@ static int bitpacked_decode_yuv422p10(AVCodecCont= ext > > > *avctx, AVFrame *frame, > > > if (avctx->width % 2) > > > return AVERROR_PATCHWELCOME; > > > > > > - ret =3D init_get_bits(&bc, avpkt->data, avctx->width * avctx->he= ight * > > > 20); > > > - if (ret) > > > - return ret; > > > - > > > + src =3D avpkt->data; > > > for (i =3D 0; i < avctx->height; i++) { > > > y =3D (uint16_t*)(frame->data[0] + i * frame->linesize[0]); > > > u =3D (uint16_t*)(frame->data[1] + i * frame->linesize[1]); > > > v =3D (uint16_t*)(frame->data[2] + i * frame->linesize[2]); > > > > > > for (j =3D 0; j < avctx->width; j +=3D 2) { > > > - *u++ =3D get_bits(&bc, 10); > > > - *y++ =3D get_bits(&bc, 10); > > > - *v++ =3D get_bits(&bc, 10); > > > - *y++ =3D get_bits(&bc, 10); > > > + *u++ =3D (src[0] << 2) | (src[1] >> 6); > > > + *y++ =3D ((src[1] << 4) | (src[2] >> 4)) & 0x3ff; > > > + *v++ =3D ((src[2] << 6) | (src[3] >> 2)) & 0x3ff; > > > + *y++ =3D ((src[3] << 8) | (src[4])) & 0x3ff; > > > + src +=3D 5; > > > } > > > } > > > > > > -- > > > 1.8.3.1 > > > > > > _______________________________________________ > > > ffmpeg-devel mailing list > > > ffmpeg-devel@ffmpeg.org > > > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > > > > > To unsubscribe, visit link above, or email > > > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". > > > > > _______________________________________________ > > ffmpeg-devel mailing list > > ffmpeg-devel@ffmpeg.org > > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > > > To unsubscribe, visit link above, or email > > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". > > > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". --=20 Devin Heitmueller, Senior Software Engineer LTN Global Communications o: +1 (301) 363-1001 w: https://ltnglobal.com e: devin.heitmueller@ltnglobal.com --000000000000e7910005fb055622 Content-Type: application/octet-stream; name="timing.patch" Content-Disposition: attachment; filename="timing.patch" Content-Transfer-Encoding: base64 Content-ID: X-Attachment-Id: f_lhby2ai90 ZGlmZiAtLWdpdCBhL2xpYmF2Y29kZWMvYml0cGFja2VkX2RlYy5jIGIvbGliYXZjb2RlYy9iaXRw YWNrZWRfZGVjLmMKaW5kZXggOTZhYmEyNy4uYjE4YmRhYyAxMDA2NDQKLS0tIGEvbGliYXZjb2Rl Yy9iaXRwYWNrZWRfZGVjLmMKKysrIGIvbGliYXZjb2RlYy9iaXRwYWNrZWRfZGVjLmMKQEAgLTMw LDYgKzMwLDcgQEAKICNpbmNsdWRlICJjb2RlY19pbnRlcm5hbC5oIgogI2luY2x1ZGUgImxpYmF2 dXRpbC9pbWd1dGlscy5oIgogI2luY2x1ZGUgInRocmVhZC5oIgorI2luY2x1ZGUgPHN5cy90aW1l Lmg+CiAKIHN0cnVjdCBCaXRwYWNrZWRDb250ZXh0IHsKICAgICBpbnQgKCpkZWNvZGUpKEFWQ29k ZWNDb250ZXh0ICphdmN0eCwgQVZGcmFtZSAqZnJhbWUsCkBAIC02NywxMSArNjgsMTQgQEAgc3Rh dGljIGludCBiaXRwYWNrZWRfZGVjb2RlX3l1djQyMnAxMChBVkNvZGVjQ29udGV4dCAqYXZjdHgs IEFWRnJhbWUgKmZyYW1lLAogICAgIHVpbnQ4X3QgKnNyYzsKICAgICB1aW50MTZfdCAqeSwgKnUs ICp2OwogICAgIGludCByZXQsIGksIGo7CisgICAgc3RydWN0IHRpbWV2YWwgdDEsIHQyOwogCiAg ICAgcmV0ID0gZmZfdGhyZWFkX2dldF9idWZmZXIoYXZjdHgsIGZyYW1lLCAwKTsKICAgICBpZiAo cmV0IDwgMCkKICAgICAgICAgcmV0dXJuIHJldDsKIAorICAgIGdldHRpbWVvZmRheSgmdDEsIE5V TEwpOworCiAgICAgaWYgKGZyYW1lX3NpemUgPiBwYWNrZXRfc2l6ZSkKICAgICAgICAgcmV0dXJu IEFWRVJST1JfSU5WQUxJRERBVEE7CiAKQEAgLTkyLDYgKzk2LDkgQEAgc3RhdGljIGludCBiaXRw YWNrZWRfZGVjb2RlX3l1djQyMnAxMChBVkNvZGVjQ29udGV4dCAqYXZjdHgsIEFWRnJhbWUgKmZy YW1lLAogICAgICAgICAgICAgc3JjICs9IDU7CiAgICAgICAgIH0KICAgICB9CisgICAgZ2V0dGlt ZW9mZGF5KCZ0MiwgTlVMTCk7CisgICAgZnByaW50ZihzdGRlcnIsICJCZWZvcmU9JWQuJTA2ZCBB ZnRlciAlZC4lMDZkXG4iLAorICAgICAgICAgICAgdDEudHZfc2VjLCB0MS50dl91c2VjLCB0Mi50 dl9zZWMsIHQyLnR2X3VzZWMpOwogCiAgICAgcmV0dXJuIDA7CiB9Cg== --000000000000e7910005fb055622 Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". --000000000000e7910005fb055622--