From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <ffmpeg-devel-bounces@ffmpeg.org>
Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100])
	by master.gitmailbox.com (Postfix) with ESMTP id CC2E344CC6
	for <ffmpegdev@gitmailbox.com>; Mon, 14 Nov 2022 21:08:04 +0000 (UTC)
Received: from [127.0.1.1] (localhost [127.0.0.1])
	by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 4F4EC68BB41;
	Mon, 14 Nov 2022 23:08:01 +0200 (EET)
Received: from relay7-d.mail.gandi.net (relay7-d.mail.gandi.net
 [217.70.183.200])
 by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id CEB6668B96D
 for <ffmpeg-devel@ffmpeg.org>; Mon, 14 Nov 2022 23:07:54 +0200 (EET)
Received: (Authenticated sender: michael@niedermayer.cc)
 by mail.gandi.net (Postfix) with ESMTPSA id 073A520002
 for <ffmpeg-devel@ffmpeg.org>; Mon, 14 Nov 2022 21:07:53 +0000 (UTC)
Date: Mon, 14 Nov 2022 22:07:53 +0100
From: Michael Niedermayer <michael@niedermayer.cc>
To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Message-ID: <20221114210753.GI1814017@pb2>
References: <20221103040010.1134-1-mindmark@gmail.com>
 <20221103040010.1134-2-mindmark@gmail.com>
 <20221113212453.GF1814017@pb2>
 <CA+anCRmx-hO38tJAmF84+6RbpZ9Vcm6tAMT4peP5TX7uEXLXqw@mail.gmail.com>
MIME-Version: 1.0
In-Reply-To: <CA+anCRmx-hO38tJAmF84+6RbpZ9Vcm6tAMT4peP5TX7uEXLXqw@mail.gmail.com>
Subject: Re: [FFmpeg-devel] [PATCH v3 1/4] swscale/input: add rgbaf32 input
 support
X-BeenThere: ffmpeg-devel@ffmpeg.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: FFmpeg development discussions and patches <ffmpeg-devel.ffmpeg.org>
List-Unsubscribe: <https://ffmpeg.org/mailman/options/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=unsubscribe>
List-Archive: <https://ffmpeg.org/pipermail/ffmpeg-devel>
List-Post: <mailto:ffmpeg-devel@ffmpeg.org>
List-Help: <mailto:ffmpeg-devel-request@ffmpeg.org?subject=help>
List-Subscribe: <https://ffmpeg.org/mailman/listinfo/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=subscribe>
Reply-To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Content-Type: multipart/mixed; boundary="===============6515138070383462294=="
Errors-To: ffmpeg-devel-bounces@ffmpeg.org
Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>
Archived-At: <https://master.gitmailbox.com/ffmpegdev/20221114210753.GI1814017@pb2/>
List-Archive: <https://master.gitmailbox.com/ffmpegdev/>
List-Post: <mailto:ffmpegdev@gitmailbox.com>


--===============6515138070383462294==
Content-Type: multipart/signed; micalg=pgp-sha256;
	protocol="application/pgp-signature"; boundary="cjXvCArabh/jFWdZ"
Content-Disposition: inline


--cjXvCArabh/jFWdZ
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Sun, Nov 13, 2022 at 05:50:37PM -0800, Mark Reid wrote:
> On Sun, Nov 13, 2022 at 1:25 PM Michael Niedermayer <michael@niedermayer.=
cc>
> wrote:
>=20
> > On Wed, Nov 02, 2022 at 09:00:07PM -0700, mindmark@gmail.com wrote:
> > > From: Mark Reid <mindmark@gmail.com>
> > >
> > > ---
> > >  libswscale/input.c | 172 +++++++++++++++++++++++++++++++++++++++++++=
++
> > >  libswscale/utils.c |   4 ++
> > >  2 files changed, 176 insertions(+)
> > >
> > > diff --git a/libswscale/input.c b/libswscale/input.c
> > > index 7ff7bfaa01..4683284b0b 100644
> > > --- a/libswscale/input.c
> > > +++ b/libswscale/input.c
> > > @@ -1284,6 +1284,136 @@ static void rgbaf16##endian_name##ToA_c(uint8=
_t
> > *_dst, const uint8_t *_src, cons
> > >  rgbaf16_funcs_endian(le, 0)
> > >  rgbaf16_funcs_endian(be, 1)
> > >
> > > +#define rdpx(src) (is_be ? av_int2float(AV_RB32(&src)):
> > av_int2float(AV_RL32(&src)))
> > > +
> > > +static av_always_inline void rgbaf32ToUV_half_endian(uint16_t *dstU,
> > uint16_t *dstV, int is_be,
> > > +                                                     const float *sr=
c,
> > int width,
> > > +                                                     int32_t *rgb2yu=
v,
> > int comp)
> > > +{
> > > +    int32_t ru =3D rgb2yuv[RU_IDX], gu =3D rgb2yuv[GU_IDX], bu =3D
> > rgb2yuv[BU_IDX];
> > > +    int32_t rv =3D rgb2yuv[RV_IDX], gv =3D rgb2yuv[GV_IDX], bv =3D
> > rgb2yuv[BV_IDX];
> > > +    int i;
> > > +    for (i =3D 0; i < width; i++) {
> >
> > > +        int r =3D (lrintf(av_clipf(65535.0f * rdpx(src[i*(comp*2)+0]=
),
> > 0.0f, 65535.0f)) +
> > > +                 lrintf(av_clipf(65535.0f * rdpx(src[i*(comp*2)+4]),
> > 0.0f, 65535.0f))) >> 1;
> > > +        int g =3D (lrintf(av_clipf(65535.0f * rdpx(src[i*(comp*2)+1]=
),
> > 0.0f, 65535.0f)) +
> > > +                 lrintf(av_clipf(65535.0f * rdpx(src[i*(comp*2)+5]),
> > 0.0f, 65535.0f))) >> 1;
> > > +        int b =3D (lrintf(av_clipf(65535.0f * rdpx(src[i*(comp*2)+2]=
),
> > 0.0f, 65535.0f)) +
> > > +                 lrintf(av_clipf(65535.0f * rdpx(src[i*(comp*2)+6]),
> > 0.0f, 65535.0f))) >> 1;
> > > +
> > > +        dstU[i] =3D (ru*r + gu*g + bu*b + (0x10001<<(RGB2YUV_SHIFT-1=
)))
> > >> RGB2YUV_SHIFT;
> > > +        dstV[i] =3D (rv*r + gv*g + bv*b + (0x10001<<(RGB2YUV_SHIFT-1=
)))
> > >> RGB2YUV_SHIFT;
> >
> > I would expect this sort of code to use 2 lrintf() and 2 av_clipf() not=
 6
> >
> >
> ya it is a bit excessive, I'll just remove the _half conversions for now,
> they aren't strictly necessary as far as I can tell.

do you see a problem with just factorizing them out ?
it shouldnt be hard to reorder the operations


>=20
>=20
> >
> > > +    }
> > > +}
> > > +
> > > +static av_always_inline void rgbaf32ToUV_endian(uint16_t *dstU,
> > uint16_t *dstV, int is_be,
> > > +                                                const float *src, int
> > width,
> > > +                                                int32_t *rgb2yuv, int
> > comp)
> > > +{
> > > +    int32_t ru =3D rgb2yuv[RU_IDX], gu =3D rgb2yuv[GU_IDX], bu =3D
> > rgb2yuv[BU_IDX];
> > > +    int32_t rv =3D rgb2yuv[RV_IDX], gv =3D rgb2yuv[GV_IDX], bv =3D
> > rgb2yuv[BV_IDX];
> > > +    int i;
> > > +    for (i =3D 0; i < width; i++) {
> > > +        int r =3D lrintf(av_clipf(65535.0f * rdpx(src[i*comp+0]), 0.=
0f,
> > 65535.0f));
> > > +        int g =3D lrintf(av_clipf(65535.0f * rdpx(src[i*comp+1]), 0.=
0f,
> > 65535.0f));
> > > +        int b =3D lrintf(av_clipf(65535.0f * rdpx(src[i*comp+2]), 0.=
0f,
> > 65535.0f));
> > > +
> > > +        dstU[i] =3D (ru*r + gu*g + bu*b + (0x10001<<(RGB2YUV_SHIFT-1=
)))
> > >> RGB2YUV_SHIFT;
> > > +        dstV[i] =3D (rv*r + gv*g + bv*b + (0x10001<<(RGB2YUV_SHIFT-1=
)))
> > >> RGB2YUV_SHIFT;
> > > +    }
> > > +}
> > > +
> >
> > > +static av_always_inline void rgbaf32ToY_endian(uint16_t *dst, const
> > float *src, int is_be,
> > > +                                               int width, int32_t
> > *rgb2yuv, int comp)
> > > +{
> > > +    int32_t ry =3D rgb2yuv[RY_IDX], gy =3D rgb2yuv[GY_IDX], by =3D
> > rgb2yuv[BY_IDX];
> > > +    int i;
> > > +    for (i =3D 0; i < width; i++) {
> > > +        int r =3D lrintf(av_clipf(65535.0f * rdpx(src[i*comp+0]), 0.=
0f,
> > 65535.0f));
> > > +        int g =3D lrintf(av_clipf(65535.0f * rdpx(src[i*comp+1]), 0.=
0f,
> > 65535.0f));
> > > +        int b =3D lrintf(av_clipf(65535.0f * rdpx(src[i*comp+2]), 0.=
0f,
> > 65535.0f));
> > > +
> >
> > > +        dst[i] =3D (ry*r + gy*g + by*b + (0x2001<<(RGB2YUV_SHIFT-1))=
) >>
> > RGB2YUV_SHIFT;
> >
> > there is one output so there should be only need for one clip and one
> > float->int
> >
>=20
> This is matching the f32 planar version. I think I was paranoid about
> things being bitexact for tests and that's why it's currently being done
> this way.
> I'll see what happens if I introduce more float operations, could I perha=
ps
> do this in a later patch? some asm might have to change too.

of course can be a seperate patch in a set. Maybe f32 planar can be changed
at the same time

thx

[...]
--=20
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

In a rich man's house there is no place to spit but his face.
-- Diogenes of Sinope

--cjXvCArabh/jFWdZ
Content-Type: application/pgp-signature; name="signature.asc"

-----BEGIN PGP SIGNATURE-----

iF0EABEIAB0WIQSf8hKLFH72cwut8TNhHseHBAsPqwUCY3KuIgAKCRBhHseHBAsP
q0xeAKCS2YNU6MiLaf/SuVwGjIYaLMgjYACeKzGj4A1NVOVnb+D/DU0131Gg3DQ=
=1FKw
-----END PGP SIGNATURE-----

--cjXvCArabh/jFWdZ--

--===============6515138070383462294==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

--===============6515138070383462294==--