From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTP id 73B0845208 for ; Sat, 13 May 2023 14:55:24 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 1EF6C68BF1E; Sat, 13 May 2023 17:55:23 +0300 (EEST) Received: from relay8-d.mail.gandi.net (relay8-d.mail.gandi.net [217.70.183.201]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 2735C68BD35 for ; Sat, 13 May 2023 17:55:17 +0300 (EEST) Received: (Authenticated sender: michael@niedermayer.cc) by mail.gandi.net (Postfix) with ESMTPSA id 615841BF203 for ; Sat, 13 May 2023 14:55:16 +0000 (UTC) Date: Sat, 13 May 2023 16:55:15 +0200 From: Michael Niedermayer To: FFmpeg development discussions and patches Message-ID: <20230513145515.GH1391451@pb2> References: <20230512233647.GE1391451@pb2> MIME-Version: 1.0 In-Reply-To: Subject: Re: [FFmpeg-devel] [PATCH] swresample: misc improvements X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Content-Type: multipart/mixed; boundary="===============5909372393575244192==" Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: --===============5909372393575244192== Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="FYos7FMnk2wAe7eG" Content-Disposition: inline --FYos7FMnk2wAe7eG Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sat, May 13, 2023 at 08:29:37AM +0200, Paul B Mahol wrote: > On Sat, May 13, 2023 at 1:37=E2=80=AFAM Michael Niedermayer > wrote: >=20 > > On Thu, May 11, 2023 at 07:13:19PM +0200, Paul B Mahol wrote: > > > Attached. > > [...] > > > @@ -33,64 +33,86 @@ > > > > > > > > > #define CONV_FUNC_NAME(dst_fmt, src_fmt) conv_ ## src_fmt ## _to_ ## > > dst_fmt > > > +#define CONVP_FUNC_NAME(dst_fmt, src_fmt) convp_ ## src_fmt ## _to_ = ## > > dst_fmt > > > > > > //FIXME rounding ? > > > -#define CONV_FUNC(ofmt, otype, ifmt, expr)\ > > > +#define CONV_FUNC(ofmt, otype, ifmt, itype, expr)\ > > > + \ > > > static void CONV_FUNC_NAME(ofmt, ifmt)(uint8_t *po, const uint8_t *p= i, > > int is, int os, uint8_t *end)\ > > > {\ > > > uint8_t *end2 =3D end - 3*os;\ > > > while(po < end2){\ > > > + itype x =3D *(itype*)pi;\ > > > *(otype*)po =3D expr; pi +=3D is; po +=3D os;\ > > > + x =3D *(itype*)pi;\ > > > *(otype*)po =3D expr; pi +=3D is; po +=3D os;\ > > > + x =3D *(itype*)pi;\ > > > *(otype*)po =3D expr; pi +=3D is; po +=3D os;\ > > > + x =3D *(itype*)pi;\ > > > *(otype*)po =3D expr; pi +=3D is; po +=3D os;\ > > > }\ > > > while(po < end){\ > > > + itype x =3D *(itype*)pi;\ > > > *(otype*)po =3D expr; pi +=3D is; po +=3D os;\ > > > }\ > > > +}\ > > > +\ > > > +static void CONVP_FUNC_NAME(ofmt, ifmt)(uint8_t *ddst, const uint8_t > > *ssrc, int len)\ > > > +{\ > > > + const itype *src =3D (const itype *)ssrc;\ > > > + otype *dst =3D (otype *)ddst;\ > > > + for (int n =3D 0; n < len; n++){\ > > > + itype x =3D src[n];\ > > > + dst[n] =3D expr;\ > > > + }\ > > > } > > > > > > //FIXME put things below under ifdefs so we do not waste space for > > cases no codec will need > > > -CONV_FUNC(AV_SAMPLE_FMT_U8 , uint8_t, AV_SAMPLE_FMT_U8 , *(const > > uint8_t*)pi) > > > -CONV_FUNC(AV_SAMPLE_FMT_S16, int16_t, AV_SAMPLE_FMT_U8 , (*(const > > uint8_t*)pi - 0x80U)<<8) > > > -CONV_FUNC(AV_SAMPLE_FMT_S32, int32_t, AV_SAMPLE_FMT_U8 , (*(const > > uint8_t*)pi - 0x80U)<<24) > > > -CONV_FUNC(AV_SAMPLE_FMT_S64, int64_t, AV_SAMPLE_FMT_U8 , > > (uint64_t)((*(const uint8_t*)pi - 0x80U))<<56) > > > -CONV_FUNC(AV_SAMPLE_FMT_FLT, float , AV_SAMPLE_FMT_U8 , (*(const > > uint8_t*)pi - 0x80)*(1.0f/ (1<<7))) > > > -CONV_FUNC(AV_SAMPLE_FMT_DBL, double , AV_SAMPLE_FMT_U8 , (*(const > > uint8_t*)pi - 0x80)*(1.0 / (1<<7))) > > > -CONV_FUNC(AV_SAMPLE_FMT_U8 , uint8_t, AV_SAMPLE_FMT_S16, (*(const > > int16_t*)pi>>8) + 0x80) > > > -CONV_FUNC(AV_SAMPLE_FMT_S16, int16_t, AV_SAMPLE_FMT_S16, *(const > > int16_t*)pi) > > > -CONV_FUNC(AV_SAMPLE_FMT_S32, int32_t, AV_SAMPLE_FMT_S16, *(const > > int16_t*)pi * (1 << 16)) > > > -CONV_FUNC(AV_SAMPLE_FMT_S64, int64_t, AV_SAMPLE_FMT_S16, > > (uint64_t)(*(const int16_t*)pi)<<48) > > > -CONV_FUNC(AV_SAMPLE_FMT_FLT, float , AV_SAMPLE_FMT_S16, *(const > > int16_t*)pi*(1.0f/ (1<<15))) > > > -CONV_FUNC(AV_SAMPLE_FMT_DBL, double , AV_SAMPLE_FMT_S16, *(const > > int16_t*)pi*(1.0 / (1<<15))) > > > -CONV_FUNC(AV_SAMPLE_FMT_U8 , uint8_t, AV_SAMPLE_FMT_S32, (*(const > > int32_t*)pi>>24) + 0x80) > > > -CONV_FUNC(AV_SAMPLE_FMT_S16, int16_t, AV_SAMPLE_FMT_S32, *(const > > int32_t*)pi>>16) > > > -CONV_FUNC(AV_SAMPLE_FMT_S32, int32_t, AV_SAMPLE_FMT_S32, *(const > > int32_t*)pi) > > > -CONV_FUNC(AV_SAMPLE_FMT_S64, int64_t, AV_SAMPLE_FMT_S32, > > (uint64_t)(*(const int32_t*)pi)<<32) > > > -CONV_FUNC(AV_SAMPLE_FMT_FLT, float , AV_SAMPLE_FMT_S32, *(const > > int32_t*)pi*(1.0f/ (1U<<31))) > > > -CONV_FUNC(AV_SAMPLE_FMT_DBL, double , AV_SAMPLE_FMT_S32, *(const > > int32_t*)pi*(1.0 / (1U<<31))) > > > -CONV_FUNC(AV_SAMPLE_FMT_U8 , uint8_t, AV_SAMPLE_FMT_S64, (*(const > > int64_t*)pi>>56) + 0x80) > > > -CONV_FUNC(AV_SAMPLE_FMT_S16, int16_t, AV_SAMPLE_FMT_S64, *(const > > int64_t*)pi>>48) > > > -CONV_FUNC(AV_SAMPLE_FMT_S32, int32_t, AV_SAMPLE_FMT_S64, *(const > > int64_t*)pi>>32) > > > -CONV_FUNC(AV_SAMPLE_FMT_S64, int64_t, AV_SAMPLE_FMT_S64, *(const > > int64_t*)pi) > > > -CONV_FUNC(AV_SAMPLE_FMT_FLT, float , AV_SAMPLE_FMT_S64, *(const > > int64_t*)pi*(1.0f/ (UINT64_C(1)<<63))) > > > -CONV_FUNC(AV_SAMPLE_FMT_DBL, double , AV_SAMPLE_FMT_S64, *(const > > int64_t*)pi*(1.0 / (UINT64_C(1)<<63))) > > > -CONV_FUNC(AV_SAMPLE_FMT_U8 , uint8_t, AV_SAMPLE_FMT_FLT, > > av_clip_uint8( lrintf(*(const float*)pi * (1<<7)) + 0x80)) > > > -CONV_FUNC(AV_SAMPLE_FMT_S16, int16_t, AV_SAMPLE_FMT_FLT, > > av_clip_int16( lrintf(*(const float*)pi * (1<<15)))) > > > -CONV_FUNC(AV_SAMPLE_FMT_S32, int32_t, AV_SAMPLE_FMT_FLT, > > av_clipl_int32(llrintf(*(const float*)pi * (1U<<31)))) > > > -CONV_FUNC(AV_SAMPLE_FMT_S64, int64_t, AV_SAMPLE_FMT_FLT, > > llrintf(*(const float*)pi * (UINT64_C(1)<<63))) > > > -CONV_FUNC(AV_SAMPLE_FMT_FLT, float , AV_SAMPLE_FMT_FLT, *(const > > float*)pi) > > > -CONV_FUNC(AV_SAMPLE_FMT_DBL, double , AV_SAMPLE_FMT_FLT, *(const > > float*)pi) > > > -CONV_FUNC(AV_SAMPLE_FMT_U8 , uint8_t, AV_SAMPLE_FMT_DBL, > > av_clip_uint8( lrint(*(const double*)pi * (1<<7)) + 0x80)) > > > -CONV_FUNC(AV_SAMPLE_FMT_S16, int16_t, AV_SAMPLE_FMT_DBL, > > av_clip_int16( lrint(*(const double*)pi * (1<<15)))) > > > -CONV_FUNC(AV_SAMPLE_FMT_S32, int32_t, AV_SAMPLE_FMT_DBL, > > av_clipl_int32(llrint(*(const double*)pi * (1U<<31)))) > > > -CONV_FUNC(AV_SAMPLE_FMT_S64, int64_t, AV_SAMPLE_FMT_DBL, llrint(*(co= nst > > double*)pi * (UINT64_C(1)<<63))) > > > -CONV_FUNC(AV_SAMPLE_FMT_FLT, float , AV_SAMPLE_FMT_DBL, *(const > > double*)pi) > > > -CONV_FUNC(AV_SAMPLE_FMT_DBL, double , AV_SAMPLE_FMT_DBL, *(const > > double*)pi) > > > - > > > -#define FMT_PAIR_FUNC(out, in) [(out) + AV_SAMPLE_FMT_NB*(in)] =3D > > CONV_FUNC_NAME(out, in) > > > - > > > -static conv_func_type * const > > fmt_pair_to_conv_functions[AV_SAMPLE_FMT_NB*AV_SAMPLE_FMT_NB] =3D { > > > +CONV_FUNC(AV_SAMPLE_FMT_U8 , uint8_t, AV_SAMPLE_FMT_U8 , uint8_t, x) > > > +CONV_FUNC(AV_SAMPLE_FMT_S16, int16_t, AV_SAMPLE_FMT_U8 , uint8_t, (x= - > > 0x80U)<<8) > > > +CONV_FUNC(AV_SAMPLE_FMT_S32, int32_t, AV_SAMPLE_FMT_U8 , uint8_t, (x= - > > 0x80U)<<24) > > > +CONV_FUNC(AV_SAMPLE_FMT_S64, int64_t, AV_SAMPLE_FMT_U8 , uint8_t, > > (uint64_t)(x - 0x80U)<<56) > > > +CONV_FUNC(AV_SAMPLE_FMT_FLT, float , AV_SAMPLE_FMT_U8 , uint8_t, (x= - > > 0x80)*(1.0f/ (1<<7))) > > > +CONV_FUNC(AV_SAMPLE_FMT_DBL, double , AV_SAMPLE_FMT_U8 , uint8_t, (x= - > > 0x80)*(1.0 / (1<<7))) > > > +CONV_FUNC(AV_SAMPLE_FMT_U8 , uint8_t, AV_SAMPLE_FMT_S16, int16_t, > > (x>>8) + 0x80) > > > +CONV_FUNC(AV_SAMPLE_FMT_S16, int16_t, AV_SAMPLE_FMT_S16, int16_t, x) > > > +CONV_FUNC(AV_SAMPLE_FMT_S32, int32_t, AV_SAMPLE_FMT_S16, int16_t, x * > > (1 << 16)) > > > +CONV_FUNC(AV_SAMPLE_FMT_S64, int64_t, AV_SAMPLE_FMT_S16, int16_t, > > (uint64_t)(x)<<48) > > > +CONV_FUNC(AV_SAMPLE_FMT_FLT, float , AV_SAMPLE_FMT_S16, int16_t, > > x*(1.0f/ (1<<15))) > > > +CONV_FUNC(AV_SAMPLE_FMT_DBL, double , AV_SAMPLE_FMT_S16, int16_t, > > x*(1.0 / (1<<15))) > > > +CONV_FUNC(AV_SAMPLE_FMT_U8 , uint8_t, AV_SAMPLE_FMT_S32, int32_t, > > (x>>24) + 0x80) > > > +CONV_FUNC(AV_SAMPLE_FMT_S16, int16_t, AV_SAMPLE_FMT_S32, int32_t, x>= >16) > > > +CONV_FUNC(AV_SAMPLE_FMT_S32, int32_t, AV_SAMPLE_FMT_S32, int32_t, x) > > > +CONV_FUNC(AV_SAMPLE_FMT_S64, int64_t, AV_SAMPLE_FMT_S32, int32_t, > > (uint64_t)(x)<<32) > > > +CONV_FUNC(AV_SAMPLE_FMT_FLT, float , AV_SAMPLE_FMT_S32, int32_t, > > x*(1.0f/ (1U<<31))) > > > +CONV_FUNC(AV_SAMPLE_FMT_DBL, double , AV_SAMPLE_FMT_S32, int32_t, > > x*(1.0 / (1U<<31))) > > > +CONV_FUNC(AV_SAMPLE_FMT_U8 , uint8_t, AV_SAMPLE_FMT_S64, int64_t, > > (x>>56) + 0x80) > > > +CONV_FUNC(AV_SAMPLE_FMT_S16, int16_t, AV_SAMPLE_FMT_S64, int64_t, x>= >48) > > > +CONV_FUNC(AV_SAMPLE_FMT_S32, int32_t, AV_SAMPLE_FMT_S64, int64_t, x>= >32) > > > +CONV_FUNC(AV_SAMPLE_FMT_S64, int64_t, AV_SAMPLE_FMT_S64, int64_t, x) > > > +CONV_FUNC(AV_SAMPLE_FMT_FLT, float , AV_SAMPLE_FMT_S64, int64_t, > > x*(1.0f/ (UINT64_C(1)<<63))) > > > +CONV_FUNC(AV_SAMPLE_FMT_DBL, double , AV_SAMPLE_FMT_S64, int64_t, > > x*(1.0 / (UINT64_C(1)<<63))) > > > +CONV_FUNC(AV_SAMPLE_FMT_U8 , uint8_t, AV_SAMPLE_FMT_FLT, float, > > av_clip_uint8( lrintf(x * (1<<7)) + 0x80)) > > > +CONV_FUNC(AV_SAMPLE_FMT_S16, int16_t, AV_SAMPLE_FMT_FLT, float, > > av_clip_int16( lrintf(x * (1<<15)))) > > > +CONV_FUNC(AV_SAMPLE_FMT_S32, int32_t, AV_SAMPLE_FMT_FLT, float, > > av_clipl_int32(llrintf(x * (1U<<31)))) > > > +CONV_FUNC(AV_SAMPLE_FMT_S64, int64_t, AV_SAMPLE_FMT_FLT, float, > > llrintf(x * (UINT64_C(1)<<63))) > > > +CONV_FUNC(AV_SAMPLE_FMT_FLT, float , AV_SAMPLE_FMT_FLT, float, x) > > > +CONV_FUNC(AV_SAMPLE_FMT_DBL, double , AV_SAMPLE_FMT_FLT, float, x) > > > +CONV_FUNC(AV_SAMPLE_FMT_U8 , uint8_t, AV_SAMPLE_FMT_DBL, double, > > av_clip_uint8( lrint(x * (1<<7)) + 0x80)) > > > +CONV_FUNC(AV_SAMPLE_FMT_S16, int16_t, AV_SAMPLE_FMT_DBL, double, > > av_clip_int16( lrint(x * (1<<15)))) > > > +CONV_FUNC(AV_SAMPLE_FMT_S32, int32_t, AV_SAMPLE_FMT_DBL, double, > > av_clipl_int32(llrint(x * (1U<<31)))) > > > +CONV_FUNC(AV_SAMPLE_FMT_S64, int64_t, AV_SAMPLE_FMT_DBL, double, > > llrint(x * (UINT64_C(1)<<63))) > > > +CONV_FUNC(AV_SAMPLE_FMT_FLT, float , AV_SAMPLE_FMT_DBL, double, x) > > > +CONV_FUNC(AV_SAMPLE_FMT_DBL, double , AV_SAMPLE_FMT_DBL, double, x) > > > > i think the new cases are longer const, is that intended ? > > (it would cast const to non const) > > >=20 > You mean I removed const from old macro? yes > Can fix that if that is the case. thx [...] --=20 Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Complexity theory is the science of finding the exact solution to an approximation. Benchmarking OTOH is finding an approximation of the exact --FYos7FMnk2wAe7eG Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iF0EABEIAB0WIQSf8hKLFH72cwut8TNhHseHBAsPqwUCZF+k0wAKCRBhHseHBAsP q4oQAKCIny/qyi2LHIEW3uBGSBqPbIRjCACfVli1w2qZmIiQnqUINLmNaQZXdUQ= =WpRm -----END PGP SIGNATURE----- --FYos7FMnk2wAe7eG-- --===============5909372393575244192== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". --===============5909372393575244192==--