From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTP id 7717943E0F for ; Wed, 14 Sep 2022 21:08:28 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id CFB3A68BADF; Thu, 15 Sep 2022 00:08:25 +0300 (EEST) Received: from relay1-d.mail.gandi.net (relay1-d.mail.gandi.net [217.70.183.193]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 0CA1E68B9F9 for ; Thu, 15 Sep 2022 00:08:18 +0300 (EEST) Received: (Authenticated sender: michael@niedermayer.cc) by mail.gandi.net (Postfix) with ESMTPSA id 257E4240003 for ; Wed, 14 Sep 2022 21:08:17 +0000 (UTC) Date: Wed, 14 Sep 2022 23:08:16 +0200 From: Michael Niedermayer To: FFmpeg development discussions and patches Message-ID: <20220914210816.GP2088045@pb2> References: <20220910192330.30872-1-philipl@overt.org> MIME-Version: 1.0 In-Reply-To: <20220910192330.30872-1-philipl@overt.org> Subject: Re: [FFmpeg-devel] [PATCH] v2: lavu/pixdesc: favour formats where depth and subsampling exactly match X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Content-Type: multipart/mixed; boundary="===============5300968118452952001==" Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: --===============5300968118452952001== Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="/wUBB1VsywAzobeC" Content-Disposition: inline --/wUBB1VsywAzobeC Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sat, Sep 10, 2022 at 12:23:30PM -0700, Philip Langdale wrote: > Since introducing the various packed formats used by VAAPI (and p012), > we've noticed that there's actually a gap in how > av_find_best_pix_fmt_of_2 works. It doesn't actually assign any value > to having the same bit depth as the source format, when comparing > against formats with a higher bit depth. This usually doesn't matter, > because av_get_padded_bits_per_pixel() will account for it. >=20 > However, as many of these formats use padding internally, we find that > av_get_padded_bits_per_pixel() actually returns the same value for the > 10 bit, 12 bit, 16 bit flavours, etc. In these tied situations, we end > up just picking the first of the two provided formats, even if the > second one should be preferred because it matches the actual bit depth. >=20 > This bug already existed if you tried to compare yuv420p10 against p016 > and p010, for example, but it simply hadn't come up before so we never > noticed. >=20 > But now, we actually got a situation in the VAAPI VP9 decoder where it > offers both p010 and p012 because Profile 3 could be either depth and > ends up picking p012 for 10 bit content due to the ordering of the > testing. >=20 > In addition, in the process of testing the fix, I realised we have the > same gap when it comes to chroma subsampling - we do not favour a > format that has exactly the same subsampling vs one with less > subsampling when all else is equal. >=20 > To fix this, I'm introducing a small score penalty if the bit depth or > subsampling doesn't exactly match the source format. This will break > the tie in favour of the format with the exact match, but not offset > any of the other scoring penalties we already have. >=20 > I have added a set of tests around these formats which will fail > without this fix. >=20 > v2: Rework penalty system to scale penalty based on how different the > two formats are, and add new loss categories for them. >=20 > Signed-off-by: Philip Langdale > --- > libavutil/pixdesc.c | 38 +++++++++++- > libavutil/pixdesc.h | 15 +++-- > libavutil/tests/pixfmt_best.c | 111 ++++++++++++++++++++++++++++------ > tests/ref/fate/pixfmt_best | 2 +- > 4 files changed, 139 insertions(+), 27 deletions(-) >=20 > diff --git a/libavutil/pixdesc.c b/libavutil/pixdesc.c > index d7c6ebfdc4..143c17e64a 100644 > --- a/libavutil/pixdesc.c > +++ b/libavutil/pixdesc.c > @@ -3004,6 +3004,11 @@ static int get_pix_fmt_score(enum AVPixelFormat ds= t_pix_fmt, > if ((ret =3D get_pix_fmt_depth(&dst_min_depth, &dst_max_depth, dst_p= ix_fmt)) < 0) > return -3; > =20 > + // Favour formats where bit depth exactly matches. If all other scor= ing is > + // equal, we'd rather use a lower bit depth that matches the source. > + if (src_min_depth !=3D dst_min_depth || src_max_depth !=3D dst_max_d= epth) > + score -=3D 64; why 64 ? > + > src_color =3D get_color_type(src_desc); > dst_color =3D get_color_type(dst_desc); > if (dst_pix_fmt =3D=3D AV_PIX_FMT_PAL8) > @@ -3013,9 +3018,16 @@ static int get_pix_fmt_score(enum AVPixelFormat ds= t_pix_fmt, > =20 > for (i =3D 0; i < nb_components; i++) { > int depth_minus1 =3D (dst_pix_fmt =3D=3D AV_PIX_FMT_PAL8) ? 7/nb= _components : (dst_desc->comp[i].depth - 1); > - if (src_desc->comp[i].depth - 1 > depth_minus1 && (consider & FF= _LOSS_DEPTH)) { > + int depth_delta =3D src_desc->comp[i].depth - 1 - depth_minus1; > + if (depth_delta > 0 && (consider & FF_LOSS_DEPTH)) { > loss |=3D FF_LOSS_DEPTH; > score -=3D 65536 >> depth_minus1; > + } else if (depth_delta < 0 && (consider & FF_LOSS_EXCESS_DEPTH))= { > + // Favour formats where bit depth exactly matches. If all ot= her > + // scoring is equal, we'd rather use the bit depth that most= closely > + // matches the source. ok > + loss |=3D FF_LOSS_EXCESS_DEPTH; > + score -=3D 1 << -depth_delta; but does that do that ? a 1bpp -> 16bpp has a considerable -depth_delta do we need the << at all ? > } > } > =20 > @@ -3024,10 +3036,12 @@ static int get_pix_fmt_score(enum AVPixelFormat d= st_pix_fmt, > loss |=3D FF_LOSS_RESOLUTION; > score -=3D 256 << dst_desc->log2_chroma_w; > } > + > if (dst_desc->log2_chroma_h > src_desc->log2_chroma_h) { > loss |=3D FF_LOSS_RESOLUTION; > score -=3D 256 << dst_desc->log2_chroma_h; > } > + > // don't favor 422 over 420 if downsampling is needed, because 4= 20 has much better support on the decoder side > if (dst_desc->log2_chroma_w =3D=3D 1 && src_desc->log2_chroma_w = =3D=3D 0 && > dst_desc->log2_chroma_h =3D=3D 1 && src_desc->log2_chroma_h = =3D=3D 0 ) { > @@ -3035,6 +3049,28 @@ static int get_pix_fmt_score(enum AVPixelFormat ds= t_pix_fmt, > } > } > =20 > + if (consider & FF_LOSS_EXCESS_RESOLUTION) { > + // Favour formats where chroma subsampling exactly matches. If a= ll other > + // scoring is equal, we'd rather use the subsampling that most c= losely > + // matches the source. > + if (dst_desc->log2_chroma_w < src_desc->log2_chroma_w) { > + loss |=3D FF_LOSS_EXCESS_RESOLUTION; > + score -=3D 32 << (src_desc->log2_chroma_w - dst_desc->log2_c= hroma_w); > + } > + > + if (dst_desc->log2_chroma_h < src_desc->log2_chroma_h) { > + loss |=3D FF_LOSS_EXCESS_RESOLUTION; > + score -=3D 32 << (src_desc->log2_chroma_h - dst_desc->log2_c= hroma_h); > + } > + > + // don't favour 411 over 420, because 420 has much better suppor= t on the > + // decoder side. > + if (dst_desc->log2_chroma_w =3D=3D 1 && src_desc->log2_chroma_w = =3D=3D 2 && > + dst_desc->log2_chroma_h =3D=3D 1 && src_desc->log2_chroma_h = =3D=3D 2) { > + score +=3D 128; > + } I was thinking more generically that more symmetric resolution (difference) may be preferrable but this is fine too theres an infinite number of ways to do this its probably best to move forward and wait for some bug report to provide real key points the heuristic can be tuned toward than inventing cases also keeping the heuristic simple should be a goal thx [...] --=20 Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB "I am not trying to be anyone's saviour, I'm trying to think about the future and not be sad" - Elon Musk --/wUBB1VsywAzobeC Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iF0EABEIAB0WIQSf8hKLFH72cwut8TNhHseHBAsPqwUCYyJCvQAKCRBhHseHBAsP q6w/AKCJpPHgCJ6MgtGvQPO9Rnd3vs6thgCgkGEERCNr3hrm+Pbey1QGAy3MYdg= =lXqu -----END PGP SIGNATURE----- --/wUBB1VsywAzobeC-- --===============5300968118452952001== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". --===============5300968118452952001==--