From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTPS id E21FF4AFEE for ; Thu, 20 Feb 2025 21:08:32 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id AB85B68C615; Thu, 20 Feb 2025 23:08:28 +0200 (EET) Received: from relay4-d.mail.gandi.net (relay4-d.mail.gandi.net [217.70.183.196]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 804E568BFD1 for ; Thu, 20 Feb 2025 23:08:22 +0200 (EET) Received: by mail.gandi.net (Postfix) with ESMTPSA id B31EA44464 for ; Thu, 20 Feb 2025 21:08:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=niedermayer.cc; s=gm1; t=1740085701; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=hp+VAE+6oyFgsRdKLjcSKy8wV54YWCyrOtWsEIuStg8=; b=UTjzoa9LX2pa5u9OhPpfuwSoYlaKU4HX4NpOzkAPoqA++Xdu/bHfR13q9u+uBgohL9DGFH Nio/BiQkcTxL4m9SU80e80Fwem2TEfwOS0tFxkYEcLO+QOKfUZBMjXk1HAk0jsDdSc94bP tkfpt8+ql8ys1BB4bCSQHRKUdT5A/8EKy6XcoPh9lLoh/hmjxFr7dDTYKH70ZylAmdWlo0 Z+fI3FZJiyF2hfkRNGSGNKerxRw/VHUBFfLBDlmTJTAj3tZhxkjfgSjGImia/ehnQtLpIw Pta3SYJfzkoEcOXw45nqF6V4GA/EMJ2MuPcAJjn5+ZsYSLu4o7Ou3ntDo3a48A== Date: Thu, 20 Feb 2025 22:08:20 +0100 From: Michael Niedermayer To: FFmpeg development discussions and patches Message-ID: <20250220210820.GK4991@pb2> References: <20250206145817.GN4991@pb2> <20250212220343.GU4991@pb2> <8ba38ed99046ebac893d08e7e2fc7b42a83d26c1.camel@haerdin.se> <20250213120328.GV4991@pb2> <707df676aa8c92a0ffd72e095d52404466bdb7b4.camel@haerdin.se> MIME-Version: 1.0 In-Reply-To: <707df676aa8c92a0ffd72e095d52404466bdb7b4.camel@haerdin.se> X-GND-State: clean X-GND-Score: -85 X-GND-Cause: gggruggvucftvghtrhhoucdtuddrgeefvddrtddtgdeikeduhecutefuodetggdotefrodftvfcurfhrohhfihhlvgemucfitefpfffkpdcuggftfghnshhusghstghrihgsvgenuceurghilhhouhhtmecufedtudenucesvcftvggtihhpihgvnhhtshculddquddttddmnegfrhhlucfvnfffucdludehmdenucfjughrpeffhffvuffkfhggtggujgesghdtreertddtudenucfhrhhomhepofhitghhrggvlhcupfhivgguvghrmhgrhigvrhcuoehmihgthhgrvghlsehnihgvuggvrhhmrgihvghrrdgttgeqnecuggftrfgrthhtvghrnhepudetvdfhudeuudegudefgfehhfevvdfggfffkefhvdfgvdetffdtjeekheetfeehnecukfhppeeguddrieeirdeijedruddufeenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepihhnvghtpeeguddrieeirdeijedruddufedphhgvlhhopehlohgtrghlhhhoshhtpdhmrghilhhfrhhomhepmhhitghhrggvlhesnhhivgguvghrmhgrhigvrhdrtggtpdhnsggprhgtphhtthhopedupdhrtghpthhtohepfhhfmhhpvghgqdguvghvvghlsehffhhmphgvghdrohhrgh X-GND-Sasl: michael@niedermayer.cc Subject: Re: [FFmpeg-devel] [PATCH 8/8] Make mime-type award a bonus probe score X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Content-Type: multipart/mixed; boundary="===============6482996086943381402==" Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: --===============6482996086943381402== Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="U64Xo+8Mv1yRwm2x" Content-Disposition: inline --U64Xo+8Mv1yRwm2x Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Feb 13, 2025 at 10:29:33PM +0100, Tomas H=E4rdin wrote: > tor 2025-02-13 klockan 13:03 +0100 skrev Michael Niedermayer: > > On Thu, Feb 13, 2025 at 12:40:24PM +0100, Tomas H=E4rdin wrote: > > > ons 2025-02-12 klockan 23:03 +0100 skrev Michael Niedermayer: > > > > On Wed, Feb 12, 2025 at 12:03:37PM +0100, Tomas H=E4rdin wrote: > > > > > tor 2025-02-06 klockan 15:58 +0100 skrev Michael Niedermayer: > > > > > > Hi Tomas > > > > > >=20 > > > > > > On Wed, Feb 05, 2025 at 03:24:24PM +0100, Tomas H=E4rdin wrote: > > > > > > > Seems reasonable to me and passes FATE > > > > > > >=20 > > > > > > > /Tomas > > > > > >=20 > > > > > > > =A0avformat.h=A0=A0 |=A0=A0=A0 2 +- > > > > > > > =A0format.c=A0=A0=A0=A0 |=A0=A0=A0 8 ++++---- > > > > > > > =A0libopenmpt.c |=A0=A0=A0 2 +- > > > > > > > =A03 files changed, 6 insertions(+), 6 deletions(-) > > > > > > > 01f04f79202640330d6be91b0215f92f14d1845a=A0 0008-Make-mime- > > > > > > > type- > > > > > > > award-a-bonus-probe-score.patch > > > > > > > From ecc3459990f2871fd907f96fe66362b8fea41bd8 Mon Sep 17 > > > > > > > 00:00:00 > > > > > > > 2001 > > > > > > > From: =3D?UTF-8?q?Peter=3D20Zeb=3DC3=3DBChr?=3D > > > > > > > Date: Tue, 21 Nov 2023 14:16:49 +0100 > > > > > > > Subject: [PATCH 8/8] Make mime-type award a bonus probe > > > > > > > score > > > > > > >=20 > > > > > > > This changes the default behaviour of ffmpeg where content- > > > > > > > type > > > > > > > headers > > > > > > > on an input gives an absolut probe score (of 75) to instead > > > > > > > give a > > > > > > > bonus > > > > > > > score (of 30). This gives the probe a better chance to > > > > > > > arrive > > > > > > > at > > > > > > > the > > > > > > > correct format by (hopefully) giving a large enough bonus > > > > > > > to > > > > > > > push > > > > > > > edge > > > > > > > cases in the right direction (MPEG-PS vs MP3, I am looking > > > > > > > at > > > > > > > you) > > > > > > > while > > > > > > > also not adversly punishing clearer cases (raw ADTS marked > > > > > > > as > > > > > > > "audio/mpeg" for example). > > > > > > >=20 > > > > > > > This patch was regression tested against 20 million recent > > > > > > > podcast > > > > > > > submissions (after content-type propagation was added to > > > > > > > original-storage), and 50k Juno vodcasts submissions > > > > > > > (dito). No > > > > > > > adverse > > > > > > > effects observed (but the bonus may still need tweaking if > > > > > > > other > > > > > > > edge > > > > > > > cases are detected in production). > > > > > > > --- > > > > > > > =A0libavformat/avformat.h=A0=A0 | 2 +- > > > > > > > =A0libavformat/format.c=A0=A0=A0=A0 | 8 ++++---- > > > > > > > =A0libavformat/libopenmpt.c | 2 +- > > > > > > > =A03 files changed, 6 insertions(+), 6 deletions(-) > > > > > >=20 > > > > > > what is the score ? > > > > > > a higher score means more likely but how much more ? > > > > > > maybe we should come up with a more formal definition > > > > > > like that score is the number of bits of entropy that where > > > > > > checked > > > > > > or > > > > > > something like that. > > > > > > in such a framework, adding 30 for a mime type match would > > > > > > probably > > > > > > make sense > > > > > >=20 > > > > > > without such a framework, adding 30 to a abstract score is > > > > > > hard > > > > > > to > > > > > > review > > > > > > beyond that, i dont see anything breaking from this but then > > > > > > i > > > > > > dont think we have real tests for mime types > > > > >=20 > > > > > We don't really have tests for the probe scores at all, which > > > > > is a > > > > > problem. Perhaps if we collected some tricky samples we could > > > > > construct > > > > > a test that demands a certain ordering of probe scores for > > > > > them? > > > > > For > > > > > now scores are tested indirectly by the fact that most tests > > > > > rely > > > > > on > > > > > correct probing > > > >=20 > > > > we have > > > > tools/probetest > > > >=20 > > > > probetest [-f ] [ []] > > >=20 > > > Yeah but that only tests with random data, not say an ordering of > > > probe > > > scores for actual test files. > >=20 > > yes, it could/should be extended > >=20 > > probetest as is is still quite usefull though as it catches probe > > functions > > which give high scores on random trash >=20 > Might be better to leverage afl-fuzz since it is more wily in its > tricks to provoke different program behavior. Then exit(1) whenever the > test program probes something incorrectly. For example you could start > with a small, valid MPEG-PS file and have afl-fuzz generate slightly > different versions of it that don't probe as such A real fuzzer will make every probe, probe incorrectly. Maybe i misundersto= od what you suggested what we want is that 1. Random binary, random ascii, randon utf8 and intermediates do not get detected as any format (thats what probetest does) 2. that format A is detected more as format A than format B where B !=3D A we and our users test this by simply using ffmpeg and fate Testing that a "randomly damaged" A is still detected as A. Iam not sure this is actuallly generally usefull. When such A doesnt exist it would constrain our probing code for no gain. And i think real world files are poorly modelled by randomly (bit wise)dama= ged files having a really large corpus of real world odd files and test probing on th= em seems the "ideal" way to test probing to me [...] --=20 Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB "You are 36 times more likely to die in a bathtub than at the hands of a terrorist. Also, you are 2.5 times more likely to become a president and 2 times more likely to become an astronaut, than to die in a terrorist attack." -- Thoughty2 --U64Xo+8Mv1yRwm2x Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iF0EABEKAB0WIQSf8hKLFH72cwut8TNhHseHBAsPqwUCZ7eZwQAKCRBhHseHBAsP q/7+AJwJvS78qIDITEGer0rb4okUcJMpAgCgk1lEKb7gYd9i/4UnATktgb+72VY= =qsfH -----END PGP SIGNATURE----- --U64Xo+8Mv1yRwm2x-- --===============6482996086943381402== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". --===============6482996086943381402==--