From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTPS id 2642D4C6E5 for ; Sun, 9 Mar 2025 20:24:51 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 7243B68E6E6; Sun, 9 Mar 2025 22:24:45 +0200 (EET) Received: from mail-wr1-f43.google.com (mail-wr1-f43.google.com [209.85.221.43]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 7B46468E64B for ; Sun, 9 Mar 2025 22:24:38 +0200 (EET) Received: by mail-wr1-f43.google.com with SMTP id ffacd0b85a97d-391342fc0b5so2721846f8f.3 for ; Sun, 09 Mar 2025 13:24:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1741551877; x=1742156677; darn=ffmpeg.org; h=thread-index:content-language:content-transfer-encoding :mime-version:message-id:date:subject:in-reply-to:references:to:from :from:to:cc:subject:date:message-id:reply-to; bh=p6rjKq2T6ABSorgs3evInM6AdTDtudJvZqCLwPFgRqY=; b=S3NkdIPHh+SBX/TO7hRPqVAw0Ono9QnpfyBKP4IpSUMJpJRsio5kBOaTFG/HoWlWf8 Diq0rUEQCo7+aWJbNmkmeTQUf3f8qddNjh/wYhVxqz6ZRG5TCWivvRClIOvqm/YVsAOZ 3K1hSAAvZlwaKuZV0Y7TASAMCFQHphaGBh/iO1eIj+1wR7HfZWOjERevGHeEV7WyFL2l UbQPe3qV1waZouFjPy3vdvkK+zE1IDGD71kV1wEN8XzeyzsqSLPaAoCTi0T0M4tNhtO1 J3cCnpflqatMR3hI0Kbx9XFWqE4nI7VLflKlPBrE0/Fb+Y1IexIrsS5ZYO8et95LkDbu ykJg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1741551877; x=1742156677; h=thread-index:content-language:content-transfer-encoding :mime-version:message-id:date:subject:in-reply-to:references:to:from :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=p6rjKq2T6ABSorgs3evInM6AdTDtudJvZqCLwPFgRqY=; b=k3JHsbC9IuvebfxyZAs++ZI/0DOgfoVBZj7+WclhxXzatjvVb1598/XOqvFK6N7UhU XOEUypVr/9HOM2B9IDFOpAIA180zX0lHsiAA4Wc5JT0xgO2DUVWDwrpRdIbnsc4mAlAw bIropkpGn6lLlBbIG2ugwGfbNSlbmnDNX38xeZaFkZ8CDZHgSFGqrpt3vtTIH/wG2czv m0OPoWsqVX1sgk9lr9Mxkpt/RnPcB1jieEryGAqlUUxITxKHTkP4qjZH4Cx9HTvqFkak H4JLnWVmEo0D2aisvriUqoeP3Aqb4JigIVCJdml86D+HVju8yqEIxMOAlTlTuvDja7uX 2L+w== X-Gm-Message-State: AOJu0YyxqaF7m70AGXLlFNY8OUr8vSQy4QWxDHFM6Ggiid+qKRzT1yTR 6ggVDFC8Y1czJULIsccj2AwxgHI1ImcX2qQLqqBCbNLPGbH06wglyFCQ5Q== X-Gm-Gg: ASbGncsAcNU62bVpeGJtcwYiT0QFO76dDf/9QOiY7cvWYCQGww+PkbroRF51J8AlmeL y+QhjKOgJO4DkFcxjhvTltIhvNnzMxrGxVU3cSFWaqFkZHTbKlylOd9rPN45AllrKZF1rXKaIdf l7a2Pe0LmP5jDDUHddDmPfg72PEvZA+EyqIB6NwWBqAfO2o7VvckHaQBNOxh/E2TDvINKxkW9Ta WYXslkSBgaAc5PAbkYQ3QKv7Qyi/vdlf+36/Ak+sK2+xtFdy2ftcoCB6wJ2WmWA6u/jmTWFi1ze 23fOOPbB9Y85f2h3sZSsSgoW9XkbZwqXZ1kMqOGzAYG2V1R404GK3SRCQGlVOyZq8QZeH9Ouyfo sBqKW4HtQXyNaFjgM X-Google-Smtp-Source: AGHT+IHwjv2We9tv8S8B6/vDNDmrpVvdCWic0xtD6vbnhrGO+nSeEWhzP2gS5aXNvOnrK4Src8BNhQ== X-Received: by 2002:a5d:5f8f:0:b0:390:e7c1:59c4 with SMTP id ffacd0b85a97d-39132d20cb5mr8101988f8f.13.1741551877188; Sun, 09 Mar 2025 13:24:37 -0700 (PDT) Received: from MK2 (80-108-16-220.cable.dynamic.surfer.at. [80.108.16.220]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-43cf8a840e1sm11646685e9.8.2025.03.09.13.24.36 for (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sun, 09 Mar 2025 13:24:36 -0700 (PDT) From: To: "'FFmpeg development discussions and patches'" References: <007d01db903a$ffe93ca0$ffbbb5e0$@gmail.com> <20250309191924.GK4991@pb2> In-Reply-To: <20250309191924.GK4991@pb2> Date: Sun, 9 Mar 2025 21:24:36 +0100 Message-ID: <00f001db9131$470fbb80$d52f3280$@gmail.com> MIME-Version: 1.0 X-Mailer: Microsoft Outlook 16.0 Content-Language: en-at Thread-Index: AQGzKiWPAryg8Ufmh6J/C6S6OTEnbgFrnlXhs7CpKZA= Subject: Re: [FFmpeg-devel] [PATCH FFmpeg 11/15] doc: avgclass Filter Documentation X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: Hi Michael, You are right. The workflow is that any classification above the confidence value parameter (default 0.5) gets written to the Side data of the Frame, then read by the avgclass filter and averaged. Given the parameter was set to 0.01 or lower, if one frame detects a cat with 0.99 confidence and another with 0.01 confidence, the average would indeed be 0.5 - the same as two frames with 0.5 confidence each, despite these representing very different detection scenarios. I think the average classification approach makes more sense when the goal is not to classify specific objects in individual frames, but rather to identify general characteristics about the entire video. For my project, I am aiming to classify movies by their Recording System, Genre and Content type. I use CLIP/CLAP to capture the overall "vibe"/facts in the images or audio, which is why I implemented category classification this way. Example LLM generated categories file for classifying Recording System, Genre and Content type: https://github.com/MaximilianKaindl/DeepFFMPEGVideoClassification/blob/main/resources/labels/categories_clip.txt In my testing, combined with scene classification, this approach works reasonably well for my use case. For the cat detection example, setting a higher confidence threshold would be more appropriate to ensure it is detecting a cat. I recognize there might be better approaches for specific detection tasks, and I should probably create a new example in the doc that better demonstrates the most useful application cases. If we could guarantee that only a single animal type appears in the entire video, this averaging approach would be effective. However, this scenario is highly unrealistic outside of controlled settings like Google Lens classifications, where users typically focus the camera on just one specific subject at a time. Kind regards -----Original Message----- From: ffmpeg-devel On Behalf Of Michael Niedermayer Sent: Sunday, 9 March 2025 20:19 To: FFmpeg development discussions and patches Subject: Re: [FFmpeg-devel] [PATCH FFmpeg 11/15] doc: avgclass Filter Documentation Hi Maximilian On Sat, Mar 08, 2025 at 04:01:40PM +0100, m.kaindl0208@gmail.com wrote: > Try the new filters using my Github Repo https://github.com/MaximilianKaindl/DeepFFMPEGVideoClassification. > > Any Feedback is appreciated! > > Signed-off-by: MaximilianKaindl > --- > doc/filters.texi | 64 > ++++++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 64 insertions(+) > > diff --git a/doc/filters.texi b/doc/filters.texi index > b6cccbacb6..bd75982d7d 100644 > --- a/doc/filters.texi > +++ b/doc/filters.texi > @@ -30827,6 +30827,70 @@ ffplay -f lavfi 'amovie=input.mp3, asplit > [a][out1]; [...] > +@example > +Classification averages: > +Stream #0: > + Label: cat: Average probability 0.8765, Appeared 120 times > + Label: dog: Average probability 0.3421, Appeared 42 times Stream > +#1: > + Label: music: Average probability 0.9823, Appeared 315 times > + Label: speech: Average probability 0.1245, Appeared 15 times @end > +example Nice! how exactly does one interpret the average probability ? I mean if one frame is detecting a cat with 0.99 and one with 0.01 does that give a average of 0.5 ? iam asking as that seems not the most usefull metric as two frames with 0.5 would be alot weaker indicator than one with 0.99 that there was at least one cat (if these behave like standard probabilities) thx [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB No human being will ever know the Truth, for even if they happen to say it by chance, they would not even known they had done so. -- Xenophanes _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".