From: <m.kaindl0208@gmail.com> To: "'FFmpeg development discussions and patches'" <ffmpeg-devel@ffmpeg.org> Subject: Re: [FFmpeg-devel] [PATCH FFmpeg 11/15] doc: avgclass Filter Documentation Date: Sun, 9 Mar 2025 21:24:36 +0100 Message-ID: <00f001db9131$470fbb80$d52f3280$@gmail.com> (raw) In-Reply-To: <20250309191924.GK4991@pb2> Hi Michael, You are right. The workflow is that any classification above the confidence value parameter (default 0.5) gets written to the Side data of the Frame, then read by the avgclass filter and averaged. Given the parameter was set to 0.01 or lower, if one frame detects a cat with 0.99 confidence and another with 0.01 confidence, the average would indeed be 0.5 - the same as two frames with 0.5 confidence each, despite these representing very different detection scenarios. I think the average classification approach makes more sense when the goal is not to classify specific objects in individual frames, but rather to identify general characteristics about the entire video. For my project, I am aiming to classify movies by their Recording System, Genre and Content type. I use CLIP/CLAP to capture the overall "vibe"/facts in the images or audio, which is why I implemented category classification this way. Example LLM generated categories file for classifying Recording System, Genre and Content type: https://github.com/MaximilianKaindl/DeepFFMPEGVideoClassification/blob/main/resources/labels/categories_clip.txt In my testing, combined with scene classification, this approach works reasonably well for my use case. For the cat detection example, setting a higher confidence threshold would be more appropriate to ensure it is detecting a cat. I recognize there might be better approaches for specific detection tasks, and I should probably create a new example in the doc that better demonstrates the most useful application cases. If we could guarantee that only a single animal type appears in the entire video, this averaging approach would be effective. However, this scenario is highly unrealistic outside of controlled settings like Google Lens classifications, where users typically focus the camera on just one specific subject at a time. Kind regards -----Original Message----- From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of Michael Niedermayer Sent: Sunday, 9 March 2025 20:19 To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org> Subject: Re: [FFmpeg-devel] [PATCH FFmpeg 11/15] doc: avgclass Filter Documentation Hi Maximilian On Sat, Mar 08, 2025 at 04:01:40PM +0100, m.kaindl0208@gmail.com wrote: > Try the new filters using my Github Repo https://github.com/MaximilianKaindl/DeepFFMPEGVideoClassification. > > Any Feedback is appreciated! > > Signed-off-by: MaximilianKaindl <m.kaindl0208@gmail.com> > --- > doc/filters.texi | 64 > ++++++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 64 insertions(+) > > diff --git a/doc/filters.texi b/doc/filters.texi index > b6cccbacb6..bd75982d7d 100644 > --- a/doc/filters.texi > +++ b/doc/filters.texi > @@ -30827,6 +30827,70 @@ ffplay -f lavfi 'amovie=input.mp3, asplit > [a][out1]; [...] > +@example > +Classification averages: > +Stream #0: > + Label: cat: Average probability 0.8765, Appeared 120 times > + Label: dog: Average probability 0.3421, Appeared 42 times Stream > +#1: > + Label: music: Average probability 0.9823, Appeared 315 times > + Label: speech: Average probability 0.1245, Appeared 15 times @end > +example Nice! how exactly does one interpret the average probability ? I mean if one frame is detecting a cat with 0.99 and one with 0.01 does that give a average of 0.5 ? iam asking as that seems not the most usefull metric as two frames with 0.5 would be alot weaker indicator than one with 0.99 that there was at least one cat (if these behave like standard probabilities) thx [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB No human being will ever know the Truth, for even if they happen to say it by chance, they would not even known they had done so. -- Xenophanes _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
prev parent reply other threads:[~2025-03-09 20:24 UTC|newest] Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top 2025-03-08 15:01 m.kaindl0208 2025-03-09 19:19 ` Michael Niedermayer 2025-03-09 20:24 ` m.kaindl0208 [this message]
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to='00f001db9131$470fbb80$d52f3280$@gmail.com' \ --to=m.kaindl0208@gmail.com \ --cc=ffmpeg-devel@ffmpeg.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel This inbox may be cloned and mirrored by anyone: git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \ ffmpegdev@gitmailbox.com public-inbox-index ffmpegdev Example config snippet for mirrors. AGPL code for this site: git clone https://public-inbox.org/public-inbox.git