Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
 help / color / mirror / Atom feed
From: <m.kaindl0208@gmail.com>
To: <ffmpeg-devel@ffmpeg.org>
Subject: [FFmpeg-devel] [PATCH FFmpeg 11/15] doc: avgclass Filter Documentation
Date: Sat, 8 Mar 2025 16:01:40 +0100
Message-ID: <007d01db903a$ffe93ca0$ffbbb5e0$@gmail.com> (raw)

Try the new filters using my Github Repo https://github.com/MaximilianKaindl/DeepFFMPEGVideoClassification. 

Any Feedback is appreciated!

Signed-off-by: MaximilianKaindl <m.kaindl0208@gmail.com>
---
 doc/filters.texi | 64 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 64 insertions(+)

diff --git a/doc/filters.texi b/doc/filters.texi
index b6cccbacb6..bd75982d7d 100644
--- a/doc/filters.texi
+++ b/doc/filters.texi
@@ -30827,6 +30827,70 @@ ffplay -f lavfi 'amovie=input.mp3, asplit [a][out1];
 
 This filter supports the all above options as commands except options @code{size} and @code{rate}.
 
+@section avgclass
+
+Average classification probabilities across multiple frames for both audio and video streams.
+
+This filter analyzes classification data from frame side data (bounding boxes) and calculates average confidence scores for each label. The filter processes classification metadata from the @code{dnn_classify} filter or other sources that generate AVDetectionBBox side data, computing averages over the entire stream.
+
+At the end of the stream (or when manually triggered), the filter outputs the average probability for each detected class, both to console logs and optionally to a CSV file.
+
+@table @option
+@item output_file
+Path to a CSV output file where average classification results will be written. If not specified, results are only printed to log output.
+
+@item v
+Specify the number of video streams (default: 1).
+
+@item a
+Specify the number of audio streams (default: 0).
+@end table
+
+This filter supports the following commands:
+
+@table @option
+@item writeinfo
+Immediately write current average classification results to the log and output file (if specified) without waiting for the stream to end.
+
+@item flush
+Force the filter to write results and flush all its internal state.
+@end table
+
+@subsection Examples
+
+Process a video with object detection and classification, then calculate average classification probabilities:
+@example
+ffmpeg -i input.mp4 -vf "dnn_detect=model=detection.xml:input=data:output=detection_out:confidence=0.5,dnn_classify=model=classification.pt:dnn_backend=torch:tokenizer=tokenizer.json:labels=labels.txt,avgclass=output_file=results.csv" -f null -
+@end example
+
+Process both audio and video classification:
+@example
+ffmpeg -i input.mkv -filter_complex "[0:v]dnn_classify[v0]; [0:a]aformat=sample_fmts=fltp,dnn_classify=dnn_backend=torch:model=clap_model.pt:is_audio=1:tokenizer=tokenizer.json:labels=audio_labels.txt[a0]; [v0][a0]avgclass=v=1:a=1:output_file=av_results.csv" -f null -
+@end example
+
+@subsection Output Format
+
+When the filter completes processing (or when the @code{writeinfo} command is sent), it outputs classification results in this format:
+
+@example
+Classification averages:
+Stream #0:
+  Label: cat: Average probability 0.8765, Appeared 120 times
+  Label: dog: Average probability 0.3421, Appeared 42 times
+Stream #1:
+  Label: music: Average probability 0.9823, Appeared 315 times
+  Label: speech: Average probability 0.1245, Appeared 15 times
+@end example
+
+If an output file is specified, the same data is written in CSV format:
+@example
+stream_id,label,avg_probability,count
+0,cat,0.8765,120
+0,dog,0.3421,42
+1,music,0.9823,315
+1,speech,0.1245,15
+@end example
+
 @section bench, abench
 
 Benchmark part of a filtergraph.
-- 
2.34.1


_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

             reply	other threads:[~2025-03-08 15:01 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-08 15:01 m.kaindl0208 [this message]
2025-03-09 19:19 ` Michael Niedermayer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='007d01db903a$ffe93ca0$ffbbb5e0$@gmail.com' \
    --to=m.kaindl0208@gmail.com \
    --cc=ffmpeg-devel@ffmpeg.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
		ffmpegdev@gitmailbox.com
	public-inbox-index ffmpegdev

Example config snippet for mirrors.


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git