From: <m.kaindl0208@gmail.com> To: <ffmpeg-devel@ffmpeg.org> Subject: [FFmpeg-devel] [PATCH FFmpeg 11/15] doc: avgclass Filter Documentation Date: Sat, 8 Mar 2025 16:01:40 +0100 Message-ID: <007d01db903a$ffe93ca0$ffbbb5e0$@gmail.com> (raw) Try the new filters using my Github Repo https://github.com/MaximilianKaindl/DeepFFMPEGVideoClassification. Any Feedback is appreciated! Signed-off-by: MaximilianKaindl <m.kaindl0208@gmail.com> --- doc/filters.texi | 64 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 64 insertions(+) diff --git a/doc/filters.texi b/doc/filters.texi index b6cccbacb6..bd75982d7d 100644 --- a/doc/filters.texi +++ b/doc/filters.texi @@ -30827,6 +30827,70 @@ ffplay -f lavfi 'amovie=input.mp3, asplit [a][out1]; This filter supports the all above options as commands except options @code{size} and @code{rate}. +@section avgclass + +Average classification probabilities across multiple frames for both audio and video streams. + +This filter analyzes classification data from frame side data (bounding boxes) and calculates average confidence scores for each label. The filter processes classification metadata from the @code{dnn_classify} filter or other sources that generate AVDetectionBBox side data, computing averages over the entire stream. + +At the end of the stream (or when manually triggered), the filter outputs the average probability for each detected class, both to console logs and optionally to a CSV file. + +@table @option +@item output_file +Path to a CSV output file where average classification results will be written. If not specified, results are only printed to log output. + +@item v +Specify the number of video streams (default: 1). + +@item a +Specify the number of audio streams (default: 0). +@end table + +This filter supports the following commands: + +@table @option +@item writeinfo +Immediately write current average classification results to the log and output file (if specified) without waiting for the stream to end. + +@item flush +Force the filter to write results and flush all its internal state. +@end table + +@subsection Examples + +Process a video with object detection and classification, then calculate average classification probabilities: +@example +ffmpeg -i input.mp4 -vf "dnn_detect=model=detection.xml:input=data:output=detection_out:confidence=0.5,dnn_classify=model=classification.pt:dnn_backend=torch:tokenizer=tokenizer.json:labels=labels.txt,avgclass=output_file=results.csv" -f null - +@end example + +Process both audio and video classification: +@example +ffmpeg -i input.mkv -filter_complex "[0:v]dnn_classify[v0]; [0:a]aformat=sample_fmts=fltp,dnn_classify=dnn_backend=torch:model=clap_model.pt:is_audio=1:tokenizer=tokenizer.json:labels=audio_labels.txt[a0]; [v0][a0]avgclass=v=1:a=1:output_file=av_results.csv" -f null - +@end example + +@subsection Output Format + +When the filter completes processing (or when the @code{writeinfo} command is sent), it outputs classification results in this format: + +@example +Classification averages: +Stream #0: + Label: cat: Average probability 0.8765, Appeared 120 times + Label: dog: Average probability 0.3421, Appeared 42 times +Stream #1: + Label: music: Average probability 0.9823, Appeared 315 times + Label: speech: Average probability 0.1245, Appeared 15 times +@end example + +If an output file is specified, the same data is written in CSV format: +@example +stream_id,label,avg_probability,count +0,cat,0.8765,120 +0,dog,0.3421,42 +1,music,0.9823,315 +1,speech,0.1245,15 +@end example + @section bench, abench Benchmark part of a filtergraph. -- 2.34.1 _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
next reply other threads:[~2025-03-08 15:01 UTC|newest] Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top 2025-03-08 15:01 m.kaindl0208 [this message] 2025-03-09 19:19 ` Michael Niedermayer
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to='007d01db903a$ffe93ca0$ffbbb5e0$@gmail.com' \ --to=m.kaindl0208@gmail.com \ --cc=ffmpeg-devel@ffmpeg.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel This inbox may be cloned and mirrored by anyone: git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \ ffmpegdev@gitmailbox.com public-inbox-index ffmpegdev Example config snippet for mirrors. AGPL code for this site: git clone https://public-inbox.org/public-inbox.git