From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTPS id 14E1D4E095 for ; Sat, 8 Mar 2025 15:01:50 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id A73E868F465; Sat, 8 Mar 2025 17:01:44 +0200 (EET) Received: from mail-wr1-f47.google.com (mail-wr1-f47.google.com [209.85.221.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 0122468F472 for ; Sat, 8 Mar 2025 17:01:38 +0200 (EET) Received: by mail-wr1-f47.google.com with SMTP id ffacd0b85a97d-38f406e9f80so2109966f8f.2 for ; Sat, 08 Mar 2025 07:01:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1741446098; x=1742050898; darn=ffmpeg.org; h=content-language:thread-index:content-transfer-encoding :mime-version:message-id:date:subject:to:from:from:to:cc:subject :date:message-id:reply-to; bh=RWG+6BPEV00LnZl+CR/DSZP1T/a6wLRrV20rHr7hf0U=; b=kA3jvgL9FUbzvv6dm50UOLWlpYKwTKoSLhQrUbk8zF0fWUyvs61tNxNKsf0xd0dPnc mAs2xSulB0sXl9AonDyxxbmjCLGfTdxK8k/JRqDUSnD4XUFq8s+sitFY48OyU+ZyugkI 9tmJ0nm9Dxvy9k3kkWiegmtweYl+tCfNU3bmrLokeGem//mUC7wjGZ9AFD3NgFAM8IeT B+6gimdSoIgItaynLwChRPugCCRcHjkvBdgOENZeL130+/PzCC7Llaw/7UsARraYx8mP KzLEd5DBkMJaKutGU3mJNZKqVZ4wR1EwtlrBXj4geLDzh24euENcR21abvmMYh2TnxU5 VEvQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1741446098; x=1742050898; h=content-language:thread-index:content-transfer-encoding :mime-version:message-id:date:subject:to:from:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=RWG+6BPEV00LnZl+CR/DSZP1T/a6wLRrV20rHr7hf0U=; b=ao8vnW0PGquucexFMtTzEIofARvT09S/iO6DWfY/5BGxjjV2wajep5jycctPxtCWlB 1a4P2cSdTUhmMKpdl5tY8lB+XO6PfPShuNW27kLDDw+gIi5n9wg9EOBKXCSakog81wpP eJckLDEdnEVfYsQ8+whxRVlIIQJuEMbc7OBR7psZYRwxNwc15JdXLOK9ozf/LLhD8XJW 1lz6eBUn2UZf7gJ8v+kIEPzcu1IFY3tYOEiBaVJLTpCoBQgV7AXGEBNLGryeeewnJfvw S/B1x1pSZJMCksesbWjp3OaF3wxKddyD86swiyMXe3klKFjo6AwRZ1Uo+wba05l+/30f e7OA== X-Gm-Message-State: AOJu0YxPUHjZ6GXntZv201WefcFS8fIo9mCnmnNGzHggkBSRsM+rFQYr AoaCSEVRCUjzEZ27YT11HvU0s08XBtky5hkCHDmoZ3H1wtK/zqXnitVbMw== X-Gm-Gg: ASbGncsehPAaSOqr1P0nLGo70INuREYw6pmxtAKlDqTKFLfAgepWrM8ANfi3lHMua3V uJYnBjD/jDGcioiDIGdoK6JeuwrIFa4X6lTCWhmGxfWoPcZhqcHUWY05TSnRpaZVY2tLrkJKXIH mem28dbmAcpfH5VGL4MwwpY9BeIFmlN0bhNEvwRym3Y+YmIC6BUjG73kPvV0UrIVoaxID36Q3fL lBlCu/ZOZQZuI/D4lIKLR3wIwuGs6V/f7V2/9shpRuRSCDbAjobkiM7gzXTJkrPAPa0q5xsXa5a siiiA9J+w2YlpSvcrDgRO1zatqB9ng5NIjscH9+Axty+Asd6AavMom6IFPgzKPY5HFog3+OifW/ EbEuWEmIIEvJvrwum X-Google-Smtp-Source: AGHT+IGHnaaxa2uHV8Wao4uR9g1rj55SZ85LREXFMfCu8ebvkpgN+uuR0jxDxSpDv620eWLAA/y5XA== X-Received: by 2002:a5d:6482:0:b0:38b:d7d2:12f6 with SMTP id ffacd0b85a97d-39132d2b33bmr4898494f8f.2.1741446097927; Sat, 08 Mar 2025 07:01:37 -0800 (PST) Received: from MK2 (80-108-16-220.cable.dynamic.surfer.at. [80.108.16.220]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-43bdd8dad73sm88005205e9.19.2025.03.08.07.01.37 for (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sat, 08 Mar 2025 07:01:37 -0800 (PST) From: To: Date: Sat, 8 Mar 2025 16:01:40 +0100 Message-ID: <007d01db903a$ffe93ca0$ffbbb5e0$@gmail.com> MIME-Version: 1.0 X-Mailer: Microsoft Outlook 16.0 Thread-Index: AduQOIrqoAWO9GOPQT+ifnzUCnUKLA== Content-Language: en-at Subject: [FFmpeg-devel] [PATCH FFmpeg 11/15] doc: avgclass Filter Documentation X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: Try the new filters using my Github Repo https://github.com/MaximilianKaindl/DeepFFMPEGVideoClassification. Any Feedback is appreciated! Signed-off-by: MaximilianKaindl --- doc/filters.texi | 64 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 64 insertions(+) diff --git a/doc/filters.texi b/doc/filters.texi index b6cccbacb6..bd75982d7d 100644 --- a/doc/filters.texi +++ b/doc/filters.texi @@ -30827,6 +30827,70 @@ ffplay -f lavfi 'amovie=input.mp3, asplit [a][out1]; This filter supports the all above options as commands except options @code{size} and @code{rate}. +@section avgclass + +Average classification probabilities across multiple frames for both audio and video streams. + +This filter analyzes classification data from frame side data (bounding boxes) and calculates average confidence scores for each label. The filter processes classification metadata from the @code{dnn_classify} filter or other sources that generate AVDetectionBBox side data, computing averages over the entire stream. + +At the end of the stream (or when manually triggered), the filter outputs the average probability for each detected class, both to console logs and optionally to a CSV file. + +@table @option +@item output_file +Path to a CSV output file where average classification results will be written. If not specified, results are only printed to log output. + +@item v +Specify the number of video streams (default: 1). + +@item a +Specify the number of audio streams (default: 0). +@end table + +This filter supports the following commands: + +@table @option +@item writeinfo +Immediately write current average classification results to the log and output file (if specified) without waiting for the stream to end. + +@item flush +Force the filter to write results and flush all its internal state. +@end table + +@subsection Examples + +Process a video with object detection and classification, then calculate average classification probabilities: +@example +ffmpeg -i input.mp4 -vf "dnn_detect=model=detection.xml:input=data:output=detection_out:confidence=0.5,dnn_classify=model=classification.pt:dnn_backend=torch:tokenizer=tokenizer.json:labels=labels.txt,avgclass=output_file=results.csv" -f null - +@end example + +Process both audio and video classification: +@example +ffmpeg -i input.mkv -filter_complex "[0:v]dnn_classify[v0]; [0:a]aformat=sample_fmts=fltp,dnn_classify=dnn_backend=torch:model=clap_model.pt:is_audio=1:tokenizer=tokenizer.json:labels=audio_labels.txt[a0]; [v0][a0]avgclass=v=1:a=1:output_file=av_results.csv" -f null - +@end example + +@subsection Output Format + +When the filter completes processing (or when the @code{writeinfo} command is sent), it outputs classification results in this format: + +@example +Classification averages: +Stream #0: + Label: cat: Average probability 0.8765, Appeared 120 times + Label: dog: Average probability 0.3421, Appeared 42 times +Stream #1: + Label: music: Average probability 0.9823, Appeared 315 times + Label: speech: Average probability 0.1245, Appeared 15 times +@end example + +If an output file is specified, the same data is written in CSV format: +@example +stream_id,label,avg_probability,count +0,cat,0.8765,120 +0,dog,0.3421,42 +1,music,0.9823,315 +1,speech,0.1245,15 +@end example + @section bench, abench Benchmark part of a filtergraph. -- 2.34.1 _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".