From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTPS id 7B6AB4C4C5 for ; Sun, 9 Mar 2025 15:46:47 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 3EB2668E60C; Sun, 9 Mar 2025 17:46:43 +0200 (EET) Received: from vidala.pars.ee (vidala.pars.ee [116.203.72.101]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id D51FD68E607 for ; Sun, 9 Mar 2025 17:46:36 +0200 (EET) DKIM-Signature: v=1; a=rsa-sha256; s=202405r; d=lynne.ee; c=relaxed/relaxed; h=From:To:Subject:Date:Message-ID; t=1741535195; bh=h54okxwvMHPSQYP/98MiYpD maMs92wibMJx5refRpLM=; b=GKnlhvg8H0d4pXN9An8GU74G9oJintnl/zphVXript0SUeczxP EExDqWp7un1iySXmtdlPMiphqkiBgt/XhO54D9aZenyUlFiKccLKN3cq3PKoZlJHUeGniz+opIB 3CpRM5cJyXoEs6e8gGX+mfGBF8wjN8QmuyE7Q5Z/gyJ3IpxMCOIIoc+Y/jG9I5a2YAX56MQ7cAx NYBddR0oFZA3jp2xO2lBpNSJo16/KsXi/nTw/3ML4ke76IndOkDjjl/brdnBti49wplAVm3TI1Q iGiuHXyR1ZWumd2CdCXCWt/UyIJzzSiDJgwDGa9vVbfGa+62zOydyzNeIzOMdwbEnUA==; DKIM-Signature: v=1; a=ed25519-sha256; s=202405e; d=lynne.ee; c=relaxed/relaxed; h=From:To:Subject:Date:Message-ID; t=1741535195; bh=h54okxwvMHPSQYP/98MiYpD maMs92wibMJx5refRpLM=; b=vtTnamFGz/xANsvmu5xaFm3ccFXmIRZD+p4wFk+Giq8N1FWc97 zMuQyW3mk2APcYEbc2x1jewhp6BQvGI9b7Ag==; Message-ID: Date: Sun, 9 Mar 2025 16:46:31 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Beta To: ffmpeg-devel@ffmpeg.org References: <007301db903a$7ea1afb0$7be50f10$@gmail.com> Content-Language: en-US From: Lynne Autocrypt: addr=dev@lynne.ee; keydata= xjMEXnFG3BYJKwYBBAHaRw8BAQdA3FyJpqEdfQj4GA7OUWVrNheT9dUsIs+yUx6Hljr9mYvN FEx5bm5lIDxkZXZAbHlubmUuZWU+wpAEExYIADgWIQT+UBOcaAVyyv1SH42i/qXwPwNEZAUC XnFG3AIbAwULCQgHAgYVCgkICwIEFgIDAQIeAQIXgAAKCRCi/qXwPwNEZOTWAQCSNEA+kZLI NZ2dsR5Qg988c0HXpOXThZEjg+h1TL7KGgEA3Gff0c28efI02S6iMxazrpdWGHqHk7JN7pCj nt397wzOOARecUbcEgorBgEEAZdVAQUBAQdAjDdFQ5H+AJ9vwXrOb7val460g45EsheIaL5S 7/zSaX8DAQgHwngEGBYIACAWIQT+UBOcaAVyyv1SH42i/qXwPwNEZAUCXnFG3AIbDAAKCRCi /qXwPwNEZKaxAQCHLV4gAk/B9JvRG27MYm22X3+5QRCLBtEILP29aDh+MQD/V8JFHATDXRY3 0LsmqR3sPQ0BJ1UFVZA5BUoIJPJZWwg= In-Reply-To: <007301db903a$7ea1afb0$7be50f10$@gmail.com> Subject: Re: [FFmpeg-devel] [PATCH FFmpeg 1/15] libavutil: add detectionbbox util functions X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: On 08/03/2025 15:58, m.kaindl0208@gmail.com wrote: > Those functions will be used by classify in the upcoming patches. > > Try the new filters using my Github Repo https://github.com/MaximilianKaindl/DeepFFMPEGVideoClassification. > > Any Feedback is appreciated! > > Signed-off-by: MaximilianKaindl > --- > libavutil/detection_bbox.c | 54 ++++++++++++++++++++++++++++++++++++++ > libavutil/detection_bbox.h | 31 ++++++++++++++++++++++ > 2 files changed, 85 insertions(+) > > diff --git a/libavutil/detection_bbox.c b/libavutil/detection_bbox.c index cb157b355b..378233121d 100644 > --- a/libavutil/detection_bbox.c > +++ b/libavutil/detection_bbox.c > @@ -18,6 +18,7 @@ > > #include "detection_bbox.h" > #include "mem.h" > +#include "libavutil/avstring.h" > > AVDetectionBBoxHeader *av_detection_bbox_alloc(uint32_t nb_bboxes, size_t *out_size) { @@ -71,3 +72,56 @@ AVDetectionBBoxHeader *av_detection_bbox_create_side_data(AVFrame *frame, uint32 > > return header; > } > + > +int av_detection_bbox_fill_with_best_labels(char **labels, float > +*probabilities, int num_labels, AVDetectionBBox *bbox, int max_classes_per_box, float confidence_threshold) { > + int i, j, minpos, ret; > + float min; > + > + if (!labels || !probabilities || !bbox) { > + return AVERROR(EINVAL); > + } > + > + for (i = 0; i < num_labels; i++) { > + if (probabilities[i] >= confidence_threshold) { > + if (bbox->classify_count >= max_classes_per_box) { > + // Find lowest probability classification > + min = av_q2d(bbox->classify_confidences[0]); > + minpos = 0; > + for (j = 1; j < bbox->classify_count; j++) { > + float prob = av_q2d(bbox->classify_confidences[j]); > + if (prob < min) { > + min = prob; > + minpos = j; > + } > + } > + > + if (probabilities[i] > min) { > + ret = av_detection_bbox_set_content(bbox, labels[i], minpos, probabilities[i]); > + if (ret < 0) > + return ret; > + } > + } else { > + ret = av_detection_bbox_set_content(bbox, labels[i], bbox->classify_count, probabilities[i]); > + if (ret < 0) > + return ret; > + bbox->classify_count++; > + } > + } > + } > + return 0; > +} > + > +int av_detection_bbox_set_content(AVDetectionBBox *bbox, char *label, > +int index, float probability) { > + // Set probability > + bbox->classify_confidences[index] = av_make_q((int)(probability * > +10000), 10000); > + > + // Copy label with size checking > + if (av_strlcpy(bbox->classify_labels[index], label, AV_DETECTION_BBOX_LABEL_NAME_MAX_SIZE) >= > + AV_DETECTION_BBOX_LABEL_NAME_MAX_SIZE) { > + av_log(NULL, AV_LOG_WARNING, "Label truncated in set_prob_and_label_of_bbox\n"); > + } > + > + return 0; > +} > diff --git a/libavutil/detection_bbox.h b/libavutil/detection_bbox.h index 011988052c..27d749ad59 100644 > --- a/libavutil/detection_bbox.h > +++ b/libavutil/detection_bbox.h > @@ -105,4 +105,35 @@ AVDetectionBBoxHeader *av_detection_bbox_alloc(uint32_t nb_bboxes, size_t *out_s > * AV_FRAME_DATA_DETECTION_BBOXES and initializes the variables. > */ > AVDetectionBBoxHeader *av_detection_bbox_create_side_data(AVFrame *frame, uint32_t nb_bboxes); > + > +/** > + * Fills an AVDetectionBBox structure with the best labels based on probabilities. > + * > + * This function selects up to max_classes_per_box labels with the > +highest probabilities > + * that exceed the given confidence threshold, and assigns them to the bounding box. > + * > + * @param labels Array of label strings > + * @param probabilities Array of probability values corresponding to > +each label > + * @param num_labels Number of elements in the labels and probabilities > +arrays > + * @param bbox Pointer to the AVDetectionBBox structure to be filled > + * @param max_classes_per_box Maximum number of classes to assign to > +the bounding box > + * @param confidence_threshold Minimum probability value required for a > +label to be considered > + * @return 0 on success, negative error code on failure */ int > +av_detection_bbox_fill_with_best_labels(char **labels, float > +*probabilities, int num_labels, AVDetectionBBox *bbox, int > +max_classes_per_box, float confidence_threshold); > + > +/** > + * Sets the content of an AVDetectionBBox at the specified index. > + * > + * This function assigns a label and its associated probability to the > +specified index > + * in the bounding box's internal storage. > + * > + * @param bbox Pointer to the AVDetectionBBox structure to modify > + * @param label The class label to assign (will be copied internally) > + * @param index The index at which to store the label and probability > + * @param probability The confidence score/probability for this label > + * @return 0 on success > + */ > +int av_detection_bbox_set_content(AVDetectionBBox *bbox, char *label, > +int index, float probability); This is outside the scope of the file IMO. Not something that should be in the public API. _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".