Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
 help / color / mirror / Atom feed
From: "Guo, Yejun" <yejun.guo-at-intel.com@ffmpeg.org>
To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Subject: Re: [FFmpeg-devel] [PATCH] avfilter/dnn: add zero-shot image classification using CLIP models
Date: Tue, 4 Mar 2025 11:28:35 +0000
Message-ID: <PH7PR11MB5957D3BC03D7C4C8369CEAD5F1C82@PH7PR11MB5957.namprd11.prod.outlook.com> (raw)
In-Reply-To: <382d01db81ff$06ac4b80$1404e280$@gmail.com>



> -----Original Message-----
> From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of
> m.kaindl0208@gmail.com
> Sent: Tuesday, February 18, 2025 8:17 PM
> To: 'FFmpeg development discussions and patches' <ffmpeg-
> devel@ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH] avfilter/dnn: add zero-shot image
> classification using CLIP models
> 
> The new backend is an extension of the existing Torch backend rather than a
> separate implementation.
In general, we don't need to add a new backend for libtorch for another use case.

> 
> Inference in CLIP differs from other models as it encodes (embeds) both
> images and tokenized text labels, then calculates the similarity between the
> encoded vectors. As a result, its forward pass takes two inputs and produces
> two outputs.
Is the tokenizer expected to be used by other dnn filters? We may put it in files
like libavfilter / dnn_filter_common.c

> 
> To ensure clarity and modularity, I have created a separate
> dnn_torch_backend_clip file instead of expanding dnn_torch_backend. This
> keeps the main file manageable and allows for easy exclusion from the build
> when the tokenizer-cpp library is not included.
> 
> If preferred, I can implement a standalone tokenizer class, integrate it into e.g.
> libavutil, and move the remaining code to the backend.
> 
> -----Ursprüngliche Nachricht-----
> Von: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> Im Auftrag von Guo,
> Yejun
> Gesendet: Tuesday, 18 February 2025 11:09
> An: FFmpeg development discussions and patches <ffmpeg-
> devel@ffmpeg.org>
> Betreff: Re: [FFmpeg-devel] [PATCH] avfilter/dnn: add zero-shot image
> classification using CLIP models
> 
> 
> 
> > -----Original Message-----
> > From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of
> > Maximilian Kaindl
> > Sent: Tuesday, February 18, 2025 12:29 AM
> > To: FFmpeg development discussions and patches <ffmpeg-
> > devel@ffmpeg.org>
> > Subject: Re: [FFmpeg-devel] [PATCH] avfilter/dnn: add zero-shot image
> > classification using CLIP models
> >
> > Hello Yejun Guo,
> >
> > yes i can do that and submit it in another patch. Do you also have
> > some feedback for the clip backend? I have already made some small
> > changes (cuda accel and new preprocessing) that i will submit along
> > with the other patch, but i would like to hear your thoughts.
> >
> Could you share why we need a new backend?
> 
> > Thanks
> >
> > ________________________________
> > From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> on behalf of Guo,
> > Yejun <yejun.guo-at-intel.com@ffmpeg.org>
> > Sent: Sunday, February 16, 2025 7:09 AM
> > To: FFmpeg development discussions and patches <ffmpeg-
> > devel@ffmpeg.org>
> > Subject: Re: [FFmpeg-devel] [PATCH] avfilter/dnn: add zero-shot image
> > classification using CLIP models
> >
> >
> >
> > > -----Original Message-----
> > > From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of
> > > m.kaindl0208@gmail.com
> > > Sent: Thursday, January 30, 2025 4:33 AM
> > > To: ffmpeg-devel@ffmpeg.org
> > > Subject: [FFmpeg-devel] [PATCH] avfilter/dnn: add zero-shot image
> > > classification using CLIP models
> > >
> > > Add a new filter 'dnn_clip' that performs zero-shot image
> > > classification using CLIP (Contrastive Language-Image Pre-Training)
> models.
> > The filter supports:
> >
> > For image classification with new dnn models, we'd better add the new
> > model support with dnn_classify at https://ffmpeg.org/ffmpeg-
> > filters.html#dnn_005fclassify
> >
> >
> > _______________________________________________
> > ffmpeg-devel mailing list
> > ffmpeg-devel@ffmpeg.org
> > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >
> > To unsubscribe, visit link above, or email
> > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
> > _______________________________________________
> > ffmpeg-devel mailing list
> > ffmpeg-devel@ffmpeg.org
> > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >
> > To unsubscribe, visit link above, or email
> > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org
> with subject "unsubscribe".
> 
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org
> with subject "unsubscribe".
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

      reply	other threads:[~2025-03-04 11:29 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-29 20:33 m.kaindl0208
2025-02-16  6:08 ` Guo, Yejun
2025-02-17 16:28   ` Maximilian Kaindl
2025-02-18 10:09     ` Guo, Yejun
2025-02-18 12:17       ` m.kaindl0208
2025-03-04 11:28         ` Guo, Yejun [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=PH7PR11MB5957D3BC03D7C4C8369CEAD5F1C82@PH7PR11MB5957.namprd11.prod.outlook.com \
    --to=yejun.guo-at-intel.com@ffmpeg.org \
    --cc=ffmpeg-devel@ffmpeg.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
		ffmpegdev@gitmailbox.com
	public-inbox-index ffmpegdev

Example config snippet for mirrors.


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git