From: Lynne <dev@lynne.ee> To: ffmpeg-devel@ffmpeg.org Subject: Re: [FFmpeg-devel] I've written a filter in Rust Date: Fri, 21 Feb 2025 14:18:06 +0100 Message-ID: <eb5fa9b8-5ec6-45be-a4ec-ac4118cdaf5e@lynne.ee> (raw) In-Reply-To: <418474f4-5b38-4a44-822a-8e3c367e673c@gmail.com> [-- Attachment #1.1.1.1: Type: text/plain, Size: 5220 bytes --] On 20/02/2025 14:06, Leandro Santiago wrote: > [insert meme here] > > (this will be a long e-mail) > > Dear FFmpeg devs, > > in the past days I've been experimenting hacking FFmpeg using Rust. > > As I am becoming more familiar with the libavfilter, and it is not a dependency for any other of the libav* libs, I decided this is a good candidate. > > It's also convenient as I use FFmpeg libs heavily in a commercial product, and one of the features I've been working on involves a basic multi object tracking. > > In my case, it does not need to be a "perfect" tracking algorithm, as I need to compromise quality of the result in exchange of performance executing in the CPU only, so most of the algorithms out there that need a GPU are out of my range. > > I decided then use as first experiment a filter called `track_sort` that implements the 2016 paper SIMPLE ONLINE AND REALTIME TRACKING WITH A DEEP ASSOCIATION METRIC, as known as SORT [1]. > > The filter already works well based on the `master` branch, but the code itself is in very early stages and far from being "production ready", so please do not read the code assuming it's in its final form. It's ugly and needs lots of refactoring. > > I've created a PR on forgejo [4] to make it easier for others to track progress, although I use gitlab.com as my main forge. > > Here is a description of the filter: > > - It perform only object tracking, needing the object detection to be performed elsewhere. It feeds from the detection boxes generated by `dnn_detect`. That means that the quality of the the tracking is closely related to the quality of the detection. We've been looking to deprecate our native DNN stuff for years. Its bitrotten NIH code written long ago. Not a good idea to base it on. > - SORT is a simple algorithm that uses spatial data only, and it not able to handle cases such as object occlusion. It's good enough for my use case, as I mentioned earlier. > > - The filter works with the default options, so you can pass it without any arguments. In this mode, it will try to track any objects from the boxes available. You can change this behaviour by specifying the list of labels to track, for example: `track_sort=labels=person|dog|cat`. Such labels come from the ML model you used in the detection filter. It also has the options `threshold`, `min_hits` and `max_age`, which control how the tracking algorithm works, but the default values should work well on most cases. > > - The filter will add the tracking information as label on a new frame side data entry of type `AV_FRAME_DATA_DETECTION_BBOXES`. It **WILL NOT** override the side data from `dnn_detect`,, meaning that the frame will have side data two entries of this type. I've created a PR that make it possible to fetch such entry [2]. > > - The labels in the detection boxes have the format "track:<track_num>:<track_age>", and this is not the final format. I did this way as a quick hack to have some visual information when drawing the boxes and labels with the `drawtext` and `drawbox` filters. I believe this can be improved by putting the tracking information as metadata of the `AVDetectionBBox`es, but this would on API and ABI breaking, so this is still an open question. That's not a very useful filter. What this needs is a more complex and flexible interface than bounding boxes, which can enable processing of individual objects and turning the tracking info into something useful, for example, splitting up a video into objects. > What has not been done so far: > > I had quite a few goals in this task: > > - 1: get a working and efficient implementation of the SORT algorithm. > - 2: start learning Rust again (it's been ~5 years since I used it) > - 3: learn more about the libavfilter codebase > - 4: evaluate whether Rust could work as a second language for hacking FFMpeg. None of us know Rust confidently enough to maintain and work on such code. > Results: > > - 1: I managed to reuse lots of high quality code, available on crates (the repository of Rust packages), preventing me of needing to write hairy math heavy code. I personally suck in maths, especially linear algebra. Using the paper and the reference implementation [3] was enough, although I do not understand all the math magic. For instance, I reused an existing crate for Kalman filters that I probably would need to implement by hand, as the alternative in C would probably be using the implementation that OpenCV offers. And I am aware that it's not practical to make OpenCV a dependency of FFmpeg. Regardless of the language, I disagree with using crates in the context of FFmpeg, and any use of cargo. We used to link to OpenCV. The only reason why we dropped it was because we used the C interface, which was removed, and no one wanted to or was interested in rewriting it. I also don't agree with a plugin interface. We've been over this a few times. A plugin interface would likely turn the project into gstreamer or dshow, with dozens of proprietary plugins, whose users would be asking us for support. We bump our ABI once a year, which is far too often for this as well. [-- Attachment #1.1.1.2: OpenPGP public key --] [-- Type: application/pgp-keys, Size: 637 bytes --] [-- Attachment #1.2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 236 bytes --] [-- Attachment #2: Type: text/plain, Size: 251 bytes --] _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
next prev parent reply other threads:[~2025-02-21 13:18 UTC|newest] Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top 2025-02-20 13:06 Leandro Santiago 2025-02-20 16:20 ` Leandro Santiago 2025-02-20 22:49 ` Michael Niedermayer 2025-02-21 7:56 ` Leandro Santiago 2025-02-21 9:01 ` Tomas Härdin 2025-02-21 9:21 ` Soft Works 2025-02-21 13:21 ` Michael Niedermayer 2025-02-21 14:30 ` Soft Works 2025-02-21 14:53 ` Kieran Kunhya via ffmpeg-devel 2025-02-21 15:02 ` Soft Works 2025-02-21 19:27 ` Kieran Kunhya via ffmpeg-devel 2025-02-21 20:10 ` Soft Works 2025-02-21 16:39 ` Stephen Hutchinson 2025-02-21 13:18 ` Lynne [this message] 2025-02-21 13:44 ` Kieran Kunhya via ffmpeg-devel 2025-02-21 18:02 ` Tomas Härdin
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=eb5fa9b8-5ec6-45be-a4ec-ac4118cdaf5e@lynne.ee \ --to=dev@lynne.ee \ --cc=ffmpeg-devel@ffmpeg.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel This inbox may be cloned and mirrored by anyone: git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \ ffmpegdev@gitmailbox.com public-inbox-index ffmpegdev Example config snippet for mirrors. AGPL code for this site: git clone https://public-inbox.org/public-inbox.git