Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
 help / color / mirror / Atom feed
From: Vittorio Palmisano <vpalmisano-at-gmail.com@ffmpeg.org>
To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Subject: Re: [FFmpeg-devel] [PATCH] Whisper audio filter
Date: Fri, 11 Jul 2025 10:41:04 +0200
Message-ID: <CADv15W-W3=VkGcJfnnbD7mw5JdxhB7Vn+Dr_O4+4Pt47YSnHqg@mail.gmail.com> (raw)
In-Reply-To: <20250710122008.GP29660@pb2>

> > +
> > +    memcpy(wctx->audio_buffer, wctx->audio_buffer + end_pos,
> > +           end_pos * sizeof(float));
>
> sizeof(*wctx->audio_buffer) is more robust than float

But end_pos is not necessarily equal to the audio_buffer size, it
could be lower.

>
> not sure how others think of this, but i would ignore the 80 char limit and format this like:
>
> static const AVOption whisper_options[] = {
>     { "model"   , "Path to the whisper.cpp model file"                 , OFFSET(model_path), AV_OPT_TYPE_STRING,.flags = FLAGS },
>     { "language", "Language for transcription ('auto' for auto-detect)", OFFSET(language)  , AV_OPT_TYPE_STRING, {.str = "auto"},             .flags = FLAGS },

I've used `indent -i4 -kr -nut` to format the code.

>
> Also it seems, this is alot slower than whisper-cli
>
> time whisper-cli  matrix.wav -m ~/whisper.cpp/models/ggml-base.en.bin  --output-srt
> real    0m16,283s
> user    1m3,644s
> sys     0m0,581s
>
>
> time ./ffmpeg -v 99 -i matrix.wav -af "aformat=sample_rates=16000:channel_layouts=mono,whisper=model=/home/michael/whisper.cpp/models/ggml-base.en.bin:language=en:queue=3000:destination=output.srt:format=srt" -f null - 2> /tmp/log
> real    1m30,827s
> user    6m0,590s
> sys     0m0,756s
>

Tested with: https://github.com/vpalmisano/webrtcperf/releases/download/videos-1.0/kt.mp4
(and you need to increase the queue param to obtain a fair
comparison):

ffmpeg -loglevel info -i ~/Videos/kt.mp4 -vn -af
"aformat=sample_rates=16000:channel_layouts=mono,whisper=model=../whisper.cpp/models/ggml-medium.bin:language=en:queue=60000:destination=/tmp/output.srt:format=srt"
-f null -
real    0m7.998s
user    0m7.552s
sys 0m0.776s

whisper-cli  ~/Videos/kt.mp4 -m ../whisper.cpp/models/ggml-medium.bin
--output-srt
real    0m8.067s
user    0m8.282s
sys 0m0.887s
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

  reply	other threads:[~2025-07-11  8:41 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-09  7:23 Vittorio Palmisano
2025-07-09 13:36 ` Marvin Scholz
2025-07-09 15:24 ` Zhao Zhili
2025-07-10  8:43   ` Vittorio Palmisano
2025-07-10  9:47     ` Zhao Zhili
2025-07-10 12:41   ` Michael Niedermayer
2025-07-09 23:37 ` Michael Niedermayer
2025-07-10  8:34   ` Vittorio Palmisano
2025-07-10 10:05     ` Marvin Scholz
2025-07-10 10:20       ` Vittorio Palmisano
2025-07-10 10:25         ` Vittorio Palmisano
2025-07-10 12:20           ` Michael Niedermayer
2025-07-11  8:41             ` Vittorio Palmisano [this message]
2025-07-11  9:07               ` Vittorio Palmisano
2025-07-11 19:05                 ` Marvin Scholz
2025-07-12  0:03               ` Michael Niedermayer
2025-07-13 11:16                 ` Vittorio Palmisano
2025-07-14 10:34                   ` Vittorio Palmisano
2025-07-10 11:31     ` Michael Niedermayer
2025-07-10 12:07       ` Nicolas George
2025-07-10 12:10         ` Nicolas George
2025-07-09 23:41 ` Michael Niedermayer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CADv15W-W3=VkGcJfnnbD7mw5JdxhB7Vn+Dr_O4+4Pt47YSnHqg@mail.gmail.com' \
    --to=vpalmisano-at-gmail.com@ffmpeg.org \
    --cc=ffmpeg-devel@ffmpeg.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
		ffmpegdev@gitmailbox.com
	public-inbox-index ffmpegdev

Example config snippet for mirrors.


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git