From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.ffmpeg.org (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTPS id 676174CA15 for ; Fri, 11 Jul 2025 08:41:44 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.ffmpeg.org (Postfix) with ESMTP id 0727568FD8C; Fri, 11 Jul 2025 11:41:40 +0300 (EEST) Received: from mail-qt1-f180.google.com (mail-qt1-f180.google.com [209.85.160.180]) by ffbox0-bg.ffmpeg.org (Postfix) with ESMTPS id 782FA68FD4E for ; Fri, 11 Jul 2025 11:41:33 +0300 (EEST) Received: by mail-qt1-f180.google.com with SMTP id d75a77b69052e-4a6f0bcdf45so22880501cf.0 for ; Fri, 11 Jul 2025 01:41:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1752223291; x=1752828091; darn=ffmpeg.org; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :from:to:cc:subject:date:message-id:reply-to; bh=O63GdZimHmKuRs5Bf8I3oFlfwbHR7vsJsktXK21zKV0=; b=gQpx1oju6+PoBYzFo2WEbwIJ8yP+TZRiDYjxHJ4p5iIutH1lhFrP9RBR71yjLp2R7c 9q3eXkssGFaoXXXBoL3LePRsIiWy/aXOeYSWTEcjh2g512MqPNDYYfyaX144ZS94qZZz jpV9E6G0qLA+NbbgWBC8HDaFJPoprPJdi9NwgUbazMTimabTGTraH6Xaca9jqVvKmxxH XOJLf8ogRKLo1qO9PoGUUSdhEtxljJKHC4PBrc3+vkHBjw0UyACe+15pYZREzLhJSeGl qWT+brEVTqpU2ehhMeoSz4XjP7LRkVVAq2nYJBEIlnKH5oOAfydKAa+WPRf4T1jXkyjM pLXw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1752223291; x=1752828091; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=O63GdZimHmKuRs5Bf8I3oFlfwbHR7vsJsktXK21zKV0=; b=AoY8aPHZ/QDd3h7OAciSjBuLYjNz0aG4AC46OHO728C4XOfAOJEKdn2mGKo1qo8Ngo lKt9JcL2dPEz0SF1i2TOQB2pXEfDe18uerKvb42RSHxXwrkvMZN/mvUC2gZoqkVUzA9m 4kwEIiCEtBauawotvc4FY8LcQCKIT6dFSgKYPj6+H70jlsWrwdtzOVrNQL9dgyZtfxhl SmZ112ZJBg7myH4arWiVR9hqZza+pSlPbGqUED5mkUEaIBWWNfmkWUd+kgWTmM/Qq42b qiLHb5G6BLnYNXt8skCaMuSbDOCuEPz9pRObNa9eI5aJgGFEK3z67xI4avReybSIBxda 7J9A== X-Gm-Message-State: AOJu0YxItt19wF5ShjFh+71TLwIR1h1uS9pa2uDdb+WlqQiA3Yernn/q n2oIRM6EmS2b434Qhu7JfgUE99wjRezPxoOvLjRO5qpsirbDTCRD/iKTS6WPLedqDrdTfF96KSN nHRah/k6knQFfHFQ9EW4+3hW2hf0FC12N29tY X-Gm-Gg: ASbGncuM9mIEfHmnbJHmhtqnoB5oYcxKhRloj5LvI4trHaMuUEmGC2/rdG38ZPuFtNe Eif0hlz9n65Z8VKQzU8mrfV2mj3z41F36moB0KjY0x+WONj56LGe+GqEJu4YF2R4iWOiaA/S+Ua 9OQkjPYjbchjM3OoL19A3wx75f9eNr+/NYMUCxntQKOc83cVSspbDAxc6yi0h41Ox6HO2RhLxdZ Jt1 X-Google-Smtp-Source: AGHT+IGp9pJLXna+rVFUpN18gOyFvfie/aBaHVGf8g4aJKK5bPfXv39iedEddvlpDpx0psv1/jFTr82uGOJmrPDDpnQ= X-Received: by 2002:a05:622a:408c:b0:4a3:d015:38ae with SMTP id d75a77b69052e-4a9fb8b005fmr37813001cf.23.1752223291203; Fri, 11 Jul 2025 01:41:31 -0700 (PDT) MIME-Version: 1.0 References: <20250710102543.1002696-1-vpalmisano@gmail.com> <20250710122008.GP29660@pb2> In-Reply-To: <20250710122008.GP29660@pb2> From: Vittorio Palmisano Date: Fri, 11 Jul 2025 10:41:04 +0200 X-Gm-Features: Ac12FXyEJ60CbFqQMy6Y-eUFseapsxSKSj4esW_LrMWabIffMGWjO2ugUCtnZ6Q Message-ID: To: FFmpeg development discussions and patches Subject: Re: [FFmpeg-devel] [PATCH] Whisper audio filter X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: > > + > > + memcpy(wctx->audio_buffer, wctx->audio_buffer + end_pos, > > + end_pos * sizeof(float)); > > sizeof(*wctx->audio_buffer) is more robust than float But end_pos is not necessarily equal to the audio_buffer size, it could be lower. > > not sure how others think of this, but i would ignore the 80 char limit and format this like: > > static const AVOption whisper_options[] = { > { "model" , "Path to the whisper.cpp model file" , OFFSET(model_path), AV_OPT_TYPE_STRING,.flags = FLAGS }, > { "language", "Language for transcription ('auto' for auto-detect)", OFFSET(language) , AV_OPT_TYPE_STRING, {.str = "auto"}, .flags = FLAGS }, I've used `indent -i4 -kr -nut` to format the code. > > Also it seems, this is alot slower than whisper-cli > > time whisper-cli matrix.wav -m ~/whisper.cpp/models/ggml-base.en.bin --output-srt > real 0m16,283s > user 1m3,644s > sys 0m0,581s > > > time ./ffmpeg -v 99 -i matrix.wav -af "aformat=sample_rates=16000:channel_layouts=mono,whisper=model=/home/michael/whisper.cpp/models/ggml-base.en.bin:language=en:queue=3000:destination=output.srt:format=srt" -f null - 2> /tmp/log > real 1m30,827s > user 6m0,590s > sys 0m0,756s > Tested with: https://github.com/vpalmisano/webrtcperf/releases/download/videos-1.0/kt.mp4 (and you need to increase the queue param to obtain a fair comparison): ffmpeg -loglevel info -i ~/Videos/kt.mp4 -vn -af "aformat=sample_rates=16000:channel_layouts=mono,whisper=model=../whisper.cpp/models/ggml-medium.bin:language=en:queue=60000:destination=/tmp/output.srt:format=srt" -f null - real 0m7.998s user 0m7.552s sys 0m0.776s whisper-cli ~/Videos/kt.mp4 -m ../whisper.cpp/models/ggml-medium.bin --output-srt real 0m8.067s user 0m8.282s sys 0m0.887s _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".