From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.ffmpeg.org (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTPS id 5702D5025C for ; Thu, 10 Jul 2025 08:35:28 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.ffmpeg.org (Postfix) with ESMTP id 6671968FA90; Thu, 10 Jul 2025 11:35:24 +0300 (EEST) Received: from mail-qv1-f53.google.com (mail-qv1-f53.google.com [209.85.219.53]) by ffbox0-bg.ffmpeg.org (Postfix) with ESMTPS id 04AAD68EEE1 for ; Thu, 10 Jul 2025 11:35:17 +0300 (EEST) Received: by mail-qv1-f53.google.com with SMTP id 6a1803df08f44-6ecf99dd567so10719756d6.0 for ; Thu, 10 Jul 2025 01:35:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1752136516; x=1752741316; darn=ffmpeg.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=DpXWcUPJXFxbAuNFjc7psnTnYYC6MxkfWD/xZt7W1mg=; b=fvHAaytaH2DMr47ZBby2Am2lMGqH2SPkSCB2z2Zhv0maeG4J9qEtBjyBUqnNWugOsO zAcAbJNDaIMqCy8AVnZR4iVgDGt5iWPMKQo/qzXjxtC+1QdUu5DhUlgPUhXWAOXpYlI8 yUUyynV8CgpAyfBqgxbi6MT5BRCo/4uJSoQyKdWLvJHnFaff47GH+jzq707gN7taifzA n1k/PAKOaQQXUgqyx8oIqCRuJR163eB4+y9uGjSYKUQkNhyupAy+fSc1KznQ/+l3Rd19 GhGrD2j/vdYKL2I3pO7VyYEQHORdI8qLbrRE3nUFg9z/C0LxEHkOXyHScXKeNnjmVp7G eUEQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1752136516; x=1752741316; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=DpXWcUPJXFxbAuNFjc7psnTnYYC6MxkfWD/xZt7W1mg=; b=MQO8Z3IReF3a91IIb0u4xLaDkbJ+1Esll62bAgmcMs4cYGNcUx5DkIJlqZyMDtWdLx UR70u1zWRyuCEHzUVOPxgQ1ImHBCAbKXYaNw3QvLzZH0v2d39PgtS01bCD1+gjxF1qtv XSYR6CTkzfdf4VoiZb0YvWUw9UrOMUZHMvTg4rWcmSQgip5SOs4FZm/qjoX1vCv1tT0B e7ct7x2P92BuXSG6IhuDP0glMpAOvyNiKU7iMa8LXYDKZID6MjoCy9UfyliRTk53nSsU eBX4kkBngX3egs4aMgoUhp0RrntFw0WKjXnLH2v0bEhrAEXmT9BZ8jhKBlvTIqHGd3Lu 2U+w== X-Gm-Message-State: AOJu0YyK77sOb0eYTXiWlDbN+5uRJ7ad1lkfuhCWTf9ETSeZr+ehj1S4 GQ/5R0qGFd5oocFJytz1Q/ESHsLE9P2jPsGvBLOSa1bk27CmeCdp3e29rZxsJxBmIli1JB44krQ DVk3iKHy1uutCZJI5xxMclp6OUq9of0c= X-Gm-Gg: ASbGnctWzArf8j6YcO3wWFOZdpwCwKA2HJypI/B9vVNaTLgQdBjP7yPVupViEnGqKq/ +rDXUbesa6Lu5bE/x7UYDzo7ciUATAdB6z7TDroxHVCCgNG6NZktOMQDh3VTY6N2olrkGMeVw+W W7ElVJNBx8oVnA0ES5D69GFt8s8bhHDLVHq8KKra3V X-Google-Smtp-Source: AGHT+IFWHnO4BfT0zreDezi8YfAyOGbYvb+eCqGLnmWHJu4kJu7MS3QEIZ7TYheJzVoLlGoAbtuC6CxhcXp7hS/IoBM= X-Received: by 2002:a05:6214:20e2:b0:6fb:59de:f8ab with SMTP id 6a1803df08f44-70498064c50mr24193286d6.10.1752136516255; Thu, 10 Jul 2025 01:35:16 -0700 (PDT) MIME-Version: 1.0 References: <20250709072350.578693-1-vpalmisano@gmail.com> <20250709233746.GM29660@pb2> In-Reply-To: <20250709233746.GM29660@pb2> From: Vittorio Palmisano Date: Thu, 10 Jul 2025 10:34:50 +0200 X-Gm-Features: Ac12FXzXrL4Ion6ahHBXjEqOjAFfT1F6Rwz3Lu9Pswg-u0W1G5VrsEd56ucr1q8 Message-ID: To: Michael Niedermayer Subject: Re: [FFmpeg-devel] [PATCH] Whisper audio filter X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: FFmpeg development discussions and patches Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: Thanks Michael, I will try to answer your comments. > > +ffmpeg -i input.mp4 -vn -af "aformat=sample_rates=16000:channel_layouts=mono,whisper= > > Is there a reason why we convert to 16khz mono here ? It is the only format supported by the whisper.cpp library. > > +model=../whisper.cpp/models/ggml-base.en.bin\ > > It would be nice if the models would be in a standard location, so teh user > just has to specify the model name and not the path I think that this functionality should be implemented inside the whisper.cpp library, so they can manage the exact model location and the download process. I will propose a change. > I tried this: > > ./ffmpeg -i matrixbench_mpeg2.mpg -vn -af "aformat=sample_rates=16000:channel_layouts=mono,whisper=model=/home/michael/whisper.cpp/models/ggml-base.en.bin:language=en:queue=3000:destination=output.srt:format=srt" -f null - > > but the output.srt is empty (0 bytes) Can you enable verbose logging? > libavfilter/af_whisper.c:75:49: error: parameter name omitted > 75 | static void cb_log_disable(enum ggml_log_level, const char *, void *) {} > | ^~~~~~~~~~~~ > libavfilter/af_whisper.c:75:63: I don't see this error using the gcc13 compiler. Do you use a different compiler or some other flags? > > + wctx->audio_buffer_fill_size = 0; > > + > > + wctx->next_pts = AV_NOPTS_VALUE; > > + > > + wctx->avio_context = NULL; > > arent things already initialized to 0 ? Yes, maybe we can keep the AV_NOPTS_VALUE assignment (it is not zero). -- /Vittorio Palmisano/ _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".