* [FFmpeg-devel] [PATCH] avfilter: add showcwt multimedia filter
@ 2022-11-25 19:28 Paul B Mahol
2022-11-26 11:01 ` Paul B Mahol
0 siblings, 1 reply; 3+ messages in thread
From: Paul B Mahol @ 2022-11-25 19:28 UTC (permalink / raw)
To: FFmpeg development discussions and patches
[-- Attachment #1: Type: text/plain, Size: 24 bytes --]
Hello,
Patch attached.
[-- Attachment #2: 0001-avfilter-add-showcwt-multimedia-filter.patch --]
[-- Type: text/x-patch, Size: 24368 bytes --]
From 68ef81098aebca9064f9c67e746476c39729e63b Mon Sep 17 00:00:00 2001
From: Paul B Mahol <onemda@gmail.com>
Date: Sat, 19 Nov 2022 19:01:23 +0100
Subject: [PATCH] avfilter: add showcwt multimedia filter
Signed-off-by: Paul B Mahol <onemda@gmail.com>
---
doc/filters.texi | 66 +++++
libavfilter/Makefile | 1 +
libavfilter/allfilters.c | 1 +
libavfilter/avf_showcwt.c | 558 ++++++++++++++++++++++++++++++++++++++
4 files changed, 626 insertions(+)
create mode 100644 libavfilter/avf_showcwt.c
diff --git a/doc/filters.texi b/doc/filters.texi
index ecf8dfa47a..5f35bd7e4e 100644
--- a/doc/filters.texi
+++ b/doc/filters.texi
@@ -29274,6 +29274,72 @@ axisfile=myaxis.png:basefreq=40:endfreq=10000
@end example
@end itemize
+@section showcwt
+
+Convert input audio to video output representing frequency spectrum
+linearly or logarithmically using Continuous Wavelet Transform and
+Morlet wavelet.
+
+The filter accepts the following options:
+
+@table @option
+@item size, s
+Specify the video size for the output. For the syntax of this option,
+check the @ref{video size syntax,,"Video size" section in the ffmpeg-utils manual,ffmpeg-utils}.
+Default value is @code{640x512}.
+
+@item rate, r
+Set the output frame rate. Default value is @code{25}.
+
+@item scale
+Set the frequency scale used. Can be @code{linear} or @code{log}.
+Default value is @code{linear}.
+
+@item min
+Set the minimum frequency that will be used in output.
+Default is @code{20} Hz.
+
+@item max
+Set the maximum frequency that will be used in output.
+Default is @code{20000} Hz. The real frequency upper limit
+depends on input audio's sample rate and such will be enforced
+on this value when it is set to value greater than Nyquist frequency.
+
+@item logb
+Set the logarithmic basis for brightness strength when
+mapping calculated magnitude values to pixel values.
+Allowed range is from @code{0} to @code{1}.
+Default value is @code{0.0001}.
+
+@item deviation
+Set the frequency deviation.
+Lower values than @code{1} are more frequency oriented,
+while higher values than @code{1} are more time oriented.
+Allowed range is from @code{0} to @code{10}.
+Default value is @code{1}.
+
+@item pps
+Set the number of pixel output per each second in one row.
+Allowed range is from @code{1} to @code{1024}.
+Default value is @code{64}.
+
+@item mode
+Set the output visual mode. Allowed values are:
+
+@table @option
+@item magnitude
+Show magnitude.
+@item phase
+Show only phase.
+@item magphase
+Show combination of magnitude and phase.
+Magnitude is mapped to brightness and phase to color.
+@end table
+
+Default value is @code{magnitude}.
+
+@end table
+
@section showfreqs
Convert input audio to video output representing the audio power spectrum.
diff --git a/libavfilter/Makefile b/libavfilter/Makefile
index 66c754f1f5..2791b6a950 100644
--- a/libavfilter/Makefile
+++ b/libavfilter/Makefile
@@ -595,6 +595,7 @@ OBJS-$(CONFIG_APHASEMETER_FILTER) += avf_aphasemeter.o
OBJS-$(CONFIG_AVECTORSCOPE_FILTER) += avf_avectorscope.o
OBJS-$(CONFIG_CONCAT_FILTER) += avf_concat.o
OBJS-$(CONFIG_SHOWCQT_FILTER) += avf_showcqt.o lswsutils.o lavfutils.o
+OBJS-$(CONFIG_SHOWCWT_FILTER) += avf_showcwt.o
OBJS-$(CONFIG_SHOWFREQS_FILTER) += avf_showfreqs.o
OBJS-$(CONFIG_SHOWSPATIAL_FILTER) += avf_showspatial.o
OBJS-$(CONFIG_SHOWSPECTRUM_FILTER) += avf_showspectrum.o
diff --git a/libavfilter/allfilters.c b/libavfilter/allfilters.c
index 4909732002..3ff20e76ce 100644
--- a/libavfilter/allfilters.c
+++ b/libavfilter/allfilters.c
@@ -560,6 +560,7 @@ extern const AVFilter ff_avf_aphasemeter;
extern const AVFilter ff_avf_avectorscope;
extern const AVFilter ff_avf_concat;
extern const AVFilter ff_avf_showcqt;
+extern const AVFilter ff_avf_showcwt;
extern const AVFilter ff_avf_showfreqs;
extern const AVFilter ff_avf_showspatial;
extern const AVFilter ff_avf_showspectrum;
diff --git a/libavfilter/avf_showcwt.c b/libavfilter/avf_showcwt.c
new file mode 100644
index 0000000000..c24efe3686
--- /dev/null
+++ b/libavfilter/avf_showcwt.c
@@ -0,0 +1,558 @@
+/*
+ * Copyright (c) 2022 Paul B Mahol
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include <float.h>
+#include <math.h>
+
+#include "libavutil/tx.h"
+#include "libavutil/avassert.h"
+#include "libavutil/avstring.h"
+#include "libavutil/channel_layout.h"
+#include "libavutil/cpu.h"
+#include "libavutil/opt.h"
+#include "libavutil/parseutils.h"
+#include "audio.h"
+#include "video.h"
+#include "avfilter.h"
+#include "filters.h"
+#include "internal.h"
+
+typedef struct ShowCWTContext {
+ const AVClass *class;
+ int w, h;
+ int mode;
+ char *rate_str;
+ AVRational auto_frame_rate;
+ AVRational frame_rate;
+ AVTXContext *fft;
+ AVTXContext **ifft;
+ av_tx_fn tx_fn;
+ av_tx_fn itx_fn;
+ int fft_in_size;
+ int fft_out_size;
+ int ifft_in_size;
+ int ifft_out_size;
+ int xpos;
+ int in_nb_samples;
+ int64_t in_pts;
+ int64_t old_pts;
+ float *frequency_band;
+ AVFrame *kernel;
+ AVFrame *overlap;
+ AVFrame *outpicref;
+ AVFrame *fft_in;
+ AVFrame *fft_out;
+ AVFrame *ifft_in;
+ AVFrame *ifft_out;
+ int nb_threads;
+ int nb_consumed_samples;
+ int pps;
+ int hop_size;
+ int ihop_size;
+ int ihop_index;
+ int input_padding_size;
+ int input_sample_count;
+ int output_padding_size;
+ int output_sample_count;
+ int frequency_band_count;
+ float logarithmic_basis;
+ int frequency_scale;
+ float minimum_frequency;
+ float maximum_frequency;
+ float deviation;
+} ShowCWTContext;
+
+#define OFFSET(x) offsetof(ShowCWTContext, x)
+#define FLAGS AV_OPT_FLAG_FILTERING_PARAM|AV_OPT_FLAG_VIDEO_PARAM
+
+static const AVOption showcwt_options[] = {
+ { "size", "set video size", OFFSET(w), AV_OPT_TYPE_IMAGE_SIZE, {.str = "640x512"}, 0, 0, FLAGS },
+ { "s", "set video size", OFFSET(w), AV_OPT_TYPE_IMAGE_SIZE, {.str = "640x512"}, 0, 0, FLAGS },
+ { "rate", "set video rate", OFFSET(rate_str), AV_OPT_TYPE_STRING, {.str = "25"}, 0, 0, FLAGS },
+ { "r", "set video rate", OFFSET(rate_str), AV_OPT_TYPE_STRING, {.str = "25"}, 0, 0, FLAGS },
+ { "scale", "set frequency scale", OFFSET(frequency_scale), AV_OPT_TYPE_INT, {.i64=0}, 0, 1, FLAGS, "scale" },
+ { "linear", "linear", 0, AV_OPT_TYPE_CONST,{.i64=0}, 0, 0, FLAGS, "scale" },
+ { "log", "logarithmic", 0, AV_OPT_TYPE_CONST,{.i64=1}, 0, 0, FLAGS, "scale" },
+ { "min", "set minimum frequency", OFFSET(minimum_frequency), AV_OPT_TYPE_FLOAT, {.dbl = 20.}, 1, 2000, FLAGS },
+ { "max", "set maximum frequency", OFFSET(maximum_frequency), AV_OPT_TYPE_FLOAT, {.dbl = 20000.}, 0, 192000, FLAGS },
+ { "logb", "set logarithmic basis", OFFSET(logarithmic_basis), AV_OPT_TYPE_FLOAT, {.dbl = 0.0001}, 0, 1, FLAGS },
+ { "deviation", "set frequency deviation", OFFSET(deviation), AV_OPT_TYPE_FLOAT, {.dbl = 1.}, 0, 10, FLAGS },
+ { "pps", "set pixels per second", OFFSET(pps), AV_OPT_TYPE_INT, {.i64 = 64}, 1, 1024, FLAGS },
+ { "mode", "set output mode", OFFSET(mode), AV_OPT_TYPE_INT, {.i64=0}, 0, 2, FLAGS, "mode" },
+ { "magnitude", "magnitude", 0, AV_OPT_TYPE_CONST,{.i64=0}, 0, 0, FLAGS, "mode" },
+ { "phase", "phase", 0, AV_OPT_TYPE_CONST,{.i64=1}, 0, 0, FLAGS, "mode" },
+ { "magphase", "magnitude+phase", 0, AV_OPT_TYPE_CONST,{.i64=2}, 0, 0, FLAGS, "mode" },
+ { NULL }
+};
+
+AVFILTER_DEFINE_CLASS(showcwt);
+
+static av_cold void uninit(AVFilterContext *ctx)
+{
+ ShowCWTContext *s = ctx->priv;
+
+ av_freep(&s->frequency_band);
+ av_frame_free(&s->kernel);
+ av_frame_free(&s->overlap);
+ av_frame_free(&s->outpicref);
+ av_frame_free(&s->fft_in);
+ av_frame_free(&s->fft_out);
+ av_frame_free(&s->ifft_in);
+ av_frame_free(&s->ifft_out);
+ av_tx_uninit(&s->fft);
+
+ if (s->ifft) {
+ for (int n = 0; n < s->nb_threads; n++)
+ av_tx_uninit(&s->ifft[n]);
+ }
+}
+
+static int query_formats(AVFilterContext *ctx)
+{
+ AVFilterFormats *formats = NULL;
+ AVFilterChannelLayouts *layouts = NULL;
+ AVFilterLink *inlink = ctx->inputs[0];
+ AVFilterLink *outlink = ctx->outputs[0];
+ static const enum AVSampleFormat sample_fmts[] = { AV_SAMPLE_FMT_FLTP, AV_SAMPLE_FMT_NONE };
+ static const enum AVPixelFormat pix_fmts[] = { AV_PIX_FMT_YUV444P, AV_PIX_FMT_YUVJ444P, AV_PIX_FMT_YUVA444P, AV_PIX_FMT_NONE };
+ int ret;
+
+ formats = ff_make_format_list(sample_fmts);
+ if ((ret = ff_formats_ref(formats, &inlink->outcfg.formats)) < 0)
+ return ret;
+
+ layouts = ff_all_channel_counts();
+ if ((ret = ff_channel_layouts_ref(layouts, &inlink->outcfg.channel_layouts)) < 0)
+ return ret;
+
+ formats = ff_all_samplerates();
+ if ((ret = ff_formats_ref(formats, &inlink->outcfg.samplerates)) < 0)
+ return ret;
+
+ formats = ff_make_format_list(pix_fmts);
+ if ((ret = ff_formats_ref(formats, &outlink->incfg.formats)) < 0)
+ return ret;
+
+ return 0;
+}
+
+static void frequency_band(float *frequency_band,
+ int frequency_band_count,
+ float frequency_range,
+ float frequency_offset,
+ int frequency_scale, float deviation)
+{
+ deviation *= sqrtf(1.f / (4.f * M_PI)); // Heisenberg Gabor Limit
+ for (int y = 0; y < frequency_band_count; y++) {
+ float frequency = frequency_range * (1.f - (float)y / frequency_band_count) + frequency_offset;
+ float frequency_derivative = frequency_range / frequency_band_count;
+
+ if (frequency_scale > 0) {
+ frequency = powf(2.f, frequency);
+ frequency_derivative *= logf(2.f) * frequency;
+ }
+
+ frequency_band[y*2 ] = frequency;
+ frequency_band[y*2+1] = frequency_derivative * deviation;
+ }
+}
+
+#define cmul(operator, index) { \
+ const float ff = kernel[index]; \
+ isrc[n].re operator ff*dst[index].re; \
+ isrc[n].im operator ff*dst[index].im; \
+}
+
+static float remap_log(float value, float log_factor)
+{
+ float sign = (0 < value) - (value < 0);
+
+ value = logf(value * sign) * log_factor;
+
+ return 1.f - av_clipf(value, 0.f, 1.f);
+}
+
+static int run_channel_cwt_prepare(AVFilterContext *ctx, void *arg, int ch)
+{
+ ShowCWTContext *s = ctx->priv;
+ AVFrame *fin = arg;
+ const float *input = (const float *)fin->extended_data[ch];
+ float *overlap = (float *)s->overlap->extended_data[ch];
+ AVComplexFloat *src = (AVComplexFloat *)s->fft_in->extended_data[ch];
+ AVComplexFloat *dst = (AVComplexFloat *)s->fft_out->extended_data[ch];
+ const int nb_consumed_samples = s->nb_consumed_samples;
+ const int input_padding_size = s->input_padding_size;
+ const int hop_size = s->hop_size;
+ const int offset = input_padding_size - hop_size;
+
+ memmove(overlap, &overlap[hop_size], offset * sizeof(float));
+ memcpy(&overlap[offset], input,
+ fin->nb_samples * sizeof(float));
+ memset(&overlap[offset + fin->nb_samples], 0,
+ (hop_size - fin->nb_samples) * sizeof(float));
+
+ for (int n = 0; n < nb_consumed_samples; n++) {
+ src[n].re = overlap[n];
+ src[n].im = 0.f;
+ }
+
+ s->tx_fn(s->fft, dst, src, sizeof(*src));
+
+ return 0;
+}
+
+static int run_channel_cwt(AVFilterContext *ctx, void *arg, int jobnr, int nb_jobs)
+{
+ ShowCWTContext *s = ctx->priv;
+ const int ch = *(int *)arg;
+ ptrdiff_t linesize = s->outpicref->linesize[0];
+ AVComplexFloat *dst = (AVComplexFloat *)s->fft_out->extended_data[ch];
+ const int output_sample_count = s->output_sample_count;
+ const float log_factor = 1.f/logf(s->logarithmic_basis);
+ const int input_padding_size = s->input_padding_size;
+ const int rest = input_padding_size % output_sample_count;
+ const int ihop_size = s->ihop_size;
+ const int ioffset = (s->output_padding_size - ihop_size) >> 1;
+ const int h = s->h;
+ const int start = (h * jobnr) / nb_jobs;
+ const int end = (h * (jobnr+1)) / nb_jobs;
+ const int mode = s->mode;
+ const int ihop_index = s->ihop_index;
+ const int i = ihop_index + ioffset;
+
+ for (int y = start; y < end; y++) {
+ AVComplexFloat *isrc = (AVComplexFloat *)s->ifft_in->extended_data[ch * h + y];
+ AVComplexFloat *idst = (AVComplexFloat *)s->ifft_out->extended_data[ch * h + y];
+ const float *kernel = (const float *)s->kernel->extended_data[y];
+ uint8_t *dstY = s->outpicref->data[0] + y * linesize;
+ uint8_t *dstU = s->outpicref->data[1] + y * linesize;
+ uint8_t *dstV = s->outpicref->data[2] + y * linesize;
+ int x = s->xpos;
+ float Y, U, V;
+
+ if (ihop_index > 0)
+ goto put_pixels;
+
+ for (int n = 0; n < output_sample_count; n++)
+ cmul(=, n);
+
+ if (output_sample_count < input_padding_size) {
+ const int cut_index = input_padding_size - rest;
+
+ for (int chunk_index = output_sample_count; chunk_index < cut_index; chunk_index += output_sample_count)
+ for (int n = 0; n < output_sample_count; n++)
+ cmul(+=, chunk_index + n);
+ for (int n = 0; n < rest; n++)
+ cmul(+=, cut_index + n);
+ }
+
+ s->itx_fn(s->ifft[jobnr], idst, isrc, sizeof(*isrc));
+
+put_pixels:
+ switch (mode) {
+ case 2:
+ Y = hypotf(idst[i].re, idst[i].im);
+ Y = remap_log(Y, log_factor);
+ U = atan2f(idst[i].im, idst[i].re);
+ U = 0.5f + 0.5f * U * Y / M_PI;
+ V = 1.f - U;
+
+ dstY[x] = av_clip_uint8(lrintf(Y * 255.f));
+ dstU[x] = av_clip_uint8(lrintf(U * 255.f));
+ dstV[x] = av_clip_uint8(lrintf(V * 255.f));
+ break;
+ case 1:
+ Y = atan2f(idst[i].im, idst[i].re);
+ Y = 0.5f + 0.5f * Y / M_PI;
+
+ dstY[x] = av_clip_uint8(lrintf(Y * 255.f));
+ break;
+ case 0:
+ Y = hypotf(idst[i].re, idst[i].im);
+ Y = remap_log(Y, log_factor);
+
+ dstY[x] = av_clip_uint8(lrintf(Y * 255.f));
+ break;
+ }
+ }
+
+ return 0;
+}
+
+static void compute_kernel(AVFilterContext *ctx)
+{
+ ShowCWTContext *s = ctx->priv;
+ const int size = s->input_sample_count;
+ const float scale_factor = 1.f/(float)size;
+ const int output_sample_count = s->output_sample_count;
+ const int fsize = s->frequency_band_count;
+
+ for (int y = 0; y < fsize; y++) {
+ float *kernel = (float *)s->kernel->extended_data[y];
+ float frequency = s->frequency_band[y*2];
+ float deviation = 1.f / (s->frequency_band[y*2+1] *
+ output_sample_count);
+
+ for (int n = 0; n < size; n++) {
+ float ff, f = fabsf(n-frequency);
+
+ f = size - fabsf(f - size);
+ ff = expf(-f*f*deviation) * scale_factor;
+ kernel[n] = ff;
+ }
+ }
+}
+
+static int config_output(AVFilterLink *outlink)
+{
+ AVFilterContext *ctx = outlink->src;
+ AVFilterLink *inlink = ctx->inputs[0];
+ ShowCWTContext *s = ctx->priv;
+ float maximum_frequency = fminf(s->maximum_frequency, inlink->sample_rate * 0.5f);
+ float minimum_frequency = s->minimum_frequency;
+ float scale = 1.f;
+ int ret;
+
+ uninit(ctx);
+
+ s->nb_threads = FFMIN(s->h, ff_filter_get_nb_threads(ctx));
+ s->old_pts = AV_NOPTS_VALUE;
+ s->nb_consumed_samples = 65536;
+ s->frequency_band_count = s->h;
+ s->input_sample_count = s->nb_consumed_samples;
+ s->hop_size = s->nb_consumed_samples >> 1;
+ s->input_padding_size = 65536;
+ s->output_padding_size = FFMAX(16,s->input_padding_size * s->pps / inlink->sample_rate);
+
+ outlink->w = s->w;
+ outlink->h = s->h;
+ outlink->sample_aspect_ratio = (AVRational){1,1};
+
+ s->fft_in_size = FFALIGN(s->input_padding_size, av_cpu_max_align());
+ s->fft_out_size = FFALIGN(s->input_padding_size, av_cpu_max_align());
+
+ s->output_sample_count = s->output_padding_size;
+
+ s->ifft_in_size = FFALIGN(s->output_padding_size, av_cpu_max_align());
+ s->ifft_out_size = FFALIGN(s->output_padding_size, av_cpu_max_align());
+ s->ihop_size = s->output_padding_size >> 1;
+
+ ret = av_tx_init(&s->fft, &s->tx_fn, AV_TX_FLOAT_FFT, 0, s->input_padding_size, &scale, 0);
+ if (ret < 0)
+ return ret;
+
+ s->ifft = av_calloc(s->nb_threads, sizeof(*s->ifft));
+ if (!s->ifft)
+ return AVERROR(ENOMEM);
+
+ for (int n = 0; n < s->nb_threads; n++) {
+ ret = av_tx_init(&s->ifft[n], &s->itx_fn, AV_TX_FLOAT_FFT, 1, s->output_padding_size, &scale, 0);
+ if (ret < 0)
+ return ret;
+ }
+
+ s->frequency_band = av_calloc(s->frequency_band_count,
+ sizeof(*s->frequency_band) * 2);
+ s->outpicref = ff_get_video_buffer(outlink, outlink->w, outlink->h);
+ s->fft_in = ff_get_audio_buffer(inlink, s->fft_in_size * 2);
+ s->fft_out = ff_get_audio_buffer(inlink, s->fft_out_size * 2);
+ s->overlap = ff_get_audio_buffer(inlink, s->input_padding_size);
+ s->ifft_in = av_frame_alloc();
+ s->ifft_out = av_frame_alloc();
+ s->kernel = av_frame_alloc();
+ if (!s->outpicref || !s->fft_in || !s->fft_out ||
+ !s->ifft_in || !s->ifft_out ||
+ !s->frequency_band || !s->kernel || !s->overlap)
+ return AVERROR(ENOMEM);
+
+ s->ifft_in->format = inlink->format;
+ s->ifft_in->nb_samples = s->ifft_in_size * 2;
+ s->ifft_in->ch_layout.nb_channels = s->h;
+ ret = av_frame_get_buffer(s->ifft_in, 0);
+ if (ret < 0)
+ return ret;
+
+ s->ifft_out->format = inlink->format;
+ s->ifft_out->nb_samples = s->ifft_out_size * 2;
+ s->ifft_out->ch_layout.nb_channels = s->h;
+ ret = av_frame_get_buffer(s->ifft_out, 0);
+ if (ret < 0)
+ return ret;
+
+ s->kernel->format = inlink->format;
+ s->kernel->nb_samples = s->input_padding_size;
+ s->kernel->ch_layout.nb_channels = s->frequency_band_count;
+ ret = av_frame_get_buffer(s->kernel, 0);
+ if (ret < 0)
+ return ret;
+
+ s->outpicref->sample_aspect_ratio = (AVRational){1,1};
+
+ for (int y = 0; y < outlink->h; y++) {
+ memset(s->outpicref->data[0] + y * s->outpicref->linesize[0], 0, outlink->w);
+ memset(s->outpicref->data[1] + y * s->outpicref->linesize[1], 128, outlink->w);
+ memset(s->outpicref->data[2] + y * s->outpicref->linesize[2], 128, outlink->w);
+ if (s->outpicref->data[3])
+ memset(s->outpicref->data[3] + y * s->outpicref->linesize[3], 0, outlink->w);
+ }
+
+ s->outpicref->color_range = AVCOL_RANGE_JPEG;
+
+ minimum_frequency *= s->nb_consumed_samples / (float)inlink->sample_rate;
+ maximum_frequency *= s->nb_consumed_samples / (float)inlink->sample_rate;
+ if (s->frequency_scale > 0) {
+ minimum_frequency = logf(minimum_frequency) / logf(2.f);
+ maximum_frequency = logf(maximum_frequency) / logf(2.f);
+ }
+
+ frequency_band(s->frequency_band,
+ s->frequency_band_count, maximum_frequency - minimum_frequency,
+ minimum_frequency, s->frequency_scale, s->deviation);
+
+ av_log(ctx, AV_LOG_DEBUG, "input_sample_count: %d\n", s->input_sample_count);
+ av_log(ctx, AV_LOG_DEBUG, "output_sample_count: %d\n", s->output_sample_count);
+
+ if (s->xpos >= s->w)
+ s->xpos = 0;
+
+ s->auto_frame_rate = av_make_q(inlink->sample_rate, s->hop_size);
+ if (strcmp(s->rate_str, "auto")) {
+ ret = av_parse_video_rate(&s->frame_rate, s->rate_str);
+ } else {
+ s->frame_rate = s->auto_frame_rate;
+ }
+ outlink->frame_rate = s->frame_rate;
+ outlink->time_base = av_inv_q(outlink->frame_rate);
+
+ compute_kernel(ctx);
+
+ return 0;
+}
+
+static int activate(AVFilterContext *ctx)
+{
+ AVFilterLink *inlink = ctx->inputs[0];
+ AVFilterLink *outlink = ctx->outputs[0];
+ ShowCWTContext *s = ctx->priv;
+ int ret = 0, status;
+ int64_t pts;
+
+ FF_FILTER_FORWARD_STATUS_BACK(outlink, inlink);
+
+ if (s->outpicref) {
+ const int ch = 0;
+ AVFrame *fin;
+
+ if (s->ihop_index == 0) {
+ ret = ff_inlink_consume_samples(inlink, s->hop_size, s->hop_size, &fin);
+ if (ret < 0)
+ return ret;
+ if (ret > 0) {
+ run_channel_cwt_prepare(ctx, fin, ch);
+ s->in_pts = fin->pts;
+ s->in_nb_samples = fin->nb_samples;
+ av_frame_free(&fin);
+ }
+ }
+
+ if (ret > 0 || s->ihop_index > 0) {
+ int64_t pts_offset;
+
+ ff_filter_execute(ctx, run_channel_cwt, (void *)&ch, NULL,
+ s->nb_threads);
+
+ pts_offset = av_rescale_q(s->ihop_index, av_make_q(1, s->ihop_size), av_make_q(1, s->in_nb_samples));
+ s->outpicref->pts = av_rescale_q(s->in_pts + pts_offset, inlink->time_base, outlink->time_base);
+
+ s->ihop_index++;
+ s->xpos++;
+ if (s->xpos >= s->w)
+ s->xpos = 0;
+ if (s->ihop_index >= s->ihop_size)
+ s->ihop_index = 0;
+
+ if (s->old_pts < s->outpicref->pts) {
+ AVFrame *out = ff_get_video_buffer(outlink, outlink->w, outlink->h);
+ if (!out)
+ return AVERROR(ENOMEM);
+ ret = av_frame_copy_props(out, s->outpicref);
+ if (ret < 0)
+ goto fail;
+ ret = av_frame_copy(out, s->outpicref);
+ if (ret < 0)
+ goto fail;
+ s->old_pts = s->outpicref->pts;
+ ret = ff_filter_frame(outlink, out);
+ if (ret <= 0)
+ return ret;
+fail:
+ av_frame_free(&out);
+ return ret;
+ }
+ }
+ }
+
+ if (ff_inlink_acknowledge_status(inlink, &status, &pts)) {
+ if (status == AVERROR_EOF) {
+ ff_outlink_set_status(outlink, status, pts);
+ return 0;
+ }
+ }
+
+ if (ff_inlink_queued_samples(inlink) >= s->hop_size || s->ihop_index) {
+ ff_filter_set_ready(ctx, 10);
+ return 0;
+ }
+
+ if (ff_outlink_frame_wanted(outlink)) {
+ ff_inlink_request_frame(inlink);
+ return 0;
+ }
+
+ return FFERROR_NOT_READY;
+}
+
+static const AVFilterPad showcwt_inputs[] = {
+ {
+ .name = "default",
+ .type = AVMEDIA_TYPE_AUDIO,
+ },
+};
+
+static const AVFilterPad showcwt_outputs[] = {
+ {
+ .name = "default",
+ .type = AVMEDIA_TYPE_VIDEO,
+ .config_props = config_output,
+ },
+};
+
+const AVFilter ff_avf_showcwt = {
+ .name = "showcwt",
+ .description = NULL_IF_CONFIG_SMALL("Convert input audio to a CWT (Continuous Wavelet Transform) video output."),
+ .uninit = uninit,
+ .priv_size = sizeof(ShowCWTContext),
+ FILTER_INPUTS(showcwt_inputs),
+ FILTER_OUTPUTS(showcwt_outputs),
+ FILTER_QUERY_FUNC(query_formats),
+ .activate = activate,
+ .priv_class = &showcwt_class,
+ .flags = AVFILTER_FLAG_SLICE_THREADS,
+};
--
2.37.2
[-- Attachment #3: Type: text/plain, Size: 251 bytes --]
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [FFmpeg-devel] [PATCH] avfilter: add showcwt multimedia filter
2022-11-25 19:28 [FFmpeg-devel] [PATCH] avfilter: add showcwt multimedia filter Paul B Mahol
@ 2022-11-26 11:01 ` Paul B Mahol
2022-11-27 21:04 ` Paul B Mahol
0 siblings, 1 reply; 3+ messages in thread
From: Paul B Mahol @ 2022-11-26 11:01 UTC (permalink / raw)
To: FFmpeg development discussions and patches
[-- Attachment #1: Type: text/plain, Size: 143 bytes --]
On 11/25/22, Paul B Mahol <onemda@gmail.com> wrote:
> Hello,
>
> Patch attached.
>
Improved patch attached.
Added slide & direction options.
[-- Attachment #2: 0001-avfilter-add-showcwt-multimedia-filter.patch --]
[-- Type: text/x-patch, Size: 30195 bytes --]
From ceecc1537b96f5639b98b490cd07a0f0a18439c3 Mon Sep 17 00:00:00 2001
From: Paul B Mahol <onemda@gmail.com>
Date: Sat, 19 Nov 2022 19:01:23 +0100
Subject: [PATCH] avfilter: add showcwt multimedia filter
Signed-off-by: Paul B Mahol <onemda@gmail.com>
---
doc/filters.texi | 87 +++++
libavfilter/Makefile | 1 +
libavfilter/allfilters.c | 1 +
libavfilter/avf_showcwt.c | 718 ++++++++++++++++++++++++++++++++++++++
4 files changed, 807 insertions(+)
create mode 100644 libavfilter/avf_showcwt.c
diff --git a/doc/filters.texi b/doc/filters.texi
index ecf8dfa47a..94445cdf91 100644
--- a/doc/filters.texi
+++ b/doc/filters.texi
@@ -29274,6 +29274,93 @@ axisfile=myaxis.png:basefreq=40:endfreq=10000
@end example
@end itemize
+@section showcwt
+
+Convert input audio to video output representing frequency spectrum
+linearly or logarithmically using Continuous Wavelet Transform and
+Morlet wavelet.
+
+The filter accepts the following options:
+
+@table @option
+@item size, s
+Specify the video size for the output. For the syntax of this option,
+check the @ref{video size syntax,,"Video size" section in the ffmpeg-utils manual,ffmpeg-utils}.
+Default value is @code{640x512}.
+
+@item rate, r
+Set the output frame rate. Default value is @code{25}.
+
+@item scale
+Set the frequency scale used. Can be @code{linear} or @code{log}.
+Default value is @code{linear}.
+
+@item min
+Set the minimum frequency that will be used in output.
+Default is @code{20} Hz.
+
+@item max
+Set the maximum frequency that will be used in output.
+Default is @code{20000} Hz. The real frequency upper limit
+depends on input audio's sample rate and such will be enforced
+on this value when it is set to value greater than Nyquist frequency.
+
+@item logb
+Set the logarithmic basis for brightness strength when
+mapping calculated magnitude values to pixel values.
+Allowed range is from @code{0} to @code{1}.
+Default value is @code{0.0001}.
+
+@item deviation
+Set the frequency deviation.
+Lower values than @code{1} are more frequency oriented,
+while higher values than @code{1} are more time oriented.
+Allowed range is from @code{0} to @code{10}.
+Default value is @code{1}.
+
+@item pps
+Set the number of pixel output per each second in one row.
+Allowed range is from @code{1} to @code{1024}.
+Default value is @code{64}.
+
+@item mode
+Set the output visual mode. Allowed values are:
+
+@table @option
+@item magnitude
+Show magnitude.
+@item phase
+Show only phase.
+@item magphase
+Show combination of magnitude and phase.
+Magnitude is mapped to brightness and phase to color.
+@end table
+
+Default value is @code{magnitude}.
+
+@item slide
+Set the output slide method. Allowed values are:
+
+@table @option
+@item replace
+@item scroll
+@end table
+
+@item direction
+Set the direction method for output slide method. Allowed values are:
+
+@table @option
+@item lr
+Direction from left to right.
+@item rl
+Direction from right to left.
+@item ud
+Direction from up to down.
+@item du
+Direction from down to up.
+@end table
+@end table
+
@section showfreqs
Convert input audio to video output representing the audio power spectrum.
diff --git a/libavfilter/Makefile b/libavfilter/Makefile
index 66c754f1f5..2791b6a950 100644
--- a/libavfilter/Makefile
+++ b/libavfilter/Makefile
@@ -595,6 +595,7 @@ OBJS-$(CONFIG_APHASEMETER_FILTER) += avf_aphasemeter.o
OBJS-$(CONFIG_AVECTORSCOPE_FILTER) += avf_avectorscope.o
OBJS-$(CONFIG_CONCAT_FILTER) += avf_concat.o
OBJS-$(CONFIG_SHOWCQT_FILTER) += avf_showcqt.o lswsutils.o lavfutils.o
+OBJS-$(CONFIG_SHOWCWT_FILTER) += avf_showcwt.o
OBJS-$(CONFIG_SHOWFREQS_FILTER) += avf_showfreqs.o
OBJS-$(CONFIG_SHOWSPATIAL_FILTER) += avf_showspatial.o
OBJS-$(CONFIG_SHOWSPECTRUM_FILTER) += avf_showspectrum.o
diff --git a/libavfilter/allfilters.c b/libavfilter/allfilters.c
index 4909732002..3ff20e76ce 100644
--- a/libavfilter/allfilters.c
+++ b/libavfilter/allfilters.c
@@ -560,6 +560,7 @@ extern const AVFilter ff_avf_aphasemeter;
extern const AVFilter ff_avf_avectorscope;
extern const AVFilter ff_avf_concat;
extern const AVFilter ff_avf_showcqt;
+extern const AVFilter ff_avf_showcwt;
extern const AVFilter ff_avf_showfreqs;
extern const AVFilter ff_avf_showspatial;
extern const AVFilter ff_avf_showspectrum;
diff --git a/libavfilter/avf_showcwt.c b/libavfilter/avf_showcwt.c
new file mode 100644
index 0000000000..2b1838316f
--- /dev/null
+++ b/libavfilter/avf_showcwt.c
@@ -0,0 +1,718 @@
+/*
+ * Copyright (c) 2022 Paul B Mahol
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include <float.h>
+#include <math.h>
+
+#include "libavutil/tx.h"
+#include "libavutil/avassert.h"
+#include "libavutil/avstring.h"
+#include "libavutil/channel_layout.h"
+#include "libavutil/cpu.h"
+#include "libavutil/opt.h"
+#include "libavutil/parseutils.h"
+#include "audio.h"
+#include "video.h"
+#include "avfilter.h"
+#include "filters.h"
+#include "internal.h"
+
+enum DirectionMode {
+ DIRECTION_LR,
+ DIRECTION_RL,
+ DIRECTION_UD,
+ DIRECTION_DU,
+ NB_DIRECTION,
+};
+
+enum SlideMode {
+ SLIDE_REPLACE,
+ SLIDE_SCROLL,
+ NB_SLIDE,
+};
+
+typedef struct ShowCWTContext {
+ const AVClass *class;
+ int w, h;
+ int mode;
+ char *rate_str;
+ AVRational auto_frame_rate;
+ AVRational frame_rate;
+ AVTXContext *fft;
+ AVTXContext **ifft;
+ av_tx_fn tx_fn;
+ av_tx_fn itx_fn;
+ int fft_in_size;
+ int fft_out_size;
+ int ifft_in_size;
+ int ifft_out_size;
+ int pos;
+ int in_nb_samples;
+ int64_t in_pts;
+ int64_t old_pts;
+ float *frequency_band;
+ AVFrame *kernel;
+ AVFrame *overlap;
+ AVFrame *outpicref;
+ AVFrame *fft_in;
+ AVFrame *fft_out;
+ AVFrame *ifft_in;
+ AVFrame *ifft_out;
+ int nb_threads;
+ int nb_consumed_samples;
+ int pps;
+ int slide;
+ int direction;
+ int hop_size;
+ int ihop_size;
+ int ihop_index;
+ int input_padding_size;
+ int input_sample_count;
+ int output_padding_size;
+ int output_sample_count;
+ int frequency_band_count;
+ float logarithmic_basis;
+ int frequency_scale;
+ float minimum_frequency;
+ float maximum_frequency;
+ float deviation;
+} ShowCWTContext;
+
+#define OFFSET(x) offsetof(ShowCWTContext, x)
+#define FLAGS AV_OPT_FLAG_FILTERING_PARAM|AV_OPT_FLAG_VIDEO_PARAM
+
+static const AVOption showcwt_options[] = {
+ { "size", "set video size", OFFSET(w), AV_OPT_TYPE_IMAGE_SIZE, {.str = "640x512"}, 0, 0, FLAGS },
+ { "s", "set video size", OFFSET(w), AV_OPT_TYPE_IMAGE_SIZE, {.str = "640x512"}, 0, 0, FLAGS },
+ { "rate", "set video rate", OFFSET(rate_str), AV_OPT_TYPE_STRING, {.str = "25"}, 0, 0, FLAGS },
+ { "r", "set video rate", OFFSET(rate_str), AV_OPT_TYPE_STRING, {.str = "25"}, 0, 0, FLAGS },
+ { "scale", "set frequency scale", OFFSET(frequency_scale), AV_OPT_TYPE_INT, {.i64=0}, 0, 1, FLAGS, "scale" },
+ { "linear", "linear", 0, AV_OPT_TYPE_CONST,{.i64=0}, 0, 0, FLAGS, "scale" },
+ { "log", "logarithmic", 0, AV_OPT_TYPE_CONST,{.i64=1}, 0, 0, FLAGS, "scale" },
+ { "min", "set minimum frequency", OFFSET(minimum_frequency), AV_OPT_TYPE_FLOAT, {.dbl = 20.}, 1, 2000, FLAGS },
+ { "max", "set maximum frequency", OFFSET(maximum_frequency), AV_OPT_TYPE_FLOAT, {.dbl = 20000.}, 0, 192000, FLAGS },
+ { "logb", "set logarithmic basis", OFFSET(logarithmic_basis), AV_OPT_TYPE_FLOAT, {.dbl = 0.0001}, 0, 1, FLAGS },
+ { "deviation", "set frequency deviation", OFFSET(deviation), AV_OPT_TYPE_FLOAT, {.dbl = 1.}, 0, 10, FLAGS },
+ { "pps", "set pixels per second", OFFSET(pps), AV_OPT_TYPE_INT, {.i64 = 64}, 1, 1024, FLAGS },
+ { "mode", "set output mode", OFFSET(mode), AV_OPT_TYPE_INT, {.i64=0}, 0, 2, FLAGS, "mode" },
+ { "magnitude", "magnitude", 0, AV_OPT_TYPE_CONST,{.i64=0}, 0, 0, FLAGS, "mode" },
+ { "phase", "phase", 0, AV_OPT_TYPE_CONST,{.i64=1}, 0, 0, FLAGS, "mode" },
+ { "magphase", "magnitude+phase", 0, AV_OPT_TYPE_CONST,{.i64=2}, 0, 0, FLAGS, "mode" },
+ { "slide", "set slide mode", OFFSET(slide), AV_OPT_TYPE_INT, {.i64=0}, 0, NB_SLIDE-1, FLAGS, "slide" },
+ { "replace", "replace", 0, AV_OPT_TYPE_CONST,{.i64=SLIDE_REPLACE},0, 0, FLAGS, "slide" },
+ { "scroll", "scroll", 0, AV_OPT_TYPE_CONST,{.i64=SLIDE_SCROLL}, 0, 0, FLAGS, "slide" },
+ { "direction", "set direction mode", OFFSET(direction), AV_OPT_TYPE_INT, {.i64=0}, 0, NB_DIRECTION-1, FLAGS, "direction" },
+ { "lr", "left to right", 0, AV_OPT_TYPE_CONST,{.i64=DIRECTION_LR}, 0, 0, FLAGS, "direction" },
+ { "rl", "right to left", 0, AV_OPT_TYPE_CONST,{.i64=DIRECTION_RL}, 0, 0, FLAGS, "direction" },
+ { "ud", "up to down", 0, AV_OPT_TYPE_CONST,{.i64=DIRECTION_UD}, 0, 0, FLAGS, "direction" },
+ { "du", "down to up", 0, AV_OPT_TYPE_CONST,{.i64=DIRECTION_DU}, 0, 0, FLAGS, "direction" },
+ { NULL }
+};
+
+AVFILTER_DEFINE_CLASS(showcwt);
+
+static av_cold void uninit(AVFilterContext *ctx)
+{
+ ShowCWTContext *s = ctx->priv;
+
+ av_freep(&s->frequency_band);
+ av_frame_free(&s->kernel);
+ av_frame_free(&s->overlap);
+ av_frame_free(&s->outpicref);
+ av_frame_free(&s->fft_in);
+ av_frame_free(&s->fft_out);
+ av_frame_free(&s->ifft_in);
+ av_frame_free(&s->ifft_out);
+ av_tx_uninit(&s->fft);
+
+ if (s->ifft) {
+ for (int n = 0; n < s->nb_threads; n++)
+ av_tx_uninit(&s->ifft[n]);
+ }
+}
+
+static int query_formats(AVFilterContext *ctx)
+{
+ AVFilterFormats *formats = NULL;
+ AVFilterChannelLayouts *layouts = NULL;
+ AVFilterLink *inlink = ctx->inputs[0];
+ AVFilterLink *outlink = ctx->outputs[0];
+ static const enum AVSampleFormat sample_fmts[] = { AV_SAMPLE_FMT_FLTP, AV_SAMPLE_FMT_NONE };
+ static const enum AVPixelFormat pix_fmts[] = { AV_PIX_FMT_YUV444P, AV_PIX_FMT_YUVJ444P, AV_PIX_FMT_YUVA444P, AV_PIX_FMT_NONE };
+ int ret;
+
+ formats = ff_make_format_list(sample_fmts);
+ if ((ret = ff_formats_ref(formats, &inlink->outcfg.formats)) < 0)
+ return ret;
+
+ layouts = ff_all_channel_counts();
+ if ((ret = ff_channel_layouts_ref(layouts, &inlink->outcfg.channel_layouts)) < 0)
+ return ret;
+
+ formats = ff_all_samplerates();
+ if ((ret = ff_formats_ref(formats, &inlink->outcfg.samplerates)) < 0)
+ return ret;
+
+ formats = ff_make_format_list(pix_fmts);
+ if ((ret = ff_formats_ref(formats, &outlink->incfg.formats)) < 0)
+ return ret;
+
+ return 0;
+}
+
+static void frequency_band(float *frequency_band,
+ int frequency_band_count,
+ float frequency_range,
+ float frequency_offset,
+ int frequency_scale, float deviation)
+{
+ deviation *= sqrtf(1.f / (4.f * M_PI)); // Heisenberg Gabor Limit
+ for (int y = 0; y < frequency_band_count; y++) {
+ float frequency = frequency_range * (1.f - (float)y / frequency_band_count) + frequency_offset;
+ float frequency_derivative = frequency_range / frequency_band_count;
+
+ if (frequency_scale > 0) {
+ frequency = powf(2.f, frequency);
+ frequency_derivative *= logf(2.f) * frequency;
+ }
+
+ frequency_band[y*2 ] = frequency;
+ frequency_band[y*2+1] = frequency_derivative * deviation;
+ }
+}
+
+#define cmul(operator, index) { \
+ const float ff = kernel[index]; \
+ isrc[n].re operator ff*dst[index].re; \
+ isrc[n].im operator ff*dst[index].im; \
+}
+
+static float remap_log(float value, float log_factor)
+{
+ float sign = (0 < value) - (value < 0);
+
+ value = logf(value * sign) * log_factor;
+
+ return 1.f - av_clipf(value, 0.f, 1.f);
+}
+
+static int run_channel_cwt_prepare(AVFilterContext *ctx, void *arg, int ch)
+{
+ ShowCWTContext *s = ctx->priv;
+ AVFrame *fin = arg;
+ const float *input = (const float *)fin->extended_data[ch];
+ float *overlap = (float *)s->overlap->extended_data[ch];
+ AVComplexFloat *src = (AVComplexFloat *)s->fft_in->extended_data[ch];
+ AVComplexFloat *dst = (AVComplexFloat *)s->fft_out->extended_data[ch];
+ const int nb_consumed_samples = s->nb_consumed_samples;
+ const int input_padding_size = s->input_padding_size;
+ const int hop_size = s->hop_size;
+ const int offset = input_padding_size - hop_size;
+
+ memmove(overlap, &overlap[hop_size], offset * sizeof(float));
+ memcpy(&overlap[offset], input,
+ fin->nb_samples * sizeof(float));
+ memset(&overlap[offset + fin->nb_samples], 0,
+ (hop_size - fin->nb_samples) * sizeof(float));
+
+ for (int n = 0; n < nb_consumed_samples; n++) {
+ src[n].re = overlap[n];
+ src[n].im = 0.f;
+ }
+
+ s->tx_fn(s->fft, dst, src, sizeof(*src));
+
+ return 0;
+}
+
+static int run_channel_cwt(AVFilterContext *ctx, void *arg, int jobnr, int nb_jobs)
+{
+ ShowCWTContext *s = ctx->priv;
+ const int ch = *(int *)arg;
+ const ptrdiff_t ylinesize = s->outpicref->linesize[0];
+ const ptrdiff_t ulinesize = s->outpicref->linesize[1];
+ const ptrdiff_t vlinesize = s->outpicref->linesize[2];
+ AVComplexFloat *dst = (AVComplexFloat *)s->fft_out->extended_data[ch];
+ const int output_sample_count = s->output_sample_count;
+ const float log_factor = 1.f/logf(s->logarithmic_basis);
+ const int input_padding_size = s->input_padding_size;
+ const int rest = input_padding_size % output_sample_count;
+ const int ihop_size = s->ihop_size;
+ const int ioffset = (s->output_padding_size - ihop_size) >> 1;
+ const int count = s->frequency_band_count;
+ const int start = (count * jobnr) / nb_jobs;
+ const int end = (count * (jobnr+1)) / nb_jobs;
+ const int mode = s->mode;
+ const int ihop_index = s->ihop_index;
+ const int i = ihop_index + ioffset;
+ const int w = s->w;
+
+ for (int y = start; y < end; y++) {
+ AVComplexFloat *isrc = (AVComplexFloat *)s->ifft_in->extended_data[ch * count + y];
+ AVComplexFloat *idst = (AVComplexFloat *)s->ifft_out->extended_data[ch * count + y];
+ const float *kernel = (const float *)s->kernel->extended_data[y];
+ uint8_t *dstY, *dstU, *dstV;
+ float Y, U, V;
+ int x;
+
+ if (ihop_index > 0)
+ goto put_pixels;
+
+ for (int n = 0; n < output_sample_count; n++)
+ cmul(=, n);
+
+ if (output_sample_count < input_padding_size) {
+ const int cut_index = input_padding_size - rest;
+
+ for (int chunk_index = output_sample_count; chunk_index < cut_index; chunk_index += output_sample_count)
+ for (int n = 0; n < output_sample_count; n++)
+ cmul(+=, chunk_index + n);
+ for (int n = 0; n < rest; n++)
+ cmul(+=, cut_index + n);
+ }
+
+ s->itx_fn(s->ifft[jobnr], idst, isrc, sizeof(*isrc));
+
+put_pixels:
+ x = s->pos;
+ switch (s->direction) {
+ case DIRECTION_LR:
+ case DIRECTION_RL:
+ dstY = s->outpicref->data[0] + y * ylinesize;
+ dstU = s->outpicref->data[1] + y * ulinesize;
+ dstV = s->outpicref->data[2] + y * vlinesize;
+ break;
+ case DIRECTION_UD:
+ case DIRECTION_DU:
+ dstY = s->outpicref->data[0] + x * ylinesize + w - y - 1;
+ dstU = s->outpicref->data[1] + x * ulinesize + w - y - 1;
+ dstV = s->outpicref->data[2] + x * vlinesize + w - y - 1;
+ break;
+ }
+
+ switch (s->slide) {
+ case SLIDE_REPLACE:
+ /* nothing to do here */
+ break;
+ case SLIDE_SCROLL:
+ switch (s->direction) {
+ case DIRECTION_RL:
+ memmove(dstY, dstY + 1, s->w - 1);
+ memmove(dstU, dstU + 1, s->w - 1);
+ memmove(dstV, dstV + 1, s->w - 1);
+ break;
+ case DIRECTION_LR:
+ memmove(dstY + 1, dstY, s->w - 1);
+ memmove(dstU + 1, dstU, s->w - 1);
+ memmove(dstV + 1, dstV, s->w - 1);
+ break;
+ }
+ break;
+ }
+
+ if (s->direction == DIRECTION_RL ||
+ s->direction == DIRECTION_LR) {
+ dstY += x;
+ dstU += x;
+ dstV += x;
+ }
+
+ switch (mode) {
+ case 2:
+ Y = hypotf(idst[i].re, idst[i].im);
+ Y = remap_log(Y, log_factor);
+ U = atan2f(idst[i].im, idst[i].re);
+ U = 0.5f + 0.5f * U * Y / M_PI;
+ V = 1.f - U;
+
+ dstY[0] = av_clip_uint8(lrintf(Y * 255.f));
+ dstU[0] = av_clip_uint8(lrintf(U * 255.f));
+ dstV[0] = av_clip_uint8(lrintf(V * 255.f));
+ break;
+ case 1:
+ Y = atan2f(idst[i].im, idst[i].re);
+ Y = 0.5f + 0.5f * Y / M_PI;
+
+ dstY[0] = av_clip_uint8(lrintf(Y * 255.f));
+ break;
+ case 0:
+ Y = hypotf(idst[i].re, idst[i].im);
+ Y = remap_log(Y, log_factor);
+
+ dstY[0] = av_clip_uint8(lrintf(Y * 255.f));
+ break;
+ }
+ }
+
+ return 0;
+}
+
+static void compute_kernel(AVFilterContext *ctx)
+{
+ ShowCWTContext *s = ctx->priv;
+ const int size = s->input_sample_count;
+ const float scale_factor = 1.f/(float)size;
+ const int output_sample_count = s->output_sample_count;
+ const int fsize = s->frequency_band_count;
+
+ for (int y = 0; y < fsize; y++) {
+ float *kernel = (float *)s->kernel->extended_data[y];
+ float frequency = s->frequency_band[y*2];
+ float deviation = 1.f / (s->frequency_band[y*2+1] *
+ output_sample_count);
+
+ for (int n = 0; n < size; n++) {
+ float ff, f = fabsf(n-frequency);
+
+ f = size - fabsf(f - size);
+ ff = expf(-f*f*deviation) * scale_factor;
+ kernel[n] = ff;
+ }
+ }
+}
+
+static int config_output(AVFilterLink *outlink)
+{
+ AVFilterContext *ctx = outlink->src;
+ AVFilterLink *inlink = ctx->inputs[0];
+ ShowCWTContext *s = ctx->priv;
+ float maximum_frequency = fminf(s->maximum_frequency, inlink->sample_rate * 0.5f);
+ float minimum_frequency = s->minimum_frequency;
+ float scale = 1.f;
+ int ret;
+
+ uninit(ctx);
+
+ switch (s->direction) {
+ case DIRECTION_LR:
+ case DIRECTION_RL:
+ s->frequency_band_count = s->h;
+ break;
+ case DIRECTION_UD:
+ case DIRECTION_DU:
+ s->frequency_band_count = s->w;
+ break;
+ }
+
+ s->nb_threads = FFMIN(s->frequency_band_count, ff_filter_get_nb_threads(ctx));
+ s->old_pts = AV_NOPTS_VALUE;
+ s->nb_consumed_samples = 65536;
+
+ s->input_sample_count = s->nb_consumed_samples;
+ s->hop_size = s->nb_consumed_samples >> 1;
+ s->input_padding_size = 65536;
+ s->output_padding_size = FFMAX(16,s->input_padding_size * s->pps / inlink->sample_rate);
+
+ outlink->w = s->w;
+ outlink->h = s->h;
+ outlink->sample_aspect_ratio = (AVRational){1,1};
+
+ s->fft_in_size = FFALIGN(s->input_padding_size, av_cpu_max_align());
+ s->fft_out_size = FFALIGN(s->input_padding_size, av_cpu_max_align());
+
+ s->output_sample_count = s->output_padding_size;
+
+ s->ifft_in_size = FFALIGN(s->output_padding_size, av_cpu_max_align());
+ s->ifft_out_size = FFALIGN(s->output_padding_size, av_cpu_max_align());
+ s->ihop_size = s->output_padding_size >> 1;
+
+ ret = av_tx_init(&s->fft, &s->tx_fn, AV_TX_FLOAT_FFT, 0, s->input_padding_size, &scale, 0);
+ if (ret < 0)
+ return ret;
+
+ s->ifft = av_calloc(s->nb_threads, sizeof(*s->ifft));
+ if (!s->ifft)
+ return AVERROR(ENOMEM);
+
+ for (int n = 0; n < s->nb_threads; n++) {
+ ret = av_tx_init(&s->ifft[n], &s->itx_fn, AV_TX_FLOAT_FFT, 1, s->output_padding_size, &scale, 0);
+ if (ret < 0)
+ return ret;
+ }
+
+ s->frequency_band = av_calloc(s->frequency_band_count,
+ sizeof(*s->frequency_band) * 2);
+ s->outpicref = ff_get_video_buffer(outlink, outlink->w, outlink->h);
+ s->fft_in = ff_get_audio_buffer(inlink, s->fft_in_size * 2);
+ s->fft_out = ff_get_audio_buffer(inlink, s->fft_out_size * 2);
+ s->overlap = ff_get_audio_buffer(inlink, s->input_padding_size);
+ s->ifft_in = av_frame_alloc();
+ s->ifft_out = av_frame_alloc();
+ s->kernel = av_frame_alloc();
+ if (!s->outpicref || !s->fft_in || !s->fft_out ||
+ !s->ifft_in || !s->ifft_out ||
+ !s->frequency_band || !s->kernel || !s->overlap)
+ return AVERROR(ENOMEM);
+
+ s->ifft_in->format = inlink->format;
+ s->ifft_in->nb_samples = s->ifft_in_size * 2;
+ s->ifft_in->ch_layout.nb_channels = s->frequency_band_count;
+ ret = av_frame_get_buffer(s->ifft_in, 0);
+ if (ret < 0)
+ return ret;
+
+ s->ifft_out->format = inlink->format;
+ s->ifft_out->nb_samples = s->ifft_out_size * 2;
+ s->ifft_out->ch_layout.nb_channels = s->frequency_band_count;
+ ret = av_frame_get_buffer(s->ifft_out, 0);
+ if (ret < 0)
+ return ret;
+
+ s->kernel->format = inlink->format;
+ s->kernel->nb_samples = s->input_padding_size;
+ s->kernel->ch_layout.nb_channels = s->frequency_band_count;
+ ret = av_frame_get_buffer(s->kernel, 0);
+ if (ret < 0)
+ return ret;
+
+ s->outpicref->sample_aspect_ratio = (AVRational){1,1};
+
+ for (int y = 0; y < outlink->h; y++) {
+ memset(s->outpicref->data[0] + y * s->outpicref->linesize[0], 0, outlink->w);
+ memset(s->outpicref->data[1] + y * s->outpicref->linesize[1], 128, outlink->w);
+ memset(s->outpicref->data[2] + y * s->outpicref->linesize[2], 128, outlink->w);
+ if (s->outpicref->data[3])
+ memset(s->outpicref->data[3] + y * s->outpicref->linesize[3], 0, outlink->w);
+ }
+
+ s->outpicref->color_range = AVCOL_RANGE_JPEG;
+
+ minimum_frequency *= s->nb_consumed_samples / (float)inlink->sample_rate;
+ maximum_frequency *= s->nb_consumed_samples / (float)inlink->sample_rate;
+ if (s->frequency_scale > 0) {
+ minimum_frequency = logf(minimum_frequency) / logf(2.f);
+ maximum_frequency = logf(maximum_frequency) / logf(2.f);
+ }
+
+ frequency_band(s->frequency_band,
+ s->frequency_band_count, maximum_frequency - minimum_frequency,
+ minimum_frequency, s->frequency_scale, s->deviation);
+
+ av_log(ctx, AV_LOG_DEBUG, "input_sample_count: %d\n", s->input_sample_count);
+ av_log(ctx, AV_LOG_DEBUG, "output_sample_count: %d\n", s->output_sample_count);
+
+ switch (s->direction) {
+ case DIRECTION_LR:
+ s->pos = 0;
+ break;
+ case DIRECTION_RL:
+ s->pos = s->w - 1;
+ break;
+ case DIRECTION_UD:
+ s->pos = 0;
+ break;
+ case DIRECTION_DU:
+ s->pos = s->h - 1;
+ break;
+ }
+
+ s->auto_frame_rate = av_make_q(inlink->sample_rate, s->hop_size);
+ if (strcmp(s->rate_str, "auto")) {
+ ret = av_parse_video_rate(&s->frame_rate, s->rate_str);
+ } else {
+ s->frame_rate = s->auto_frame_rate;
+ }
+ outlink->frame_rate = s->frame_rate;
+ outlink->time_base = av_inv_q(outlink->frame_rate);
+
+ compute_kernel(ctx);
+
+ return 0;
+}
+
+static int activate(AVFilterContext *ctx)
+{
+ AVFilterLink *inlink = ctx->inputs[0];
+ AVFilterLink *outlink = ctx->outputs[0];
+ ShowCWTContext *s = ctx->priv;
+ int ret = 0, status;
+ int64_t pts;
+
+ FF_FILTER_FORWARD_STATUS_BACK(outlink, inlink);
+
+ if (s->outpicref) {
+ const int ch = 0;
+ AVFrame *fin;
+
+ if (s->ihop_index == 0) {
+ ret = ff_inlink_consume_samples(inlink, s->hop_size, s->hop_size, &fin);
+ if (ret < 0)
+ return ret;
+ if (ret > 0) {
+ run_channel_cwt_prepare(ctx, fin, ch);
+ s->in_pts = fin->pts;
+ s->in_nb_samples = fin->nb_samples;
+ av_frame_free(&fin);
+ }
+ }
+
+ if (ret > 0 || s->ihop_index > 0) {
+ int64_t pts_offset;
+
+ switch (s->slide) {
+ case SLIDE_SCROLL:
+ switch (s->direction) {
+ case DIRECTION_UD:
+ for (int p = 0; p < 3; p++) {
+ ptrdiff_t linesize = s->outpicref->linesize[p];
+
+ for (int y = s->h - 1; y > 0; y--) {
+ uint8_t *dst = s->outpicref->data[p] + y * linesize;
+
+ memmove(dst, dst - linesize, s->w);
+ }
+ }
+ break;
+ case DIRECTION_DU:
+ for (int p = 0; p < 3; p++) {
+ ptrdiff_t linesize = s->outpicref->linesize[p];
+
+ for (int y = 0; y < s->h - 1; y++) {
+ uint8_t *dst = s->outpicref->data[p] + y * linesize;
+
+ memmove(dst, dst + linesize, s->w);
+ }
+ }
+ break;
+ }
+ break;
+ }
+
+ ff_filter_execute(ctx, run_channel_cwt, (void *)&ch, NULL,
+ s->nb_threads);
+
+ pts_offset = av_rescale_q(s->ihop_index, av_make_q(1, s->ihop_size), av_make_q(1, s->in_nb_samples));
+ s->outpicref->pts = av_rescale_q(s->in_pts + pts_offset, inlink->time_base, outlink->time_base);
+
+ s->ihop_index++;
+ if (s->ihop_index >= s->ihop_size)
+ s->ihop_index = 0;
+
+ switch (s->slide) {
+ case SLIDE_REPLACE:
+ switch (s->direction) {
+ case DIRECTION_LR:
+ s->pos++;
+ if (s->pos >= s->w)
+ s->pos = 0;
+ break;
+ case DIRECTION_RL:
+ s->pos--;
+ if (s->pos < 0)
+ s->pos = s->w - 1;
+ break;
+ case DIRECTION_UD:
+ s->pos++;
+ if (s->pos >= s->h)
+ s->pos = 0;
+ break;
+ case DIRECTION_DU:
+ s->pos--;
+ if (s->pos < 0)
+ s->pos = s->h - 1;
+ break;
+ }
+ break;
+ case SLIDE_SCROLL:
+ switch (s->direction) {
+ case DIRECTION_LR:
+ s->pos = 0;
+ break;
+ case DIRECTION_RL:
+ s->pos = s->w - 1;
+ break;
+ case DIRECTION_UD:
+ s->pos = 0;
+ break;
+ case DIRECTION_DU:
+ s->pos = s->h - 1;
+ break;
+ }
+ break;
+ }
+
+ if (s->old_pts < s->outpicref->pts) {
+ AVFrame *out = ff_get_video_buffer(outlink, outlink->w, outlink->h);
+ if (!out)
+ return AVERROR(ENOMEM);
+ ret = av_frame_copy_props(out, s->outpicref);
+ if (ret < 0)
+ goto fail;
+ ret = av_frame_copy(out, s->outpicref);
+ if (ret < 0)
+ goto fail;
+ s->old_pts = s->outpicref->pts;
+ ret = ff_filter_frame(outlink, out);
+ if (ret <= 0)
+ return ret;
+fail:
+ av_frame_free(&out);
+ return ret;
+ }
+ }
+ }
+
+ if (ff_inlink_acknowledge_status(inlink, &status, &pts)) {
+ if (status == AVERROR_EOF) {
+ ff_outlink_set_status(outlink, status, pts);
+ return 0;
+ }
+ }
+
+ if (ff_inlink_queued_samples(inlink) >= s->hop_size || s->ihop_index) {
+ ff_filter_set_ready(ctx, 10);
+ return 0;
+ }
+
+ if (ff_outlink_frame_wanted(outlink)) {
+ ff_inlink_request_frame(inlink);
+ return 0;
+ }
+
+ return FFERROR_NOT_READY;
+}
+
+static const AVFilterPad showcwt_inputs[] = {
+ {
+ .name = "default",
+ .type = AVMEDIA_TYPE_AUDIO,
+ },
+};
+
+static const AVFilterPad showcwt_outputs[] = {
+ {
+ .name = "default",
+ .type = AVMEDIA_TYPE_VIDEO,
+ .config_props = config_output,
+ },
+};
+
+const AVFilter ff_avf_showcwt = {
+ .name = "showcwt",
+ .description = NULL_IF_CONFIG_SMALL("Convert input audio to a CWT (Continuous Wavelet Transform) spectrum video output."),
+ .uninit = uninit,
+ .priv_size = sizeof(ShowCWTContext),
+ FILTER_INPUTS(showcwt_inputs),
+ FILTER_OUTPUTS(showcwt_outputs),
+ FILTER_QUERY_FUNC(query_formats),
+ .activate = activate,
+ .priv_class = &showcwt_class,
+ .flags = AVFILTER_FLAG_SLICE_THREADS,
+};
--
2.37.2
[-- Attachment #3: Type: text/plain, Size: 251 bytes --]
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [FFmpeg-devel] [PATCH] avfilter: add showcwt multimedia filter
2022-11-26 11:01 ` Paul B Mahol
@ 2022-11-27 21:04 ` Paul B Mahol
0 siblings, 0 replies; 3+ messages in thread
From: Paul B Mahol @ 2022-11-27 21:04 UTC (permalink / raw)
To: FFmpeg development discussions and patches
On 11/26/22, Paul B Mahol <onemda@gmail.com> wrote:
> On 11/25/22, Paul B Mahol <onemda@gmail.com> wrote:
>> Hello,
>>
>> Patch attached.
>>
>
> Improved patch attached.
>
> Added slide & direction options.
>
Gonna push soon.
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2022-11-27 21:04 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-25 19:28 [FFmpeg-devel] [PATCH] avfilter: add showcwt multimedia filter Paul B Mahol
2022-11-26 11:01 ` Paul B Mahol
2022-11-27 21:04 ` Paul B Mahol
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
This inbox may be cloned and mirrored by anyone:
git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git
# If you have public-inbox 1.1+ installed, you may
# initialize and index your mirror using the following commands:
public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
ffmpegdev@gitmailbox.com
public-inbox-index ffmpegdev
Example config snippet for mirrors.
AGPL code for this site: git clone https://public-inbox.org/public-inbox.git