* [FFmpeg-devel] [PATCH v3 0/5] avfilter/af_volumedetect.c: Add 32bit float audio
@ 2024-07-02 1:33 Yigithan Yigit
2024-07-02 1:33 ` [FFmpeg-devel] [PATCH v3 1/5] avfilter/af_volumedetect.c: Move logdb function Yigithan Yigit
` (4 more replies)
0 siblings, 5 replies; 11+ messages in thread
From: Yigithan Yigit @ 2024-07-02 1:33 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: thilo.borgmann, yigithanyigitdevel
changes since v1:
- Defined callback and assigning to filter frame according to planar/packed and float/int.
- Fixed rounding same value 2 times
- Subnormal values are supported
- Replaced square division with ldexp
Yigithan Yigit (5):
avfilter/af_volumedetect.c: Move logdb function
avfilter/af_volumedetect.c: Add 32bit float audio support
avfilter/af_volumedetect.c: Added functions for int/float and
planar/packed
avfilter/af_volumedetect.c: reindent after last commit
Replace division with ldexp
libavfilter/af_volumedetect.c | 255 ++++++++++++++++++++++++++--------
1 file changed, 198 insertions(+), 57 deletions(-)
--
2.45.2
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 11+ messages in thread
* [FFmpeg-devel] [PATCH v3 1/5] avfilter/af_volumedetect.c: Move logdb function
2024-07-02 1:33 [FFmpeg-devel] [PATCH v3 0/5] avfilter/af_volumedetect.c: Add 32bit float audio Yigithan Yigit
@ 2024-07-02 1:33 ` Yigithan Yigit
2024-07-02 1:33 ` [FFmpeg-devel] [PATCH v3 2/5] avfilter/af_volumedetect.c: Add 32bit float audio support Yigithan Yigit
` (3 subsequent siblings)
4 siblings, 0 replies; 11+ messages in thread
From: Yigithan Yigit @ 2024-07-02 1:33 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: thilo.borgmann, yigithanyigitdevel
---
libavfilter/af_volumedetect.c | 20 ++++++++++----------
1 file changed, 10 insertions(+), 10 deletions(-)
diff --git a/libavfilter/af_volumedetect.c b/libavfilter/af_volumedetect.c
index 8b001d1cf2..327801a7f9 100644
--- a/libavfilter/af_volumedetect.c
+++ b/libavfilter/af_volumedetect.c
@@ -24,6 +24,8 @@
#include "avfilter.h"
#include "internal.h"
+#define MAX_DB 91
+
typedef struct VolDetectContext {
/**
* Number of samples at each PCM value.
@@ -33,6 +35,14 @@ typedef struct VolDetectContext {
uint64_t histogram[0x10001];
} VolDetectContext;
+static inline double logdb(uint64_t v)
+{
+ double d = v / (double)(0x8000 * 0x8000);
+ if (!v)
+ return MAX_DB;
+ return -log10(d) * 10;
+}
+
static int filter_frame(AVFilterLink *inlink, AVFrame *samples)
{
AVFilterContext *ctx = inlink->dst;
@@ -56,16 +66,6 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *samples)
return ff_filter_frame(inlink->dst->outputs[0], samples);
}
-#define MAX_DB 91
-
-static inline double logdb(uint64_t v)
-{
- double d = v / (double)(0x8000 * 0x8000);
- if (!v)
- return MAX_DB;
- return -log10(d) * 10;
-}
-
static void print_stats(AVFilterContext *ctx)
{
VolDetectContext *vd = ctx->priv;
--
2.45.2
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 11+ messages in thread
* [FFmpeg-devel] [PATCH v3 2/5] avfilter/af_volumedetect.c: Add 32bit float audio support
2024-07-02 1:33 [FFmpeg-devel] [PATCH v3 0/5] avfilter/af_volumedetect.c: Add 32bit float audio Yigithan Yigit
2024-07-02 1:33 ` [FFmpeg-devel] [PATCH v3 1/5] avfilter/af_volumedetect.c: Move logdb function Yigithan Yigit
@ 2024-07-02 1:33 ` Yigithan Yigit
2024-07-02 5:46 ` Rémi Denis-Courmont
2024-07-02 5:51 ` Rémi Denis-Courmont
2024-07-02 1:33 ` [FFmpeg-devel] [PATCH v3 3/5] avfilter/af_volumedetect.c: Added functions for int/float and planar/packed Yigithan Yigit
` (2 subsequent siblings)
4 siblings, 2 replies; 11+ messages in thread
From: Yigithan Yigit @ 2024-07-02 1:33 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: thilo.borgmann, yigithanyigitdevel
---
libavfilter/af_volumedetect.c | 139 ++++++++++++++++++++++++++--------
1 file changed, 107 insertions(+), 32 deletions(-)
diff --git a/libavfilter/af_volumedetect.c b/libavfilter/af_volumedetect.c
index 327801a7f9..edd2d56f7a 100644
--- a/libavfilter/af_volumedetect.c
+++ b/libavfilter/af_volumedetect.c
@@ -1,5 +1,6 @@
/*
* Copyright (c) 2012 Nicolas George
+ * Copyright (c) 2024 Yigithan Yigit - 32 Bit Float Audio Support
*
* This file is part of FFmpeg.
*
@@ -20,48 +21,62 @@
#include "libavutil/channel_layout.h"
#include "libavutil/avassert.h"
+#include "libavutil/mem.h"
#include "audio.h"
#include "avfilter.h"
#include "internal.h"
+#define MAX_DB_FLT 1024
#define MAX_DB 91
+#define HISTOGRAM_SIZE 0x10000
+#define HISTOGRAM_SIZE_FLT (MAX_DB_FLT*2)
+
+typedef struct VolDetectContext VolDetectContext;
typedef struct VolDetectContext {
- /**
- * Number of samples at each PCM value.
- * histogram[0x8000 + i] is the number of samples at value i.
- * The extra element is there for symmetry.
- */
- uint64_t histogram[0x10001];
+ uint64_t* histogram; ///< for integer number of samples at each PCM value, for float number of samples at each dB
+ uint64_t nb_samples; ///< number of samples
+ double sum2; ///< sum of the squares of the samples
+ double max; ///< maximum sample value
+ int is_float; ///< true if the input is in floating point
+ void (*process_samples)(VolDetectContext *vd, AVFrame *samples);
} VolDetectContext;
-static inline double logdb(uint64_t v)
+static inline double logdb(double v, enum AVSampleFormat sample_fmt)
{
- double d = v / (double)(0x8000 * 0x8000);
- if (!v)
- return MAX_DB;
- return -log10(d) * 10;
+ if (sample_fmt == AV_SAMPLE_FMT_FLT) {
+ if (!v)
+ return MAX_DB_FLT;
+ return -log10(v) * 10;
+ } else {
+ double d = v / (double)(0x8000 * 0x8000);
+ if (!v)
+ return MAX_DB;
+ return -log10(d) * 10;
+ }
+}
+
+static void update_float_stats(VolDetectContext *vd, float *audio_data)
+{
+ double sample;
+ int idx;
+ if(!isfinite(*audio_data) || isnan(*audio_data))
+ return;
+ sample = fabsf(*audio_data);
+ if (sample > vd->max)
+ vd->max = sample;
+ vd->sum2 += sample * sample;
+ idx = (int)floorf(logdb(sample * sample, AV_SAMPLE_FMT_FLT)) + MAX_DB_FLT;
+ vd->histogram[idx]++;
+ vd->nb_samples++;
}
static int filter_frame(AVFilterLink *inlink, AVFrame *samples)
{
AVFilterContext *ctx = inlink->dst;
VolDetectContext *vd = ctx->priv;
- int nb_samples = samples->nb_samples;
- int nb_channels = samples->ch_layout.nb_channels;
- int nb_planes = nb_channels;
- int plane, i;
- int16_t *pcm;
-
- if (!av_sample_fmt_is_planar(samples->format)) {
- nb_samples *= nb_channels;
- nb_planes = 1;
- }
- for (plane = 0; plane < nb_planes; plane++) {
- pcm = (int16_t *)samples->extended_data[plane];
- for (i = 0; i < nb_samples; i++)
- vd->histogram[pcm[i] + 0x8000]++;
- }
+
+ vd->process_samples(vd, samples);
return ff_filter_frame(inlink->dst->outputs[0], samples);
}
@@ -73,6 +88,20 @@ static void print_stats(AVFilterContext *ctx)
uint64_t nb_samples = 0, power = 0, nb_samples_shift = 0, sum = 0;
uint64_t histdb[MAX_DB + 1] = { 0 };
+ if (!vd->nb_samples)
+ return;
+ if (vd->is_float) {
+ av_log(ctx, AV_LOG_INFO, "n_samples: %" PRId64 "\n", vd->nb_samples);
+ av_log(ctx, AV_LOG_INFO, "mean_volume: %.1f dB\n", -logdb(vd->sum2 / vd->nb_samples, AV_SAMPLE_FMT_FLT));
+ av_log(ctx, AV_LOG_INFO, "max_volume: %.1f dB\n", -2.0*logdb(vd->max, AV_SAMPLE_FMT_FLT));
+ for (i = 0; i < HISTOGRAM_SIZE_FLT && !vd->histogram[i]; i++);
+ for (; i >= 0 && sum < vd->nb_samples / 1000; i++) {
+ if (!vd->histogram[i])
+ continue;
+ av_log(ctx, AV_LOG_INFO, "histogram_%ddb: %" PRId64 "\n", MAX_DB_FLT - i, vd->histogram[i]);
+ sum += vd->histogram[i];
+ }
+ } else {
for (i = 0; i < 0x10000; i++)
nb_samples += vd->histogram[i];
av_log(ctx, AV_LOG_INFO, "n_samples: %"PRId64"\n", nb_samples);
@@ -92,26 +121,61 @@ static void print_stats(AVFilterContext *ctx)
return;
power = (power + nb_samples_shift / 2) / nb_samples_shift;
av_assert0(power <= 0x8000 * 0x8000);
- av_log(ctx, AV_LOG_INFO, "mean_volume: %.1f dB\n", -logdb(power));
+ av_log(ctx, AV_LOG_INFO, "mean_volume: %.1f dB\n", -logdb((double)power, AV_SAMPLE_FMT_S16));
max_volume = 0x8000;
while (max_volume > 0 && !vd->histogram[0x8000 + max_volume] &&
!vd->histogram[0x8000 - max_volume])
max_volume--;
- av_log(ctx, AV_LOG_INFO, "max_volume: %.1f dB\n", -logdb(max_volume * max_volume));
+ av_log(ctx, AV_LOG_INFO, "max_volume: %.1f dB\n", -logdb((double)(max_volume * max_volume), AV_SAMPLE_FMT_S16));
for (i = 0; i < 0x10000; i++)
- histdb[(int)logdb((i - 0x8000) * (i - 0x8000))] += vd->histogram[i];
+ histdb[(int)logdb((double)(i - 0x8000) * (i - 0x8000), AV_SAMPLE_FMT_S16)] += vd->histogram[i];
for (i = 0; i <= MAX_DB && !histdb[i]; i++);
for (; i <= MAX_DB && sum < nb_samples / 1000; i++) {
- av_log(ctx, AV_LOG_INFO, "histogram_%ddb: %"PRId64"\n", i, histdb[i]);
+ av_log(ctx, AV_LOG_INFO, "histogram_%ddb: %"PRId64"\n", -i, histdb[i]);
sum += histdb[i];
}
+ }
+}
+
+static int config_output(AVFilterLink *outlink)
+{
+ AVFilterContext *ctx = outlink->src;
+ VolDetectContext *vd = ctx->priv;
+ size_t histogram_size;
+
+ vd->is_float = outlink->format == AV_SAMPLE_FMT_FLT ||
+ outlink->format == AV_SAMPLE_FMT_FLTP;
+
+ if (!vd->is_float) {
+ /*
+ * Number of samples at each PCM value.
+ * Only used for integer formats.
+ * For 16 bit signed PCM there are 65536.
+ * histogram[0x8000 + i] is the number of samples at value i.
+ * The extra element is there for symmetry.
+ */
+ histogram_size = HISTOGRAM_SIZE + 1;
+ } else {
+ /*
+ * The histogram is used to store the number of samples at each dB
+ * instead of the number of samples at each PCM value.
+ */
+ histogram_size = HISTOGRAM_SIZE_FLT + 1;
+ }
+ vd->histogram = av_calloc(histogram_size, sizeof(uint64_t));
+ if (!vd->histogram)
+ return AVERROR(ENOMEM);
+ return 0;
}
static av_cold void uninit(AVFilterContext *ctx)
{
+ VolDetectContext *vd = ctx->priv;
print_stats(ctx);
+ if (vd->histogram)
+ av_freep(&vd->histogram);
}
static const AVFilterPad volumedetect_inputs[] = {
@@ -122,6 +186,14 @@ static const AVFilterPad volumedetect_inputs[] = {
},
};
+static const AVFilterPad volumedetect_outputs[] = {
+ {
+ .name = "default",
+ .type = AVMEDIA_TYPE_AUDIO,
+ .config_props = config_output,
+ },
+};
+
const AVFilter ff_af_volumedetect = {
.name = "volumedetect",
.description = NULL_IF_CONFIG_SMALL("Detect audio volume."),
@@ -129,6 +201,9 @@ const AVFilter ff_af_volumedetect = {
.uninit = uninit,
.flags = AVFILTER_FLAG_METADATA_ONLY,
FILTER_INPUTS(volumedetect_inputs),
- FILTER_OUTPUTS(ff_audio_default_filterpad),
- FILTER_SAMPLEFMTS(AV_SAMPLE_FMT_S16, AV_SAMPLE_FMT_S16P),
+ FILTER_OUTPUTS(volumedetect_outputs),
+ FILTER_SAMPLEFMTS(AV_SAMPLE_FMT_S16,
+ AV_SAMPLE_FMT_S16P,
+ AV_SAMPLE_FMT_FLT,
+ AV_SAMPLE_FMT_FLTP),
};
--
2.45.2
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 11+ messages in thread
* [FFmpeg-devel] [PATCH v3 3/5] avfilter/af_volumedetect.c: Added functions for int/float and planar/packed
2024-07-02 1:33 [FFmpeg-devel] [PATCH v3 0/5] avfilter/af_volumedetect.c: Add 32bit float audio Yigithan Yigit
2024-07-02 1:33 ` [FFmpeg-devel] [PATCH v3 1/5] avfilter/af_volumedetect.c: Move logdb function Yigithan Yigit
2024-07-02 1:33 ` [FFmpeg-devel] [PATCH v3 2/5] avfilter/af_volumedetect.c: Add 32bit float audio support Yigithan Yigit
@ 2024-07-02 1:33 ` Yigithan Yigit
2024-07-02 7:50 ` Anton Khirnov
2024-07-02 1:33 ` [FFmpeg-devel] [PATCH v3 4/5] avfilter/af_volumedetect.c: reindent after last commit Yigithan Yigit
2024-07-02 1:33 ` [FFmpeg-devel] [PATCH v3 5/5] Replace division with ldexp Yigithan Yigit
4 siblings, 1 reply; 11+ messages in thread
From: Yigithan Yigit @ 2024-07-02 1:33 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: thilo.borgmann, yigithanyigitdevel
---
libavfilter/af_volumedetect.c | 66 +++++++++++++++++++++++++++++++++++
1 file changed, 66 insertions(+)
diff --git a/libavfilter/af_volumedetect.c b/libavfilter/af_volumedetect.c
index edd2d56f7a..778f0cac6c 100644
--- a/libavfilter/af_volumedetect.c
+++ b/libavfilter/af_volumedetect.c
@@ -71,6 +71,64 @@ static void update_float_stats(VolDetectContext *vd, float *audio_data)
vd->nb_samples++;
}
+static void process_float_planar_samples(VolDetectContext *vd, AVFrame *samples)
+{
+ int plane, i;
+ int nb_channels = samples->ch_layout.nb_channels;
+ int nb_samples = samples->nb_samples;
+ float *audio_data;
+ for (plane = 0; plane < nb_channels; plane++) {
+ audio_data = (float *)samples->extended_data[plane];
+ for (i = 0; i < nb_samples; i++) {
+ update_float_stats(vd, &audio_data[i]);
+ }
+ }
+}
+
+static void process_float_packed_samples(VolDetectContext *vd, AVFrame *samples)
+{
+ int i, j;
+ int nb_channels = samples->ch_layout.nb_channels;
+ int nb_samples = samples->nb_samples;
+ float *audio_data;
+ for (i = 0; i < nb_samples; i++) {
+ audio_data = (float *)samples->extended_data[0];
+ for (j = 0; j < nb_channels; j++) {
+ update_float_stats(vd, &audio_data[i * nb_channels + j]);
+ }
+ }
+}
+
+static void process_int_planar_samples(VolDetectContext *vd, AVFrame *samples)
+{
+ int plane, i;
+ int nb_channels = samples->ch_layout.nb_channels;
+ int nb_samples = samples->nb_samples;
+ int16_t *pcm;
+ for (plane = 0; plane < nb_channels; plane++) {
+ pcm = (int16_t *)samples->extended_data[plane];
+ for (i = 0; i < nb_samples; i++) {
+ vd->histogram[pcm[i] + 0x8000]++;
+ vd->nb_samples++;
+ }
+ }
+}
+
+static void process_int_packed_samples(VolDetectContext *vd, AVFrame *samples)
+{
+ int i, j;
+ int nb_channels = samples->ch_layout.nb_channels;
+ int nb_samples = samples->nb_samples;
+ int16_t *pcm;
+ for (i = 0; i < nb_samples; i++) {
+ pcm = (int16_t *)samples->extended_data[0];
+ for (j = 0; j < nb_channels; j++) {
+ vd->histogram[pcm[i * nb_channels + j] + 0x8000]++;
+ vd->nb_samples++;
+ }
+ }
+}
+
static int filter_frame(AVFilterLink *inlink, AVFrame *samples)
{
AVFilterContext *ctx = inlink->dst;
@@ -157,12 +215,20 @@ static int config_output(AVFilterLink *outlink)
* The extra element is there for symmetry.
*/
histogram_size = HISTOGRAM_SIZE + 1;
+ if (av_sample_fmt_is_planar(outlink->format))
+ vd->process_samples = process_int_planar_samples;
+ else
+ vd->process_samples = process_int_packed_samples;
} else {
/*
* The histogram is used to store the number of samples at each dB
* instead of the number of samples at each PCM value.
*/
histogram_size = HISTOGRAM_SIZE_FLT + 1;
+ if (av_sample_fmt_is_planar(outlink->format))
+ vd->process_samples = process_float_planar_samples;
+ else
+ vd->process_samples = process_float_packed_samples;
}
vd->histogram = av_calloc(histogram_size, sizeof(uint64_t));
if (!vd->histogram)
--
2.45.2
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 11+ messages in thread
* [FFmpeg-devel] [PATCH v3 4/5] avfilter/af_volumedetect.c: reindent after last commit
2024-07-02 1:33 [FFmpeg-devel] [PATCH v3 0/5] avfilter/af_volumedetect.c: Add 32bit float audio Yigithan Yigit
` (2 preceding siblings ...)
2024-07-02 1:33 ` [FFmpeg-devel] [PATCH v3 3/5] avfilter/af_volumedetect.c: Added functions for int/float and planar/packed Yigithan Yigit
@ 2024-07-02 1:33 ` Yigithan Yigit
2024-07-02 1:33 ` [FFmpeg-devel] [PATCH v3 5/5] Replace division with ldexp Yigithan Yigit
4 siblings, 0 replies; 11+ messages in thread
From: Yigithan Yigit @ 2024-07-02 1:33 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: thilo.borgmann, yigithanyigitdevel
---
libavfilter/af_volumedetect.c | 62 +++++++++++++++++------------------
1 file changed, 31 insertions(+), 31 deletions(-)
diff --git a/libavfilter/af_volumedetect.c b/libavfilter/af_volumedetect.c
index 778f0cac6c..a53212015d 100644
--- a/libavfilter/af_volumedetect.c
+++ b/libavfilter/af_volumedetect.c
@@ -160,40 +160,40 @@ static void print_stats(AVFilterContext *ctx)
sum += vd->histogram[i];
}
} else {
- for (i = 0; i < 0x10000; i++)
- nb_samples += vd->histogram[i];
- av_log(ctx, AV_LOG_INFO, "n_samples: %"PRId64"\n", nb_samples);
- if (!nb_samples)
- return;
+ for (i = 0; i < 0x10000; i++)
+ nb_samples += vd->histogram[i];
+ av_log(ctx, AV_LOG_INFO, "n_samples: %"PRId64"\n", nb_samples);
+ if (!nb_samples)
+ return;
- /* If nb_samples > 1<<34, there is a risk of overflow in the
- multiplication or the sum: shift all histogram values to avoid that.
- The total number of samples must be recomputed to avoid rounding
- errors. */
- shift = av_log2(nb_samples >> 33);
- for (i = 0; i < 0x10000; i++) {
- nb_samples_shift += vd->histogram[i] >> shift;
- power += (i - 0x8000) * (i - 0x8000) * (vd->histogram[i] >> shift);
- }
- if (!nb_samples_shift)
- return;
- power = (power + nb_samples_shift / 2) / nb_samples_shift;
- av_assert0(power <= 0x8000 * 0x8000);
- av_log(ctx, AV_LOG_INFO, "mean_volume: %.1f dB\n", -logdb((double)power, AV_SAMPLE_FMT_S16));
+ /* If nb_samples > 1<<34, there is a risk of overflow in the
+ multiplication or the sum: shift all histogram values to avoid that.
+ The total number of samples must be recomputed to avoid rounding
+ errors. */
+ shift = av_log2(nb_samples >> 33);
+ for (i = 0; i < 0x10000; i++) {
+ nb_samples_shift += vd->histogram[i] >> shift;
+ power += (i - 0x8000) * (i - 0x8000) * (vd->histogram[i] >> shift);
+ }
+ if (!nb_samples_shift)
+ return;
+ power = (power + nb_samples_shift / 2) / nb_samples_shift;
+ av_assert0(power <= 0x8000 * 0x8000);
+ av_log(ctx, AV_LOG_INFO, "mean_volume: %.1f dB\n", -logdb((double)power, AV_SAMPLE_FMT_S16));
- max_volume = 0x8000;
- while (max_volume > 0 && !vd->histogram[0x8000 + max_volume] &&
- !vd->histogram[0x8000 - max_volume])
- max_volume--;
- av_log(ctx, AV_LOG_INFO, "max_volume: %.1f dB\n", -logdb((double)(max_volume * max_volume), AV_SAMPLE_FMT_S16));
+ max_volume = 0x8000;
+ while (max_volume > 0 && !vd->histogram[0x8000 + max_volume] &&
+ !vd->histogram[0x8000 - max_volume])
+ max_volume--;
+ av_log(ctx, AV_LOG_INFO, "max_volume: %.1f dB\n", -logdb((double)(max_volume * max_volume), AV_SAMPLE_FMT_S16));
- for (i = 0; i < 0x10000; i++)
- histdb[(int)logdb((double)(i - 0x8000) * (i - 0x8000), AV_SAMPLE_FMT_S16)] += vd->histogram[i];
- for (i = 0; i <= MAX_DB && !histdb[i]; i++);
- for (; i <= MAX_DB && sum < nb_samples / 1000; i++) {
- av_log(ctx, AV_LOG_INFO, "histogram_%ddb: %"PRId64"\n", -i, histdb[i]);
- sum += histdb[i];
- }
+ for (i = 0; i < 0x10000; i++)
+ histdb[(int)logdb((double)(i - 0x8000) * (i - 0x8000), AV_SAMPLE_FMT_S16)] += vd->histogram[i];
+ for (i = 0; i <= MAX_DB && !histdb[i]; i++);
+ for (; i <= MAX_DB && sum < nb_samples / 1000; i++) {
+ av_log(ctx, AV_LOG_INFO, "histogram_%ddb: %"PRId64"\n", -i, histdb[i]);
+ sum += histdb[i];
+ }
}
}
--
2.45.2
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 11+ messages in thread
* [FFmpeg-devel] [PATCH v3 5/5] Replace division with ldexp
2024-07-02 1:33 [FFmpeg-devel] [PATCH v3 0/5] avfilter/af_volumedetect.c: Add 32bit float audio Yigithan Yigit
` (3 preceding siblings ...)
2024-07-02 1:33 ` [FFmpeg-devel] [PATCH v3 4/5] avfilter/af_volumedetect.c: reindent after last commit Yigithan Yigit
@ 2024-07-02 1:33 ` Yigithan Yigit
4 siblings, 0 replies; 11+ messages in thread
From: Yigithan Yigit @ 2024-07-02 1:33 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: thilo.borgmann, yigithanyigitdevel
---
libavfilter/af_volumedetect.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/libavfilter/af_volumedetect.c b/libavfilter/af_volumedetect.c
index a53212015d..856d5a295d 100644
--- a/libavfilter/af_volumedetect.c
+++ b/libavfilter/af_volumedetect.c
@@ -49,7 +49,7 @@ static inline double logdb(double v, enum AVSampleFormat sample_fmt)
return MAX_DB_FLT;
return -log10(v) * 10;
} else {
- double d = v / (double)(0x8000 * 0x8000);
+ double d = ldexp(v, -30);
if (!v)
return MAX_DB;
return -log10(d) * 10;
--
2.45.2
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [FFmpeg-devel] [PATCH v3 2/5] avfilter/af_volumedetect.c: Add 32bit float audio support
2024-07-02 1:33 ` [FFmpeg-devel] [PATCH v3 2/5] avfilter/af_volumedetect.c: Add 32bit float audio support Yigithan Yigit
@ 2024-07-02 5:46 ` Rémi Denis-Courmont
2024-07-02 5:49 ` Rémi Denis-Courmont
2024-07-02 5:51 ` Rémi Denis-Courmont
1 sibling, 1 reply; 11+ messages in thread
From: Rémi Denis-Courmont @ 2024-07-02 5:46 UTC (permalink / raw)
To: FFmpeg development discussions and patches
Le 2 juillet 2024 04:33:51 GMT+03:00, Yigithan Yigit <yigithanyigitdevel@gmail.com> a écrit :
>---
> libavfilter/af_volumedetect.c | 139 ++++++++++++++++++++++++++--------
> 1 file changed, 107 insertions(+), 32 deletions(-)
Did you try to compile this patch?
>
>diff --git a/libavfilter/af_volumedetect.c b/libavfilter/af_volumedetect.c
>index 327801a7f9..edd2d56f7a 100644
>--- a/libavfilter/af_volumedetect.c
>+++ b/libavfilter/af_volumedetect.c
>@@ -1,5 +1,6 @@
> /*
> * Copyright (c) 2012 Nicolas George
>+ * Copyright (c) 2024 Yigithan Yigit - 32 Bit Float Audio Support
> *
> * This file is part of FFmpeg.
> *
>@@ -20,48 +21,62 @@
>
> #include "libavutil/channel_layout.h"
> #include "libavutil/avassert.h"
>+#include "libavutil/mem.h"
> #include "audio.h"
> #include "avfilter.h"
> #include "internal.h"
>
>+#define MAX_DB_FLT 1024
> #define MAX_DB 91
>+#define HISTOGRAM_SIZE 0x10000
>+#define HISTOGRAM_SIZE_FLT (MAX_DB_FLT*2)
>+
>+typedef struct VolDetectContext VolDetectContext;
>
> typedef struct VolDetectContext {
>- /**
>- * Number of samples at each PCM value.
>- * histogram[0x8000 + i] is the number of samples at value i.
>- * The extra element is there for symmetry.
>- */
>- uint64_t histogram[0x10001];
>+ uint64_t* histogram; ///< for integer number of samples at each PCM value, for float number of samples at each dB
>+ uint64_t nb_samples; ///< number of samples
>+ double sum2; ///< sum of the squares of the samples
>+ double max; ///< maximum sample value
>+ int is_float; ///< true if the input is in floating point
>+ void (*process_samples)(VolDetectContext *vd, AVFrame *samples);
> } VolDetectContext;
>
>-static inline double logdb(uint64_t v)
>+static inline double logdb(double v, enum AVSampleFormat sample_fmt)
> {
>- double d = v / (double)(0x8000 * 0x8000);
>- if (!v)
>- return MAX_DB;
>- return -log10(d) * 10;
>+ if (sample_fmt == AV_SAMPLE_FMT_FLT) {
>+ if (!v)
>+ return MAX_DB_FLT;
>+ return -log10(v) * 10;
>+ } else {
>+ double d = v / (double)(0x8000 * 0x8000);
>+ if (!v)
>+ return MAX_DB;
>+ return -log10(d) * 10;
>+ }
>+}
>+
>+static void update_float_stats(VolDetectContext *vd, float *audio_data)
>+{
>+ double sample;
>+ int idx;
>+ if(!isfinite(*audio_data) || isnan(*audio_data))
>+ return;
>+ sample = fabsf(*audio_data);
>+ if (sample > vd->max)
>+ vd->max = sample;
>+ vd->sum2 += sample * sample;
>+ idx = (int)floorf(logdb(sample * sample, AV_SAMPLE_FMT_FLT)) + MAX_DB_FLT;
>+ vd->histogram[idx]++;
>+ vd->nb_samples++;
> }
>
> static int filter_frame(AVFilterLink *inlink, AVFrame *samples)
> {
> AVFilterContext *ctx = inlink->dst;
> VolDetectContext *vd = ctx->priv;
>- int nb_samples = samples->nb_samples;
>- int nb_channels = samples->ch_layout.nb_channels;
>- int nb_planes = nb_channels;
>- int plane, i;
>- int16_t *pcm;
>-
>- if (!av_sample_fmt_is_planar(samples->format)) {
>- nb_samples *= nb_channels;
>- nb_planes = 1;
>- }
>- for (plane = 0; plane < nb_planes; plane++) {
>- pcm = (int16_t *)samples->extended_data[plane];
>- for (i = 0; i < nb_samples; i++)
>- vd->histogram[pcm[i] + 0x8000]++;
>- }
>+
>+ vd->process_samples(vd, samples);
>
> return ff_filter_frame(inlink->dst->outputs[0], samples);
> }
>@@ -73,6 +88,20 @@ static void print_stats(AVFilterContext *ctx)
> uint64_t nb_samples = 0, power = 0, nb_samples_shift = 0, sum = 0;
> uint64_t histdb[MAX_DB + 1] = { 0 };
>
>+ if (!vd->nb_samples)
>+ return;
>+ if (vd->is_float) {
>+ av_log(ctx, AV_LOG_INFO, "n_samples: %" PRId64 "\n", vd->nb_samples);
>+ av_log(ctx, AV_LOG_INFO, "mean_volume: %.1f dB\n", -logdb(vd->sum2 / vd->nb_samples, AV_SAMPLE_FMT_FLT));
>+ av_log(ctx, AV_LOG_INFO, "max_volume: %.1f dB\n", -2.0*logdb(vd->max, AV_SAMPLE_FMT_FLT));
>+ for (i = 0; i < HISTOGRAM_SIZE_FLT && !vd->histogram[i]; i++);
>+ for (; i >= 0 && sum < vd->nb_samples / 1000; i++) {
>+ if (!vd->histogram[i])
>+ continue;
>+ av_log(ctx, AV_LOG_INFO, "histogram_%ddb: %" PRId64 "\n", MAX_DB_FLT - i, vd->histogram[i]);
>+ sum += vd->histogram[i];
>+ }
>+ } else {
> for (i = 0; i < 0x10000; i++)
> nb_samples += vd->histogram[i];
> av_log(ctx, AV_LOG_INFO, "n_samples: %"PRId64"\n", nb_samples);
>@@ -92,26 +121,61 @@ static void print_stats(AVFilterContext *ctx)
> return;
> power = (power + nb_samples_shift / 2) / nb_samples_shift;
> av_assert0(power <= 0x8000 * 0x8000);
>- av_log(ctx, AV_LOG_INFO, "mean_volume: %.1f dB\n", -logdb(power));
>+ av_log(ctx, AV_LOG_INFO, "mean_volume: %.1f dB\n", -logdb((double)power, AV_SAMPLE_FMT_S16));
>
> max_volume = 0x8000;
> while (max_volume > 0 && !vd->histogram[0x8000 + max_volume] &&
> !vd->histogram[0x8000 - max_volume])
> max_volume--;
>- av_log(ctx, AV_LOG_INFO, "max_volume: %.1f dB\n", -logdb(max_volume * max_volume));
>+ av_log(ctx, AV_LOG_INFO, "max_volume: %.1f dB\n", -logdb((double)(max_volume * max_volume), AV_SAMPLE_FMT_S16));
>
> for (i = 0; i < 0x10000; i++)
>- histdb[(int)logdb((i - 0x8000) * (i - 0x8000))] += vd->histogram[i];
>+ histdb[(int)logdb((double)(i - 0x8000) * (i - 0x8000), AV_SAMPLE_FMT_S16)] += vd->histogram[i];
> for (i = 0; i <= MAX_DB && !histdb[i]; i++);
> for (; i <= MAX_DB && sum < nb_samples / 1000; i++) {
>- av_log(ctx, AV_LOG_INFO, "histogram_%ddb: %"PRId64"\n", i, histdb[i]);
>+ av_log(ctx, AV_LOG_INFO, "histogram_%ddb: %"PRId64"\n", -i, histdb[i]);
> sum += histdb[i];
> }
>+ }
>+}
>+
>+static int config_output(AVFilterLink *outlink)
>+{
>+ AVFilterContext *ctx = outlink->src;
>+ VolDetectContext *vd = ctx->priv;
>+ size_t histogram_size;
>+
>+ vd->is_float = outlink->format == AV_SAMPLE_FMT_FLT ||
>+ outlink->format == AV_SAMPLE_FMT_FLTP;
>+
>+ if (!vd->is_float) {
>+ /*
>+ * Number of samples at each PCM value.
>+ * Only used for integer formats.
>+ * For 16 bit signed PCM there are 65536.
>+ * histogram[0x8000 + i] is the number of samples at value i.
>+ * The extra element is there for symmetry.
>+ */
>+ histogram_size = HISTOGRAM_SIZE + 1;
>+ } else {
>+ /*
>+ * The histogram is used to store the number of samples at each dB
>+ * instead of the number of samples at each PCM value.
>+ */
>+ histogram_size = HISTOGRAM_SIZE_FLT + 1;
>+ }
>+ vd->histogram = av_calloc(histogram_size, sizeof(uint64_t));
>+ if (!vd->histogram)
>+ return AVERROR(ENOMEM);
>+ return 0;
> }
>
> static av_cold void uninit(AVFilterContext *ctx)
> {
>+ VolDetectContext *vd = ctx->priv;
> print_stats(ctx);
>+ if (vd->histogram)
>+ av_freep(&vd->histogram);
> }
>
> static const AVFilterPad volumedetect_inputs[] = {
>@@ -122,6 +186,14 @@ static const AVFilterPad volumedetect_inputs[] = {
> },
> };
>
>+static const AVFilterPad volumedetect_outputs[] = {
>+ {
>+ .name = "default",
>+ .type = AVMEDIA_TYPE_AUDIO,
>+ .config_props = config_output,
>+ },
>+};
>+
> const AVFilter ff_af_volumedetect = {
> .name = "volumedetect",
> .description = NULL_IF_CONFIG_SMALL("Detect audio volume."),
>@@ -129,6 +201,9 @@ const AVFilter ff_af_volumedetect = {
> .uninit = uninit,
> .flags = AVFILTER_FLAG_METADATA_ONLY,
> FILTER_INPUTS(volumedetect_inputs),
>- FILTER_OUTPUTS(ff_audio_default_filterpad),
>- FILTER_SAMPLEFMTS(AV_SAMPLE_FMT_S16, AV_SAMPLE_FMT_S16P),
>+ FILTER_OUTPUTS(volumedetect_outputs),
>+ FILTER_SAMPLEFMTS(AV_SAMPLE_FMT_S16,
>+ AV_SAMPLE_FMT_S16P,
>+ AV_SAMPLE_FMT_FLT,
>+ AV_SAMPLE_FMT_FLTP),
> };
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [FFmpeg-devel] [PATCH v3 2/5] avfilter/af_volumedetect.c: Add 32bit float audio support
2024-07-02 5:46 ` Rémi Denis-Courmont
@ 2024-07-02 5:49 ` Rémi Denis-Courmont
0 siblings, 0 replies; 11+ messages in thread
From: Rémi Denis-Courmont @ 2024-07-02 5:49 UTC (permalink / raw)
To: FFmpeg development discussions and patches
Le 2 juillet 2024 08:46:53 GMT+03:00, "Rémi Denis-Courmont" <remi@remlab.net> a écrit :
>
>
>Le 2 juillet 2024 04:33:51 GMT+03:00, Yigithan Yigit <yigithanyigitdevel@gmail.com> a écrit :
>>---
>> libavfilter/af_volumedetect.c | 139 ++++++++++++++++++++++++++--------
>> 1 file changed, 107 insertions(+), 32 deletions(-)
>
>Did you try to compile this patch?
Nvmd misread.
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [FFmpeg-devel] [PATCH v3 2/5] avfilter/af_volumedetect.c: Add 32bit float audio support
2024-07-02 1:33 ` [FFmpeg-devel] [PATCH v3 2/5] avfilter/af_volumedetect.c: Add 32bit float audio support Yigithan Yigit
2024-07-02 5:46 ` Rémi Denis-Courmont
@ 2024-07-02 5:51 ` Rémi Denis-Courmont
2024-07-02 11:46 ` Yigithan Yigit
1 sibling, 1 reply; 11+ messages in thread
From: Rémi Denis-Courmont @ 2024-07-02 5:51 UTC (permalink / raw)
To: FFmpeg development discussions and patches
Le 2 juillet 2024 04:33:51 GMT+03:00, Yigithan Yigit <yigithanyigitdevel@gmail.com> a écrit :
>---
> libavfilter/af_volumedetect.c | 139 ++++++++++++++++++++++++++--------
> 1 file changed, 107 insertions(+), 32 deletions(-)
>
>diff --git a/libavfilter/af_volumedetect.c b/libavfilter/af_volumedetect.c
>index 327801a7f9..edd2d56f7a 100644
>--- a/libavfilter/af_volumedetect.c
>+++ b/libavfilter/af_volumedetect.c
>@@ -1,5 +1,6 @@
> /*
> * Copyright (c) 2012 Nicolas George
>+ * Copyright (c) 2024 Yigithan Yigit - 32 Bit Float Audio Support
> *
> * This file is part of FFmpeg.
> *
>@@ -20,48 +21,62 @@
>
> #include "libavutil/channel_layout.h"
> #include "libavutil/avassert.h"
>+#include "libavutil/mem.h"
> #include "audio.h"
> #include "avfilter.h"
> #include "internal.h"
>
>+#define MAX_DB_FLT 1024
> #define MAX_DB 91
>+#define HISTOGRAM_SIZE 0x10000
>+#define HISTOGRAM_SIZE_FLT (MAX_DB_FLT*2)
>+
>+typedef struct VolDetectContext VolDetectContext;
>
> typedef struct VolDetectContext {
>- /**
>- * Number of samples at each PCM value.
>- * histogram[0x8000 + i] is the number of samples at value i.
>- * The extra element is there for symmetry.
>- */
>- uint64_t histogram[0x10001];
>+ uint64_t* histogram; ///< for integer number of samples at each PCM value, for float number of samples at each dB
>+ uint64_t nb_samples; ///< number of samples
>+ double sum2; ///< sum of the squares of the samples
>+ double max; ///< maximum sample value
>+ int is_float; ///< true if the input is in floating point
>+ void (*process_samples)(VolDetectContext *vd, AVFrame *samples);
> } VolDetectContext;
>
>-static inline double logdb(uint64_t v)
>+static inline double logdb(double v, enum AVSampleFormat sample_fmt)
> {
>- double d = v / (double)(0x8000 * 0x8000);
>- if (!v)
>- return MAX_DB;
>- return -log10(d) * 10;
>+ if (sample_fmt == AV_SAMPLE_FMT_FLT) {
There's no point in doing this. You've already up-converted to double precision and do all the calculations in double precision. Maybe that's fine or maybe not, but either way, this doesn't look sensible.
>+ if (!v)
>+ return MAX_DB_FLT;
>+ return -log10(v) * 10;
>+ } else {
>+ double d = v / (double)(0x8000 * 0x8000);
>+ if (!v)
>+ return MAX_DB;
>+ return -log10(d) * 10;
>+ }
>+}
>+
>+static void update_float_stats(VolDetectContext *vd, float *audio_data)
>+{
>+ double sample;
>+ int idx;
>+ if(!isfinite(*audio_data) || isnan(*audio_data))
>+ return;
>+ sample = fabsf(*audio_data);
>+ if (sample > vd->max)
>+ vd->max = sample;
>+ vd->sum2 += sample * sample;
>+ idx = (int)floorf(logdb(sample * sample, AV_SAMPLE_FMT_FLT)) + MAX_DB_FLT;
>+ vd->histogram[idx]++;
>+ vd->nb_samples++;
> }
>
> static int filter_frame(AVFilterLink *inlink, AVFrame *samples)
> {
> AVFilterContext *ctx = inlink->dst;
> VolDetectContext *vd = ctx->priv;
>- int nb_samples = samples->nb_samples;
>- int nb_channels = samples->ch_layout.nb_channels;
>- int nb_planes = nb_channels;
>- int plane, i;
>- int16_t *pcm;
>-
>- if (!av_sample_fmt_is_planar(samples->format)) {
>- nb_samples *= nb_channels;
>- nb_planes = 1;
>- }
>- for (plane = 0; plane < nb_planes; plane++) {
>- pcm = (int16_t *)samples->extended_data[plane];
>- for (i = 0; i < nb_samples; i++)
>- vd->histogram[pcm[i] + 0x8000]++;
>- }
>+
>+ vd->process_samples(vd, samples);
>
> return ff_filter_frame(inlink->dst->outputs[0], samples);
> }
>@@ -73,6 +88,20 @@ static void print_stats(AVFilterContext *ctx)
> uint64_t nb_samples = 0, power = 0, nb_samples_shift = 0, sum = 0;
> uint64_t histdb[MAX_DB + 1] = { 0 };
>
>+ if (!vd->nb_samples)
>+ return;
>+ if (vd->is_float) {
>+ av_log(ctx, AV_LOG_INFO, "n_samples: %" PRId64 "\n", vd->nb_samples);
>+ av_log(ctx, AV_LOG_INFO, "mean_volume: %.1f dB\n", -logdb(vd->sum2 / vd->nb_samples, AV_SAMPLE_FMT_FLT));
>+ av_log(ctx, AV_LOG_INFO, "max_volume: %.1f dB\n", -2.0*logdb(vd->max, AV_SAMPLE_FMT_FLT));
>+ for (i = 0; i < HISTOGRAM_SIZE_FLT && !vd->histogram[i]; i++);
>+ for (; i >= 0 && sum < vd->nb_samples / 1000; i++) {
>+ if (!vd->histogram[i])
>+ continue;
>+ av_log(ctx, AV_LOG_INFO, "histogram_%ddb: %" PRId64 "\n", MAX_DB_FLT - i, vd->histogram[i]);
>+ sum += vd->histogram[i];
>+ }
>+ } else {
> for (i = 0; i < 0x10000; i++)
> nb_samples += vd->histogram[i];
> av_log(ctx, AV_LOG_INFO, "n_samples: %"PRId64"\n", nb_samples);
>@@ -92,26 +121,61 @@ static void print_stats(AVFilterContext *ctx)
> return;
> power = (power + nb_samples_shift / 2) / nb_samples_shift;
> av_assert0(power <= 0x8000 * 0x8000);
>- av_log(ctx, AV_LOG_INFO, "mean_volume: %.1f dB\n", -logdb(power));
>+ av_log(ctx, AV_LOG_INFO, "mean_volume: %.1f dB\n", -logdb((double)power, AV_SAMPLE_FMT_S16));
>
> max_volume = 0x8000;
> while (max_volume > 0 && !vd->histogram[0x8000 + max_volume] &&
> !vd->histogram[0x8000 - max_volume])
> max_volume--;
>- av_log(ctx, AV_LOG_INFO, "max_volume: %.1f dB\n", -logdb(max_volume * max_volume));
>+ av_log(ctx, AV_LOG_INFO, "max_volume: %.1f dB\n", -logdb((double)(max_volume * max_volume), AV_SAMPLE_FMT_S16));
>
> for (i = 0; i < 0x10000; i++)
>- histdb[(int)logdb((i - 0x8000) * (i - 0x8000))] += vd->histogram[i];
>+ histdb[(int)logdb((double)(i - 0x8000) * (i - 0x8000), AV_SAMPLE_FMT_S16)] += vd->histogram[i];
> for (i = 0; i <= MAX_DB && !histdb[i]; i++);
> for (; i <= MAX_DB && sum < nb_samples / 1000; i++) {
>- av_log(ctx, AV_LOG_INFO, "histogram_%ddb: %"PRId64"\n", i, histdb[i]);
>+ av_log(ctx, AV_LOG_INFO, "histogram_%ddb: %"PRId64"\n", -i, histdb[i]);
> sum += histdb[i];
> }
>+ }
>+}
>+
>+static int config_output(AVFilterLink *outlink)
>+{
>+ AVFilterContext *ctx = outlink->src;
>+ VolDetectContext *vd = ctx->priv;
>+ size_t histogram_size;
>+
>+ vd->is_float = outlink->format == AV_SAMPLE_FMT_FLT ||
>+ outlink->format == AV_SAMPLE_FMT_FLTP;
>+
>+ if (!vd->is_float) {
>+ /*
>+ * Number of samples at each PCM value.
>+ * Only used for integer formats.
>+ * For 16 bit signed PCM there are 65536.
>+ * histogram[0x8000 + i] is the number of samples at value i.
>+ * The extra element is there for symmetry.
>+ */
>+ histogram_size = HISTOGRAM_SIZE + 1;
>+ } else {
>+ /*
>+ * The histogram is used to store the number of samples at each dB
>+ * instead of the number of samples at each PCM value.
>+ */
>+ histogram_size = HISTOGRAM_SIZE_FLT + 1;
>+ }
>+ vd->histogram = av_calloc(histogram_size, sizeof(uint64_t));
>+ if (!vd->histogram)
>+ return AVERROR(ENOMEM);
>+ return 0;
> }
>
> static av_cold void uninit(AVFilterContext *ctx)
> {
>+ VolDetectContext *vd = ctx->priv;
> print_stats(ctx);
>+ if (vd->histogram)
>+ av_freep(&vd->histogram);
> }
>
> static const AVFilterPad volumedetect_inputs[] = {
>@@ -122,6 +186,14 @@ static const AVFilterPad volumedetect_inputs[] = {
> },
> };
>
>+static const AVFilterPad volumedetect_outputs[] = {
>+ {
>+ .name = "default",
>+ .type = AVMEDIA_TYPE_AUDIO,
>+ .config_props = config_output,
>+ },
>+};
>+
> const AVFilter ff_af_volumedetect = {
> .name = "volumedetect",
> .description = NULL_IF_CONFIG_SMALL("Detect audio volume."),
>@@ -129,6 +201,9 @@ const AVFilter ff_af_volumedetect = {
> .uninit = uninit,
> .flags = AVFILTER_FLAG_METADATA_ONLY,
> FILTER_INPUTS(volumedetect_inputs),
>- FILTER_OUTPUTS(ff_audio_default_filterpad),
>- FILTER_SAMPLEFMTS(AV_SAMPLE_FMT_S16, AV_SAMPLE_FMT_S16P),
>+ FILTER_OUTPUTS(volumedetect_outputs),
>+ FILTER_SAMPLEFMTS(AV_SAMPLE_FMT_S16,
>+ AV_SAMPLE_FMT_S16P,
>+ AV_SAMPLE_FMT_FLT,
>+ AV_SAMPLE_FMT_FLTP),
> };
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [FFmpeg-devel] [PATCH v3 3/5] avfilter/af_volumedetect.c: Added functions for int/float and planar/packed
2024-07-02 1:33 ` [FFmpeg-devel] [PATCH v3 3/5] avfilter/af_volumedetect.c: Added functions for int/float and planar/packed Yigithan Yigit
@ 2024-07-02 7:50 ` Anton Khirnov
0 siblings, 0 replies; 11+ messages in thread
From: Anton Khirnov @ 2024-07-02 7:50 UTC (permalink / raw)
To: FFmpeg development discussions and patches
Cc: thilo.borgmann, yigithanyigitdevel
Why did you ignore my comment from the last iteration?
--
Anton Khirnov
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [FFmpeg-devel] [PATCH v3 2/5] avfilter/af_volumedetect.c: Add 32bit float audio support
2024-07-02 5:51 ` Rémi Denis-Courmont
@ 2024-07-02 11:46 ` Yigithan Yigit
0 siblings, 0 replies; 11+ messages in thread
From: Yigithan Yigit @ 2024-07-02 11:46 UTC (permalink / raw)
To: FFmpeg development discussions and patches
> On Jul 2, 2024, at 8:51 AM, Rémi Denis-Courmont <remi@remlab.net> wrote:
>
>
>
> Le 2 juillet 2024 04:33:51 GMT+03:00, Yigithan Yigit <yigithanyigitdevel@gmail.com <mailto:yigithanyigitdevel@gmail.com>> a écrit :
>> ---
>> libavfilter/af_volumedetect.c | 139 ++++++++++++++++++++++++++--------
>> 1 file changed, 107 insertions(+), 32 deletions(-)
>>
>> diff --git a/libavfilter/af_volumedetect.c b/libavfilter/af_volumedetect.c
>> index 327801a7f9..edd2d56f7a 100644
>> --- a/libavfilter/af_volumedetect.c
>> +++ b/libavfilter/af_volumedetect.c
>> @@ -1,5 +1,6 @@
>> /*
>> * Copyright (c) 2012 Nicolas George
>> + * Copyright (c) 2024 Yigithan Yigit - 32 Bit Float Audio Support
>> *
>> * This file is part of FFmpeg.
>> *
>> @@ -20,48 +21,62 @@
>>
>> #include "libavutil/channel_layout.h"
>> #include "libavutil/avassert.h"
>> +#include "libavutil/mem.h"
>> #include "audio.h"
>> #include "avfilter.h"
>> #include "internal.h"
>>
>> +#define MAX_DB_FLT 1024
>> #define MAX_DB 91
>> +#define HISTOGRAM_SIZE 0x10000
>> +#define HISTOGRAM_SIZE_FLT (MAX_DB_FLT*2)
>> +
>> +typedef struct VolDetectContext VolDetectContext;
>>
>> typedef struct VolDetectContext {
>> - /**
>> - * Number of samples at each PCM value.
>> - * histogram[0x8000 + i] is the number of samples at value i.
>> - * The extra element is there for symmetry.
>> - */
>> - uint64_t histogram[0x10001];
>> + uint64_t* histogram; ///< for integer number of samples at each PCM value, for float number of samples at each dB
>> + uint64_t nb_samples; ///< number of samples
>> + double sum2; ///< sum of the squares of the samples
>> + double max; ///< maximum sample value
>> + int is_float; ///< true if the input is in floating point
>> + void (*process_samples)(VolDetectContext *vd, AVFrame *samples);
>> } VolDetectContext;
>>
>> -static inline double logdb(uint64_t v)
>> +static inline double logdb(double v, enum AVSampleFormat sample_fmt)
>> {
>> - double d = v / (double)(0x8000 * 0x8000);
>> - if (!v)
>> - return MAX_DB;
>> - return -log10(d) * 10;
>> + if (sample_fmt == AV_SAMPLE_FMT_FLT) {
>
> There's no point in doing this. You've already up-converted to double precision and do all the calculations in double precision. Maybe that's fine or maybe not, but either way, this doesn't look sensible.
>
>> + if (!v)
>> + return MAX_DB_FLT;
>> + return -log10(v) * 10;
>> + } else {
>> + double d = v / (double)(0x8000 * 0x8000);
>> + if (!v)
>> + return MAX_DB;
>> + return -log10(d) * 10;
>> + }
>> +}
>> +
If I understand your concerns correctly, We should have function like this;
> static inline double logdb(double v, enum AVSampleFormat sample_fmt)
> {
> if (!v)
> return sample_fmt == AV_SAMPLE_FMT_FLT ? MAX_DB_FLT : MAX_DB;
>
> if (sample_fmt == AV_SAMPLE_FMT_S16)
> v = ldexp(v, -30);
>
> return -log10(v) * 10;
> }
What do you think about that?
Thanks for the feedback
Yigithan
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2024-07-02 11:47 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-07-02 1:33 [FFmpeg-devel] [PATCH v3 0/5] avfilter/af_volumedetect.c: Add 32bit float audio Yigithan Yigit
2024-07-02 1:33 ` [FFmpeg-devel] [PATCH v3 1/5] avfilter/af_volumedetect.c: Move logdb function Yigithan Yigit
2024-07-02 1:33 ` [FFmpeg-devel] [PATCH v3 2/5] avfilter/af_volumedetect.c: Add 32bit float audio support Yigithan Yigit
2024-07-02 5:46 ` Rémi Denis-Courmont
2024-07-02 5:49 ` Rémi Denis-Courmont
2024-07-02 5:51 ` Rémi Denis-Courmont
2024-07-02 11:46 ` Yigithan Yigit
2024-07-02 1:33 ` [FFmpeg-devel] [PATCH v3 3/5] avfilter/af_volumedetect.c: Added functions for int/float and planar/packed Yigithan Yigit
2024-07-02 7:50 ` Anton Khirnov
2024-07-02 1:33 ` [FFmpeg-devel] [PATCH v3 4/5] avfilter/af_volumedetect.c: reindent after last commit Yigithan Yigit
2024-07-02 1:33 ` [FFmpeg-devel] [PATCH v3 5/5] Replace division with ldexp Yigithan Yigit
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
This inbox may be cloned and mirrored by anyone:
git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git
# If you have public-inbox 1.1+ installed, you may
# initialize and index your mirror using the following commands:
public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
ffmpegdev@gitmailbox.com
public-inbox-index ffmpegdev
Example config snippet for mirrors.
AGPL code for this site: git clone https://public-inbox.org/public-inbox.git