From: Paul B Mahol <onemda@gmail.com> To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org> Subject: Re: [FFmpeg-devel] [PATCH] avfilter: merge loudnorm filter functionality into f_ebur128.c Date: Wed, 15 Nov 2023 21:46:54 +0100 Message-ID: <CAPYw7P5MGP2Hg9Ca3qbKv3MnGYJH2y3T9Y=Dy251EWPLTpDtYg@mail.gmail.com> (raw) In-Reply-To: <CAPYw7P7AE0pZSzCOKXOF5DjABa8fJbOGLtReBBW5mB1D5=X9aQ@mail.gmail.com> [-- Attachment #1: Type: text/plain, Size: 10 bytes --] Attached. [-- Attachment #2: 0001-avfilter-merge-loudnorm-filter-functionality-into-f_.patch --] [-- Type: text/x-patch, Size: 118899 bytes --] From 433296d8aaf2de06a34a1f7deaaca72c25c8962b Mon Sep 17 00:00:00 2001 From: Paul B Mahol <onemda@gmail.com> Date: Fri, 29 Sep 2023 20:53:51 +0200 Subject: [PATCH] avfilter: merge loudnorm filter functionality into f_ebur128.c Signed-off-by: Paul B Mahol <onemda@gmail.com> --- libavfilter/Makefile | 1 - libavfilter/af_loudnorm.c | 941 -------------------------------------- libavfilter/ebur128.c | 725 ----------------------------- libavfilter/ebur128.h | 229 ---------- libavfilter/f_ebur128.c | 914 +++++++++++++++++++++++++++++------- 5 files changed, 752 insertions(+), 2058 deletions(-) delete mode 100644 libavfilter/af_loudnorm.c delete mode 100644 libavfilter/ebur128.c delete mode 100644 libavfilter/ebur128.h diff --git a/libavfilter/Makefile b/libavfilter/Makefile index abdc430723..26325603ab 100644 --- a/libavfilter/Makefile +++ b/libavfilter/Makefile @@ -150,7 +150,6 @@ OBJS-$(CONFIG_HIGHPASS_FILTER) += af_biquads.o OBJS-$(CONFIG_HIGHSHELF_FILTER) += af_biquads.o OBJS-$(CONFIG_JOIN_FILTER) += af_join.o OBJS-$(CONFIG_LADSPA_FILTER) += af_ladspa.o -OBJS-$(CONFIG_LOUDNORM_FILTER) += af_loudnorm.o ebur128.o OBJS-$(CONFIG_LOWPASS_FILTER) += af_biquads.o OBJS-$(CONFIG_LOWSHELF_FILTER) += af_biquads.o OBJS-$(CONFIG_LV2_FILTER) += af_lv2.o diff --git a/libavfilter/af_loudnorm.c b/libavfilter/af_loudnorm.c deleted file mode 100644 index d83398ae2a..0000000000 --- a/libavfilter/af_loudnorm.c +++ /dev/null @@ -1,941 +0,0 @@ -/* - * Copyright (c) 2016 Kyle Swanson <k@ylo.ph>. - * - * This file is part of FFmpeg. - * - * FFmpeg is free software; you can redistribute it and/or - * modify it under the terms of the GNU Lesser General Public - * License as published by the Free Software Foundation; either - * version 2.1 of the License, or (at your option) any later version. - * - * FFmpeg is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - * Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public - * License along with FFmpeg; if not, write to the Free Software - * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA - */ - -/* http://k.ylo.ph/2016/04/04/loudnorm.html */ - -#include "libavutil/opt.h" -#include "avfilter.h" -#include "filters.h" -#include "formats.h" -#include "internal.h" -#include "audio.h" -#include "ebur128.h" - -enum FrameType { - FIRST_FRAME, - INNER_FRAME, - FINAL_FRAME, - LINEAR_MODE, - FRAME_NB -}; - -enum LimiterState { - OUT, - ATTACK, - SUSTAIN, - RELEASE, - STATE_NB -}; - -enum PrintFormat { - NONE, - JSON, - SUMMARY, - PF_NB -}; - -typedef struct LoudNormContext { - const AVClass *class; - double target_i; - double target_lra; - double target_tp; - double measured_i; - double measured_lra; - double measured_tp; - double measured_thresh; - double offset; - int linear; - int dual_mono; - enum PrintFormat print_format; - - double *buf; - int buf_size; - int buf_index; - int prev_buf_index; - - double delta[30]; - double weights[21]; - double prev_delta; - int index; - - double gain_reduction[2]; - double *limiter_buf; - double *prev_smp; - int limiter_buf_index; - int limiter_buf_size; - enum LimiterState limiter_state; - int peak_index; - int env_index; - int env_cnt; - int attack_length; - int release_length; - - int64_t pts[30]; - enum FrameType frame_type; - int above_threshold; - int prev_nb_samples; - int channels; - - FFEBUR128State *r128_in; - FFEBUR128State *r128_out; -} LoudNormContext; - -#define OFFSET(x) offsetof(LoudNormContext, x) -#define FLAGS AV_OPT_FLAG_AUDIO_PARAM|AV_OPT_FLAG_FILTERING_PARAM - -static const AVOption loudnorm_options[] = { - { "I", "set integrated loudness target", OFFSET(target_i), AV_OPT_TYPE_DOUBLE, {.dbl = -24.}, -70., -5., FLAGS }, - { "i", "set integrated loudness target", OFFSET(target_i), AV_OPT_TYPE_DOUBLE, {.dbl = -24.}, -70., -5., FLAGS }, - { "LRA", "set loudness range target", OFFSET(target_lra), AV_OPT_TYPE_DOUBLE, {.dbl = 7.}, 1., 50., FLAGS }, - { "lra", "set loudness range target", OFFSET(target_lra), AV_OPT_TYPE_DOUBLE, {.dbl = 7.}, 1., 50., FLAGS }, - { "TP", "set maximum true peak", OFFSET(target_tp), AV_OPT_TYPE_DOUBLE, {.dbl = -2.}, -9., 0., FLAGS }, - { "tp", "set maximum true peak", OFFSET(target_tp), AV_OPT_TYPE_DOUBLE, {.dbl = -2.}, -9., 0., FLAGS }, - { "measured_I", "measured IL of input file", OFFSET(measured_i), AV_OPT_TYPE_DOUBLE, {.dbl = 0.}, -99., 0., FLAGS }, - { "measured_i", "measured IL of input file", OFFSET(measured_i), AV_OPT_TYPE_DOUBLE, {.dbl = 0.}, -99., 0., FLAGS }, - { "measured_LRA", "measured LRA of input file", OFFSET(measured_lra), AV_OPT_TYPE_DOUBLE, {.dbl = 0.}, 0., 99., FLAGS }, - { "measured_lra", "measured LRA of input file", OFFSET(measured_lra), AV_OPT_TYPE_DOUBLE, {.dbl = 0.}, 0., 99., FLAGS }, - { "measured_TP", "measured true peak of input file", OFFSET(measured_tp), AV_OPT_TYPE_DOUBLE, {.dbl = 99.}, -99., 99., FLAGS }, - { "measured_tp", "measured true peak of input file", OFFSET(measured_tp), AV_OPT_TYPE_DOUBLE, {.dbl = 99.}, -99., 99., FLAGS }, - { "measured_thresh", "measured threshold of input file", OFFSET(measured_thresh), AV_OPT_TYPE_DOUBLE, {.dbl = -70.}, -99., 0., FLAGS }, - { "offset", "set offset gain", OFFSET(offset), AV_OPT_TYPE_DOUBLE, {.dbl = 0.}, -99., 99., FLAGS }, - { "linear", "normalize linearly if possible", OFFSET(linear), AV_OPT_TYPE_BOOL, {.i64 = 1}, 0, 1, FLAGS }, - { "dual_mono", "treat mono input as dual-mono", OFFSET(dual_mono), AV_OPT_TYPE_BOOL, {.i64 = 0}, 0, 1, FLAGS }, - { "print_format", "set print format for stats", OFFSET(print_format), AV_OPT_TYPE_INT, {.i64 = NONE}, NONE, PF_NB -1, FLAGS, "print_format" }, - { "none", 0, 0, AV_OPT_TYPE_CONST, {.i64 = NONE}, 0, 0, FLAGS, "print_format" }, - { "json", 0, 0, AV_OPT_TYPE_CONST, {.i64 = JSON}, 0, 0, FLAGS, "print_format" }, - { "summary", 0, 0, AV_OPT_TYPE_CONST, {.i64 = SUMMARY}, 0, 0, FLAGS, "print_format" }, - { NULL } -}; - -AVFILTER_DEFINE_CLASS(loudnorm); - -static inline int frame_size(int sample_rate, int frame_len_msec) -{ - const int frame_size = round((double)sample_rate * (frame_len_msec / 1000.0)); - return frame_size + (frame_size % 2); -} - -static void init_gaussian_filter(LoudNormContext *s) -{ - double total_weight = 0.0; - const double sigma = 3.5; - double adjust; - int i; - - const int offset = 21 / 2; - const double c1 = 1.0 / (sigma * sqrt(2.0 * M_PI)); - const double c2 = 2.0 * pow(sigma, 2.0); - - for (i = 0; i < 21; i++) { - const int x = i - offset; - s->weights[i] = c1 * exp(-(pow(x, 2.0) / c2)); - total_weight += s->weights[i]; - } - - adjust = 1.0 / total_weight; - for (i = 0; i < 21; i++) - s->weights[i] *= adjust; -} - -static double gaussian_filter(LoudNormContext *s, int index) -{ - double result = 0.; - int i; - - index = index - 10 > 0 ? index - 10 : index + 20; - for (i = 0; i < 21; i++) - result += s->delta[((index + i) < 30) ? (index + i) : (index + i - 30)] * s->weights[i]; - - return result; -} - -static void detect_peak(LoudNormContext *s, int offset, int nb_samples, int channels, int *peak_delta, double *peak_value) -{ - int n, c, i, index; - double ceiling; - double *buf; - - *peak_delta = -1; - buf = s->limiter_buf; - ceiling = s->target_tp; - - index = s->limiter_buf_index + (offset * channels) + (1920 * channels); - if (index >= s->limiter_buf_size) - index -= s->limiter_buf_size; - - if (s->frame_type == FIRST_FRAME) { - for (c = 0; c < channels; c++) - s->prev_smp[c] = fabs(buf[index + c - channels]); - } - - for (n = 0; n < nb_samples; n++) { - for (c = 0; c < channels; c++) { - double this, next, max_peak; - - this = fabs(buf[(index + c) < s->limiter_buf_size ? (index + c) : (index + c - s->limiter_buf_size)]); - next = fabs(buf[(index + c + channels) < s->limiter_buf_size ? (index + c + channels) : (index + c + channels - s->limiter_buf_size)]); - - if ((s->prev_smp[c] <= this) && (next <= this) && (this > ceiling) && (n > 0)) { - int detected; - - detected = 1; - for (i = 2; i < 12; i++) { - next = fabs(buf[(index + c + (i * channels)) < s->limiter_buf_size ? (index + c + (i * channels)) : (index + c + (i * channels) - s->limiter_buf_size)]); - if (next > this) { - detected = 0; - break; - } - } - - if (!detected) - continue; - - for (c = 0; c < channels; c++) { - if (c == 0 || fabs(buf[index + c]) > max_peak) - max_peak = fabs(buf[index + c]); - - s->prev_smp[c] = fabs(buf[(index + c) < s->limiter_buf_size ? (index + c) : (index + c - s->limiter_buf_size)]); - } - - *peak_delta = n; - s->peak_index = index; - *peak_value = max_peak; - return; - } - - s->prev_smp[c] = this; - } - - index += channels; - if (index >= s->limiter_buf_size) - index -= s->limiter_buf_size; - } -} - -static void true_peak_limiter(LoudNormContext *s, double *out, int nb_samples, int channels) -{ - int n, c, index, peak_delta, smp_cnt; - double ceiling, peak_value; - double *buf; - - buf = s->limiter_buf; - ceiling = s->target_tp; - index = s->limiter_buf_index; - smp_cnt = 0; - - if (s->frame_type == FIRST_FRAME) { - double max; - - max = 0.; - for (n = 0; n < 1920; n++) { - for (c = 0; c < channels; c++) { - max = fabs(buf[c]) > max ? fabs(buf[c]) : max; - } - buf += channels; - } - - if (max > ceiling) { - s->gain_reduction[1] = ceiling / max; - s->limiter_state = SUSTAIN; - buf = s->limiter_buf; - - for (n = 0; n < 1920; n++) { - for (c = 0; c < channels; c++) { - double env; - env = s->gain_reduction[1]; - buf[c] *= env; - } - buf += channels; - } - } - - buf = s->limiter_buf; - } - - do { - - switch(s->limiter_state) { - case OUT: - detect_peak(s, smp_cnt, nb_samples - smp_cnt, channels, &peak_delta, &peak_value); - if (peak_delta != -1) { - s->env_cnt = 0; - smp_cnt += (peak_delta - s->attack_length); - s->gain_reduction[0] = 1.; - s->gain_reduction[1] = ceiling / peak_value; - s->limiter_state = ATTACK; - - s->env_index = s->peak_index - (s->attack_length * channels); - if (s->env_index < 0) - s->env_index += s->limiter_buf_size; - - s->env_index += (s->env_cnt * channels); - if (s->env_index > s->limiter_buf_size) - s->env_index -= s->limiter_buf_size; - - } else { - smp_cnt = nb_samples; - } - break; - - case ATTACK: - for (; s->env_cnt < s->attack_length; s->env_cnt++) { - for (c = 0; c < channels; c++) { - double env; - env = s->gain_reduction[0] - ((double) s->env_cnt / (s->attack_length - 1) * (s->gain_reduction[0] - s->gain_reduction[1])); - buf[s->env_index + c] *= env; - } - - s->env_index += channels; - if (s->env_index >= s->limiter_buf_size) - s->env_index -= s->limiter_buf_size; - - smp_cnt++; - if (smp_cnt >= nb_samples) { - s->env_cnt++; - break; - } - } - - if (smp_cnt < nb_samples) { - s->env_cnt = 0; - s->attack_length = 1920; - s->limiter_state = SUSTAIN; - } - break; - - case SUSTAIN: - detect_peak(s, smp_cnt, nb_samples, channels, &peak_delta, &peak_value); - if (peak_delta == -1) { - s->limiter_state = RELEASE; - s->gain_reduction[0] = s->gain_reduction[1]; - s->gain_reduction[1] = 1.; - s->env_cnt = 0; - break; - } else { - double gain_reduction; - gain_reduction = ceiling / peak_value; - - if (gain_reduction < s->gain_reduction[1]) { - s->limiter_state = ATTACK; - - s->attack_length = peak_delta; - if (s->attack_length <= 1) - s->attack_length = 2; - - s->gain_reduction[0] = s->gain_reduction[1]; - s->gain_reduction[1] = gain_reduction; - s->env_cnt = 0; - break; - } - - for (s->env_cnt = 0; s->env_cnt < peak_delta; s->env_cnt++) { - for (c = 0; c < channels; c++) { - double env; - env = s->gain_reduction[1]; - buf[s->env_index + c] *= env; - } - - s->env_index += channels; - if (s->env_index >= s->limiter_buf_size) - s->env_index -= s->limiter_buf_size; - - smp_cnt++; - if (smp_cnt >= nb_samples) { - s->env_cnt++; - break; - } - } - } - break; - - case RELEASE: - for (; s->env_cnt < s->release_length; s->env_cnt++) { - for (c = 0; c < channels; c++) { - double env; - env = s->gain_reduction[0] + (((double) s->env_cnt / (s->release_length - 1)) * (s->gain_reduction[1] - s->gain_reduction[0])); - buf[s->env_index + c] *= env; - } - - s->env_index += channels; - if (s->env_index >= s->limiter_buf_size) - s->env_index -= s->limiter_buf_size; - - smp_cnt++; - if (smp_cnt >= nb_samples) { - s->env_cnt++; - break; - } - } - - if (smp_cnt < nb_samples) { - s->env_cnt = 0; - s->limiter_state = OUT; - } - - break; - } - - } while (smp_cnt < nb_samples); - - for (n = 0; n < nb_samples; n++) { - for (c = 0; c < channels; c++) { - out[c] = buf[index + c]; - if (fabs(out[c]) > ceiling) { - out[c] = ceiling * (out[c] < 0 ? -1 : 1); - } - } - out += channels; - index += channels; - if (index >= s->limiter_buf_size) - index -= s->limiter_buf_size; - } -} - -static int filter_frame(AVFilterLink *inlink, AVFrame *in) -{ - AVFilterContext *ctx = inlink->dst; - LoudNormContext *s = ctx->priv; - AVFilterLink *outlink = ctx->outputs[0]; - AVFrame *out; - const double *src; - double *dst; - double *buf; - double *limiter_buf; - int i, n, c, subframe_length, src_index; - double gain, gain_next, env_global, env_shortterm, - global, shortterm, lra, relative_threshold; - - if (av_frame_is_writable(in)) { - out = in; - } else { - out = ff_get_audio_buffer(outlink, in->nb_samples); - if (!out) { - av_frame_free(&in); - return AVERROR(ENOMEM); - } - av_frame_copy_props(out, in); - } - - out->pts = s->pts[0]; - memmove(s->pts, &s->pts[1], (FF_ARRAY_ELEMS(s->pts) - 1) * sizeof(s->pts[0])); - - src = (const double *)in->data[0]; - dst = (double *)out->data[0]; - buf = s->buf; - limiter_buf = s->limiter_buf; - - ff_ebur128_add_frames_double(s->r128_in, src, in->nb_samples); - - if (s->frame_type == FIRST_FRAME && in->nb_samples < frame_size(inlink->sample_rate, 3000)) { - double offset, offset_tp, true_peak; - - ff_ebur128_loudness_global(s->r128_in, &global); - for (c = 0; c < inlink->ch_layout.nb_channels; c++) { - double tmp; - ff_ebur128_sample_peak(s->r128_in, c, &tmp); - if (c == 0 || tmp > true_peak) - true_peak = tmp; - } - - offset = pow(10., (s->target_i - global) / 20.); - offset_tp = true_peak * offset; - s->offset = offset_tp < s->target_tp ? offset : s->target_tp / true_peak; - s->frame_type = LINEAR_MODE; - } - - switch (s->frame_type) { - case FIRST_FRAME: - for (n = 0; n < in->nb_samples; n++) { - for (c = 0; c < inlink->ch_layout.nb_channels; c++) { - buf[s->buf_index + c] = src[c]; - } - src += inlink->ch_layout.nb_channels; - s->buf_index += inlink->ch_layout.nb_channels; - } - - ff_ebur128_loudness_shortterm(s->r128_in, &shortterm); - - if (shortterm < s->measured_thresh) { - s->above_threshold = 0; - env_shortterm = shortterm <= -70. ? 0. : s->target_i - s->measured_i; - } else { - s->above_threshold = 1; - env_shortterm = shortterm <= -70. ? 0. : s->target_i - shortterm; - } - - for (n = 0; n < 30; n++) - s->delta[n] = pow(10., env_shortterm / 20.); - s->prev_delta = s->delta[s->index]; - - s->buf_index = - s->limiter_buf_index = 0; - - for (n = 0; n < (s->limiter_buf_size / inlink->ch_layout.nb_channels); n++) { - for (c = 0; c < inlink->ch_layout.nb_channels; c++) { - limiter_buf[s->limiter_buf_index + c] = buf[s->buf_index + c] * s->delta[s->index] * s->offset; - } - s->limiter_buf_index += inlink->ch_layout.nb_channels; - if (s->limiter_buf_index >= s->limiter_buf_size) - s->limiter_buf_index -= s->limiter_buf_size; - - s->buf_index += inlink->ch_layout.nb_channels; - } - - subframe_length = frame_size(inlink->sample_rate, 100); - true_peak_limiter(s, dst, subframe_length, inlink->ch_layout.nb_channels); - ff_ebur128_add_frames_double(s->r128_out, dst, subframe_length); - - out->nb_samples = subframe_length; - - s->frame_type = INNER_FRAME; - break; - - case INNER_FRAME: - gain = gaussian_filter(s, s->index + 10 < 30 ? s->index + 10 : s->index + 10 - 30); - gain_next = gaussian_filter(s, s->index + 11 < 30 ? s->index + 11 : s->index + 11 - 30); - - for (n = 0; n < in->nb_samples; n++) { - for (c = 0; c < inlink->ch_layout.nb_channels; c++) { - buf[s->prev_buf_index + c] = src[c]; - limiter_buf[s->limiter_buf_index + c] = buf[s->buf_index + c] * (gain + (((double) n / in->nb_samples) * (gain_next - gain))) * s->offset; - } - src += inlink->ch_layout.nb_channels; - - s->limiter_buf_index += inlink->ch_layout.nb_channels; - if (s->limiter_buf_index >= s->limiter_buf_size) - s->limiter_buf_index -= s->limiter_buf_size; - - s->prev_buf_index += inlink->ch_layout.nb_channels; - if (s->prev_buf_index >= s->buf_size) - s->prev_buf_index -= s->buf_size; - - s->buf_index += inlink->ch_layout.nb_channels; - if (s->buf_index >= s->buf_size) - s->buf_index -= s->buf_size; - } - - subframe_length = (frame_size(inlink->sample_rate, 100) - in->nb_samples) * inlink->ch_layout.nb_channels; - s->limiter_buf_index = s->limiter_buf_index + subframe_length < s->limiter_buf_size ? s->limiter_buf_index + subframe_length : s->limiter_buf_index + subframe_length - s->limiter_buf_size; - - true_peak_limiter(s, dst, in->nb_samples, inlink->ch_layout.nb_channels); - ff_ebur128_add_frames_double(s->r128_out, dst, in->nb_samples); - - ff_ebur128_loudness_range(s->r128_in, &lra); - ff_ebur128_loudness_global(s->r128_in, &global); - ff_ebur128_loudness_shortterm(s->r128_in, &shortterm); - ff_ebur128_relative_threshold(s->r128_in, &relative_threshold); - - if (s->above_threshold == 0) { - double shortterm_out; - - if (shortterm > s->measured_thresh) - s->prev_delta *= 1.0058; - - ff_ebur128_loudness_shortterm(s->r128_out, &shortterm_out); - if (shortterm_out >= s->target_i) - s->above_threshold = 1; - } - - if (shortterm < relative_threshold || shortterm <= -70. || s->above_threshold == 0) { - s->delta[s->index] = s->prev_delta; - } else { - env_global = fabs(shortterm - global) < (s->target_lra / 2.) ? shortterm - global : (s->target_lra / 2.) * ((shortterm - global) < 0 ? -1 : 1); - env_shortterm = s->target_i - shortterm; - s->delta[s->index] = pow(10., (env_global + env_shortterm) / 20.); - } - - s->prev_delta = s->delta[s->index]; - s->index++; - if (s->index >= 30) - s->index -= 30; - s->prev_nb_samples = in->nb_samples; - break; - - case FINAL_FRAME: - gain = gaussian_filter(s, s->index + 10 < 30 ? s->index + 10 : s->index + 10 - 30); - s->limiter_buf_index = 0; - src_index = 0; - - for (n = 0; n < s->limiter_buf_size / inlink->ch_layout.nb_channels; n++) { - for (c = 0; c < inlink->ch_layout.nb_channels; c++) { - s->limiter_buf[s->limiter_buf_index + c] = src[src_index + c] * gain * s->offset; - } - src_index += inlink->ch_layout.nb_channels; - - s->limiter_buf_index += inlink->ch_layout.nb_channels; - if (s->limiter_buf_index >= s->limiter_buf_size) - s->limiter_buf_index -= s->limiter_buf_size; - } - - subframe_length = frame_size(inlink->sample_rate, 100); - for (i = 0; i < in->nb_samples / subframe_length; i++) { - true_peak_limiter(s, dst, subframe_length, inlink->ch_layout.nb_channels); - - for (n = 0; n < subframe_length; n++) { - for (c = 0; c < inlink->ch_layout.nb_channels; c++) { - if (src_index < (in->nb_samples * inlink->ch_layout.nb_channels)) { - limiter_buf[s->limiter_buf_index + c] = src[src_index + c] * gain * s->offset; - } else { - limiter_buf[s->limiter_buf_index + c] = 0.; - } - } - - if (src_index < (in->nb_samples * inlink->ch_layout.nb_channels)) - src_index += inlink->ch_layout.nb_channels; - - s->limiter_buf_index += inlink->ch_layout.nb_channels; - if (s->limiter_buf_index >= s->limiter_buf_size) - s->limiter_buf_index -= s->limiter_buf_size; - } - - dst += (subframe_length * inlink->ch_layout.nb_channels); - } - - dst = (double *)out->data[0]; - ff_ebur128_add_frames_double(s->r128_out, dst, in->nb_samples); - break; - - case LINEAR_MODE: - for (n = 0; n < in->nb_samples; n++) { - for (c = 0; c < inlink->ch_layout.nb_channels; c++) { - dst[c] = src[c] * s->offset; - } - src += inlink->ch_layout.nb_channels; - dst += inlink->ch_layout.nb_channels; - } - - dst = (double *)out->data[0]; - ff_ebur128_add_frames_double(s->r128_out, dst, in->nb_samples); - break; - } - - if (in != out) - av_frame_free(&in); - return ff_filter_frame(outlink, out); -} - -static int flush_frame(AVFilterLink *outlink) -{ - AVFilterContext *ctx = outlink->src; - AVFilterLink *inlink = ctx->inputs[0]; - LoudNormContext *s = ctx->priv; - int ret = 0; - - if (s->frame_type == INNER_FRAME) { - double *src; - double *buf; - int nb_samples, n, c, offset; - AVFrame *frame; - - nb_samples = (s->buf_size / inlink->ch_layout.nb_channels) - s->prev_nb_samples; - nb_samples -= (frame_size(inlink->sample_rate, 100) - s->prev_nb_samples); - - frame = ff_get_audio_buffer(outlink, nb_samples); - if (!frame) - return AVERROR(ENOMEM); - frame->nb_samples = nb_samples; - - buf = s->buf; - src = (double *)frame->data[0]; - - offset = ((s->limiter_buf_size / inlink->ch_layout.nb_channels) - s->prev_nb_samples) * inlink->ch_layout.nb_channels; - offset -= (frame_size(inlink->sample_rate, 100) - s->prev_nb_samples) * inlink->ch_layout.nb_channels; - s->buf_index = s->buf_index - offset < 0 ? s->buf_index - offset + s->buf_size : s->buf_index - offset; - - for (n = 0; n < nb_samples; n++) { - for (c = 0; c < inlink->ch_layout.nb_channels; c++) { - src[c] = buf[s->buf_index + c]; - } - src += inlink->ch_layout.nb_channels; - s->buf_index += inlink->ch_layout.nb_channels; - if (s->buf_index >= s->buf_size) - s->buf_index -= s->buf_size; - } - - s->frame_type = FINAL_FRAME; - ret = filter_frame(inlink, frame); - } - return ret; -} - -static int activate(AVFilterContext *ctx) -{ - AVFilterLink *inlink = ctx->inputs[0]; - AVFilterLink *outlink = ctx->outputs[0]; - LoudNormContext *s = ctx->priv; - AVFrame *in = NULL; - int ret = 0, status; - int64_t pts; - - FF_FILTER_FORWARD_STATUS_BACK(outlink, inlink); - - if (s->frame_type != LINEAR_MODE) { - int nb_samples; - - if (s->frame_type == FIRST_FRAME) { - nb_samples = frame_size(inlink->sample_rate, 3000); - } else { - nb_samples = frame_size(inlink->sample_rate, 100); - } - - ret = ff_inlink_consume_samples(inlink, nb_samples, nb_samples, &in); - } else { - ret = ff_inlink_consume_frame(inlink, &in); - } - - if (ret < 0) - return ret; - if (ret > 0) { - if (s->frame_type == FIRST_FRAME) { - const int nb_samples = frame_size(inlink->sample_rate, 100); - - for (int i = 0; i < FF_ARRAY_ELEMS(s->pts); i++) - s->pts[i] = in->pts + i * nb_samples; - } else if (s->frame_type == LINEAR_MODE) { - s->pts[0] = in->pts; - } else { - s->pts[FF_ARRAY_ELEMS(s->pts) - 1] = in->pts; - } - ret = filter_frame(inlink, in); - } - if (ret < 0) - return ret; - - if (ff_inlink_acknowledge_status(inlink, &status, &pts)) { - ff_outlink_set_status(outlink, status, pts); - return flush_frame(outlink); - } - - FF_FILTER_FORWARD_WANTED(outlink, inlink); - - return FFERROR_NOT_READY; -} - -static int query_formats(AVFilterContext *ctx) -{ - LoudNormContext *s = ctx->priv; - static const int input_srate[] = {192000, -1}; - static const enum AVSampleFormat sample_fmts[] = { - AV_SAMPLE_FMT_DBL, - AV_SAMPLE_FMT_NONE - }; - int ret = ff_set_common_all_channel_counts(ctx); - if (ret < 0) - return ret; - - ret = ff_set_common_formats_from_list(ctx, sample_fmts); - if (ret < 0) - return ret; - - if (s->frame_type == LINEAR_MODE) { - return ff_set_common_all_samplerates(ctx); - } else { - return ff_set_common_samplerates_from_list(ctx, input_srate); - } -} - -static int config_input(AVFilterLink *inlink) -{ - AVFilterContext *ctx = inlink->dst; - LoudNormContext *s = ctx->priv; - - s->r128_in = ff_ebur128_init(inlink->ch_layout.nb_channels, inlink->sample_rate, 0, FF_EBUR128_MODE_I | FF_EBUR128_MODE_S | FF_EBUR128_MODE_LRA | FF_EBUR128_MODE_SAMPLE_PEAK); - if (!s->r128_in) - return AVERROR(ENOMEM); - - s->r128_out = ff_ebur128_init(inlink->ch_layout.nb_channels, inlink->sample_rate, 0, FF_EBUR128_MODE_I | FF_EBUR128_MODE_S | FF_EBUR128_MODE_LRA | FF_EBUR128_MODE_SAMPLE_PEAK); - if (!s->r128_out) - return AVERROR(ENOMEM); - - if (inlink->ch_layout.nb_channels == 1 && s->dual_mono) { - ff_ebur128_set_channel(s->r128_in, 0, FF_EBUR128_DUAL_MONO); - ff_ebur128_set_channel(s->r128_out, 0, FF_EBUR128_DUAL_MONO); - } - - s->buf_size = frame_size(inlink->sample_rate, 3000) * inlink->ch_layout.nb_channels; - s->buf = av_malloc_array(s->buf_size, sizeof(*s->buf)); - if (!s->buf) - return AVERROR(ENOMEM); - - s->limiter_buf_size = frame_size(inlink->sample_rate, 210) * inlink->ch_layout.nb_channels; - s->limiter_buf = av_malloc_array(s->buf_size, sizeof(*s->limiter_buf)); - if (!s->limiter_buf) - return AVERROR(ENOMEM); - - s->prev_smp = av_malloc_array(inlink->ch_layout.nb_channels, sizeof(*s->prev_smp)); - if (!s->prev_smp) - return AVERROR(ENOMEM); - - init_gaussian_filter(s); - - s->buf_index = - s->prev_buf_index = - s->limiter_buf_index = 0; - s->channels = inlink->ch_layout.nb_channels; - s->index = 1; - s->limiter_state = OUT; - s->offset = pow(10., s->offset / 20.); - s->target_tp = pow(10., s->target_tp / 20.); - s->attack_length = frame_size(inlink->sample_rate, 10); - s->release_length = frame_size(inlink->sample_rate, 100); - - return 0; -} - -static av_cold int init(AVFilterContext *ctx) -{ - LoudNormContext *s = ctx->priv; - s->frame_type = FIRST_FRAME; - - if (s->linear) { - double offset, offset_tp; - offset = s->target_i - s->measured_i; - offset_tp = s->measured_tp + offset; - - if (s->measured_tp != 99 && s->measured_thresh != -70 && s->measured_lra != 0 && s->measured_i != 0) { - if ((offset_tp <= s->target_tp) && (s->measured_lra <= s->target_lra)) { - s->frame_type = LINEAR_MODE; - s->offset = offset; - } - } - } - - return 0; -} - -static av_cold void uninit(AVFilterContext *ctx) -{ - LoudNormContext *s = ctx->priv; - double i_in, i_out, lra_in, lra_out, thresh_in, thresh_out, tp_in, tp_out; - int c; - - if (!s->r128_in || !s->r128_out) - goto end; - - ff_ebur128_loudness_range(s->r128_in, &lra_in); - ff_ebur128_loudness_global(s->r128_in, &i_in); - ff_ebur128_relative_threshold(s->r128_in, &thresh_in); - for (c = 0; c < s->channels; c++) { - double tmp; - ff_ebur128_sample_peak(s->r128_in, c, &tmp); - if ((c == 0) || (tmp > tp_in)) - tp_in = tmp; - } - - ff_ebur128_loudness_range(s->r128_out, &lra_out); - ff_ebur128_loudness_global(s->r128_out, &i_out); - ff_ebur128_relative_threshold(s->r128_out, &thresh_out); - for (c = 0; c < s->channels; c++) { - double tmp; - ff_ebur128_sample_peak(s->r128_out, c, &tmp); - if ((c == 0) || (tmp > tp_out)) - tp_out = tmp; - } - - switch(s->print_format) { - case NONE: - break; - - case JSON: - av_log(ctx, AV_LOG_INFO, - "\n{\n" - "\t\"input_i\" : \"%.2f\",\n" - "\t\"input_tp\" : \"%.2f\",\n" - "\t\"input_lra\" : \"%.2f\",\n" - "\t\"input_thresh\" : \"%.2f\",\n" - "\t\"output_i\" : \"%.2f\",\n" - "\t\"output_tp\" : \"%+.2f\",\n" - "\t\"output_lra\" : \"%.2f\",\n" - "\t\"output_thresh\" : \"%.2f\",\n" - "\t\"normalization_type\" : \"%s\",\n" - "\t\"target_offset\" : \"%.2f\"\n" - "}\n", - i_in, - 20. * log10(tp_in), - lra_in, - thresh_in, - i_out, - 20. * log10(tp_out), - lra_out, - thresh_out, - s->frame_type == LINEAR_MODE ? "linear" : "dynamic", - s->target_i - i_out - ); - break; - - case SUMMARY: - av_log(ctx, AV_LOG_INFO, - "\n" - "Input Integrated: %+6.1f LUFS\n" - "Input True Peak: %+6.1f dBTP\n" - "Input LRA: %6.1f LU\n" - "Input Threshold: %+6.1f LUFS\n" - "\n" - "Output Integrated: %+6.1f LUFS\n" - "Output True Peak: %+6.1f dBTP\n" - "Output LRA: %6.1f LU\n" - "Output Threshold: %+6.1f LUFS\n" - "\n" - "Normalization Type: %s\n" - "Target Offset: %+6.1f LU\n", - i_in, - 20. * log10(tp_in), - lra_in, - thresh_in, - i_out, - 20. * log10(tp_out), - lra_out, - thresh_out, - s->frame_type == LINEAR_MODE ? "Linear" : "Dynamic", - s->target_i - i_out - ); - break; - } - -end: - if (s->r128_in) - ff_ebur128_destroy(&s->r128_in); - if (s->r128_out) - ff_ebur128_destroy(&s->r128_out); - av_freep(&s->limiter_buf); - av_freep(&s->prev_smp); - av_freep(&s->buf); -} - -static const AVFilterPad avfilter_af_loudnorm_inputs[] = { - { - .name = "default", - .type = AVMEDIA_TYPE_AUDIO, - .config_props = config_input, - }, -}; - -const AVFilter ff_af_loudnorm = { - .name = "loudnorm", - .description = NULL_IF_CONFIG_SMALL("EBU R128 loudness normalization"), - .priv_size = sizeof(LoudNormContext), - .priv_class = &loudnorm_class, - .init = init, - .activate = activate, - .uninit = uninit, - FILTER_INPUTS(avfilter_af_loudnorm_inputs), - FILTER_OUTPUTS(ff_audio_default_filterpad), - FILTER_QUERY_FUNC(query_formats), -}; diff --git a/libavfilter/ebur128.c b/libavfilter/ebur128.c deleted file mode 100644 index 062099e206..0000000000 --- a/libavfilter/ebur128.c +++ /dev/null @@ -1,725 +0,0 @@ -/* - * Copyright (c) 2011 Jan Kokemüller - * - * This file is part of FFmpeg. - * - * FFmpeg is free software; you can redistribute it and/or - * modify it under the terms of the GNU Lesser General Public - * License as published by the Free Software Foundation; either - * version 2.1 of the License, or (at your option) any later version. - * - * FFmpeg is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - * Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public - * License along with FFmpeg; if not, write to the Free Software - * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA - * - * This file is based on libebur128 which is available at - * https://github.com/jiixyj/libebur128/ - * - * Libebur128 has the following copyright: - * - * Permission is hereby granted, free of charge, to any person obtaining a copy - * of this software and associated documentation files (the "Software"), to deal - * in the Software without restriction, including without limitation the rights - * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell - * copies of the Software, and to permit persons to whom the Software is - * furnished to do so, subject to the following conditions: - * - * The above copyright notice and this permission notice shall be included in - * all copies or substantial portions of the Software. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR - * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE - * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER - * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, - * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN - * THE SOFTWARE. -*/ - -#include "ebur128.h" - -#include <float.h> -#include <limits.h> -#include <math.h> /* You may have to define _USE_MATH_DEFINES if you use MSVC */ - -#include "libavutil/error.h" -#include "libavutil/macros.h" -#include "libavutil/mem.h" -#include "libavutil/mem_internal.h" -#include "libavutil/thread.h" - -#define CHECK_ERROR(condition, errorcode, goto_point) \ - if ((condition)) { \ - errcode = (errorcode); \ - goto goto_point; \ - } - -#define ALMOST_ZERO 0.000001 - -#define RELATIVE_GATE (-10.0) -#define RELATIVE_GATE_FACTOR pow(10.0, RELATIVE_GATE / 10.0) -#define MINUS_20DB pow(10.0, -20.0 / 10.0) - -struct FFEBUR128StateInternal { - /** Filtered audio data (used as ring buffer). */ - double *audio_data; - /** Size of audio_data array. */ - size_t audio_data_frames; - /** Current index for audio_data. */ - size_t audio_data_index; - /** How many frames are needed for a gating block. Will correspond to 400ms - * of audio at initialization, and 100ms after the first block (75% overlap - * as specified in the 2011 revision of BS1770). */ - unsigned long needed_frames; - /** The channel map. Has as many elements as there are channels. */ - int *channel_map; - /** How many samples fit in 100ms (rounded). */ - unsigned long samples_in_100ms; - /** BS.1770 filter coefficients (nominator). */ - double b[5]; - /** BS.1770 filter coefficients (denominator). */ - double a[5]; - /** BS.1770 filter state. */ - double v[5][5]; - /** Histograms, used to calculate LRA. */ - unsigned long *block_energy_histogram; - unsigned long *short_term_block_energy_histogram; - /** Keeps track of when a new short term block is needed. */ - size_t short_term_frame_counter; - /** Maximum sample peak, one per channel */ - double *sample_peak; - /** The maximum window duration in ms. */ - unsigned long window; - /** Data pointer array for interleaved data */ - void **data_ptrs; -}; - -static AVOnce histogram_init = AV_ONCE_INIT; -static DECLARE_ALIGNED(32, double, histogram_energies)[1000]; -static DECLARE_ALIGNED(32, double, histogram_energy_boundaries)[1001]; - -static void ebur128_init_filter(FFEBUR128State * st) -{ - int i, j; - - double f0 = 1681.974450955533; - double G = 3.999843853973347; - double Q = 0.7071752369554196; - - double K = tan(M_PI * f0 / (double) st->samplerate); - double Vh = pow(10.0, G / 20.0); - double Vb = pow(Vh, 0.4996667741545416); - - double pb[3] = { 0.0, 0.0, 0.0 }; - double pa[3] = { 1.0, 0.0, 0.0 }; - double rb[3] = { 1.0, -2.0, 1.0 }; - double ra[3] = { 1.0, 0.0, 0.0 }; - - double a0 = 1.0 + K / Q + K * K; - pb[0] = (Vh + Vb * K / Q + K * K) / a0; - pb[1] = 2.0 * (K * K - Vh) / a0; - pb[2] = (Vh - Vb * K / Q + K * K) / a0; - pa[1] = 2.0 * (K * K - 1.0) / a0; - pa[2] = (1.0 - K / Q + K * K) / a0; - - f0 = 38.13547087602444; - Q = 0.5003270373238773; - K = tan(M_PI * f0 / (double) st->samplerate); - - ra[1] = 2.0 * (K * K - 1.0) / (1.0 + K / Q + K * K); - ra[2] = (1.0 - K / Q + K * K) / (1.0 + K / Q + K * K); - - st->d->b[0] = pb[0] * rb[0]; - st->d->b[1] = pb[0] * rb[1] + pb[1] * rb[0]; - st->d->b[2] = pb[0] * rb[2] + pb[1] * rb[1] + pb[2] * rb[0]; - st->d->b[3] = pb[1] * rb[2] + pb[2] * rb[1]; - st->d->b[4] = pb[2] * rb[2]; - - st->d->a[0] = pa[0] * ra[0]; - st->d->a[1] = pa[0] * ra[1] + pa[1] * ra[0]; - st->d->a[2] = pa[0] * ra[2] + pa[1] * ra[1] + pa[2] * ra[0]; - st->d->a[3] = pa[1] * ra[2] + pa[2] * ra[1]; - st->d->a[4] = pa[2] * ra[2]; - - for (i = 0; i < 5; ++i) { - for (j = 0; j < 5; ++j) { - st->d->v[i][j] = 0.0; - } - } -} - -static int ebur128_init_channel_map(FFEBUR128State * st) -{ - size_t i; - st->d->channel_map = - (int *) av_malloc_array(st->channels, sizeof(*st->d->channel_map)); - if (!st->d->channel_map) - return AVERROR(ENOMEM); - if (st->channels == 4) { - st->d->channel_map[0] = FF_EBUR128_LEFT; - st->d->channel_map[1] = FF_EBUR128_RIGHT; - st->d->channel_map[2] = FF_EBUR128_LEFT_SURROUND; - st->d->channel_map[3] = FF_EBUR128_RIGHT_SURROUND; - } else if (st->channels == 5) { - st->d->channel_map[0] = FF_EBUR128_LEFT; - st->d->channel_map[1] = FF_EBUR128_RIGHT; - st->d->channel_map[2] = FF_EBUR128_CENTER; - st->d->channel_map[3] = FF_EBUR128_LEFT_SURROUND; - st->d->channel_map[4] = FF_EBUR128_RIGHT_SURROUND; - } else { - for (i = 0; i < st->channels; ++i) { - switch (i) { - case 0: - st->d->channel_map[i] = FF_EBUR128_LEFT; - break; - case 1: - st->d->channel_map[i] = FF_EBUR128_RIGHT; - break; - case 2: - st->d->channel_map[i] = FF_EBUR128_CENTER; - break; - case 3: - st->d->channel_map[i] = FF_EBUR128_UNUSED; - break; - case 4: - st->d->channel_map[i] = FF_EBUR128_LEFT_SURROUND; - break; - case 5: - st->d->channel_map[i] = FF_EBUR128_RIGHT_SURROUND; - break; - default: - st->d->channel_map[i] = FF_EBUR128_UNUSED; - break; - } - } - } - return 0; -} - -static inline void init_histogram(void) -{ - int i; - /* initialize static constants */ - histogram_energy_boundaries[0] = pow(10.0, (-70.0 + 0.691) / 10.0); - for (i = 0; i < 1000; ++i) { - histogram_energies[i] = - pow(10.0, ((double) i / 10.0 - 69.95 + 0.691) / 10.0); - } - for (i = 1; i < 1001; ++i) { - histogram_energy_boundaries[i] = - pow(10.0, ((double) i / 10.0 - 70.0 + 0.691) / 10.0); - } -} - -FFEBUR128State *ff_ebur128_init(unsigned int channels, - unsigned long samplerate, - unsigned long window, int mode) -{ - int errcode; - FFEBUR128State *st; - - st = (FFEBUR128State *) av_malloc(sizeof(*st)); - CHECK_ERROR(!st, 0, exit) - st->d = (struct FFEBUR128StateInternal *) - av_malloc(sizeof(*st->d)); - CHECK_ERROR(!st->d, 0, free_state) - st->channels = channels; - errcode = ebur128_init_channel_map(st); - CHECK_ERROR(errcode, 0, free_internal) - - st->d->sample_peak = - (double *) av_calloc(channels, sizeof(*st->d->sample_peak)); - CHECK_ERROR(!st->d->sample_peak, 0, free_channel_map) - - st->samplerate = samplerate; - st->d->samples_in_100ms = (st->samplerate + 5) / 10; - st->mode = mode; - if ((mode & FF_EBUR128_MODE_S) == FF_EBUR128_MODE_S) { - st->d->window = FFMAX(window, 3000); - } else if ((mode & FF_EBUR128_MODE_M) == FF_EBUR128_MODE_M) { - st->d->window = FFMAX(window, 400); - } else { - goto free_sample_peak; - } - st->d->audio_data_frames = st->samplerate * st->d->window / 1000; - if (st->d->audio_data_frames % st->d->samples_in_100ms) { - /* round up to multiple of samples_in_100ms */ - st->d->audio_data_frames = st->d->audio_data_frames - + st->d->samples_in_100ms - - (st->d->audio_data_frames % st->d->samples_in_100ms); - } - st->d->audio_data = - (double *) av_calloc(st->d->audio_data_frames, - st->channels * sizeof(*st->d->audio_data)); - CHECK_ERROR(!st->d->audio_data, 0, free_sample_peak) - - ebur128_init_filter(st); - - st->d->block_energy_histogram = - av_mallocz(1000 * sizeof(*st->d->block_energy_histogram)); - CHECK_ERROR(!st->d->block_energy_histogram, 0, free_audio_data) - st->d->short_term_block_energy_histogram = - av_mallocz(1000 * sizeof(*st->d->short_term_block_energy_histogram)); - CHECK_ERROR(!st->d->short_term_block_energy_histogram, 0, - free_block_energy_histogram) - st->d->short_term_frame_counter = 0; - - /* the first block needs 400ms of audio data */ - st->d->needed_frames = st->d->samples_in_100ms * 4; - /* start at the beginning of the buffer */ - st->d->audio_data_index = 0; - - if (ff_thread_once(&histogram_init, &init_histogram) != 0) - goto free_short_term_block_energy_histogram; - - st->d->data_ptrs = av_malloc_array(channels, sizeof(*st->d->data_ptrs)); - CHECK_ERROR(!st->d->data_ptrs, 0, - free_short_term_block_energy_histogram); - - return st; - -free_short_term_block_energy_histogram: - av_free(st->d->short_term_block_energy_histogram); -free_block_energy_histogram: - av_free(st->d->block_energy_histogram); -free_audio_data: - av_free(st->d->audio_data); -free_sample_peak: - av_free(st->d->sample_peak); -free_channel_map: - av_free(st->d->channel_map); -free_internal: - av_free(st->d); -free_state: - av_free(st); -exit: - return NULL; -} - -void ff_ebur128_destroy(FFEBUR128State ** st) -{ - av_free((*st)->d->block_energy_histogram); - av_free((*st)->d->short_term_block_energy_histogram); - av_free((*st)->d->audio_data); - av_free((*st)->d->channel_map); - av_free((*st)->d->sample_peak); - av_free((*st)->d->data_ptrs); - av_free((*st)->d); - av_free(*st); - *st = NULL; -} - -#define EBUR128_FILTER(type, scaling_factor) \ -static void ebur128_filter_##type(FFEBUR128State* st, const type** srcs, \ - size_t src_index, size_t frames, \ - int stride) { \ - double* audio_data = st->d->audio_data + st->d->audio_data_index; \ - size_t i, c; \ - \ - if ((st->mode & FF_EBUR128_MODE_SAMPLE_PEAK) == FF_EBUR128_MODE_SAMPLE_PEAK) { \ - for (c = 0; c < st->channels; ++c) { \ - double max = 0.0; \ - for (i = 0; i < frames; ++i) { \ - type v = srcs[c][src_index + i * stride]; \ - if (v > max) { \ - max = v; \ - } else if (-v > max) { \ - max = -1.0 * v; \ - } \ - } \ - max /= scaling_factor; \ - if (max > st->d->sample_peak[c]) st->d->sample_peak[c] = max; \ - } \ - } \ - for (c = 0; c < st->channels; ++c) { \ - int ci = st->d->channel_map[c] - 1; \ - if (ci < 0) continue; \ - else if (ci == FF_EBUR128_DUAL_MONO - 1) ci = 0; /*dual mono */ \ - for (i = 0; i < frames; ++i) { \ - st->d->v[ci][0] = (double) (srcs[c][src_index + i * stride] / scaling_factor) \ - - st->d->a[1] * st->d->v[ci][1] \ - - st->d->a[2] * st->d->v[ci][2] \ - - st->d->a[3] * st->d->v[ci][3] \ - - st->d->a[4] * st->d->v[ci][4]; \ - audio_data[i * st->channels + c] = \ - st->d->b[0] * st->d->v[ci][0] \ - + st->d->b[1] * st->d->v[ci][1] \ - + st->d->b[2] * st->d->v[ci][2] \ - + st->d->b[3] * st->d->v[ci][3] \ - + st->d->b[4] * st->d->v[ci][4]; \ - st->d->v[ci][4] = st->d->v[ci][3]; \ - st->d->v[ci][3] = st->d->v[ci][2]; \ - st->d->v[ci][2] = st->d->v[ci][1]; \ - st->d->v[ci][1] = st->d->v[ci][0]; \ - } \ - st->d->v[ci][4] = fabs(st->d->v[ci][4]) < DBL_MIN ? 0.0 : st->d->v[ci][4]; \ - st->d->v[ci][3] = fabs(st->d->v[ci][3]) < DBL_MIN ? 0.0 : st->d->v[ci][3]; \ - st->d->v[ci][2] = fabs(st->d->v[ci][2]) < DBL_MIN ? 0.0 : st->d->v[ci][2]; \ - st->d->v[ci][1] = fabs(st->d->v[ci][1]) < DBL_MIN ? 0.0 : st->d->v[ci][1]; \ - } \ -} -EBUR128_FILTER(double, 1.0) - -static double ebur128_energy_to_loudness(double energy) -{ - return 10 * log10(energy) - 0.691; -} - -static size_t find_histogram_index(double energy) -{ - size_t index_min = 0; - size_t index_max = 1000; - size_t index_mid; - - do { - index_mid = (index_min + index_max) / 2; - if (energy >= histogram_energy_boundaries[index_mid]) { - index_min = index_mid; - } else { - index_max = index_mid; - } - } while (index_max - index_min != 1); - - return index_min; -} - -static void ebur128_calc_gating_block(FFEBUR128State * st, - size_t frames_per_block, - double *optional_output) -{ - size_t i, c; - double sum = 0.0; - double channel_sum; - for (c = 0; c < st->channels; ++c) { - if (st->d->channel_map[c] == FF_EBUR128_UNUSED) - continue; - channel_sum = 0.0; - if (st->d->audio_data_index < frames_per_block * st->channels) { - for (i = 0; i < st->d->audio_data_index / st->channels; ++i) { - channel_sum += st->d->audio_data[i * st->channels + c] * - st->d->audio_data[i * st->channels + c]; - } - for (i = st->d->audio_data_frames - - (frames_per_block - - st->d->audio_data_index / st->channels); - i < st->d->audio_data_frames; ++i) { - channel_sum += st->d->audio_data[i * st->channels + c] * - st->d->audio_data[i * st->channels + c]; - } - } else { - for (i = - st->d->audio_data_index / st->channels - frames_per_block; - i < st->d->audio_data_index / st->channels; ++i) { - channel_sum += - st->d->audio_data[i * st->channels + - c] * st->d->audio_data[i * - st->channels + - c]; - } - } - if (st->d->channel_map[c] == FF_EBUR128_Mp110 || - st->d->channel_map[c] == FF_EBUR128_Mm110 || - st->d->channel_map[c] == FF_EBUR128_Mp060 || - st->d->channel_map[c] == FF_EBUR128_Mm060 || - st->d->channel_map[c] == FF_EBUR128_Mp090 || - st->d->channel_map[c] == FF_EBUR128_Mm090) { - channel_sum *= 1.41; - } else if (st->d->channel_map[c] == FF_EBUR128_DUAL_MONO) { - channel_sum *= 2.0; - } - sum += channel_sum; - } - sum /= (double) frames_per_block; - if (optional_output) { - *optional_output = sum; - } else if (sum >= histogram_energy_boundaries[0]) { - ++st->d->block_energy_histogram[find_histogram_index(sum)]; - } -} - -int ff_ebur128_set_channel(FFEBUR128State * st, - unsigned int channel_number, int value) -{ - if (channel_number >= st->channels) { - return 1; - } - if (value == FF_EBUR128_DUAL_MONO && - (st->channels != 1 || channel_number != 0)) { - return 1; - } - st->d->channel_map[channel_number] = value; - return 0; -} - -static int ebur128_energy_shortterm(FFEBUR128State * st, double *out); -#define EBUR128_ADD_FRAMES_PLANAR(type) \ -static void ebur128_add_frames_planar_##type(FFEBUR128State* st, const type** srcs, \ - size_t frames, int stride) { \ - size_t src_index = 0; \ - while (frames > 0) { \ - if (frames >= st->d->needed_frames) { \ - ebur128_filter_##type(st, srcs, src_index, st->d->needed_frames, stride); \ - src_index += st->d->needed_frames * stride; \ - frames -= st->d->needed_frames; \ - st->d->audio_data_index += st->d->needed_frames * st->channels; \ - /* calculate the new gating block */ \ - if ((st->mode & FF_EBUR128_MODE_I) == FF_EBUR128_MODE_I) { \ - ebur128_calc_gating_block(st, st->d->samples_in_100ms * 4, NULL); \ - } \ - if ((st->mode & FF_EBUR128_MODE_LRA) == FF_EBUR128_MODE_LRA) { \ - st->d->short_term_frame_counter += st->d->needed_frames; \ - if (st->d->short_term_frame_counter == st->d->samples_in_100ms * 30) { \ - double st_energy; \ - ebur128_energy_shortterm(st, &st_energy); \ - if (st_energy >= histogram_energy_boundaries[0]) { \ - ++st->d->short_term_block_energy_histogram[ \ - find_histogram_index(st_energy)]; \ - } \ - st->d->short_term_frame_counter = st->d->samples_in_100ms * 20; \ - } \ - } \ - /* 100ms are needed for all blocks besides the first one */ \ - st->d->needed_frames = st->d->samples_in_100ms; \ - /* reset audio_data_index when buffer full */ \ - if (st->d->audio_data_index == st->d->audio_data_frames * st->channels) { \ - st->d->audio_data_index = 0; \ - } \ - } else { \ - ebur128_filter_##type(st, srcs, src_index, frames, stride); \ - st->d->audio_data_index += frames * st->channels; \ - if ((st->mode & FF_EBUR128_MODE_LRA) == FF_EBUR128_MODE_LRA) { \ - st->d->short_term_frame_counter += frames; \ - } \ - st->d->needed_frames -= frames; \ - frames = 0; \ - } \ - } \ -} -EBUR128_ADD_FRAMES_PLANAR(double) -#define FF_EBUR128_ADD_FRAMES(type) \ -void ff_ebur128_add_frames_##type(FFEBUR128State* st, const type* src, \ - size_t frames) { \ - int i; \ - const type **buf = (const type**)st->d->data_ptrs; \ - for (i = 0; i < st->channels; i++) \ - buf[i] = src + i; \ - ebur128_add_frames_planar_##type(st, buf, frames, st->channels); \ -} -FF_EBUR128_ADD_FRAMES(double) - -static int ebur128_calc_relative_threshold(FFEBUR128State **sts, size_t size, - double *relative_threshold) -{ - size_t i, j; - int above_thresh_counter = 0; - *relative_threshold = 0.0; - - for (i = 0; i < size; i++) { - unsigned long *block_energy_histogram = sts[i]->d->block_energy_histogram; - for (j = 0; j < 1000; ++j) { - *relative_threshold += block_energy_histogram[j] * histogram_energies[j]; - above_thresh_counter += block_energy_histogram[j]; - } - } - - if (above_thresh_counter != 0) { - *relative_threshold /= (double)above_thresh_counter; - *relative_threshold *= RELATIVE_GATE_FACTOR; - } - - return above_thresh_counter; -} - -static int ebur128_gated_loudness(FFEBUR128State ** sts, size_t size, - double *out) -{ - double gated_loudness = 0.0; - double relative_threshold; - size_t above_thresh_counter; - size_t i, j, start_index; - - for (i = 0; i < size; i++) - if ((sts[i]->mode & FF_EBUR128_MODE_I) != FF_EBUR128_MODE_I) - return AVERROR(EINVAL); - - if (!ebur128_calc_relative_threshold(sts, size, &relative_threshold)) { - *out = -HUGE_VAL; - return 0; - } - - above_thresh_counter = 0; - if (relative_threshold < histogram_energy_boundaries[0]) { - start_index = 0; - } else { - start_index = find_histogram_index(relative_threshold); - if (relative_threshold > histogram_energies[start_index]) { - ++start_index; - } - } - for (i = 0; i < size; i++) { - for (j = start_index; j < 1000; ++j) { - gated_loudness += sts[i]->d->block_energy_histogram[j] * - histogram_energies[j]; - above_thresh_counter += sts[i]->d->block_energy_histogram[j]; - } - } - if (!above_thresh_counter) { - *out = -HUGE_VAL; - return 0; - } - gated_loudness /= (double) above_thresh_counter; - *out = ebur128_energy_to_loudness(gated_loudness); - return 0; -} - -int ff_ebur128_relative_threshold(FFEBUR128State * st, double *out) -{ - double relative_threshold; - - if ((st->mode & FF_EBUR128_MODE_I) != FF_EBUR128_MODE_I) - return AVERROR(EINVAL); - - if (!ebur128_calc_relative_threshold(&st, 1, &relative_threshold)) { - *out = -70.0; - return 0; - } - - *out = ebur128_energy_to_loudness(relative_threshold); - return 0; -} - -int ff_ebur128_loudness_global(FFEBUR128State * st, double *out) -{ - return ebur128_gated_loudness(&st, 1, out); -} - -static int ebur128_energy_in_interval(FFEBUR128State * st, - size_t interval_frames, double *out) -{ - if (interval_frames > st->d->audio_data_frames) { - return AVERROR(EINVAL); - } - ebur128_calc_gating_block(st, interval_frames, out); - return 0; -} - -static int ebur128_energy_shortterm(FFEBUR128State * st, double *out) -{ - return ebur128_energy_in_interval(st, st->d->samples_in_100ms * 30, - out); -} - -int ff_ebur128_loudness_shortterm(FFEBUR128State * st, double *out) -{ - double energy; - int error = ebur128_energy_shortterm(st, &energy); - if (error) { - return error; - } else if (energy <= 0.0) { - *out = -HUGE_VAL; - return 0; - } - *out = ebur128_energy_to_loudness(energy); - return 0; -} - -/* EBU - TECH 3342 */ -int ff_ebur128_loudness_range_multiple(FFEBUR128State ** sts, size_t size, - double *out) -{ - size_t i, j; - size_t stl_size; - double stl_power, stl_integrated; - /* High and low percentile energy */ - double h_en, l_en; - unsigned long hist[1000] = { 0 }; - size_t percentile_low, percentile_high; - size_t index; - - for (i = 0; i < size; ++i) { - if (sts[i]) { - if ((sts[i]->mode & FF_EBUR128_MODE_LRA) != - FF_EBUR128_MODE_LRA) { - return AVERROR(EINVAL); - } - } - } - - stl_size = 0; - stl_power = 0.0; - for (i = 0; i < size; ++i) { - if (!sts[i]) - continue; - for (j = 0; j < 1000; ++j) { - hist[j] += sts[i]->d->short_term_block_energy_histogram[j]; - stl_size += sts[i]->d->short_term_block_energy_histogram[j]; - stl_power += sts[i]->d->short_term_block_energy_histogram[j] - * histogram_energies[j]; - } - } - if (!stl_size) { - *out = 0.0; - return 0; - } - - stl_power /= stl_size; - stl_integrated = MINUS_20DB * stl_power; - - if (stl_integrated < histogram_energy_boundaries[0]) { - index = 0; - } else { - index = find_histogram_index(stl_integrated); - if (stl_integrated > histogram_energies[index]) { - ++index; - } - } - stl_size = 0; - for (j = index; j < 1000; ++j) { - stl_size += hist[j]; - } - if (!stl_size) { - *out = 0.0; - return 0; - } - - percentile_low = (size_t) ((stl_size - 1) * 0.1 + 0.5); - percentile_high = (size_t) ((stl_size - 1) * 0.95 + 0.5); - - stl_size = 0; - j = index; - while (stl_size <= percentile_low) { - stl_size += hist[j++]; - } - l_en = histogram_energies[j - 1]; - while (stl_size <= percentile_high) { - stl_size += hist[j++]; - } - h_en = histogram_energies[j - 1]; - *out = - ebur128_energy_to_loudness(h_en) - - ebur128_energy_to_loudness(l_en); - return 0; -} - -int ff_ebur128_loudness_range(FFEBUR128State * st, double *out) -{ - return ff_ebur128_loudness_range_multiple(&st, 1, out); -} - -int ff_ebur128_sample_peak(FFEBUR128State * st, - unsigned int channel_number, double *out) -{ - if ((st->mode & FF_EBUR128_MODE_SAMPLE_PEAK) != - FF_EBUR128_MODE_SAMPLE_PEAK) { - return AVERROR(EINVAL); - } else if (channel_number >= st->channels) { - return AVERROR(EINVAL); - } - *out = st->d->sample_peak[channel_number]; - return 0; -} diff --git a/libavfilter/ebur128.h b/libavfilter/ebur128.h deleted file mode 100644 index 8e7385e044..0000000000 --- a/libavfilter/ebur128.h +++ /dev/null @@ -1,229 +0,0 @@ -/* - * Copyright (c) 2011 Jan Kokemüller - * - * This file is part of FFmpeg. - * - * FFmpeg is free software; you can redistribute it and/or - * modify it under the terms of the GNU Lesser General Public - * License as published by the Free Software Foundation; either - * version 2.1 of the License, or (at your option) any later version. - * - * FFmpeg is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - * Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public - * License along with FFmpeg; if not, write to the Free Software - * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA - * - * This file is based on libebur128 which is available at - * https://github.com/jiixyj/libebur128/ - * -*/ - -#ifndef AVFILTER_EBUR128_H -#define AVFILTER_EBUR128_H - -/** \file ebur128.h - * \brief libebur128 - a library for loudness measurement according to - * the EBU R128 standard. - */ - -#include <stddef.h> /* for size_t */ - -/** \enum channel - * Use these values when setting the channel map with ebur128_set_channel(). - * See definitions in ITU R-REC-BS 1770-4 - */ -enum channel { - FF_EBUR128_UNUSED = 0, /**< unused channel (for example LFE channel) */ - FF_EBUR128_LEFT, - FF_EBUR128_Mp030 = 1, /**< itu M+030 */ - FF_EBUR128_RIGHT, - FF_EBUR128_Mm030 = 2, /**< itu M-030 */ - FF_EBUR128_CENTER, - FF_EBUR128_Mp000 = 3, /**< itu M+000 */ - FF_EBUR128_LEFT_SURROUND, - FF_EBUR128_Mp110 = 4, /**< itu M+110 */ - FF_EBUR128_RIGHT_SURROUND, - FF_EBUR128_Mm110 = 5, /**< itu M-110 */ - FF_EBUR128_DUAL_MONO, /**< a channel that is counted twice */ - FF_EBUR128_MpSC, /**< itu M+SC */ - FF_EBUR128_MmSC, /**< itu M-SC */ - FF_EBUR128_Mp060, /**< itu M+060 */ - FF_EBUR128_Mm060, /**< itu M-060 */ - FF_EBUR128_Mp090, /**< itu M+090 */ - FF_EBUR128_Mm090, /**< itu M-090 */ - FF_EBUR128_Mp135, /**< itu M+135 */ - FF_EBUR128_Mm135, /**< itu M-135 */ - FF_EBUR128_Mp180, /**< itu M+180 */ - FF_EBUR128_Up000, /**< itu U+000 */ - FF_EBUR128_Up030, /**< itu U+030 */ - FF_EBUR128_Um030, /**< itu U-030 */ - FF_EBUR128_Up045, /**< itu U+045 */ - FF_EBUR128_Um045, /**< itu U-030 */ - FF_EBUR128_Up090, /**< itu U+090 */ - FF_EBUR128_Um090, /**< itu U-090 */ - FF_EBUR128_Up110, /**< itu U+110 */ - FF_EBUR128_Um110, /**< itu U-110 */ - FF_EBUR128_Up135, /**< itu U+135 */ - FF_EBUR128_Um135, /**< itu U-135 */ - FF_EBUR128_Up180, /**< itu U+180 */ - FF_EBUR128_Tp000, /**< itu T+000 */ - FF_EBUR128_Bp000, /**< itu B+000 */ - FF_EBUR128_Bp045, /**< itu B+045 */ - FF_EBUR128_Bm045 /**< itu B-045 */ -}; - -/** \enum mode - * Use these values in ebur128_init (or'ed). Try to use the lowest possible - * modes that suit your needs, as performance will be better. - */ -enum mode { - /** can resurrrect and call ff_ebur128_loudness_momentary */ - FF_EBUR128_MODE_M = (1 << 0), - /** can call ff_ebur128_loudness_shortterm */ - FF_EBUR128_MODE_S = (1 << 1) | FF_EBUR128_MODE_M, - /** can call ff_ebur128_loudness_global_* and ff_ebur128_relative_threshold */ - FF_EBUR128_MODE_I = (1 << 2) | FF_EBUR128_MODE_M, - /** can call ff_ebur128_loudness_range */ - FF_EBUR128_MODE_LRA = (1 << 3) | FF_EBUR128_MODE_S, - /** can call ff_ebur128_sample_peak */ - FF_EBUR128_MODE_SAMPLE_PEAK = (1 << 4) | FF_EBUR128_MODE_M, -}; - -/** forward declaration of FFEBUR128StateInternal */ -struct FFEBUR128StateInternal; - -/** \brief Contains information about the state of a loudness measurement. - * - * You should not need to modify this struct directly. - */ -typedef struct FFEBUR128State { - int mode; /**< The current mode. */ - unsigned int channels; /**< The number of channels. */ - unsigned long samplerate; /**< The sample rate. */ - struct FFEBUR128StateInternal *d; /**< Internal state. */ -} FFEBUR128State; - -/** \brief Initialize library state. - * - * @param channels the number of channels. - * @param samplerate the sample rate. - * @param window set the maximum window size in ms, set to 0 for auto. - * @param mode see the mode enum for possible values. - * @return an initialized library state. - */ -FFEBUR128State *ff_ebur128_init(unsigned int channels, - unsigned long samplerate, - unsigned long window, int mode); - -/** \brief Destroy library state. - * - * @param st pointer to a library state. - */ -void ff_ebur128_destroy(FFEBUR128State ** st); - -/** \brief Set channel type. - * - * The default is: - * - 0 -> FF_EBUR128_LEFT - * - 1 -> FF_EBUR128_RIGHT - * - 2 -> FF_EBUR128_CENTER - * - 3 -> FF_EBUR128_UNUSED - * - 4 -> FF_EBUR128_LEFT_SURROUND - * - 5 -> FF_EBUR128_RIGHT_SURROUND - * - * @param st library state. - * @param channel_number zero based channel index. - * @param value channel type from the "channel" enum. - * @return - * - 0 on success. - * - AVERROR(EINVAL) if invalid channel index. - */ -int ff_ebur128_set_channel(FFEBUR128State * st, - unsigned int channel_number, int value); - -/** \brief Add frames to be processed. - * - * @param st library state. - * @param src array of source frames. Channels must be interleaved. - * @param frames number of frames. Not number of samples! - */ -void ff_ebur128_add_frames_double(FFEBUR128State * st, - const double *src, size_t frames); - -/** \brief Get global integrated loudness in LUFS. - * - * @param st library state. - * @param out integrated loudness in LUFS. -HUGE_VAL if result is negative - * infinity. - * @return - * - 0 on success. - * - AVERROR(EINVAL) if mode "FF_EBUR128_MODE_I" has not been set. - */ -int ff_ebur128_loudness_global(FFEBUR128State * st, double *out); - -/** \brief Get short-term loudness (last 3s) in LUFS. - * - * @param st library state. - * @param out short-term loudness in LUFS. -HUGE_VAL if result is negative - * infinity. - * @return - * - 0 on success. - * - AVERROR(EINVAL) if mode "FF_EBUR128_MODE_S" has not been set. - */ -int ff_ebur128_loudness_shortterm(FFEBUR128State * st, double *out); - -/** \brief Get loudness range (LRA) of programme in LU. - * - * Calculates loudness range according to EBU 3342. - * - * @param st library state. - * @param out loudness range (LRA) in LU. Will not be changed in case of - * error. AVERROR(EINVAL) will be returned in this case. - * @return - * - 0 on success. - * - AVERROR(EINVAL) if mode "FF_EBUR128_MODE_LRA" has not been set. - */ -int ff_ebur128_loudness_range(FFEBUR128State * st, double *out); -/** \brief Get loudness range (LRA) in LU across multiple instances. - * - * Calculates loudness range according to EBU 3342. - * - * @param sts array of library states. - * @param size length of sts - * @param out loudness range (LRA) in LU. Will not be changed in case of - * error. AVERROR(EINVAL) will be returned in this case. - * @return - * - 0 on success. - * - AVERROR(EINVAL) if mode "FF_EBUR128_MODE_LRA" has not been set. - */ -int ff_ebur128_loudness_range_multiple(FFEBUR128State ** sts, - size_t size, double *out); - -/** \brief Get maximum sample peak of selected channel in float format. - * - * @param st library state - * @param channel_number channel to analyse - * @param out maximum sample peak in float format (1.0 is 0 dBFS) - * @return - * - 0 on success. - * - AVERROR(EINVAL) if mode "FF_EBUR128_MODE_SAMPLE_PEAK" has not been set. - * - AVERROR(EINVAL) if invalid channel index. - */ -int ff_ebur128_sample_peak(FFEBUR128State * st, - unsigned int channel_number, double *out); - -/** \brief Get relative threshold in LUFS. - * - * @param st library state - * @param out relative threshold in LUFS. - * @return - * - 0 on success. - * - AVERROR(EINVAL) if mode "FF_EBUR128_MODE_I" has not been set. - */ -int ff_ebur128_relative_threshold(FFEBUR128State * st, double *out); - -#endif /* AVFILTER_EBUR128_H */ diff --git a/libavfilter/f_ebur128.c b/libavfilter/f_ebur128.c index a921602b44..aa4426d4da 100644 --- a/libavfilter/f_ebur128.c +++ b/libavfilter/f_ebur128.c @@ -75,6 +75,13 @@ struct integrator { struct hist_entry *histogram; ///< histogram of the powers, used to compute LRA and I }; +enum PrintFormat { + NONE, + JSON, + SUMMARY, + PF_NB +}; + struct rect { int x, y, w, h; }; typedef struct EBUR128Context { @@ -85,8 +92,10 @@ typedef struct EBUR128Context { double true_peak; ///< global true peak double *true_peaks; ///< true peaks per channel double sample_peak; ///< global sample peak + double frame_sample_peak; ///< frame sample peak double *sample_peaks; ///< sample peaks per channel double *true_peaks_per_frame; ///< true peaks in a frame per channel + double *sample_peaks_per_frame; ///< sample peaks in a frame per channel #if CONFIG_SWRESAMPLE SwrContext *swr_ctx; ///< over-sampling context for true peak metering double *swr_buf; ///< resampled audio data for true peak metering @@ -389,11 +398,8 @@ static int config_video_output(AVFilterLink *outlink) return 0; } -static int config_audio_input(AVFilterLink *inlink) +static int config_audio_in(AVFilterLink *inlink, EBUR128Context *ebur128) { - AVFilterContext *ctx = inlink->dst; - EBUR128Context *ebur128 = ctx->priv; - /* Unofficial reversed parametrization of PRE * and RLB from 48kHz */ @@ -434,11 +440,16 @@ static int config_audio_input(AVFilterLink *inlink) return 0; } -static int config_audio_output(AVFilterLink *outlink) +static int config_audio_input(AVFilterLink *inlink) { - int i; - AVFilterContext *ctx = outlink->src; + AVFilterContext *ctx = inlink->dst; EBUR128Context *ebur128 = ctx->priv; + return config_audio_in(inlink, ebur128); +} + +static int config_audio_out(AVFilterLink *outlink, EBUR128Context *ebur128) +{ + int i; const int nb_channels = outlink->ch_layout.nb_channels; #define BACK_MASK (AV_CH_BACK_LEFT |AV_CH_BACK_CENTER |AV_CH_BACK_RIGHT| \ @@ -515,14 +526,22 @@ static int config_audio_output(AVFilterLink *outlink) #endif if (ebur128->peak_mode & PEAK_MODE_SAMPLES_PEAKS) { + ebur128->sample_peaks_per_frame = av_calloc(nb_channels, sizeof(*ebur128->sample_peaks_per_frame)); ebur128->sample_peaks = av_calloc(nb_channels, sizeof(*ebur128->sample_peaks)); - if (!ebur128->sample_peaks) + if (!ebur128->sample_peaks || !ebur128->sample_peaks_per_frame) return AVERROR(ENOMEM); } return 0; } +static int config_audio_output(AVFilterLink *outlink) +{ + AVFilterContext *ctx = outlink->src; + EBUR128Context *ebur128 = ctx->priv; + return config_audio_out(outlink, ebur128); +} + #define ENERGY(loudness) (ff_exp10(((loudness) + 0.691) / 10.)) #define LOUDNESS(energy) (-0.691 + 10 * log10(energy)) #define DBFS(energy) (20 * log10(energy)) @@ -541,9 +560,8 @@ static struct hist_entry *get_histogram(void) return h; } -static av_cold int init(AVFilterContext *ctx) +static av_cold int init_ebur128(AVFilterContext *ctx, EBUR128Context *ebur128) { - EBUR128Context *ebur128 = ctx->priv; AVFilterPad pad; int ret; @@ -574,6 +592,9 @@ static av_cold int init(AVFilterContext *ctx) ebur128->integrated_loudness = ABS_THRES; ebur128->loudness_range = 0; + if (strcmp(ctx->filter->name, "ebur128")) + return 0; + /* insert output pads */ if (ebur128->do_video) { pad = (AVFilterPad){ @@ -600,6 +621,12 @@ static av_cold int init(AVFilterContext *ctx) return 0; } +static av_cold int init(AVFilterContext *ctx) +{ + EBUR128Context *ebur128 = ctx->priv; + return init_ebur128(ctx, ebur128); +} + #define HIST_POS(power) (int)(((power) - ABS_THRES) * HIST_GRAIN) /* loudness and power should be set such as loudness = -0.691 + @@ -627,27 +654,21 @@ static int gate_update(struct integrator *integ, double power, return gate_hist_pos; } -static int filter_frame(AVFilterLink *inlink, AVFrame *insamples) +static int true_peaks_ebur128(EBUR128Context *ebur128, const uint8_t **samples, + int nb_samples) { - int i, ch, idx_insample, ret; - AVFilterContext *ctx = inlink->dst; - EBUR128Context *ebur128 = ctx->priv; - const int nb_channels = ebur128->nb_channels; - const int nb_samples = insamples->nb_samples; - const double *samples = (double *)insamples->data[0]; - AVFrame *pic; - #if CONFIG_SWRESAMPLE if (ebur128->peak_mode & PEAK_MODE_TRUE_PEAKS && ebur128->idx_insample == 0) { const double *swr_samples = ebur128->swr_buf; + const int nb_channels = ebur128->nb_channels; int ret = swr_convert(ebur128->swr_ctx, (uint8_t**)&ebur128->swr_buf, 19200, - (const uint8_t **)insamples->data, nb_samples); + samples, nb_samples); if (ret < 0) return ret; - for (ch = 0; ch < nb_channels; ch++) + for (int ch = 0; ch < nb_channels; ch++) ebur128->true_peaks_per_frame[ch] = 0.0; - for (idx_insample = 0; idx_insample < ret; idx_insample++) { - for (ch = 0; ch < nb_channels; ch++) { + for (int idx_insample = 0; idx_insample < ret; idx_insample++) { + for (int ch = 0; ch < nb_channels; ch++) { ebur128->true_peaks[ch] = FFMAX(ebur128->true_peaks[ch], fabs(*swr_samples)); ebur128->true_peaks_per_frame[ch] = FFMAX(ebur128->true_peaks_per_frame[ch], fabs(*swr_samples)); @@ -656,10 +677,14 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *insamples) } } #endif + return 0; +} - for (idx_insample = ebur128->idx_insample; idx_insample < nb_samples; idx_insample++) { - const int bin_id_400 = ebur128->i400.cache_pos; - const int bin_id_3000 = ebur128->i3000.cache_pos; +static void process_ebur128(EBUR128Context *ebur128, const double *samples) +{ + const int nb_channels = ebur128->nb_channels; + const int bin_id_400 = ebur128->i400.cache_pos; + const int bin_id_3000 = ebur128->i3000.cache_pos; #define MOVE_TO_NEXT_CACHED_ENTRY(time) do { \ ebur128->i##time.cache_pos++; \ @@ -670,47 +695,49 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *insamples) } \ } while (0) - MOVE_TO_NEXT_CACHED_ENTRY(400); - MOVE_TO_NEXT_CACHED_ENTRY(3000); + MOVE_TO_NEXT_CACHED_ENTRY(400); + MOVE_TO_NEXT_CACHED_ENTRY(3000); - for (ch = 0; ch < nb_channels; ch++) { - double bin; + for (int ch = 0; ch < nb_channels; ch++) { + double bin; - if (ebur128->peak_mode & PEAK_MODE_SAMPLES_PEAKS) - ebur128->sample_peaks[ch] = FFMAX(ebur128->sample_peaks[ch], fabs(samples[idx_insample * nb_channels + ch])); + if (ebur128->peak_mode & PEAK_MODE_SAMPLES_PEAKS) { + ebur128->sample_peaks[ch] = FFMAX(ebur128->sample_peaks[ch], fabs(samples[ch])); + ebur128->sample_peaks_per_frame[ch] = FFMAX(ebur128->sample_peaks_per_frame[ch], fabs(samples[ch])); + } - ebur128->x[ch * 3] = samples[idx_insample * nb_channels + ch]; // set X[i] + ebur128->x[ch * 3] = samples[ch]; // set X[i] - if (!ebur128->ch_weighting[ch]) - continue; + if (!ebur128->ch_weighting[ch]) + continue; - /* Y[i] = X[i]*b0 + X[i-1]*b1 + X[i-2]*b2 - Y[i-1]*a1 - Y[i-2]*a2 */ -#define FILTER(Y, X, NUM, DEN) do { \ - double *dst = ebur128->Y + ch*3; \ - double *src = ebur128->X + ch*3; \ - dst[2] = dst[1]; \ - dst[1] = dst[0]; \ - dst[0] = src[0]*NUM[0] + src[1]*NUM[1] + src[2]*NUM[2] \ - - dst[1]*DEN[1] - dst[2]*DEN[2]; \ + /* Y[i] = X[i]*b0 + X[i-1]*b1 + X[i-2]*b2 - Y[i-1]*a1 - Y[i-2]*a2 */ +#define FILTER(Y, X, NUM, DEN) do { \ + double *dst = ebur128->Y + ch*3; \ + double *src = ebur128->X + ch*3; \ + dst[2] = dst[1]; \ + dst[1] = dst[0]; \ + dst[0] = src[0]*NUM[0] + src[1]*NUM[1] + src[2]*NUM[2] \ + - dst[1]*DEN[1] - dst[2]*DEN[2]; \ } while (0) - // TODO: merge both filters in one? - FILTER(y, x, ebur128->pre_b, ebur128->pre_a); // apply pre-filter - ebur128->x[ch * 3 + 2] = ebur128->x[ch * 3 + 1]; - ebur128->x[ch * 3 + 1] = ebur128->x[ch * 3 ]; - FILTER(z, y, ebur128->rlb_b, ebur128->rlb_a); // apply RLB-filter + // TODO: merge both filters in one? + FILTER(y, x, ebur128->pre_b, ebur128->pre_a); // apply pre-filter + ebur128->x[ch * 3 + 2] = ebur128->x[ch * 3 + 1]; + ebur128->x[ch * 3 + 1] = ebur128->x[ch * 3 ]; + FILTER(z, y, ebur128->rlb_b, ebur128->rlb_a); // apply RLB-filter - bin = ebur128->z[ch * 3] * ebur128->z[ch * 3]; + bin = ebur128->z[ch * 3] * ebur128->z[ch * 3]; - /* add the new value, and limit the sum to the cache size (400ms or 3s) - * by removing the oldest one */ - ebur128->i400.sum [ch] = ebur128->i400.sum [ch] + bin - ebur128->i400.cache [ch][bin_id_400]; - ebur128->i3000.sum[ch] = ebur128->i3000.sum[ch] + bin - ebur128->i3000.cache[ch][bin_id_3000]; + /* add the new value, and limit the sum to the cache size (400ms or 3s) + * by removing the oldest one */ + ebur128->i400.sum [ch] = ebur128->i400.sum [ch] + bin - ebur128->i400.cache [ch][bin_id_400]; + ebur128->i3000.sum[ch] = ebur128->i3000.sum[ch] + bin - ebur128->i3000.cache[ch][bin_id_3000]; - /* override old cache entry with the new value */ - ebur128->i400.cache [ch][bin_id_400 ] = bin; - ebur128->i3000.cache[ch][bin_id_3000] = bin; - } + /* override old cache entry with the new value */ + ebur128->i400.cache [ch][bin_id_400 ] = bin; + ebur128->i3000.cache[ch][bin_id_3000] = bin; + } #define FIND_PEAK(global, sp, ptype) do { \ int ch; \ @@ -723,110 +750,149 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *insamples) } \ } while (0) - FIND_PEAK(ebur128->sample_peak, ebur128->sample_peaks, SAMPLES); - FIND_PEAK(ebur128->true_peak, ebur128->true_peaks, TRUE); + FIND_PEAK(ebur128->frame_sample_peak, ebur128->sample_peaks_per_frame, SAMPLES); + FIND_PEAK(ebur128->sample_peak, ebur128->sample_peaks, SAMPLES); + FIND_PEAK(ebur128->true_peak, ebur128->true_peaks, TRUE); +} - /* For integrated loudness, gating blocks are 400ms long with 75% - * overlap (see BS.1770-2 p5), so a re-computation is needed each 100ms - * (4800 samples at 48kHz). */ - if (++ebur128->sample_count == inlink->sample_rate / 10) { - double loudness_400, loudness_3000; - double power_400 = 1e-12, power_3000 = 1e-12; - AVFilterLink *outlink = ctx->outputs[0]; - const int64_t pts = insamples->pts + - av_rescale_q(idx_insample, (AVRational){ 1, inlink->sample_rate }, - ctx->outputs[ebur128->do_video]->time_base); +static void ebur128_loudness(AVFilterLink *inlink, + EBUR128Context *ebur128, + double *l400, double *l3000, double *integrated, double *peak) +{ + const int nb_channels = ebur128->nb_channels; + double power_400 = 1e-12, power_3000 = 1e-12; + double loudness_400, loudness_3000; - ebur128->sample_count = 0; + ebur128->sample_count = 0; #define COMPUTE_LOUDNESS(m, time) do { \ if (ebur128->i##time.filled) { \ /* weighting sum of the last <time> ms */ \ - for (ch = 0; ch < nb_channels; ch++) \ + for (int ch = 0; ch < nb_channels; ch++) \ power_##time += ebur128->ch_weighting[ch] * ebur128->i##time.sum[ch]; \ power_##time /= I##time##_BINS(inlink->sample_rate); \ } \ loudness_##time = LOUDNESS(power_##time); \ } while (0) - COMPUTE_LOUDNESS(M, 400); - COMPUTE_LOUDNESS(S, 3000); + COMPUTE_LOUDNESS(M, 400); + COMPUTE_LOUDNESS(S, 3000); - /* Integrated loudness */ + /* Integrated loudness */ #define I_GATE_THRES -10 // initially defined to -8 LU in the first EBU standard - if (loudness_400 >= ABS_THRES) { - double integrated_sum = 0.0; - uint64_t nb_integrated = 0; - int gate_hist_pos = gate_update(&ebur128->i400, power_400, - loudness_400, I_GATE_THRES); - - /* compute integrated loudness by summing the histogram values - * above the relative threshold */ - for (i = gate_hist_pos; i < HIST_SIZE; i++) { - const unsigned nb_v = ebur128->i400.histogram[i].count; - nb_integrated += nb_v; - integrated_sum += nb_v * ebur128->i400.histogram[i].energy; - } - if (nb_integrated) { - ebur128->integrated_loudness = LOUDNESS(integrated_sum / nb_integrated); - /* dual-mono correction */ - if (nb_channels == 1 && ebur128->dual_mono) { - ebur128->integrated_loudness -= ebur128->pan_law; - } - } + if (loudness_400 >= ABS_THRES) { + double integrated_sum = 0.0; + uint64_t nb_integrated = 0; + int gate_hist_pos = gate_update(&ebur128->i400, power_400, + loudness_400, I_GATE_THRES); + + /* compute integrated loudness by summing the histogram values + * above the relative threshold */ + for (int i = gate_hist_pos; i < HIST_SIZE; i++) { + const unsigned nb_v = ebur128->i400.histogram[i].count; + nb_integrated += nb_v; + integrated_sum += nb_v * ebur128->i400.histogram[i].energy; + } + if (nb_integrated) { + ebur128->integrated_loudness = LOUDNESS(integrated_sum / nb_integrated); + /* dual-mono correction */ + if (nb_channels == 1 && ebur128->dual_mono) { + ebur128->integrated_loudness -= ebur128->pan_law; } + } + } - /* LRA */ + /* LRA */ #define LRA_GATE_THRES -20 #define LRA_LOWER_PRC 10 #define LRA_HIGHER_PRC 95 - /* XXX: example code in EBU 3342 is ">=" but formula in BS.1770 - * specs is ">" */ - if (loudness_3000 >= ABS_THRES) { - uint64_t nb_powers = 0; - int gate_hist_pos = gate_update(&ebur128->i3000, power_3000, - loudness_3000, LRA_GATE_THRES); - - for (i = gate_hist_pos; i < HIST_SIZE; i++) - nb_powers += ebur128->i3000.histogram[i].count; - if (nb_powers) { - uint64_t n, nb_pow; - - /* get lower loudness to consider */ - n = 0; - nb_pow = LRA_LOWER_PRC * nb_powers * 0.01 + 0.5; - for (i = gate_hist_pos; i < HIST_SIZE; i++) { - n += ebur128->i3000.histogram[i].count; - if (n >= nb_pow) { - ebur128->lra_low = ebur128->i3000.histogram[i].loudness; - break; - } - } - - /* get higher loudness to consider */ - n = nb_powers; - nb_pow = LRA_HIGHER_PRC * nb_powers * 0.01 + 0.5; - for (i = HIST_SIZE - 1; i >= 0; i--) { - n -= FFMIN(n, ebur128->i3000.histogram[i].count); - if (n < nb_pow) { - ebur128->lra_high = ebur128->i3000.histogram[i].loudness; - break; - } - } - - // XXX: show low & high on the graph? - ebur128->loudness_range = ebur128->lra_high - ebur128->lra_low; + /* XXX: example code in EBU 3342 is ">=" but formula in BS.1770 + * specs is ">" */ + if (loudness_3000 >= ABS_THRES) { + uint64_t nb_powers = 0; + int gate_hist_pos = gate_update(&ebur128->i3000, power_3000, + loudness_3000, LRA_GATE_THRES); + + for (int i = gate_hist_pos; i < HIST_SIZE; i++) + nb_powers += ebur128->i3000.histogram[i].count; + if (nb_powers) { + uint64_t n, nb_pow; + + /* get lower loudness to consider */ + n = 0; + nb_pow = LRA_LOWER_PRC * nb_powers * 0.01 + 0.5; + for (int i = gate_hist_pos; i < HIST_SIZE; i++) { + n += ebur128->i3000.histogram[i].count; + if (n >= nb_pow) { + ebur128->lra_low = ebur128->i3000.histogram[i].loudness; + break; } } - /* dual-mono correction */ - if (nb_channels == 1 && ebur128->dual_mono) { - loudness_400 -= ebur128->pan_law; - loudness_3000 -= ebur128->pan_law; + /* get higher loudness to consider */ + n = nb_powers; + nb_pow = LRA_HIGHER_PRC * nb_powers * 0.01 + 0.5; + for (int i = HIST_SIZE - 1; i >= 0; i--) { + n -= FFMIN(n, ebur128->i3000.histogram[i].count); + if (n < nb_pow) { + ebur128->lra_high = ebur128->i3000.histogram[i].loudness; + break; + } } + // XXX: show low & high on the graph? + ebur128->loudness_range = ebur128->lra_high - ebur128->lra_low; + } + } + + /* dual-mono correction */ + if (nb_channels == 1 && ebur128->dual_mono) { + loudness_400 -= ebur128->pan_law; + loudness_3000 -= ebur128->pan_law; + } + + if (ebur128->peak_mode & PEAK_MODE_SAMPLES_PEAKS) { + for (int ch = 0; ch < nb_channels; ch++) + ebur128->sample_peaks_per_frame[ch] = 0.0; + } + + *l400 = loudness_400; + *l3000 = loudness_3000; + *integrated = ebur128->integrated_loudness; + *peak = ebur128->frame_sample_peak; +} + +static int filter_frame(AVFilterLink *inlink, AVFrame *insamples) +{ + int idx_insample, ret; + AVFilterContext *ctx = inlink->dst; + EBUR128Context *ebur128 = ctx->priv; + const int nb_channels = ebur128->nb_channels; + const int nb_samples = insamples->nb_samples; + const double *samples = (const double *)insamples->data[0]; + AVFrame *pic; + + ret = true_peaks_ebur128(ebur128, (const uint8_t **)insamples->data, nb_samples); + if (ret < 0) + return ret; + + for (idx_insample = ebur128->idx_insample; idx_insample < nb_samples; idx_insample++) { + process_ebur128(ebur128, samples + idx_insample * nb_channels); + + /* For integrated loudness, gating blocks are 400ms long with 75% + * overlap (see BS.1770-2 p5), so a re-computation is needed each 100ms + * (4800 samples at 48kHz). */ + if (++ebur128->sample_count == inlink->sample_rate / 10) { + double loudness_400, loudness_3000, loudness_integrated, peak; + AVFilterLink *outlink = ctx->outputs[0]; + const int64_t pts = insamples->pts + + av_rescale_q(idx_insample, (AVRational){ 1, inlink->sample_rate }, + ctx->outputs[ebur128->do_video]->time_base); + + ebur128_loudness(inlink, ebur128, &loudness_400, &loudness_3000, &loudness_integrated, &peak); + #define LOG_FMT "TARGET:%d LUFS M:%6.1f S:%6.1f I:%6.1f %s LRA:%6.1f LU" /* push one video frame */ @@ -910,7 +976,7 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *insamples) if (ebur128->peak_mode & PEAK_MODE_ ## ptype ## _PEAKS) { \ double max_peak = 0.0; \ char key[64]; \ - for (ch = 0; ch < nb_channels; ch++) { \ + for (int ch = 0; ch < nb_channels; ch++) { \ snprintf(key, sizeof(key), \ META_PREFIX AV_STRINGIFY(name) "_peaks_ch%d", ch); \ max_peak = fmax(max_peak, ebur128->name##_peaks[ch]); \ @@ -949,7 +1015,7 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *insamples) #define PRINT_PEAKS(str, sp, ptype) do { \ if (ebur128->peak_mode & PEAK_MODE_ ## ptype ## _PEAKS) { \ av_log(ctx, ebur128->loglevel, " " str ":"); \ - for (ch = 0; ch < nb_channels; ch++) \ + for (int ch = 0; ch < nb_channels; ch++) \ av_log(ctx, ebur128->loglevel, " %5.1f", DBFS(sp[ch])); \ av_log(ctx, ebur128->loglevel, " dBFS"); \ } \ @@ -1047,6 +1113,36 @@ static int query_formats(AVFilterContext *ctx) return 0; } +static av_cold void uninit_ebur128(AVFilterContext *ctx, EBUR128Context *ebur128) +{ + av_freep(&ebur128->y_line_ref); + av_freep(&ebur128->x); + av_freep(&ebur128->y); + av_freep(&ebur128->z); + av_freep(&ebur128->ch_weighting); + av_freep(&ebur128->true_peaks); + av_freep(&ebur128->sample_peaks); + av_freep(&ebur128->true_peaks_per_frame); + av_freep(&ebur128->sample_peaks_per_frame); + av_freep(&ebur128->i400.sum); + av_freep(&ebur128->i3000.sum); + av_freep(&ebur128->i400.histogram); + av_freep(&ebur128->i3000.histogram); + for (int i = 0; i < ebur128->nb_channels; i++) { + if (ebur128->i400.cache) + av_freep(&ebur128->i400.cache[i]); + if (ebur128->i3000.cache) + av_freep(&ebur128->i3000.cache[i]); + } + av_freep(&ebur128->i400.cache); + av_freep(&ebur128->i3000.cache); + av_frame_free(&ebur128->outpicref); +#if CONFIG_SWRESAMPLE + av_freep(&ebur128->swr_buf); + swr_free(&ebur128->swr_ctx); +#endif +} + static av_cold void uninit(AVFilterContext *ctx) { EBUR128Context *ebur128 = ctx->priv; @@ -1085,31 +1181,7 @@ static av_cold void uninit(AVFilterContext *ctx) av_log(ctx, AV_LOG_INFO, "\n"); } - av_freep(&ebur128->y_line_ref); - av_freep(&ebur128->x); - av_freep(&ebur128->y); - av_freep(&ebur128->z); - av_freep(&ebur128->ch_weighting); - av_freep(&ebur128->true_peaks); - av_freep(&ebur128->sample_peaks); - av_freep(&ebur128->true_peaks_per_frame); - av_freep(&ebur128->i400.sum); - av_freep(&ebur128->i3000.sum); - av_freep(&ebur128->i400.histogram); - av_freep(&ebur128->i3000.histogram); - for (int i = 0; i < ebur128->nb_channels; i++) { - if (ebur128->i400.cache) - av_freep(&ebur128->i400.cache[i]); - if (ebur128->i3000.cache) - av_freep(&ebur128->i3000.cache[i]); - } - av_freep(&ebur128->i400.cache); - av_freep(&ebur128->i3000.cache); - av_frame_free(&ebur128->outpicref); -#if CONFIG_SWRESAMPLE - av_freep(&ebur128->swr_buf); - swr_free(&ebur128->swr_ctx); -#endif + uninit_ebur128(ctx, ebur128); } static const AVFilterPad ebur128_inputs[] = { @@ -1133,3 +1205,521 @@ const AVFilter ff_af_ebur128 = { .priv_class = &ebur128_class, .flags = AVFILTER_FLAG_DYNAMIC_OUTPUTS, }; + +#define FIFO_SIZE 30 + +enum DynamicMode { + DM_MOMENTARY = 1 << 0, + DM_SHORTTERM = 1 << 1, + DM_INTEGRATED = 1 << 2, +}; + +enum MeanMode { + MM_ARITHMETIC, + MM_HARMONIC, + MM_GEOMETRIC, + MM_MAXIMUM, + MM_MODES +}; + +typedef struct LoudNormContext { + const AVClass *class; + double target_i; + double target_lra; + double target_tp; + double measured_i; + double measured_lra; + double measured_tp; + double measured_thresh; + double offset; + int linear_mode; + int dual_mono; + enum PrintFormat print_format; + int dynamic_mode; + int mean_mode; + double rangeup; + double rangedown; + double attack; + double release; + double attack_coeff; + double release_coeff; + + int eof; + int64_t eof_pts; + int nb_channels; + int nb_samples; + + AVFrame *insamples; + AVFrame *frames[FIFO_SIZE]; + double i400; + double i3000; + double integrated; + double peaks[FIFO_SIZE]; + double prev_offset; + + EBUR128Context r128_in; + EBUR128Context r128_out; +} LoudNormContext; + +static av_cold int loudnorm_init(AVFilterContext *ctx) +{ + LoudNormContext *s = ctx->priv; + int ret; + + ret = init_ebur128(ctx, &s->r128_in); + if (ret < 0) + return ret; + ret = init_ebur128(ctx, &s->r128_out); + if (ret < 0) + return ret; + + s->r128_in.dual_mono = s->dual_mono; + s->r128_out.dual_mono = s->dual_mono; + + if (s->linear_mode) { + double offset, offset_tp; + offset = s->target_i - s->measured_i; + offset_tp = s->measured_tp + offset; + + if (s->measured_tp != 99 && s->measured_thresh != -70 && s->measured_lra != 0 && s->measured_i != 0) { + if ((offset_tp <= s->target_tp) && (s->measured_lra <= s->target_lra)) { + s->offset = pow(10., offset / 20.); + } + } else { + s->linear_mode = 0; + } + } + + return 0; +} + +static int loudnorm_query_formats(AVFilterContext *ctx) +{ + LoudNormContext *s = ctx->priv; + static const int input_srate[] = {192000, -1}; + static const enum AVSampleFormat sample_fmts[] = { + AV_SAMPLE_FMT_DBL, + AV_SAMPLE_FMT_NONE + }; + int ret = ff_set_common_all_channel_counts(ctx); + if (ret < 0) + return ret; + + ret = ff_set_common_formats_from_list(ctx, sample_fmts); + if (ret < 0) + return ret; + + if (s->linear_mode) { + return ff_set_common_all_samplerates(ctx); + } else { + return ff_set_common_samplerates_from_list(ctx, input_srate); + } +} + +static av_cold void loudnorm_uninit(AVFilterContext *ctx) +{ + LoudNormContext *s = ctx->priv; + EBUR128Context *r128_out = &s->r128_out; + EBUR128Context *r128_in = &s->r128_in; + + if (s->nb_channels > 0) { + switch (s->print_format) { + case NONE: + break; + + case JSON: + av_log(ctx, AV_LOG_INFO, + "\n{\n" + "\t\"input_i\" : \"%.2f\",\n" + "\t\"input_tp\" : \"%.2f\",\n" + "\t\"input_lra\" : \"%.2f\",\n" + "\t\"input_thresh\" : \"%.2f\",\n" + "\t\"output_i\" : \"%.2f\",\n" + "\t\"output_tp\" : \"%+.2f\",\n" + "\t\"output_lra\" : \"%.2f\",\n" + "\t\"output_thresh\" : \"%.2f\",\n" + "\t\"normalization_type\" : \"%s\",\n" + "\t\"target_offset\" : \"%.2f\"\n" + "}\n", + r128_in->integrated_loudness, + r128_in->true_peak, + r128_in->loudness_range, + r128_in->i3000.rel_threshold, + r128_out->integrated_loudness, + r128_out->true_peak, + r128_out->loudness_range, + r128_out->i3000.rel_threshold, + s->linear_mode ? "linear" : "dynamic", + s->target_i - r128_out->integrated_loudness + ); + break; + + case SUMMARY: + av_log(ctx, AV_LOG_INFO, + "\n" + "Input Integrated: %+6.1f LUFS\n" + "Input True Peak: %+6.1f dBTP\n" + "Input LRA: %6.1f LU\n" + "Input Threshold: %+6.1f LUFS\n" + "\n" + "Output Integrated: %+6.1f LUFS\n" + "Output True Peak: %+6.1f dBTP\n" + "Output LRA: %6.1f LU\n" + "Output Threshold: %+6.1f LUFS\n" + "\n" + "Normalization Type: %s\n" + "Target Offset: %+6.1f LU\n", + r128_in->integrated_loudness, + r128_in->true_peak, + r128_in->loudness_range, + r128_in->i3000.rel_threshold, + r128_out->integrated_loudness, + r128_out->true_peak, + r128_out->loudness_range, + r128_out->i3000.rel_threshold, + s->linear_mode ? "Linear" : "Dynamic", + s->target_i - r128_out->integrated_loudness + ); + break; + } + } + + for (int i = 0; i < FIFO_SIZE; i++) + av_frame_free(&s->frames[i]); + + uninit_ebur128(ctx, &s->r128_in); + uninit_ebur128(ctx, &s->r128_out); +} + +static int loudnorm_config_input(AVFilterLink *inlink) +{ + AVFilterContext *ctx = inlink->dst; + LoudNormContext *s = ctx->priv; + int ret; + + s->nb_samples = FFMAX(inlink->sample_rate / 10, 1); + + ret = config_audio_in(inlink, &s->r128_in); + if (ret < 0) + return ret; + + ret = config_audio_in(inlink, &s->r128_out); + if (ret < 0) + return ret; + + return 0; +} + +static double get_coeff(double x, double sr) +{ + return 1.0 - exp(-1.0 / (0.001 * x * sr)); +} + +static int loudnorm_config_output(AVFilterLink *outlink) +{ + AVFilterContext *ctx = outlink->src; + LoudNormContext *s = ctx->priv; + int ret; + + s->attack_coeff = get_coeff(s->attack, outlink->sample_rate); + s->release_coeff = get_coeff(s->release, outlink->sample_rate); + s->prev_offset = 1.0; + + s->nb_channels = outlink->ch_layout.nb_channels; + s->r128_out.peak_mode = PEAK_MODE_TRUE_PEAKS|PEAK_MODE_SAMPLES_PEAKS; + s->r128_in.peak_mode = PEAK_MODE_TRUE_PEAKS|PEAK_MODE_SAMPLES_PEAKS; + + ret = config_audio_out(outlink, &s->r128_in); + if (ret < 0) + return ret; + + ret = config_audio_out(outlink, &s->r128_out); + if (ret < 0) + return ret; + + return 0; +} + +static double add_item(int mode, double offset, double item) +{ + switch (mode) { + case MM_MAXIMUM: + offset = fmax(offset, item); + break; + case MM_GEOMETRIC: + offset *= item; + break; + case MM_HARMONIC: + offset += 1.0 / item; + break; + case MM_ARITHMETIC: + offset += item; + break; + } + + return offset; +} + +static double get_loudness(LoudNormContext *s, + int dynamic_mode, + int mean_mode) +{ + double offset, p = 0.0; + + if (mean_mode == MM_GEOMETRIC) + offset = 1.0; + else if (mean_mode == MM_MAXIMUM) + offset = -70.0; + else + offset = 0.0; + + if (dynamic_mode & DM_INTEGRATED) { + offset = add_item(mean_mode, offset, s->integrated); + p += 1.0; + } + + if (dynamic_mode & DM_SHORTTERM) { + offset = add_item(mean_mode, offset, s->i3000); + p += 1.0; + } + + if (s->dynamic_mode & DM_MOMENTARY) { + offset = add_item(mean_mode, offset, s->i400); + p += 1.0; + } + + switch (s->mean_mode) { + case MM_MAXIMUM: + break; + case MM_GEOMETRIC: + offset = -pow(fabs(offset), 1.0 / p); + break; + case MM_HARMONIC: + offset = p / offset; + break; + case MM_ARITHMETIC: + offset /= p; + break; + } + + return offset; +} + +static int loudnorm_filter_frame(AVFilterLink *inlink, AVFrame *in) +{ + AVFilterContext *ctx = inlink->dst; + AVFilterLink *outlink = ctx->outputs[0]; + LoudNormContext *s = ctx->priv; + EBUR128Context *r128_out = &s->r128_out; + EBUR128Context *r128_in = &s->r128_in; + const int nb_channels = s->nb_channels; + int nb_samples = in ? in->nb_samples : 0; + const double *samples = in ? (const double *)in->data[0] : NULL; + AVFrame *out; + int ret; + + if (in) { + ret = true_peaks_ebur128(r128_in, (const uint8_t **)in->data, nb_samples); + if (ret < 0) + return ret; + } + + for (int idx_insample = r128_in->idx_insample; idx_insample < nb_samples; idx_insample++) { + process_ebur128(r128_in, samples + idx_insample * nb_channels); + if (++r128_in->sample_count == inlink->sample_rate / 10) { + double peak; + + ebur128_loudness(inlink, r128_in, &s->i400, &s->i3000, &s->integrated, &peak); + memmove(&s->peaks[0], &s->peaks[1], sizeof(s->peaks) - sizeof(s->peaks[0])); + s->peaks[FIFO_SIZE-1] = peak; + } + } + + r128_in->idx_insample = 0; + s->insamples = NULL; + + if (s->linear_mode) { + out = ff_get_audio_buffer(outlink, nb_samples); + if (!out) { + if (s->linear_mode) + av_frame_free(&in); + return AVERROR(ENOMEM); + } else { + const double *src = (const double *)in->data[0]; + double *dst = (double *)out->data[0]; + const double offset = s->offset; + + for (int n = 0; n < nb_samples * nb_channels; n++) + dst[n] = src[n] * offset; + } + } else { + av_frame_free(&s->frames[0]); + memmove(&s->frames[0], &s->frames[1], sizeof(s->frames) - sizeof(s->frames[0])); + s->frames[FIFO_SIZE-1] = in; + in = s->frames[0]; + if (in) { + nb_samples = in->nb_samples; + out = ff_get_audio_buffer(outlink, nb_samples); + if (!out) { + return AVERROR(ENOMEM); + } else { + const double release = s->release_coeff; + const double attack = s->attack_coeff; + const double *src = (const double *)in->data[0]; + double *dst = (double *)out->data[0]; + const double measured = get_loudness(s, s->dynamic_mode, s->mean_mode); + const double limit = s->target_tp - fmax(s->peaks[0], s->peaks[1]); + const double rangemin = -s->rangedown; + const double rangemax = fmin(s->rangeup, limit); + const double target = av_clipd(s->target_i - measured, rangemin, rangemax); + const double new_offset = pow(10., target / 20.); + double prev_offset = s->prev_offset; + + for (int n = 0; n < nb_samples * nb_channels; n += nb_channels) { + const double f = (new_offset > prev_offset) * attack + (new_offset <= prev_offset) * release; + const double offset = f * new_offset + (1.0 - f) * prev_offset; + for (int c = n; c < n + nb_channels; c++) + dst[c] = src[c] * offset; + prev_offset = offset; + } + + s->prev_offset = prev_offset; + } + } else { + ff_filter_set_ready(ctx, 100); + return 0; + } + } + + ret = true_peaks_ebur128(r128_out, (const uint8_t **)out->data, nb_samples); + if (ret < 0) + return ret; + + samples = (const double *)out->data[0]; + for (int idx_insample = r128_out->idx_insample; idx_insample < nb_samples; idx_insample++) { + process_ebur128(r128_out, samples + idx_insample * nb_channels); + if (++r128_out->sample_count == inlink->sample_rate / 10) { + double loudness_400, loudness_3000, loudness_integrated, peak; + ebur128_loudness(inlink, r128_out, &loudness_400, &loudness_3000, &loudness_integrated, &peak); + } + } + + r128_out->idx_insample = 0; + av_frame_copy_props(out, in); + if (s->linear_mode) + av_frame_free(&in); + + return ff_filter_frame(outlink, out); +} + +static int loudnorm_activate(AVFilterContext *ctx) +{ + AVFilterLink *outlink = ctx->outputs[0]; + AVFilterLink *inlink = ctx->inputs[0]; + LoudNormContext *s = ctx->priv; + int ret, status; + + FF_FILTER_FORWARD_STATUS_BACK(outlink, inlink); + + if (!s->insamples && !s->eof) { + AVFrame *in; + + ret = ff_inlink_consume_samples(inlink, s->nb_samples, s->nb_samples, &in); + if (ret < 0) + return ret; + if (ret > 0) + s->insamples = in; + } + + if (s->insamples) + return loudnorm_filter_frame(inlink, s->insamples); + + if (!s->eof && ff_inlink_acknowledge_status(inlink, &status, &s->eof_pts)) { + if (status == AVERROR_EOF) + s->eof = 1; + } + + if (s->eof && !s->frames[0]) + ff_outlink_set_status(outlink, AVERROR_EOF, s->eof_pts); + + if (s->eof) + return loudnorm_filter_frame(inlink, NULL); + + FF_FILTER_FORWARD_WANTED(outlink, inlink); + + return ret; +} + +#undef OFFSET +#undef FLAGS + +#define OFFSET(x) offsetof(LoudNormContext, x) +#define FLAGS AV_OPT_FLAG_AUDIO_PARAM|AV_OPT_FLAG_FILTERING_PARAM + +static const AVOption loudnorm_options[] = { + { "I", "set integrated loudness target", OFFSET(target_i), AV_OPT_TYPE_DOUBLE, {.dbl = -24.}, -70., -5., FLAGS }, + { "i", "set integrated loudness target", OFFSET(target_i), AV_OPT_TYPE_DOUBLE, {.dbl = -24.}, -70., -5., FLAGS }, + { "LRA", "set loudness range target", OFFSET(target_lra), AV_OPT_TYPE_DOUBLE, {.dbl = 7.}, 1., 50., FLAGS }, + { "lra", "set loudness range target", OFFSET(target_lra), AV_OPT_TYPE_DOUBLE, {.dbl = 7.}, 1., 50., FLAGS }, + { "TP", "set maximum true peak", OFFSET(target_tp), AV_OPT_TYPE_DOUBLE, {.dbl = -2.}, -9., 0., FLAGS }, + { "tp", "set maximum true peak", OFFSET(target_tp), AV_OPT_TYPE_DOUBLE, {.dbl = -2.}, -9., 0., FLAGS }, + { "measured_I", "measured IL of input file", OFFSET(measured_i), AV_OPT_TYPE_DOUBLE, {.dbl = 0.}, -99., 0., FLAGS }, + { "measured_i", "measured IL of input file", OFFSET(measured_i), AV_OPT_TYPE_DOUBLE, {.dbl = 0.}, -99., 0., FLAGS }, + { "measured_LRA", "measured LRA of input file", OFFSET(measured_lra), AV_OPT_TYPE_DOUBLE, {.dbl = 0.}, 0., 99., FLAGS }, + { "measured_lra", "measured LRA of input file", OFFSET(measured_lra), AV_OPT_TYPE_DOUBLE, {.dbl = 0.}, 0., 99., FLAGS }, + { "measured_TP", "measured true peak of input file", OFFSET(measured_tp), AV_OPT_TYPE_DOUBLE, {.dbl = 99.}, -99., 99., FLAGS }, + { "measured_tp", "measured true peak of input file", OFFSET(measured_tp), AV_OPT_TYPE_DOUBLE, {.dbl = 99.}, -99., 99., FLAGS }, + { "measured_thresh", "measured threshold of input file", OFFSET(measured_thresh), AV_OPT_TYPE_DOUBLE, {.dbl = -70.}, -99., 0., FLAGS }, + { "offset", "set offset gain", OFFSET(offset), AV_OPT_TYPE_DOUBLE, {.dbl = 0.}, -99., 99., FLAGS }, + { "linear", "normalize linearly if possible", OFFSET(linear_mode), AV_OPT_TYPE_BOOL, {.i64 = 1}, 0, 1, FLAGS }, + { "dual_mono", "treat mono input as dual-mono", OFFSET(dual_mono), AV_OPT_TYPE_BOOL, {.i64 = 0}, 0, 1, FLAGS }, + { "print_format", "set print format for stats", OFFSET(print_format), AV_OPT_TYPE_INT, {.i64 = NONE}, NONE, PF_NB -1, FLAGS, "print_format" }, + { "none", 0, 0, AV_OPT_TYPE_CONST, {.i64 = NONE}, 0, 0, FLAGS, "print_format" }, + { "json", 0, 0, AV_OPT_TYPE_CONST, {.i64 = JSON}, 0, 0, FLAGS, "print_format" }, + { "summary", 0, 0, AV_OPT_TYPE_CONST, {.i64 = SUMMARY}, 0, 0, FLAGS, "print_format" }, + { "dynamic_mode", "set dynamic mode", OFFSET(dynamic_mode), AV_OPT_TYPE_FLAGS, {.i64 = DM_INTEGRATED|DM_SHORTTERM},0,INT32_MAX,FLAGS,"dynamic_mode" }, + { "i", "integrated", 0, AV_OPT_TYPE_CONST, {.i64 = DM_INTEGRATED}, 0, 0, FLAGS, "dynamic_mode" }, + { "m", "momentary", 0, AV_OPT_TYPE_CONST, {.i64 = DM_MOMENTARY}, 0, 0, FLAGS, "dynamic_mode" }, + { "s", "shortterm", 0, AV_OPT_TYPE_CONST, {.i64 = DM_SHORTTERM}, 0, 0, FLAGS, "dynamic_mode" }, + { "mean_mode", "set mean mode", OFFSET(mean_mode), AV_OPT_TYPE_INT, {.i64 = MM_GEOMETRIC}, 0, MM_MODES-1, FLAGS,"mean_mode" }, + { "a", "arithmetic", 0, AV_OPT_TYPE_CONST, {.i64 = MM_ARITHMETIC}, 0, 0, FLAGS, "mean_mode" }, + { "h", "harmonic", 0, AV_OPT_TYPE_CONST, {.i64 = MM_HARMONIC}, 0, 0, FLAGS, "mean_mode" }, + { "g", "geometric", 0, AV_OPT_TYPE_CONST, {.i64 = MM_GEOMETRIC}, 0, 0, FLAGS, "mean_mode" }, + { "m", "maximum", 0, AV_OPT_TYPE_CONST, {.i64 = MM_MAXIMUM}, 0, 0, FLAGS, "mean_mode" }, + { "rangeup", "set max expansion", OFFSET(rangeup), AV_OPT_TYPE_DOUBLE, {.dbl = 0}, 0, 70, FLAGS }, + { "rangedown", "set max compression", OFFSET(rangedown), AV_OPT_TYPE_DOUBLE, {.dbl = 70}, 0, 70, FLAGS }, + { "attack", "set attack", OFFSET(attack), AV_OPT_TYPE_DOUBLE, {.dbl = 1}, 1, 2000, FLAGS }, + { "release", "set release", OFFSET(release), AV_OPT_TYPE_DOUBLE, {.dbl = 1}, 1, 2000, FLAGS }, + { NULL } +}; + +AVFILTER_DEFINE_CLASS(loudnorm); + +static const AVFilterPad loudnorm_inputs[] = { + { + .name = "default", + .type = AVMEDIA_TYPE_AUDIO, + .config_props = loudnorm_config_input, + }, +}; + +static const AVFilterPad loudnorm_outputs[] = { + { + .name = "default", + .type = AVMEDIA_TYPE_AUDIO, + .config_props = loudnorm_config_output, + }, +}; + +const AVFilter ff_af_loudnorm = { + .name = "loudnorm", + .description = NULL_IF_CONFIG_SMALL("EBU R128 loudness normalization"), + .priv_size = sizeof(LoudNormContext), + .priv_class = &loudnorm_class, + .init = loudnorm_init, + .activate = loudnorm_activate, + .uninit = loudnorm_uninit, + FILTER_INPUTS(loudnorm_inputs), + FILTER_OUTPUTS(loudnorm_outputs), + FILTER_QUERY_FUNC(loudnorm_query_formats), +}; -- 2.42.0 [-- Attachment #3: Type: text/plain, Size: 251 bytes --] _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
next prev parent reply other threads:[~2023-11-15 20:39 UTC|newest] Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top 2023-09-29 21:46 Paul B Mahol 2023-11-15 20:46 ` Paul B Mahol [this message] 2023-11-17 6:38 ` Kyle Swanson 2023-11-19 11:56 ` Paul B Mahol 2023-11-19 21:55 ` Marton Balint 2023-11-19 23:37 ` Paul B Mahol 2023-11-21 18:53 ` Kyle Swanson 2023-11-28 16:51 ` Paul B Mahol 2023-11-30 11:43 ` Anton Khirnov 2023-11-30 12:48 ` Paul B Mahol 2023-11-30 13:47 ` Anton Khirnov 2023-11-30 14:01 ` Paul B Mahol 2023-11-30 13:57 ` Anton Khirnov 2023-11-30 14:20 ` Paul B Mahol 2023-11-30 18:34 ` Kyle Swanson 2023-11-30 21:44 ` Paul B Mahol 2023-11-30 22:19 ` Kyle Swanson 2023-11-30 22:51 ` Paul B Mahol 2023-11-30 23:29 ` Kyle Swanson 2023-12-01 10:45 ` Paul B Mahol 2023-12-01 21:12 ` Kyle Swanson 2023-12-01 21:27 ` Paul B Mahol
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to='CAPYw7P5MGP2Hg9Ca3qbKv3MnGYJH2y3T9Y=Dy251EWPLTpDtYg@mail.gmail.com' \ --to=onemda@gmail.com \ --cc=ffmpeg-devel@ffmpeg.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel This inbox may be cloned and mirrored by anyone: git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \ ffmpegdev@gitmailbox.com public-inbox-index ffmpegdev Example config snippet for mirrors. AGPL code for this site: git clone https://public-inbox.org/public-inbox.git