From: Marton Balint <cus@passwd.hu> To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org> Subject: Re: [FFmpeg-devel] [PATCH] avfilter: merge loudnorm filter functionality into f_ebur128.c Date: Sun, 19 Nov 2023 22:55:09 +0100 (CET) Message-ID: <f09a26eb-c825-5f4-20a2-2788558182bf@passwd.hu> (raw) In-Reply-To: <CAPYw7P4JMu6J6dWMo=fy4S95=46m_j1SO_aKMzTh1CzQj3E0Sw@mail.gmail.com> [-- Attachment #1: Type: text/plain, Size: 2569 bytes --] On Sun, 19 Nov 2023, Paul B Mahol wrote: > On Fri, Nov 17, 2023 at 7:38 AM Kyle Swanson <k@ylo.ph> wrote: > >> Hi, >> >> On Wed, Nov 15, 2023 at 12:39 PM Paul B Mahol <onemda@gmail.com> wrote: >> > >> > Attached. >> >> Only had a few minutes to look at this. Seems like more than just >> merging two filters, I see a bunch of new filter options for example. >> Can you explain? >> > > The linear mode and scanning, both input to filter and filter output itself > should give similar results. > The dynamic mode now actually can be configured how aggressively it will > expand / compress audio. > Because current state of filter have numerous issues: > > - using unmaintained old libebur128 module, when same functionality is > already available in existing filter. > - code duplication and functionality duplication due the above > - buggy limiter - causing clipped samples randomly > - buggy first and final frame filtering > - over-complicated flow path for dynamic code in filter > - excessive compressing of audio dynamic range, causing extreme smaller > LRU from output audio > - and probably more that I forgot > > Some options from this patch can be probably removed, like attack/release > options, and just use defaults as currently in patch. Previously ebur128 functionality was decoupled from specific filters, so there was a chance that multiple filters can use it. Unfortunately f_ebur128.c was never migrated to use internal ebur128 lib, as far as I remember the maintaner rejected the idea for some reason back then. IMHO having some generic ebur128 functionality would be preferable. I have an old patch for example which adds EBUR128 mode to af_dynaudnorm, see attached for reference. Looks much cleaner than af_loudnorm, which was always a bit overcomplicated and slightly buggy, as you mentioned. So please consider two things: - Can you keep some generic ebur128 functionality which can easily reused by multiple filters? I don't mind if it is the old code from ebur128 lib or current code from f_ebur128, but it should be reusable internal ff_ functions. - Does it make sense to maintain a separate loudnorm filter for EBUR128 loudness, or it can be integrated into af_dynaudnorm? Because I kind of think that having this as a option of af_dynaudnorm would be cleaner, at least for the dynamic normalization functionality. For the linear mode, well, we already have compressor filters, so I am not sure if that mode is worth keeping. But maybe it is easier for the end user, I don't know. Thanks, Marton [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: Type: text/x-patch; name=af_dynaudnorm-loudness-poc.patch, Size: 6849 bytes --] commit df4e283d7b2aa4b4de6e405e5dcbbae38d053b9f Author: Marton Balint <cus@passwd.hu> Date: Sun Oct 16 20:45:51 2016 +0200 lavfi/af_dynaudnorm: add support for momentary loudness based normalization Signed-off-by: Marton Balint <cus@passwd.hu> diff --git a/doc/filters.texi b/doc/filters.texi index 604e44d569..9d05d7db94 100644 --- a/doc/filters.texi +++ b/doc/filters.texi @@ -3212,6 +3212,22 @@ factor is defined as the factor that would result in exactly that RMS value. Note, however, that the maximum local gain factor is still restricted by the frame's highest magnitude sample, in order to prevent clipping. +@item l +Set the target loudness in LUFS. In range from -70.0 to 0. Default is 0.0 - +disabled. By default, the Dynamic Audio Normalizer performs "peak" +normalization. This means that the maximum local gain factor for each frame is +defined (only) by the frame's highest magnitude sample. This way, the samples +can be amplified as much as possible without exceeding the maximum signal +level, i.e. without clipping. Optionally, however, the Dynamic Audio Normalizer +can also take into account the frame's perceived momentary loudness which is +measured based on the EBU R128 recommendation. Consequently, by adjusting all +frames to a constant loudness value, a uniform "perceived loudness" can be +established. Note, however, that loudness is measured without any kind of +gating, therefore the integrated loudness as defined by EBU R128 will be +usually less than the target level, depending on your content. Also note, that +the maximum local gain factor is still restricted by the frame's highest +magnitude sample, in order to prevent clipping. + @item n Enable channels coupling. By default is enabled. By default, the Dynamic Audio Normalizer will amplify all channels by the same diff --git a/libavfilter/Makefile b/libavfilter/Makefile index 455c809b15..7c3238edd3 100644 --- a/libavfilter/Makefile +++ b/libavfilter/Makefile @@ -103,7 +103,7 @@ OBJS-$(CONFIG_CRYSTALIZER_FILTER) += af_crystalizer.o OBJS-$(CONFIG_DCSHIFT_FILTER) += af_dcshift.o OBJS-$(CONFIG_DEESSER_FILTER) += af_deesser.o OBJS-$(CONFIG_DRMETER_FILTER) += af_drmeter.o -OBJS-$(CONFIG_DYNAUDNORM_FILTER) += af_dynaudnorm.o +OBJS-$(CONFIG_DYNAUDNORM_FILTER) += af_dynaudnorm.o ebur128.o OBJS-$(CONFIG_EARWAX_FILTER) += af_earwax.o OBJS-$(CONFIG_EBUR128_FILTER) += f_ebur128.o OBJS-$(CONFIG_EQUALIZER_FILTER) += af_biquads.o diff --git a/libavfilter/af_dynaudnorm.c b/libavfilter/af_dynaudnorm.c index fd430884d7..67db1dcfc2 100644 --- a/libavfilter/af_dynaudnorm.c +++ b/libavfilter/af_dynaudnorm.c @@ -37,6 +37,8 @@ #include "filters.h" #include "internal.h" +#include "ebur128.h" + typedef struct cqueue { double *elements; int size; @@ -59,6 +61,7 @@ typedef struct DynamicAudioNormalizerContext { double peak_value; double max_amplification; double target_rms; + double target_lufs; double compress_factor; double *prev_amplification_factor; double *dc_correction_value; @@ -76,6 +79,8 @@ typedef struct DynamicAudioNormalizerContext { cqueue **gain_history_smoothed; cqueue *is_enabled; + FFEBUR128State **r128; + int nb_r128; } DynamicAudioNormalizerContext; #define OFFSET(x) offsetof(DynamicAudioNormalizerContext, x) @@ -87,6 +92,7 @@ static const AVOption dynaudnorm_options[] = { { "p", "set the peak value", OFFSET(peak_value), AV_OPT_TYPE_DOUBLE, {.dbl = 0.95}, 0.0, 1.0, FLAGS }, { "m", "set the max amplification", OFFSET(max_amplification), AV_OPT_TYPE_DOUBLE, {.dbl = 10.0}, 1.0, 100.0, FLAGS }, { "r", "set the target RMS", OFFSET(target_rms), AV_OPT_TYPE_DOUBLE, {.dbl = 0.0}, 0.0, 1.0, FLAGS }, + { "l", "set the target LUFS", OFFSET(target_lufs), AV_OPT_TYPE_DOUBLE, {.dbl = 0.0},-70.0, 0.0, FLAGS }, { "n", "set channel coupling", OFFSET(channels_coupled), AV_OPT_TYPE_BOOL, {.i64 = 1}, 0, 1, FLAGS }, { "c", "set DC correction", OFFSET(dc_correction), AV_OPT_TYPE_BOOL, {.i64 = 0}, 0, 1, FLAGS }, { "b", "set alternative boundary mode", OFFSET(alt_boundary_mode), AV_OPT_TYPE_BOOL, {.i64 = 0}, 0, 1, FLAGS }, @@ -290,6 +296,10 @@ static av_cold void uninit(AVFilterContext *ctx) av_freep(&s->weights); ff_bufqueue_discard_all(&s->queue); + + for (c = 0; c < s->nb_r128; c++) + ff_ebur128_destroy(&s->r128[c]); + av_freep(&s->r128); } static int config_input(AVFilterLink *inlink) @@ -338,6 +348,20 @@ static int config_input(AVFilterLink *inlink) s->channels = inlink->channels; s->delay = s->filter_size; + if (s->target_lufs < -DBL_EPSILON) { + s->nb_r128 = s->channels_coupled ? 1 : inlink->channels; + s->r128 = av_mallocz_array(s->nb_r128, sizeof(*s->r128)); + if (!s->r128) { + s->nb_r128 = 0; + return AVERROR(ENOMEM); + } + for (c = 0; c < s->nb_r128; c++) { + s->r128[c] = ff_ebur128_init(s->channels_coupled ? inlink->channels : 1, inlink->sample_rate, s->frame_len_msec, FF_EBUR128_MODE_M); + if (!s->r128[c]) + return AVERROR(ENOMEM); + } + } + return 0; } @@ -380,6 +404,17 @@ static double find_peak_magnitude(AVFrame *frame, int channel) return max; } +static double compute_frame_lufs_gain(DynamicAudioNormalizerContext *s, AVFrame *frame, int channel) +{ + double lufs; + + channel = FFMAX(0, channel); + ff_ebur128_add_frames_planar_double(s->r128[channel], (const double **)frame->extended_data + channel, frame->nb_samples, 1); + ff_ebur128_loudness_window(s->r128[channel], s->frame_len_msec, &lufs); + + return pow(10.0, (s->target_lufs - lufs) / 20.0); +} + static double compute_frame_rms(AVFrame *frame, int channel) { double rms_value = 0.0; @@ -412,7 +447,8 @@ static double get_max_local_gain(DynamicAudioNormalizerContext *s, AVFrame *fram { const double maximum_gain = s->peak_value / find_peak_magnitude(frame, channel); const double rms_gain = s->target_rms > DBL_EPSILON ? (s->target_rms / compute_frame_rms(frame, channel)) : DBL_MAX; - return bound(s->max_amplification, FFMIN(maximum_gain, rms_gain)); + const double lufs_gain = s->target_lufs < -DBL_EPSILON ? compute_frame_lufs_gain(s, frame, channel) : DBL_MAX; + return bound(s->max_amplification, FFMIN(maximum_gain, FFMIN(lufs_gain, rms_gain))); } static double minimum_filter(cqueue *q) [-- Attachment #3: Type: text/plain, Size: 251 bytes --] _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
next prev parent reply other threads:[~2023-11-19 21:55 UTC|newest] Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top 2023-09-29 21:46 Paul B Mahol 2023-11-15 20:46 ` Paul B Mahol 2023-11-17 6:38 ` Kyle Swanson 2023-11-19 11:56 ` Paul B Mahol 2023-11-19 21:55 ` Marton Balint [this message] 2023-11-19 23:37 ` Paul B Mahol 2023-11-21 18:53 ` Kyle Swanson 2023-11-28 16:51 ` Paul B Mahol 2023-11-30 11:43 ` Anton Khirnov 2023-11-30 12:48 ` Paul B Mahol 2023-11-30 13:47 ` Anton Khirnov 2023-11-30 14:01 ` Paul B Mahol 2023-11-30 13:57 ` Anton Khirnov 2023-11-30 14:20 ` Paul B Mahol 2023-11-30 18:34 ` Kyle Swanson 2023-11-30 21:44 ` Paul B Mahol 2023-11-30 22:19 ` Kyle Swanson 2023-11-30 22:51 ` Paul B Mahol 2023-11-30 23:29 ` Kyle Swanson 2023-12-01 10:45 ` Paul B Mahol 2023-12-01 21:12 ` Kyle Swanson 2023-12-01 21:27 ` Paul B Mahol
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=f09a26eb-c825-5f4-20a2-2788558182bf@passwd.hu \ --to=cus@passwd.hu \ --cc=ffmpeg-devel@ffmpeg.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel This inbox may be cloned and mirrored by anyone: git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \ ffmpegdev@gitmailbox.com public-inbox-index ffmpegdev Example config snippet for mirrors. AGPL code for this site: git clone https://public-inbox.org/public-inbox.git