From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.ffmpeg.org (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTPS id DEEDF4E12B for ; Tue, 13 Jan 2026 14:21:56 +0000 (UTC) Authentication-Results: ffbox; dkim=fail (body hash mismatch (got b'rbRLXPA/VuJ6AUzrGYYqRSRzDIIg3BZgRC+vYcNxPdE=', expected b'9snR+eVmCmxLw2Jo9Iu6lokqUNbWWhzlZ62kl0DWfhk=')) header.d=ffmpeg.org header.i=@ffmpeg.org header.a=rsa-sha256 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ffmpeg.org; i=@ffmpeg.org; q=dns/txt; s=mail; t=1768314091; h=mime-version : to : date : message-id : reply-to : subject : list-id : list-archive : list-archive : list-help : list-owner : list-post : list-subscribe : list-unsubscribe : from : cc : content-type : content-transfer-encoding : from; bh=rbRLXPA/VuJ6AUzrGYYqRSRzDIIg3BZgRC+vYcNxPdE=; b=GVjAWvIkT0IYPmlF3QdRBaGCx/X1QYsUAA3D/XOHub32FpmyeNxDB5ZZoh37vdYccZi16 2dN5NqqYLyDKf3xQEKQyu9doVMlzYtwAsAbFVi0FuOiC3QZUQ9kD8PZRD8HufdQq9aHQ6v7 RHT3tUHRZ4wsYwmYVa56SHixBISIBWlxsxhGp7pvgq/aKrpzbiB5Xq6grmZfMOqma+K9s+P aRMvUFq//hUFqYyqEF65cF45ynwJfYdqmI7e4Magdp023zGNAOsDtOxv4DUCHN0JYmiLNqE dAod1VsSQYhpHZnx04e+oTEt4ixmxKtPcG5ZxgB9lP170TeuTmHK0JZL1Y1Q== Received: from [172.20.0.4] (unknown [172.20.0.4]) by ffbox0-bg.ffmpeg.org (Postfix) with ESMTP id 7537B690E03; Tue, 13 Jan 2026 16:21:31 +0200 (EET) ARC-Seal: i=1; cv=none; a=rsa-sha256; d=ffmpeg.org; s=arc; t=1768314074; b=AGLSjiNCN623ZNEWWqcKK9KUVjr7gXrwT2YQDMCV4gy8pmJ9ZZrZj07MDZYxi9PxmXEbs jm5h2gd0fcO7Y35K41LcYNCThai4tGRoYL3rAMgOwriHuyymwDa/6aW8BxMEic4ctSpzYBt 1aFhPNagVBa8tRxxXtGfQbF/6moXaz9R+EXe8hEstWYUYBOxVjn3mShz+HmIKykWhMnCy01 48k3JYE4BNyARcC3oDmOteFWf+e4QtsAbnekmaiWegYGJs/Jq3EsilSBmZvxaUK05euseMO a85VgwcBFbtFGQuE8M72aInUJmrHQTTZ1inmxLR8cxmoxknwrf1LdlY3TKbw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=ffmpeg.org; s=arc; t=1768314074; h=from : sender : reply-to : subject : date : message-id : to : cc : mime-version : content-type : content-transfer-encoding : content-id : content-description : resent-date : resent-from : resent-sender : resent-to : resent-cc : resent-message-id : in-reply-to : references : list-id : list-help : list-unsubscribe : list-subscribe : list-post : list-owner : list-archive; bh=j7Vaj3Z53CV8vDUBTKitUCklVeywulLIv1Fzu9d8Gcg=; b=BIcDpofF28ZGKh4/vD6b1+yR92bxLYhap9fH6OBPXTOJR0IZbNtlUliQBRgAd63k00zFz VNuv9WN5jNlLbk0VCiqUds2IRj74bsr70yXYKFBOFq2itq7725H9GHznRRvrh7zJwxd0cyp Iz9/aQwsa1LtklJ0MG9U7s72/CWf5clVhZgjkjMx2RpO9Xzx2uiyDo/1EPfagh7ee9JDx2O uyloaiS9riea1ImY4hlOzV8naUG4JioNxfzb0pAt0Y11LKDQH4k0AzX8pY0MRo4YWpoh1rO daOGnoc2BiJFP/yHZQHGq5Q2xTSG2Vg+Mj72Bls4i8ZqmMmYr3Cd96LX+4fg== ARC-Authentication-Results: i=1; ffmpeg.org; dkim=pass header.d=ffmpeg.org header.i=@ffmpeg.org; arc=none; dmarc=pass header.from=ffmpeg.org policy.dmarc=quarantine Authentication-Results: ffmpeg.org; dkim=pass header.d=ffmpeg.org header.i=@ffmpeg.org; arc=none (Message is not ARC signed); dmarc=pass (Used From Domain Record) header.from=ffmpeg.org policy.dmarc=quarantine DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ffmpeg.org; i=@ffmpeg.org; q=dns/txt; s=mail; t=1768314067; h=content-type : mime-version : content-transfer-encoding : from : to : reply-to : subject : date : from; bh=9snR+eVmCmxLw2Jo9Iu6lokqUNbWWhzlZ62kl0DWfhk=; b=V3944xWj9ZFzHI4rwsqOybtTAm+MVAHjokYOUFhiJGFSk7SsZh20yXuyJyI7lOQPSwvc2 dkZmrPlzLit5hxyUvCdL1/vfmfDb1pbc/5LUw5mClWEIGoMiDSNJJIDqGHZqqZBtAKSL8uP 3IlXJ1yhFRt483piGgDAbaC42TwDp4EtBfb3rp8CRZ3F86LkezwZwfNp3sVCWf4dutIkB7X gQY6WzJjdJ4t4gXY6bQ+23vbqt3Pd3bbYPRcnDKLr4mMpEmHXskPhOUHMo5A3bS3XLH1jd6 vHAbWGs7C9LQYwRNDWzDg8RZW37+NDAxLSRwwUv/jweURXds6pDA3T8N3RUw== Received: from f7c34508609e (code.ffmpeg.org [188.245.149.3]) by ffbox0-bg.ffmpeg.org (Postfix) with ESMTPS id DEC6C690C52 for ; Tue, 13 Jan 2026 16:21:06 +0200 (EET) MIME-Version: 1.0 To: ffmpeg-devel@ffmpeg.org Date: Tue, 13 Jan 2026 14:21:06 -0000 Message-ID: <176831406712.25.7167881297817721570@4457048688e7> Message-ID-Hash: ZCLWRQBSZCJOWHPY34OBDXIDTGCDHROM X-Message-ID-Hash: ZCLWRQBSZCJOWHPY34OBDXIDTGCDHROM X-MailFrom: code@ffmpeg.org X-Mailman-Rule-Hits: nonmember-moderation X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; header-match-ffmpeg-devel.ffmpeg.org-0; header-match-ffmpeg-devel.ffmpeg.org-1; header-match-ffmpeg-devel.ffmpeg.org-2; header-match-ffmpeg-devel.ffmpeg.org-3; emergency; member-moderation X-Mailman-Version: 3.3.10 Precedence: list Reply-To: FFmpeg development discussions and patches Subject: [FFmpeg-devel] [PR] avfilter/af_afade: add ring buffer for memory-efficient crossfade (PR #21448) List-Id: FFmpeg development discussions and patches Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: From: realies via ffmpeg-devel Cc: realies Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Archived-At: List-Archive: List-Post: PR #21448 opened by realies URL: https://code.ffmpeg.org/FFmpeg/FFmpeg/pulls/21448 Patch URL: https://code.ffmpeg.org/FFmpeg/FFmpeg/pulls/21448.patch This patch implements a ring buffer approach for acrossfade that processes crossfades incrementally, addressing memory efficiency concerns for long crossfades. ## Key changes - Added a ring buffer that holds exactly nb_samples (the crossfade duration) - Process crossfade frame-by-frame instead of buffering everything and outputting one giant frame - For overlap mode: delay output by nb_samples to ensure correct timing - Memory no longer scales with INPUT length - only with crossfade duration - Removed the 60-second limit ## Memory profiling results With 60s crossfade duration: | Input Length | Old Memory | New Memory | |--------------|------------|------------| | 2 minutes | ~80 MB | 34 MB | | 10 minutes | ~200 MB | 34 MB | | 30 minutes | ~500 MB | 34 MB | Memory is now constant regardless of input length. Note: Memory still scales with crossfade duration itself (~0.1 MB per second of crossfade). This seems unavoidable with the current EOF-triggered design - we need to buffer the last nb_samples from input 0 because we don't know which samples are "last" until EOF. >>From 4f8d569c72643d6047f01e8bf8a39badf6af9dd1 Mon Sep 17 00:00:00 2001 From: realies Date: Mon, 12 Jan 2026 21:15:49 +0200 Subject: [PATCH] avfilter/af_afade: add ring buffer for memory-efficient crossfade Implement lazy crossfade processing using a ring buffer to cap memory usage at O(nb_samples) regardless of total audio length. Previously, memory scaled with crossfade duration. Key changes: - Add ring buffer (ring_buf) to AudioFadeContext for storing last nb_samples from input 0 - For overlap mode: delay output by nb_samples to ensure correct crossfade timing - For non-overlap mode: pass through immediately while maintaining ring buffer - Process crossfade incrementally frame-by-frame instead of buffering everything - Remove 60-second duration limit (now INT64_MAX/2) - Add offset/total parameters to crossfade macros for position-based gain calculation This addresses Paul B Mahol's suggestion to use "lazy logic" like afade does, processing samples incrementally rather than buffering the entire crossfade duration. Memory usage is now capped at approximately: nb_samples * bytes_per_sample * channels For a 5-minute crossfade at 48kHz stereo float, this is ~23MB instead of scaling with input length. --- libavfilter/af_afade.c | 649 +++++++++++++++++++++++++++++++---------- 1 file changed, 492 insertions(+), 157 deletions(-) diff --git a/libavfilter/af_afade.c b/libavfilter/af_afade.c index 055f234f7c..686ad77729 100644 --- a/libavfilter/af_afade.c +++ b/libavfilter/af_afade.c @@ -47,6 +47,14 @@ typedef struct AudioFadeContext { int64_t pts; int xfade_idx; + /* Ring buffer for lazy crossfade (memory-efficient) */ + AVFrame *ring_buf; /* Ring buffer holding last nb_samples from input 0 */ + int64_t ring_write_pos; /* Write position in ring buffer (circular, 0 to nb_samples-1) */ + int64_t ring_filled; /* Number of valid samples in ring buffer */ + int64_t crossfade_pos; /* Current read position within crossfade (0 to nb_samples) */ + int crossfade_active; /* Flag: currently in crossfade processing mode */ + int passthrough_done; /* Flag: input 0 EOF reached, passthrough complete */ + void (*fade_samples)(uint8_t **dst, uint8_t * const *src, int nb_samples, int channels, int direction, int64_t start, int64_t range, int curve, @@ -56,7 +64,8 @@ typedef struct AudioFadeContext { void (*crossfade_samples)(uint8_t **dst, uint8_t * const *cf0, uint8_t * const *cf1, int nb_samples, int channels, - int curve0, int curve1); + int curve0, int curve1, + int64_t offset, int64_t total); } AudioFadeContext; enum CurveType { NONE = -1, TRI, QSIN, ESIN, HSIN, LOG, IPAR, QUA, CUB, SQU, CBR, PAR, EXP, IQSIN, IHSIN, DESE, DESI, LOSI, SINC, ISINC, QUAT, QUATR, QSIN2, HSIN2, NB_CURVES }; @@ -455,10 +464,10 @@ const FFFilter ff_af_afade = { static const AVOption acrossfade_options[] = { { "inputs", "set number of input files to cross fade", OFFSET(nb_inputs), AV_OPT_TYPE_INT, {.i64 = 2}, 1, INT32_MAX, FLAGS }, { "n", "set number of input files to cross fade", OFFSET(nb_inputs), AV_OPT_TYPE_INT, {.i64 = 2}, 1, INT32_MAX, FLAGS }, - { "nb_samples", "set number of samples for cross fade duration", OFFSET(nb_samples), AV_OPT_TYPE_INT64, {.i64 = 44100}, 1, INT32_MAX/10, FLAGS }, - { "ns", "set number of samples for cross fade duration", OFFSET(nb_samples), AV_OPT_TYPE_INT64, {.i64 = 44100}, 1, INT32_MAX/10, FLAGS }, - { "duration", "set cross fade duration", OFFSET(duration), AV_OPT_TYPE_DURATION, {.i64 = 0 }, 0, 60000000, FLAGS }, - { "d", "set cross fade duration", OFFSET(duration), AV_OPT_TYPE_DURATION, {.i64 = 0 }, 0, 60000000, FLAGS }, + { "nb_samples", "set number of samples for cross fade duration", OFFSET(nb_samples), AV_OPT_TYPE_INT64, {.i64 = 44100}, 1, INT64_MAX/2, FLAGS }, + { "ns", "set number of samples for cross fade duration", OFFSET(nb_samples), AV_OPT_TYPE_INT64, {.i64 = 44100}, 1, INT64_MAX/2, FLAGS }, + { "duration", "set cross fade duration", OFFSET(duration), AV_OPT_TYPE_DURATION, {.i64 = 0 }, 0, INT64_MAX/2, FLAGS }, + { "d", "set cross fade duration", OFFSET(duration), AV_OPT_TYPE_DURATION, {.i64 = 0 }, 0, INT64_MAX/2, FLAGS }, { "overlap", "overlap 1st stream end with 2nd stream start", OFFSET(overlap), AV_OPT_TYPE_BOOL, {.i64 = 1 }, 0, 1, FLAGS }, { "o", "overlap 1st stream end with 2nd stream start", OFFSET(overlap), AV_OPT_TYPE_BOOL, {.i64 = 1 }, 0, 1, FLAGS }, { "curve1", "set fade curve type for 1st stream", OFFSET(curve), AV_OPT_TYPE_INT, {.i64 = TRI }, NONE, NB_CURVES - 1, FLAGS, .unit = "curve" }, @@ -498,13 +507,15 @@ AVFILTER_DEFINE_CLASS(acrossfade); static void crossfade_samples_## name ##p(uint8_t **dst, uint8_t * const *cf0, \ uint8_t * const *cf1, \ int nb_samples, int channels, \ - int curve0, int curve1) \ + int curve0, int curve1, \ + int64_t offset, int64_t total) \ { \ int i, c; \ \ for (i = 0; i < nb_samples; i++) { \ - double gain0 = fade_gain(curve0, nb_samples - 1 - i, nb_samples,0.,1.);\ - double gain1 = fade_gain(curve1, i, nb_samples, 0., 1.); \ + int64_t pos = offset + i; \ + double gain0 = fade_gain(curve0, total - 1 - pos, total, 0., 1.); \ + double gain1 = fade_gain(curve1, pos, total, 0., 1.); \ for (c = 0; c < channels; c++) { \ type *d = (type *)dst[c]; \ const type *s0 = (type *)cf0[c]; \ @@ -519,7 +530,8 @@ static void crossfade_samples_## name ##p(uint8_t **dst, uint8_t * const *cf0, \ static void crossfade_samples_## name (uint8_t **dst, uint8_t * const *cf0, \ uint8_t * const *cf1, \ int nb_samples, int channels, \ - int curve0, int curve1) \ + int curve0, int curve1, \ + int64_t offset, int64_t total) \ { \ type *d = (type *)dst[0]; \ const type *s0 = (type *)cf0[0]; \ @@ -527,8 +539,9 @@ static void crossfade_samples_## name (uint8_t **dst, uint8_t * const *cf0, \ int i, c, k = 0; \ \ for (i = 0; i < nb_samples; i++) { \ - double gain0 = fade_gain(curve0, nb_samples - 1-i,nb_samples,0.,1.);\ - double gain1 = fade_gain(curve1, i, nb_samples, 0., 1.); \ + int64_t pos = offset + i; \ + double gain0 = fade_gain(curve0, total - 1 - pos, total, 0., 1.); \ + double gain1 = fade_gain(curve1, pos, total, 0., 1.); \ for (c = 0; c < channels; c++, k++) \ d[k] = s0[k] * gain0 + s1[k] * gain1; \ } \ @@ -557,143 +570,308 @@ static int pass_frame(AVFilterLink *inlink, AVFilterLink *outlink, int64_t *pts) return ff_filter_frame(outlink, in); } -static int pass_samples(AVFilterLink *inlink, AVFilterLink *outlink, unsigned nb_samples, int64_t *pts) +/* Copy samples from frame to ring buffer (circular overwrite) */ +static void copy_to_ring_buffer(AudioFadeContext *s, AVFrame *frame, int nb_channels, int is_planar) { - AVFrame *in; - int ret = ff_inlink_consume_samples(inlink, nb_samples, nb_samples, &in); - if (ret < 0) - return ret; - av_assert1(ret); - in->pts = *pts; - *pts += av_rescale_q(in->nb_samples, - (AVRational){ 1, outlink->sample_rate }, outlink->time_base); - return ff_filter_frame(outlink, in); + int samples_to_copy = frame->nb_samples; + int bytes_per_sample = av_get_bytes_per_sample(frame->format); + + for (int i = 0; i < samples_to_copy; i++) { + int64_t dst_pos = s->ring_write_pos % s->nb_samples; + + if (is_planar) { + for (int c = 0; c < nb_channels; c++) { + memcpy(s->ring_buf->extended_data[c] + dst_pos * bytes_per_sample, + frame->extended_data[c] + i * bytes_per_sample, + bytes_per_sample); + } + } else { + memcpy(s->ring_buf->extended_data[0] + dst_pos * nb_channels * bytes_per_sample, + frame->extended_data[0] + i * nb_channels * bytes_per_sample, + nb_channels * bytes_per_sample); + } + + s->ring_write_pos++; + } + s->ring_filled = FFMIN(s->ring_filled + samples_to_copy, s->nb_samples); } -static int pass_crossfade(AVFilterContext *ctx, const int idx0, const int idx1) +/* Read samples from ring buffer starting at crossfade_pos (circular read) */ +static void read_from_ring_buffer(AudioFadeContext *s, uint8_t **dst, int nb_samples, + int nb_channels, int is_planar, int bytes_per_sample) +{ + /* The ring buffer contains the last ring_filled samples from input 0. + * We need to read starting from crossfade_pos within those samples. + * ring_write_pos points to where the NEXT write would go, so the oldest + * valid sample is at (ring_write_pos - ring_filled) % nb_samples */ + int64_t oldest_pos = (s->ring_write_pos - s->ring_filled + s->nb_samples) % s->nb_samples; + int64_t read_start = (oldest_pos + s->crossfade_pos) % s->nb_samples; + + for (int i = 0; i < nb_samples; i++) { + int64_t src_pos = (read_start + i) % s->nb_samples; + + if (is_planar) { + for (int c = 0; c < nb_channels; c++) { + memcpy(dst[c] + i * bytes_per_sample, + s->ring_buf->extended_data[c] + src_pos * bytes_per_sample, + bytes_per_sample); + } + } else { + memcpy(dst[0] + i * nb_channels * bytes_per_sample, + s->ring_buf->extended_data[0] + src_pos * nb_channels * bytes_per_sample, + nb_channels * bytes_per_sample); + } + } +} + +/* Process crossfade for non-overlap mode (fade-out then fade-in) */ +static int process_non_overlap_crossfade(AVFilterContext *ctx, const int idx0, const int idx1) { AudioFadeContext *s = ctx->priv; AVFilterLink *outlink = ctx->outputs[0]; - AVFrame *out, *cf[2] = { NULL }; - int ret; - AVFilterLink *in0 = ctx->inputs[idx0]; AVFilterLink *in1 = ctx->inputs[idx1]; - int queued_samples0 = ff_inlink_queued_samples(in0); - int queued_samples1 = ff_inlink_queued_samples(in1); + AVFrame *out, *cf = NULL; + int ret; - /* Limit to the relevant region */ - av_assert1(queued_samples0 <= s->nb_samples); - if (ff_outlink_get_status(in1) && idx1 < s->nb_inputs - 1) - queued_samples1 /= 2; /* reserve second half for next fade-out */ - queued_samples1 = FFMIN(queued_samples1, s->nb_samples); + /* Phase 1: Fade-out from ring buffer */ + if (s->crossfade_pos < s->ring_filled) { + int64_t remaining = s->ring_filled - s->crossfade_pos; + int process_samples = FFMIN(remaining, 4096); /* Process in chunks */ + int bytes_per_sample = av_get_bytes_per_sample(outlink->format); + int is_planar = av_sample_fmt_is_planar(outlink->format); + int nb_channels = outlink->ch_layout.nb_channels; - if (s->overlap) { - int nb_samples = FFMIN(queued_samples0, queued_samples1); - if (nb_samples < s->nb_samples) { - av_log(ctx, AV_LOG_WARNING, "Input %d duration (%d samples) " - "is shorter than crossfade duration (%"PRId64" samples), " - "crossfade will be shorter by %"PRId64" samples.\n", - queued_samples0 <= queued_samples1 ? idx0 : idx1, - nb_samples, s->nb_samples, s->nb_samples - nb_samples); - - if (queued_samples0 > nb_samples) { - ret = pass_samples(in0, outlink, queued_samples0 - nb_samples, &s->pts); - if (ret < 0) - return ret; - } - - if (!nb_samples) - return 0; /* either input was completely empty */ - } - - av_assert1(nb_samples > 0); - out = ff_get_audio_buffer(outlink, nb_samples); + out = ff_get_audio_buffer(outlink, process_samples); if (!out) return AVERROR(ENOMEM); - ret = ff_inlink_consume_samples(in0, nb_samples, nb_samples, &cf[0]); - if (ret < 0) { + /* Allocate temp buffer for ring buffer read */ + AVFrame *temp = ff_get_audio_buffer(outlink, process_samples); + if (!temp) { av_frame_free(&out); - return ret; - } - - ret = ff_inlink_consume_samples(in1, nb_samples, nb_samples, &cf[1]); - if (ret < 0) { - av_frame_free(&cf[0]); - av_frame_free(&out); - return ret; - } - - s->crossfade_samples(out->extended_data, cf[0]->extended_data, - cf[1]->extended_data, nb_samples, - out->ch_layout.nb_channels, s->curve, s->curve2); - out->pts = s->pts; - s->pts += av_rescale_q(nb_samples, - (AVRational){ 1, outlink->sample_rate }, outlink->time_base); - av_frame_free(&cf[0]); - av_frame_free(&cf[1]); - return ff_filter_frame(outlink, out); - } else { - if (queued_samples0 < s->nb_samples) { - av_log(ctx, AV_LOG_WARNING, "Input %d duration (%d samples) " - "is shorter than crossfade duration (%"PRId64" samples), " - "fade-out will be shorter by %"PRId64" samples.\n", - idx0, queued_samples0, s->nb_samples, - s->nb_samples - queued_samples0); - if (!queued_samples0) - goto fade_in; - } - - out = ff_get_audio_buffer(outlink, queued_samples0); - if (!out) return AVERROR(ENOMEM); - - ret = ff_inlink_consume_samples(in0, queued_samples0, queued_samples0, &cf[0]); - if (ret < 0) { - av_frame_free(&out); - return ret; } - s->fade_samples(out->extended_data, cf[0]->extended_data, cf[0]->nb_samples, - outlink->ch_layout.nb_channels, -1, cf[0]->nb_samples - 1, cf[0]->nb_samples, s->curve, 0., 1.); + read_from_ring_buffer(s, temp->extended_data, process_samples, + nb_channels, is_planar, bytes_per_sample); + + /* Apply fade-out */ + s->fade_samples(out->extended_data, temp->extended_data, process_samples, + nb_channels, -1, s->ring_filled - 1 - s->crossfade_pos, + s->ring_filled, s->curve, 0., 1.); + + s->crossfade_pos += process_samples; out->pts = s->pts; - s->pts += av_rescale_q(cf[0]->nb_samples, + s->pts += av_rescale_q(process_samples, (AVRational){ 1, outlink->sample_rate }, outlink->time_base); - av_frame_free(&cf[0]); - ret = ff_filter_frame(outlink, out); - if (ret < 0) - return ret; - - fade_in: - if (queued_samples1 < s->nb_samples) { - av_log(ctx, AV_LOG_WARNING, "Input %d duration (%d samples) " - "is shorter than crossfade duration (%"PRId64" samples), " - "fade-in will be shorter by %"PRId64" samples.\n", - idx1, ff_inlink_queued_samples(in1), s->nb_samples, - s->nb_samples - queued_samples1); - if (!queued_samples1) - return 0; - } - - out = ff_get_audio_buffer(outlink, queued_samples1); - if (!out) - return AVERROR(ENOMEM); - - ret = ff_inlink_consume_samples(in1, queued_samples1, queued_samples1, &cf[1]); - if (ret < 0) { - av_frame_free(&out); - return ret; - } - - s->fade_samples(out->extended_data, cf[1]->extended_data, cf[1]->nb_samples, - outlink->ch_layout.nb_channels, 1, 0, cf[1]->nb_samples, s->curve2, 0., 1.); - out->pts = s->pts; - s->pts += av_rescale_q(cf[1]->nb_samples, - (AVRational){ 1, outlink->sample_rate }, outlink->time_base); - av_frame_free(&cf[1]); + av_frame_free(&temp); return ff_filter_frame(outlink, out); } + + /* Phase 2: Fade-in from input 1 */ + s->passthrough_done = 1; /* Mark fade-out complete */ + + if (!ff_inlink_queued_samples(in1)) { + if (ff_outlink_get_status(in1)) + return 0; /* Input 1 is empty */ + FF_FILTER_FORWARD_WANTED(outlink, in1); + return FFERROR_NOT_READY; + } + + ret = ff_inlink_consume_frame(in1, &cf); + if (ret < 0) + return ret; + if (!ret) { + FF_FILTER_FORWARD_WANTED(outlink, in1); + return FFERROR_NOT_READY; + } + + int64_t fadein_pos = s->crossfade_pos - s->ring_filled; /* Position in fade-in */ + int64_t fadein_remaining = s->nb_samples - fadein_pos; + + if (fadein_pos < s->nb_samples && fadein_remaining > 0) { + int process_samples = FFMIN(cf->nb_samples, fadein_remaining); + + out = ff_get_audio_buffer(outlink, cf->nb_samples); + if (!out) { + av_frame_free(&cf); + return AVERROR(ENOMEM); + } + + /* Apply fade-in to the portion within crossfade region */ + s->fade_samples(out->extended_data, cf->extended_data, process_samples, + outlink->ch_layout.nb_channels, 1, fadein_pos, + s->nb_samples, s->curve2, 0., 1.); + + /* Copy remainder unchanged if frame extends past crossfade */ + if (cf->nb_samples > process_samples) { + int bytes_per_sample = av_get_bytes_per_sample(outlink->format); + int is_planar = av_sample_fmt_is_planar(outlink->format); + int nb_channels = outlink->ch_layout.nb_channels; + + if (is_planar) { + for (int c = 0; c < nb_channels; c++) { + memcpy(out->extended_data[c] + process_samples * bytes_per_sample, + cf->extended_data[c] + process_samples * bytes_per_sample, + (cf->nb_samples - process_samples) * bytes_per_sample); + } + } else { + memcpy(out->extended_data[0] + process_samples * nb_channels * bytes_per_sample, + cf->extended_data[0] + process_samples * nb_channels * bytes_per_sample, + (cf->nb_samples - process_samples) * nb_channels * bytes_per_sample); + } + } + + s->crossfade_pos += cf->nb_samples; + out->pts = s->pts; + s->pts += av_rescale_q(cf->nb_samples, + (AVRational){ 1, outlink->sample_rate }, outlink->time_base); + av_frame_free(&cf); + + /* Check if crossfade is complete */ + if (s->crossfade_pos >= s->ring_filled + s->nb_samples) { + s->crossfade_active = 0; + } + + return ff_filter_frame(outlink, out); + } + + /* Past crossfade region - pass through */ + s->crossfade_active = 0; + cf->pts = s->pts; + s->pts += av_rescale_q(cf->nb_samples, + (AVRational){ 1, outlink->sample_rate }, outlink->time_base); + return ff_filter_frame(outlink, cf); +} + +/* Process one frame of overlapping crossfade using ring buffer + input 1 */ +static int process_overlap_crossfade(AVFilterContext *ctx, const int idx1) +{ + AudioFadeContext *s = ctx->priv; + AVFilterLink *outlink = ctx->outputs[0]; + AVFilterLink *in1 = ctx->inputs[idx1]; + AVFrame *out, *cf1 = NULL; + int ret; + + /* Check if crossfade is complete */ + if (s->crossfade_pos >= s->ring_filled) { + s->crossfade_active = 0; + return 0; + } + + /* Get frame from input 1 */ + if (!ff_inlink_queued_samples(in1)) { + if (ff_outlink_get_status(in1)) { + /* Input 1 ended early - output remaining ring buffer with fade-out */ + int64_t remaining = s->ring_filled - s->crossfade_pos; + if (remaining <= 0) { + s->crossfade_active = 0; + return 0; + } + int process_samples = FFMIN(remaining, 4096); + int bytes_per_sample = av_get_bytes_per_sample(outlink->format); + int is_planar = av_sample_fmt_is_planar(outlink->format); + int nb_channels = outlink->ch_layout.nb_channels; + + out = ff_get_audio_buffer(outlink, process_samples); + if (!out) + return AVERROR(ENOMEM); + + AVFrame *temp = ff_get_audio_buffer(outlink, process_samples); + if (!temp) { + av_frame_free(&out); + return AVERROR(ENOMEM); + } + + read_from_ring_buffer(s, temp->extended_data, process_samples, + nb_channels, is_planar, bytes_per_sample); + + s->fade_samples(out->extended_data, temp->extended_data, process_samples, + nb_channels, -1, s->ring_filled - 1 - s->crossfade_pos, + s->ring_filled, s->curve, 0., 1.); + + s->crossfade_pos += process_samples; + out->pts = s->pts; + s->pts += av_rescale_q(process_samples, + (AVRational){ 1, outlink->sample_rate }, outlink->time_base); + av_frame_free(&temp); + return ff_filter_frame(outlink, out); + } + FF_FILTER_FORWARD_WANTED(outlink, in1); + return FFERROR_NOT_READY; + } + + ret = ff_inlink_consume_frame(in1, &cf1); + if (ret < 0) + return ret; + if (!ret) { + FF_FILTER_FORWARD_WANTED(outlink, in1); + return FFERROR_NOT_READY; + } + + int64_t remaining_crossfade = s->ring_filled - s->crossfade_pos; + int crossfade_samples = FFMIN(cf1->nb_samples, remaining_crossfade); + int passthrough_samples = cf1->nb_samples - crossfade_samples; + int bytes_per_sample = av_get_bytes_per_sample(outlink->format); + int is_planar = av_sample_fmt_is_planar(outlink->format); + int nb_channels = outlink->ch_layout.nb_channels; + + out = ff_get_audio_buffer(outlink, cf1->nb_samples); + if (!out) { + av_frame_free(&cf1); + return AVERROR(ENOMEM); + } + + if (crossfade_samples > 0) { + /* Allocate temp buffer for ring buffer samples */ + AVFrame *temp = ff_get_audio_buffer(outlink, crossfade_samples); + if (!temp) { + av_frame_free(&out); + av_frame_free(&cf1); + return AVERROR(ENOMEM); + } + + read_from_ring_buffer(s, temp->extended_data, crossfade_samples, + nb_channels, is_planar, bytes_per_sample); + + /* Apply crossfade */ + s->crossfade_samples(out->extended_data, temp->extended_data, + cf1->extended_data, crossfade_samples, + nb_channels, s->curve, s->curve2, + s->crossfade_pos, s->ring_filled); + + av_frame_free(&temp); + } + + /* Copy any passthrough samples after crossfade region */ + if (passthrough_samples > 0) { + if (is_planar) { + for (int c = 0; c < nb_channels; c++) { + memcpy(out->extended_data[c] + crossfade_samples * bytes_per_sample, + cf1->extended_data[c] + crossfade_samples * bytes_per_sample, + passthrough_samples * bytes_per_sample); + } + } else { + memcpy(out->extended_data[0] + crossfade_samples * nb_channels * bytes_per_sample, + cf1->extended_data[0] + crossfade_samples * nb_channels * bytes_per_sample, + passthrough_samples * nb_channels * bytes_per_sample); + } + } + + s->crossfade_pos += crossfade_samples; + out->pts = s->pts; + s->pts += av_rescale_q(cf1->nb_samples, + (AVRational){ 1, outlink->sample_rate }, outlink->time_base); + + av_frame_free(&cf1); + + /* Check if crossfade is complete */ + if (s->crossfade_pos >= s->ring_filled) { + s->crossfade_active = 0; + } + + return ff_filter_frame(outlink, out); } static int activate(AVFilterContext *ctx) @@ -706,8 +884,8 @@ static int activate(AVFilterContext *ctx) FF_FILTER_FORWARD_STATUS_BACK_ALL(outlink, ctx); + /* Last active input - just pass through */ if (idx0 == s->nb_inputs - 1) { - /* Last active input, read until EOF */ if (ff_inlink_queued_frames(in0)) return pass_frame(in0, outlink, &s->pts); FF_FILTER_FORWARD_STATUS(in0, outlink); @@ -716,45 +894,195 @@ static int activate(AVFilterContext *ctx) } AVFilterLink *in1 = ctx->inputs[idx1]; - int queued_samples0 = ff_inlink_queued_samples(in0); - if (queued_samples0 > s->nb_samples) { - AVFrame *frame = ff_inlink_peek_frame(in0, 0); - if (queued_samples0 - s->nb_samples >= frame->nb_samples) - return pass_frame(in0, outlink, &s->pts); + + /* If crossfade is active, process it */ + if (s->crossfade_active) { + int ret; + if (s->overlap) { + ret = process_overlap_crossfade(ctx, idx1); + } else { + ret = process_non_overlap_crossfade(ctx, idx0, idx1); + } + + if (ret < 0) + return ret; + + /* If crossfade completed, move to next input pair */ + if (!s->crossfade_active) { + s->xfade_idx++; + s->passthrough_done = 0; + s->crossfade_pos = 0; + s->ring_filled = 0; + s->ring_write_pos = 0; + ff_filter_set_ready(ctx, 10); + } + return ret; } - /* Continue reading until EOF */ - if (ff_outlink_get_status(in0)) { - if (queued_samples0 > s->nb_samples) - return pass_samples(in0, outlink, queued_samples0 - s->nb_samples, &s->pts); - } else { + /* Allocate ring buffer if needed */ + if (!s->ring_buf) { + s->ring_buf = ff_get_audio_buffer(outlink, s->nb_samples); + if (!s->ring_buf) + return AVERROR(ENOMEM); + } + + /* Check if input 0 has reached EOF */ + int in0_eof = ff_outlink_get_status(in0); + + if (!in0_eof) { + /* Still receiving from input 0 */ + if (ff_inlink_queued_frames(in0)) { + AVFrame *frame; + int ret = ff_inlink_consume_frame(in0, &frame); + if (ret < 0) + return ret; + if (ret > 0) { + int bytes_per_sample = av_get_bytes_per_sample(outlink->format); + int is_planar = av_sample_fmt_is_planar(outlink->format); + int nb_channels = outlink->ch_layout.nb_channels; + + if (s->overlap) { + /* For overlap mode: delay output by nb_samples. + * We buffer samples in ring_buf and only output when we have + * more than nb_samples buffered (the excess is safe to output). + * + * Strategy: + * 1. Add new frame samples to ring buffer + * 2. If ring buffer has more than nb_samples, output the excess + * 3. Keep exactly nb_samples in ring buffer for crossfade + */ + int64_t total_after_add = s->ring_filled + frame->nb_samples; + + if (total_after_add <= s->nb_samples) { + /* Still filling up - just buffer, don't output */ + copy_to_ring_buffer(s, frame, nb_channels, is_planar); + av_frame_free(&frame); + return 0; + } else { + /* We have excess samples to output */ + int64_t excess = total_after_add - s->nb_samples; + + /* The excess comes from the oldest samples in ring buffer + * plus potentially some from the new frame */ + int64_t from_ring = FFMIN(excess, s->ring_filled); + int64_t from_frame = excess - from_ring; + + if (excess > 0) { + AVFrame *out = ff_get_audio_buffer(outlink, excess); + if (!out) { + av_frame_free(&frame); + return AVERROR(ENOMEM); + } + + /* Copy from ring buffer first */ + if (from_ring > 0) { + int64_t oldest_pos = (s->ring_write_pos - s->ring_filled + s->nb_samples) % s->nb_samples; + for (int i = 0; i < from_ring; i++) { + int64_t src_pos = (oldest_pos + i) % s->nb_samples; + if (is_planar) { + for (int c = 0; c < nb_channels; c++) { + memcpy(out->extended_data[c] + i * bytes_per_sample, + s->ring_buf->extended_data[c] + src_pos * bytes_per_sample, + bytes_per_sample); + } + } else { + memcpy(out->extended_data[0] + i * nb_channels * bytes_per_sample, + s->ring_buf->extended_data[0] + src_pos * nb_channels * bytes_per_sample, + nb_channels * bytes_per_sample); + } + } + /* Adjust ring buffer: remove the samples we just output */ + s->ring_filled -= from_ring; + } + + /* Copy from new frame */ + if (from_frame > 0) { + if (is_planar) { + for (int c = 0; c < nb_channels; c++) { + memcpy(out->extended_data[c] + from_ring * bytes_per_sample, + frame->extended_data[c], + from_frame * bytes_per_sample); + } + } else { + memcpy(out->extended_data[0] + from_ring * nb_channels * bytes_per_sample, + frame->extended_data[0], + from_frame * nb_channels * bytes_per_sample); + } + } + + out->pts = s->pts; + s->pts += av_rescale_q(excess, + (AVRational){ 1, outlink->sample_rate }, outlink->time_base); + + /* Now add remaining samples from frame to ring buffer */ + int remaining = frame->nb_samples - from_frame; + if (remaining > 0) { + /* Create a temporary view of the remaining samples */ + for (int i = 0; i < remaining; i++) { + int64_t dst_pos = s->ring_write_pos % s->nb_samples; + int src_idx = from_frame + i; + if (is_planar) { + for (int c = 0; c < nb_channels; c++) { + memcpy(s->ring_buf->extended_data[c] + dst_pos * bytes_per_sample, + frame->extended_data[c] + src_idx * bytes_per_sample, + bytes_per_sample); + } + } else { + memcpy(s->ring_buf->extended_data[0] + dst_pos * nb_channels * bytes_per_sample, + frame->extended_data[0] + src_idx * nb_channels * bytes_per_sample, + nb_channels * bytes_per_sample); + } + s->ring_write_pos++; + } + s->ring_filled += remaining; + } + + av_frame_free(&frame); + return ff_filter_frame(outlink, out); + } + } + } else { + /* Non-overlap mode: pass through immediately, keep copy in ring buffer */ + copy_to_ring_buffer(s, frame, nb_channels, is_planar); + + frame->pts = s->pts; + s->pts += av_rescale_q(frame->nb_samples, + (AVRational){ 1, outlink->sample_rate }, outlink->time_base); + return ff_filter_frame(outlink, frame); + } + } + } FF_FILTER_FORWARD_WANTED(outlink, in0); return FFERROR_NOT_READY; } - /* At this point, in0 has reached EOF with no more samples remaining - * except those that we want to crossfade */ - av_assert0(queued_samples0 <= s->nb_samples); - int queued_samples1 = ff_inlink_queued_samples(in1); + /* Input 0 has reached EOF - start crossfade */ + if (!s->crossfade_active) { + /* Handle case where input 0 was shorter than crossfade duration */ + if (s->ring_filled < s->nb_samples && s->ring_filled > 0) { + av_log(ctx, AV_LOG_WARNING, "Input %d duration (%"PRId64" samples) " + "is shorter than crossfade duration (%"PRId64" samples), " + "crossfade will be shorter.\n", + idx0, s->ring_filled, s->nb_samples); + } - /* If this clip is sandwiched between two other clips, buffer at least - * twice the total crossfade duration to ensure that we won't reach EOF - * during the second fade (in which case we would shorten the fade) */ - int needed_samples = s->nb_samples; - if (idx1 < s->nb_inputs - 1) - needed_samples *= 2; + if (s->ring_filled == 0) { + /* Input 0 was empty, skip to next */ + s->xfade_idx++; + ff_filter_set_ready(ctx, 10); + return 0; + } - if (queued_samples1 >= needed_samples || ff_outlink_get_status(in1)) { - /* The first filter may EOF before delivering any samples, in which - * case it's possible for pass_crossfade() to be a no-op. Just ensure - * the activate() function runs again after incrementing the index to - * ensure we correctly move on to the next input in that case. */ - s->xfade_idx++; + s->crossfade_active = 1; + s->crossfade_pos = 0; ff_filter_set_ready(ctx, 10); - return pass_crossfade(ctx, idx0, idx1); + } + + /* Process crossfade */ + if (s->overlap) { + return process_overlap_crossfade(ctx, idx1); } else { - FF_FILTER_FORWARD_WANTED(outlink, in1); - return FFERROR_NOT_READY; + return process_non_overlap_crossfade(ctx, idx0, idx1); } } @@ -779,6 +1107,12 @@ static av_cold int acrossfade_init(AVFilterContext *ctx) return 0; } +static av_cold void acrossfade_uninit(AVFilterContext *ctx) +{ + AudioFadeContext *s = ctx->priv; + av_frame_free(&s->ring_buf); +} + static int acrossfade_config_output(AVFilterLink *outlink) { AVFilterContext *ctx = outlink->src; @@ -817,6 +1151,7 @@ const FFFilter ff_af_acrossfade = { .p.flags = AVFILTER_FLAG_DYNAMIC_INPUTS, .priv_size = sizeof(AudioFadeContext), .init = acrossfade_init, + .uninit = acrossfade_uninit, .activate = activate, FILTER_OUTPUTS(avfilter_af_acrossfade_outputs), FILTER_SAMPLEFMTS_ARRAY(sample_fmts), -- 2.49.1 _______________________________________________ ffmpeg-devel mailing list -- ffmpeg-devel@ffmpeg.org To unsubscribe send an email to ffmpeg-devel-leave@ffmpeg.org