* [FFmpeg-devel] [PATCH 2/2] aviobuf: Avoid clearing the whole buffer in fill_buffer
2023-03-21 12:37 [FFmpeg-devel] [PATCH 1/2] libavformat: Improve ff_configure_buffers_for_index for excessive deltas Martin Storsjö
@ 2023-03-21 12:37 ` Martin Storsjö
2023-03-21 19:29 ` Marton Balint
2023-03-24 21:49 ` [FFmpeg-devel] [PATCH 1/2] libavformat: Improve ff_configure_buffers_for_index for excessive deltas Marton Balint
2023-03-25 0:37 ` Michael Niedermayer
2 siblings, 1 reply; 14+ messages in thread
From: Martin Storsjö @ 2023-03-21 12:37 UTC (permalink / raw)
To: ffmpeg-devel
Normally, fill_buffer reads in one max_packet_size/IO_BUFFER_SIZE
worth of data into the buffer, slowly filling the buffer until it
is full.
Previously, when the buffer was full, fill_buffer would start over
from the start, effectively discarding all the previously buffered
data.
For files that are read linearly, the previous behaviour was fine.
For files that exhibit some amount of nonlinear read patterns,
especially mov files (where ff_configure_buffers_for_index
increases the buffer size to accomodate for the nonlinear reading!)
we would mostly be able to seek within the buffer - but whenever
we've hit the maximum buffer size, we'd discard most of the buffer
and start over with a very small buffer, so the next seek backwards
would end up outside of the buffer.
Keep one fourth of the buffered data, moving it to the start of
the buffer, freeing the rest to be refilled with future data.
For mov files with nonlinear read patterns, this almost entirely
avoids doing seeks on the lower IO level, where we previously would
end up doing seeks occasionally.
Signed-off-by: Martin Storsjö <martin@martin.st>
---
I'm open to suggestions on whether 1/4 of the buffer is a reasonable
amount to keep. It does of course incur some amount of overhead
for well behaved linear files, but is a decent improvement for
nonlinear mov files.
Alternatively we could trigger this behaviour only after we've
observed a couple seeks backwards?
---
libavformat/aviobuf.c | 46 +++++++++++++++++++++++++++++++++++++------
1 file changed, 40 insertions(+), 6 deletions(-)
diff --git a/libavformat/aviobuf.c b/libavformat/aviobuf.c
index 4ad734a3c3..dfc3e77016 100644
--- a/libavformat/aviobuf.c
+++ b/libavformat/aviobuf.c
@@ -534,8 +534,7 @@ static void fill_buffer(AVIOContext *s)
FFIOContext *const ctx = (FFIOContext *)s;
int max_buffer_size = s->max_packet_size ?
s->max_packet_size : IO_BUFFER_SIZE;
- uint8_t *dst = s->buf_end - s->buffer + max_buffer_size <= s->buffer_size ?
- s->buf_end : s->buffer;
+ uint8_t *dst = s->buf_end;
int len = s->buffer_size - (dst - s->buffer);
/* can't fill the buffer without read_packet, just set EOF if appropriate */
@@ -546,11 +545,46 @@ static void fill_buffer(AVIOContext *s)
if (s->eof_reached)
return;
- if (s->update_checksum && dst == s->buffer) {
- if (s->buf_end > s->checksum_ptr)
+ if (len < max_buffer_size && s->buffer_size > max_buffer_size) {
+ /* If the buffer is almost full and we're not trying to read
+ one whole buffer worth of data at once; keep some amount of
+ the currently buffered data, but move it to the start of the
+ buffer, to allow filling the buffer with more data. */
+ int keep = (s->buf_end - s->buffer)/4;
+ int shift = s->buf_end - keep - s->buffer;
+
+ if (s->update_checksum && s->checksum_ptr - s->buffer < shift) {
+ /* Checksum up to the buffer + shift position (that we're
+ shifting out of the buffer. */
s->checksum = s->update_checksum(s->checksum, s->checksum_ptr,
- s->buf_end - s->checksum_ptr);
- s->checksum_ptr = s->buffer;
+ s->buffer + shift - s->checksum_ptr);
+ }
+
+ memmove(s->buffer, s->buf_end - keep, keep);
+ s->buf_end -= shift;
+ s->buf_ptr -= shift;
+ if (s->update_checksum) {
+ if (s->checksum_ptr - s->buffer < shift)
+ s->checksum_ptr = s->buffer;
+ else
+ s->checksum_ptr -= shift;
+ }
+
+ dst = s->buf_end;
+ len = s->buffer_size - (dst - s->buffer);
+ } else if (len < max_buffer_size) {
+ /* If the buffer is full so we can't fit a whole write of max_buffer_size,
+ just restart the pointers from the start of the buffer. */
+ dst = s->buffer;
+ len = s->buffer_size;
+
+ if (s->update_checksum) {
+ /* Checksum all data that gets shifted out of the buffer. */
+ if (s->buf_end > s->checksum_ptr)
+ s->checksum = s->update_checksum(s->checksum, s->checksum_ptr,
+ s->buf_end - s->checksum_ptr);
+ s->checksum_ptr = s->buffer;
+ }
}
/* make buffer smaller in case it ended up large after probing */
--
2.37.1 (Apple Git-137.1)
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [FFmpeg-devel] [PATCH 2/2] aviobuf: Avoid clearing the whole buffer in fill_buffer
2023-03-21 12:37 ` [FFmpeg-devel] [PATCH 2/2] aviobuf: Avoid clearing the whole buffer in fill_buffer Martin Storsjö
@ 2023-03-21 19:29 ` Marton Balint
2023-03-21 20:24 ` Martin Storsjö
0 siblings, 1 reply; 14+ messages in thread
From: Marton Balint @ 2023-03-21 19:29 UTC (permalink / raw)
To: FFmpeg development discussions and patches
On Tue, 21 Mar 2023, Martin Storsjö wrote:
> Normally, fill_buffer reads in one max_packet_size/IO_BUFFER_SIZE
> worth of data into the buffer, slowly filling the buffer until it
> is full.
>
> Previously, when the buffer was full, fill_buffer would start over
> from the start, effectively discarding all the previously buffered
> data.
>
> For files that are read linearly, the previous behaviour was fine.
>
> For files that exhibit some amount of nonlinear read patterns,
> especially mov files (where ff_configure_buffers_for_index
> increases the buffer size to accomodate for the nonlinear reading!)
> we would mostly be able to seek within the buffer - but whenever
> we've hit the maximum buffer size, we'd discard most of the buffer
> and start over with a very small buffer, so the next seek backwards
> would end up outside of the buffer.
>
> Keep one fourth of the buffered data, moving it to the start of
> the buffer, freeing the rest to be refilled with future data.
>
> For mov files with nonlinear read patterns, this almost entirely
> avoids doing seeks on the lower IO level, where we previously would
> end up doing seeks occasionally.
Maybe the demuxer should use ffio_ensure_seekback() instead if it knows
that a seekback will happen? Unconditional memmove of even fourth of all
data does not seem like a good idea.
Regards,
Marton
>
> Signed-off-by: Martin Storsjö <martin@martin.st>
> ---
> I'm open to suggestions on whether 1/4 of the buffer is a reasonable
> amount to keep. It does of course incur some amount of overhead
> for well behaved linear files, but is a decent improvement for
> nonlinear mov files.
>
> Alternatively we could trigger this behaviour only after we've
> observed a couple seeks backwards?
> ---
> libavformat/aviobuf.c | 46 +++++++++++++++++++++++++++++++++++++------
> 1 file changed, 40 insertions(+), 6 deletions(-)
>
> diff --git a/libavformat/aviobuf.c b/libavformat/aviobuf.c
> index 4ad734a3c3..dfc3e77016 100644
> --- a/libavformat/aviobuf.c
> +++ b/libavformat/aviobuf.c
> @@ -534,8 +534,7 @@ static void fill_buffer(AVIOContext *s)
> FFIOContext *const ctx = (FFIOContext *)s;
> int max_buffer_size = s->max_packet_size ?
> s->max_packet_size : IO_BUFFER_SIZE;
> - uint8_t *dst = s->buf_end - s->buffer + max_buffer_size <= s->buffer_size ?
> - s->buf_end : s->buffer;
> + uint8_t *dst = s->buf_end;
> int len = s->buffer_size - (dst - s->buffer);
>
> /* can't fill the buffer without read_packet, just set EOF if appropriate */
> @@ -546,11 +545,46 @@ static void fill_buffer(AVIOContext *s)
> if (s->eof_reached)
> return;
>
> - if (s->update_checksum && dst == s->buffer) {
> - if (s->buf_end > s->checksum_ptr)
> + if (len < max_buffer_size && s->buffer_size > max_buffer_size) {
> + /* If the buffer is almost full and we're not trying to read
> + one whole buffer worth of data at once; keep some amount of
> + the currently buffered data, but move it to the start of the
> + buffer, to allow filling the buffer with more data. */
> + int keep = (s->buf_end - s->buffer)/4;
> + int shift = s->buf_end - keep - s->buffer;
> +
> + if (s->update_checksum && s->checksum_ptr - s->buffer < shift) {
> + /* Checksum up to the buffer + shift position (that we're
> + shifting out of the buffer. */
> s->checksum = s->update_checksum(s->checksum, s->checksum_ptr,
> - s->buf_end - s->checksum_ptr);
> - s->checksum_ptr = s->buffer;
> + s->buffer + shift - s->checksum_ptr);
> + }
> +
> + memmove(s->buffer, s->buf_end - keep, keep);
> + s->buf_end -= shift;
> + s->buf_ptr -= shift;
> + if (s->update_checksum) {
> + if (s->checksum_ptr - s->buffer < shift)
> + s->checksum_ptr = s->buffer;
> + else
> + s->checksum_ptr -= shift;
> + }
> +
> + dst = s->buf_end;
> + len = s->buffer_size - (dst - s->buffer);
> + } else if (len < max_buffer_size) {
> + /* If the buffer is full so we can't fit a whole write of max_buffer_size,
> + just restart the pointers from the start of the buffer. */
> + dst = s->buffer;
> + len = s->buffer_size;
> +
> + if (s->update_checksum) {
> + /* Checksum all data that gets shifted out of the buffer. */
> + if (s->buf_end > s->checksum_ptr)
> + s->checksum = s->update_checksum(s->checksum, s->checksum_ptr,
> + s->buf_end - s->checksum_ptr);
> + s->checksum_ptr = s->buffer;
> + }
> }
>
> /* make buffer smaller in case it ended up large after probing */
> --
> 2.37.1 (Apple Git-137.1)
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [FFmpeg-devel] [PATCH 2/2] aviobuf: Avoid clearing the whole buffer in fill_buffer
2023-03-21 19:29 ` Marton Balint
@ 2023-03-21 20:24 ` Martin Storsjö
2023-03-24 11:11 ` Anton Khirnov
0 siblings, 1 reply; 14+ messages in thread
From: Martin Storsjö @ 2023-03-21 20:24 UTC (permalink / raw)
To: FFmpeg development discussions and patches
On Tue, 21 Mar 2023, Marton Balint wrote:
>
>
> On Tue, 21 Mar 2023, Martin Storsjö wrote:
>
>> Normally, fill_buffer reads in one max_packet_size/IO_BUFFER_SIZE
>> worth of data into the buffer, slowly filling the buffer until it
>> is full.
>>
>> Previously, when the buffer was full, fill_buffer would start over
>> from the start, effectively discarding all the previously buffered
>> data.
>>
>> For files that are read linearly, the previous behaviour was fine.
>>
>> For files that exhibit some amount of nonlinear read patterns,
>> especially mov files (where ff_configure_buffers_for_index
>> increases the buffer size to accomodate for the nonlinear reading!)
>> we would mostly be able to seek within the buffer - but whenever
>> we've hit the maximum buffer size, we'd discard most of the buffer
>> and start over with a very small buffer, so the next seek backwards
>> would end up outside of the buffer.
>>
>> Keep one fourth of the buffered data, moving it to the start of
>> the buffer, freeing the rest to be refilled with future data.
>>
>> For mov files with nonlinear read patterns, this almost entirely
>> avoids doing seeks on the lower IO level, where we previously would
>> end up doing seeks occasionally.
>
> Maybe the demuxer should use ffio_ensure_seekback() instead if it knows
> that a seekback will happen? Unconditional memmove of even fourth of all
> data does not seem like a good idea.
Right, it's probably not ideal to do this unconditionally.
However, it's not that the demuxer really knows that a seekback _will_
happen - unless we make it inspect the next couple index entries. And I
don't think we should make the demuxer pre-analyze the next access
locations, but keep optimization like this on the separate layer. That
way, it works as expected as long as the seeks are short enough within the
expected tolerance, and falls back graciously on regular seeking for the
accesses that are weirder than that.
If we'd use ffio_ensure_seekback(), we'd make it mandatory for the aviobuf
layer to cache the data for any insane accesses.
Some stats on the file I'm dealing with: The file is >2 GB, and is not
exactly interleaved like the mov demuxer reads it, but roughly - when
demuxing, the mov demuxer mostly jumps back/forward within a maybe ~2 MB
range. But at the start and end of the file, there's a couple samples that
are way out of order, causing it to do seeks from one end of the file to
the other and back. So in that case, if we'd do ffio_ensure_seekback(),
we'd end up allocating a 2 GB seekback buffer.
Currently, ff_configure_buffers_for_index() correctly measures that it
needs a large buffer to avoid seeks in this file. (The function finds a
huge >2 GB pos_delta when inspecting all sample combinations in the file,
but setting it to the maximum of 16 MB already helps a whole lot, see
patch 1/2.)
So maybe we could have ff_configure_buffers_for_index set some more flags
to opt into behaviour like this?
// Martin
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [FFmpeg-devel] [PATCH 2/2] aviobuf: Avoid clearing the whole buffer in fill_buffer
2023-03-21 20:24 ` Martin Storsjö
@ 2023-03-24 11:11 ` Anton Khirnov
2023-03-24 11:25 ` Martin Storsjö
0 siblings, 1 reply; 14+ messages in thread
From: Anton Khirnov @ 2023-03-24 11:11 UTC (permalink / raw)
To: FFmpeg development discussions and patches
Quoting Martin Storsjö (2023-03-21 21:24:25)
> On Tue, 21 Mar 2023, Marton Balint wrote:
>
> >
> >
> > On Tue, 21 Mar 2023, Martin Storsjö wrote:
> >
> >> Normally, fill_buffer reads in one max_packet_size/IO_BUFFER_SIZE
> >> worth of data into the buffer, slowly filling the buffer until it
> >> is full.
> >>
> >> Previously, when the buffer was full, fill_buffer would start over
> >> from the start, effectively discarding all the previously buffered
> >> data.
> >>
> >> For files that are read linearly, the previous behaviour was fine.
> >>
> >> For files that exhibit some amount of nonlinear read patterns,
> >> especially mov files (where ff_configure_buffers_for_index
> >> increases the buffer size to accomodate for the nonlinear reading!)
> >> we would mostly be able to seek within the buffer - but whenever
> >> we've hit the maximum buffer size, we'd discard most of the buffer
> >> and start over with a very small buffer, so the next seek backwards
> >> would end up outside of the buffer.
> >>
> >> Keep one fourth of the buffered data, moving it to the start of
> >> the buffer, freeing the rest to be refilled with future data.
> >>
> >> For mov files with nonlinear read patterns, this almost entirely
> >> avoids doing seeks on the lower IO level, where we previously would
> >> end up doing seeks occasionally.
> >
> > Maybe the demuxer should use ffio_ensure_seekback() instead if it knows
> > that a seekback will happen? Unconditional memmove of even fourth of all
> > data does not seem like a good idea.
>
> Right, it's probably not ideal to do this unconditionally.
>
> However, it's not that the demuxer really knows that a seekback _will_
> happen - unless we make it inspect the next couple index entries. And I
> don't think we should make the demuxer pre-analyze the next access
> locations, but keep optimization like this on the separate layer. That
> way, it works as expected as long as the seeks are short enough within the
> expected tolerance, and falls back graciously on regular seeking for the
> accesses that are weirder than that.
I suppose changing the buffer into a ring buffer so you don't have to
move the data is not feasible?
--
Anton Khirnov
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [FFmpeg-devel] [PATCH 2/2] aviobuf: Avoid clearing the whole buffer in fill_buffer
2023-03-24 11:11 ` Anton Khirnov
@ 2023-03-24 11:25 ` Martin Storsjö
2023-03-24 11:55 ` Anton Khirnov
0 siblings, 1 reply; 14+ messages in thread
From: Martin Storsjö @ 2023-03-24 11:25 UTC (permalink / raw)
To: FFmpeg development discussions and patches,
FFmpeg development discussions and patches
On Fri, 24 Mar 2023, Anton Khirnov wrote:
> Quoting Martin Storsjö (2023-03-21 21:24:25)
>> On Tue, 21 Mar 2023, Marton Balint wrote:
>>
>> >
>> >
>> > On Tue, 21 Mar 2023, Martin Storsjö wrote:
>> >
>> >> Normally, fill_buffer reads in one max_packet_size/IO_BUFFER_SIZE
>> >> worth of data into the buffer, slowly filling the buffer until it
>> >> is full.
>> >>
>> >> Previously, when the buffer was full, fill_buffer would start over
>> >> from the start, effectively discarding all the previously buffered
>> >> data.
>> >>
>> >> For files that are read linearly, the previous behaviour was fine.
>> >>
>> >> For files that exhibit some amount of nonlinear read patterns,
>> >> especially mov files (where ff_configure_buffers_for_index
>> >> increases the buffer size to accomodate for the nonlinear reading!)
>> >> we would mostly be able to seek within the buffer - but whenever
>> >> we've hit the maximum buffer size, we'd discard most of the buffer
>> >> and start over with a very small buffer, so the next seek backwards
>> >> would end up outside of the buffer.
>> >>
>> >> Keep one fourth of the buffered data, moving it to the start of
>> >> the buffer, freeing the rest to be refilled with future data.
>> >>
>> >> For mov files with nonlinear read patterns, this almost entirely
>> >> avoids doing seeks on the lower IO level, where we previously would
>> >> end up doing seeks occasionally.
>> >
>> > Maybe the demuxer should use ffio_ensure_seekback() instead if it knows
>> > that a seekback will happen? Unconditional memmove of even fourth of all
>> > data does not seem like a good idea.
>>
>> Right, it's probably not ideal to do this unconditionally.
>>
>> However, it's not that the demuxer really knows that a seekback _will_
>> happen - unless we make it inspect the next couple index entries. And I
>> don't think we should make the demuxer pre-analyze the next access
>> locations, but keep optimization like this on the separate layer. That
>> way, it works as expected as long as the seeks are short enough within the
>> expected tolerance, and falls back graciously on regular seeking for the
>> accesses that are weirder than that.
>
> I suppose changing the buffer into a ring buffer so you don't have to
> move the data is not feasible?
Something like that would probably be ideal, yes - so we'd have a
constantly sliding window of data available behind the current position.
I think that would be more work than I'm able to invest in the issue at
the moment, though. (That doesn't mean I think everyone should suffer the
overhead of this patch in this form, but I'm more interested in looking at
heuristic based solutions for triggering this case rather than a full
rewrite.)
// Martin
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [FFmpeg-devel] [PATCH 2/2] aviobuf: Avoid clearing the whole buffer in fill_buffer
2023-03-24 11:25 ` Martin Storsjö
@ 2023-03-24 11:55 ` Anton Khirnov
2023-03-24 20:45 ` Marton Balint
0 siblings, 1 reply; 14+ messages in thread
From: Anton Khirnov @ 2023-03-24 11:55 UTC (permalink / raw)
To: FFmpeg development discussions and patches
Quoting Martin Storsjö (2023-03-24 12:25:37)
> On Fri, 24 Mar 2023, Anton Khirnov wrote:
>
> > Quoting Martin Storsjö (2023-03-21 21:24:25)
> >> On Tue, 21 Mar 2023, Marton Balint wrote:
> >>
> >> >
> >> >
> >> > On Tue, 21 Mar 2023, Martin Storsjö wrote:
> >> >
> >> >> Normally, fill_buffer reads in one max_packet_size/IO_BUFFER_SIZE
> >> >> worth of data into the buffer, slowly filling the buffer until it
> >> >> is full.
> >> >>
> >> >> Previously, when the buffer was full, fill_buffer would start over
> >> >> from the start, effectively discarding all the previously buffered
> >> >> data.
> >> >>
> >> >> For files that are read linearly, the previous behaviour was fine.
> >> >>
> >> >> For files that exhibit some amount of nonlinear read patterns,
> >> >> especially mov files (where ff_configure_buffers_for_index
> >> >> increases the buffer size to accomodate for the nonlinear reading!)
> >> >> we would mostly be able to seek within the buffer - but whenever
> >> >> we've hit the maximum buffer size, we'd discard most of the buffer
> >> >> and start over with a very small buffer, so the next seek backwards
> >> >> would end up outside of the buffer.
> >> >>
> >> >> Keep one fourth of the buffered data, moving it to the start of
> >> >> the buffer, freeing the rest to be refilled with future data.
> >> >>
> >> >> For mov files with nonlinear read patterns, this almost entirely
> >> >> avoids doing seeks on the lower IO level, where we previously would
> >> >> end up doing seeks occasionally.
> >> >
> >> > Maybe the demuxer should use ffio_ensure_seekback() instead if it knows
> >> > that a seekback will happen? Unconditional memmove of even fourth of all
> >> > data does not seem like a good idea.
> >>
> >> Right, it's probably not ideal to do this unconditionally.
> >>
> >> However, it's not that the demuxer really knows that a seekback _will_
> >> happen - unless we make it inspect the next couple index entries. And I
> >> don't think we should make the demuxer pre-analyze the next access
> >> locations, but keep optimization like this on the separate layer. That
> >> way, it works as expected as long as the seeks are short enough within the
> >> expected tolerance, and falls back graciously on regular seeking for the
> >> accesses that are weirder than that.
> >
> > I suppose changing the buffer into a ring buffer so you don't have to
> > move the data is not feasible?
>
> Something like that would probably be ideal, yes - so we'd have a
> constantly sliding window of data available behind the current position.
>
> I think that would be more work than I'm able to invest in the issue at
> the moment, though. (That doesn't mean I think everyone should suffer the
> overhead of this patch in this form, but I'm more interested in looking at
> heuristic based solutions for triggering this case rather than a full
> rewrite.)
As a (hopefully) temporary heuristic, triggering this after observing a
few backward seeks under buffer size sounds reasonable to me.
--
Anton Khirnov
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [FFmpeg-devel] [PATCH 2/2] aviobuf: Avoid clearing the whole buffer in fill_buffer
2023-03-24 11:55 ` Anton Khirnov
@ 2023-03-24 20:45 ` Marton Balint
2023-03-24 21:05 ` Martin Storsjö
0 siblings, 1 reply; 14+ messages in thread
From: Marton Balint @ 2023-03-24 20:45 UTC (permalink / raw)
To: FFmpeg development discussions and patches
On Fri, 24 Mar 2023, Anton Khirnov wrote:
> Quoting Martin Storsjö (2023-03-24 12:25:37)
>> On Fri, 24 Mar 2023, Anton Khirnov wrote:
>>
>> > Quoting Martin Storsjö (2023-03-21 21:24:25)
>> >> On Tue, 21 Mar 2023, Marton Balint wrote:
>> >>
>> >> >
>> >> >
>> >> > On Tue, 21 Mar 2023, Martin Storsjö wrote:
>> >> >
>> >> >> Normally, fill_buffer reads in one max_packet_size/IO_BUFFER_SIZE
>> >> >> worth of data into the buffer, slowly filling the buffer until it
>> >> >> is full.
>> >> >>
>> >> >> Previously, when the buffer was full, fill_buffer would start over
>> >> >> from the start, effectively discarding all the previously buffered
>> >> >> data.
>> >> >>
>> >> >> For files that are read linearly, the previous behaviour was fine.
>> >> >>
>> >> >> For files that exhibit some amount of nonlinear read patterns,
>> >> >> especially mov files (where ff_configure_buffers_for_index
>> >> >> increases the buffer size to accomodate for the nonlinear reading!)
>> >> >> we would mostly be able to seek within the buffer - but whenever
>> >> >> we've hit the maximum buffer size, we'd discard most of the buffer
>> >> >> and start over with a very small buffer, so the next seek backwards
>> >> >> would end up outside of the buffer.
>> >> >>
>> >> >> Keep one fourth of the buffered data, moving it to the start of
>> >> >> the buffer, freeing the rest to be refilled with future data.
>> >> >>
>> >> >> For mov files with nonlinear read patterns, this almost entirely
>> >> >> avoids doing seeks on the lower IO level, where we previously would
>> >> >> end up doing seeks occasionally.
>> >> >
>> >> > Maybe the demuxer should use ffio_ensure_seekback() instead if it knows
>> >> > that a seekback will happen? Unconditional memmove of even fourth of all
>> >> > data does not seem like a good idea.
>> >>
>> >> Right, it's probably not ideal to do this unconditionally.
>> >>
>> >> However, it's not that the demuxer really knows that a seekback _will_
>> >> happen - unless we make it inspect the next couple index entries. And I
>> >> don't think we should make the demuxer pre-analyze the next access
>> >> locations, but keep optimization like this on the separate layer. That
>> >> way, it works as expected as long as the seeks are short enough within the
>> >> expected tolerance, and falls back graciously on regular seeking for the
>> >> accesses that are weirder than that.
>> >
>> > I suppose changing the buffer into a ring buffer so you don't have to
>> > move the data is not feasible?
>>
>> Something like that would probably be ideal, yes - so we'd have a
>> constantly sliding window of data available behind the current position.
>>
>> I think that would be more work than I'm able to invest in the issue at
>> the moment, though. (That doesn't mean I think everyone should suffer the
>> overhead of this patch in this form, but I'm more interested in looking at
>> heuristic based solutions for triggering this case rather than a full
>> rewrite.)
>
> As a (hopefully) temporary heuristic, triggering this after observing a
> few backward seeks under buffer size sounds reasonable to me.
I am uneasy about complicating an already complicated and
hard-to-follow AVIO layer with heuristics which activate on magic
behaviour. And we all know how long temporary solutions last :)
I guess we could add some new parameter to AVIOContext end enable this
data-shifting behaviour explicitly when you reconfigure the buffer size
for index in the MOV demuxer. But is it worth it? How significant is the
"improvement" this patch provides over the previous one in the series?
Thanks,
Marton
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [FFmpeg-devel] [PATCH 2/2] aviobuf: Avoid clearing the whole buffer in fill_buffer
2023-03-24 20:45 ` Marton Balint
@ 2023-03-24 21:05 ` Martin Storsjö
2023-03-24 21:35 ` Marton Balint
0 siblings, 1 reply; 14+ messages in thread
From: Martin Storsjö @ 2023-03-24 21:05 UTC (permalink / raw)
To: FFmpeg development discussions and patches
On Fri, 24 Mar 2023, Marton Balint wrote:
> I am uneasy about complicating an already complicated and
> hard-to-follow AVIO layer with heuristics which activate on magic
> behaviour. And we all know how long temporary solutions last :)
>
> I guess we could add some new parameter to AVIOContext end enable this
> data-shifting behaviour explicitly when you reconfigure the buffer size
> for index in the MOV demuxer. But is it worth it? How significant is the
> "improvement" this patch provides over the previous one in the series?
With the 2.6 GB, 40 minute mov file I'm looking at, originally, due to the
issue fixed in patch 1/2, the buffer size was never increased from the
original 32 KB, so when reading the file linearly, we would do many tens
of thousands of seek requests, giving absolutely abysmal performance. (I
saw a server side log number saying 120 000 requests.)
With patch 1/2 applied, while reading the bulk of the file, it does ~170
seeks. So nothing terrible, but it still feels unnecessarily inefficient
to do >4 seeks per minute due to the fact that the aviobuf layer is
throwing away good data that it already had buffered.
In this case, it used a buffer size of 16 MB, and calculating 2.6 GB / 16
MB ends up very near 170. So every time the 16 MB aviobuf buffer gets full
and aviobuf clears it, we end up doing a seek backwards.
With patch 2/2 applied, we no longer do any seeks while reading the bulk
of the file (at the start/end of the file there are still a bunch of
scattered seeks though).
// Martin
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [FFmpeg-devel] [PATCH 2/2] aviobuf: Avoid clearing the whole buffer in fill_buffer
2023-03-24 21:05 ` Martin Storsjö
@ 2023-03-24 21:35 ` Marton Balint
2023-03-24 21:41 ` Martin Storsjö
0 siblings, 1 reply; 14+ messages in thread
From: Marton Balint @ 2023-03-24 21:35 UTC (permalink / raw)
To: FFmpeg development discussions and patches
On Fri, 24 Mar 2023, Martin Storsjö wrote:
> On Fri, 24 Mar 2023, Marton Balint wrote:
>
>> I am uneasy about complicating an already complicated and hard-to-follow
>> AVIO layer with heuristics which activate on magic behaviour. And we all
>> know how long temporary solutions last :)
>>
>> I guess we could add some new parameter to AVIOContext end enable this
>> data-shifting behaviour explicitly when you reconfigure the buffer size
>> for index in the MOV demuxer. But is it worth it? How significant is the
>> "improvement" this patch provides over the previous one in the series?
>
> With the 2.6 GB, 40 minute mov file I'm looking at, originally, due to the
> issue fixed in patch 1/2, the buffer size was never increased from the
> original 32 KB, so when reading the file linearly, we would do many tens of
> thousands of seek requests, giving absolutely abysmal performance. (I saw a
> server side log number saying 120 000 requests.)
>
> With patch 1/2 applied, while reading the bulk of the file, it does ~170
> seeks. So nothing terrible, but it still feels unnecessarily inefficient to
> do >4 seeks per minute due to the fact that the aviobuf layer is throwing
> away good data that it already had buffered.
>
> In this case, it used a buffer size of 16 MB, and calculating 2.6 GB / 16 MB
> ends up very near 170. So every time the 16 MB aviobuf buffer gets full and
> aviobuf clears it, we end up doing a seek backwards.
Thanks for the details. Patch/1 already made the significant improvement,
so yeah, I am not sure about Patch/2 knowing it is not the "right" way.
Regards,
Marton
>
> With patch 2/2 applied, we no longer do any seeks while reading the bulk of
> the file (at the start/end of the file there are still a bunch of scattered
> seeks though).
>
> // Martin
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [FFmpeg-devel] [PATCH 2/2] aviobuf: Avoid clearing the whole buffer in fill_buffer
2023-03-24 21:35 ` Marton Balint
@ 2023-03-24 21:41 ` Martin Storsjö
0 siblings, 0 replies; 14+ messages in thread
From: Martin Storsjö @ 2023-03-24 21:41 UTC (permalink / raw)
To: FFmpeg development discussions and patches
On Fri, 24 Mar 2023, Marton Balint wrote:
>
>
> On Fri, 24 Mar 2023, Martin Storsjö wrote:
>
>> On Fri, 24 Mar 2023, Marton Balint wrote:
>>
>>> I am uneasy about complicating an already complicated and hard-to-follow
>>> AVIO layer with heuristics which activate on magic behaviour. And we all
>>> know how long temporary solutions last :)
>>>
>>> I guess we could add some new parameter to AVIOContext end enable this
>>> data-shifting behaviour explicitly when you reconfigure the buffer size
>>> for index in the MOV demuxer. But is it worth it? How significant is the
>>> "improvement" this patch provides over the previous one in the series?
>>
>> With the 2.6 GB, 40 minute mov file I'm looking at, originally, due to the
>> issue fixed in patch 1/2, the buffer size was never increased from the
>> original 32 KB, so when reading the file linearly, we would do many tens of
>> thousands of seek requests, giving absolutely abysmal performance. (I saw a
>> server side log number saying 120 000 requests.)
>>
>> With patch 1/2 applied, while reading the bulk of the file, it does ~170
>> seeks. So nothing terrible, but it still feels unnecessarily inefficient to
>> do >4 seeks per minute due to the fact that the aviobuf layer is throwing
>> away good data that it already had buffered.
>>
>> In this case, it used a buffer size of 16 MB, and calculating 2.6 GB / 16
> MB
>> ends up very near 170. So every time the 16 MB aviobuf buffer gets full and
>> aviobuf clears it, we end up doing a seek backwards.
>
> Thanks for the details. Patch/1 already made the significant improvement,
Yeah - can I get someone to approve that one too? :-)
> so yeah, I am not sure about Patch/2 knowing it is not the "right" way.
Yeah, I'm a bit on the fence myself. On one hand, it's been a pet peeve of
mine for years, that we throw away the buffered data regularly like this,
but the implementation maybe isn't the best. So for the practical
performance issue it's probably not essential - it's just an annoying wart
that is left :-)
// Martin
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [FFmpeg-devel] [PATCH 1/2] libavformat: Improve ff_configure_buffers_for_index for excessive deltas
2023-03-21 12:37 [FFmpeg-devel] [PATCH 1/2] libavformat: Improve ff_configure_buffers_for_index for excessive deltas Martin Storsjö
2023-03-21 12:37 ` [FFmpeg-devel] [PATCH 2/2] aviobuf: Avoid clearing the whole buffer in fill_buffer Martin Storsjö
@ 2023-03-24 21:49 ` Marton Balint
2023-03-25 0:37 ` Michael Niedermayer
2 siblings, 0 replies; 14+ messages in thread
From: Marton Balint @ 2023-03-24 21:49 UTC (permalink / raw)
To: FFmpeg development discussions and patches
On Tue, 21 Mar 2023, Martin Storsjö wrote:
> Previously, the ff_configure_buffers_for_index function had
> upper sanity limits of 16 MB (1<<24) for buffer_size and
> 8 MB (1<<23) for short_seek_threshold.
>
> However, if the index analysis showed a need for even larger
> buffer sizes (for a really badly interleaved file), over the
> sanity limits, we previously didn't increase the buffer sizes
> at all.
>
> Instead, if the file shows a need for really large buffers and
> short_seek_threshold, just set them to the maximum sanity limit;
> while it might not be enough for all cases in the file, it might
> be enough for most of it.
>
> This can happen e.g. with a mov file with some tracks containing
> some samples that belong in the start of the file, at the end of
> the mdat, while the rest of the file is mostly reasonably interleaved;
> previously those samples caused the maximum pos_delta to skyrocket,
> skipping any buffer size enlargement.
>
> Signed-off-by: Martin Storsjö <martin@martin.st>
> ---
> libavformat/seek.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/libavformat/seek.c b/libavformat/seek.c
> index a236e285c0..818549dfef 100644
> --- a/libavformat/seek.c
> +++ b/libavformat/seek.c
> @@ -220,7 +220,8 @@ void ff_configure_buffers_for_index(AVFormatContext *s, int64_t time_tolerance)
> pos_delta *= 2;
> ctx = ffiocontext(s->pb);
> /* XXX This could be adjusted depending on protocol*/
> - if (s->pb->buffer_size < pos_delta && pos_delta < (1<<24)) {
> + pos_delta = FFMIN(pos_delta, 1<<24);
> + if (s->pb->buffer_size < pos_delta) {
> av_log(s, AV_LOG_VERBOSE, "Reconfiguring buffers to size %"PRId64"\n", pos_delta);
>
> /* realloc the buffer and the original data will be retained */
> @@ -232,9 +233,8 @@ void ff_configure_buffers_for_index(AVFormatContext *s, int64_t time_tolerance)
> ctx->short_seek_threshold = FFMAX(ctx->short_seek_threshold, pos_delta/2);
> }
>
> - if (skip < (1<<23)) {
> - ctx->short_seek_threshold = FFMAX(ctx->short_seek_threshold, skip);
> - }
> + skip = FFMIN(skip, 1<<23);
> + ctx->short_seek_threshold = FFMAX(ctx->short_seek_threshold, skip);
> }
>
> int av_index_search_timestamp(AVStream *st, int64_t wanted_timestamp, int flags)
> --
> 2.37.1 (Apple Git-137.1)
LGTM, thanks.
Marton
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [FFmpeg-devel] [PATCH 1/2] libavformat: Improve ff_configure_buffers_for_index for excessive deltas
2023-03-21 12:37 [FFmpeg-devel] [PATCH 1/2] libavformat: Improve ff_configure_buffers_for_index for excessive deltas Martin Storsjö
2023-03-21 12:37 ` [FFmpeg-devel] [PATCH 2/2] aviobuf: Avoid clearing the whole buffer in fill_buffer Martin Storsjö
2023-03-24 21:49 ` [FFmpeg-devel] [PATCH 1/2] libavformat: Improve ff_configure_buffers_for_index for excessive deltas Marton Balint
@ 2023-03-25 0:37 ` Michael Niedermayer
2023-03-25 22:17 ` Martin Storsjö
2 siblings, 1 reply; 14+ messages in thread
From: Michael Niedermayer @ 2023-03-25 0:37 UTC (permalink / raw)
To: FFmpeg development discussions and patches
[-- Attachment #1.1: Type: text/plain, Size: 1552 bytes --]
On Tue, Mar 21, 2023 at 02:37:28PM +0200, Martin Storsjö wrote:
> Previously, the ff_configure_buffers_for_index function had
> upper sanity limits of 16 MB (1<<24) for buffer_size and
> 8 MB (1<<23) for short_seek_threshold.
>
> However, if the index analysis showed a need for even larger
> buffer sizes (for a really badly interleaved file), over the
> sanity limits, we previously didn't increase the buffer sizes
> at all.
>
> Instead, if the file shows a need for really large buffers and
> short_seek_threshold, just set them to the maximum sanity limit;
> while it might not be enough for all cases in the file, it might
> be enough for most of it.
>
> This can happen e.g. with a mov file with some tracks containing
> some samples that belong in the start of the file, at the end of
> the mdat, while the rest of the file is mostly reasonably interleaved;
> previously those samples caused the maximum pos_delta to skyrocket,
> skipping any buffer size enlargement.
>
> Signed-off-by: Martin Storsjö <martin@martin.st>
> ---
> libavformat/seek.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
I think, if iam not too tired ATM that
instead of taking the maximum it would be better to ignore the cases
which are above the maximum, so the value would be optimal for the
cases that it can help
[...]
thx
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
It is a danger to trust the dream we wish for rather than
the science we have, -- Dr. Kenneth Brown
[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]
[-- Attachment #2: Type: text/plain, Size: 251 bytes --]
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [FFmpeg-devel] [PATCH 1/2] libavformat: Improve ff_configure_buffers_for_index for excessive deltas
2023-03-25 0:37 ` Michael Niedermayer
@ 2023-03-25 22:17 ` Martin Storsjö
0 siblings, 0 replies; 14+ messages in thread
From: Martin Storsjö @ 2023-03-25 22:17 UTC (permalink / raw)
To: FFmpeg development discussions and patches
On Sat, 25 Mar 2023, Michael Niedermayer wrote:
> On Tue, Mar 21, 2023 at 02:37:28PM +0200, Martin Storsjö wrote:
>> Previously, the ff_configure_buffers_for_index function had
>> upper sanity limits of 16 MB (1<<24) for buffer_size and
>> 8 MB (1<<23) for short_seek_threshold.
>>
>> However, if the index analysis showed a need for even larger
>> buffer sizes (for a really badly interleaved file), over the
>> sanity limits, we previously didn't increase the buffer sizes
>> at all.
>>
>> Instead, if the file shows a need for really large buffers and
>> short_seek_threshold, just set them to the maximum sanity limit;
>> while it might not be enough for all cases in the file, it might
>> be enough for most of it.
>>
>> This can happen e.g. with a mov file with some tracks containing
>> some samples that belong in the start of the file, at the end of
>> the mdat, while the rest of the file is mostly reasonably interleaved;
>> previously those samples caused the maximum pos_delta to skyrocket,
>> skipping any buffer size enlargement.
>>
>> Signed-off-by: Martin Storsjö <martin@martin.st>
>> ---
>> libavformat/seek.c | 8 ++++----
>> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> I think, if iam not too tired ATM that
> instead of taking the maximum it would be better to ignore the cases
> which are above the maximum, so the value would be optimal for the
> cases that it can help
Sure, I guess that could be reasonable too - I can update the patch with
that.
For the tricky file in my case, this algorithm currently observes mostly
deltas around 2-32 MB here, but with a few outliers around 2 GB. So the 2
GB outliers clearly are irrelevant regardless of whether we keep the limit
at the current 16 MB or raise it a little bit.
This heuristic algorithm currently does observe some differences over 32
MB - but even with the buffer size set to the maximum of 16 MB, it's quite
enough to avoid most of the seeking.
One thing I noted is that this function is called with time_tolerance set
to 1 second - but shouldn't it be quite enough to just have time_tolerance
set to 0? I.e. check exactly the next sample in the other tracks (since
that's what's going to be read in most cases), instead of checking a
sample 1 second in the future in other tracks?
// Martin
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 14+ messages in thread