* [FFmpeg-devel] [PATCH v2] avformat/dhav: fix backward scanning for get_duration and optimize seeking
@ 2025-05-20 15:27 Derek Buitenhuis
2025-05-20 19:18 ` Andreas Rheinhardt
0 siblings, 1 reply; 3+ messages in thread
From: Derek Buitenhuis @ 2025-05-20 15:27 UTC (permalink / raw)
To: ffmpeg-devel
From: Justin Ruggles <justinr@vimeo.com>
The backwards scanning done for incomplete final packets should not
assume a specific alignment at the end of the file. Truncated files
result in hundreds of thousands of seeks if the final packet does not
fall on a specific byte boundary, which can be extremely slow.
For example, with HTTP, each backwards seek results in a separate
HTTP request.
This changes the scanning to check for the end tag 1 byte at a time
and buffers the last 1 MiB using ffio_ensure_seekback to avoid additional
seek operations.
Signed-off-by: Justin Ruggles <justinr@vimeo.com>
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
---
Addresses Andreas' comments from the previous round.
---
libavformat/dhav.c | 48 +++++++++++++++++++++++++++++++++++-----------
1 file changed, 37 insertions(+), 11 deletions(-)
diff --git a/libavformat/dhav.c b/libavformat/dhav.c
index b2ead99609..27e2f4db23 100644
--- a/libavformat/dhav.c
+++ b/libavformat/dhav.c
@@ -22,6 +22,7 @@
#include <time.h>
+#include "libavutil/intreadwrite.h"
#include "libavutil/mem.h"
#include "libavutil/parseutils.h"
#include "avio_internal.h"
@@ -232,34 +233,59 @@ static void get_timeinfo(unsigned date, struct tm *timeinfo)
timeinfo->tm_sec = sec;
}
+#define MAX_DURATION_BUFFER_SIZE (1024*1024)
+
static int64_t get_duration(AVFormatContext *s)
{
DHAVContext *dhav = s->priv_data;
int64_t start_pos = avio_tell(s->pb);
+ int64_t end_pos = -1;
int64_t start = 0, end = 0;
struct tm timeinfo;
- int max_interations = 100000;
+ uint8_t *end_buffer;
+ int64_t end_buffer_size;
+ int64_t end_buffer_pos;
+ int64_t offset;
+ int ret;
if (!s->pb->seekable)
return 0;
- avio_seek(s->pb, avio_size(s->pb) - 8, SEEK_SET);
- while (avio_tell(s->pb) > 12 && max_interations--) {
- if (avio_rl32(s->pb) == MKTAG('d','h','a','v')) {
- int64_t seek_back = avio_rl32(s->pb);
+ end_buffer_size = FFMIN(MAX_DURATION_BUFFER_SIZE, avio_size(s->pb));
+ end_buffer = av_malloc(end_buffer_size);
+ if (!end_buffer)
+ return 0;
+ end_buffer_pos = avio_size(s->pb) - end_buffer_size;
+ avio_seek(s->pb, end_buffer_pos, SEEK_SET);
+ ret = ffio_ensure_seekback(s->pb, end_buffer_size);
+ if (ret < 0)
+ return 0;
+ avio_read(s->pb, end_buffer, end_buffer_size);
- avio_seek(s->pb, -seek_back, SEEK_CUR);
- read_chunk(s);
- get_timeinfo(dhav->date, &timeinfo);
- end = av_timegm(&timeinfo) * 1000LL;
+ offset = end_buffer_size - 8;
+ while (offset > 0) {
+ if (AV_RL32(end_buffer + offset) == MKTAG('d','h','a','v')) {
+ int64_t seek_back = AV_RL32(end_buffer + offset + 4);
+ end_pos = end_buffer_pos + offset - seek_back + 8;
break;
} else {
- avio_seek(s->pb, -12, SEEK_CUR);
+ offset -= 9;
}
}
- avio_seek(s->pb, start_pos, SEEK_SET);
+ av_freep(&end_buffer);
+
+ if (end_pos < 0) {
+ avio_seek(s->pb, start_pos, SEEK_SET);
+ return 0;
+ }
+
+ avio_seek(s->pb, end_pos, SEEK_SET);
+ read_chunk(s);
+ get_timeinfo(dhav->date, &timeinfo);
+ end = av_timegm(&timeinfo) * 1000LL;
+ avio_seek(s->pb, start_pos, SEEK_SET);
read_chunk(s);
get_timeinfo(dhav->date, &timeinfo);
start = av_timegm(&timeinfo) * 1000LL;
--
2.49.0
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [FFmpeg-devel] [PATCH v2] avformat/dhav: fix backward scanning for get_duration and optimize seeking
2025-05-20 15:27 [FFmpeg-devel] [PATCH v2] avformat/dhav: fix backward scanning for get_duration and optimize seeking Derek Buitenhuis
@ 2025-05-20 19:18 ` Andreas Rheinhardt
2025-05-21 13:23 ` Derek Buitenhuis
0 siblings, 1 reply; 3+ messages in thread
From: Andreas Rheinhardt @ 2025-05-20 19:18 UTC (permalink / raw)
To: ffmpeg-devel
Derek Buitenhuis:
> From: Justin Ruggles <justinr@vimeo.com>
>
> The backwards scanning done for incomplete final packets should not
> assume a specific alignment at the end of the file. Truncated files
> result in hundreds of thousands of seeks if the final packet does not
> fall on a specific byte boundary, which can be extremely slow.
> For example, with HTTP, each backwards seek results in a separate
> HTTP request.
>
> This changes the scanning to check for the end tag 1 byte at a time
> and buffers the last 1 MiB using ffio_ensure_seekback to avoid additional
> seek operations.
>
> Signed-off-by: Justin Ruggles <justinr@vimeo.com>
> Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
> ---
> Addresses Andreas' comments from the previous round.
> ---
> libavformat/dhav.c | 48 +++++++++++++++++++++++++++++++++++-----------
> 1 file changed, 37 insertions(+), 11 deletions(-)
>
> diff --git a/libavformat/dhav.c b/libavformat/dhav.c
> index b2ead99609..27e2f4db23 100644
> --- a/libavformat/dhav.c
> +++ b/libavformat/dhav.c
> @@ -22,6 +22,7 @@
>
> #include <time.h>
>
> +#include "libavutil/intreadwrite.h"
> #include "libavutil/mem.h"
> #include "libavutil/parseutils.h"
> #include "avio_internal.h"
> @@ -232,34 +233,59 @@ static void get_timeinfo(unsigned date, struct tm *timeinfo)
> timeinfo->tm_sec = sec;
> }
>
> +#define MAX_DURATION_BUFFER_SIZE (1024*1024)
> +
> static int64_t get_duration(AVFormatContext *s)
> {
> DHAVContext *dhav = s->priv_data;
> int64_t start_pos = avio_tell(s->pb);
> + int64_t end_pos = -1;
> int64_t start = 0, end = 0;
> struct tm timeinfo;
> - int max_interations = 100000;
> + uint8_t *end_buffer;
> + int64_t end_buffer_size;
> + int64_t end_buffer_pos;
> + int64_t offset;
> + int ret;
>
> if (!s->pb->seekable)
> return 0;
>
> - avio_seek(s->pb, avio_size(s->pb) - 8, SEEK_SET);
> - while (avio_tell(s->pb) > 12 && max_interations--) {
> - if (avio_rl32(s->pb) == MKTAG('d','h','a','v')) {
> - int64_t seek_back = avio_rl32(s->pb);
> + end_buffer_size = FFMIN(MAX_DURATION_BUFFER_SIZE, avio_size(s->pb));
> + end_buffer = av_malloc(end_buffer_size);
> + if (!end_buffer)
> + return 0;
> + end_buffer_pos = avio_size(s->pb) - end_buffer_size;
> + avio_seek(s->pb, end_buffer_pos, SEEK_SET);
> + ret = ffio_ensure_seekback(s->pb, end_buffer_size);
> + if (ret < 0)
> + return 0;
> + avio_read(s->pb, end_buffer, end_buffer_size);
>
> - avio_seek(s->pb, -seek_back, SEEK_CUR);
> - read_chunk(s);
> - get_timeinfo(dhav->date, &timeinfo);
> - end = av_timegm(&timeinfo) * 1000LL;
> + offset = end_buffer_size - 8;
> + while (offset > 0) {
> + if (AV_RL32(end_buffer + offset) == MKTAG('d','h','a','v')) {
> + int64_t seek_back = AV_RL32(end_buffer + offset + 4);
> + end_pos = end_buffer_pos + offset - seek_back + 8;
> break;
> } else {
> - avio_seek(s->pb, -12, SEEK_CUR);
> + offset -= 9;
> }
> }
>
> - avio_seek(s->pb, start_pos, SEEK_SET);
> + av_freep(&end_buffer);
> +
> + if (end_pos < 0) {
> + avio_seek(s->pb, start_pos, SEEK_SET);
> + return 0;
> + }
> +
> + avio_seek(s->pb, end_pos, SEEK_SET);
> + read_chunk(s);
> + get_timeinfo(dhav->date, &timeinfo);
> + end = av_timegm(&timeinfo) * 1000LL;
The only thing that you want from parsing the end is the end timestamp
and this is given by data alone. date is a 32bit le number at offset 16
from the start of the dhav tag; it can be read directly from the buffer,
so you do not need to seek again and use read_chunk to get the end.
It also means that your ffio_ensure_seekback() is unnecessary (unless
you wanted to ensure that the seek back to start_pos works (which is
currently not guaranteed at all and not ensured by your code and would
probably also be bad in general given that this would cache the whole
file in memory).
(The above is based on the presumption that we are not really interested
in what parse_ext() may parse in the chunk at the end; I don't know
whether this is true or not.)
- Andreas
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [FFmpeg-devel] [PATCH v2] avformat/dhav: fix backward scanning for get_duration and optimize seeking
2025-05-20 19:18 ` Andreas Rheinhardt
@ 2025-05-21 13:23 ` Derek Buitenhuis
0 siblings, 0 replies; 3+ messages in thread
From: Derek Buitenhuis @ 2025-05-21 13:23 UTC (permalink / raw)
To: ffmpeg-devel
On 5/20/2025 8:18 PM, Andreas Rheinhardt wrote:
> The only thing that you want from parsing the end is the end timestamp
> and this is given by data alone. date is a 32bit le number at offset 16
> from the start of the dhav tag; it can be read directly from the buffer,
> so you do not need to seek again and use read_chunk to get the end.
> It also means that your ffio_ensure_seekback() is unnecessary (unless
> you wanted to ensure that the seek back to start_pos works (which is
> currently not guaranteed at all and not ensured by your code and would
> probably also be bad in general given that this would cache the whole
> file in memory).
Thanks for the review, v3 sent.
> (The above is based on the presumption that we are not really interested
> in what parse_ext() may parse in the chunk at the end; I don't know
> whether this is true or not.)
AFAICT, we are not.
Cheers,
- Derek
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2025-05-21 13:23 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-05-20 15:27 [FFmpeg-devel] [PATCH v2] avformat/dhav: fix backward scanning for get_duration and optimize seeking Derek Buitenhuis
2025-05-20 19:18 ` Andreas Rheinhardt
2025-05-21 13:23 ` Derek Buitenhuis
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
This inbox may be cloned and mirrored by anyone:
git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git
# If you have public-inbox 1.1+ installed, you may
# initialize and index your mirror using the following commands:
public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
ffmpegdev@gitmailbox.com
public-inbox-index ffmpegdev
Example config snippet for mirrors.
AGPL code for this site: git clone https://public-inbox.org/public-inbox.git