From: Allan Cady via ffmpeg-devel <ffmpeg-devel@ffmpeg.org> To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org> Cc: Allan Cady <allancady@yahoo.com> Subject: Re: [FFmpeg-devel] [libavutil/timestamp.h: Fix loss of precision in timestamps for silencedetect on long files] Date: Mon, 11 Mar 2024 23:46:00 +0000 (UTC) Message-ID: <1773989202.1734553.1710200760704@mail.yahoo.com> (raw) In-Reply-To: <D86B9AE2-B398-4A01-9F80-5D42BB8AD99B@gmail.com> > On Monday, March 11, 2024 at 12:50:11 PM PDT, <epirat07@gmail.com> wrote: > On 11 Mar 2024, at 15:26, Andreas Rheinhardt wrote: >> Andreas Rheinhardt: >>> Allan Cady via ffmpeg-devel: >>>> From: "Allan Cady" <allancady@yahoo.com> >>>> >>>> I propose changing the format to "%.6f", which will >>>> give microsecond precision for all timestamps, regardless of >>>> offset. Trailing zeros can be trimmed from the fraction, without >>>> losing precision. If the length of the fixed-precision formatted >>>> timestamp exceeds the length of the allocated buffer >>>> (AV_TS_MAX_STRING_SIZE, currently 32, less one for the >>>> terminating null), then we can fall back to scientific notation, though >>>> this seems almost certain to never occur, because 32 characters would >>>> allow a maximum timestamp value of (32 - 1 - 6 - 1) = 24 characters. >>>> By my calculation, 10^24 seconds is about six orders of magnitude >>>> greater than the age of the universe. >>>> >>>> The fix proposed here follows the following logic: >>>> >>>> 1) Try formatting the number of seconds using "%.6f". This will >>>> always result in a string with six decimal digits in the fraction, >>>> possibly including trailing zeros. (e.g. "897234.73200"). >>>> >>>> 2) Check if that string would overflow the buffer. If it would, then >>>> format it using scientific notation ("%.8g"). >>>> >>>> 3) If the original fixed-point format fits, then trim any trailing >>>> zeros and decimal point, and return that result. >>>> >>>> Making this change broke two fate tests, filter-metadata-scdet, >>>> and filter-metadata-silencedetect. To correct this, I've modified >>>> tests/ref/fate/filter-metadata-scdet and >>>> tests/ref/fate/filter-metadata-silencedetect to match the >>>> new output. >>>> >>>> Signed-off-by: Allan Cady <allancady@yahoo.com> >>>> --- >>>> libavutil/timestamp.h | 53 +++++++++++++++++++- >>>> tests/ref/fate/filter-metadata-scdet | 12 ++--- >>>> tests/ref/fate/filter-metadata-silencedetect | 2 +- >>>> 3 files changed, 58 insertions(+), 9 deletions(-) >>>> >>>> diff --git a/libavutil/timestamp.h b/libavutil/timestamp.h >>>> index 2b37781eba..2f04f9bb2b 100644 >>>> --- a/libavutil/timestamp.h >>>> +++ b/libavutil/timestamp.h >>>> @@ -25,6 +25,7 @@ >>>> #define AVUTIL_TIMESTAMP_H >>>> >>>> #include "avutil.h" >>>> +#include <stdbool.h> >>>> >>>> #if defined(__cplusplus) && !defined(__STDC_FORMAT_MACROS) && !defined(PRId64) >>>> #error missing -D__STDC_FORMAT_MACROS / #define __STDC_FORMAT_MACROS >>>> @@ -53,6 +54,32 @@ static inline char *av_ts_make_string(char *buf, int64_t ts) >>>> */ >>>> #define av_ts2str(ts) av_ts_make_string((char[AV_TS_MAX_STRING_SIZE]){0}, ts) >>>> >>>> +/** >>>> + * Strip trailing zeros and decimal point from a string. Performed >>>> + * in-place on input buffer. For local use only by av_ts_make_time_string. >>>> + * >>>> + * e.g.: >>>> + * "752.378000" -> "752.378" >>>> + * "4.0" -> "4" >>>> + * "97300" -> "97300" >>>> + */ >>>> +static inline void av_ts_strip_trailing_zeros_and_decimal_point(char *str) { >>>> + if (strchr(str, '.')) >>>> + { >>>> + int i; >>>> + for (i = strlen(str) - 1; i >= 0 && str[i] == '0'; i--) { >>>> + str[i] = '\0'; >>>> + } >>>> + >>>> + // Remove decimal point if it's the last character >>>> + if (i >= 0 && str[i] == '.') { >>>> + str[i] = '\0'; >>>> + } >>>> + >>>> + // String was modified in place; no need for return value. >>>> + } >>>> +} >>>> + >>>> /** >>>> * Fill the provided buffer with a string containing a timestamp time >>>> * representation. >>>> @@ -65,8 +92,30 @@ static inline char *av_ts_make_string(char *buf, int64_t ts) >>>> static inline char *av_ts_make_time_string(char *buf, int64_t ts, >>>> const AVRational *tb) >>>> { >>>> - if (ts == AV_NOPTS_VALUE) snprintf(buf, AV_TS_MAX_STRING_SIZE, "NOPTS"); >>>> - else snprintf(buf, AV_TS_MAX_STRING_SIZE, "%.6g", av_q2d(*tb) * ts); >>>> + if (ts == AV_NOPTS_VALUE) >>>> + { >>>> + snprintf(buf, AV_TS_MAX_STRING_SIZE, "NOPTS"); >>>> + } >>>> + else >>>> + { >>>> + const int max_fraction_digits = 6; >>>> + >>>> + // Convert 64-bit timestamp to double, using rational timebase >>>> + double seconds = av_q2d(*tb) * ts; >>>> + >>>> + int length = snprintf(NULL, 0, "%.*f", max_fraction_digits, seconds); >>>> + if (length <= AV_TS_MAX_STRING_SIZE - 1) >>>> + { >>>> + snprintf(buf, AV_TS_MAX_STRING_SIZE, "%.*f", max_fraction_digits, seconds); >>>> + av_ts_strip_trailing_zeros_and_decimal_point(buf); >>>> + } >>>> + else >>>> + { >>>> + snprintf(buf, AV_TS_MAX_STRING_SIZE, "%.8g", seconds); >>>> + } >>>> + >>>> + } >>>> + >>>> return buf; >>>> } >>>> >>> >>> 1. What makes you believe that all users want the new format and that it >>> does not cause undesired behaviour for some (maybe a lot) of them? I definitely do not know what other users would want. I would think maybe some would like the change, others wouldn't, and most would never know. >>> The >>> number of characters written by the earlier code stayed pretty constant >>> even when the times became big (in this case, it just printed 8 chars if >>> ts>=0), yet your code will really make use of the whole buffer. It's true that my change will increase the potential length of the output beyond 8 significant digits. The issue I was having that brought this up was, I have some very long audio files (up to 50 hours long), which I was wanting to split into smaller pieces. I wrote some scripts that use silencedetect to get the locations of breaks and then split the files at the breaks, but I discovered that for segments near the end of the file, silencedetect was returning whole-number timestamps, which was causing undesirable results for me. Thinking functionally, it seems like timestamps further out in a file ought to have the same precision as those near the beginning. So this seems to me like a minor oversight in the original design, that might warrant fixing. >>> Granted, we could tell our users that they have no right to complain >>> about this, given that we always had a "right" to use the full buffer, >>> but I consider this a violation of the principle of least surprise. I definitely agree with you there. >>> Why don't you just change silencedetect or add another function? I actually started out taking that approach in my submission a few weeks ago. Marton Balint suggested (in a message on 20 Feb) that we make the change in av_ts_make_time_string, so I did that for this submission. I'm open to whatever approach you all consider is best. >>> 2. For very small timestamps (< 10^-4), the new code will print a lot of >>> useless leading zeros (after the decimal point). In fact, it can be so >>> many that the new code has less precision than the old code, despite >>> using the fill buffer. I don't understand. Leading zeros after the decimal point are far from useless -- they are part of the value. Maybe what you're saying is that six digits is more precision than necessary? That may be so. I could personally do fine with just two digits (hundredths), as long as it's consistent through the length of the file. >>> 2. This is way too much code for an inline function. No disagreement there. >>> 3. Anyway, your placement of {} on their own lines does not match the >>> project coding style. I'm happy to conform with project coding style. >> In addition to this, there is another issue here: Your >> av_ts_strip_trailing_zeros_and_decimal_point() presumes that the >> "decimal-point character" is always '.', but this can be changed via >> setlocale(). Excellent point, which I hadn't considered. I have no experience with how locale is handled in C. I would welcome advice on the best way to handle this. > True, though I would consider this a more general bug. We should be > consistent and not generate files that are locale-dependent and then > not parseable anymore with a different one… That’s just a huge mess. > > Also in general FFmpeg is completely broken if you use any locale that > does not use . as decimal separator. (This never shows for most users > currently as most people use FFmpeg CLI which does not respect the > users locale) I'll leave that conversation to the experts here. Thanks for giving my code a look. _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
next prev parent reply other threads:[~2024-03-11 23:46 UTC|newest] Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top [not found] <20240221073915.26055-1-allancady.ref@yahoo.com> 2024-02-21 7:39 ` [FFmpeg-devel] [PATCH] libavutil/timestamp.h: Fix loss of precision in timestamps for silencedetect on long files Allan Cady via ffmpeg-devel 2024-02-21 20:25 ` Marton Balint 2024-02-22 1:21 ` Allan Cady via ffmpeg-devel 2024-02-22 9:16 ` Marton Balint 2024-02-23 21:44 ` Allan Cady via ffmpeg-devel 2024-02-23 21:48 ` Allan Cady via ffmpeg-devel 2024-02-23 22:47 ` Marton Balint 2024-03-11 2:56 ` [FFmpeg-devel] [PATCH] When the silencedetect filter is run against long files, the output timestamps gradually lose precision as the scan proceeds further into the file. This is because the output is formatted (in libavutil/timestamp.h) as "%.6g", which limits the total field length. Eventually, for offsets greater than 100000 seconds (about 28 hours), fractions of a second disappear altogether, and the timestamps are logged as whole seconds Allan Cady via ffmpeg-devel 2024-03-11 13:43 ` Andreas Rheinhardt 2024-03-11 14:26 ` Andreas Rheinhardt 2024-03-11 19:49 ` epirat07 2024-03-11 23:46 ` Allan Cady via ffmpeg-devel [this message] 2024-03-11 19:11 ` [FFmpeg-devel] [PATCH] change av_ts_make_time_string precision Marton Balint 2024-03-12 2:57 ` Allan Cady via ffmpeg-devel 2024-03-12 21:24 ` Marton Balint 2024-03-13 6:03 ` Allan Cady via ffmpeg-devel 2024-03-17 23:40 ` [FFmpeg-devel] [PATCH] avutil/timestamp: keep microsecond precision in av_ts_make_time_string Marton Balint 2024-03-19 21:14 ` Marton Balint 2024-03-19 23:38 ` Andreas Rheinhardt 2024-03-20 12:44 ` Andreas Rheinhardt 2024-03-20 19:51 ` Marton Balint 2024-03-22 21:50 ` Marton Balint 2024-03-22 21:58 ` Andreas Rheinhardt 2024-03-23 11:14 ` [FFmpeg-devel] [PATCH v2 1/2] avutil/timestamp: introduce av_ts_make_time_string2 for better precision Marton Balint 2024-03-23 11:14 ` [FFmpeg-devel] [PATCH v2 2/2] avutil/timestamp: change precision of av_ts_make_time_string() Marton Balint 2024-03-25 18:27 ` [FFmpeg-devel] [PATCH v2 1/2] avutil/timestamp: introduce av_ts_make_time_string2 for better precision Marton Balint 2024-03-25 22:13 ` Allan Cady via ffmpeg-devel 2024-06-17 20:08 ` Hendrik Leppkes 2024-06-17 20:33 ` [FFmpeg-devel] [PATCH] avutil/timestamp: avoid possible FPE when 0 is passed to av_ts_make_time_string2() Marton Balint 2024-06-18 7:55 ` Rémi Denis-Courmont 2024-06-18 19:07 ` Marton Balint 2024-06-29 8:24 ` Marton Balint 2024-03-17 23:43 ` [FFmpeg-devel] [PATCH] change av_ts_make_time_string precision Marton Balint 2024-03-19 22:46 ` Allan Cady via ffmpeg-devel 2024-03-13 6:09 ` Allan Cady via ffmpeg-devel 2024-03-11 3:57 ` [FFmpeg-devel] [PATCH] libavutil/timestamp.h: Fix loss of precision in timestamps for silencedetect on long files Allan Cady via ffmpeg-devel 2024-03-11 4:13 ` Allan Cady via ffmpeg-devel
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=1773989202.1734553.1710200760704@mail.yahoo.com \ --to=ffmpeg-devel@ffmpeg.org \ --cc=allancady@yahoo.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel This inbox may be cloned and mirrored by anyone: git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \ ffmpegdev@gitmailbox.com public-inbox-index ffmpegdev Example config snippet for mirrors. AGPL code for this site: git clone https://public-inbox.org/public-inbox.git