* [FFmpeg-devel] ffmpeg appears to be wasting CPU time
@ 2023-03-30 8:20 Davy Durham
2023-04-07 4:03 ` Davy Durham
0 siblings, 1 reply; 4+ messages in thread
From: Davy Durham @ 2023-03-30 8:20 UTC (permalink / raw)
To: ffmpeg-devel
Greetings,
I'm seeing something curious when using ffmpeg and a readrate (using
-re specifically). *The cumulative CPU usage when reading at a lower
rate is significantly higher than when letting it run as fast as possible.*
Evidence:
Using master HEAD, using the /time/ command, I am converting a 22.61s
.wav file to .mp3.
Running at *full speed* the results are:
time ./ffmpeg -i input.wav output.mp3
size= 355kB time=00:00:22.62 bitrate= 128.4kbits/s speed=52.1x
video:0kB audio:354kB subtitle:0kB other streams:0kB global
headers:0kB muxing overhead: 0.121834%
real 0m0.464s
*user 0m0.487s**
**sys 0m0.036s*
Running at *realtime**speed* (adding -re) the results are:
time ./ffmpeg -i input.wav output.mp3
real 0m22.664s
*user 0m2.253s**
**sys 0m0.156s*
This works out to the realtime speed run consuming *~10.6%* of the CPU
(which about what I see on top as well) when it should only consume
*~2.31%*, based on how much CPU was actually needed for the conversion
in the fullspeed run. (i.e. (0.487s+0.036s) * 100% / 22.61s == ~2.31%).
This means it consumes almost 5x the CPU when reading at the slower
rate.
Experiment:
I started digging and noticed that when readrates are enabled, the
transcoding loop in ffmpeg sleeps arbitrarily for 10ms then wakes up and
does more reading, however, the read function may decide it still isn't
time to return anything (EAGAIN) and so things may go back to sleep for
another trip through the loop.
I thought this might be the inefficiency, but it turns out not to be the
case.
If I run it without the -re flag, but introduce an unconditional call to
usleep(250ms) in the loop (simply to slow it down in a similar fashion,
but avoiding the code paths that -re introduces), then one would expect
it to consume about the same amount of cumulative CPU as the full sped
run above, only it should just take more overall time while it mostly
sleeps.
i.e. this simple change
diff --git a/fftools/ffmpeg.c b/fftools/ffmpeg.c
index 438bee8fef..32a2087640 100644
--- a/fftools/ffmpeg.c
+++ b/fftools/ffmpeg.c
@@ -3936,6 +3936,8 @@ static int transcode_step(void)
av_log(NULL, AV_LOG_VERBOSE, "No more inputs to read from, finishing.\n");
return AVERROR_EOF;
}
+
+ av_usleep(250000/*250ms*/);
if (ost->filter && !ost->filter->graph->graph) {
if (ifilter_has_all_input_formats(ost->filter->graph)) {
This gives the same surprising result:
time ./ffmpeg -i input.wav output.mp3
real 6m10.163s***user 0m2.304s****sys 0m0.174s*
And so again, that real time is expected, but the user+sys /should/ be
about the same as it was in the first run (i.e. ~0.5s)..
Conclusion?
*Can anyone guess as to why this may be happening?* I'd like to get to
the bottom of it and submit a patch.
I suspected something with the use of threads, and maybe threads aren't
efficiently pausing but are spinning when queues start to fill or empty
out? But if I introduce a much longer sleep, I don't see anything
register on top while the main thread is in that sleep state.
And BTW, the large discrepancy exists for much longer files too
(involved in my actual use case).
I see the same behavior in v5.1.2 which I'm using in production.
If I perform the same experimental code change on v2.2.1, I do not get
the unexpected difference. I just picked some old version, and would be
happy to figure out more precisely when it was introduced if that would
help.
Any advice would be appreciated. Thank You
P.S. You might wonder why I need to run ffmpeg in realtime, and it's
because my actual use case involves HLS and requires converting at the
realtime rate, but this simple mp3 conversion demonstrates the problem
I'm seeing there.
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [FFmpeg-devel] ffmpeg appears to be wasting CPU time
2023-03-30 8:20 [FFmpeg-devel] ffmpeg appears to be wasting CPU time Davy Durham
@ 2023-04-07 4:03 ` Davy Durham
2023-04-07 16:15 ` Kieran Kunhya
0 siblings, 1 reply; 4+ messages in thread
From: Davy Durham @ 2023-04-07 4:03 UTC (permalink / raw)
To: ffmpeg-devel
After furiously digging through the ffmpeg code (particularly in older
versions that also exhibit the problem, but where the code is simpler) I
have found no reason for there to be wasted CPU time.
In my travels, I frustratingly noticed that sometime the problem would
mysteriously vanish. And, long story short, I believe I've been chasing
a phantom. If, on my local machine w/ 8 cores, I start a process that
simply spins one CPU, the problem vanishes--that is, now whether running
ffmpeg at full speed or with a read rate limit, the cumulative CPU
reported by 'time' is nearly** the same, as one would expect it to be.
I believe this is a manifestation of some oddball linux kernel behavior,
or perhaps of glibc, where the sleep operation is sometimes just a CPU
spin rather than a context switch since the machine doesn't have
anything else to do--kind of like a spin lock improves efficiency when
locks are expected to be short.
I also noticed that under valgrind's callgrind tool, I could get no
different results based on full speed or rate limited. Probably because
that intensive analysis tool spins the CPU regardless.
Though, I've tried looking into the massive and hard-to-follow glibc and
kernel sources, I haven't spotted a culprit, but nevertheless this seems
to be the observed behavior, and it's not apparently caused by ffmpeg
itself.
So, I'll go on faith that in the presence of other actual tasks in a
production environment, the ffmpeg won't be unduly consuming CPU.
If anyone else has an insight, I'd still be interested in trying to
understand the situation, but I think I'm good for now.
----
**- "nearly" When read rate limited, still the the CPU is about 20%
higher than it should be, and I tracked this down to the frequency at
which the read function gets repeatedly called only to return EAGAIN
over and over. The call to av_usleep after receiving EAGAIN is 10ms.
This is quite short, causing a lot of wasted wakeups, when it's likely
that the next frame is going to be returned more like 250ms in the
future. When I change this sleep to 250ms, the cumulative CPU becomes
virtually identical.
I believe this can be improved by having the read function return the
future time (instead of EAGAIN) at which more data should be read (based
on the timestamps in the read buffers), and then it can sleep precisely
the needed amount instead of blindly waking up often only to be told to
go try again later (how much later, it doesn't know, so it just tries
again in 10ms). I'm working on a patch that makes this improvement.
On 3/30/23 03:20, Davy Durham wrote:
>
> Greetings,
>
> I'm seeing something curious when using ffmpeg and a readrate (using
> -re specifically). *The cumulative CPU usage when reading at a lower
> rate is significantly higher than when letting it run as fast as
> possible.*
>
>
>
> Evidence:
>
> Using master HEAD, using the /time/ command, I am converting a 22.61s
> .wav file to .mp3.
>
> Running at *full speed* the results are:
>
> time ./ffmpeg -i input.wav output.mp3
>
> size= 355kB time=00:00:22.62 bitrate= 128.4kbits/s speed=52.1x
> video:0kB audio:354kB subtitle:0kB other streams:0kB global
> headers:0kB muxing overhead: 0.121834%
>
> real 0m0.464s
> *user 0m0.487s**
> **sys 0m0.036s*
>
> Running at *realtime**speed* (adding -re) the results are:
>
> time ./ffmpeg -i input.wav output.mp3
>
> real 0m22.664s
> *user 0m2.253s**
> **sys 0m0.156s*
>
> This works out to the realtime speed run consuming *~10.6%* of the CPU
> (which about what I see on top as well) when it should only consume
> *~2.31%*, based on how much CPU was actually needed for the conversion
> in the fullspeed run. (i.e. (0.487s+0.036s) * 100% / 22.61s ==
> ~2.31%). This means it consumes almost 5x the CPU when reading at
> the slower rate.
>
>
> Experiment:
>
> I started digging and noticed that when readrates are enabled, the
> transcoding loop in ffmpeg sleeps arbitrarily for 10ms then wakes up
> and does more reading, however, the read function may decide it still
> isn't time to return anything (EAGAIN) and so things may go back to
> sleep for another trip through the loop.
>
> I thought this might be the inefficiency, but it turns out not to be
> the case.
>
> If I run it without the -re flag, but introduce an unconditional call
> to usleep(250ms) in the loop (simply to slow it down in a similar
> fashion, but avoiding the code paths that -re introduces), then one
> would expect it to consume about the same amount of cumulative CPU as
> the full sped run above, only it should just take more overall time
> while it mostly sleeps.
>
> i.e. this simple change
>
> diff --git a/fftools/ffmpeg.c b/fftools/ffmpeg.c
> index 438bee8fef..32a2087640 100644
> --- a/fftools/ffmpeg.c
> +++ b/fftools/ffmpeg.c
> @@ -3936,6 +3936,8 @@ static int transcode_step(void)
> av_log(NULL, AV_LOG_VERBOSE, "No more inputs to read from, finishing.\n");
> return AVERROR_EOF;
> }
> +
> + av_usleep(250000/*250ms*/);
>
> if (ost->filter && !ost->filter->graph->graph) {
> if (ifilter_has_all_input_formats(ost->filter->graph)) {
>
> This gives the same surprising result:
>
> time ./ffmpeg -i input.wav output.mp3
>
> real 6m10.163s***user 0m2.304s****sys 0m0.174s*
>
>
> And so again, that real time is expected, but the user+sys /should/ be
> about the same as it was in the first run (i.e. ~0.5s)..
>
>
>
> Conclusion?
>
> *Can anyone guess as to why this may be happening?* I'd like to get
> to the bottom of it and submit a patch.
>
> I suspected something with the use of threads, and maybe threads
> aren't efficiently pausing but are spinning when queues start to fill
> or empty out? But if I introduce a much longer sleep, I don't see
> anything register on top while the main thread is in that sleep state.
>
> And BTW, the large discrepancy exists for much longer files too
> (involved in my actual use case).
>
> I see the same behavior in v5.1.2 which I'm using in production.
>
> If I perform the same experimental code change on v2.2.1, I do not get
> the unexpected difference. I just picked some old version, and would
> be happy to figure out more precisely when it was introduced if that
> would help.
>
> Any advice would be appreciated. Thank You
>
>
> P.S. You might wonder why I need to run ffmpeg in realtime, and it's
> because my actual use case involves HLS and requires converting at the
> realtime rate, but this simple mp3 conversion demonstrates the problem
> I'm seeing there.
>
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [FFmpeg-devel] ffmpeg appears to be wasting CPU time
2023-04-07 4:03 ` Davy Durham
@ 2023-04-07 16:15 ` Kieran Kunhya
2023-04-07 20:26 ` Davy Durham
0 siblings, 1 reply; 4+ messages in thread
From: Kieran Kunhya @ 2023-04-07 16:15 UTC (permalink / raw)
To: FFmpeg development discussions and patches
On Fri, 7 Apr 2023 at 05:03, Davy Durham <ddurham@davyandbeth.com> wrote:
> After furiously digging through the ffmpeg code (particularly in older
> versions that also exhibit the problem, but where the code is simpler) I
> have found no reason for there to be wasted CPU time.
>
> In my travels, I frustratingly noticed that sometime the problem would
> mysteriously vanish. And, long story short, I believe I've been chasing
> a phantom. If, on my local machine w/ 8 cores, I start a process that
> simply spins one CPU, the problem vanishes--that is, now whether running
> ffmpeg at full speed or with a read rate limit, the cumulative CPU
> reported by 'time' is nearly** the same, as one would expect it to be.
>
Are you sure this isn't just your CPU governor and/or the effects of
frequency scaling/C-states?
Kieran
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [FFmpeg-devel] ffmpeg appears to be wasting CPU time
2023-04-07 16:15 ` Kieran Kunhya
@ 2023-04-07 20:26 ` Davy Durham
0 siblings, 0 replies; 4+ messages in thread
From: Davy Durham @ 2023-04-07 20:26 UTC (permalink / raw)
To: ffmpeg-devel
Ah, you appear to be correct. I learned about cpufreq and adjusted it
to not scale and the problem also disappears then too. I had not even
considered that. Thank you
On 4/7/23 11:15, Kieran Kunhya wrote:
> On Fri, 7 Apr 2023 at 05:03, Davy Durham <ddurham@davyandbeth.com> wrote:
>
>> After furiously digging through the ffmpeg code (particularly in older
>> versions that also exhibit the problem, but where the code is simpler) I
>> have found no reason for there to be wasted CPU time.
>>
>> In my travels, I frustratingly noticed that sometime the problem would
>> mysteriously vanish. And, long story short, I believe I've been chasing
>> a phantom. If, on my local machine w/ 8 cores, I start a process that
>> simply spins one CPU, the problem vanishes--that is, now whether running
>> ffmpeg at full speed or with a read rate limit, the cumulative CPU
>> reported by 'time' is nearly** the same, as one would expect it to be.
>>
> Are you sure this isn't just your CPU governor and/or the effects of
> frequency scaling/C-states?
>
> Kieran
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2023-04-07 20:26 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-30 8:20 [FFmpeg-devel] ffmpeg appears to be wasting CPU time Davy Durham
2023-04-07 4:03 ` Davy Durham
2023-04-07 16:15 ` Kieran Kunhya
2023-04-07 20:26 ` Davy Durham
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
This inbox may be cloned and mirrored by anyone:
git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git
# If you have public-inbox 1.1+ installed, you may
# initialize and index your mirror using the following commands:
public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
ffmpegdev@gitmailbox.com
public-inbox-index ffmpegdev
Example config snippet for mirrors.
AGPL code for this site: git clone https://public-inbox.org/public-inbox.git