Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
 help / color / mirror / Atom feed
* [FFmpeg-devel] [PATCH] libavcodec/cuviddec.c: increase CUVID_DEFAULT_NUM_SURFACES
@ 2025-02-20 20:37 Scott Theisen
  2025-02-21 13:26 ` Timo Rothenpieler
  0 siblings, 1 reply; 3+ messages in thread
From: Scott Theisen @ 2025-02-20 20:37 UTC (permalink / raw)
  To: ffmpeg-devel

The default value of CuvidContext::nb_surfaces was reduced from 25 to 5 (as
(CUVID_MAX_DISPLAY_DELAY + 1)) in 402d98c9d467dff6931d906ebb732b9a00334e0b.

In cuvid_is_buffer_full() delay can be 2 * CUVID_MAX_DISPLAY_DELAY with double
rate deinterlacing.  ctx->nb_surfaces is CUVID_DEFAULT_NUM_SURFACES =
(CUVID_MAX_DISPLAY_DELAY + 1) by default, in which case cuvid_is_buffer_full()
will always return true and cuvid_output_frame() will never read any data since
it will not call ff_decode_get_packet().
---

I think part of the problem might be that cuvid_is_buffer_full() does not know
how many frames are actually in the driver's queue and assumes it is the
maximum, even if none have yet been added.

This was preventing any frames from being decoded using NVDEC with MythTV for
some streams.  See https://github.com/MythTV/mythtv/issues/1039

---
 libavcodec/cuviddec.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libavcodec/cuviddec.c b/libavcodec/cuviddec.c
index 67076a1752..05dcafab6e 100644
--- a/libavcodec/cuviddec.c
+++ b/libavcodec/cuviddec.c
@@ -120,7 +120,7 @@ typedef struct CuvidParsedFrame
 #define CUVID_MAX_DISPLAY_DELAY (4)
 
 // Actual pool size will be determined by parser.
-#define CUVID_DEFAULT_NUM_SURFACES (CUVID_MAX_DISPLAY_DELAY + 1)
+#define CUVID_DEFAULT_NUM_SURFACES ((2 * CUVID_MAX_DISPLAY_DELAY) + 1)
 
 static int CUDAAPI cuvid_handle_video_sequence(void *opaque, CUVIDEOFORMAT* format)
 {
-- 
2.43.0

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [FFmpeg-devel] [PATCH] libavcodec/cuviddec.c: increase CUVID_DEFAULT_NUM_SURFACES
  2025-02-20 20:37 [FFmpeg-devel] [PATCH] libavcodec/cuviddec.c: increase CUVID_DEFAULT_NUM_SURFACES Scott Theisen
@ 2025-02-21 13:26 ` Timo Rothenpieler
  2025-02-22  2:52   ` Scott Theisen
  0 siblings, 1 reply; 3+ messages in thread
From: Timo Rothenpieler @ 2025-02-21 13:26 UTC (permalink / raw)
  To: ffmpeg-devel

On 20.02.2025 21:37, Scott Theisen wrote:
> The default value of CuvidContext::nb_surfaces was reduced from 25 to 5 (as
> (CUVID_MAX_DISPLAY_DELAY + 1)) in 402d98c9d467dff6931d906ebb732b9a00334e0b.
> 
> In cuvid_is_buffer_full() delay can be 2 * CUVID_MAX_DISPLAY_DELAY with double
> rate deinterlacing.  ctx->nb_surfaces is CUVID_DEFAULT_NUM_SURFACES =
> (CUVID_MAX_DISPLAY_DELAY + 1) by default, in which case cuvid_is_buffer_full()
> will always return true and cuvid_output_frame() will never read any data since
> it will not call ff_decode_get_packet().

It's been way too long since I looked at all that code, and I didn't 
even write most of the code involved:
> https://github.com/FFmpeg/FFmpeg/commit/bddb2343b6e594e312dadb5d21b408702929ae04
> https://github.com/FFmpeg/FFmpeg/commit/402d98c9d467dff6931d906ebb732b9a00334e0b

But doesn't this instead mean that the logic in cuvid_is_buffer_full is 
flawed somehow?
Just increasing the default number of surfaces does not seem like the 
correct fix or sensible, since it will increase VRAM usage by 
potentially quite a bit for all users.

 From looking at this a bit, the issue will only happen when 
deinterlacing, the logic in cuvid_is_buffer_full becomes stuck then, and 
will always claim the buffer is full.
And from my understanding, it's correct in making that claim. Due to the 
display delay, it could in theory happen that the moment cuvid starts 
outputting frames, there will be more output available than what fits 
into ctx->frame_queue, since it delayed by 4 frames, which results in 8 
surfaces, but the queue only fits 5.

So to me it looks like that the correct fix would be to double the size 
of the frame_queue when deinterlacing, not unconditionally.

nb_surfaces is also used to determine the size of the key_frame array, 
which would then also be pointlessly doubled. But not like a handful of 
extra ints would hurt that much though.
Alternatively the size-doubling could not be reflected in nb_surfaces, 
but that would make the logic in various other places be more complicated.

> ---
> 
> I think part of the problem might be that cuvid_is_buffer_full() does not know
> how many frames are actually in the driver's queue and assumes it is the
> maximum, even if none have yet been added.
> 
> This was preventing any frames from being decoded using NVDEC with MythTV for
> some streams.  See https://github.com/MythTV/mythtv/issues/1039

I'd highly recommend to not use cuviddec anymore, but instead use nvdec.
cuviddec only still exists to sanity-check nvdec against it at times.

> ---
>   libavcodec/cuviddec.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/libavcodec/cuviddec.c b/libavcodec/cuviddec.c
> index 67076a1752..05dcafab6e 100644
> --- a/libavcodec/cuviddec.c
> +++ b/libavcodec/cuviddec.c
> @@ -120,7 +120,7 @@ typedef struct CuvidParsedFrame
>   #define CUVID_MAX_DISPLAY_DELAY (4)
>   
>   // Actual pool size will be determined by parser.
> -#define CUVID_DEFAULT_NUM_SURFACES (CUVID_MAX_DISPLAY_DELAY + 1)
> +#define CUVID_DEFAULT_NUM_SURFACES ((2 * CUVID_MAX_DISPLAY_DELAY) + 1)
>   
>   static int CUDAAPI cuvid_handle_video_sequence(void *opaque, CUVIDEOFORMAT* format)
>   {

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [FFmpeg-devel] [PATCH] libavcodec/cuviddec.c: increase CUVID_DEFAULT_NUM_SURFACES
  2025-02-21 13:26 ` Timo Rothenpieler
@ 2025-02-22  2:52   ` Scott Theisen
  0 siblings, 0 replies; 3+ messages in thread
From: Scott Theisen @ 2025-02-22  2:52 UTC (permalink / raw)
  To: ffmpeg-devel

On 2/21/25 08:26, Timo Rothenpieler wrote:
> On 20.02.2025 21:37, Scott Theisen wrote:
>> The default value of CuvidContext::nb_surfaces was reduced from 25 to 
>> 5 (as
>> (CUVID_MAX_DISPLAY_DELAY + 1)) in 
>> 402d98c9d467dff6931d906ebb732b9a00334e0b.
>>
>> In cuvid_is_buffer_full() delay can be 2 * CUVID_MAX_DISPLAY_DELAY 
>> with double
>> rate deinterlacing.  ctx->nb_surfaces is CUVID_DEFAULT_NUM_SURFACES =
>> (CUVID_MAX_DISPLAY_DELAY + 1) by default, in which case 
>> cuvid_is_buffer_full()
>> will always return true and cuvid_output_frame() will never read any 
>> data since
>> it will not call ff_decode_get_packet().
>
> It's been way too long since I looked at all that code, and I didn't 
> even write most of the code involved:
>> https://github.com/FFmpeg/FFmpeg/commit/bddb2343b6e594e312dadb5d21b408702929ae04 
>>
>> https://github.com/FFmpeg/FFmpeg/commit/402d98c9d467dff6931d906ebb732b9a00334e0b 
>>
>
> But doesn't this instead mean that the logic in cuvid_is_buffer_full 
> is flawed somehow?

I think it is the number of frames ready to send to the driver + the 
number of frames in queue in the driver >= the number of decoded frame 
buffers.  However, it doesn't actually know how many frames are in queue 
in the driver and assumes the maximum.

> Just increasing the default number of surfaces does not seem like the 
> correct fix or sensible, since it will increase VRAM usage by 
> potentially quite a bit for all users.
>

The changes to cuvid_handle_video_sequence() from 
402d98c9d467dff6931d906ebb732b9a00334e0b will increase nb_surfaces once 
data has been read.

> From looking at this a bit, the issue will only happen when 
> deinterlacing, the logic in cuvid_is_buffer_full becomes stuck then, 
> and will always claim the buffer is full.
> And from my understanding, it's correct in making that claim. Due to 
> the display delay, it could in theory happen that the moment cuvid 
> starts outputting frames, there will be more output available than 
> what fits into ctx->frame_queue, since it delayed by 4 frames, which 
> results in 8 surfaces, but the queue only fits 5.
>
> So to me it looks like that the correct fix would be to double the 
> size of the frame_queue when deinterlacing, not unconditionally.

There is nothing stopping deint_mode or drop_second_field from being 
changed after cuvid_decode_init() is called, so it doesn't necessarily 
know it will deinterlace.

Regardless, 402d98c9d467dff6931d906ebb732b9a00334e0b reduced 
CUVID_DEFAULT_NUM_SURFACES from 25 to *only 5* to not break playback 
entirely.  I don't think the intention was to break playback for double 
rate deinterlacing while allowing playback for only single rate 
deinterlacing.

Also, if AV_CODEC_FLAG_LOW_DELAY is set, then only one output surface is 
needed, but there are still 5.

>
> nb_surfaces is also used to determine the size of the key_frame array, 
> which would then also be pointlessly doubled. But not like a handful 
> of extra ints would hurt that much though.
> Alternatively the size-doubling could not be reflected in nb_surfaces, 
> but that would make the logic in various other places be more 
> complicated.
>
>> ---
>>
>> I think part of the problem might be that cuvid_is_buffer_full() does 
>> not know
>> how many frames are actually in the driver's queue and assumes it is the
>> maximum, even if none have yet been added.
>>
>> This was preventing any frames from being decoded using NVDEC with 
>> MythTV for
>> some streams.  See https://github.com/MythTV/mythtv/issues/1039
>
> I'd highly recommend to not use cuviddec anymore, but instead use nvdec.
> cuviddec only still exists to sanity-check nvdec against it at times.

.*_cuvid are FFCodecs while .*_nvdec are FFHWAccels.  I don't know what 
would be required to change to .*_nvdec and the .*_cuvid FFCodecs work 
fine with this change.

>
>> ---
>>   libavcodec/cuviddec.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/libavcodec/cuviddec.c b/libavcodec/cuviddec.c
>> index 67076a1752..05dcafab6e 100644
>> --- a/libavcodec/cuviddec.c
>> +++ b/libavcodec/cuviddec.c
>> @@ -120,7 +120,7 @@ typedef struct CuvidParsedFrame
>>   #define CUVID_MAX_DISPLAY_DELAY (4)
>>     // Actual pool size will be determined by parser.
>> -#define CUVID_DEFAULT_NUM_SURFACES (CUVID_MAX_DISPLAY_DELAY + 1)
>> +#define CUVID_DEFAULT_NUM_SURFACES ((2 * CUVID_MAX_DISPLAY_DELAY) + 1)
>>     static int CUDAAPI cuvid_handle_video_sequence(void *opaque, 
>> CUVIDEOFORMAT* format)
>>   {
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-02-22  2:53 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-02-20 20:37 [FFmpeg-devel] [PATCH] libavcodec/cuviddec.c: increase CUVID_DEFAULT_NUM_SURFACES Scott Theisen
2025-02-21 13:26 ` Timo Rothenpieler
2025-02-22  2:52   ` Scott Theisen

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
		ffmpegdev@gitmailbox.com
	public-inbox-index ffmpegdev

Example config snippet for mirrors.


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git