Re: [FFmpeg-devel] [PATCH v4 1/4] lavc/vaapi_encode_h265: Add GPB frame support for hevc_vaapi

From: Mark Thompson <sw@jkqxz.net>
To: ffmpeg-devel@ffmpeg.org
Subject: Re: [FFmpeg-devel] [PATCH v4 1/4] lavc/vaapi_encode_h265: Add GPB frame support for hevc_vaapi
Date: Tue, 15 Mar 2022 00:24:10 +0000
Message-ID: <79eb995f-9064-8334-3544-105a83c10f86@jkqxz.net> (raw)
In-Reply-To: <PH0PR11MB503013295A4B8F6A663A5072CD0F9@PH0PR11MB5030.namprd11.prod.outlook.com>

On 14/03/2022 11:07, Wang, Fei W wrote:
>> -----Original Message-----
>> From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of Xiang,
>> Haihao
>> Sent: Monday, March 14, 2022 10:15 AM
>> To: ffmpeg-devel@ffmpeg.org
>> Subject: Re: [FFmpeg-devel] [PATCH v4 1/4] lavc/vaapi_encode_h265: Add GPB
>> frame support for hevc_vaapi
>>
>> On Sun, 2022-03-13 at 20:45 +0000, Mark Thompson wrote:
>>> On 11/03/2022 09:00, Fei Wang wrote:
>>>> From: Linjie Fu <linjie.fu@intel.com>
>>>>
>>>> Use GPB frames to replace regular P/B frames if backend driver does
>>>> not support it.
>>>>
>>>> - GPB:
>>>>       Generalized P and B picture. Regular P/B frames replaced by B
>>>>       frames with previous-predict only, L0 == L1. Normal B frames
>>>>       still have 2 different ref_lists and allow bi-prediction
>>>>
>>>> Signed-off-by: Linjie Fu <linjie.fu@intel.com>
>>>> Signed-off-by: Fei Wang <fei.w.wang@intel.com>
>>>> ---
>>>> update:
>>>> 1. Add b to gpb.
>>>> 2. Optimise debug message.
>>>>
>>>>    libavcodec/vaapi_encode.c      | 74 +++++++++++++++++++++++++++++++--
>> -
>>>>    libavcodec/vaapi_encode.h      |  2 +
>>>>    libavcodec/vaapi_encode_h265.c | 24 ++++++++++-
>>>>    3 files changed, 93 insertions(+), 7 deletions(-)
>>>>
>>>> diff --git a/libavcodec/vaapi_encode.c b/libavcodec/vaapi_encode.c
>>>> index 3bf379b1a0..bdba9726b2 100644
>>>> --- a/libavcodec/vaapi_encode.c
>>>> +++ b/libavcodec/vaapi_encode.c
>>>> @@ -848,9 +848,13 @@ static void
>>>> vaapi_encode_set_b_pictures(AVCodecContext
>>>> *avctx,
>>>>                pic->b_depth = current_depth;
>>>>
>>>>                vaapi_encode_add_ref(avctx, pic, start, 1, 1, 0);
>>>> -            vaapi_encode_add_ref(avctx, pic, end,   1, 1, 0);
>>>>                vaapi_encode_add_ref(avctx, pic, prev,  0, 0, 1);
>>>>
>>>> +            if (!ctx->b_to_gpb)
>>>> +                vaapi_encode_add_ref(avctx, pic, end, 1, 1, 0);
>>>> +            else
>>>> +                vaapi_encode_add_ref(avctx, pic, end, 0, 1, 0);
>>>
>>> This is doing something extremely dubious.  If b-to-gpb is set, then
>>> don't use the future reference?
>>
>> According to
>> https://github.com/intel/media-
>> driver/blob/master/media_driver/agnostic/common/codec/hal/codechal_vdenc
>> _hevc.cpp#L3072-L3087
>> , L0 and L1 should be the same for vdenc hevc on some platforms, so user can't
>> use past and future reference together, which is why you experienced the failure
>> after applying version 2
>>
>> Thanks
>> Haihao
>>
>>
>>> That means these pictures will only have the past reference, and the
>>> coding efficiency will often be much worse.
>>>
>>> E.g. if you have the default structure (max_b_frames = 2, max_b_depth
>>> = 1) then in a sequence of four pictures you get:
>>>
>>> 1 referring to something previous
>>> 4 referring to 1
>>> 2 referring to 1 and 4
>>> 3 referring to 1 and 4
>>>
>>> and this change means you lose the 2 -> 4 and 3 -> 4 references.
>>> Therefore, a change in the picture between 1 and 2 will end up coded
>>> three times in 2, 3 and 4 rather than just being coded in 4 and then referred to
>> by the others.
> 
> If driver doesn't support B frames with two different reference lists, use gpb instead
> of regular B is a best way. In V3, I turned B frames to P, but this will break gop structure.
> If user set I/P/B frames, while the output only contains I/P frames will be much confuse.

If the driver requires that RefPicList0 and RefPicList1 in B slices are the same then isn't the obvious solution to concatenate the expected lists and put the result in both actual lists (assuming it fits)?

That continues to support B frames in the obvious way (and transparently to the user).  The hardware will have to do a bit more motion search, but the fake P slices already require that anyway so I assume it is sensibly optimised.

>>>> +
>>>>                for (ref = end->refs[1]; ref; ref = ref->refs[1])
>>>>                    vaapi_encode_add_ref(avctx, pic, ref, 0, 1, 0);
>>>>            }
>>>> @@ -871,8 +875,11 @@ static void
>>>> vaapi_encode_set_b_pictures(AVCodecContext
>>>> *avctx,
>>>>
>>>>            vaapi_encode_add_ref(avctx, pic, pic,   0, 1, 0);
>>>>            vaapi_encode_add_ref(avctx, pic, start, 1, 1, 0);
>>>> -        vaapi_encode_add_ref(avctx, pic, end,   1, 1, 0);
>>>>            vaapi_encode_add_ref(avctx, pic, prev,  0, 0, 1);
>>>> +        if (!ctx->b_to_gpb)
>>>> +            vaapi_encode_add_ref(avctx, pic, end, 1, 1, 0);
>>>> +        else
>>>> +            vaapi_encode_add_ref(avctx, pic, end, 0, 1, 0);
>>>>
>>>>            for (ref = end->refs[1]; ref; ref = ref->refs[1])
>>>>                vaapi_encode_add_ref(avctx, pic, ref, 0, 1, 0); @@
>>>> -1845,6 +1852,51 @@ static av_cold int
>>>> vaapi_encode_init_gop_structure(AVCodecContext *avctx)
>>>>            ref_l1 = attr.value >> 16 & 0xffff;
>>>>        }
>>>>
>>>> +    ctx->p_to_gpb = 0;
>>>> +    ctx->b_to_gpb = 0;
>>>> +
>>>> +#if VA_CHECK_VERSION(1, 9, 0)
>>>> +    if (!(ctx->codec->flags & FLAG_INTRA_ONLY ||
>>>> +        avctx->gop_size <= 1)) {
>>>> +        attr = (VAConfigAttrib) { VAConfigAttribPredictionDirection };
>>>> +        vas = vaGetConfigAttributes(ctx->hwctx->display,
>>>> +                                    ctx->va_profile,
>>>> +                                    ctx->va_entrypoint,
>>>> +                                    &attr, 1);
>>>> +        if (vas != VA_STATUS_SUCCESS) {
>>>> +            av_log(avctx, AV_LOG_WARNING, "Failed to query
>>>> +prediction
>>>> direction "
>>>> +                   "attribute: %d (%s).\n", vas, vaErrorStr(vas));
>>>> +            return AVERROR_EXTERNAL;
>>>> +        } else if (attr.value == VA_ATTRIB_NOT_SUPPORTED) {
>>>> +            av_log(avctx, AV_LOG_VERBOSE, "Driver does not report
>>>> + any
>>>> additional "
>>>> +                   "prediction constraints.\n");
>>>> +        } else {
>>>> +            if (((ref_l0 > 0 || ref_l1 > 0) && !(attr.value &
>>>> VA_PREDICTION_DIRECTION_PREVIOUS)) ||
>>>> +                ((ref_l1 == 0) && (attr.value &
>>>> (VA_PREDICTION_DIRECTION_FUTURE |
>>>> VA_PREDICTION_DIRECTION_BI_NOT_EMPTY)))) {
>>>> +                av_log(avctx, AV_LOG_ERROR, "Driver report
>>>> + incorrect
>>>> prediction "
>>>> +                       "direction attribute.\n");
>>>> +                return AVERROR_EXTERNAL;
>>>> +            }
>>>> +
>>>> +            if (!(attr.value & VA_PREDICTION_DIRECTION_FUTURE)) {
>>>> +                if (ref_l0 > 0 && ref_l1 > 0) {
>>>> +                    ctx->b_to_gpb = 1;
>>>> +                    av_log(avctx, AV_LOG_VERBOSE, "Driver support
>>>> + previous
>>>> prediction "
>>>> +                           "only for B-frames.\n");
>>>> +                }
>>>> +            }
>>>
>>> Please note that the PREDICTION_DIRECTION_FUTURE/PREVIOUS options
>>> don't mean anything for H.265.
>>>
>>> The driver isn't told which direction the prediction is in, it's only
>>> told about some reference frames and how to refer to them.  Whether
>>> those frames are in the past or future is unspecified and irrelevant -
>>> a P frame can refer to a single future frame if it wants.
>>>
>>> (I tried to argue this when it was added, but given that they are
>>> meaningless I didn't argue very hard.)
>>>
>>> I suspect you are trying to use this as a test of something else.
>>> Perhaps you could explain what the test you actually want is?
> 
> VA_PREDICTION_DIRECTION_PREVIOUS/ VA_PREDICTION_DIRECTION_FUTURE/ VA_PREDICTION_DIRECTION_BI_NOT_EMPTY
>   are used to indicate if driver has the limitation on how to set regular P and B frame's reference list when the queried max
> reference list ref_l0 and ref_l1 both are not zero.
> 
> If queried value is VA_PREDICTION_DIRECTION_PREVIOUS only, this means driver doesn't support B frame with different l0/l1,
> need to set l1 = 10.
> VA_PREDICTION_DIRECTION_PREVIOUS | VA_PREDICTION_DIRECTION_FUTURE means different l0/l1 is support for B frame.
> 
> And if queried value is VA_PREDICTION_DIRECTION_BI_NOT_EMPTY, this means driver doesn't support P frame with l0 only.

Hmm.  So if we ignore both the names and the documentation as unhelpful then the meanings you are intending are:

Attribute unset -> no restrictions.
Attribute set -> PREVIOUS bit must be set: P and B slices are both supported but RefPicList0 and RefPicList1 must in B slices must be equal.

Then additional bits modify that:

FUTURE bit set -> RefPicList0 and RefPicList1 in B slices may differ.
BI_NOT_EMPTY bit set -> P slices are not supported.

(If this is actually the intended meaning then maybe someone could add this to the documentation in libva?)

The real Intel hardware we've then got for H.265 is:

Gen9, Gen9 vdenc, Gen11, Gen12: unset -> no restrictions.
Gen11 vdenc: PREVIOUS | BI_NOT_EMPTY -> P slices are not supported, B slices must have RefPicList0 equal to RefPicList1.
Gen12 vdenc: PREVIOUS | FUTURE | BI_NOT_EMPTY -> P slices are not supported, B slices can do whatever.

For Gen12 vdenc you fake up P frames by using B slices with two equal lists, and make B frames normally.  That looks sensible.

For Gen11 vdenc you are faking up P frames in the same way, but then with B frames your patch pretends they are also P frames and ignores all but the first reference, which IMO is not sensible because it will multiply bitrate by several times at a given quality.  Hence suggestion above?

> And in debug message, maybe use "Driver only support same reference list for B-frames\n" will be more clear.

Yeah, given the trickiness of this it's probably a good idea if the messages are very clear about what is going on.

>>>> +
>>>> +            if (attr.value & VA_PREDICTION_DIRECTION_BI_NOT_EMPTY) {
>>>> +                if (ref_l0 > 0 && ref_l1 > 0) {
>>>> +                    ctx->p_to_gpb = 1;
>>>> +                    av_log(avctx, AV_LOG_VERBOSE, "Driver does not
>>>> + support
>>>> P-frames, "
>>>> +                           "replacing them with previous prediction
>>>> + only B-
>>>> frames.\n");
>>>> +                }
>>>> +            }
>>>> +        }
>>>> +    }
>>>> +#endif
>>>> ...

- Mark
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".