From: "Wang, Fei W" <fei.w.wang-at-intel.com@ffmpeg.org> To: "ffmpeg-devel@ffmpeg.org" <ffmpeg-devel@ffmpeg.org> Subject: Re: [FFmpeg-devel] [PATCH v3 6/6] lavc/vaapi_encode: Add VAAPI AV1 encoder Date: Wed, 16 Aug 2023 07:54:51 +0000 Message-ID: <eb5ee057aa524af26e6732ba8cea2c8e6b545b3e.camel@intel.com> (raw) In-Reply-To: <1acd766d-e0dc-69f6-56e5-0d84e62f83c8@jkqxz.net> On Sun, 2023-08-13 at 22:43 +0100, Mark Thompson wrote: > On 10/08/2023 03:54, Wang, Fei W wrote: > > On Mon, 2023-08-07 at 22:21 +0100, Mark Thompson wrote: > > > On 03/08/2023 07:01, fei.w.wang-at-intel.com@ffmpeg.org wrote: > > > > From: Fei Wang <fei.w.wang@intel.com> > > > > > > > > Signed-off-by: Fei Wang <fei.w.wang@intel.com> > > > > --- > > > > Changelog | 1 + > > > > configure | 3 + > > > > doc/encoders.texi | 13 + > > > > libavcodec/Makefile | 1 + > > > > libavcodec/allcodecs.c | 1 + > > > > libavcodec/vaapi_encode.c | 125 +++- > > > > libavcodec/vaapi_encode.h | 12 + > > > > libavcodec/vaapi_encode_av1.c | 1229 > > > > +++++++++++++++++++++++++++++++++ > > > > libavcodec/version.h | 2 +- > > > > 9 files changed, 1368 insertions(+), 19 deletions(-) > > > > create mode 100644 libavcodec/vaapi_encode_av1.c > > > > > > I assume this is tested on Intel hardware. Is it tested on AMD > > > as > > > well? (Apparently working in recent Mesa.) > > > > AMD tested the patchset months ago: > > https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22585 > > > > This patch changed a little compare with the version in Cartwheel, > > @Ruijing, Could you help to review this version in ML? To avoid the > > diffs break you. Thanks. > > > > > > ... > > > > @@ -669,6 +669,15 @@ static int > > > > vaapi_encode_set_output_timestamp(AVCodecContext *avctx, > > > > { > > > > VAAPIEncodeContext *ctx = avctx->priv_data; > > > > > > > > + // AV1 packs P frame and next B frame into one pkt, and > > > > uses > > > > the other > > > > + // repeat frame header pkt at the display order position > > > > of > > > > the P frame > > > > + // to indicate its frame index. Each frame has a > > > > corresponding > > > > pkt in its > > > > + // display order position. So don't need to consider delay > > > > for > > > > AV1 timestamp. > > > > + if (avctx->codec_id == AV_CODEC_ID_AV1) { > > > > + pkt->dts = pkt->pts - ctx->dts_pts_diff; > > > > + return 0; > > > > + } > > > > > > This doesn't get you the right result, though? The PTS and DTS > > > on > > > every AV1 packet want to be the same. > > > > The result tested good which can be played normally. Just aligned > > with > > other vaapi encoders that the 1st frame start with 0/-1 of PTS/DTS > > if > > have B frame. Set PTS/DTS to same also looks good. > > DTS != PTS should only be true when the stream has externally-visible > frame reordering in, which AV1 doesn't. > > The other two AV1 encoders both output DTS == PTS; I think you should > match that. > > > > In any case, please don't put tests for codec ID in the common > > > code. I suggest that the timestamp behaviour wants to be a new > > > FLAG_*. > > > > > > > + > > > > ... > > > > @@ -1128,9 +1182,19 @@ static int > > > > vaapi_encode_pick_next(AVCodecContext *avctx, > > > > > > > > vaapi_encode_add_ref(avctx, pic, pic, 0, 1, 0); > > > > if (pic->type != PICTURE_TYPE_IDR) { > > > > - vaapi_encode_add_ref(avctx, pic, start, > > > > - pic->type == PICTURE_TYPE_P, > > > > - b_counter > 0, 0); > > > > + // TODO: apply both previous and forward multi > > > > reference > > > > for all vaapi encoders. > > > > + // And L0/L1 reference frame number can be set > > > > dynamically > > > > through query > > > > + // VAConfigAttribEncMaxRefFrames attribute. > > > > + if (avctx->codec_id == AV_CODEC_ID_AV1) { > > > > + for (i = 0; i < ctx->nb_next_prev; i++) > > > > + vaapi_encode_add_ref(avctx, pic, ctx- > > > > > next_prev[i], > > > > + pic->type == > > > > PICTURE_TYPE_P, > > > > + b_counter > 0, 0); > > > > > > So an undisclosed aim of the previous patch was to make this > > > extra > > > list of the most recent N frames for this to use with P frames > > > only? > > > > Currently, yes. As the TODO comment says, I will apply it to all > > other > > codecs and B frame next, it will need lots of test on different > > hardware. Add this for AV1 here is to align with VPL which use 2 > > previous reference for P frame. Base on our test, 2 reference frame > > for > > P will improve ~5% BD-rate for quality, so that can make it > > possible to > > let user get same quality with VPL after adjusting and aligning > > encoder > > options. > > Hmm. My Intel test machine says it can take 8 L0 and 2 L1 references > in H.264. I might try this, seems like it should be easy to test. Yes, if L0/L1 number queried from driver are fully supported, need to increase MAX_PICTURE_REFERENCES and init size of surface pool. > > > > Why this partcular structure for P frames only, though? Seems > > > like > > > if you support extra references then other codecs could also make > > > use > > > of them in either direction, so we would be better off hinting > > > that > > > the codec would like up to N references and then using the > > > contents > > > of the existing DPB, plus keep extras if there is space. > > > > > > > ... > > > > + > > > > +static av_cold int vaapi_encode_av1_configure(AVCodecContext > > > > *avctx) > > > > +{ > > > > + VAAPIEncodeContext *ctx = avctx->priv_data; > > > > + VAAPIEncodeAV1Context *priv = avctx->priv_data; > > > > + int ret; > > > > + > > > > + ret = ff_cbs_init(&priv->cbc, AV_CODEC_ID_AV1, avctx); > > > > + if (ret < 0) > > > > + return ret; > > > > + > > > > + if (ctx->rc_mode->quality) { > > > > + priv->q_idx_p = av_clip(ctx->rc_quality, 0, > > > > AV1_MAX_QUANT); > > > > + if (fabs(avctx->i_quant_factor) > 0.0) > > > > + priv->q_idx_idr = > > > > + av_clip((fabs(avctx->i_quant_factor) * priv- > > > > > q_idx_p + > > > > + avctx->i_quant_offset) + 0.5, > > > > + 0, AV1_MAX_QUANT); > > > > + else > > > > + priv->q_idx_idr = priv->q_idx_p; > > > > + > > > > + if (fabs(avctx->b_quant_factor) > 0.0) > > > > + priv->q_idx_b = > > > > + av_clip((fabs(avctx->b_quant_factor) * priv- > > > > > q_idx_p + > > > > + avctx->b_quant_offset) + 0.5, > > > > + 0, AV1_MAX_QUANT); > > > > + else > > > > + priv->q_idx_b = priv->q_idx_p; > > > > + } else { > > > > + /** Arbitrary value */ > > > > + priv->q_idx_idr = priv->q_idx_p = priv->q_idx_b = 128; > > > > + } > > > > > > Are there any alignment requirements here? (If I gave it an > > > input > > > which is, say, 1361x1, would that work?) > > > > I didn't get you. Which parameter here need to be align? Max/Min > > resolution HW supported will be checked somewhere else. If use > > 1361x1 > > there will be the error like: > > > > "Hardware does not support encoding at size 1376x16 (constraints: > > width > > 32-8192 height 32-8192)." > > I mean of padding for the reference surfaces. I was expecting this > to be rounded up to a multiple of 8 (to match MiCols/MiRows) - see > this function in other codecs. > > > > > ... > > > > + > > > > + /** update obu size in bitstream */ > > > > + if (fh_obu->header.obu_has_size_field) { > > > > + obu_size_len = priv- > > > > >attr_ext2.bits.obu_size_bytes_minus1 > > > > + 1; > > > > + for (i = 0; i < obu_size_len; i++) { > > > > + byte = obu_size >> (7 * i) & 0x7f; > > > > + if (i < obu_size_len - 1) > > > > + byte |= 0x80; > > > > + put_bits(&pbc_tmp, 8, byte); > > > > + } > > > > + flush_put_bits(&pbc_tmp); > > > > + memmove(pbc_tmp.buf_ptr, pbc_tmp.buf_ptr + (8 - > > > > obu_size_len), obu_size); > > > > + *data_len -= (8 - obu_size_len) * 8; > > > > + } > > > > > > Why is there an incomplete duplicate of the cbs_av1 header > > > writing > > > code here? > > > > To record some position/size in bitstream that needed for VAAPI. > > Like > > qp_index/loopfilter/cdef offset and cdef parameters size in bit. > > It's > > not reasonable to add the specific parameters into CBS. > > How about with < > https://lists.ffmpeg.org/pipermail/ffmpeg-devel/2023-August/313228.html> > ;? How to pass position info out of .trace_write_callback? If define own write_callback function in vaapi_encode_av1.c, and it can easily get the positions of each syntax element, but can't pass them back to VAAPI AV1 encoder. A possible way is according to CodedBitstreamContext.priv_data, but that will need to add lots of xxx_offset into CodedBitstreamAV1Context. > > > > > ... > > > > + > > > > +static int vaapi_encode_av1_set_tile(AVCodecContext *avctx) > > > > +{ > > > > + VAAPIEncodeAV1Context *priv = avctx->priv_data; > > > > + int mi_cols, mi_rows, sb_shift, sb_size; > > > > + int max_tile_area_sb, max_tile_area_sb_varied; > > > > + int tile_width_sb, tile_height_sb, widest_tile_sb; > > > > + int min_log2_tiles; > > > > + int tile_rows_tmp, i; > > > > + > > > > + if (priv->tile_cols > AV1_MAX_TILE_COLS || > > > > + priv->tile_rows > AV1_MAX_TILE_ROWS) { > > > > + av_log(avctx, AV_LOG_ERROR, "Invalid tile number > > > > %dx%d, > > > > should less than %dx%d.\n", > > > > + priv->tile_cols, priv->tile_rows, > > > > AV1_MAX_TILE_COLS, AV1_MAX_TILE_ROWS); > > > > > > Can't this be a constraint on the option rather than a special > > > check > > > here? > > > > Tiles set with IMAGE_SIZE option type which aligns with HEVC. It > > doesn't support Min/MAX check. > > Hmm, yeah - fair enough. > > > > > ... > > > > + > > > > + sh->color_config = (AV1RawColorConfig) { > > > > + .high_bitdepth = desc->comp[0].depth > > > > == 8 > > > > ? 0 : 1, > > > > + .color_primaries = avctx- > > > > >color_primaries, > > > > + .transfer_characteristics = avctx->color_trc, > > > > + .matrix_coefficients = avctx->colorspace, > > > > + .color_description_present_flag = (avctx- > > > > >color_primaries > > > > != AVCOL_PRI_UNSPECIFIED || > > > > + avctx- > > > > > color_trc != AVCOL_TRC_UNSPECIFIED || > > > > + avctx- > > > > > colorspace != AVCOL_SPC_UNSPECIFIED), > > > > + .color_range = avctx->color_range > > > > == > > > > AVCOL_RANGE_JPEG, > > > > + .subsampling_x = desc->log2_chroma_w, > > > > + .subsampling_y = desc->log2_chroma_h, > > > > > > .chroma_sample_position = some function of > > > chroma_sample_location. > > > > There is no chroma_sample_position supported in VAAPI yet. > > VAAPI won't care. It should be marked correctly in the headers so > that the output stream has the right value. > > (CFL not having any ability to deal with the sample location is a > known deficiency of AV1.) > > > > > ... > > > > + fh->tile_cols = priv->tile_cols; > > > > + fh->tile_rows = priv->tile_rows; > > > > + fh->tile_cols_log2 = priv->tile_cols_log2; > > > > + fh->tile_rows_log2 = priv->tile_rows_log2; > > > > + fh->uniform_tile_spacing_flag = priv->uniform_tile; > > > > + fh->tile_size_bytes_minus1 = priv- > > > > > attr_ext2.bits.tile_size_bytes_minus1; > > > > + fh->reduced_tx_set = 1; > > > > > > This not being present in the capabilities feels like an omission > > > in > > > the API. That's a large loss for an implementation which does > > > support the full transform set. > > > > Personally understanding, "not being present" means it must be > > supported. If any exception driver, then they can ask to add new > > capability to libva. > > reduced_tx_set is a negative property - you want to set it to zero to > allow all transforms, unless the implementation can't cope with that. > > Does the Intel implementation require it to be set? (That is, does > not support the full transform set.) > That make sense. Intel HW doesn't require it to be set as 1. Will change it to 0. > > > > ... > > > > + > > > > + for (i = 0; i < fh->tile_cols; i++) > > > > + fh->width_in_sbs_minus_1[i] = vpic- > > > > > width_in_sbs_minus_1[i] = priv->width_in_sbs_minus_1[i]; > > > > + > > > > + for (i = 0; i < fh->tile_rows; i++) > > > > + fh->height_in_sbs_minus_1[i] = vpic- > > > > > height_in_sbs_minus_1[i] = priv->height_in_sbs_minus_1[i]; > > > > > > Er, what? Doesn't this fail the inference on the last tile > > > width/height if you don't split exactly? > > > > What's the meaning of "split exactly"? All the tile w/h will be > > split > > in vaapi_encode_av1_set_tile include last tile. Just make sure the > > info > > of tiles w/h in frame header are same with the info passed to VAAPI > > should be fine. > > Sorry, I misread this. The values are set elsewhere, it's not > setting all tiles to have the same SB width/height. > > > > > ... > > > > + > > > > + vpic->base_qindex = fh->base_q_idx; > > > > + vpic->frame_width_minus_1 = fh->frame_width_minus_1; > > > > + vpic->frame_height_minus_1 = fh->frame_height_minus_1; > > > > + vpic->primary_ref_frame = fh->primary_ref_frame; > > > > + vpic->reconstructed_frame = pic->recon_surface; > > > > + vpic->coded_buf = pic->output_buffer; > > > > + vpic->tile_cols = fh->tile_cols; > > > > + vpic->tile_rows = fh->tile_rows; > > > > + vpic->order_hint = fh->order_hint; > > > > +#if VA_CHECK_VERSION(1, 15, 0) > > > > + vpic->refresh_frame_flags = fh->refresh_frame_flags; > > > > +#endif > > > > > > What will the difference be between running in the two > > > versions? (Is > > > it worth supporting the older one? Cf. bit_depth on VP9.) > > > > I'd prefer to support older one. In case of someone using old > > version > > and works well with ffmpeg-vpl(as vpl already support av1 encode > > long > > time ago), otherwise he must be confuse why ffmpeg-vaapi need > > higher > > VAAPI version but VPL doesn't. > > Ok. So the driver doesn't use the value of refresh_frame_flags if it > doesn't affect the result? (Seems like it shouldn't, not sure why it > would be added to the API.) Media-driver doesn't use it, but MS Windows driver does, it may need maintain DPB and pack frame header itself: https://github.com/intel/cartwheel-ffmpeg/issues/244 Thanks > > > > > ... > > Thanks, > > - Mark > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
next prev parent reply other threads:[~2023-08-16 7:55 UTC|newest] Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top 2023-08-03 6:01 [FFmpeg-devel] [PATCH v3 1/6] avcodec/cbs_av1: Add tx mode enum values fei.w.wang-at-intel.com 2023-08-03 6:01 ` [FFmpeg-devel] [PATCH v3 2/6] lavc/av1: Add common code and unit test for level handling fei.w.wang-at-intel.com 2023-08-07 12:40 ` James Almer 2023-08-07 15:45 ` Andreas Rheinhardt 2023-08-03 6:01 ` [FFmpeg-devel] [PATCH v3 3/6] lavc/vaapi_encode: Init pic at the beginning of API fei.w.wang-at-intel.com 2023-08-03 6:01 ` [FFmpeg-devel] [PATCH v3 4/6] lavc/vaapi_encode: Extract set output pkt timestamp function fei.w.wang-at-intel.com 2023-08-07 20:22 ` Mark Thompson 2023-08-03 6:01 ` [FFmpeg-devel] [PATCH v3 5/6] lavc/vaapi_encode: Separate reference frame into previous/future list fei.w.wang-at-intel.com 2023-08-07 20:28 ` Mark Thompson 2023-08-03 6:01 ` [FFmpeg-devel] [PATCH v3 6/6] lavc/vaapi_encode: Add VAAPI AV1 encoder fei.w.wang-at-intel.com 2023-08-07 21:21 ` Mark Thompson 2023-08-10 2:54 ` Wang, Fei W 2023-08-13 21:43 ` Mark Thompson 2023-08-16 7:54 ` Wang, Fei W [this message] 2023-08-16 20:45 ` Mark Thompson 2023-08-15 13:59 ` Dong, Ruijing 2023-08-15 16:49 ` Neal Gompa 2023-08-07 7:51 ` [FFmpeg-devel] [PATCH v3 1/6] avcodec/cbs_av1: Add tx mode enum values Xiang, Haihao
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=eb5ee057aa524af26e6732ba8cea2c8e6b545b3e.camel@intel.com \ --to=fei.w.wang-at-intel.com@ffmpeg.org \ --cc=ffmpeg-devel@ffmpeg.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel This inbox may be cloned and mirrored by anyone: git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \ ffmpegdev@gitmailbox.com public-inbox-index ffmpegdev Example config snippet for mirrors. AGPL code for this site: git clone https://public-inbox.org/public-inbox.git