From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTP id B7D3349F95 for ; Sun, 17 Mar 2024 15:36:31 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 4A34268D105; Sun, 17 Mar 2024 17:36:28 +0200 (EET) Received: from mail-wr1-f54.google.com (mail-wr1-f54.google.com [209.85.221.54]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 2442368C678 for ; Sun, 17 Mar 2024 17:36:22 +0200 (EET) Received: by mail-wr1-f54.google.com with SMTP id ffacd0b85a97d-33e162b1b71so3268437f8f.1 for ; Sun, 17 Mar 2024 08:36:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=jkqxz-net.20230601.gappssmtp.com; s=20230601; t=1710689781; x=1711294581; darn=ffmpeg.org; h=content-transfer-encoding:in-reply-to:from:content-language :references:to:subject:user-agent:mime-version:date:message-id:from :to:cc:subject:date:message-id:reply-to; bh=a3yj+gOWOdp9oloIgDT1s7e0U5pCLPTUOyJcmtxLx3Y=; b=IyNP4UtQ6jrInzeKGb7i4t/0BBRwuh+m14/y2HHTqLfVCkhNtTAvltQtfaq69N4ACN zwvJuaX5H1YwVQXMzofUdQZhF4wtuP6RG/LLarRWYdLXJjXlv65m17ZYhlM0ikXFUYi1 G6Y5M9IM4hMCHtiUYU6uYJy6Q5ulHciznwhMNM8EWXUE+/ohz1/AIiOQCUwElZfsldBp NML+FZ2L5QE6L07RWRgbyprLrs2y5EOWlp+l+8yspMCGTAxJLNWJ8krp/oBSX1L3cdH4 /zbBovD6cOgSgUVoRNizgw6FsaD60VQ+lznR+Yw6k3G0BQhLurBEjZt6FTPJ3iBxFTBs nmlQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710689781; x=1711294581; h=content-transfer-encoding:in-reply-to:from:content-language :references:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=a3yj+gOWOdp9oloIgDT1s7e0U5pCLPTUOyJcmtxLx3Y=; b=DHqJQ3ZCYIRwZckmJ8T3PUplaayYAiWRSa3bJb1qmMOwHZsvQzuKhpChr0oEmDkA8I KtCKHwbwDjVqvheTRvxWRGPUeMNgjNONyFarD3OYaMXITtCBDQp+lo7zJEFRN3r/sXWz +xIinDfXjKAPOXqKGLI9A34/jUQAC0uJh9h75ic8+UCvoEyDoX0yEZBP/7Rhs6T3suYP ND+RMTVNqsqFAzqBizNhJmCohWpz63ialLEAG5q28HeoVPitq0LpvIZbtdqiMDCsQb5/ Tbt3rEac0A/N2QqY/4ABPx+re12G4b7bMxQxTiFtt+dEcMYNXJRLccQp68IK/8OGu4fs M6Kg== X-Gm-Message-State: AOJu0YxXjphMK7NjcJ57kII2mcejobHRT0IvHhtA9lS8azMYOBVaOq1f dP0lIG8ZPY+qwM/OsHSxrHwTPRtsTI+uMbtp5l8SH4Q+Noi6bQICnF2KkyJkgNQbRhmnazjOW9u o X-Google-Smtp-Source: AGHT+IGC/rma6USOxm//MfDM9YE8rCYgpiXX46s7gES0/aP01NM0EBNI1N5nSXFmCXWrxaesV5YwqQ== X-Received: by 2002:a5d:44c5:0:b0:33d:269e:132a with SMTP id z5-20020a5d44c5000000b0033d269e132amr7236876wrr.15.1710689780977; Sun, 17 Mar 2024 08:36:20 -0700 (PDT) Received: from [192.168.0.15] (cpc92302-cmbg19-2-0-cust1183.5-4.cable.virginm.net. [82.1.212.160]) by smtp.gmail.com with ESMTPSA id n2-20020a5d4002000000b0033e93e00f68sm7699226wrp.61.2024.03.17.08.36.20 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 17 Mar 2024 08:36:20 -0700 (PDT) Message-ID: Date: Sun, 17 Mar 2024 15:36:53 +0000 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird To: ffmpeg-devel@ffmpeg.org References: Content-Language: en-US From: Mark Thompson In-Reply-To: Subject: Re: [FFmpeg-devel] [PATCH v3 2/2] lavc/vulkan_av1: port to the new stable API X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: On 13/03/2024 16:38, Lynne wrote: > Tested by multiple users on multiple operating systems, > driver implementations and test samples to work. > > Co-Authored-by: Dave Airlie > > From e2d31951f46fcc10e1263b8603e486e111e9578c Mon Sep 17 00:00:00 2001 > From: Lynne > Date: Fri, 19 Jan 2024 10:49:02 +1000 > Subject: [PATCH v3 2/2] lavc/vulkan_av1: port to the new stable API > > Co-Authored-by: Dave Airlie > --- > configure | 4 +- > libavcodec/Makefile | 5 +- > libavcodec/av1.h | 2 + > libavcodec/vulkan_av1.c | 502 ++++++++++-------- > libavcodec/vulkan_decode.c | 31 +- > libavcodec/vulkan_decode.h | 2 +- > libavcodec/vulkan_video.h | 2 - > .../vulkan_video_codec_av1std_decode_mesa.h | 36 -- > libavcodec/vulkan_video_codec_av1std_mesa.h | 403 -------------- > libavutil/hwcontext_vulkan.c | 2 +- > libavutil/vulkan_functions.h | 2 +- > libavutil/vulkan_loader.h | 2 +- > 12 files changed, 300 insertions(+), 693 deletions(-) > delete mode 100644 libavcodec/vulkan_video_codec_av1std_decode_mesa.h > delete mode 100644 libavcodec/vulkan_video_codec_av1std_mesa.h > > diff --git a/configure b/configure > index 05f8283af9..b07a742a74 100755 > --- a/configure > +++ b/configure > @@ -7229,8 +7229,8 @@ enabled vdpau && > check_lib vdpau_x11 "vdpau/vdpau.h vdpau/vdpau_x11.h" vdp_device_create_x11 -lvdpau -lX11 > > if enabled vulkan; then > - check_pkg_config_header_only vulkan "vulkan >= 1.3.255" "vulkan/vulkan.h" "defined VK_VERSION_1_3" || > - check_cpp_condition vulkan "vulkan/vulkan.h" "defined(VK_VERSION_1_4) || (defined(VK_VERSION_1_3) && VK_HEADER_VERSION >= 255)" > + check_pkg_config_header_only vulkan "vulkan >= 1.3.277" "vulkan/vulkan.h" "defined VK_VERSION_1_3" || > + check_cpp_condition vulkan "vulkan/vulkan.h" "defined(VK_VERSION_1_4) || (defined(VK_VERSION_1_3) && VK_HEADER_VERSION >= 277)" This bumping the requirement to the latest version needs to stop at some point. Vulkan is not an experimental thing any more, it should be usable in released distributions. > fi > > if disabled vulkan; then > diff --git a/libavcodec/Makefile b/libavcodec/Makefile > index 708434ac76..c552f034b7 100644 > --- a/libavcodec/Makefile > +++ b/libavcodec/Makefile > @@ -1258,8 +1258,7 @@ SKIPHEADERS += %_tablegen.h \ > aacenc_quantization.h \ > aacenc_quantization_misc.h \ > bitstream_template.h \ > - vulkan_video_codec_av1std_mesa.h \ > - $(ARCH)/vpx_arith.h \ > + $(ARCH)/vpx_arith.h \ > > SKIPHEADERS-$(CONFIG_AMF) += amfenc.h > SKIPHEADERS-$(CONFIG_D3D11VA) += d3d11va.h dxva2_internal.h > @@ -1280,7 +1279,7 @@ SKIPHEADERS-$(CONFIG_QSVENC) += qsvenc.h > SKIPHEADERS-$(CONFIG_VAAPI) += vaapi_decode.h vaapi_hevc.h vaapi_encode.h > SKIPHEADERS-$(CONFIG_VDPAU) += vdpau.h vdpau_internal.h > SKIPHEADERS-$(CONFIG_VIDEOTOOLBOX) += videotoolbox.h vt_internal.h > -SKIPHEADERS-$(CONFIG_VULKAN) += vulkan.h vulkan_video.h vulkan_decode.h vulkan_video_codec_av1std_decode_mesa.h > +SKIPHEADERS-$(CONFIG_VULKAN) += vulkan.h vulkan_video.h vulkan_decode.h > SKIPHEADERS-$(CONFIG_V4L2_M2M) += v4l2_buffers.h v4l2_context.h v4l2_m2m.h > SKIPHEADERS-$(CONFIG_ZLIB) += zlib_wrapper.h > > diff --git a/libavcodec/av1.h b/libavcodec/av1.h > index 8704bc41c1..18d5fa9e7f 100644 > --- a/libavcodec/av1.h > +++ b/libavcodec/av1.h > @@ -121,6 +121,8 @@ enum { > AV1_DIV_LUT_NUM = 257, > > AV1_MAX_LOOP_FILTER = 63, > + > + AV1_RESTORATION_TILESIZE_MAX = 256, ? This isn't used anywhere. > }; > > > diff --git a/libavcodec/vulkan_av1.c b/libavcodec/vulkan_av1.c > index 5afd5353cc..dc71ecf1fa 100644 > --- a/libavcodec/vulkan_av1.c > +++ b/libavcodec/vulkan_av1.c > @@ -26,7 +26,7 @@ > const FFVulkanDecodeDescriptor ff_vk_dec_av1_desc = { > .codec_id = AV_CODEC_ID_AV1, > .decode_extension = FF_VK_EXT_VIDEO_DECODE_AV1, > - .decode_op = 0x01000000, /* TODO fix this */ > + .decode_op = VK_VIDEO_CODEC_OPERATION_DECODE_AV1_BIT_KHR, > .ext_props = { > .extensionName = VK_STD_VULKAN_VIDEO_CODEC_AV1_DECODE_EXTENSION_NAME, > .specVersion = VK_STD_VULKAN_VIDEO_CODEC_AV1_DECODE_SPEC_VERSION, > @@ -36,22 +36,34 @@ const FFVulkanDecodeDescriptor ff_vk_dec_av1_desc = { > typedef struct AV1VulkanDecodePicture { > FFVulkanDecodePicture vp; > > - /* Workaround for a spec issue. > - *Can be removed once no longer needed, and threading can be enabled. */ > + /* TODO: investigate if this can be removed to make decoding completely > + * independent. */ > FFVulkanDecodeContext *dec; Can you explain what the id_alloc_mask thing is doing which needs this? (The 32 size in particular seems suspicious.) > > - StdVideoAV1MESATile tiles[MAX_TILES]; > - StdVideoAV1MESATileList tile_list; > - const uint32_t *tile_offsets; > + uint32_t tile_sizes[MAX_TILES]; > > /* Current picture */ > - VkVideoDecodeAV1DpbSlotInfoMESA vkav1_ref; > - StdVideoAV1MESAFrameHeader av1_frame_header; > - VkVideoDecodeAV1PictureInfoMESA av1_pic_info; > + StdVideoDecodeAV1ReferenceInfo std_ref; > + VkVideoDecodeAV1DpbSlotInfoKHR vkav1_ref; > + uint16_t width_in_sbs_minus1[64]; > + uint16_t height_in_sbs_minus1[64]; > + uint16_t mi_col_starts[64]; > + uint16_t mi_row_starts[64]; > + StdVideoAV1TileInfo tile_info; > + StdVideoAV1Quantization quantization; > + StdVideoAV1Segmentation segmentation; > + StdVideoAV1LoopFilter loop_filter; > + StdVideoAV1CDEF cdef; > + StdVideoAV1LoopRestoration loop_restoration; > + StdVideoAV1GlobalMotion global_motion; > + StdVideoAV1FilmGrain film_grain; > + StdVideoDecodeAV1PictureInfo std_pic_info; > + VkVideoDecodeAV1PictureInfoKHR av1_pic_info; > > /* Picture refs */ > const AV1Frame *ref_src [AV1_NUM_REF_FRAMES]; > - VkVideoDecodeAV1DpbSlotInfoMESA vkav1_refs[AV1_NUM_REF_FRAMES]; > + StdVideoDecodeAV1ReferenceInfo std_ref_info[AV1_NUM_REF_FRAMES]; > + VkVideoDecodeAV1DpbSlotInfoKHR vkav1_refs[AV1_NUM_REF_FRAMES]; > > uint8_t frame_id_set; > uint8_t frame_id; > @@ -60,44 +72,60 @@ typedef struct AV1VulkanDecodePicture { > static int vk_av1_fill_pict(AVCodecContext *avctx, const AV1Frame **ref_src, > VkVideoReferenceSlotInfoKHR *ref_slot, /* Main structure */ > VkVideoPictureResourceInfoKHR *ref, /* Goes in ^ */ > - VkVideoDecodeAV1DpbSlotInfoMESA *vkav1_ref, /* Goes in ^ */ > - const AV1Frame *pic, int is_current, int has_grain, > - int dpb_slot_index) > + StdVideoDecodeAV1ReferenceInfo *vkav1_std_ref, > + VkVideoDecodeAV1DpbSlotInfoKHR *vkav1_ref, /* Goes in ^ */ > + const AV1Frame *pic, int is_current, int has_grain) > { > FFVulkanDecodeContext *dec = avctx->internal->hwaccel_priv_data; > AV1VulkanDecodePicture *hp = pic->hwaccel_picture_private; > FFVulkanDecodePicture *vkpic = &hp->vp; > + AV1DecContext *s = avctx->priv_data; > + CodedBitstreamAV1Context *cbs_ctx; > > int err = ff_vk_decode_prepare_frame(dec, pic->f, vkpic, is_current, > has_grain || dec->dedicated_dpb); > if (err < 0) > return err; > > - *vkav1_ref = (VkVideoDecodeAV1DpbSlotInfoMESA) { > - .sType = VK_STRUCTURE_TYPE_VIDEO_DECODE_AV1_DPB_SLOT_INFO_MESA, > - .frameIdx = hp->frame_id, > + cbs_ctx = (CodedBitstreamAV1Context *)(s->cbc->priv_data); > + > + *vkav1_std_ref = (StdVideoDecodeAV1ReferenceInfo) { > + .flags = (StdVideoDecodeAV1ReferenceInfoFlags) { > + .disable_frame_end_update_cdf = pic->raw_frame_header->disable_frame_end_update_cdf, > + .segmentation_enabled = pic->raw_frame_header->segmentation_enabled, > + }, > + .frame_type = pic->raw_frame_header->frame_type, > + .OrderHint = pic->raw_frame_header->order_hint, > + .RefFrameSignBias = 0x0, > }; > > - for (unsigned i = 0; i < 7; i++) { > - const int idx = pic->raw_frame_header->ref_frame_idx[i]; > - vkav1_ref->ref_order_hint[i] = pic->raw_frame_header->ref_order_hint[idx]; > + for (int i = 0; i < AV1_TOTAL_REFS_PER_FRAME; i++) { > + vkav1_std_ref->SavedOrderHints[i] = cbs_ctx->order_hints[i]; This isn't right. The values you're writing here aren't SavedOrderHints, rather they are the order hints relative to the current picture (which will then be written identically on every picture in the DPB). > + vkav1_std_ref->RefFrameSignBias |= cbs_ctx->ref_frame_sign_bias[i] << i; Again, this looks like it asking for the value relative to the reference picture rather than copying the current-frame value identically for every reference? (Probably need to store that in AV1ReferenceFrameState.) > } > > - vkav1_ref->disable_frame_end_update_cdf = pic->raw_frame_header->disable_frame_end_update_cdf; > + *vkav1_ref = (VkVideoDecodeAV1DpbSlotInfoKHR) { > + .sType = VK_STRUCTURE_TYPE_VIDEO_DECODE_AV1_DPB_SLOT_INFO_KHR, > + .pStdReferenceInfo = vkav1_std_ref, > + }; > + > + vkav1_std_ref->flags.disable_frame_end_update_cdf = pic->raw_frame_header->disable_frame_end_update_cdf; > + vkav1_std_ref->flags.segmentation_enabled = pic->raw_frame_header->segmentation_enabled; > + vkav1_std_ref->frame_type = pic->raw_frame_header->frame_type; > > *ref = (VkVideoPictureResourceInfoKHR) { > .sType = VK_STRUCTURE_TYPE_VIDEO_PICTURE_RESOURCE_INFO_KHR, > .codedOffset = (VkOffset2D){ 0, 0 }, > .codedExtent = (VkExtent2D){ pic->f->width, pic->f->height }, > .baseArrayLayer = ((has_grain || dec->dedicated_dpb) && dec->layered_dpb) ? > - dpb_slot_index : 0, > + hp->frame_id : 0, > .imageViewBinding = vkpic->img_view_ref, > }; > > *ref_slot = (VkVideoReferenceSlotInfoKHR) { > .sType = VK_STRUCTURE_TYPE_VIDEO_REFERENCE_SLOT_INFO_KHR, > .pNext = vkav1_ref, > - .slotIndex = dpb_slot_index, > + .slotIndex = hp->frame_id, > .pPictureResource = ref, > }; > > ... > @@ -236,12 +269,24 @@ static int vk_av1_start_frame(AVCodecContext *avctx, > /* Fill in references */ > for (int i = 0; i < AV1_NUM_REF_FRAMES; i++) { > const AV1Frame *ref_frame = &s->ref[i]; > + AV1VulkanDecodePicture *hp = ref_frame->hwaccel_picture_private; > + int found = 0; > + > if (s->ref[i].f->pict_type == AV_PICTURE_TYPE_NONE) > continue; > > - err = vk_av1_fill_pict(avctx, &ap->ref_src[i], &vp->ref_slots[i], > - &vp->refs[i], &ap->vkav1_refs[i], > - ref_frame, 0, 0, i); > + for (int j = 0; j < ref_count; j++) { > + if (vp->ref_slots[j].slotIndex == hp->frame_id) { > + found = 1; > + break; > + } > + } > + if (found) > + continue; > + > + err = vk_av1_fill_pict(avctx, &ap->ref_src[ref_count], &vp->ref_slots[ref_count], > + &vp->refs[ref_count], &ap->std_ref_info[ref_count], &ap->vkav1_refs[ref_count], > + ref_frame, 0, 0); > if (err < 0) > return err; > > @@ -249,20 +294,33 @@ static int vk_av1_start_frame(AVCodecContext *avctx, > } > > err = vk_av1_fill_pict(avctx, NULL, &vp->ref_slot, &vp->ref, > + &ap->std_ref, > &ap->vkav1_ref, > - pic, 1, apply_grain, 8); > + pic, 1, apply_grain); > if (err < 0) > return err; > > - ap->tile_list.nb_tiles = 0; > - ap->tile_list.tile_list = ap->tiles; > - > - ap->av1_pic_info = (VkVideoDecodeAV1PictureInfoMESA) { > - .sType = VK_STRUCTURE_TYPE_VIDEO_DECODE_AV1_PICTURE_INFO_MESA, > - .frame_header = &ap->av1_frame_header, > - .tile_list = &ap->tile_list, > + ap->av1_pic_info = (VkVideoDecodeAV1PictureInfoKHR) { > + .sType = VK_STRUCTURE_TYPE_VIDEO_DECODE_AV1_PICTURE_INFO_KHR, > + .pStdPictureInfo = &ap->std_pic_info, > + .frameHeaderOffset = 0, > + .tileCount = 0, > + .pTileOffsets = NULL, > + .pTileSizes = ap->tile_sizes, > }; > > + for (int i = 0; i < STD_VIDEO_AV1_REFS_PER_FRAME; i++) { > + const int idx = pic->raw_frame_header->ref_frame_idx[i]; > + const AV1Frame *ref_frame = &s->ref[idx]; > + AV1VulkanDecodePicture *hp = ref_frame->hwaccel_picture_private; > + > + ap->av1_pic_info.referenceNameSlotIndices[i] = -1; Suggest adding an AV1_REF_FRAME_NONE constant as in 6.10.24 and using it here. > + if (s->ref[i].f->pict_type == AV_PICTURE_TYPE_NONE) > + continue; > + > + ap->av1_pic_info.referenceNameSlotIndices[i] = hp->frame_id; Maybe if (s->ref[i].f->pict_type == AV_PICTURE_TYPE_NONE) ap->av1_pic_info.referenceNameSlotIndices[i] = AV1_REF_FRAME_NONE; else ap->av1_pic_info.referenceNameSlotIndices[i] = hp->frame_id; would be a clearer way to write this. > + } > + > vp->decode_info = (VkVideoDecodeInfoKHR) { > .sType = VK_STRUCTURE_TYPE_VIDEO_DECODE_INFO_KHR, > .pNext = &ap->av1_pic_info, > @@ -279,9 +337,94 @@ static int vk_av1_start_frame(AVCodecContext *avctx, > }, > }; > > + ap->tile_info = (StdVideoAV1TileInfo) { > + .flags = (StdVideoAV1TileInfoFlags) { > + .uniform_tile_spacing_flag = frame_header->uniform_tile_spacing_flag, > + }, > + .TileCols = frame_header->tile_cols, > + .TileRows = frame_header->tile_rows, > + .context_update_tile_id = frame_header->context_update_tile_id, > + .tile_size_bytes_minus_1 = frame_header->tile_size_bytes_minus1, > + .pWidthInSbsMinus1 = ap->width_in_sbs_minus1, > + .pHeightInSbsMinus1 = ap->height_in_sbs_minus1, > + .pMiColStarts = ap->mi_col_starts, > + .pMiRowStarts = ap->mi_row_starts, > + }; > + > + ap->quantization = (StdVideoAV1Quantization) { > + .flags.using_qmatrix = frame_header->using_qmatrix, > + .flags.diff_uv_delta = frame_header->diff_uv_delta, > + .base_q_idx = frame_header->base_q_idx, > + .DeltaQYDc = frame_header->delta_q_y_dc, > + .DeltaQUDc = frame_header->delta_q_u_dc, > + .DeltaQUAc = frame_header->delta_q_u_ac, > + .DeltaQVDc = frame_header->delta_q_v_dc, > + .DeltaQVAc = frame_header->delta_q_v_ac, > + .qm_y = frame_header->qm_y, > + .qm_u = frame_header->qm_u, > + .qm_v = frame_header->qm_v, > + }; > + > + for (int i = 0; i < STD_VIDEO_AV1_MAX_SEGMENTS; i++) { > + for (int j = 0; j < STD_VIDEO_AV1_SEG_LVL_MAX; j++) { > + ap->segmentation.FeatureEnabled[i] |= frame_header->feature_enabled[i][j] << j; > + ap->segmentation.FeatureData[i][j] = frame_header->feature_value[i][j]; > + } > + } Setting the segmentation fields is repeated (below as well). > + > + ap->loop_filter = (StdVideoAV1LoopFilter) { > + .flags = (StdVideoAV1LoopFilterFlags) { > + .loop_filter_delta_enabled = frame_header->loop_filter_delta_enabled, > + .loop_filter_delta_update = frame_header->loop_filter_delta_update, > + }, > + .loop_filter_sharpness = frame_header->loop_filter_sharpness, > + }; > + > + for (int i = 0; i < STD_VIDEO_AV1_MAX_LOOP_FILTER_STRENGTHS; i++) > + ap->loop_filter.loop_filter_level[i] = frame_header->loop_filter_level[i]; > + for (int i = 0; i < STD_VIDEO_AV1_LOOP_FILTER_ADJUSTMENTS; i++) > + ap->loop_filter.loop_filter_mode_deltas[i] = frame_header->loop_filter_mode_deltas[i]; > + > + ap->cdef = (StdVideoAV1CDEF) { > + .cdef_damping_minus_3 = frame_header->cdef_damping_minus_3, > + .cdef_bits = frame_header->cdef_bits, > + }; > + > + ap->loop_restoration = (StdVideoAV1LoopRestoration) { > + .FrameRestorationType[0] = remap_lr_type[frame_header->lr_type[0]], > + .FrameRestorationType[1] = remap_lr_type[frame_header->lr_type[1]], > + .FrameRestorationType[2] = remap_lr_type[frame_header->lr_type[2]], > + .LoopRestorationSize[0] = 1 + frame_header->lr_unit_shift, > + .LoopRestorationSize[1] = 1 + frame_header->lr_unit_shift - frame_header->lr_uv_shift, > + .LoopRestorationSize[2] = 1 + frame_header->lr_unit_shift - frame_header->lr_uv_shift, "the StdVideoAV1LoopRestoration structure pointed to by pLoopRestoration is interpreted as defined in section 6.10.15 of the AV1 Specification;" 6.10.15 defines LoopRestorationSize as the size in samples, but this is writing it with log2 of that. > + }; > + > + ap->film_grain = (StdVideoAV1FilmGrain) { > + .flags = (StdVideoAV1FilmGrainFlags) { > + .chroma_scaling_from_luma = film_grain->chroma_scaling_from_luma, > + .overlap_flag = film_grain->overlap_flag, > + .clip_to_restricted_range = film_grain->clip_to_restricted_range, > + }, > + .grain_scaling_minus_8 = film_grain->grain_scaling_minus_8, > + .ar_coeff_lag = film_grain->ar_coeff_lag, > + .ar_coeff_shift_minus_6 = film_grain->ar_coeff_shift_minus_6, > + .grain_scale_shift = film_grain->grain_scale_shift, > + .grain_seed = film_grain->grain_seed, > + .film_grain_params_ref_idx = film_grain->film_grain_params_ref_idx, > + .num_y_points = film_grain->num_y_points, > + .num_cb_points = film_grain->num_cb_points, > + .num_cr_points = film_grain->num_cr_points, > + .cb_mult = film_grain->cb_mult, > + .cb_luma_mult = film_grain->cb_luma_mult, > + .cb_offset = film_grain->cb_offset, > + .cr_mult = film_grain->cr_mult, > + .cr_luma_mult = film_grain->cr_luma_mult, > + .cr_offset = film_grain->cr_offset, > + }; > + > /* Setup frame header */ > - ap->av1_frame_header = (StdVideoAV1MESAFrameHeader) { > - .flags = (StdVideoAV1MESAFrameHeaderFlags) { > + ap->std_pic_info = (StdVideoDecodeAV1PictureInfo) { > + .flags = (StdVideoDecodeAV1PictureInfoFlags) { > .error_resilient_mode = frame_header->error_resilient_mode, > .disable_cdf_update = frame_header->disable_cdf_update, > .use_superres = frame_header->use_superres, > @@ -302,174 +445,88 @@ static int vk_av1_start_frame(AVCodecContext *avctx, > .reference_select = frame_header->reference_select, > .skip_mode_present = frame_header->skip_mode_present, > .delta_q_present = frame_header->delta_q_present, > + .delta_lf_present = frame_header->delta_lf_present, > + .delta_lf_multi = frame_header->delta_lf_multi, > + .segmentation_enabled = frame_header->segmentation_enabled, > + .segmentation_update_map = frame_header->segmentation_update_map, > + .segmentation_temporal_update = frame_header->segmentation_temporal_update, > + .segmentation_update_data = frame_header->segmentation_update_data, > + .UsesLr = frame_header->lr_type[0] || frame_header->lr_type[1] || frame_header->lr_type[2], > + .apply_grain = apply_grain, > }, > - .frame_to_show_map_idx = frame_header->frame_to_show_map_idx, > - .frame_presentation_time = frame_header->frame_presentation_time, > - .display_frame_id = frame_header->display_frame_id, > .frame_type = frame_header->frame_type, > .current_frame_id = frame_header->current_frame_id, > - .order_hint = frame_header->order_hint, > + .OrderHint = frame_header->order_hint, > .primary_ref_frame = frame_header->primary_ref_frame, > - .frame_width_minus_1 = frame_header->frame_width_minus_1, > - .frame_height_minus_1 = frame_header->frame_height_minus_1, > - .coded_denom = frame_header->coded_denom, > - .render_width_minus_1 = frame_header->render_width_minus_1, > - .render_height_minus_1 = frame_header->render_height_minus_1, > .refresh_frame_flags = frame_header->refresh_frame_flags, > .interpolation_filter = frame_header->interpolation_filter, > - .tx_mode = frame_header->tx_mode, > - .tiling = (StdVideoAV1MESATileInfo) { > - .flags = (StdVideoAV1MESATileInfoFlags) { > - .uniform_tile_spacing_flag = frame_header->uniform_tile_spacing_flag, > - }, > - .tile_cols = frame_header->tile_cols, > - .tile_rows = frame_header->tile_rows, > - .context_update_tile_id = frame_header->context_update_tile_id, > - .tile_size_bytes_minus1 = frame_header->tile_size_bytes_minus1, > - }, > - .quantization = (StdVideoAV1MESAQuantization) { > - .flags.using_qmatrix = frame_header->using_qmatrix, > - .base_q_idx = frame_header->base_q_idx, > - .delta_q_y_dc = frame_header->delta_q_y_dc, > - .diff_uv_delta = frame_header->diff_uv_delta, > - .delta_q_u_dc = frame_header->delta_q_u_dc, > - .delta_q_u_ac = frame_header->delta_q_u_ac, > - .delta_q_v_dc = frame_header->delta_q_v_dc, > - .delta_q_v_ac = frame_header->delta_q_v_ac, > - .qm_y = frame_header->qm_y, > - .qm_u = frame_header->qm_u, > - .qm_v = frame_header->qm_v, > - }, > - .delta_q = (StdVideoAV1MESADeltaQ) { > - .flags = (StdVideoAV1MESADeltaQFlags) { > - .delta_lf_present = frame_header->delta_lf_present, > - .delta_lf_multi = frame_header->delta_lf_multi, > - }, > - .delta_q_res = frame_header->delta_q_res, > - .delta_lf_res = frame_header->delta_lf_res, > - }, > - .loop_filter = (StdVideoAV1MESALoopFilter) { > - .flags = (StdVideoAV1MESALoopFilterFlags) { > - .delta_enabled = frame_header->loop_filter_delta_enabled, > - .delta_update = frame_header->loop_filter_delta_update, > - }, > - .level = { > - frame_header->loop_filter_level[0], frame_header->loop_filter_level[1], > - frame_header->loop_filter_level[2], frame_header->loop_filter_level[3], > - }, > - .sharpness = frame_header->loop_filter_sharpness, > - .mode_deltas = { > - frame_header->loop_filter_mode_deltas[0], frame_header->loop_filter_mode_deltas[1], > - }, > - }, > - .cdef = (StdVideoAV1MESACDEF) { > - .damping_minus_3 = frame_header->cdef_damping_minus_3, > - .bits = frame_header->cdef_bits, > - }, > - .lr = (StdVideoAV1MESALoopRestoration) { > - .lr_unit_shift = frame_header->lr_unit_shift, > - .lr_uv_shift = frame_header->lr_uv_shift, > - .lr_type = { frame_header->lr_type[0], frame_header->lr_type[1], frame_header->lr_type[2] }, > - }, > - .segmentation = (StdVideoAV1MESASegmentation) { > - .flags = (StdVideoAV1MESASegmentationFlags) { > - .enabled = frame_header->segmentation_enabled, > - .update_map = frame_header->segmentation_update_map, > - .temporal_update = frame_header->segmentation_temporal_update, > - .update_data = frame_header->segmentation_update_data, > - }, > - }, > - .film_grain = (StdVideoAV1MESAFilmGrainParameters) { > - .flags = (StdVideoAV1MESAFilmGrainFlags) { > - .apply_grain = apply_grain, > - .chroma_scaling_from_luma = film_grain->chroma_scaling_from_luma, > - .overlap_flag = film_grain->overlap_flag, > - .clip_to_restricted_range = film_grain->clip_to_restricted_range, > - }, > - .grain_scaling_minus_8 = film_grain->grain_scaling_minus_8, > - .ar_coeff_lag = film_grain->ar_coeff_lag, > - .ar_coeff_shift_minus_6 = film_grain->ar_coeff_shift_minus_6, > - .grain_scale_shift = film_grain->grain_scale_shift, > - .grain_seed = film_grain->grain_seed, > - .num_y_points = film_grain->num_y_points, > - .num_cb_points = film_grain->num_cb_points, > - .num_cr_points = film_grain->num_cr_points, > - .cb_mult = film_grain->cb_mult, > - .cb_luma_mult = film_grain->cb_luma_mult, > - .cb_offset = film_grain->cb_offset, > - .cr_mult = film_grain->cr_mult, > - .cr_luma_mult = film_grain->cr_luma_mult, > - .cr_offset = film_grain->cr_offset, > - }, > + .TxMode = frame_header->tx_mode, > + .delta_q_res = frame_header->delta_q_res, > + .delta_lf_res = frame_header->delta_lf_res, > + .SkipModeFrame[0] = s->cur_frame.skip_mode_frame_idx[0], > + .SkipModeFrame[1] = s->cur_frame.skip_mode_frame_idx[1], > + .coded_denom = frame_header->coded_denom, > + .pTileInfo = &ap->tile_info, > + .pQuantization = &ap->quantization, > + .pSegmentation = &ap->segmentation, > + .pLoopFilter = &ap->loop_filter, > + .pCDEF = &ap->cdef, > + .pLoopRestoration = &ap->loop_restoration, > + .pGlobalMotion = &ap->global_motion, > + .pFilmGrain = apply_grain ? &ap->film_grain : NULL, > }; > > for (int i = 0; i < 64; i++) { > - ap->av1_frame_header.tiling.width_in_sbs_minus_1[i] = frame_header->width_in_sbs_minus_1[i]; > - ap->av1_frame_header.tiling.height_in_sbs_minus_1[i] = frame_header->height_in_sbs_minus_1[i]; > - ap->av1_frame_header.tiling.tile_start_col_sb[i] = frame_header->tile_start_col_sb[i]; > - ap->av1_frame_header.tiling.tile_start_row_sb[i] = frame_header->tile_start_row_sb[i]; > + ap->width_in_sbs_minus1[i] = frame_header->width_in_sbs_minus_1[i]; > + ap->height_in_sbs_minus1[i] = frame_header->height_in_sbs_minus_1[i]; > + ap->mi_col_starts[i] = frame_header->tile_start_col_sb[i]; > + ap->mi_row_starts[i] = frame_header->tile_start_row_sb[i]; > } > > for (int i = 0; i < 8; i++) { This is using the magic constant 8 as at least three different things (AV1_MAX_SEGMENTS, AV1_NUM_REF_FRAMES, cdef_bits). Suggest splitting the loop. > - ap->av1_frame_header.segmentation.feature_enabled_bits[i] = 0; > - for (int j = 0; j < 8; j++) { > - ap->av1_frame_header.segmentation.feature_enabled_bits[i] |= (frame_header->feature_enabled[i][j] << j); > - ap->av1_frame_header.segmentation.feature_data[i][j] = frame_header->feature_value[i][j]; > + ap->segmentation.FeatureEnabled[i] = 0x0; > + for (int j = 0; j < STD_VIDEO_AV1_SEG_LVL_MAX; j++) { > + ap->segmentation.FeatureEnabled[i] |= (frame_header->feature_enabled[i][j] << j); > + ap->segmentation.FeatureData[i][j] = frame_header->feature_value[i][j]; > } > > - ap->av1_frame_header.loop_filter.ref_deltas[i] = frame_header->loop_filter_ref_deltas[i]; > - > - ap->av1_frame_header.cdef.y_pri_strength[i] = frame_header->cdef_y_pri_strength[i]; > - ap->av1_frame_header.cdef.y_sec_strength[i] = frame_header->cdef_y_sec_strength[i]; > - ap->av1_frame_header.cdef.uv_pri_strength[i] = frame_header->cdef_uv_pri_strength[i]; > - ap->av1_frame_header.cdef.uv_sec_strength[i] = frame_header->cdef_uv_sec_strength[i]; > - > - ap->av1_frame_header.ref_order_hint[i] = frame_header->ref_order_hint[i]; > - ap->av1_frame_header.global_motion[i] = (StdVideoAV1MESAGlobalMotion) { > - .flags = (StdVideoAV1MESAGlobalMotionFlags) { > - .gm_invalid = s->cur_frame.gm_invalid[i], > - }, > - .gm_type = s->cur_frame.gm_type[i], > - .gm_params = { > - s->cur_frame.gm_params[i][0], s->cur_frame.gm_params[i][1], > - s->cur_frame.gm_params[i][2], s->cur_frame.gm_params[i][3], > - s->cur_frame.gm_params[i][4], s->cur_frame.gm_params[i][5], > - }, > - }; > - } > + ap->loop_filter.loop_filter_ref_deltas[i] = frame_header->loop_filter_ref_deltas[i]; > > - for (int i = 0; i < 7; i++) { > - ap->av1_frame_header.ref_frame_idx[i] = frame_header->ref_frame_idx[i]; > - ap->av1_frame_header.delta_frame_id_minus1[i] = frame_header->delta_frame_id_minus1[i]; > - } > + ap->cdef.cdef_y_pri_strength[i] = frame_header->cdef_y_pri_strength[i]; > + ap->cdef.cdef_y_sec_strength[i] = frame_header->cdef_y_sec_strength[i]; > + ap->cdef.cdef_uv_pri_strength[i] = frame_header->cdef_uv_pri_strength[i]; > + ap->cdef.cdef_uv_sec_strength[i] = frame_header->cdef_uv_sec_strength[i]; > > - ap->av1_pic_info.skip_mode_frame_idx[0] = s->cur_frame.skip_mode_frame_idx[0]; > - ap->av1_pic_info.skip_mode_frame_idx[1] = s->cur_frame.skip_mode_frame_idx[1]; > + /* Reference frames */ > + ap->std_pic_info.OrderHints[i] = frame_header->ref_order_hint[i]; > + ap->global_motion.GmType[i] = s->cur_frame.gm_type[i]; > + for (int j = 0; j < STD_VIDEO_AV1_GLOBAL_MOTION_PARAMS; j++) { > + ap->global_motion.gm_params[i][j] = s->cur_frame.gm_params[i][j]; > + } > + } > > if (apply_grain) { > - for (int i = 0; i < 14; i++) { > - ap->av1_frame_header.film_grain.point_y_value[i] = film_grain->point_y_value[i]; > - ap->av1_frame_header.film_grain.point_y_scaling[i] = film_grain->point_y_scaling[i]; > + for (int i = 0; i < STD_VIDEO_AV1_MAX_NUM_Y_POINTS; i++) { > + ap->film_grain.point_y_value[i] = film_grain->point_y_value[i]; > + ap->film_grain.point_y_scaling[i] = film_grain->point_y_scaling[i]; > } > > - for (int i = 0; i < 10; i++) { > - ap->av1_frame_header.film_grain.point_cb_value[i] = film_grain->point_cb_value[i]; > - ap->av1_frame_header.film_grain.point_cb_scaling[i] = film_grain->point_cb_scaling[i]; > - ap->av1_frame_header.film_grain.point_cr_value[i] = film_grain->point_cr_value[i]; > - ap->av1_frame_header.film_grain.point_cr_scaling[i] = film_grain->point_cr_scaling[i]; > + for (int i = 0; i < STD_VIDEO_AV1_MAX_NUM_CB_POINTS; i++) { > + ap->film_grain.point_cb_value[i] = film_grain->point_cb_value[i]; > + ap->film_grain.point_cb_scaling[i] = film_grain->point_cb_scaling[i]; > + ap->film_grain.point_cr_value[i] = film_grain->point_cr_value[i]; > + ap->film_grain.point_cr_scaling[i] = film_grain->point_cr_scaling[i]; > } > > - for (int i = 0; i < 24; i++) { > - ap->av1_frame_header.film_grain.ar_coeffs_y_plus_128[i] = film_grain->ar_coeffs_y_plus_128[i]; > - ap->av1_frame_header.film_grain.ar_coeffs_cb_plus_128[i] = film_grain->ar_coeffs_cb_plus_128[i]; > - ap->av1_frame_header.film_grain.ar_coeffs_cr_plus_128[i] = film_grain->ar_coeffs_cr_plus_128[i]; > - } > + for (int i = 0; i < STD_VIDEO_AV1_MAX_NUM_POS_LUMA; i++) > + ap->film_grain.ar_coeffs_y_plus_128[i] = film_grain->ar_coeffs_y_plus_128[i]; > > - ap->av1_frame_header.film_grain.ar_coeffs_cb_plus_128[24] = film_grain->ar_coeffs_cb_plus_128[24]; > - ap->av1_frame_header.film_grain.ar_coeffs_cr_plus_128[24] = film_grain->ar_coeffs_cr_plus_128[24]; > + for (int i = 0; i < STD_VIDEO_AV1_MAX_NUM_POS_CHROMA; i++) { > + ap->film_grain.ar_coeffs_cb_plus_128[i] = film_grain->ar_coeffs_cb_plus_128[i]; > + ap->film_grain.ar_coeffs_cr_plus_128[i] = film_grain->ar_coeffs_cr_plus_128[i]; > + } > } > > - /* Workaround for a spec issue. */ > ap->dec = dec; > > return 0; > @@ -484,25 +541,20 @@ static int vk_av1_decode_slice(AVCodecContext *avctx, > AV1VulkanDecodePicture *ap = s->cur_frame.hwaccel_picture_private; > FFVulkanDecodePicture *vp = &ap->vp; > > + /* Too many tiles, exceeding all defined levels in the AV1 spec */ > + if (ap->av1_pic_info.tileCount > MAX_TILES) > + return AVERROR(ENOSYS); This seems to be checking how many tiles have already been decoded at the beginning of the tile group? (Was expecting something like "tileCount + tg_end - tg_start" instead?) > + > for (int i = s->tg_start; i <= s->tg_end; i++) { > - ap->tiles[ap->tile_list.nb_tiles] = (StdVideoAV1MESATile) { > - .size = s->tile_group_info[i].tile_size, > - .offset = s->tile_group_info[i].tile_offset, > - .row = s->tile_group_info[i].tile_row, > - .column = s->tile_group_info[i].tile_column, > - .tg_start = s->tg_start, > - .tg_end = s->tg_end, > - }; > + ap->tile_sizes[ap->av1_pic_info.tileCount] = s->tile_group_info[i].tile_size; > > err = ff_vk_decode_add_slice(avctx, vp, > data + s->tile_group_info[i].tile_offset, > s->tile_group_info[i].tile_size, 0, > - &ap->tile_list.nb_tiles, > - &ap->tile_offsets); > + &ap->av1_pic_info.tileCount, > + &ap->av1_pic_info.pTileOffsets); > if (err < 0) > return err; > - > - ap->tiles[ap->tile_list.nb_tiles - 1].offset = ap->tile_offsets[ap->tile_list.nb_tiles - 1]; > } > > return 0; > @@ -518,7 +570,7 @@ static int vk_av1_end_frame(AVCodecContext *avctx) > FFVulkanDecodePicture *rvp[AV1_NUM_REF_FRAMES] = { 0 }; > AVFrame *rav[AV1_NUM_REF_FRAMES] = { 0 }; > > - if (!ap->tile_list.nb_tiles) > + if (!ap->av1_pic_info.tileCount) > return 0; > > if (!dec->session_params) { > @@ -536,7 +588,7 @@ static int vk_av1_end_frame(AVCodecContext *avctx) > } > > av_log(avctx, AV_LOG_VERBOSE, "Decoding frame, %"SIZE_SPECIFIER" bytes, %i tiles\n", > - vp->slices_size, ap->tile_list.nb_tiles); > + vp->slices_size, ap->av1_pic_info.tileCount); > > return ff_vk_decode_frame(avctx, pic->f, vp, rav, rvp); > } > @@ -580,8 +632,6 @@ const FFHWAccel ff_av1_vulkan_hwaccel = { > * flexibility, this index cannot be present anywhere. > * The current implementation tracks the index for the driver and submits it > * as necessary information. Due to needing to modify the decoding context, > - * which is not thread-safe, on frame free, threading is disabled. > - * In the future, once this is fixed in the spec, the workarounds may be removed > - * and threading enabled. */ > + * which is not thread-safe, on frame free, threading is disabled. */ > .caps_internal = HWACCEL_CAP_ASYNC_SAFE, > }; > ... Thanks, - Mark _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".