From: Lynne <dev@lynne.ee> To: ffmpeg-devel@ffmpeg.org Cc: Lynne <dev@lynne.ee> Subject: [FFmpeg-devel] [PATCH v2 08/12] ffv1enc_vulkan: refactor shaders slightly to support sharing Date: Mon, 24 Feb 2025 09:04:21 +0100 Message-ID: <20250224080434.5632-8-dev@lynne.ee> (raw) In-Reply-To: <20250224080434.5632-1-dev@lynne.ee> The shaders were written to support sharing, but needed slight tweaking. --- libavcodec/Makefile | 2 +- libavcodec/ffv1_vulkan.c | 123 ++++++++++++++ libavcodec/ffv1_vulkan.h | 60 +++++++ libavcodec/ffv1enc_vulkan.c | 234 +++++++++----------------- libavcodec/vulkan/ffv1_common.comp | 24 ++- libavcodec/vulkan/ffv1_enc_setup.comp | 18 +- libavcodec/vulkan/ffv1_reset.comp | 3 +- libavcodec/vulkan/rangecoder.comp | 27 +-- 8 files changed, 302 insertions(+), 189 deletions(-) create mode 100644 libavcodec/ffv1_vulkan.c create mode 100644 libavcodec/ffv1_vulkan.h diff --git a/libavcodec/Makefile b/libavcodec/Makefile index 9630074205..0e96b33ef3 100644 --- a/libavcodec/Makefile +++ b/libavcodec/Makefile @@ -371,7 +371,7 @@ OBJS-$(CONFIG_EXR_ENCODER) += exrenc.o float2half.o OBJS-$(CONFIG_FASTAUDIO_DECODER) += fastaudio.o OBJS-$(CONFIG_FFV1_DECODER) += ffv1dec.o ffv1_parse.o ffv1.o OBJS-$(CONFIG_FFV1_ENCODER) += ffv1enc.o ffv1_parse.o ffv1.o -OBJS-$(CONFIG_FFV1_VULKAN_ENCODER) += ffv1enc.o ffv1.o ffv1enc_vulkan.o +OBJS-$(CONFIG_FFV1_VULKAN_ENCODER) += ffv1enc.o ffv1.o ffv1_vulkan.o ffv1enc_vulkan.o OBJS-$(CONFIG_FFWAVESYNTH_DECODER) += ffwavesynth.o OBJS-$(CONFIG_FIC_DECODER) += fic.o OBJS-$(CONFIG_FITS_DECODER) += fitsdec.o fits.o diff --git a/libavcodec/ffv1_vulkan.c b/libavcodec/ffv1_vulkan.c new file mode 100644 index 0000000000..6f49e2ebb1 --- /dev/null +++ b/libavcodec/ffv1_vulkan.c @@ -0,0 +1,123 @@ +/* + * Copyright (c) 2025 Lynne <dev@lynne.ee> + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "ffv1_vulkan.h" +#include "libavutil/crc.h" + +int ff_ffv1_vk_update_state_transition_data(FFVulkanContext *s, + FFVkBuffer *vkb, FFV1Context *f) +{ + int err; + uint8_t *buf_mapped; + + RET(ff_vk_map_buffer(s, vkb, &buf_mapped, 0)); + + for (int i = 1; i < 256; i++) { + buf_mapped[256 + i] = f->state_transition[i]; + buf_mapped[256 - i] = 256 - (int)f->state_transition[i]; + } + + RET(ff_vk_unmap_buffer(s, vkb, 1)); + +fail: + return err; +} + +static int init_state_transition_data(FFVulkanContext *s, + FFVkBuffer *vkb, FFV1Context *f, + int (*write_data)(FFVulkanContext *s, + FFVkBuffer *vkb, FFV1Context *f)) +{ + int err; + size_t buf_len = 512*sizeof(uint8_t); + + RET(ff_vk_create_buf(s, vkb, + buf_len, + NULL, NULL, + VK_BUFFER_USAGE_SHADER_DEVICE_ADDRESS_BIT | + VK_BUFFER_USAGE_UNIFORM_BUFFER_BIT, + VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT | + VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT)); + + write_data(s, vkb, f); + +fail: + return err; +} + +int ff_ffv1_vk_init_state_transition_data(FFVulkanContext *s, + FFVkBuffer *vkb, FFV1Context *f) +{ + return init_state_transition_data(s, vkb, f, + ff_ffv1_vk_update_state_transition_data); +} + +int ff_ffv1_vk_init_quant_table_data(FFVulkanContext *s, + FFVkBuffer *vkb, FFV1Context *f) +{ + int err; + + int16_t *buf_mapped; + size_t buf_len = MAX_QUANT_TABLES* + MAX_CONTEXT_INPUTS* + MAX_QUANT_TABLE_SIZE*sizeof(int16_t); + + RET(ff_vk_create_buf(s, vkb, + buf_len, + NULL, NULL, + VK_BUFFER_USAGE_SHADER_DEVICE_ADDRESS_BIT | + VK_BUFFER_USAGE_UNIFORM_BUFFER_BIT, + VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT | + VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT)); + RET(ff_vk_map_buffer(s, vkb, (void *)&buf_mapped, 0)); + + memcpy(buf_mapped, f->quant_tables, + sizeof(f->quant_tables)); + + RET(ff_vk_unmap_buffer(s, vkb, 1)); + +fail: + return err; +} + +int ff_ffv1_vk_init_crc_table_data(FFVulkanContext *s, + FFVkBuffer *vkb, FFV1Context *f) +{ + int err; + + uint32_t *buf_mapped; + size_t buf_len = 256*sizeof(int32_t); + + RET(ff_vk_create_buf(s, vkb, + buf_len, + NULL, NULL, + VK_BUFFER_USAGE_SHADER_DEVICE_ADDRESS_BIT | + VK_BUFFER_USAGE_UNIFORM_BUFFER_BIT, + VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT | + VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT)); + RET(ff_vk_map_buffer(s, vkb, (void *)&buf_mapped, 0)); + + memcpy(buf_mapped, av_crc_get_table(AV_CRC_32_IEEE), buf_len); + + RET(ff_vk_unmap_buffer(s, vkb, 1)); + +fail: + return err; +} diff --git a/libavcodec/ffv1_vulkan.h b/libavcodec/ffv1_vulkan.h new file mode 100644 index 0000000000..0da6dc2d33 --- /dev/null +++ b/libavcodec/ffv1_vulkan.h @@ -0,0 +1,60 @@ +/* + * Copyright (c) 2024 Lynne <dev@lynne.ee> + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#ifndef AVCODEC_FFV1_VULKAN_H +#define AVCODEC_FFV1_VULKAN_H + +#include "libavutil/vulkan.h" +#include "ffv1.h" + +int ff_ffv1_vk_update_state_transition_data(FFVulkanContext *s, + FFVkBuffer *vkb, FFV1Context *f); + +int ff_ffv1_vk_init_state_transition_data(FFVulkanContext *s, + FFVkBuffer *vkb, FFV1Context *f); + +int ff_ffv1_vk_init_quant_table_data(FFVulkanContext *s, + FFVkBuffer *vkb, FFV1Context *f); + +int ff_ffv1_vk_init_crc_table_data(FFVulkanContext *s, + FFVkBuffer *vkb, FFV1Context *f); + +typedef struct FFv1VkRCTParameters { + int offset; + uint8_t bits; + uint8_t planar_rgb; + uint8_t transparency; + uint8_t version; + uint8_t micro_version; + uint8_t padding[3]; +} FFv1VkRCTParameters; + +typedef struct FFv1VkResetParameters { + VkDeviceAddress slice_state; + uint32_t plane_state_size; + uint32_t context_count; + uint8_t codec_planes; + uint8_t key_frame; + uint8_t version; + uint8_t micro_version; + uint8_t padding[1]; +} FFv1VkResetParameters; + +#endif /* AVCODEC_FFV1_VULKAN_H */ diff --git a/libavcodec/ffv1enc_vulkan.c b/libavcodec/ffv1enc_vulkan.c index 6a12ee2055..88801ca8e6 100644 --- a/libavcodec/ffv1enc_vulkan.c +++ b/libavcodec/ffv1enc_vulkan.c @@ -18,7 +18,6 @@ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA */ -#include "libavutil/crc.h" #include "libavutil/mem.h" #include "libavutil/vulkan.h" #include "libavutil/vulkan_spirv.h" @@ -32,6 +31,7 @@ #include "ffv1.h" #include "ffv1enc.h" +#include "ffv1_vulkan.h" /* Parallel Golomb alignment */ #define LG_ALIGN_W 32 @@ -122,28 +122,10 @@ extern const char *ff_source_ffv1_enc_setup_comp; extern const char *ff_source_ffv1_enc_comp; extern const char *ff_source_ffv1_enc_rgb_comp; -typedef struct FFv1VkRCTParameters { - int offset; - uint8_t bits; - uint8_t planar_rgb; - uint8_t transparency; - uint8_t padding[1]; -} FFv1VkRCTParameters; - -typedef struct FFv1VkResetParameters { - VkDeviceAddress slice_state; - uint32_t plane_state_size; - uint32_t context_count; - uint8_t codec_planes; - uint8_t key_frame; - uint8_t padding[3]; -} FFv1VkResetParameters; - typedef struct FFv1VkParameters { VkDeviceAddress slice_state; VkDeviceAddress scratch_data; VkDeviceAddress out_data; - uint64_t slice_size_max; int32_t sar[2]; uint32_t chroma_shift[2]; @@ -151,6 +133,7 @@ typedef struct FFv1VkParameters { uint32_t plane_state_size; uint32_t context_count; uint32_t crcref; + uint32_t slice_size_max; uint8_t bits_per_raw_sample; uint8_t context_model; @@ -175,7 +158,6 @@ static void add_push_data(FFVulkanShader *shd) GLSLC(1, u8buf slice_state; ); GLSLC(1, u8buf scratch_data; ); GLSLC(1, u8buf out_data; ); - GLSLC(1, uint64_t slice_size_max; ); GLSLC(0, ); GLSLC(1, ivec2 sar; ); GLSLC(1, uvec2 chroma_shift; ); @@ -183,6 +165,7 @@ static void add_push_data(FFVulkanShader *shd) GLSLC(1, uint plane_state_size; ); GLSLC(1, uint context_count; ); GLSLC(1, uint32_t crcref; ); + GLSLC(1, uint32_t slice_size_max; ); GLSLC(0, ); GLSLC(1, uint8_t bits_per_raw_sample; ); GLSLC(1, uint8_t context_model; ); @@ -492,7 +475,6 @@ static int vulkan_encode_ffv1_submit_frame(AVCodecContext *avctx, .slice_state = slice_data_buf->address + f->slice_count*256, .scratch_data = tmp_data_buf->address, .out_data = out_data_buf->address, - .slice_size_max = out_data_buf->size / f->slice_count, .bits_per_raw_sample = f->bits_per_raw_sample, .sar[0] = pict->sample_aspect_ratio.num, .sar[1] = pict->sample_aspect_ratio.den, @@ -501,6 +483,7 @@ static int vulkan_encode_ffv1_submit_frame(AVCodecContext *avctx, .plane_state_size = plane_state_size, .context_count = context_count, .crcref = f->crcref, + .slice_size_max = out_data_buf->size / f->slice_count, .context_model = fv->ctx.context_model, .version = f->version, .micro_version = f->micro_version, @@ -966,7 +949,6 @@ static void define_shared_code(AVCodecContext *avctx, FFVulkanShader *shd) GLSLF(0, #define TYPE int%i_t ,smp_bits); GLSLF(0, #define VTYPE2 i%ivec2 ,smp_bits); GLSLF(0, #define VTYPE3 i%ivec3 ,smp_bits); - GLSLD(ff_source_common_comp); GLSLD(ff_source_rangecoder_comp); if (f->ac == AC_GOLOMB_RICE) @@ -993,6 +975,10 @@ static int init_setup_shader(AVCodecContext *avctx, FFVkSPIRVCompiler *spv) 1, 1, 1, 0)); + /* Common codec header */ + GLSLD(ff_source_common_comp); + add_push_data(shd); + av_bprintf(&shd->src, "#define MAX_QUANT_TABLES %i\n", MAX_QUANT_TABLES); av_bprintf(&shd->src, "#define MAX_CONTEXT_INPUTS %i\n", MAX_CONTEXT_INPUTS); av_bprintf(&shd->src, "#define MAX_QUANT_TABLE_SIZE %i\n", MAX_QUANT_TABLE_SIZE); @@ -1038,8 +1024,6 @@ static int init_setup_shader(AVCodecContext *avctx, FFVkSPIRVCompiler *spv) }; RET(ff_vk_shader_add_descriptor_set(&fv->s, shd, desc_set, 2, 0, 0)); - add_push_data(shd); - GLSLD(ff_source_ffv1_enc_setup_comp); RET(spv->compile_shader(&fv->s, spv, shd, &spv_data, &spv_len, "main", @@ -1074,6 +1058,22 @@ static int init_reset_shader(AVCodecContext *avctx, FFVkSPIRVCompiler *spv) wg_dim, 1, 1, 0)); + /* Common codec header */ + GLSLD(ff_source_common_comp); + + GLSLC(0, layout(push_constant, scalar) uniform pushConstants { ); + GLSLC(1, u8buf slice_state; ); + GLSLC(1, uint plane_state_size; ); + GLSLC(1, uint context_count; ); + GLSLC(1, uint8_t codec_planes; ); + GLSLC(1, uint8_t key_frame; ); + GLSLC(1, uint8_t version; ); + GLSLC(1, uint8_t micro_version; ); + GLSLC(1, uint8_t padding[1]; ); + GLSLC(0, }; ); + ff_vk_shader_add_push_const(shd, 0, sizeof(FFv1VkResetParameters), + VK_SHADER_STAGE_COMPUTE_BIT); + av_bprintf(&shd->src, "#define MAX_QUANT_TABLES %i\n", MAX_QUANT_TABLES); av_bprintf(&shd->src, "#define MAX_CONTEXT_INPUTS %i\n", MAX_CONTEXT_INPUTS); av_bprintf(&shd->src, "#define MAX_QUANT_TABLE_SIZE %i\n", MAX_QUANT_TABLE_SIZE); @@ -1110,17 +1110,6 @@ static int init_reset_shader(AVCodecContext *avctx, FFVkSPIRVCompiler *spv) }; RET(ff_vk_shader_add_descriptor_set(&fv->s, shd, desc_set, 1, 0, 0)); - GLSLC(0, layout(push_constant, scalar) uniform pushConstants { ); - GLSLC(1, u8buf slice_state; ); - GLSLC(1, uint plane_state_size; ); - GLSLC(1, uint context_count; ); - GLSLC(1, uint8_t codec_planes; ); - GLSLC(1, uint8_t key_frame; ); - GLSLC(1, uint8_t padding[3]; ); - GLSLC(0, }; ); - ff_vk_shader_add_push_const(shd, 0, sizeof(FFv1VkResetParameters), - VK_SHADER_STAGE_COMPUTE_BIT); - GLSLD(ff_source_ffv1_reset_comp); RET(spv->compile_shader(&fv->s, spv, shd, &spv_data, &spv_len, "main", @@ -1164,6 +1153,21 @@ static int init_rct_shader(AVCodecContext *avctx, FFVkSPIRVCompiler *spv) wg_count, wg_count, 1, 0)); + /* Common codec header */ + GLSLD(ff_source_common_comp); + + GLSLC(0, layout(push_constant, scalar) uniform pushConstants { ); + GLSLC(1, int offset; ); + GLSLC(1, uint8_t bits; ); + GLSLC(1, uint8_t planar_rgb; ); + GLSLC(1, uint8_t transparency; ); + GLSLC(1, uint8_t version; ); + GLSLC(1, uint8_t micro_version; ); + GLSLC(1, uint8_t padding[3]; ); + GLSLC(0, }; ); + ff_vk_shader_add_push_const(shd, 0, sizeof(FFv1VkRCTParameters), + VK_SHADER_STAGE_COMPUTE_BIT); + av_bprintf(&shd->src, "#define MAX_QUANT_TABLES %i\n", MAX_QUANT_TABLES); av_bprintf(&shd->src, "#define MAX_CONTEXT_INPUTS %i\n", MAX_CONTEXT_INPUTS); av_bprintf(&shd->src, "#define MAX_QUANT_TABLE_SIZE %i\n", MAX_QUANT_TABLE_SIZE); @@ -1220,16 +1224,6 @@ static int init_rct_shader(AVCodecContext *avctx, FFVkSPIRVCompiler *spv) }; RET(ff_vk_shader_add_descriptor_set(&fv->s, shd, desc_set, 3, 0, 0)); - GLSLC(0, layout(push_constant, scalar) uniform pushConstants { ); - GLSLC(1, int offset; ); - GLSLC(1, uint8_t bits; ); - GLSLC(1, uint8_t planar_rgb; ); - GLSLC(1, uint8_t transparency; ); - GLSLC(1, uint8_t padding[1]; ); - GLSLC(0, }; ); - ff_vk_shader_add_push_const(shd, 0, sizeof(FFv1VkRCTParameters), - VK_SHADER_STAGE_COMPUTE_BIT); - GLSLD(ff_source_ffv1_enc_rct_comp); RET(spv->compile_shader(&fv->s, spv, shd, &spv_data, &spv_len, "main", @@ -1268,6 +1262,11 @@ static int init_encode_shader(AVCodecContext *avctx, FFVkSPIRVCompiler *spv) 1, 1, 1, 0)); + /* Common codec header */ + GLSLD(ff_source_common_comp); + + add_push_data(shd); + av_bprintf(&shd->src, "#define MAX_QUANT_TABLES %i\n", MAX_QUANT_TABLES); av_bprintf(&shd->src, "#define MAX_CONTEXT_INPUTS %i\n", MAX_CONTEXT_INPUTS); av_bprintf(&shd->src, "#define MAX_QUANT_TABLE_SIZE %i\n", MAX_QUANT_TABLE_SIZE); @@ -1328,8 +1327,6 @@ static int init_encode_shader(AVCodecContext *avctx, FFVkSPIRVCompiler *spv) }; RET(ff_vk_shader_add_descriptor_set(&fv->s, shd, desc_set, 3, 0, 0)); - add_push_data(shd); - /* Assemble the shader body */ GLSLD(ff_source_ffv1_enc_common_comp); @@ -1356,110 +1353,6 @@ fail: return err; } -static int init_state_transition_data(AVCodecContext *avctx) -{ - int err; - VulkanEncodeFFv1Context *fv = avctx->priv_data; - - uint8_t *buf_mapped; - size_t buf_len = 512*sizeof(uint8_t); - - RET(ff_vk_create_buf(&fv->s, &fv->rangecoder_static_buf, - buf_len, - NULL, NULL, - VK_BUFFER_USAGE_SHADER_DEVICE_ADDRESS_BIT | - VK_BUFFER_USAGE_UNIFORM_BUFFER_BIT, - VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT | - VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT)); - RET(ff_vk_map_buffer(&fv->s, &fv->rangecoder_static_buf, - &buf_mapped, 0)); - - for (int i = 1; i < 256; i++) { - buf_mapped[256 + i] = fv->ctx.state_transition[i]; - buf_mapped[256 - i] = 256 - (int)fv->ctx.state_transition[i]; - } - - RET(ff_vk_unmap_buffer(&fv->s, &fv->rangecoder_static_buf, 1)); - - /* Update descriptors */ - RET(ff_vk_shader_update_desc_buffer(&fv->s, &fv->exec_pool.contexts[0], - &fv->setup, 0, 0, 0, - &fv->rangecoder_static_buf, - 0, fv->rangecoder_static_buf.size, - VK_FORMAT_UNDEFINED)); - RET(ff_vk_shader_update_desc_buffer(&fv->s, &fv->exec_pool.contexts[0], - &fv->enc, 0, 0, 0, - &fv->rangecoder_static_buf, - 0, fv->rangecoder_static_buf.size, - VK_FORMAT_UNDEFINED)); - -fail: - return err; -} - -static int init_quant_table_data(AVCodecContext *avctx) -{ - int err; - VulkanEncodeFFv1Context *fv = avctx->priv_data; - - int16_t *buf_mapped; - size_t buf_len = MAX_QUANT_TABLES* - MAX_CONTEXT_INPUTS* - MAX_QUANT_TABLE_SIZE*sizeof(int16_t); - - RET(ff_vk_create_buf(&fv->s, &fv->quant_buf, - buf_len, - NULL, NULL, - VK_BUFFER_USAGE_SHADER_DEVICE_ADDRESS_BIT | - VK_BUFFER_USAGE_UNIFORM_BUFFER_BIT, - VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT | - VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT)); - RET(ff_vk_map_buffer(&fv->s, &fv->quant_buf, (void *)&buf_mapped, 0)); - - memcpy(buf_mapped, fv->ctx.quant_tables, - sizeof(fv->ctx.quant_tables)); - - RET(ff_vk_unmap_buffer(&fv->s, &fv->quant_buf, 1)); - RET(ff_vk_shader_update_desc_buffer(&fv->s, &fv->exec_pool.contexts[0], - &fv->enc, 0, 1, 0, - &fv->quant_buf, - 0, fv->quant_buf.size, - VK_FORMAT_UNDEFINED)); - -fail: - return err; -} - -static int init_crc_table_data(AVCodecContext *avctx) -{ - int err; - VulkanEncodeFFv1Context *fv = avctx->priv_data; - - uint32_t *buf_mapped; - size_t buf_len = 256*sizeof(int32_t); - - RET(ff_vk_create_buf(&fv->s, &fv->crc_tab_buf, - buf_len, - NULL, NULL, - VK_BUFFER_USAGE_SHADER_DEVICE_ADDRESS_BIT | - VK_BUFFER_USAGE_UNIFORM_BUFFER_BIT, - VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT | - VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT)); - RET(ff_vk_map_buffer(&fv->s, &fv->crc_tab_buf, (void *)&buf_mapped, 0)); - - memcpy(buf_mapped, av_crc_get_table(AV_CRC_32_IEEE), buf_len); - - RET(ff_vk_unmap_buffer(&fv->s, &fv->crc_tab_buf, 1)); - RET(ff_vk_shader_update_desc_buffer(&fv->s, &fv->exec_pool.contexts[0], - &fv->enc, 0, 2, 0, - &fv->crc_tab_buf, - 0, fv->crc_tab_buf.size, - VK_FORMAT_UNDEFINED)); - -fail: - return err; -} - static av_cold int vulkan_encode_ffv1_init(AVCodecContext *avctx) { int err; @@ -1703,20 +1596,50 @@ static av_cold int vulkan_encode_ffv1_init(AVCodecContext *avctx) spv->uninit(&spv); /* Range coder data */ - err = init_state_transition_data(avctx); + err = ff_ffv1_vk_init_state_transition_data(&fv->s, + &fv->rangecoder_static_buf, + f); if (err < 0) return err; /* Quantization table data */ - err = init_quant_table_data(avctx); + err = ff_ffv1_vk_init_quant_table_data(&fv->s, + &fv->quant_buf, + f); if (err < 0) return err; /* CRC table buffer */ - err = init_crc_table_data(avctx); + err = ff_ffv1_vk_init_crc_table_data(&fv->s, + &fv->crc_tab_buf, + f); if (err < 0) return err; + /* Update setup global descriptors */ + RET(ff_vk_shader_update_desc_buffer(&fv->s, &fv->exec_pool.contexts[0], + &fv->setup, 0, 0, 0, + &fv->rangecoder_static_buf, + 0, fv->rangecoder_static_buf.size, + VK_FORMAT_UNDEFINED)); + + /* Update encode global descriptors */ + RET(ff_vk_shader_update_desc_buffer(&fv->s, &fv->exec_pool.contexts[0], + &fv->enc, 0, 0, 0, + &fv->rangecoder_static_buf, + 0, fv->rangecoder_static_buf.size, + VK_FORMAT_UNDEFINED)); + RET(ff_vk_shader_update_desc_buffer(&fv->s, &fv->exec_pool.contexts[0], + &fv->enc, 0, 1, 0, + &fv->quant_buf, + 0, fv->quant_buf.size, + VK_FORMAT_UNDEFINED)); + RET(ff_vk_shader_update_desc_buffer(&fv->s, &fv->exec_pool.contexts[0], + &fv->enc, 0, 2, 0, + &fv->crc_tab_buf, + 0, fv->crc_tab_buf.size, + VK_FORMAT_UNDEFINED)); + /* Temporary frame */ fv->frame = av_frame_alloc(); if (!fv->frame) @@ -1735,7 +1658,8 @@ static av_cold int vulkan_encode_ffv1_init(AVCodecContext *avctx) if (!fv->buf_regions) return AVERROR(ENOMEM); - return 0; +fail: + return err; } static av_cold int vulkan_encode_ffv1_close(AVCodecContext *avctx) diff --git a/libavcodec/vulkan/ffv1_common.comp b/libavcodec/vulkan/ffv1_common.comp index 5b4a882367..604d03b2de 100644 --- a/libavcodec/vulkan/ffv1_common.comp +++ b/libavcodec/vulkan/ffv1_common.comp @@ -22,17 +22,18 @@ struct SliceContext { RangeCoder c; - -#ifdef GOLOMB PutBitContext pb; /* 8*8 bytes */ -#endif ivec2 slice_dim; ivec2 slice_pos; ivec2 slice_rct_coef; + u8vec4 quant_table_idx; + uint context_count; uint hdr_len; // only used for golomb - int slice_coding_mode; + + uint slice_coding_mode; + bool slice_reset_contexts; }; /* -1, { -1, 0 } */ @@ -72,3 +73,18 @@ const uint32_t log2_run[41] = { 16, 17, 18, 19, 20, 21, 22, 23, 24, }; + +uint slice_coord(uint width, uint sx, uint num_h_slices, uint chroma_shift) +{ + uint mpw = 1 << chroma_shift; + uint awidth = align(width, mpw); + + if ((version < 4) || ((version == 4) && (micro_version < 3))) + return width * sx / num_h_slices; + + sx = (2 * awidth * sx + num_h_slices * mpw) / (2 * num_h_slices * mpw) * mpw; + if (sx == awidth) + sx = width; + + return sx; +} diff --git a/libavcodec/vulkan/ffv1_enc_setup.comp b/libavcodec/vulkan/ffv1_enc_setup.comp index b861e25f74..23f09b2af6 100644 --- a/libavcodec/vulkan/ffv1_enc_setup.comp +++ b/libavcodec/vulkan/ffv1_enc_setup.comp @@ -20,21 +20,6 @@ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA */ -uint slice_coord(uint width, uint sx, uint num_h_slices, uint chroma_shift) -{ - uint mpw = 1 << chroma_shift; - uint awidth = align(width, mpw); - - if ((version < 4) || ((version == 4) && (micro_version < 3))) - return width * sx / num_h_slices; - - sx = (2 * awidth * sx + num_h_slices * mpw) / (2 * num_h_slices * mpw) * mpw; - if (sx == awidth) - sx = width; - - return sx; -} - void init_slice(out SliceContext sc, const uint slice_idx) { /* Set coordinates */ @@ -52,6 +37,7 @@ void init_slice(out SliceContext sc, const uint slice_idx) sc.slice_dim = ivec2(sxe - sxs, sye - sys); sc.slice_rct_coef = ivec2(1, 1); sc.slice_coding_mode = int(force_pcm == 1); + sc.slice_reset_contexts = sc.slice_coding_mode == 1; rac_init(sc.c, OFFBUF(u8buf, out_data, slice_idx * slice_size_max), @@ -105,7 +91,7 @@ void write_slice_header(inout SliceContext sc, uint64_t state) put_symbol_unsigned(sc.c, state, sar.y); if (version >= 4) { - put_rac_full(sc.c, state, sc.slice_coding_mode == 1); + put_rac_full(sc.c, state, sc.slice_reset_contexts); put_symbol_unsigned(sc.c, state, sc.slice_coding_mode); if (sc.slice_coding_mode != 1 && colorspace == 1) { put_symbol_unsigned(sc.c, state, sc.slice_rct_coef.y); diff --git a/libavcodec/vulkan/ffv1_reset.comp b/libavcodec/vulkan/ffv1_reset.comp index c7c7962850..1b87ca754e 100644 --- a/libavcodec/vulkan/ffv1_reset.comp +++ b/libavcodec/vulkan/ffv1_reset.comp @@ -24,7 +24,8 @@ void main(void) { const uint slice_idx = gl_WorkGroupID.y*gl_NumWorkGroups.x + gl_WorkGroupID.x; - if (slice_ctx[slice_idx].slice_coding_mode == 0 && key_frame == 0) + if (key_frame == 0 && + slice_ctx[slice_idx].slice_reset_contexts == false) return; uint64_t slice_state_off = uint64_t(slice_state) + diff --git a/libavcodec/vulkan/rangecoder.comp b/libavcodec/vulkan/rangecoder.comp index 848a056fb1..6e3b9c1238 100644 --- a/libavcodec/vulkan/rangecoder.comp +++ b/libavcodec/vulkan/rangecoder.comp @@ -21,8 +21,9 @@ */ struct RangeCoder { - u8buf bytestream_start; - u8buf bytestream; + uint64_t bytestream_start; + uint64_t bytestream; + uint64_t bytestream_end; uint low; uint16_t range; @@ -34,28 +35,29 @@ struct RangeCoder { void renorm_encoder_full(inout RangeCoder c) { int bs_cnt = 0; + u8buf bytestream = u8buf(c.bytestream); if (c.outstanding_byte == 0xFF) { c.outstanding_byte = uint8_t(c.low >> 8); } else if (c.low <= 0xFF00) { - c.bytestream[bs_cnt++].v = c.outstanding_byte; + bytestream[bs_cnt++].v = c.outstanding_byte; uint16_t cnt = c.outstanding_count; for (; cnt > 0; cnt--) - c.bytestream[bs_cnt++].v = uint8_t(0xFF); + bytestream[bs_cnt++].v = uint8_t(0xFF); c.outstanding_count = uint16_t(0); c.outstanding_byte = uint8_t(c.low >> 8); } else if (c.low >= 0x10000) { - c.bytestream[bs_cnt++].v = c.outstanding_byte + uint8_t(1); + bytestream[bs_cnt++].v = c.outstanding_byte + uint8_t(1); uint16_t cnt = c.outstanding_count; for (; cnt > 0; cnt--) - c.bytestream[bs_cnt++].v = uint8_t(0x00); + bytestream[bs_cnt++].v = uint8_t(0x00); c.outstanding_count = uint16_t(0); c.outstanding_byte = uint8_t(bitfieldExtract(c.low, 8, 8)); } else { c.outstanding_count++; } - c.bytestream = OFFBUF(u8buf, c.bytestream, bs_cnt); + c.bytestream += bs_cnt; c.range <<= 8; c.low = bitfieldInsert(0, c.low, 8, 8); } @@ -74,10 +76,10 @@ void renorm_encoder(inout RangeCoder c) return; } - u8buf bs = c.bytestream; + u8buf bs = u8buf(c.bytestream); uint8_t outstanding_byte = c.outstanding_byte; - c.bytestream = OFFBUF(u8buf, bs, oc); + c.bytestream = uint64_t(bs) + oc; c.outstanding_count = uint16_t(0); c.outstanding_byte = uint8_t(low >> 8); @@ -179,10 +181,11 @@ uint32_t rac_terminate(inout RangeCoder c) return uint32_t(uint64_t(c.bytestream) - uint64_t(c.bytestream_start)); } -void rac_init(out RangeCoder r, u8buf data, uint64_t buf_size) +void rac_init(out RangeCoder r, u8buf data, uint buf_size) { - r.bytestream_start = data; - r.bytestream = data; + r.bytestream_start = uint64_t(data); + r.bytestream = uint64_t(data); + r.bytestream_end = uint64_t(data) + buf_size; r.low = 0; r.range = uint16_t(0xFF00); r.outstanding_count = uint16_t(0); -- 2.47.2 _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
next prev parent reply other threads:[~2025-02-24 8:06 UTC|newest] Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top 2025-02-24 8:04 [FFmpeg-devel] [PATCH v2 01/12] ffv1enc_vulkan: disable autodetection of async_depth Lynne 2025-02-24 8:04 ` [FFmpeg-devel] [PATCH v2 02/12] vulkan: add ff_vk_create_imageview Lynne 2025-02-24 8:04 ` [FFmpeg-devel] [PATCH v2 03/12] vulkan: copy host-mapping buffer code from hwcontext Lynne 2025-02-24 8:04 ` [FFmpeg-devel] [PATCH v2 04/12] vulkan_decode: support software-defined decoders Lynne 2025-02-24 8:04 ` [FFmpeg-devel] [PATCH v2 05/12] vulkan_decode: support multiple image views Lynne 2025-02-24 8:04 ` [FFmpeg-devel] [PATCH v2 06/12] hwcontext_vulkan: enable read/write without storage Lynne 2025-02-24 8:04 ` [FFmpeg-devel] [PATCH v2 07/12] vulkan: workaround BGR storage image undefined behaviour Lynne 2025-02-24 8:04 ` Lynne [this message] 2025-02-24 8:04 ` [FFmpeg-devel] [PATCH v2 09/12] vulkan: unify handling of BGR and simplify ffv1_rct Lynne 2025-02-24 8:04 ` [FFmpeg-devel] [PATCH v2 10/12] ffv1dec: add support for hwaccels Lynne 2025-02-24 8:05 ` [FFmpeg-devel] [PATCH v2 11/12] ffv1dec: reference the current packet into the main context Lynne 2025-02-24 8:05 ` [FFmpeg-devel] [PATCH v2 12/12] ffv1dec_vulkan: add a Vulkan compute-based hardware decoding implementation Lynne
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20250224080434.5632-8-dev@lynne.ee \ --to=dev@lynne.ee \ --cc=ffmpeg-devel@ffmpeg.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel This inbox may be cloned and mirrored by anyone: git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \ ffmpegdev@gitmailbox.com public-inbox-index ffmpegdev Example config snippet for mirrors. AGPL code for this site: git clone https://public-inbox.org/public-inbox.git