From: Andreas Rheinhardt <andreas.rheinhardt@outlook.com> To: ffmpeg-devel@ffmpeg.org Subject: Re: [FFmpeg-devel] [PATCH v6 14/14] vvcdec: add full vvc decoder Date: Sat, 9 Dec 2023 06:14:57 +0100 Message-ID: <DU0P250MB074757365CDF9D12E95F5D188F89A@DU0P250MB0747.EURP250.PROD.OUTLOOK.COM> (raw) In-Reply-To: <CAFXK13fKBgbSZ54HyqA0zHd6DdXEcu4+j6KtnXmQ04ueKhJLfw@mail.gmail.com> Nuo Mi: > Hi Andreas, > thank you for the review. > On Fri, Dec 8, 2023 at 8:17 PM Andreas Rheinhardt < > andreas.rheinhardt@outlook.com> wrote: > >> >>> + >>> +static int min_pu_arrays_init(VVCFrameContext *fc, const int >> pic_size_in_min_pu) >>> +{ >>> + if (fc->tab.pic_size_in_min_pu != pic_size_in_min_pu) { >>> + min_pu_arrays_free(fc); >>> + fc->tab.msf = av_mallocz(pic_size_in_min_pu); >>> + fc->tab.iaf = av_mallocz(pic_size_in_min_pu); >>> + fc->tab.mmi = av_mallocz(pic_size_in_min_pu); >>> + fc->tab.mvf = av_mallocz(pic_size_in_min_pu * >> sizeof(*fc->tab.mvf)); >> >> Do these have to be separate allocations? If there were allocated >> jointly, one memset below would suffice. >> > They are separate flags, if we combine them. We can't use memset to set > flags for a block. > I disagree: You would still be able to use different pointers for different parts of the large allocated block, it is just that you also save some unnecessary allocations (and frees and errors checks for the allocations) and also gain the ability to memset them via one memset call in case one wants to set them to the same value. >> >>> + >>> +static int init_slice_context(SliceContext *sc, VVCFrameContext *fc, >> const H2645NAL *nal, const CodedBitstreamUnit *unit) >>> +{ >>> + const VVCSH *sh = &sc->sh; >>> + const H266RawSlice *slice = (const H266RawSlice *)unit->content; >> >> Please no pointless casts. Also, why is there unnecessary whitespace in >> front of '='? >> > Fix here and serval other places > The whitespace will make all = in a col. > But there is nothing that needs that much whitespace. >>> + >>> +static av_cold int frame_context_init(VVCFrameContext *fc, >> AVCodecContext *avctx) >>> +{ >>> + >>> + fc->avctx = av_memdup(avctx, sizeof(*avctx)); >> >> When I read this, I presumed you are using multiple AVCodecContexts to >> store the ever changing state of the AVCodecContext fields similarly to >> update_context_from_thread() in pthread_frame.c. But it seems you don't. >> These contexts are only used as a) logcontexts (where the actual >> user-facing AVCodecContext should be used, so that the user can make >> sense of the logmessages!), b) in ff_thread_get_buffer() and c) in >> export_frame_params() where only some basic fields >> (dimension-related+pix_fmt) is set. Presumably c) is done for b). >> > I remember if i did not use a local AVCodecContext it would trigger some > assert when resolution changed. > Can you be more specific about what assert has been triggered? And have you set the AVFrame fields directly before ff_thread_get_buffer()? >> >> But the user is allowed to change the provided callbacks in the master >> context at any time. E.g. the call to ff_thread_get_buffer() in >> vvc_refs.c currently uses the VVCFrameContext and therefore uses the >> get_buffer2 callback in place now (during av_memdup()). This is wrong. >> > This will not happen. av_memdup only happens in vvc_decode_init. > Nobody will call ff_thread_get_buffer at this time > You missed the point: If the user changes the get_buffer2 callback after init, the new callback will not be used at all. >> >> I think you can just remove VVCFrameContext.avctx and use the >> user-facing AVCodecContext if you set the AVFrame properties that are >> normally derived from the AVCodecContext directly on the AVFrame before >> ff_thread_get_buffer(). > > Could you explain more about how to create a user-facing AVCodecContext? > You do not create a user-facing AVCodecContext, the user does (and calls avcodec_send_packet()/avcodec_receive_frame() with it). >> >>> + >>> +static int decode_nal_unit(VVCContext *s, VVCFrameContext *fc, const >> H2645NAL *nal, const CodedBitstreamUnit *unit) >>> +{ >>> + int ret; >>> + >>> + s->temporal_id = nal->temporal_id; >>> + >>> + switch (unit->type) { >>> + case VVC_VPS_NUT: >>> + case VVC_SPS_NUT: >>> + case VVC_PPS_NUT: >>> + /* vps, sps, sps cached by s->cbc */ >>> + break; >>> + case VVC_TRAIL_NUT: >>> + case VVC_STSA_NUT: >>> + case VVC_RADL_NUT: >>> + case VVC_RASL_NUT: >>> + case VVC_IDR_W_RADL: >>> + case VVC_IDR_N_LP: >>> + case VVC_CRA_NUT: >>> + case VVC_GDR_NUT: >>> + ret = decode_slice(s, fc, nal, unit); >>> + if (ret < 0) >>> + goto fail; >>> + break; >>> + case VVC_PREFIX_APS_NUT: >>> + case VVC_SUFFIX_APS_NUT: >>> + ret = ff_vvc_decode_aps(&s->ps, unit); >>> + if (ret < 0) >>> + goto fail; >>> + break; >>> + default: >>> + av_log(s->avctx, AV_LOG_INFO, >>> + "Skipping NAL unit %d\n", unit->type); >> >> This will probably be very noisy (and warn for every SEI). I don't think >> it is even needed, as h2645_parse.c already contains debug log messages >> to display the unit type. >> > It's copied from hevcdec. It means we did not handle the nal diffrent than > h2645_parser.c messages > 1. The message is unnecessary, because a user who wants to know which NAL units have been handled or not can get the info about which units are present from h2645_parse.c and then look up in this list whether this type is processed. 2. hevcdec.c does "handle" quite a lot more NAL units; e.g. it actually handles SEI messages and it ignores e.g. Access unit delimiters as well as HEVC_NAL_UNSPEC62. Whereas you do not. > A fail that is only "return ret" is pointless (not only here). >> > At someday if we need to add some cleanup code. we do not need to change > all returns to goto. > IMO a goto fail should be added if and when it is actually beneficial. >>> + >>> + if (output) { >>> + while (s->nb_delayed) { >>> + wait_delayed_frame(s, output, &got_output); >>> + if (got_output) { >>> + av_frame_unref(output); >>> + } >>> + } >>> + av_frame_free(&output); >>> + } >>> } >>> >>> static av_cold int vvc_decode_free(AVCodecContext *avctx) >>> { >>> + VVCContext *s = avctx->priv_data; >>> + int i; >>> + >>> + ff_cbs_fragment_free(&s->current_frame); >> >> Is it sure that the fragment is not in use (given that other threads may >> be running now before vvc_decode_flush())? >> > Do you mean the executor threads? If they want to use some data, they will > take their own hip. > see ff_refstruct_replace(&sps->r, rsps); > "hip"? I have now noticed that the SliceContexts contain a reference to the packet's data via VVCSH->H266RawSliceHeader, which in reality points to a H266RawSlice which contains a reference to the actual data, so this point is moot. But using H266RawSliceHeader* for a H266RawSlice* is not nice. >> >>> + >>> + s->nb_fcs = (avctx->flags & AV_CODEC_FLAG_LOW_DELAY) ? 1 : >> FFMIN(av_cpu_count(), VVC_MAX_FRMAE_DELAY); >> >> This may evaluate av_cpu_count() multiple times. Furthermore I don't >> know why this define is used here at all: With frame threading, the >> number of frame threads is not limited by the delay/number of reordering >> frames at all (we even have frame-threading for decoders without >> frame-reordering at all). >> > vvc_decode_frame only allows 1 frame in 1 frame out. We can remove the > delay if we switch to FFCodec->receive_frame, > I do not get how this is supposed to address my point. >> >> But worst of this is that you do not check avctx->thread_count at all. >> > we do not use avctx->thread_count. we use the executor to manage threads. > This is complete nonsense: It is the user who specifies how many threads to use, regardless of which mechanism is used to manage threads. >> >>> + s->fcs = av_calloc(s->nb_fcs, sizeof(*s->fcs)); >>> + if (!s->fcs) >>> + goto fail; >>> + >>> + for (int i = 0; i < s->nb_fcs; i++) { >>> + VVCFrameContext *fc = s->fcs + i; >>> + ret = frame_context_init(fc, avctx); >>> + if (ret < 0) >>> + goto fail; >>> + } >>> + >>> + s->executor = ff_vvc_executor_alloc(s, s->nb_fcs); >>> + if (!s->executor) >>> + goto fail; >>> + >>> + s->eos = 1; >>> + GDR_SET_RECOVERED(s); >>> + memset(&ff_vvc_default_scale_m, 16, sizeof(ff_vvc_default_scale_m)); >> >> This needs to be done once (i.e. protected by an AVOnce) and not every >> time a decoder is set up. Otherwise there might be data races. >> > It's not read and set, it will no data races:), I can change it to AVOnce > . This is wrong: It is set here and presumably it will be read somewhere, so there absolutely can be data races. If you believe that there is no data race because every memset sets it to the same value, then you should be aware that the C specification disagrees with you (all references are to the C11 spec): a) 5.1.2.4 25: "The execution of a program contains a data race if it contains two conflicting actions in different threads, at least one of which is not atomic, and neither happens before the other. Any such data race results in undefined behavior." b) 5.1.2.4 4: "Two expression evaluations conflict if one of them modifies a memory location and the other one reads or modifies the same memory location." c) Note 2 in 3.1 (to the definition of "access"): "‘‘Modify’’ includes the case where the new value being stored is the same as the previous value." With the current code data races will happen if a) two different decoder instances are initialized without synchronisation (given that lavc does not serialize initialization of codecs (except in rare cases based upon a flag which this decoder does not set), this synchronization would have to be performed by the user, but we do not require our users to do this) or b) a decoder is initialized while another decoder runs and reads from ff_vvc_default_scale_m: Because the accesses performed by the initing thread are always modifications according to c), the accesses by the different threads conflict by definition b). memset() is not required to perform atomic modifications (and according to the standard atomic modifications can only happen with atomic objects, which ff_vvc_default_scale_m is not) and by our assumption there is no synchronisation between these actions, so it is a data race according to a). And data races are undefined behaviour. This clause allows compilers to optimize lots of code as if the program were single-threaded (because concurrent accesses were UB, so presumably they don't happen). In particular, speculative writes are legal (and happen sometimes, but probably not in memset). In fact, it would be legal for memset to always zero the memory it is supposed to set and then overwrite it with the final value. - Andreas _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
next prev parent reply other threads:[~2023-12-09 5:13 UTC|newest] Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top [not found] <20231205144519.7334-1-nuomi2021@gmail.com> 2023-12-05 14:45 ` [FFmpeg-devel] [PATCH v6 02/14] vvcdec: add vvc_data Nuo Mi 2023-12-05 14:45 ` [FFmpeg-devel] [PATCH v6 03/14] vvcdec: add parameter parser for sps, pps, ph, sh Nuo Mi 2023-12-08 12:26 ` Andreas Rheinhardt 2023-12-08 16:20 ` Nuo Mi 2023-12-05 14:45 ` [FFmpeg-devel] [PATCH v6 04/14] vvcdec: add cabac decoder Nuo Mi 2023-12-05 14:45 ` [FFmpeg-devel] [PATCH v6 05/14] vvcdec: add reference management Nuo Mi 2023-12-05 14:45 ` [FFmpeg-devel] [PATCH v6 06/14] vvcdec: add motion vector decoder Nuo Mi 2023-12-05 14:45 ` [FFmpeg-devel] [PATCH v6 07/14] vvcdec: add inter prediction Nuo Mi 2023-12-05 14:45 ` [FFmpeg-devel] [PATCH v6 08/14] vvcdec: add inv transform 1d Nuo Mi 2023-12-05 14:45 ` [FFmpeg-devel] [PATCH v6 09/14] vvcdec: add intra prediction Nuo Mi 2023-12-05 14:45 ` [FFmpeg-devel] [PATCH v6 10/14] vvcdec: add LMCS, Deblocking, SAO, and ALF filters Nuo Mi 2023-12-05 14:45 ` [FFmpeg-devel] [PATCH v6 11/14] vvcdec: add dsp init and inv transform Nuo Mi 2023-12-05 14:45 ` [FFmpeg-devel] [PATCH v6 12/14] vvcdec: add CTU parser Nuo Mi 2023-12-05 14:45 ` [FFmpeg-devel] [PATCH v6 13/14] vvcdec: add CTU thread logical Nuo Mi 2023-12-05 14:45 ` [FFmpeg-devel] [PATCH v6 14/14] vvcdec: add full vvc decoder Nuo Mi 2023-12-05 15:20 ` Nuo Mi 2023-12-08 12:19 ` Andreas Rheinhardt 2023-12-08 16:20 ` Nuo Mi 2023-12-09 5:14 ` Andreas Rheinhardt [this message] 2023-12-10 12:54 ` Nuo Mi
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=DU0P250MB074757365CDF9D12E95F5D188F89A@DU0P250MB0747.EURP250.PROD.OUTLOOK.COM \ --to=andreas.rheinhardt@outlook.com \ --cc=ffmpeg-devel@ffmpeg.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel This inbox may be cloned and mirrored by anyone: git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \ ffmpegdev@gitmailbox.com public-inbox-index ffmpegdev Example config snippet for mirrors. AGPL code for this site: git clone https://public-inbox.org/public-inbox.git