From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTP id 17AFF43270 for ; Mon, 25 Jul 2022 19:44:31 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 3774968B7DF; Mon, 25 Jul 2022 22:44:28 +0300 (EEST) Received: from relay5-d.mail.gandi.net (relay5-d.mail.gandi.net [217.70.183.197]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 41A6C68B74E for ; Mon, 25 Jul 2022 22:44:22 +0300 (EEST) Received: (Authenticated sender: michael@niedermayer.cc) by mail.gandi.net (Postfix) with ESMTPSA id 929221C0002 for ; Mon, 25 Jul 2022 19:44:21 +0000 (UTC) Date: Mon, 25 Jul 2022 21:44:20 +0200 From: Michael Niedermayer To: FFmpeg development discussions and patches Message-ID: <20220725194420.GW2088045@pb2> References: <20220701212511.GY396728@pb2> <20220705222411.GM396728@pb2> <20220723143802.GR2088045@pb2> <20220724212314.GT2088045@pb2> MIME-Version: 1.0 In-Reply-To: Subject: Re: [FFmpeg-devel] [PATCH 14/18] avcodec/hevcdec: Don't allocate redundant HEVCContexts X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Content-Type: multipart/mixed; boundary="===============5109008701172863966==" Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: --===============5109008701172863966== Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="l8FkFPwqImiSFyXg" Content-Disposition: inline --l8FkFPwqImiSFyXg Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sun, Jul 24, 2022 at 11:26:37PM +0200, Andreas Rheinhardt wrote: > Michael Niedermayer: > > On Sat, Jul 23, 2022 at 11:42:23PM +0200, Andreas Rheinhardt wrote: > >> Michael Niedermayer: > >>> On Sat, Jul 23, 2022 at 07:44:40AM +0200, Andreas Rheinhardt wrote: > >>>> Andreas Rheinhardt: > >>>>> Michael Niedermayer: > >>>>>> On Sat, Jul 02, 2022 at 08:32:06AM +0200, Andreas Rheinhardt wrote: > >>>>>>> Michael Niedermayer: > >>>>>>>> On Fri, Jul 01, 2022 at 12:29:45AM +0200, Andreas Rheinhardt wro= te: > >>>>>>>>> The HEVC decoder has both HEVCContext and HEVCLocalContext > >>>>>>>>> structures. The latter is supposed to be the structure > >>>>>>>>> containing the per-slicethread state. > >>>>>>>>> > >>>>>>>>> Yet up until now that is not how it is handled in practice: > >>>>>>>>> Each HEVCLocalContext has a unique HEVCContext allocated for it > >>>>>>>>> and each of these coincides except in exactly one field: The > >>>>>>>>> corresponding HEVCLocalContext. This makes it possible to pass > >>>>>>>>> the HEVCContext everywhere where logically a HEVCLocalContext > >>>>>>>>> should be used. And up until recently, this is how it has been = done. > >>>>>>>>> > >>>>>>>>> Yet the preceding patches changed this, making it possible > >>>>>>>>> to avoid allocating redundant HEVCContexts. > >>>>>>>>> > >>>>>>>>> Signed-off-by: Andreas Rheinhardt > >>>>>>>>> --- > >>>>>>>>> libavcodec/hevcdec.c | 40 ++++++++++++++++--------------------= ---- > >>>>>>>>> libavcodec/hevcdec.h | 2 -- > >>>>>>>>> 2 files changed, 16 insertions(+), 26 deletions(-) > >>>>>>>>> > >>>>>>>>> diff --git a/libavcodec/hevcdec.c b/libavcodec/hevcdec.c > >>>>>>>>> index 9d1241f293..048fcc76b4 100644 > >>>>>>>>> --- a/libavcodec/hevcdec.c > >>>>>>>>> +++ b/libavcodec/hevcdec.c > >>>>>>>>> @@ -2548,13 +2548,12 @@ static int hls_decode_entry_wpp(AVCodec= Context *avctxt, void *hevc_lclist, > >>>>>>>>> { > >>>>>>>>> HEVCLocalContext *lc =3D ((HEVCLocalContext**)hevc_lclist)= [self_id]; > >>>>>>>>> const HEVCContext *const s =3D lc->parent; > >>>>>>>>> - HEVCContext *s1 =3D avctxt->priv_data; > >>>>>>>>> - int ctb_size =3D 1<< s1->ps.sps->log2_ctb_size; > >>>>>>>>> + int ctb_size =3D 1 << s->ps.sps->log2_ctb_size; > >>>>>>>>> int more_data =3D 1; > >>>>>>>>> int ctb_row =3D job; > >>>>>>>>> - int ctb_addr_rs =3D s1->sh.slice_ctb_addr_rs + ctb_row * (= (s1->ps.sps->width + ctb_size - 1) >> s1->ps.sps->log2_ctb_size); > >>>>>>>>> - int ctb_addr_ts =3D s1->ps.pps->ctb_addr_rs_to_ts[ctb_addr= _rs]; > >>>>>>>>> - int thread =3D ctb_row % s1->threads_number; > >>>>>>>>> + int ctb_addr_rs =3D s->sh.slice_ctb_addr_rs + ctb_row * ((= s->ps.sps->width + ctb_size - 1) >> s->ps.sps->log2_ctb_size); > >>>>>>>>> + int ctb_addr_ts =3D s->ps.pps->ctb_addr_rs_to_ts[ctb_addr_= rs]; > >>>>>>>>> + int thread =3D ctb_row % s->threads_number; > >>>>>>>>> int ret; > >>>>>>>>> =20 > >>>>>>>>> if(ctb_row) { > >>>>>>>>> @@ -2572,7 +2571,7 @@ static int hls_decode_entry_wpp(AVCodecCo= ntext *avctxt, void *hevc_lclist, > >>>>>>>>> =20 > >>>>>>>>> ff_thread_await_progress2(s->avctx, ctb_row, thread, S= HIFT_CTB_WPP); > >>>>>>>>> =20 > >>>>>>>>> - if (atomic_load(&s1->wpp_err)) { > >>>>>>>>> + if (atomic_load(&s->wpp_err)) { > >>>>>>>>> ff_thread_report_progress2(s->avctx, ctb_row , thr= ead, SHIFT_CTB_WPP); > >>>>>>>> > >>>>>>>> the consts in "const HEVCContext *const " make clang version 6.0= =2E0-1ubuntu2 unhappy > >>>>>>>> (this was building shared libs) > >>>>>>>> > >>>>>>>> > >>>>>>>> CC libavcodec/hevcdec.o > >>>>>>>> src/libavcodec/hevcdec.c:2574:13: error: address argument to ato= mic operation must be a pointer to non-const _Atomic type ('const atomic_in= t *' (aka 'const _Atomic(int) *') invalid) > >>>>>>>> if (atomic_load(&s->wpp_err)) { > >>>>>>>> ^ ~~~~~~~~~~~ > >>>>>>>> /usr/lib/llvm-6.0/lib/clang/6.0.0/include/stdatomic.h:134:29: no= te: expanded from macro 'atomic_load' > >>>>>>>> #define atomic_load(object) __c11_atomic_load(object, __ATOMIC_S= EQ_CST) > >>>>>>>> ^ ~~~~~~ > >>>>>>>> 1 error generated. > >>>>>>>> src/ffbuild/common.mak:81: recipe for target 'libavcodec/hevcdec= =2Eo' failed > >>>>>>>> make: *** [libavcodec/hevcdec.o] Error 1 > >>>>>>>> > >>>>>>>> thx > >>>>>>>> > >>>>>>> > >>>>>>> Thanks for testing this. atomic_load is indeed declared without c= onst in > >>>>>>> 7.17.7.2: > >>>>>>> > >>>>>>> C atomic_load(volatile A *object); > >>>>>>> > >>>>>>> Upon reflection this makes sense, because if atomics are implemen= ted via > >>>>>>> mutexes, even a read may involve a preceding write. So I'll cast = const > >>>>>>> away here, too, and add a comment. (It works when casting const a= way, > >>>>>>> doesn't it?) > >>>>>> > >>>>>> This doesnt feel "right". These pointers should not be coming from= a const > >>>>>> if they are written to > >>>>>> > >>>>> > >>>>> The HEVCContext is not const because the underlying object is const= ; the > >>>>> HEVCContext is const when accessed from any part of the code that m= ay be > >>>>> run from slice threads, because if a slice thread modifies it, you = have > >>>>> a data race in case any of the other slice threads reads this field= or > >>>>> modifies it itself. But this is by definition not true for atomic > >>>>> operations, so casting const away for them is fine. > >>>>> > >>>>>> The compiler accepts it with an explicit cast though. With an impl= icit cast > >>>>>> it produces a warning > >>>>>> > >>>> > >>>> Did the above explanation satisfy you? Or do you want something else? > >>> > >>> sure, ok > >>> > >>> [...] > >>> > >> > >> Good to hear. This patchset (namely patch 11/18: "avcodec/hevcpred: Pa= ss > >> HEVCLocalContext when slice-threading") includes modifications to mips > >> code that I created blindly. Can you please test it? Here is a branch = of > >> this rebased on top of current git master: > >> https://github.com/mkver/FFmpeg/commits/hevc_wpp > >> (Said branch actually contains a bit of further work which also modifi= es > >> mips code (in particular, > >> https://github.com/mkver/FFmpeg/commit/cf441e559b8d4bf2c05c29483ccf49e= 82fc6b863 > >> does so); you may also test this.) > >=20 > > what exact tests do we need ? > > simple fate ? any specific thread type count ? > > also note i can only test qemu mips not real hw. > > I stopped maintaining the MIPS hw and sofar noone volunteered to take i= ts=20 > > maintaince over > >=20 >=20 > Compiling (in particular being on the lookout for new warnings) and a > simple fate should be enough. I seethis: (NOT sanity checked and i also had some other patches from today= applied) Probably these are all unrelated nonsense --- fatelist-oldw 2022-07-25 21:19:24.555885524 +0200 +++ fatelistw 2022-07-25 21:19:30.715956081 +0200 @@ -1,3 +1,5 @@ +src/tests/checkasm/synth_filter.c:66:42: warning: unused variable =E2=80= =98offset_b=E2=80=99 [-Wunused-variable] +src/tests/checkasm/sbrdsp.c:180:14: warning: unused variable =E2=80=98bw= =E2=80=99 [-Wunused-variable] src/libavfilter/dnn/dnn_backend_common.c:94:11: warning: unused variable = =E2=80=98status=E2=80=99 [-Wunused-variable] src/libavfilter/dnn/dnn_backend_common.c:114:11: warning: unused variable = =E2=80=98status=E2=80=99 [-Wunused-variable] src/libavfilter/dnn/dnn_backend_common.c:80:14: warning: =E2=80=98async_th= read_routine=E2=80=99 defined but not used [-Wunused-function] @@ -7,20 +9,20 @@ src/libavformat/dashenc.c:1960:63: warning: =E2=80=98%s=E2=80=99 directive= output may be truncated writing up to 1023 bytes into a region of size bet= ween 1 and 1024 [-Wformat-truncation=3D] src/libavformat/dashenc.c:492:49: warning: =E2=80=98media_=E2=80=99 direct= ive output may be truncated writing 6 bytes into a region of size between 1= and 1024 [-Wformat-truncation=3D] src/libavformat/dashenc.c:2257:59: warning: =E2=80=98%s=E2=80=99 directive= output may be truncated writing up to 1023 bytes into a region of size bet= ween 1 and 1024 [-Wformat-truncation=3D] -src/libavformat/matroskaenc.c:3074:58: warning: =E2=80=98%012.9f=E2=80=99 = directive output may be truncated writing between 12 and 320 bytes into a r= egion of size between 8 and 14 [-Wformat-truncation=3D] src/libavformat/mlvdec.c:361:63: warning: =E2=80=98__builtin___snprintf_ch= k=E2=80=99 output may be truncated before the last format character [-Wform= at-truncation=3D] -src/libavformat/movenc.c:1126:8: warning: assuming signed overflow does no= t occur when assuming that (X - c) > X is always false [-Wstrict-overflow] +src/libavformat/matroskaenc.c:3074:58: warning: =E2=80=98%012.9f=E2=80=99 = directive output may be truncated writing between 12 and 320 bytes into a r= egion of size between 8 and 14 [-Wformat-truncation=3D] src/libavformat/smoothstreamingenc.c:509:49: warning: =E2=80=98/temp=E2=80= =99 directive output may be truncated writing 5 bytes into a region of size= between 1 and 1024 [-Wformat-truncation=3D] src/libavformat/smoothstreamingenc.c:544:63: warning: =E2=80=98/FragmentIn= fo(=E2=80=99 directive output may be truncated writing 14 bytes into a regi= on of size between 1 and 1024 [-Wformat-truncation=3D] src/libavformat/smoothstreamingenc.c:545:63: warning: =E2=80=98/Fragments(= =E2=80=99 directive output may be truncated writing 11 bytes into a region = of size between 1 and 1024 [-Wformat-truncation=3D] src/libavformat/smoothstreamingenc.c:537:53: warning: =E2=80=98/temp=E2=80= =99 directive output may be truncated writing 5 bytes into a region of size= between 1 and 1024 [-Wformat-truncation=3D] src/libavformat/vorbiscomment.c:103:63: warning: =E2=80=98%03d=E2=80=99 di= rective output may be truncated writing between 3 and 10 bytes into a regio= n of size 4 [-Wformat-truncation=3D] src/libavformat/vorbiscomment.c:104:69: warning: =E2=80=98%02d=E2=80=99 di= rective output may be truncated writing between 2 and 3 bytes into a region= of size between 1 and 7 [-Wformat-truncation=3D] +src/libavformat/movenc.c:1126:8: warning: assuming signed overflow does no= t occur when assuming that (X - c) > X is always false [-Wstrict-overflow] src/libavcodec/mips/aacsbr_mips.h:62:13: warning: =E2=80=98sbr_qmf_analysi= s_mips=E2=80=99 defined but not used [-Wunused-function] src/libavcodec/dvenc.c:786:81: warning: array subscript is above array bou= nds [-Warray-bounds] src/libavcodec/dvenc.c:786:81: warning: array subscript is above array bou= nds [-Warray-bounds] src/libavcodec/ffv1dec.c:999:13: warning: =E2=80=98copy_fields=E2=80=99 de= fined but not used [-Wunused-function] -src/libavcodec/hevcdec.c:3540:12: warning: =E2=80=98hevc_ref_frame=E2=80= =99 defined but not used [-Wunused-function] +src/libavcodec/hevcdec.c:3529:12: warning: =E2=80=98hevc_ref_frame=E2=80= =99 defined but not used [-Wunused-function] src/libavcodec/mobiclip.c:471:24: warning: array subscript is above array = bounds [-Warray-bounds] src/libavcodec/mobiclip.c:475:16: warning: array subscript is above array = bounds [-Warray-bounds] src/libavcodec/mobiclip.c:471:24: warning: array subscript is above array = bounds [-Warray-bounds] @@ -41,11 +43,11 @@ src/libavcodec/pcm-bluray.c:190:49: warning: passing argument 2 of =E2=80= =98bytestream2_get_buffer=E2=80=99 from incompatible pointer type [-Wincomp= atible-pointer-types] src/libavcodec/pcm-dvd.c:160:37: warning: passing argument 2 of =E2=80=98b= ytestream2_get_buffer=E2=80=99 from incompatible pointer type [-Wincompatib= le-pointer-types] src/libavcodec/qdm2.c:1002:47: warning: array subscript is above array bou= nds [-Warray-bounds] -src/libavcodec/vp8.c:2373:30: warning: variable =E2=80=98next_td=E2=80=99 = set but not used [-Wunused-but-set-variable] -src/libavcodec/vp8.c:2586:37: warning: unused variable =E2=80=98prev_td=E2= =80=99 [-Wunused-variable] -src/libavcodec/vp8.c:2586:20: warning: unused variable =E2=80=98next_td=E2= =80=99 [-Wunused-variable] -src/libavcodec/vp8.c:109:12: warning: =E2=80=98vp8_ref_frame=E2=80=99 defi= ned but not used [-Wunused-function] -src/libavcodec/vp9.c:1795:9: warning: unused variable =E2=80=98ret=E2=80= =99 [-Wunused-variable] src/libavutil/timecode.c:123:60: warning: =E2=80=98%0*d=E2=80=99 directive= output may be truncated writing between 1 and 10 bytes into a region of si= ze between 2 and 14 [-Wformat-truncation=3D] +src/libavcodec/vp9.c:1795:9: warning: unused variable =E2=80=98ret=E2=80= =99 [-Wunused-variable] src/fftools/ffprobe.c:333:11: warning: unused variable =E2=80=98new_log_bu= ffer=E2=80=99 [-Wunused-variable] src/fftools/ffprobe.c:329:14: warning: unused variable =E2=80=98avc=E2=80= =99 [-Wunused-variable] +src/libavcodec/vp8.c:2373:30: warning: variable =E2=80=98next_td=E2=80=99 = set but not used [-Wunused-but-set-variable] +src/libavcodec/vp8.c:2587:37: warning: unused variable =E2=80=98prev_td=E2= =80=99 [-Wunused-variable] +src/libavcodec/vp8.c:2587:20: warning: unused variable =E2=80=98next_td=E2= =80=99 [-Wunused-variable] +src/libavcodec/vp8.c:109:12: warning: =E2=80=98vp8_ref_frame=E2=80=99 defi= ned but not used [-Wunused-function] There is also the fate-filter-metadata-signalstats-yuv420p10 test which fails before and afterwards -pts=3D0|tag:lavfi.signalstats.UBITDEPTH=3D2|tag:lavfi.signalstats.YMIN=3D9= 43|tag:lavfi.signalstats.YLOW=3D943|tag:lavfi.signalstats.YAVG=3D943|tag:la= vfi.signalstats.YHIGH=3D943|tag:lavfi.signalstats.YMAX=3D943|tag:lavfi.sign= alstats.UMIN=3D514|tag:lavfi.signalstats.ULOW=3D514|tag:lavfi.signalstats.U= AVG=3D514|tag:lavfi.signalstats.UHIGH=3D514|tag:lavfi.signalstats.UMAX=3D51= 4|tag:lavfi.signalstats.VMIN=3D514|tag:lavfi.signalstats.VLOW=3D514|tag:lav= fi.signalstats.VAVG=3D514|tag:lavfi.signalstats.VHIGH=3D514|tag:lavfi.signa= lstats.VMAX=3D514|tag:lavfi.signalstats.SATMIN=3D2|tag:lavfi.signalstats.SA= TLOW=3D2|tag:lavfi.signalstats.SATAVG=3D2|tag:lavfi.signalstats.SATHIGH=3D2= |tag:lavfi.signalstats.SATMAX=3D2|tag:lavfi.signalstats.HUEMED=3D225|tag:la= vfi.signalstats.HUEAVG=3D225|tag:lavfi.signalstats.YDIF=3D0|tag:lavfi.signa= lstats.UDIF=3D0|tag:lavfi.signalstats.VDIF=3D0|tag:lavfi.signalstats.YBITDE= PTH=3D8|tag:lavfi.signalstats.VBITDEPTH=3D2 +pts=3D0|tag:lavfi.signalstats.UBITDEPTH=3D1|tag:lavfi.signalstats.YMIN=3D9= 40|tag:lavfi.signalstats.YLOW=3D940|tag:lavfi.signalstats.YAVG=3D940|tag:la= vfi.signalstats.YHIGH=3D940|tag:lavfi.signalstats.YMAX=3D940|tag:lavfi.sign= alstats.UMIN=3D512|tag:lavfi.signalstats.ULOW=3D512|tag:lavfi.signalstats.U= AVG=3D512|tag:lavfi.signalstats.UHIGH=3D512|tag:lavfi.signalstats.UMAX=3D51= 2|tag:lavfi.signalstats.VMIN=3D512|tag:lavfi.signalstats.VLOW=3D512|tag:lav= fi.signalstats.VAVG=3D512|tag:lavfi.signalstats.VHIGH=3D512|tag:lavfi.signa= lstats.VMAX=3D512|tag:lavfi.signalstats.SATMIN=3D0|tag:lavfi.signalstats.SA= TLOW=3D0|tag:lavfi.signalstats.SATAVG=3D0|tag:lavfi.signalstats.SATHIGH=3D0= |tag:lavfi.signalstats.SATMAX=3D0|tag:lavfi.signalstats.HUEMED=3D180|tag:la= vfi.signalstats.HUEAVG=3D180|tag:lavfi.signalstats.YDIF=3D0|tag:lavfi.signa= lstats.UDIF=3D0|tag:lavfi.signalstats.VDIF=3D0|tag:lavfi.signalstats.YBITDE= PTH=3D6|tag:lavfi.signalstats.VBITDEPTH=3D1 Test filter-metadata-signalstats-yuv420p10 failed. Look at tests/data/fate/= filter-metadata-signalstats-yuv420p10.err for details. [...] --=20 Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB If you drop bombs on a foreign country and kill a hundred thousand innocent people, expect your government to call the consequence "unprovoked inhuman terrorist attacks" and use it to justify dropping more bombs and killing more people. The technology changed, the idea is old. --l8FkFPwqImiSFyXg Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iF0EABEIAB0WIQSf8hKLFH72cwut8TNhHseHBAsPqwUCYt7ylAAKCRBhHseHBAsP q6ZqAJ0TrnxZ5tqt03EBhpPsJvS1FR833ACbBXhFytO0+Up2q8FiJ8GHxHZuA/o= =KtHF -----END PGP SIGNATURE----- --l8FkFPwqImiSFyXg-- --===============5109008701172863966== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". --===============5109008701172863966==--