From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTPS id 54FA34C7AA for ; Sat, 8 Feb 2025 03:50:35 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id AA95B68B98F; Sat, 8 Feb 2025 05:50:30 +0200 (EET) Received: from xmbghk7.mail.qq.com (xmbghk7.mail.qq.com [43.163.128.44]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 42AED68B89A for ; Sat, 8 Feb 2025 05:50:19 +0200 (EET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1738986614; bh=82KBLNN1TsSBHY+NyIsykxArlAfvMXtSyarL/jaQJIg=; h=From:Subject:Date:References:To:In-Reply-To; b=oKlMAeksuHY82cA4So8p7/momH7tM7vz/0CpjvjbX/U88t0DzVu86M8KojWrRFzv9 zRwsD69uXxdSeEr1b1xCplIz+cUchZBNTkebgTQ/hAcXBW/u4xcQqQ+flc+RczNJ20 4iPkjtGk9TPEyY/IG05ec9PuvriIo3liqYANczZg= Received: from smtpclient.apple ([119.147.10.242]) by newxmesmtplogicsvrszb16-1.qq.com (NewEsmtp) with SMTP id C8DB2A6D; Sat, 08 Feb 2025 11:50:13 +0800 X-QQ-mid: xmsmtpt1738986613to47va74a Message-ID: X-QQ-XMAILINFO: OGZxhFXqN7PJVL+bS6EiismBD/NixggkS3FgcSx2Di4iFWIiqXuDDlClgWCofS fyLBEPoSkNfttvCni9tT8poSx5NK2LjEnvhS1EwCCKatL+BQKnA1kXFO6cFpiiKPzcNRDR/CRxsR m1A/xARzVAnolEycK4/D2LvmQZ/8HiqARRMa6uopTSUUB+J2GdPvuGI3xSbjBYt77u9E1EjhVPrh A1r0gSCCpDd+i0XRizhNeVLACE3qGI1gU1q5xOvpaLrWwaZ1sGiTOiGj3ddzVqrUxArETEY5JaV8 QB2+dU+A9al/yFCDf+CwWrNgTja7cMQ1gY0/nH5DZsO00jugDFFiE45d8kpEEsT+dlNhVpVLMXe6 5wpiNPJgdX9PaWvN/kYDm2QfL0svvJw2FisUIMZ62lQFHBSw7Ab94mNwqT9PvPMPrXSf+tACsJj9 wc/563hnWvy6XaPs9HZGzclxCYpaNanqAF4Ce5DrPm9yCDkfTWnmITY+9mJPYDCfrFCFVKun+Ahl 6vJRHy14rslxm4+CsqmvLQT8RHWqbjHafPsv5Xu1s6tXkpgvx+GrMOB5+D15IrMrc57WqIGCI3Gc 68opezNTCWsmW3BT+DUkqrRy+U571h78xmqgBGpZrPmT2aX0Pp9OovFzORlwZvVBlt3uMBk9pZ/q KfrVzpjDTf6PPM5UFW3eK4cJc2e3IvhHgn3bIRuzgJEQXqA4vCHodxB4ip5MOGdu3gb2LmPV9BXu 1mT+pDSVOim0sjFepqv6/aCw6BuNpZ9C7r8Vx5EcL0R+9XTFdHcu3igY4cj/fGS7OgbIDUL4lv+C kfddgkVM3mRVL6o/UtRbFSnoc6PoHR4ZT0XLibw23HCEH+bP3Rw0Rk87/pVcwZAVvK7JkwdQV6GG 0Q09kZJ6w+bJtNyNiBNXozsqDJ8/1XgZYxqgvHMnCoDxuy/0SqT2dfM7x2QtBJULw6vu1SwNtgyV WZ2A2PtTcNQEdny2IPW7xy4/EKbcThYeDcRbJzsVHGie3XhX2F2eN/wVlpNeioSWEFyyTBu5s= X-QQ-XMRINFO: OD9hHCdaPRBwq3WW+NvGbIU= From: Zhao Zhili Content-Type: multipart/mixed; boundary="Apple-Mail=_FEF7B783-A79E-47B1-A1CC-6FD39EE8D616" Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3818.100.11.1.3\)) Date: Sat, 8 Feb 2025 11:50:03 +0800 References: <20250206214257.37158-1-rsbultje@gmail.com> To: FFmpeg development discussions and patches In-Reply-To: X-OQ-MSGID: <2BB15410-E16C-4BD0-9D84-76833E90CEFA@foxmail.com> X-Mailer: Apple Mail (2.3818.100.11.1.3) Subject: Re: [FFmpeg-devel] [PATCH] threadprogress: reorder instructions to silence tsan warning. X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: --Apple-Mail=_FEF7B783-A79E-47B1-A1CC-6FD39EE8D616 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 > On Feb 8, 2025, at 00:05, Ronald S. Bultje wrote: >=20 > Hi, >=20 > On Fri, Feb 7, 2025 at 8:44=E2=80=AFAM Zhao Zhili < > quinkblack-at-foxmail.com@ffmpeg.org> wrote: >=20 >>> On Feb 7, 2025, at 21:26, Ronald S. Bultje = wrote: >>> On Fri, Feb 7, 2025 at 6:22=E2=80=AFAM Andreas Rheinhardt < >>> andreas.rheinhardt@outlook.com> wrote: >>>> Ronald S. Bultje: >>>>> Fixes #11456. >>>>> --- >>>>> libavcodec/threadprogress.c | 3 +-- >>>>> 1 file changed, 1 insertion(+), 2 deletions(-) >>>>>=20 >>>>> diff --git a/libavcodec/threadprogress.c = b/libavcodec/threadprogress.c >>>>> index 62c4fd898b..aa72ff80e7 100644 >>>>> --- a/libavcodec/threadprogress.c >>>>> +++ b/libavcodec/threadprogress.c >>>>> @@ -55,9 +55,8 @@ void ff_thread_progress_report(ThreadProgress = *pro, >>>> int n) >>>>> if (atomic_load_explicit(&pro->progress, memory_order_relaxed) = >=3D >> n) >>>>> return; >>>>>=20 >>>>> - atomic_store_explicit(&pro->progress, n, = memory_order_release); >>>>> - >>>>> ff_mutex_lock(&pro->progress_mutex); >>>>> + atomic_store_explicit(&pro->progress, n, = memory_order_release); >>>>> ff_cond_broadcast(&pro->progress_cond); >>>>> ff_mutex_unlock(&pro->progress_mutex); >>>>> } >>>>=20 >>>> I don't really understand why this is supposed to fix a race; after = all, >>>> the synchronisation of ff_thread_progress_(report|await) is not = supposed >>>> to be provided by the mutex (which is avoided altogether in the = fast >>>> path in ff_thread_report_await()), but by storing and loading the >>>> progress variable. >>>> That's also the reason why I moved this outside of the mutex = (compared >>>> to ff_thread_report_progress(). (This way it is possible for a = consumer >>>> thread to see the new progress value earlier and possibly avoid the >>>> mutex altogether.) >>>=20 >>>=20 >>> The consumer thread already checks the value without the lock. so = the >>> significance of that last point seems minor to me. This would be >> different >>> if the wait() counterpart had no lockless path. Or am I missing >> something? >>=20 >> What Andreas says is atomic_store before mutex_lock makes the first >> atomic_load in progress_wait has a higher chance to succeed. The = earlier >> progress is set, the higher chance of progress_wait go into the fast = path. >=20 >=20 > I understand that is true in theory - but I have doubts on whether = this is > in any way significant in practice if wait() already has behaviour to > pre-empty locklessly >=20 > I measured this in the most un-scientific way possible by decoding > gizmo.webm (from Firefox' bug report) 10x before and after my patch, = taking > the average and standard deviation, and comparing these with each = other. I > repeated this a couple of times. The values (before vs after avg +/- > stddev) are obviously never exactly the same, but they swarm around = each > other like a random noise generator. Or to say it differently: in my = highly > unscientific test, I see no performance difference. >=20 > So ... Is this really worth it? I did another test by measure fast_path / (fast_path + slow_path) on = macOS of hevc decoding with 10 threads. 1. Before the patch, it=E2=80=99s 99.741%. 2. With the patch, it=E2=80=99s 99.743%. 3. With while (atomic_load_explicit(&pro->progress, = memory_order_acquire) < n), it=E2=80=99s 99.741%. So, it doesn=E2=80=99t matter for performance. Current patch is the most = elegant solution in my opinion. --Apple-Mail=_FEF7B783-A79E-47B1-A1CC-6FD39EE8D616 Content-Disposition: attachment; filename=threadprogress.patch Content-Type: application/octet-stream; x-unix-mode=0644; name="threadprogress.patch" Content-Transfer-Encoding: 7bit diff --git a/libavcodec/threadprogress.c b/libavcodec/threadprogress.c index 62c4fd898b..dcea99cb4c 100644 --- a/libavcodec/threadprogress.c +++ b/libavcodec/threadprogress.c @@ -33,6 +33,8 @@ DEFINE_OFFSET_ARRAY(ThreadProgress, thread_progress, init, av_cold int ff_thread_progress_init(ThreadProgress *pro, int init_mode) { atomic_init(&pro->progress, init_mode ? -1 : INT_MAX); + atomic_init(&pro->fast_path, 0); + atomic_init(&pro->slow_path, 0); #if HAVE_THREADS if (init_mode) return ff_pthread_init(pro, thread_progress_offsets); @@ -48,6 +50,10 @@ av_cold void ff_thread_progress_destroy(ThreadProgress *pro) #else pro->init = 0; #endif + int fast = atomic_load_explicit(&pro->fast_path, memory_order_relaxed); + int slow = atomic_load_explicit(&pro->slow_path, memory_order_relaxed); + av_log(NULL, AV_LOG_ERROR, "fast path %d, slow path %d, fast rate %f\n", + fast, slow, (float)fast/(slow + fast)); } void ff_thread_progress_report(ThreadProgress *pro, int n) @@ -69,11 +75,14 @@ void ff_thread_progress_await(const ThreadProgress *pro_c, int n) * as it had at the beginning. */ ThreadProgress *pro = (ThreadProgress*)pro_c; - if (atomic_load_explicit(&pro->progress, memory_order_acquire) >= n) + if (atomic_load_explicit(&pro->progress, memory_order_acquire) >= n) { + atomic_fetch_add_explicit(&pro->fast_path, 1, memory_order_release); return; + } ff_mutex_lock(&pro->progress_mutex); while (atomic_load_explicit(&pro->progress, memory_order_relaxed) < n) ff_cond_wait(&pro->progress_cond, &pro->progress_mutex); ff_mutex_unlock(&pro->progress_mutex); + atomic_fetch_add_explicit(&pro->slow_path, 1, memory_order_release); } diff --git a/libavcodec/threadprogress.h b/libavcodec/threadprogress.h index cc3414c2ce..6799ad6cb4 100644 --- a/libavcodec/threadprogress.h +++ b/libavcodec/threadprogress.h @@ -42,6 +42,8 @@ */ typedef struct ThreadProgress { atomic_int progress; + atomic_int fast_path; + atomic_int slow_path; unsigned init; AVMutex progress_mutex; AVCond progress_cond; --Apple-Mail=_FEF7B783-A79E-47B1-A1CC-6FD39EE8D616 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii > > Ronald > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". --Apple-Mail=_FEF7B783-A79E-47B1-A1CC-6FD39EE8D616 Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". --Apple-Mail=_FEF7B783-A79E-47B1-A1CC-6FD39EE8D616--