From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <ffmpeg-devel-bounces@ffmpeg.org>
Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100])
	by master.gitmailbox.com (Postfix) with ESMTPS id 54FA34C7AA
	for <ffmpegdev@gitmailbox.com>; Sat,  8 Feb 2025 03:50:35 +0000 (UTC)
Received: from [127.0.1.1] (localhost [127.0.0.1])
	by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id AA95B68B98F;
	Sat,  8 Feb 2025 05:50:30 +0200 (EET)
Received: from xmbghk7.mail.qq.com (xmbghk7.mail.qq.com [43.163.128.44])
 by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 42AED68B89A
 for <ffmpeg-devel@ffmpeg.org>; Sat,  8 Feb 2025 05:50:19 +0200 (EET)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com;
 s=s201512; t=1738986614;
 bh=82KBLNN1TsSBHY+NyIsykxArlAfvMXtSyarL/jaQJIg=;
 h=From:Subject:Date:References:To:In-Reply-To;
 b=oKlMAeksuHY82cA4So8p7/momH7tM7vz/0CpjvjbX/U88t0DzVu86M8KojWrRFzv9
 zRwsD69uXxdSeEr1b1xCplIz+cUchZBNTkebgTQ/hAcXBW/u4xcQqQ+flc+RczNJ20
 4iPkjtGk9TPEyY/IG05ec9PuvriIo3liqYANczZg=
Received: from smtpclient.apple ([119.147.10.242])
 by newxmesmtplogicsvrszb16-1.qq.com (NewEsmtp) with SMTP
 id C8DB2A6D; Sat, 08 Feb 2025 11:50:13 +0800
X-QQ-mid: xmsmtpt1738986613to47va74a
Message-ID: <tencent_504F3FB7566F9C2AE0125BC8357B80C03509@qq.com>
X-QQ-XMAILINFO: OGZxhFXqN7PJVL+bS6EiismBD/NixggkS3FgcSx2Di4iFWIiqXuDDlClgWCofS
 fyLBEPoSkNfttvCni9tT8poSx5NK2LjEnvhS1EwCCKatL+BQKnA1kXFO6cFpiiKPzcNRDR/CRxsR
 m1A/xARzVAnolEycK4/D2LvmQZ/8HiqARRMa6uopTSUUB+J2GdPvuGI3xSbjBYt77u9E1EjhVPrh
 A1r0gSCCpDd+i0XRizhNeVLACE3qGI1gU1q5xOvpaLrWwaZ1sGiTOiGj3ddzVqrUxArETEY5JaV8
 QB2+dU+A9al/yFCDf+CwWrNgTja7cMQ1gY0/nH5DZsO00jugDFFiE45d8kpEEsT+dlNhVpVLMXe6
 5wpiNPJgdX9PaWvN/kYDm2QfL0svvJw2FisUIMZ62lQFHBSw7Ab94mNwqT9PvPMPrXSf+tACsJj9
 wc/563hnWvy6XaPs9HZGzclxCYpaNanqAF4Ce5DrPm9yCDkfTWnmITY+9mJPYDCfrFCFVKun+Ahl
 6vJRHy14rslxm4+CsqmvLQT8RHWqbjHafPsv5Xu1s6tXkpgvx+GrMOB5+D15IrMrc57WqIGCI3Gc
 68opezNTCWsmW3BT+DUkqrRy+U571h78xmqgBGpZrPmT2aX0Pp9OovFzORlwZvVBlt3uMBk9pZ/q
 KfrVzpjDTf6PPM5UFW3eK4cJc2e3IvhHgn3bIRuzgJEQXqA4vCHodxB4ip5MOGdu3gb2LmPV9BXu
 1mT+pDSVOim0sjFepqv6/aCw6BuNpZ9C7r8Vx5EcL0R+9XTFdHcu3igY4cj/fGS7OgbIDUL4lv+C
 kfddgkVM3mRVL6o/UtRbFSnoc6PoHR4ZT0XLibw23HCEH+bP3Rw0Rk87/pVcwZAVvK7JkwdQV6GG
 0Q09kZJ6w+bJtNyNiBNXozsqDJ8/1XgZYxqgvHMnCoDxuy/0SqT2dfM7x2QtBJULw6vu1SwNtgyV
 WZ2A2PtTcNQEdny2IPW7xy4/EKbcThYeDcRbJzsVHGie3XhX2F2eN/wVlpNeioSWEFyyTBu5s=
X-QQ-XMRINFO: OD9hHCdaPRBwq3WW+NvGbIU=
From: Zhao Zhili <quinkblack-at-foxmail.com@ffmpeg.org>
Content-Type: multipart/mixed;
 boundary="Apple-Mail=_FEF7B783-A79E-47B1-A1CC-6FD39EE8D616"
Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3818.100.11.1.3\))
Date: Sat, 8 Feb 2025 11:50:03 +0800
References: <20250206214257.37158-1-rsbultje@gmail.com>
 <AS8P250MB0744018A4EB21C01AD77467C8FF12@AS8P250MB0744.EURP250.PROD.OUTLOOK.COM>
 <CAEEMt2kvkHcj3Lke8uioMRiNCb2nNZmo9o37=MOBPWhJZ=3nPA@mail.gmail.com>
 <tencent_D2337475C3491C2EAD65489307A85DF3E205@qq.com>
 <CAEEMt2mnjdYzh=cpCiG6_rOuOYE2ZzQS_W-8fi1WsCDBs67YRg@mail.gmail.com>
To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
In-Reply-To: <CAEEMt2mnjdYzh=cpCiG6_rOuOYE2ZzQS_W-8fi1WsCDBs67YRg@mail.gmail.com>
X-OQ-MSGID: <2BB15410-E16C-4BD0-9D84-76833E90CEFA@foxmail.com>
X-Mailer: Apple Mail (2.3818.100.11.1.3)
Subject: Re: [FFmpeg-devel] [PATCH] threadprogress: reorder instructions to
 silence tsan warning.
X-BeenThere: ffmpeg-devel@ffmpeg.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: FFmpeg development discussions and patches <ffmpeg-devel.ffmpeg.org>
List-Unsubscribe: <https://ffmpeg.org/mailman/options/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=unsubscribe>
List-Archive: <https://ffmpeg.org/pipermail/ffmpeg-devel>
List-Post: <mailto:ffmpeg-devel@ffmpeg.org>
List-Help: <mailto:ffmpeg-devel-request@ffmpeg.org?subject=help>
List-Subscribe: <https://ffmpeg.org/mailman/listinfo/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=subscribe>
Reply-To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Errors-To: ffmpeg-devel-bounces@ffmpeg.org
Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>
Archived-At: <https://master.gitmailbox.com/ffmpegdev/tencent_504F3FB7566F9C2AE0125BC8357B80C03509@qq.com/>
List-Archive: <https://master.gitmailbox.com/ffmpegdev/>
List-Post: <mailto:ffmpegdev@gitmailbox.com>


--Apple-Mail=_FEF7B783-A79E-47B1-A1CC-6FD39EE8D616
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=utf-8


> On Feb 8, 2025, at 00:05, Ronald S. Bultje <rsbultje@gmail.com> wrote:
>=20
> Hi,
>=20
> On Fri, Feb 7, 2025 at 8:44=E2=80=AFAM Zhao Zhili <
> quinkblack-at-foxmail.com@ffmpeg.org> wrote:
>=20
>>> On Feb 7, 2025, at 21:26, Ronald S. Bultje <rsbultje@gmail.com> =
wrote:
>>> On Fri, Feb 7, 2025 at 6:22=E2=80=AFAM Andreas Rheinhardt <
>>> andreas.rheinhardt@outlook.com> wrote:
>>>> Ronald S. Bultje:
>>>>> Fixes #11456.
>>>>> ---
>>>>> libavcodec/threadprogress.c | 3 +--
>>>>> 1 file changed, 1 insertion(+), 2 deletions(-)
>>>>>=20
>>>>> diff --git a/libavcodec/threadprogress.c =
b/libavcodec/threadprogress.c
>>>>> index 62c4fd898b..aa72ff80e7 100644
>>>>> --- a/libavcodec/threadprogress.c
>>>>> +++ b/libavcodec/threadprogress.c
>>>>> @@ -55,9 +55,8 @@ void ff_thread_progress_report(ThreadProgress =
*pro,
>>>> int n)
>>>>>    if (atomic_load_explicit(&pro->progress, memory_order_relaxed) =
>=3D
>> n)
>>>>>        return;
>>>>>=20
>>>>> -    atomic_store_explicit(&pro->progress, n, =
memory_order_release);
>>>>> -
>>>>>    ff_mutex_lock(&pro->progress_mutex);
>>>>> +    atomic_store_explicit(&pro->progress, n, =
memory_order_release);
>>>>>    ff_cond_broadcast(&pro->progress_cond);
>>>>>    ff_mutex_unlock(&pro->progress_mutex);
>>>>> }
>>>>=20
>>>> I don't really understand why this is supposed to fix a race; after =
all,
>>>> the synchronisation of ff_thread_progress_(report|await) is not =
supposed
>>>> to be provided by the mutex (which is avoided altogether in the =
fast
>>>> path in ff_thread_report_await()), but by storing and loading the
>>>> progress variable.
>>>> That's also the reason why I moved this outside of the mutex =
(compared
>>>> to ff_thread_report_progress(). (This way it is possible for a =
consumer
>>>> thread to see the new progress value earlier and possibly avoid the
>>>> mutex altogether.)
>>>=20
>>>=20
>>> The consumer thread already checks the value without the lock. so =
the
>>> significance of that last point seems minor to me. This would be
>> different
>>> if the wait() counterpart had no lockless path. Or am I missing
>> something?
>>=20
>> What Andreas says is atomic_store before mutex_lock makes the first
>> atomic_load in progress_wait has a higher chance to succeed. The =
earlier
>> progress is set, the higher chance of progress_wait go into the fast =
path.
>=20
>=20
> I understand that is true in theory - but I have doubts on whether =
this is
> in any way significant in practice if wait() already has behaviour to
> pre-empty locklessly
>=20
> I measured this in the most un-scientific way possible by decoding
> gizmo.webm (from Firefox' bug report) 10x before and after my patch, =
taking
> the average and standard deviation, and comparing these with each =
other. I
> repeated this a couple of times. The values (before vs after avg +/-
> stddev) are obviously never exactly the same, but they swarm around =
each
> other like a random noise generator. Or to say it differently: in my =
highly
> unscientific test, I see no performance difference.
>=20
> So ... Is this really worth it?

I did another test by measure fast_path / (fast_path + slow_path) on =
macOS of hevc decoding with 10 threads.

1. Before the patch, it=E2=80=99s 99.741%.
2. With the patch, it=E2=80=99s 99.743%.
3. With while (atomic_load_explicit(&pro->progress, =
memory_order_acquire) < n), it=E2=80=99s 99.741%.

So, it doesn=E2=80=99t matter for performance. Current patch is the most =
elegant solution in my opinion.


--Apple-Mail=_FEF7B783-A79E-47B1-A1CC-6FD39EE8D616
Content-Disposition: attachment;
	filename=threadprogress.patch
Content-Type: application/octet-stream;
	x-unix-mode=0644;
	name="threadprogress.patch"
Content-Transfer-Encoding: 7bit

diff --git a/libavcodec/threadprogress.c b/libavcodec/threadprogress.c
index 62c4fd898b..dcea99cb4c 100644
--- a/libavcodec/threadprogress.c
+++ b/libavcodec/threadprogress.c
@@ -33,6 +33,8 @@ DEFINE_OFFSET_ARRAY(ThreadProgress, thread_progress, init,
 av_cold int ff_thread_progress_init(ThreadProgress *pro, int init_mode)
 {
     atomic_init(&pro->progress, init_mode ? -1 : INT_MAX);
+    atomic_init(&pro->fast_path, 0);
+    atomic_init(&pro->slow_path, 0);
 #if HAVE_THREADS
     if (init_mode)
         return ff_pthread_init(pro, thread_progress_offsets);
@@ -48,6 +50,10 @@ av_cold void ff_thread_progress_destroy(ThreadProgress *pro)
 #else
     pro->init = 0;
 #endif
+    int fast = atomic_load_explicit(&pro->fast_path, memory_order_relaxed);
+    int slow = atomic_load_explicit(&pro->slow_path, memory_order_relaxed);
+    av_log(NULL, AV_LOG_ERROR, "fast path %d, slow path %d, fast rate %f\n",
+           fast, slow, (float)fast/(slow + fast));
 }
 
 void ff_thread_progress_report(ThreadProgress *pro, int n)
@@ -69,11 +75,14 @@ void ff_thread_progress_await(const ThreadProgress *pro_c, int n)
      * as it had at the beginning. */
     ThreadProgress *pro = (ThreadProgress*)pro_c;
 
-    if (atomic_load_explicit(&pro->progress, memory_order_acquire) >= n)
+    if (atomic_load_explicit(&pro->progress, memory_order_acquire) >= n) {
+        atomic_fetch_add_explicit(&pro->fast_path, 1, memory_order_release);
         return;
+    }
 
     ff_mutex_lock(&pro->progress_mutex);
     while (atomic_load_explicit(&pro->progress, memory_order_relaxed) < n)
         ff_cond_wait(&pro->progress_cond, &pro->progress_mutex);
     ff_mutex_unlock(&pro->progress_mutex);
+    atomic_fetch_add_explicit(&pro->slow_path, 1, memory_order_release);
 }
diff --git a/libavcodec/threadprogress.h b/libavcodec/threadprogress.h
index cc3414c2ce..6799ad6cb4 100644
--- a/libavcodec/threadprogress.h
+++ b/libavcodec/threadprogress.h
@@ -42,6 +42,8 @@
  */
 typedef struct ThreadProgress {
     atomic_int progress;
+    atomic_int fast_path;
+    atomic_int slow_path;
     unsigned   init;
     AVMutex progress_mutex;
     AVCond  progress_cond;

--Apple-Mail=_FEF7B783-A79E-47B1-A1CC-6FD39EE8D616
Content-Transfer-Encoding: 7bit
Content-Type: text/plain;
	charset=us-ascii


> 
> Ronald
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".


--Apple-Mail=_FEF7B783-A79E-47B1-A1CC-6FD39EE8D616
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

--Apple-Mail=_FEF7B783-A79E-47B1-A1CC-6FD39EE8D616--