[FFmpeg-devel] [PATCH] threadprogress: reorder instructions to silence tsan warning.

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
 help / color / mirror / Atom feed

* [FFmpeg-devel] [PATCH] threadprogress: reorder instructions to silence tsan warning.
@ 2025-02-06 21:42 Ronald S. Bultje
  2025-02-07  5:13 ` Zhao Zhili
  2025-02-07 11:22 ` Andreas Rheinhardt
  0 siblings, 2 replies; 15+ messages in thread
From: Ronald S. Bultje @ 2025-02-06 21:42 UTC (permalink / raw)
  To: ffmpeg-devel; +Cc: Ronald S. Bultje

Fixes #11456.
---
 libavcodec/threadprogress.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/libavcodec/threadprogress.c b/libavcodec/threadprogress.c
index 62c4fd898b..aa72ff80e7 100644
--- a/libavcodec/threadprogress.c
+++ b/libavcodec/threadprogress.c
@@ -55,9 +55,8 @@ void ff_thread_progress_report(ThreadProgress *pro, int n)
     if (atomic_load_explicit(&pro->progress, memory_order_relaxed) >= n)
         return;
 
-    atomic_store_explicit(&pro->progress, n, memory_order_release);
-
     ff_mutex_lock(&pro->progress_mutex);
+    atomic_store_explicit(&pro->progress, n, memory_order_release);
     ff_cond_broadcast(&pro->progress_cond);
     ff_mutex_unlock(&pro->progress_mutex);
 }
-- 
2.48.0

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [FFmpeg-devel] [PATCH] threadprogress: reorder instructions to silence tsan warning.
  2025-02-06 21:42 [FFmpeg-devel] [PATCH] threadprogress: reorder instructions to silence tsan warning Ronald S. Bultje
@ 2025-02-07  5:13 ` Zhao Zhili
  2025-02-07 11:22 ` Andreas Rheinhardt
  1 sibling, 0 replies; 15+ messages in thread
From: Zhao Zhili @ 2025-02-07  5:13 UTC (permalink / raw)
  To: FFmpeg development discussions and patches



> On Feb 7, 2025, at 05:42, Ronald S. Bultje <rsbultje@gmail.com> wrote:
> 
> Fixes #11456.
> ---
> libavcodec/threadprogress.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/libavcodec/threadprogress.c b/libavcodec/threadprogress.c
> index 62c4fd898b..aa72ff80e7 100644
> --- a/libavcodec/threadprogress.c
> +++ b/libavcodec/threadprogress.c
> @@ -55,9 +55,8 @@ void ff_thread_progress_report(ThreadProgress *pro, int n)
>     if (atomic_load_explicit(&pro->progress, memory_order_relaxed) >= n)
>         return;
> 
> -    atomic_store_explicit(&pro->progress, n, memory_order_release);
> -
>     ff_mutex_lock(&pro->progress_mutex);
> +    atomic_store_explicit(&pro->progress, n, memory_order_release);
>     ff_cond_broadcast(&pro->progress_cond);
>     ff_mutex_unlock(&pro->progress_mutex);
> }
> -- 
> 2.48.0

LGTM.

> 
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [FFmpeg-devel] [PATCH] threadprogress: reorder instructions to silence tsan warning.
  2025-02-06 21:42 [FFmpeg-devel] [PATCH] threadprogress: reorder instructions to silence tsan warning Ronald S. Bultje
  2025-02-07  5:13 ` Zhao Zhili
@ 2025-02-07 11:22 ` Andreas Rheinhardt
  2025-02-07 11:38   ` Zhao Zhili
                     ` (2 more replies)
  1 sibling, 3 replies; 15+ messages in thread
From: Andreas Rheinhardt @ 2025-02-07 11:22 UTC (permalink / raw)
  To: ffmpeg-devel

Ronald S. Bultje:
> Fixes #11456.
> ---
>  libavcodec/threadprogress.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/libavcodec/threadprogress.c b/libavcodec/threadprogress.c
> index 62c4fd898b..aa72ff80e7 100644
> --- a/libavcodec/threadprogress.c
> +++ b/libavcodec/threadprogress.c
> @@ -55,9 +55,8 @@ void ff_thread_progress_report(ThreadProgress *pro, int n)
>      if (atomic_load_explicit(&pro->progress, memory_order_relaxed) >= n)
>          return;
>  
> -    atomic_store_explicit(&pro->progress, n, memory_order_release);
> -
>      ff_mutex_lock(&pro->progress_mutex);
> +    atomic_store_explicit(&pro->progress, n, memory_order_release);
>      ff_cond_broadcast(&pro->progress_cond);
>      ff_mutex_unlock(&pro->progress_mutex);
>  }

I don't really understand why this is supposed to fix a race; after all,
the synchronisation of ff_thread_progress_(report|await) is not supposed
to be provided by the mutex (which is avoided altogether in the fast
path in ff_thread_report_await()), but by storing and loading the
progress variable.
That's also the reason why I moved this outside of the mutex (compared
to ff_thread_report_progress(). (This way it is possible for a consumer
thread to see the new progress value earlier and possibly avoid the
mutex altogether.)

- Andreas

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [FFmpeg-devel] [PATCH] threadprogress: reorder instructions to silence tsan warning.
  2025-02-07 11:22 ` Andreas Rheinhardt
@ 2025-02-07 11:38   ` Zhao Zhili
  2025-02-07 11:47     ` Andreas Rheinhardt
  2025-02-07 11:39   ` Andreas Rheinhardt
  2025-02-07 13:26   ` Ronald S. Bultje
  2 siblings, 1 reply; 15+ messages in thread
From: Zhao Zhili @ 2025-02-07 11:38 UTC (permalink / raw)
  To: FFmpeg development discussions and patches



> On Feb 7, 2025, at 19:22, Andreas Rheinhardt <andreas.rheinhardt@outlook.com> wrote:
> 
> Ronald S. Bultje:
>> Fixes #11456.
>> ---
>> libavcodec/threadprogress.c | 3 +--
>> 1 file changed, 1 insertion(+), 2 deletions(-)
>> 
>> diff --git a/libavcodec/threadprogress.c b/libavcodec/threadprogress.c
>> index 62c4fd898b..aa72ff80e7 100644
>> --- a/libavcodec/threadprogress.c
>> +++ b/libavcodec/threadprogress.c
>> @@ -55,9 +55,8 @@ void ff_thread_progress_report(ThreadProgress *pro, int n)
>>     if (atomic_load_explicit(&pro->progress, memory_order_relaxed) >= n)
>>         return;
>> 
>> -    atomic_store_explicit(&pro->progress, n, memory_order_release);
>> -
>>     ff_mutex_lock(&pro->progress_mutex);
>> +    atomic_store_explicit(&pro->progress, n, memory_order_release);
>>     ff_cond_broadcast(&pro->progress_cond);
>>     ff_mutex_unlock(&pro->progress_mutex);
>> }
> 
> I don't really understand why this is supposed to fix a race; after all,
> the synchronisation of ff_thread_progress_(report|await) is not supposed
> to be provided by the mutex (which is avoided altogether in the fast
> path in ff_thread_report_await()), but by storing and loading the
> progress variable.
> That's also the reason why I moved this outside of the mutex (compared
> to ff_thread_report_progress(). (This way it is possible for a consumer
> thread to see the new progress value earlier and possibly avoid the
> mutex altogether.)

As I understand it, there is no real race condition, that’s why the patch
says “silence tsan warning”.

I have considered another idea to keep tsan clean and keep the benefit
of set progress earlier: use another non-atomic progress together with
mutex/cond, so atomic and mutex/cond are used separately. Not sure
whether it’s worth the complexity.

> 
> - Andreas
> 
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [FFmpeg-devel] [PATCH] threadprogress: reorder instructions to silence tsan warning.
  2025-02-07 11:22 ` Andreas Rheinhardt
  2025-02-07 11:38   ` Zhao Zhili
@ 2025-02-07 11:39   ` Andreas Rheinhardt
  2025-02-07 11:46     ` Zhao Zhili
  2025-02-07 13:26   ` Ronald S. Bultje
  2 siblings, 1 reply; 15+ messages in thread
From: Andreas Rheinhardt @ 2025-02-07 11:39 UTC (permalink / raw)
  To: ffmpeg-devel

Andreas Rheinhardt:
> Ronald S. Bultje:
>> Fixes #11456.
>> ---
>>  libavcodec/threadprogress.c | 3 +--
>>  1 file changed, 1 insertion(+), 2 deletions(-)
>>
>> diff --git a/libavcodec/threadprogress.c b/libavcodec/threadprogress.c
>> index 62c4fd898b..aa72ff80e7 100644
>> --- a/libavcodec/threadprogress.c
>> +++ b/libavcodec/threadprogress.c
>> @@ -55,9 +55,8 @@ void ff_thread_progress_report(ThreadProgress *pro, int n)
>>      if (atomic_load_explicit(&pro->progress, memory_order_relaxed) >= n)
>>          return;
>>  
>> -    atomic_store_explicit(&pro->progress, n, memory_order_release);
>> -
>>      ff_mutex_lock(&pro->progress_mutex);
>> +    atomic_store_explicit(&pro->progress, n, memory_order_release);
>>      ff_cond_broadcast(&pro->progress_cond);
>>      ff_mutex_unlock(&pro->progress_mutex);
>>  }
> 
> I don't really understand why this is supposed to fix a race; after all,
> the synchronisation of ff_thread_progress_(report|await) is not supposed
> to be provided by the mutex (which is avoided altogether in the fast
> path in ff_thread_report_await()), but by storing and loading the
> progress variable.
> That's also the reason why I moved this outside of the mutex (compared
> to ff_thread_report_progress(). (This way it is possible for a consumer
> thread to see the new progress value earlier and possibly avoid the
> mutex altogether.)
> 

Damn, this optimization works, but only if the progress variable is
always read with acquire-semantics; it is currently read via
memory_order_relaxed inside the mutex (just like in
ff_thread_await_progress()).

According to my understanding, this is what happens:
Consumer thread waits for progress and finds that it is insufficient
(fast path fails)
Producer thread updates progress variable
Consumer thread acquires the mutex and reads new progress via
memory_order_relaxed
Producer thread acquires mutex and broadcasts the new progress

I'd prefer to change these semantics so that we always perform
synchronisation via the atomic progress variable (unless you know of a
performance impact -- I only know that on x86, both memory_order_relaxed
and memory_order_acquire are ordinary loads).

Thanks for looking into this.

- Andreas

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [FFmpeg-devel] [PATCH] threadprogress: reorder instructions to silence tsan warning.
  2025-02-07 11:39   ` Andreas Rheinhardt
@ 2025-02-07 11:46     ` Zhao Zhili
  2025-02-07 11:53       ` Zhao Zhili
  0 siblings, 1 reply; 15+ messages in thread
From: Zhao Zhili @ 2025-02-07 11:46 UTC (permalink / raw)
  To: FFmpeg development discussions and patches



> On Feb 7, 2025, at 19:39, Andreas Rheinhardt <andreas.rheinhardt@outlook.com> wrote:
> 
> Andreas Rheinhardt:
>> Ronald S. Bultje:
>>> Fixes #11456.
>>> ---
>>> libavcodec/threadprogress.c | 3 +--
>>> 1 file changed, 1 insertion(+), 2 deletions(-)
>>> 
>>> diff --git a/libavcodec/threadprogress.c b/libavcodec/threadprogress.c
>>> index 62c4fd898b..aa72ff80e7 100644
>>> --- a/libavcodec/threadprogress.c
>>> +++ b/libavcodec/threadprogress.c
>>> @@ -55,9 +55,8 @@ void ff_thread_progress_report(ThreadProgress *pro, int n)
>>>     if (atomic_load_explicit(&pro->progress, memory_order_relaxed) >= n)
>>>         return;
>>> 
>>> -    atomic_store_explicit(&pro->progress, n, memory_order_release);
>>> -
>>>     ff_mutex_lock(&pro->progress_mutex);
>>> +    atomic_store_explicit(&pro->progress, n, memory_order_release);
>>>     ff_cond_broadcast(&pro->progress_cond);
>>>     ff_mutex_unlock(&pro->progress_mutex);
>>> }
>> 
>> I don't really understand why this is supposed to fix a race; after all,
>> the synchronisation of ff_thread_progress_(report|await) is not supposed
>> to be provided by the mutex (which is avoided altogether in the fast
>> path in ff_thread_report_await()), but by storing and loading the
>> progress variable.
>> That's also the reason why I moved this outside of the mutex (compared
>> to ff_thread_report_progress(). (This way it is possible for a consumer
>> thread to see the new progress value earlier and possibly avoid the
>> mutex altogether.)
>> 
> 
> Damn, this optimization works, but only if the progress variable is
> always read with acquire-semantics; it is currently read via
> memory_order_relaxed inside the mutex (just like in
> ff_thread_await_progress()).
> 
> According to my understanding, this is what happens:
> Consumer thread waits for progress and finds that it is insufficient
> (fast path fails)
> Producer thread updates progress variable
> Consumer thread acquires the mutex and reads new progress via
> memory_order_relaxed
> Producer thread acquires mutex and broadcasts the new progress
> 
> I'd prefer to change these semantics so that we always perform
> synchronisation via the atomic progress variable (unless you know of a
> performance impact -- I only know that on x86, both memory_order_relaxed
> and memory_order_acquire are ordinary loads).

I have considered the solution too, by always use memory_order_acquire
in wait progress. memory_order_relaxed is normal load on ARM, while
memory_order_acquire isn’t. So there is real difference.

https://developer.arm.com/documentation/dui0801/l/A64-Data-Transfer-Instructions/LDAPR--A64-

Now it’s weird to use memory_order_acquire inside mutex lock.

> 
> Thanks for looking into this.
> 
> - Andreas
> 
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [FFmpeg-devel] [PATCH] threadprogress: reorder instructions to silence tsan warning.
  2025-02-07 11:38   ` Zhao Zhili
@ 2025-02-07 11:47     ` Andreas Rheinhardt
  0 siblings, 0 replies; 15+ messages in thread
From: Andreas Rheinhardt @ 2025-02-07 11:47 UTC (permalink / raw)
  To: ffmpeg-devel

Zhao Zhili:
> 
> 
>> On Feb 7, 2025, at 19:22, Andreas Rheinhardt <andreas.rheinhardt@outlook.com> wrote:
>>
>> Ronald S. Bultje:
>>> Fixes #11456.
>>> ---
>>> libavcodec/threadprogress.c | 3 +--
>>> 1 file changed, 1 insertion(+), 2 deletions(-)
>>>
>>> diff --git a/libavcodec/threadprogress.c b/libavcodec/threadprogress.c
>>> index 62c4fd898b..aa72ff80e7 100644
>>> --- a/libavcodec/threadprogress.c
>>> +++ b/libavcodec/threadprogress.c
>>> @@ -55,9 +55,8 @@ void ff_thread_progress_report(ThreadProgress *pro, int n)
>>>     if (atomic_load_explicit(&pro->progress, memory_order_relaxed) >= n)
>>>         return;
>>>
>>> -    atomic_store_explicit(&pro->progress, n, memory_order_release);
>>> -
>>>     ff_mutex_lock(&pro->progress_mutex);
>>> +    atomic_store_explicit(&pro->progress, n, memory_order_release);
>>>     ff_cond_broadcast(&pro->progress_cond);
>>>     ff_mutex_unlock(&pro->progress_mutex);
>>> }
>>
>> I don't really understand why this is supposed to fix a race; after all,
>> the synchronisation of ff_thread_progress_(report|await) is not supposed
>> to be provided by the mutex (which is avoided altogether in the fast
>> path in ff_thread_report_await()), but by storing and loading the
>> progress variable.
>> That's also the reason why I moved this outside of the mutex (compared
>> to ff_thread_report_progress(). (This way it is possible for a consumer
>> thread to see the new progress value earlier and possibly avoid the
>> mutex altogether.)
> 
> As I understand it, there is no real race condition, that’s why the patch
> says “silence tsan warning”.

There is a race (in fact, both a data race and a race condition (the
latter doesn't happen on x86 with its strong memory model though); see
my other mail for an explanation.

> 
> I have considered another idea to keep tsan clean and keep the benefit
> of set progress earlier: use another non-atomic progress together with
> mutex/cond, so atomic and mutex/cond are used separately. Not sure
> whether it’s worth the complexity.
> 

So we would have one atomic and one non-atomic progress variable and a
mutex? And the atomic progress is set as now and only read for the fast
path and the nonatomic variable is set and read inside the mutex? I
don't really think this is better than any of the alternatives.

- Andreas

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [FFmpeg-devel] [PATCH] threadprogress: reorder instructions to silence tsan warning.
  2025-02-07 11:46     ` Zhao Zhili
@ 2025-02-07 11:53       ` Zhao Zhili
  2025-02-07 12:20         ` Rémi Denis-Courmont
  0 siblings, 1 reply; 15+ messages in thread
From: Zhao Zhili @ 2025-02-07 11:53 UTC (permalink / raw)
  To: FFmpeg development discussions and patches; +Cc: Rémi Denis-Courmont



> On Feb 7, 2025, at 19:46, Zhao Zhili <quinkblack-at-foxmail.com@ffmpeg.org> wrote:
> 
> 
> 
>> On Feb 7, 2025, at 19:39, Andreas Rheinhardt <andreas.rheinhardt@outlook.com> wrote:
>> 
>> Andreas Rheinhardt:
>>> Ronald S. Bultje:
>>>> Fixes #11456.
>>>> ---
>>>> libavcodec/threadprogress.c | 3 +--
>>>> 1 file changed, 1 insertion(+), 2 deletions(-)
>>>> 
>>>> diff --git a/libavcodec/threadprogress.c b/libavcodec/threadprogress.c
>>>> index 62c4fd898b..aa72ff80e7 100644
>>>> --- a/libavcodec/threadprogress.c
>>>> +++ b/libavcodec/threadprogress.c
>>>> @@ -55,9 +55,8 @@ void ff_thread_progress_report(ThreadProgress *pro, int n)
>>>>    if (atomic_load_explicit(&pro->progress, memory_order_relaxed) >= n)
>>>>        return;
>>>> 
>>>> -    atomic_store_explicit(&pro->progress, n, memory_order_release);
>>>> -
>>>>    ff_mutex_lock(&pro->progress_mutex);
>>>> +    atomic_store_explicit(&pro->progress, n, memory_order_release);
>>>>    ff_cond_broadcast(&pro->progress_cond);
>>>>    ff_mutex_unlock(&pro->progress_mutex);
>>>> }
>>> 
>>> I don't really understand why this is supposed to fix a race; after all,
>>> the synchronisation of ff_thread_progress_(report|await) is not supposed
>>> to be provided by the mutex (which is avoided altogether in the fast
>>> path in ff_thread_report_await()), but by storing and loading the
>>> progress variable.
>>> That's also the reason why I moved this outside of the mutex (compared
>>> to ff_thread_report_progress(). (This way it is possible for a consumer
>>> thread to see the new progress value earlier and possibly avoid the
>>> mutex altogether.)
>>> 
>> 
>> Damn, this optimization works, but only if the progress variable is
>> always read with acquire-semantics; it is currently read via
>> memory_order_relaxed inside the mutex (just like in
>> ff_thread_await_progress()).
>> 
>> According to my understanding, this is what happens:
>> Consumer thread waits for progress and finds that it is insufficient
>> (fast path fails)
>> Producer thread updates progress variable
>> Consumer thread acquires the mutex and reads new progress via
>> memory_order_relaxed
>> Producer thread acquires mutex and broadcasts the new progress
>> 
>> I'd prefer to change these semantics so that we always perform
>> synchronisation via the atomic progress variable (unless you know of a
>> performance impact -- I only know that on x86, both memory_order_relaxed
>> and memory_order_acquire are ordinary loads).
> 
> I have considered the solution too, by always use memory_order_acquire
> in wait progress. memory_order_relaxed is normal load on ARM, while
> memory_order_acquire isn’t. So there is real difference.
> 
> https://developer.arm.com/documentation/dui0801/l/A64-Data-Transfer-Instructions/LDAPR--A64-
> 
> Now it’s weird to use memory_order_acquire inside mutex lock.

cc Remi, who have written VLC atomic_wait and mutex from sketch.

> 
>> 
>> Thanks for looking into this.
>> 
>> - Andreas
>> 
>> _______________________________________________
>> ffmpeg-devel mailing list
>> ffmpeg-devel@ffmpeg.org
>> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>> 
>> To unsubscribe, visit link above, or email
>> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
> 
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [FFmpeg-devel] [PATCH] threadprogress: reorder instructions to silence tsan warning.
  2025-02-07 11:53       ` Zhao Zhili
@ 2025-02-07 12:20         ` Rémi Denis-Courmont
  0 siblings, 0 replies; 15+ messages in thread
From: Rémi Denis-Courmont @ 2025-02-07 12:20 UTC (permalink / raw)
  To: Zhao Zhili, FFmpeg development discussions and patches



Le 7 février 2025 13:53:22 GMT+02:00, Zhao Zhili <quinkblack@foxmail.com> a écrit :
>
>
>> On Feb 7, 2025, at 19:46, Zhao Zhili <quinkblack-at-foxmail.com@ffmpeg.org> wrote:
>> 
>> 
>> 
>>> On Feb 7, 2025, at 19:39, Andreas Rheinhardt <andreas.rheinhardt@outlook.com> wrote:
>>> 
>>> Andreas Rheinhardt:
>>>> Ronald S. Bultje:
>>>>> Fixes #11456.
>>>>> ---
>>>>> libavcodec/threadprogress.c | 3 +--
>>>>> 1 file changed, 1 insertion(+), 2 deletions(-)
>>>>> 
>>>>> diff --git a/libavcodec/threadprogress.c b/libavcodec/threadprogress.c
>>>>> index 62c4fd898b..aa72ff80e7 100644
>>>>> --- a/libavcodec/threadprogress.c
>>>>> +++ b/libavcodec/threadprogress.c
>>>>> @@ -55,9 +55,8 @@ void ff_thread_progress_report(ThreadProgress *pro, int n)
>>>>>    if (atomic_load_explicit(&pro->progress, memory_order_relaxed) >= n)
>>>>>        return;
>>>>> 
>>>>> -    atomic_store_explicit(&pro->progress, n, memory_order_release);
>>>>> -
>>>>>    ff_mutex_lock(&pro->progress_mutex);
>>>>> +    atomic_store_explicit(&pro->progress, n, memory_order_release);
>>>>>    ff_cond_broadcast(&pro->progress_cond);
>>>>>    ff_mutex_unlock(&pro->progress_mutex);
>>>>> }
>>>> 
>>>> I don't really understand why this is supposed to fix a race; after all,
>>>> the synchronisation of ff_thread_progress_(report|await) is not supposed
>>>> to be provided by the mutex (which is avoided altogether in the fast
>>>> path in ff_thread_report_await()), but by storing and loading the
>>>> progress variable.
>>>> That's also the reason why I moved this outside of the mutex (compared
>>>> to ff_thread_report_progress(). (This way it is possible for a consumer
>>>> thread to see the new progress value earlier and possibly avoid the
>>>> mutex altogether.)
>>>> 
>>> 
>>> Damn, this optimization works, but only if the progress variable is
>>> always read with acquire-semantics; it is currently read via
>>> memory_order_relaxed inside the mutex (just like in
>>> ff_thread_await_progress()).
>>> 
>>> According to my understanding, this is what happens:
>>> Consumer thread waits for progress and finds that it is insufficient
>>> (fast path fails)
>>> Producer thread updates progress variable
>>> Consumer thread acquires the mutex and reads new progress via
>>> memory_order_relaxed
>>> Producer thread acquires mutex and broadcasts the new progress
>>> 
>>> I'd prefer to change these semantics so that we always perform
>>> synchronisation via the atomic progress variable (unless you know of a
>>> performance impact -- I only know that on x86, both memory_order_relaxed
>>> and memory_order_acquire are ordinary loads).
>> 
>> I have considered the solution too, by always use memory_order_acquire
>> in wait progress. memory_order_relaxed is normal load on ARM, while
>> memory_order_acquire isn’t. So there is real difference.
>> 
>> https://developer.arm.com/documentation/dui0801/l/A64-Data-Transfer-Instructions/LDAPR--A64-
>> 
>> Now it’s weird to use memory_order_acquire inside mutex lock.
>
>cc Remi, who have written VLC atomic_wait and mutex from sketch.

It always gets weird when you mix atomics and CVs. CVs must nominally have some sort of associated state that is modified under the same lock. But in the preexisting code there is no such thing.

So yeah, if you need the performance properties (or the infallible initialisation, or the implicit clean-up) of atomics, then you really should use futeces, not CVs. Otherwise don't use atomics.

In this particular case, whether this is a false positive of TSan or a real bug depends on the behaviour of other code paths (which I have not had time to review as of yet).
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [FFmpeg-devel] [PATCH] threadprogress: reorder instructions to silence tsan warning.
  2025-02-07 11:22 ` Andreas Rheinhardt
  2025-02-07 11:38   ` Zhao Zhili
  2025-02-07 11:39   ` Andreas Rheinhardt
@ 2025-02-07 13:26   ` Ronald S. Bultje
  2025-02-07 13:43     ` Zhao Zhili
  2 siblings, 1 reply; 15+ messages in thread
From: Ronald S. Bultje @ 2025-02-07 13:26 UTC (permalink / raw)
  To: FFmpeg development discussions and patches

Hi,

On Fri, Feb 7, 2025 at 6:22 AM Andreas Rheinhardt <
andreas.rheinhardt@outlook.com> wrote:

> Ronald S. Bultje:
> > Fixes #11456.
> > ---
> >  libavcodec/threadprogress.c | 3 +--
> >  1 file changed, 1 insertion(+), 2 deletions(-)
> >
> > diff --git a/libavcodec/threadprogress.c b/libavcodec/threadprogress.c
> > index 62c4fd898b..aa72ff80e7 100644
> > --- a/libavcodec/threadprogress.c
> > +++ b/libavcodec/threadprogress.c
> > @@ -55,9 +55,8 @@ void ff_thread_progress_report(ThreadProgress *pro,
> int n)
> >      if (atomic_load_explicit(&pro->progress, memory_order_relaxed) >= n)
> >          return;
> >
> > -    atomic_store_explicit(&pro->progress, n, memory_order_release);
> > -
> >      ff_mutex_lock(&pro->progress_mutex);
> > +    atomic_store_explicit(&pro->progress, n, memory_order_release);
> >      ff_cond_broadcast(&pro->progress_cond);
> >      ff_mutex_unlock(&pro->progress_mutex);
> >  }
>
> I don't really understand why this is supposed to fix a race; after all,
> the synchronisation of ff_thread_progress_(report|await) is not supposed
> to be provided by the mutex (which is avoided altogether in the fast
> path in ff_thread_report_await()), but by storing and loading the
> progress variable.
> That's also the reason why I moved this outside of the mutex (compared
> to ff_thread_report_progress(). (This way it is possible for a consumer
> thread to see the new progress value earlier and possibly avoid the
> mutex altogether.)


The consumer thread already checks the value without the lock. so the
significance of that last point seems minor to me. This would be different
if the wait() counterpart had no lockless path. Or am I missing something?

Ronald
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [FFmpeg-devel] [PATCH] threadprogress: reorder instructions to silence tsan warning.
  2025-02-07 13:26   ` Ronald S. Bultje
@ 2025-02-07 13:43     ` Zhao Zhili
  2025-02-07 16:05       ` Ronald S. Bultje
  0 siblings, 1 reply; 15+ messages in thread
From: Zhao Zhili @ 2025-02-07 13:43 UTC (permalink / raw)
  To: FFmpeg development discussions and patches



> On Feb 7, 2025, at 21:26, Ronald S. Bultje <rsbultje@gmail.com> wrote:
> 
> Hi,
> 
> On Fri, Feb 7, 2025 at 6:22 AM Andreas Rheinhardt <
> andreas.rheinhardt@outlook.com> wrote:
> 
>> Ronald S. Bultje:
>>> Fixes #11456.
>>> ---
>>> libavcodec/threadprogress.c | 3 +--
>>> 1 file changed, 1 insertion(+), 2 deletions(-)
>>> 
>>> diff --git a/libavcodec/threadprogress.c b/libavcodec/threadprogress.c
>>> index 62c4fd898b..aa72ff80e7 100644
>>> --- a/libavcodec/threadprogress.c
>>> +++ b/libavcodec/threadprogress.c
>>> @@ -55,9 +55,8 @@ void ff_thread_progress_report(ThreadProgress *pro,
>> int n)
>>>     if (atomic_load_explicit(&pro->progress, memory_order_relaxed) >= n)
>>>         return;
>>> 
>>> -    atomic_store_explicit(&pro->progress, n, memory_order_release);
>>> -
>>>     ff_mutex_lock(&pro->progress_mutex);
>>> +    atomic_store_explicit(&pro->progress, n, memory_order_release);
>>>     ff_cond_broadcast(&pro->progress_cond);
>>>     ff_mutex_unlock(&pro->progress_mutex);
>>> }
>> 
>> I don't really understand why this is supposed to fix a race; after all,
>> the synchronisation of ff_thread_progress_(report|await) is not supposed
>> to be provided by the mutex (which is avoided altogether in the fast
>> path in ff_thread_report_await()), but by storing and loading the
>> progress variable.
>> That's also the reason why I moved this outside of the mutex (compared
>> to ff_thread_report_progress(). (This way it is possible for a consumer
>> thread to see the new progress value earlier and possibly avoid the
>> mutex altogether.)
> 
> 
> The consumer thread already checks the value without the lock. so the
> significance of that last point seems minor to me. This would be different
> if the wait() counterpart had no lockless path. Or am I missing something?

What Andreas says is atomic_store before mutex_lock makes the first
atomic_load in progress_wait has a higher chance to succeed. The earlier
progress is set, the higher chance of progress_wait go into the fast path.

> 
> Ronald
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [FFmpeg-devel] [PATCH] threadprogress: reorder instructions to silence tsan warning.
  2025-02-07 13:43     ` Zhao Zhili
@ 2025-02-07 16:05       ` Ronald S. Bultje
  2025-02-08  3:50         ` Zhao Zhili
  0 siblings, 1 reply; 15+ messages in thread
From: Ronald S. Bultje @ 2025-02-07 16:05 UTC (permalink / raw)
  To: FFmpeg development discussions and patches

Hi,

On Fri, Feb 7, 2025 at 8:44 AM Zhao Zhili <
quinkblack-at-foxmail.com@ffmpeg.org> wrote:

> > On Feb 7, 2025, at 21:26, Ronald S. Bultje <rsbultje@gmail.com> wrote:
> > On Fri, Feb 7, 2025 at 6:22 AM Andreas Rheinhardt <
> > andreas.rheinhardt@outlook.com> wrote:
> >> Ronald S. Bultje:
> >>> Fixes #11456.
> >>> ---
> >>> libavcodec/threadprogress.c | 3 +--
> >>> 1 file changed, 1 insertion(+), 2 deletions(-)
> >>>
> >>> diff --git a/libavcodec/threadprogress.c b/libavcodec/threadprogress.c
> >>> index 62c4fd898b..aa72ff80e7 100644
> >>> --- a/libavcodec/threadprogress.c
> >>> +++ b/libavcodec/threadprogress.c
> >>> @@ -55,9 +55,8 @@ void ff_thread_progress_report(ThreadProgress *pro,
> >> int n)
> >>>     if (atomic_load_explicit(&pro->progress, memory_order_relaxed) >=
> n)
> >>>         return;
> >>>
> >>> -    atomic_store_explicit(&pro->progress, n, memory_order_release);
> >>> -
> >>>     ff_mutex_lock(&pro->progress_mutex);
> >>> +    atomic_store_explicit(&pro->progress, n, memory_order_release);
> >>>     ff_cond_broadcast(&pro->progress_cond);
> >>>     ff_mutex_unlock(&pro->progress_mutex);
> >>> }
> >>
> >> I don't really understand why this is supposed to fix a race; after all,
> >> the synchronisation of ff_thread_progress_(report|await) is not supposed
> >> to be provided by the mutex (which is avoided altogether in the fast
> >> path in ff_thread_report_await()), but by storing and loading the
> >> progress variable.
> >> That's also the reason why I moved this outside of the mutex (compared
> >> to ff_thread_report_progress(). (This way it is possible for a consumer
> >> thread to see the new progress value earlier and possibly avoid the
> >> mutex altogether.)
> >
> >
> > The consumer thread already checks the value without the lock. so the
> > significance of that last point seems minor to me. This would be
> different
> > if the wait() counterpart had no lockless path. Or am I missing
> something?
>
> What Andreas says is atomic_store before mutex_lock makes the first
> atomic_load in progress_wait has a higher chance to succeed. The earlier
> progress is set, the higher chance of progress_wait go into the fast path.


I understand that is true in theory - but I have doubts on whether this is
in any way significant in practice if wait() already has behaviour to
pre-empty locklessly

I measured this in the most un-scientific way possible by decoding
gizmo.webm (from Firefox' bug report) 10x before and after my patch, taking
the average and standard deviation, and comparing these with each other. I
repeated this a couple of times. The values (before vs after avg +/-
stddev) are obviously never exactly the same, but they swarm around each
other like a random noise generator. Or to say it differently: in my highly
unscientific test, I see no performance difference.

So ... Is this really worth it?

Ronald
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [FFmpeg-devel] [PATCH] threadprogress: reorder instructions to silence tsan warning.
  2025-02-07 16:05       ` Ronald S. Bultje
@ 2025-02-08  3:50         ` Zhao Zhili
  2025-02-08 19:31           ` Ronald S. Bultje
  0 siblings, 1 reply; 15+ messages in thread
From: Zhao Zhili @ 2025-02-08  3:50 UTC (permalink / raw)
  To: FFmpeg development discussions and patches

[-- Attachment #1: Type: text/plain, Size: 3459 bytes --]



> On Feb 8, 2025, at 00:05, Ronald S. Bultje <rsbultje@gmail.com> wrote:
> 
> Hi,
> 
> On Fri, Feb 7, 2025 at 8:44 AM Zhao Zhili <
> quinkblack-at-foxmail.com@ffmpeg.org> wrote:
> 
>>> On Feb 7, 2025, at 21:26, Ronald S. Bultje <rsbultje@gmail.com> wrote:
>>> On Fri, Feb 7, 2025 at 6:22 AM Andreas Rheinhardt <
>>> andreas.rheinhardt@outlook.com> wrote:
>>>> Ronald S. Bultje:
>>>>> Fixes #11456.
>>>>> ---
>>>>> libavcodec/threadprogress.c | 3 +--
>>>>> 1 file changed, 1 insertion(+), 2 deletions(-)
>>>>> 
>>>>> diff --git a/libavcodec/threadprogress.c b/libavcodec/threadprogress.c
>>>>> index 62c4fd898b..aa72ff80e7 100644
>>>>> --- a/libavcodec/threadprogress.c
>>>>> +++ b/libavcodec/threadprogress.c
>>>>> @@ -55,9 +55,8 @@ void ff_thread_progress_report(ThreadProgress *pro,
>>>> int n)
>>>>>    if (atomic_load_explicit(&pro->progress, memory_order_relaxed) >=
>> n)
>>>>>        return;
>>>>> 
>>>>> -    atomic_store_explicit(&pro->progress, n, memory_order_release);
>>>>> -
>>>>>    ff_mutex_lock(&pro->progress_mutex);
>>>>> +    atomic_store_explicit(&pro->progress, n, memory_order_release);
>>>>>    ff_cond_broadcast(&pro->progress_cond);
>>>>>    ff_mutex_unlock(&pro->progress_mutex);
>>>>> }
>>>> 
>>>> I don't really understand why this is supposed to fix a race; after all,
>>>> the synchronisation of ff_thread_progress_(report|await) is not supposed
>>>> to be provided by the mutex (which is avoided altogether in the fast
>>>> path in ff_thread_report_await()), but by storing and loading the
>>>> progress variable.
>>>> That's also the reason why I moved this outside of the mutex (compared
>>>> to ff_thread_report_progress(). (This way it is possible for a consumer
>>>> thread to see the new progress value earlier and possibly avoid the
>>>> mutex altogether.)
>>> 
>>> 
>>> The consumer thread already checks the value without the lock. so the
>>> significance of that last point seems minor to me. This would be
>> different
>>> if the wait() counterpart had no lockless path. Or am I missing
>> something?
>> 
>> What Andreas says is atomic_store before mutex_lock makes the first
>> atomic_load in progress_wait has a higher chance to succeed. The earlier
>> progress is set, the higher chance of progress_wait go into the fast path.
> 
> 
> I understand that is true in theory - but I have doubts on whether this is
> in any way significant in practice if wait() already has behaviour to
> pre-empty locklessly
> 
> I measured this in the most un-scientific way possible by decoding
> gizmo.webm (from Firefox' bug report) 10x before and after my patch, taking
> the average and standard deviation, and comparing these with each other. I
> repeated this a couple of times. The values (before vs after avg +/-
> stddev) are obviously never exactly the same, but they swarm around each
> other like a random noise generator. Or to say it differently: in my highly
> unscientific test, I see no performance difference.
> 
> So ... Is this really worth it?

I did another test by measure fast_path / (fast_path + slow_path) on macOS of hevc decoding with 10 threads.

1. Before the patch, it’s 99.741%.
2. With the patch, it’s 99.743%.
3. With while (atomic_load_explicit(&pro->progress, memory_order_acquire) < n), it’s 99.741%.

So, it doesn’t matter for performance. Current patch is the most elegant solution in my opinion.


[-- Attachment #2: threadprogress.patch --]
[-- Type: application/octet-stream, Size: 2156 bytes --]

[-- Attachment #3: Type: text/plain, Size: 279 bytes --]


> 
> Ronald
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".


[-- Attachment #4: Type: text/plain, Size: 251 bytes --]

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [FFmpeg-devel] [PATCH] threadprogress: reorder instructions to silence tsan warning.
  2025-02-08  3:50         ` Zhao Zhili
@ 2025-02-08 19:31           ` Ronald S. Bultje
  2025-02-10 16:02             ` Ronald S. Bultje
  0 siblings, 1 reply; 15+ messages in thread
From: Ronald S. Bultje @ 2025-02-08 19:31 UTC (permalink / raw)
  To: FFmpeg development discussions and patches

Hi,

On Fri, Feb 7, 2025 at 10:50 PM Zhao Zhili <
quinkblack-at-foxmail.com@ffmpeg.org> wrote:

>
>
> > On Feb 8, 2025, at 00:05, Ronald S. Bultje <rsbultje@gmail.com> wrote:
> >
> > Hi,
> >
> > On Fri, Feb 7, 2025 at 8:44 AM Zhao Zhili <
> > quinkblack-at-foxmail.com@ffmpeg.org> wrote:
> >
> >>> On Feb 7, 2025, at 21:26, Ronald S. Bultje <rsbultje@gmail.com> wrote:
> >>> On Fri, Feb 7, 2025 at 6:22 AM Andreas Rheinhardt <
> >>> andreas.rheinhardt@outlook.com> wrote:
> >>>> Ronald S. Bultje:
> >>>>> Fixes #11456.
> >>>>> ---
> >>>>> libavcodec/threadprogress.c | 3 +--
> >>>>> 1 file changed, 1 insertion(+), 2 deletions(-)
> >>>>>
> >>>>> diff --git a/libavcodec/threadprogress.c
> b/libavcodec/threadprogress.c
> >>>>> index 62c4fd898b..aa72ff80e7 100644
> >>>>> --- a/libavcodec/threadprogress.c
> >>>>> +++ b/libavcodec/threadprogress.c
> >>>>> @@ -55,9 +55,8 @@ void ff_thread_progress_report(ThreadProgress *pro,
> >>>> int n)
> >>>>>    if (atomic_load_explicit(&pro->progress, memory_order_relaxed) >=
> >> n)
> >>>>>        return;
> >>>>>
> >>>>> -    atomic_store_explicit(&pro->progress, n, memory_order_release);
> >>>>> -
> >>>>>    ff_mutex_lock(&pro->progress_mutex);
> >>>>> +    atomic_store_explicit(&pro->progress, n, memory_order_release);
> >>>>>    ff_cond_broadcast(&pro->progress_cond);
> >>>>>    ff_mutex_unlock(&pro->progress_mutex);
> >>>>> }
> >>>>
> >>>> I don't really understand why this is supposed to fix a race; after
> all,
> >>>> the synchronisation of ff_thread_progress_(report|await) is not
> supposed
> >>>> to be provided by the mutex (which is avoided altogether in the fast
> >>>> path in ff_thread_report_await()), but by storing and loading the
> >>>> progress variable.
> >>>> That's also the reason why I moved this outside of the mutex (compared
> >>>> to ff_thread_report_progress(). (This way it is possible for a
> consumer
> >>>> thread to see the new progress value earlier and possibly avoid the
> >>>> mutex altogether.)
> >>>
> >>>
> >>> The consumer thread already checks the value without the lock. so the
> >>> significance of that last point seems minor to me. This would be
> >> different
> >>> if the wait() counterpart had no lockless path. Or am I missing
> >> something?
> >>
> >> What Andreas says is atomic_store before mutex_lock makes the first
> >> atomic_load in progress_wait has a higher chance to succeed. The earlier
> >> progress is set, the higher chance of progress_wait go into the fast
> path.
> >
> >
> > I understand that is true in theory - but I have doubts on whether this
> is
> > in any way significant in practice if wait() already has behaviour to
> > pre-empty locklessly
> >
> > I measured this in the most un-scientific way possible by decoding
> > gizmo.webm (from Firefox' bug report) 10x before and after my patch,
> taking
> > the average and standard deviation, and comparing these with each other.
> I
> > repeated this a couple of times. The values (before vs after avg +/-
> > stddev) are obviously never exactly the same, but they swarm around each
> > other like a random noise generator. Or to say it differently: in my
> highly
> > unscientific test, I see no performance difference.
> >
> > So ... Is this really worth it?
>
> I did another test by measure fast_path / (fast_path + slow_path) on macOS
> of hevc decoding with 10 threads.
>
> 1. Before the patch, it’s 99.741%.
> 2. With the patch, it’s 99.743%.
> 3. With while (atomic_load_explicit(&pro->progress, memory_order_acquire)
> < n), it’s 99.741%.
>
> So, it doesn’t matter for performance. Current patch is the most elegant
> solution in my opinion.


Thanks for testing. Andreas, any further thoughts?

Ronald
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [FFmpeg-devel] [PATCH] threadprogress: reorder instructions to silence tsan warning.
  2025-02-08 19:31           ` Ronald S. Bultje
@ 2025-02-10 16:02             ` Ronald S. Bultje
  0 siblings, 0 replies; 15+ messages in thread
From: Ronald S. Bultje @ 2025-02-10 16:02 UTC (permalink / raw)
  To: FFmpeg development discussions and patches

Hi,

On Sat, Feb 8, 2025 at 2:31 PM Ronald S. Bultje <rsbultje@gmail.com> wrote:

> Hi,
>
> On Fri, Feb 7, 2025 at 10:50 PM Zhao Zhili <
> quinkblack-at-foxmail.com@ffmpeg.org> wrote:
>
>>
>>
>> > On Feb 8, 2025, at 00:05, Ronald S. Bultje <rsbultje@gmail.com> wrote:
>> >
>> > Hi,
>> >
>> > On Fri, Feb 7, 2025 at 8:44 AM Zhao Zhili <
>> > quinkblack-at-foxmail.com@ffmpeg.org> wrote:
>> >
>> >>> On Feb 7, 2025, at 21:26, Ronald S. Bultje <rsbultje@gmail.com>
>> wrote:
>> >>> On Fri, Feb 7, 2025 at 6:22 AM Andreas Rheinhardt <
>> >>> andreas.rheinhardt@outlook.com> wrote:
>> >>>> Ronald S. Bultje:
>> >>>>> Fixes #11456.
>> >>>>> ---
>> >>>>> libavcodec/threadprogress.c | 3 +--
>> >>>>> 1 file changed, 1 insertion(+), 2 deletions(-)
>> >>>>>
>> >>>>> diff --git a/libavcodec/threadprogress.c
>> b/libavcodec/threadprogress.c
>> >>>>> index 62c4fd898b..aa72ff80e7 100644
>> >>>>> --- a/libavcodec/threadprogress.c
>> >>>>> +++ b/libavcodec/threadprogress.c
>> >>>>> @@ -55,9 +55,8 @@ void ff_thread_progress_report(ThreadProgress
>> *pro,
>> >>>> int n)
>> >>>>>    if (atomic_load_explicit(&pro->progress, memory_order_relaxed) >=
>> >> n)
>> >>>>>        return;
>> >>>>>
>> >>>>> -    atomic_store_explicit(&pro->progress, n, memory_order_release);
>> >>>>> -
>> >>>>>    ff_mutex_lock(&pro->progress_mutex);
>> >>>>> +    atomic_store_explicit(&pro->progress, n, memory_order_release);
>> >>>>>    ff_cond_broadcast(&pro->progress_cond);
>> >>>>>    ff_mutex_unlock(&pro->progress_mutex);
>> >>>>> }
>> >>>>
>> >>>> I don't really understand why this is supposed to fix a race; after
>> all,
>> >>>> the synchronisation of ff_thread_progress_(report|await) is not
>> supposed
>> >>>> to be provided by the mutex (which is avoided altogether in the fast
>> >>>> path in ff_thread_report_await()), but by storing and loading the
>> >>>> progress variable.
>> >>>> That's also the reason why I moved this outside of the mutex
>> (compared
>> >>>> to ff_thread_report_progress(). (This way it is possible for a
>> consumer
>> >>>> thread to see the new progress value earlier and possibly avoid the
>> >>>> mutex altogether.)
>> >>>
>> >>>
>> >>> The consumer thread already checks the value without the lock. so the
>> >>> significance of that last point seems minor to me. This would be
>> >> different
>> >>> if the wait() counterpart had no lockless path. Or am I missing
>> >> something?
>> >>
>> >> What Andreas says is atomic_store before mutex_lock makes the first
>> >> atomic_load in progress_wait has a higher chance to succeed. The
>> earlier
>> >> progress is set, the higher chance of progress_wait go into the fast
>> path.
>> >
>> >
>> > I understand that is true in theory - but I have doubts on whether this
>> is
>> > in any way significant in practice if wait() already has behaviour to
>> > pre-empty locklessly
>> >
>> > I measured this in the most un-scientific way possible by decoding
>> > gizmo.webm (from Firefox' bug report) 10x before and after my patch,
>> taking
>> > the average and standard deviation, and comparing these with each
>> other. I
>> > repeated this a couple of times. The values (before vs after avg +/-
>> > stddev) are obviously never exactly the same, but they swarm around each
>> > other like a random noise generator. Or to say it differently: in my
>> highly
>> > unscientific test, I see no performance difference.
>> >
>> > So ... Is this really worth it?
>>
>> I did another test by measure fast_path / (fast_path + slow_path) on
>> macOS of hevc decoding with 10 threads.
>>
>> 1. Before the patch, it’s 99.741%.
>> 2. With the patch, it’s 99.743%.
>> 3. With while (atomic_load_explicit(&pro->progress, memory_order_acquire)
>> < n), it’s 99.741%.
>>
>> So, it doesn’t matter for performance. Current patch is the most elegant
>> solution in my opinion.
>
>
> Thanks for testing. Andreas, any further thoughts?
>

After approval from Andreas on IRC, pushed with a slightly improved commit
message ("silence tsan warning" -> "fix race"), as suggested by Andreas.

Ronald
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2025-02-10 16:02 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-02-06 21:42 [FFmpeg-devel] [PATCH] threadprogress: reorder instructions to silence tsan warning Ronald S. Bultje
2025-02-07  5:13 ` Zhao Zhili
2025-02-07 11:22 ` Andreas Rheinhardt
2025-02-07 11:38   ` Zhao Zhili
2025-02-07 11:47     ` Andreas Rheinhardt
2025-02-07 11:39   ` Andreas Rheinhardt
2025-02-07 11:46     ` Zhao Zhili
2025-02-07 11:53       ` Zhao Zhili
2025-02-07 12:20         ` Rémi Denis-Courmont
2025-02-07 13:26   ` Ronald S. Bultje
2025-02-07 13:43     ` Zhao Zhili
2025-02-07 16:05       ` Ronald S. Bultje
2025-02-08  3:50         ` Zhao Zhili
2025-02-08 19:31           ` Ronald S. Bultje
2025-02-10 16:02             ` Ronald S. Bultje

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
		ffmpegdev@gitmailbox.com
	public-inbox-index ffmpegdev

Example config snippet for mirrors.


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git