* [FFmpeg-devel] [PATCH] threadprogress: reorder instructions to silence tsan warning. @ 2025-02-06 21:42 Ronald S. Bultje 2025-02-07 5:13 ` Zhao Zhili 2025-02-07 11:22 ` Andreas Rheinhardt 0 siblings, 2 replies; 12+ messages in thread From: Ronald S. Bultje @ 2025-02-06 21:42 UTC (permalink / raw) To: ffmpeg-devel; +Cc: Ronald S. Bultje Fixes #11456. --- libavcodec/threadprogress.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/libavcodec/threadprogress.c b/libavcodec/threadprogress.c index 62c4fd898b..aa72ff80e7 100644 --- a/libavcodec/threadprogress.c +++ b/libavcodec/threadprogress.c @@ -55,9 +55,8 @@ void ff_thread_progress_report(ThreadProgress *pro, int n) if (atomic_load_explicit(&pro->progress, memory_order_relaxed) >= n) return; - atomic_store_explicit(&pro->progress, n, memory_order_release); - ff_mutex_lock(&pro->progress_mutex); + atomic_store_explicit(&pro->progress, n, memory_order_release); ff_cond_broadcast(&pro->progress_cond); ff_mutex_unlock(&pro->progress_mutex); } -- 2.48.0 _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [FFmpeg-devel] [PATCH] threadprogress: reorder instructions to silence tsan warning. 2025-02-06 21:42 [FFmpeg-devel] [PATCH] threadprogress: reorder instructions to silence tsan warning Ronald S. Bultje @ 2025-02-07 5:13 ` Zhao Zhili 2025-02-07 11:22 ` Andreas Rheinhardt 1 sibling, 0 replies; 12+ messages in thread From: Zhao Zhili @ 2025-02-07 5:13 UTC (permalink / raw) To: FFmpeg development discussions and patches > On Feb 7, 2025, at 05:42, Ronald S. Bultje <rsbultje@gmail.com> wrote: > > Fixes #11456. > --- > libavcodec/threadprogress.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/libavcodec/threadprogress.c b/libavcodec/threadprogress.c > index 62c4fd898b..aa72ff80e7 100644 > --- a/libavcodec/threadprogress.c > +++ b/libavcodec/threadprogress.c > @@ -55,9 +55,8 @@ void ff_thread_progress_report(ThreadProgress *pro, int n) > if (atomic_load_explicit(&pro->progress, memory_order_relaxed) >= n) > return; > > - atomic_store_explicit(&pro->progress, n, memory_order_release); > - > ff_mutex_lock(&pro->progress_mutex); > + atomic_store_explicit(&pro->progress, n, memory_order_release); > ff_cond_broadcast(&pro->progress_cond); > ff_mutex_unlock(&pro->progress_mutex); > } > -- > 2.48.0 LGTM. > > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [FFmpeg-devel] [PATCH] threadprogress: reorder instructions to silence tsan warning. 2025-02-06 21:42 [FFmpeg-devel] [PATCH] threadprogress: reorder instructions to silence tsan warning Ronald S. Bultje 2025-02-07 5:13 ` Zhao Zhili @ 2025-02-07 11:22 ` Andreas Rheinhardt 2025-02-07 11:38 ` Zhao Zhili ` (2 more replies) 1 sibling, 3 replies; 12+ messages in thread From: Andreas Rheinhardt @ 2025-02-07 11:22 UTC (permalink / raw) To: ffmpeg-devel Ronald S. Bultje: > Fixes #11456. > --- > libavcodec/threadprogress.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/libavcodec/threadprogress.c b/libavcodec/threadprogress.c > index 62c4fd898b..aa72ff80e7 100644 > --- a/libavcodec/threadprogress.c > +++ b/libavcodec/threadprogress.c > @@ -55,9 +55,8 @@ void ff_thread_progress_report(ThreadProgress *pro, int n) > if (atomic_load_explicit(&pro->progress, memory_order_relaxed) >= n) > return; > > - atomic_store_explicit(&pro->progress, n, memory_order_release); > - > ff_mutex_lock(&pro->progress_mutex); > + atomic_store_explicit(&pro->progress, n, memory_order_release); > ff_cond_broadcast(&pro->progress_cond); > ff_mutex_unlock(&pro->progress_mutex); > } I don't really understand why this is supposed to fix a race; after all, the synchronisation of ff_thread_progress_(report|await) is not supposed to be provided by the mutex (which is avoided altogether in the fast path in ff_thread_report_await()), but by storing and loading the progress variable. That's also the reason why I moved this outside of the mutex (compared to ff_thread_report_progress(). (This way it is possible for a consumer thread to see the new progress value earlier and possibly avoid the mutex altogether.) - Andreas _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [FFmpeg-devel] [PATCH] threadprogress: reorder instructions to silence tsan warning. 2025-02-07 11:22 ` Andreas Rheinhardt @ 2025-02-07 11:38 ` Zhao Zhili 2025-02-07 11:47 ` Andreas Rheinhardt 2025-02-07 11:39 ` Andreas Rheinhardt 2025-02-07 13:26 ` Ronald S. Bultje 2 siblings, 1 reply; 12+ messages in thread From: Zhao Zhili @ 2025-02-07 11:38 UTC (permalink / raw) To: FFmpeg development discussions and patches > On Feb 7, 2025, at 19:22, Andreas Rheinhardt <andreas.rheinhardt@outlook.com> wrote: > > Ronald S. Bultje: >> Fixes #11456. >> --- >> libavcodec/threadprogress.c | 3 +-- >> 1 file changed, 1 insertion(+), 2 deletions(-) >> >> diff --git a/libavcodec/threadprogress.c b/libavcodec/threadprogress.c >> index 62c4fd898b..aa72ff80e7 100644 >> --- a/libavcodec/threadprogress.c >> +++ b/libavcodec/threadprogress.c >> @@ -55,9 +55,8 @@ void ff_thread_progress_report(ThreadProgress *pro, int n) >> if (atomic_load_explicit(&pro->progress, memory_order_relaxed) >= n) >> return; >> >> - atomic_store_explicit(&pro->progress, n, memory_order_release); >> - >> ff_mutex_lock(&pro->progress_mutex); >> + atomic_store_explicit(&pro->progress, n, memory_order_release); >> ff_cond_broadcast(&pro->progress_cond); >> ff_mutex_unlock(&pro->progress_mutex); >> } > > I don't really understand why this is supposed to fix a race; after all, > the synchronisation of ff_thread_progress_(report|await) is not supposed > to be provided by the mutex (which is avoided altogether in the fast > path in ff_thread_report_await()), but by storing and loading the > progress variable. > That's also the reason why I moved this outside of the mutex (compared > to ff_thread_report_progress(). (This way it is possible for a consumer > thread to see the new progress value earlier and possibly avoid the > mutex altogether.) As I understand it, there is no real race condition, that’s why the patch says “silence tsan warning”. I have considered another idea to keep tsan clean and keep the benefit of set progress earlier: use another non-atomic progress together with mutex/cond, so atomic and mutex/cond are used separately. Not sure whether it’s worth the complexity. > > - Andreas > > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [FFmpeg-devel] [PATCH] threadprogress: reorder instructions to silence tsan warning. 2025-02-07 11:38 ` Zhao Zhili @ 2025-02-07 11:47 ` Andreas Rheinhardt 0 siblings, 0 replies; 12+ messages in thread From: Andreas Rheinhardt @ 2025-02-07 11:47 UTC (permalink / raw) To: ffmpeg-devel Zhao Zhili: > > >> On Feb 7, 2025, at 19:22, Andreas Rheinhardt <andreas.rheinhardt@outlook.com> wrote: >> >> Ronald S. Bultje: >>> Fixes #11456. >>> --- >>> libavcodec/threadprogress.c | 3 +-- >>> 1 file changed, 1 insertion(+), 2 deletions(-) >>> >>> diff --git a/libavcodec/threadprogress.c b/libavcodec/threadprogress.c >>> index 62c4fd898b..aa72ff80e7 100644 >>> --- a/libavcodec/threadprogress.c >>> +++ b/libavcodec/threadprogress.c >>> @@ -55,9 +55,8 @@ void ff_thread_progress_report(ThreadProgress *pro, int n) >>> if (atomic_load_explicit(&pro->progress, memory_order_relaxed) >= n) >>> return; >>> >>> - atomic_store_explicit(&pro->progress, n, memory_order_release); >>> - >>> ff_mutex_lock(&pro->progress_mutex); >>> + atomic_store_explicit(&pro->progress, n, memory_order_release); >>> ff_cond_broadcast(&pro->progress_cond); >>> ff_mutex_unlock(&pro->progress_mutex); >>> } >> >> I don't really understand why this is supposed to fix a race; after all, >> the synchronisation of ff_thread_progress_(report|await) is not supposed >> to be provided by the mutex (which is avoided altogether in the fast >> path in ff_thread_report_await()), but by storing and loading the >> progress variable. >> That's also the reason why I moved this outside of the mutex (compared >> to ff_thread_report_progress(). (This way it is possible for a consumer >> thread to see the new progress value earlier and possibly avoid the >> mutex altogether.) > > As I understand it, there is no real race condition, that’s why the patch > says “silence tsan warning”. There is a race (in fact, both a data race and a race condition (the latter doesn't happen on x86 with its strong memory model though); see my other mail for an explanation. > > I have considered another idea to keep tsan clean and keep the benefit > of set progress earlier: use another non-atomic progress together with > mutex/cond, so atomic and mutex/cond are used separately. Not sure > whether it’s worth the complexity. > So we would have one atomic and one non-atomic progress variable and a mutex? And the atomic progress is set as now and only read for the fast path and the nonatomic variable is set and read inside the mutex? I don't really think this is better than any of the alternatives. - Andreas _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [FFmpeg-devel] [PATCH] threadprogress: reorder instructions to silence tsan warning. 2025-02-07 11:22 ` Andreas Rheinhardt 2025-02-07 11:38 ` Zhao Zhili @ 2025-02-07 11:39 ` Andreas Rheinhardt 2025-02-07 11:46 ` Zhao Zhili 2025-02-07 13:26 ` Ronald S. Bultje 2 siblings, 1 reply; 12+ messages in thread From: Andreas Rheinhardt @ 2025-02-07 11:39 UTC (permalink / raw) To: ffmpeg-devel Andreas Rheinhardt: > Ronald S. Bultje: >> Fixes #11456. >> --- >> libavcodec/threadprogress.c | 3 +-- >> 1 file changed, 1 insertion(+), 2 deletions(-) >> >> diff --git a/libavcodec/threadprogress.c b/libavcodec/threadprogress.c >> index 62c4fd898b..aa72ff80e7 100644 >> --- a/libavcodec/threadprogress.c >> +++ b/libavcodec/threadprogress.c >> @@ -55,9 +55,8 @@ void ff_thread_progress_report(ThreadProgress *pro, int n) >> if (atomic_load_explicit(&pro->progress, memory_order_relaxed) >= n) >> return; >> >> - atomic_store_explicit(&pro->progress, n, memory_order_release); >> - >> ff_mutex_lock(&pro->progress_mutex); >> + atomic_store_explicit(&pro->progress, n, memory_order_release); >> ff_cond_broadcast(&pro->progress_cond); >> ff_mutex_unlock(&pro->progress_mutex); >> } > > I don't really understand why this is supposed to fix a race; after all, > the synchronisation of ff_thread_progress_(report|await) is not supposed > to be provided by the mutex (which is avoided altogether in the fast > path in ff_thread_report_await()), but by storing and loading the > progress variable. > That's also the reason why I moved this outside of the mutex (compared > to ff_thread_report_progress(). (This way it is possible for a consumer > thread to see the new progress value earlier and possibly avoid the > mutex altogether.) > Damn, this optimization works, but only if the progress variable is always read with acquire-semantics; it is currently read via memory_order_relaxed inside the mutex (just like in ff_thread_await_progress()). According to my understanding, this is what happens: Consumer thread waits for progress and finds that it is insufficient (fast path fails) Producer thread updates progress variable Consumer thread acquires the mutex and reads new progress via memory_order_relaxed Producer thread acquires mutex and broadcasts the new progress I'd prefer to change these semantics so that we always perform synchronisation via the atomic progress variable (unless you know of a performance impact -- I only know that on x86, both memory_order_relaxed and memory_order_acquire are ordinary loads). Thanks for looking into this. - Andreas _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [FFmpeg-devel] [PATCH] threadprogress: reorder instructions to silence tsan warning. 2025-02-07 11:39 ` Andreas Rheinhardt @ 2025-02-07 11:46 ` Zhao Zhili 2025-02-07 11:53 ` Zhao Zhili 0 siblings, 1 reply; 12+ messages in thread From: Zhao Zhili @ 2025-02-07 11:46 UTC (permalink / raw) To: FFmpeg development discussions and patches > On Feb 7, 2025, at 19:39, Andreas Rheinhardt <andreas.rheinhardt@outlook.com> wrote: > > Andreas Rheinhardt: >> Ronald S. Bultje: >>> Fixes #11456. >>> --- >>> libavcodec/threadprogress.c | 3 +-- >>> 1 file changed, 1 insertion(+), 2 deletions(-) >>> >>> diff --git a/libavcodec/threadprogress.c b/libavcodec/threadprogress.c >>> index 62c4fd898b..aa72ff80e7 100644 >>> --- a/libavcodec/threadprogress.c >>> +++ b/libavcodec/threadprogress.c >>> @@ -55,9 +55,8 @@ void ff_thread_progress_report(ThreadProgress *pro, int n) >>> if (atomic_load_explicit(&pro->progress, memory_order_relaxed) >= n) >>> return; >>> >>> - atomic_store_explicit(&pro->progress, n, memory_order_release); >>> - >>> ff_mutex_lock(&pro->progress_mutex); >>> + atomic_store_explicit(&pro->progress, n, memory_order_release); >>> ff_cond_broadcast(&pro->progress_cond); >>> ff_mutex_unlock(&pro->progress_mutex); >>> } >> >> I don't really understand why this is supposed to fix a race; after all, >> the synchronisation of ff_thread_progress_(report|await) is not supposed >> to be provided by the mutex (which is avoided altogether in the fast >> path in ff_thread_report_await()), but by storing and loading the >> progress variable. >> That's also the reason why I moved this outside of the mutex (compared >> to ff_thread_report_progress(). (This way it is possible for a consumer >> thread to see the new progress value earlier and possibly avoid the >> mutex altogether.) >> > > Damn, this optimization works, but only if the progress variable is > always read with acquire-semantics; it is currently read via > memory_order_relaxed inside the mutex (just like in > ff_thread_await_progress()). > > According to my understanding, this is what happens: > Consumer thread waits for progress and finds that it is insufficient > (fast path fails) > Producer thread updates progress variable > Consumer thread acquires the mutex and reads new progress via > memory_order_relaxed > Producer thread acquires mutex and broadcasts the new progress > > I'd prefer to change these semantics so that we always perform > synchronisation via the atomic progress variable (unless you know of a > performance impact -- I only know that on x86, both memory_order_relaxed > and memory_order_acquire are ordinary loads). I have considered the solution too, by always use memory_order_acquire in wait progress. memory_order_relaxed is normal load on ARM, while memory_order_acquire isn’t. So there is real difference. https://developer.arm.com/documentation/dui0801/l/A64-Data-Transfer-Instructions/LDAPR--A64- Now it’s weird to use memory_order_acquire inside mutex lock. > > Thanks for looking into this. > > - Andreas > > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [FFmpeg-devel] [PATCH] threadprogress: reorder instructions to silence tsan warning. 2025-02-07 11:46 ` Zhao Zhili @ 2025-02-07 11:53 ` Zhao Zhili 2025-02-07 12:20 ` Rémi Denis-Courmont 0 siblings, 1 reply; 12+ messages in thread From: Zhao Zhili @ 2025-02-07 11:53 UTC (permalink / raw) To: FFmpeg development discussions and patches; +Cc: Rémi Denis-Courmont > On Feb 7, 2025, at 19:46, Zhao Zhili <quinkblack-at-foxmail.com@ffmpeg.org> wrote: > > > >> On Feb 7, 2025, at 19:39, Andreas Rheinhardt <andreas.rheinhardt@outlook.com> wrote: >> >> Andreas Rheinhardt: >>> Ronald S. Bultje: >>>> Fixes #11456. >>>> --- >>>> libavcodec/threadprogress.c | 3 +-- >>>> 1 file changed, 1 insertion(+), 2 deletions(-) >>>> >>>> diff --git a/libavcodec/threadprogress.c b/libavcodec/threadprogress.c >>>> index 62c4fd898b..aa72ff80e7 100644 >>>> --- a/libavcodec/threadprogress.c >>>> +++ b/libavcodec/threadprogress.c >>>> @@ -55,9 +55,8 @@ void ff_thread_progress_report(ThreadProgress *pro, int n) >>>> if (atomic_load_explicit(&pro->progress, memory_order_relaxed) >= n) >>>> return; >>>> >>>> - atomic_store_explicit(&pro->progress, n, memory_order_release); >>>> - >>>> ff_mutex_lock(&pro->progress_mutex); >>>> + atomic_store_explicit(&pro->progress, n, memory_order_release); >>>> ff_cond_broadcast(&pro->progress_cond); >>>> ff_mutex_unlock(&pro->progress_mutex); >>>> } >>> >>> I don't really understand why this is supposed to fix a race; after all, >>> the synchronisation of ff_thread_progress_(report|await) is not supposed >>> to be provided by the mutex (which is avoided altogether in the fast >>> path in ff_thread_report_await()), but by storing and loading the >>> progress variable. >>> That's also the reason why I moved this outside of the mutex (compared >>> to ff_thread_report_progress(). (This way it is possible for a consumer >>> thread to see the new progress value earlier and possibly avoid the >>> mutex altogether.) >>> >> >> Damn, this optimization works, but only if the progress variable is >> always read with acquire-semantics; it is currently read via >> memory_order_relaxed inside the mutex (just like in >> ff_thread_await_progress()). >> >> According to my understanding, this is what happens: >> Consumer thread waits for progress and finds that it is insufficient >> (fast path fails) >> Producer thread updates progress variable >> Consumer thread acquires the mutex and reads new progress via >> memory_order_relaxed >> Producer thread acquires mutex and broadcasts the new progress >> >> I'd prefer to change these semantics so that we always perform >> synchronisation via the atomic progress variable (unless you know of a >> performance impact -- I only know that on x86, both memory_order_relaxed >> and memory_order_acquire are ordinary loads). > > I have considered the solution too, by always use memory_order_acquire > in wait progress. memory_order_relaxed is normal load on ARM, while > memory_order_acquire isn’t. So there is real difference. > > https://developer.arm.com/documentation/dui0801/l/A64-Data-Transfer-Instructions/LDAPR--A64- > > Now it’s weird to use memory_order_acquire inside mutex lock. cc Remi, who have written VLC atomic_wait and mutex from sketch. > >> >> Thanks for looking into this. >> >> - Andreas >> >> _______________________________________________ >> ffmpeg-devel mailing list >> ffmpeg-devel@ffmpeg.org >> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel >> >> To unsubscribe, visit link above, or email >> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". > > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [FFmpeg-devel] [PATCH] threadprogress: reorder instructions to silence tsan warning. 2025-02-07 11:53 ` Zhao Zhili @ 2025-02-07 12:20 ` Rémi Denis-Courmont 0 siblings, 0 replies; 12+ messages in thread From: Rémi Denis-Courmont @ 2025-02-07 12:20 UTC (permalink / raw) To: Zhao Zhili, FFmpeg development discussions and patches Le 7 février 2025 13:53:22 GMT+02:00, Zhao Zhili <quinkblack@foxmail.com> a écrit : > > >> On Feb 7, 2025, at 19:46, Zhao Zhili <quinkblack-at-foxmail.com@ffmpeg.org> wrote: >> >> >> >>> On Feb 7, 2025, at 19:39, Andreas Rheinhardt <andreas.rheinhardt@outlook.com> wrote: >>> >>> Andreas Rheinhardt: >>>> Ronald S. Bultje: >>>>> Fixes #11456. >>>>> --- >>>>> libavcodec/threadprogress.c | 3 +-- >>>>> 1 file changed, 1 insertion(+), 2 deletions(-) >>>>> >>>>> diff --git a/libavcodec/threadprogress.c b/libavcodec/threadprogress.c >>>>> index 62c4fd898b..aa72ff80e7 100644 >>>>> --- a/libavcodec/threadprogress.c >>>>> +++ b/libavcodec/threadprogress.c >>>>> @@ -55,9 +55,8 @@ void ff_thread_progress_report(ThreadProgress *pro, int n) >>>>> if (atomic_load_explicit(&pro->progress, memory_order_relaxed) >= n) >>>>> return; >>>>> >>>>> - atomic_store_explicit(&pro->progress, n, memory_order_release); >>>>> - >>>>> ff_mutex_lock(&pro->progress_mutex); >>>>> + atomic_store_explicit(&pro->progress, n, memory_order_release); >>>>> ff_cond_broadcast(&pro->progress_cond); >>>>> ff_mutex_unlock(&pro->progress_mutex); >>>>> } >>>> >>>> I don't really understand why this is supposed to fix a race; after all, >>>> the synchronisation of ff_thread_progress_(report|await) is not supposed >>>> to be provided by the mutex (which is avoided altogether in the fast >>>> path in ff_thread_report_await()), but by storing and loading the >>>> progress variable. >>>> That's also the reason why I moved this outside of the mutex (compared >>>> to ff_thread_report_progress(). (This way it is possible for a consumer >>>> thread to see the new progress value earlier and possibly avoid the >>>> mutex altogether.) >>>> >>> >>> Damn, this optimization works, but only if the progress variable is >>> always read with acquire-semantics; it is currently read via >>> memory_order_relaxed inside the mutex (just like in >>> ff_thread_await_progress()). >>> >>> According to my understanding, this is what happens: >>> Consumer thread waits for progress and finds that it is insufficient >>> (fast path fails) >>> Producer thread updates progress variable >>> Consumer thread acquires the mutex and reads new progress via >>> memory_order_relaxed >>> Producer thread acquires mutex and broadcasts the new progress >>> >>> I'd prefer to change these semantics so that we always perform >>> synchronisation via the atomic progress variable (unless you know of a >>> performance impact -- I only know that on x86, both memory_order_relaxed >>> and memory_order_acquire are ordinary loads). >> >> I have considered the solution too, by always use memory_order_acquire >> in wait progress. memory_order_relaxed is normal load on ARM, while >> memory_order_acquire isn’t. So there is real difference. >> >> https://developer.arm.com/documentation/dui0801/l/A64-Data-Transfer-Instructions/LDAPR--A64- >> >> Now it’s weird to use memory_order_acquire inside mutex lock. > >cc Remi, who have written VLC atomic_wait and mutex from sketch. It always gets weird when you mix atomics and CVs. CVs must nominally have some sort of associated state that is modified under the same lock. But in the preexisting code there is no such thing. So yeah, if you need the performance properties (or the infallible initialisation, or the implicit clean-up) of atomics, then you really should use futeces, not CVs. Otherwise don't use atomics. In this particular case, whether this is a false positive of TSan or a real bug depends on the behaviour of other code paths (which I have not had time to review as of yet). _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [FFmpeg-devel] [PATCH] threadprogress: reorder instructions to silence tsan warning. 2025-02-07 11:22 ` Andreas Rheinhardt 2025-02-07 11:38 ` Zhao Zhili 2025-02-07 11:39 ` Andreas Rheinhardt @ 2025-02-07 13:26 ` Ronald S. Bultje 2025-02-07 13:43 ` Zhao Zhili 2 siblings, 1 reply; 12+ messages in thread From: Ronald S. Bultje @ 2025-02-07 13:26 UTC (permalink / raw) To: FFmpeg development discussions and patches Hi, On Fri, Feb 7, 2025 at 6:22 AM Andreas Rheinhardt < andreas.rheinhardt@outlook.com> wrote: > Ronald S. Bultje: > > Fixes #11456. > > --- > > libavcodec/threadprogress.c | 3 +-- > > 1 file changed, 1 insertion(+), 2 deletions(-) > > > > diff --git a/libavcodec/threadprogress.c b/libavcodec/threadprogress.c > > index 62c4fd898b..aa72ff80e7 100644 > > --- a/libavcodec/threadprogress.c > > +++ b/libavcodec/threadprogress.c > > @@ -55,9 +55,8 @@ void ff_thread_progress_report(ThreadProgress *pro, > int n) > > if (atomic_load_explicit(&pro->progress, memory_order_relaxed) >= n) > > return; > > > > - atomic_store_explicit(&pro->progress, n, memory_order_release); > > - > > ff_mutex_lock(&pro->progress_mutex); > > + atomic_store_explicit(&pro->progress, n, memory_order_release); > > ff_cond_broadcast(&pro->progress_cond); > > ff_mutex_unlock(&pro->progress_mutex); > > } > > I don't really understand why this is supposed to fix a race; after all, > the synchronisation of ff_thread_progress_(report|await) is not supposed > to be provided by the mutex (which is avoided altogether in the fast > path in ff_thread_report_await()), but by storing and loading the > progress variable. > That's also the reason why I moved this outside of the mutex (compared > to ff_thread_report_progress(). (This way it is possible for a consumer > thread to see the new progress value earlier and possibly avoid the > mutex altogether.) The consumer thread already checks the value without the lock. so the significance of that last point seems minor to me. This would be different if the wait() counterpart had no lockless path. Or am I missing something? Ronald _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [FFmpeg-devel] [PATCH] threadprogress: reorder instructions to silence tsan warning. 2025-02-07 13:26 ` Ronald S. Bultje @ 2025-02-07 13:43 ` Zhao Zhili 2025-02-07 16:05 ` Ronald S. Bultje 0 siblings, 1 reply; 12+ messages in thread From: Zhao Zhili @ 2025-02-07 13:43 UTC (permalink / raw) To: FFmpeg development discussions and patches > On Feb 7, 2025, at 21:26, Ronald S. Bultje <rsbultje@gmail.com> wrote: > > Hi, > > On Fri, Feb 7, 2025 at 6:22 AM Andreas Rheinhardt < > andreas.rheinhardt@outlook.com> wrote: > >> Ronald S. Bultje: >>> Fixes #11456. >>> --- >>> libavcodec/threadprogress.c | 3 +-- >>> 1 file changed, 1 insertion(+), 2 deletions(-) >>> >>> diff --git a/libavcodec/threadprogress.c b/libavcodec/threadprogress.c >>> index 62c4fd898b..aa72ff80e7 100644 >>> --- a/libavcodec/threadprogress.c >>> +++ b/libavcodec/threadprogress.c >>> @@ -55,9 +55,8 @@ void ff_thread_progress_report(ThreadProgress *pro, >> int n) >>> if (atomic_load_explicit(&pro->progress, memory_order_relaxed) >= n) >>> return; >>> >>> - atomic_store_explicit(&pro->progress, n, memory_order_release); >>> - >>> ff_mutex_lock(&pro->progress_mutex); >>> + atomic_store_explicit(&pro->progress, n, memory_order_release); >>> ff_cond_broadcast(&pro->progress_cond); >>> ff_mutex_unlock(&pro->progress_mutex); >>> } >> >> I don't really understand why this is supposed to fix a race; after all, >> the synchronisation of ff_thread_progress_(report|await) is not supposed >> to be provided by the mutex (which is avoided altogether in the fast >> path in ff_thread_report_await()), but by storing and loading the >> progress variable. >> That's also the reason why I moved this outside of the mutex (compared >> to ff_thread_report_progress(). (This way it is possible for a consumer >> thread to see the new progress value earlier and possibly avoid the >> mutex altogether.) > > > The consumer thread already checks the value without the lock. so the > significance of that last point seems minor to me. This would be different > if the wait() counterpart had no lockless path. Or am I missing something? What Andreas says is atomic_store before mutex_lock makes the first atomic_load in progress_wait has a higher chance to succeed. The earlier progress is set, the higher chance of progress_wait go into the fast path. > > Ronald > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [FFmpeg-devel] [PATCH] threadprogress: reorder instructions to silence tsan warning. 2025-02-07 13:43 ` Zhao Zhili @ 2025-02-07 16:05 ` Ronald S. Bultje 0 siblings, 0 replies; 12+ messages in thread From: Ronald S. Bultje @ 2025-02-07 16:05 UTC (permalink / raw) To: FFmpeg development discussions and patches Hi, On Fri, Feb 7, 2025 at 8:44 AM Zhao Zhili < quinkblack-at-foxmail.com@ffmpeg.org> wrote: > > On Feb 7, 2025, at 21:26, Ronald S. Bultje <rsbultje@gmail.com> wrote: > > On Fri, Feb 7, 2025 at 6:22 AM Andreas Rheinhardt < > > andreas.rheinhardt@outlook.com> wrote: > >> Ronald S. Bultje: > >>> Fixes #11456. > >>> --- > >>> libavcodec/threadprogress.c | 3 +-- > >>> 1 file changed, 1 insertion(+), 2 deletions(-) > >>> > >>> diff --git a/libavcodec/threadprogress.c b/libavcodec/threadprogress.c > >>> index 62c4fd898b..aa72ff80e7 100644 > >>> --- a/libavcodec/threadprogress.c > >>> +++ b/libavcodec/threadprogress.c > >>> @@ -55,9 +55,8 @@ void ff_thread_progress_report(ThreadProgress *pro, > >> int n) > >>> if (atomic_load_explicit(&pro->progress, memory_order_relaxed) >= > n) > >>> return; > >>> > >>> - atomic_store_explicit(&pro->progress, n, memory_order_release); > >>> - > >>> ff_mutex_lock(&pro->progress_mutex); > >>> + atomic_store_explicit(&pro->progress, n, memory_order_release); > >>> ff_cond_broadcast(&pro->progress_cond); > >>> ff_mutex_unlock(&pro->progress_mutex); > >>> } > >> > >> I don't really understand why this is supposed to fix a race; after all, > >> the synchronisation of ff_thread_progress_(report|await) is not supposed > >> to be provided by the mutex (which is avoided altogether in the fast > >> path in ff_thread_report_await()), but by storing and loading the > >> progress variable. > >> That's also the reason why I moved this outside of the mutex (compared > >> to ff_thread_report_progress(). (This way it is possible for a consumer > >> thread to see the new progress value earlier and possibly avoid the > >> mutex altogether.) > > > > > > The consumer thread already checks the value without the lock. so the > > significance of that last point seems minor to me. This would be > different > > if the wait() counterpart had no lockless path. Or am I missing > something? > > What Andreas says is atomic_store before mutex_lock makes the first > atomic_load in progress_wait has a higher chance to succeed. The earlier > progress is set, the higher chance of progress_wait go into the fast path. I understand that is true in theory - but I have doubts on whether this is in any way significant in practice if wait() already has behaviour to pre-empty locklessly I measured this in the most un-scientific way possible by decoding gizmo.webm (from Firefox' bug report) 10x before and after my patch, taking the average and standard deviation, and comparing these with each other. I repeated this a couple of times. The values (before vs after avg +/- stddev) are obviously never exactly the same, but they swarm around each other like a random noise generator. Or to say it differently: in my highly unscientific test, I see no performance difference. So ... Is this really worth it? Ronald _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2025-02-07 16:06 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2025-02-06 21:42 [FFmpeg-devel] [PATCH] threadprogress: reorder instructions to silence tsan warning Ronald S. Bultje 2025-02-07 5:13 ` Zhao Zhili 2025-02-07 11:22 ` Andreas Rheinhardt 2025-02-07 11:38 ` Zhao Zhili 2025-02-07 11:47 ` Andreas Rheinhardt 2025-02-07 11:39 ` Andreas Rheinhardt 2025-02-07 11:46 ` Zhao Zhili 2025-02-07 11:53 ` Zhao Zhili 2025-02-07 12:20 ` Rémi Denis-Courmont 2025-02-07 13:26 ` Ronald S. Bultje 2025-02-07 13:43 ` Zhao Zhili 2025-02-07 16:05 ` Ronald S. Bultje
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel This inbox may be cloned and mirrored by anyone: git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \ ffmpegdev@gitmailbox.com public-inbox-index ffmpegdev Example config snippet for mirrors. AGPL code for this site: git clone https://public-inbox.org/public-inbox.git