* [FFmpeg-devel] [PATCH] Replace br return with ret
@ 2023-07-27 10:26 Casey Smalley
2023-07-27 13:55 ` Rémi Denis-Courmont
0 siblings, 1 reply; 7+ messages in thread
From: Casey Smalley @ 2023-07-27 10:26 UTC (permalink / raw)
To: ffmpeg-devel
This patch changes the return instruction in the
tr_32x4 macro from br to ret.
On devices that support BTI a landing pad is
required when branching with br, or the instruction
can be replaced with a ret.
The change fixes fate-hevc-hdr-vivid-metadata when
on hardware with BTI support.
Signed-off-by: Casey Smalley <casey.smalley@arm.com>
---
libavcodec/aarch64/hevcdsp_idct_neon.S | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/libavcodec/aarch64/hevcdsp_idct_neon.S
b/libavcodec/aarch64/hevcdsp_idct_neon.S
index b7f23386a4..eab2add9e8 100644
--- a/libavcodec/aarch64/hevcdsp_idct_neon.S
+++ b/libavcodec/aarch64/hevcdsp_idct_neon.S
@@ -791,7 +791,7 @@ function func_tr_32x4_\name
add x3, x11, #(32 + 3 * 64)
scale_store \shift
- br x10
+ ret x10
endfunc
.endm
-- 2.40.1
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [FFmpeg-devel] [PATCH] Replace br return with ret
2023-07-27 10:26 [FFmpeg-devel] [PATCH] Replace br return with ret Casey Smalley
@ 2023-07-27 13:55 ` Rémi Denis-Courmont
2023-07-27 17:22 ` Reimar Döffinger
2023-08-04 10:48 ` Martin Storsjö
0 siblings, 2 replies; 7+ messages in thread
From: Rémi Denis-Courmont @ 2023-07-27 13:55 UTC (permalink / raw)
To: FFmpeg development discussions and patches
Hi,
The use of RET vs BR also has microarchitectural side effects. AFAIU, RET should always be paired with an earlier BL/BLR to avoid interfering with branch prediction.
So depending on the circumstances, either one of these should be addressed:
* Clarify that this is actually a function return , and RET should be used anyway, regardless of BTI.
* Keep BR and add BTI J landing pads where appropriate, if this wasn't really a function return.
Br,
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [FFmpeg-devel] [PATCH] Replace br return with ret
2023-07-27 13:55 ` Rémi Denis-Courmont
@ 2023-07-27 17:22 ` Reimar Döffinger
2023-08-04 9:14 ` Casey Smalley
2023-08-04 10:48 ` Martin Storsjö
1 sibling, 1 reply; 7+ messages in thread
From: Reimar Döffinger @ 2023-07-27 17:22 UTC (permalink / raw)
To: FFmpeg development discussions and patches
> On 27 Jul 2023, at 15:55, Rémi Denis-Courmont <remi@remlab.net> wrote:
>
> Hi,
>
> The use of RET vs BR also has microarchitectural side effects. AFAIU, RET should always be paired with an earlier BL/BLR to avoid interfering with branch prediction.
>
> So depending on the circumstances, either one of these should be addressed:
> * Clarify that this is actually a function return , and RET should be used anyway, regardless of BTI.
> * Keep BR and add BTI J landing pads where appropriate, if this wasn't really a function return.
Yes BL and RET is best to match up.
For this function:
% git grep func_tr_32x4
libavcodec/aarch64/hevcdsp_idct_neon.S:function func_tr_32x4_\name
libavcodec/aarch64/hevcdsp_idct_neon.S: bl func_tr_32x4_firstpass
libavcodec/aarch64/hevcdsp_idct_neon.S: bl func_tr_32x4_secondpass_\bitdepth
libavcodec/arm/hevcdsp_idct_neon.S:function func_tr_32x4_\name
libavcodec/arm/hevcdsp_idct_neon.S: bl func_tr_32x4_firstpass
libavcodec/arm/hevcdsp_idct_neon.S: bl func_tr_32x4_secondpass_\bitdepth
It is always used with "bl", thus ret is also more correct from
that aspect.
Was your comment only on checking that, or did you mean that this should
be mentioned in the commit message?
(if you are wondering why the code did not use ret before, I guess it's
because it was ported from the 32-bit arm assembler and it slipped by code review)
Best regards,
Reimar
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [FFmpeg-devel] [PATCH] Replace br return with ret
2023-07-27 17:22 ` Reimar Döffinger
@ 2023-08-04 9:14 ` Casey Smalley
0 siblings, 0 replies; 7+ messages in thread
From: Casey Smalley @ 2023-08-04 9:14 UTC (permalink / raw)
To: ffmpeg-devel
Hi,
Just wondering what current thoughts on the patch was. It looks as
though the change is fine, but if there is still an issue I can submit a
new patch using BTI landing pads instead.
Best regards,
Casey.
On 7/27/23 18:22, Reimar Döffinger wrote:
>
>
>
>> On 27 Jul 2023, at 15:55, Rémi Denis-Courmont <remi@remlab.net> wrote:
>>
>> Hi,
>>
>> The use of RET vs BR also has microarchitectural side effects. AFAIU, RET should always be paired with an earlier BL/BLR to avoid interfering with branch prediction.
>>
>> So depending on the circumstances, either one of these should be addressed:
>> * Clarify that this is actually a function return , and RET should be used anyway, regardless of BTI.
>> * Keep BR and add BTI J landing pads where appropriate, if this wasn't really a function return.
> Yes BL and RET is best to match up.
>
> For this function:
> % git grep func_tr_32x4
> libavcodec/aarch64/hevcdsp_idct_neon.S:function func_tr_32x4_\name
> libavcodec/aarch64/hevcdsp_idct_neon.S: bl func_tr_32x4_firstpass
> libavcodec/aarch64/hevcdsp_idct_neon.S: bl func_tr_32x4_secondpass_\bitdepth
> libavcodec/arm/hevcdsp_idct_neon.S:function func_tr_32x4_\name
> libavcodec/arm/hevcdsp_idct_neon.S: bl func_tr_32x4_firstpass
> libavcodec/arm/hevcdsp_idct_neon.S: bl func_tr_32x4_secondpass_\bitdepth
>
> It is always used with "bl", thus ret is also more correct from
> that aspect.
> Was your comment only on checking that, or did you mean that this should
> be mentioned in the commit message?
> (if you are wondering why the code did not use ret before, I guess it's
> because it was ported from the 32-bit arm assembler and it slipped by code review)
>
> Best regards,
> Reimar
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [FFmpeg-devel] [PATCH] Replace br return with ret
2023-07-27 13:55 ` Rémi Denis-Courmont
2023-07-27 17:22 ` Reimar Döffinger
@ 2023-08-04 10:48 ` Martin Storsjö
1 sibling, 0 replies; 7+ messages in thread
From: Martin Storsjö @ 2023-08-04 10:48 UTC (permalink / raw)
To: FFmpeg development discussions and patches
On Thu, 27 Jul 2023, Rémi Denis-Courmont wrote:
> Hi,
>
> The use of RET vs BR also has microarchitectural side effects. AFAIU, RET should always be paired with an earlier BL/BLR to avoid interfering with branch prediction.
>
> So depending on the circumstances, either one of these should be
> addressed:
> * Clarify that this is actually a function return , and RET should be
> used anyway, regardless of BTI.
This is the case, and the patch looks good to me.
I guess the commit message could be clarified that this is an issue even
without BTI (even if the effect is much harder to notice there).
Would this amended commit message be ok with you? (If no input I guess
I'll push it in a few days.)
---8<---
Subject: aarch64/hevc: Replace br return with ret
This patch changes the return instruction in the tr_32x4 macro from br to
ret.
Function returns should always use the RET instruction instead of BR, to
avoid interfering with branch prediction.
On devices that support BTI, this is observeable as a landing pad is
required when branching with BR. The change fixes
fate-hevc-hdr-vivid-metadata when on hardware with BTI support.
---8<---
// Martin
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 7+ messages in thread
* [FFmpeg-devel] [PATCH] Replace br return with ret
@ 2023-08-08 12:22 Casey Smalley
2023-08-08 17:46 ` Martin Storsjö
0 siblings, 1 reply; 7+ messages in thread
From: Casey Smalley @ 2023-08-08 12:22 UTC (permalink / raw)
To: ffmpeg-devel
This patch changes the return instruction in the
tr_32x4 macro from br to ret.
Using ret properly hints that the branch is a
function return.
On devices that support BTI a landing pad is
required when branching with br, or the instruction
can be replaced with a ret.
The change fixes fate-hevc-hdr-vivid-metadata when
on hardware with BTI support.
Signed-off-by: Casey Smalley <casey.smalley@arm.com>
---
libavcodec/aarch64/hevcdsp_idct_neon.S | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/libavcodec/aarch64/hevcdsp_idct_neon.S
b/libavcodec/aarch64/hevcdsp_idct_neon.S
index f7142c939c..dbb3705670 100644
--- a/libavcodec/aarch64/hevcdsp_idct_neon.S
+++ b/libavcodec/aarch64/hevcdsp_idct_neon.S
@@ -790,7 +790,7 @@ function func_tr_32x4_\name
add x3, x11, #(32 + 3 * 64)
scale_store \shift
- br x10
+ ret x10
endfunc
.endm
-- 2.40.1
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [FFmpeg-devel] [PATCH] Replace br return with ret
2023-08-08 12:22 Casey Smalley
@ 2023-08-08 17:46 ` Martin Storsjö
0 siblings, 0 replies; 7+ messages in thread
From: Martin Storsjö @ 2023-08-08 17:46 UTC (permalink / raw)
To: FFmpeg development discussions and patches
On Tue, 8 Aug 2023, Casey Smalley wrote:
> This patch changes the return instruction in the
> tr_32x4 macro from br to ret.
>
> Using ret properly hints that the branch is a
> function return.
>
> On devices that support BTI a landing pad is
> required when branching with br, or the instruction
> can be replaced with a ret.
>
> The change fixes fate-hevc-hdr-vivid-metadata when
> on hardware with BTI support.
>
> Signed-off-by: Casey Smalley <casey.smalley@arm.com>
> ---
> libavcodec/aarch64/hevcdsp_idct_neon.S | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
I already pushed this patch yesterday (but probably forgot to reply to the
list about it). Thanks for the patch in any case!
// Martin
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2023-08-08 17:47 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-27 10:26 [FFmpeg-devel] [PATCH] Replace br return with ret Casey Smalley
2023-07-27 13:55 ` Rémi Denis-Courmont
2023-07-27 17:22 ` Reimar Döffinger
2023-08-04 9:14 ` Casey Smalley
2023-08-04 10:48 ` Martin Storsjö
2023-08-08 12:22 Casey Smalley
2023-08-08 17:46 ` Martin Storsjö
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
This inbox may be cloned and mirrored by anyone:
git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git
# If you have public-inbox 1.1+ installed, you may
# initialize and index your mirror using the following commands:
public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
ffmpegdev@gitmailbox.com
public-inbox-index ffmpegdev
Example config snippet for mirrors.
AGPL code for this site: git clone https://public-inbox.org/public-inbox.git