* [FFmpeg-devel] [PATCH] arm32/neon: Avoid using bge/beq for function calls
@ 2023-01-07 3:54 Rui Ueyama
2023-01-09 16:01 ` Martin Storsjö
2023-01-14 4:08 ` Rui Ueyama
0 siblings, 2 replies; 5+ messages in thread
From: Rui Ueyama @ 2023-01-07 3:54 UTC (permalink / raw)
To: ffmpeg-devel
It looks like compiler-generated code always uses `b`, `bl` or `blx`
instructions for function calls. These instructions have a 24-bit
immediate and therefore can jump anywhere between PC +- 16 MiB.
This hand-written assembly code instead uses `bge` and `beq` for
interprocedural jumps. Since these instructions have only a 19-bit
immediate (we have less bits for condition code), they can jump only
within PC +- 512 KiB. This sometimes causes a "relocation R_ARM_THM_JUMP19
out of range" error when linked with the mold linker. This error can
easily be avoided by using `b` instead of `bge` or `beq`.
Signed-off-by: Rui Ueyama <rui314@gmail.com>
---
libswresample/arm/audio_convert_neon.S | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/libswresample/arm/audio_convert_neon.S
b/libswresample/arm/audio_convert_neon.S
index 085d50aafa..3fe114772c 100644
--- a/libswresample/arm/audio_convert_neon.S
+++ b/libswresample/arm/audio_convert_neon.S
@@ -133,12 +133,13 @@ endfunc
function swri_oldapi_conv_fltp_to_s16_nch_neon, export=1
cmp r3, #2
- itt lt
- ldrlt r1, [r1]
- blt .L_swri_oldapi_conv_flt_to_s16_neon
- beq .L_swri_oldapi_conv_fltp_to_s16_2ch_neon
+ bgt 2f
+ beq 1f
+ ldr r1, [r1]
+ b .L_swri_oldapi_conv_flt_to_s16_neon
+1: b .L_swri_oldapi_conv_fltp_to_s16_2ch_neon
- push {r4-r8, lr}
+2: push {r4-r8, lr}
cmp r3, #4
lsl r12, r3, #1
blt 4f
--
2.34.1
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [FFmpeg-devel] [PATCH] arm32/neon: Avoid using bge/beq for function calls
2023-01-07 3:54 [FFmpeg-devel] [PATCH] arm32/neon: Avoid using bge/beq for function calls Rui Ueyama
@ 2023-01-09 16:01 ` Martin Storsjö
2023-01-09 21:48 ` Martin Storsjö
2023-01-14 4:08 ` Rui Ueyama
1 sibling, 1 reply; 5+ messages in thread
From: Martin Storsjö @ 2023-01-09 16:01 UTC (permalink / raw)
To: FFmpeg development discussions and patches
Hi Rui,
Long time no see!
On Sat, 7 Jan 2023, Rui Ueyama wrote:
> It looks like compiler-generated code always uses `b`, `bl` or `blx`
> instructions for function calls. These instructions have a 24-bit
> immediate and therefore can jump anywhere between PC +- 16 MiB.
>
> This hand-written assembly code instead uses `bge` and `beq` for
> interprocedural jumps. Since these instructions have only a 19-bit
> immediate (we have less bits for condition code), they can jump only
> within PC +- 512 KiB. This sometimes causes a "relocation R_ARM_THM_JUMP19
> out of range" error when linked with the mold linker. This error can
> easily be avoided by using `b` instead of `bge` or `beq`.
Can you add a bit more explanation about what happens in mold in this case
and context about the setup - I don't quite understand how this can happen
(even if the code admittedly is a bit unusual)?
Since .L_swri_oldapi_conv_flt_to_s16_neon and
.L_swri_oldapi_conv_fltp_to_s16_2ch_neon are local symbols, they don't get
emitted by the assembler, and the branch instructions are encoded with
fixed offsets and no relocations. And even if there would be a relocation,
the destination is within the same text section chunk in the object file,
so it shouldn't be possible for it to be out of range.
The only possibility for this to be out of range, is if the destination is
treated as a global and routed via the PLC?
What am I missing here?
// Martin
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [FFmpeg-devel] [PATCH] arm32/neon: Avoid using bge/beq for function calls
2023-01-09 16:01 ` Martin Storsjö
@ 2023-01-09 21:48 ` Martin Storsjö
0 siblings, 0 replies; 5+ messages in thread
From: Martin Storsjö @ 2023-01-09 21:48 UTC (permalink / raw)
To: FFmpeg development discussions and patches
On Mon, 9 Jan 2023, Martin Storsjö wrote:
> Hi Rui,
>
> Long time no see!
>
> On Sat, 7 Jan 2023, Rui Ueyama wrote:
>
>> It looks like compiler-generated code always uses `b`, `bl` or `blx`
>> instructions for function calls. These instructions have a 24-bit
>> immediate and therefore can jump anywhere between PC +- 16 MiB.
>>
>> This hand-written assembly code instead uses `bge` and `beq` for
>> interprocedural jumps. Since these instructions have only a 19-bit
>> immediate (we have less bits for condition code), they can jump only
>> within PC +- 512 KiB. This sometimes causes a "relocation R_ARM_THM_JUMP19
>> out of range" error when linked with the mold linker. This error can
>> easily be avoided by using `b` instead of `bge` or `beq`.
>
> Can you add a bit more explanation about what happens in mold in this case
> and context about the setup - I don't quite understand how this can happen
> (even if the code admittedly is a bit unusual)?
>
> Since .L_swri_oldapi_conv_flt_to_s16_neon and
> .L_swri_oldapi_conv_fltp_to_s16_2ch_neon are local symbols, they don't get
> emitted by the assembler, and the branch instructions are encoded with fixed
> offsets and no relocations. And even if there would be a relocation, the
> destination is within the same text section chunk in the object file, so it
> shouldn't be possible for it to be out of range.
>
> The only possibility for this to be out of range, is if the destination is
> treated as a global and routed via the PLC?
>
> What am I missing here?
In particular, it seems like the commits
b22db4f465c9adb2cf1489e04f7b65ef6bb55b8b and
e84212b78e00df17799e01be1e153a073eb8f689 were introduced to fix exactly
this issue - by converting references from using the external global
symbols into local labels instead.
// Martin
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [FFmpeg-devel] [PATCH] arm32/neon: Avoid using bge/beq for function calls
2023-01-07 3:54 [FFmpeg-devel] [PATCH] arm32/neon: Avoid using bge/beq for function calls Rui Ueyama
2023-01-09 16:01 ` Martin Storsjö
@ 2023-01-14 4:08 ` Rui Ueyama
2023-01-14 22:30 ` Martin Storsjö
1 sibling, 1 reply; 5+ messages in thread
From: Rui Ueyama @ 2023-01-14 4:08 UTC (permalink / raw)
To: ffmpeg-devel
Hey Martin,
It's nice to see you on this mailing list!
Sorry about sending this email as a reply to a wrong email, as I
didn't receive your mail and thus couldn't send this as a reply to
your mail.
> On Sat, 7 Jan 2023, Rui Ueyama wrote:
>
> > It looks like compiler-generated code always uses `b`, `bl` or `blx`
> > instructions for function calls. These instructions have a 24-bit
> > immediate and therefore can jump anywhere between PC +- 16 MiB.
> >
> > This hand-written assembly code instead uses `bge` and `beq` for
> > interprocedural jumps. Since these instructions have only a 19-bit
> > immediate (we have less bits for condition code), they can jump only
> > within PC +- 512 KiB. This sometimes causes a "relocation R_ARM_THM_JUMP19
> > out of range" error when linked with the mold linker. This error can
> > easily be avoided by using `b` instead of `bge` or `beq`.
>
> Can you add a bit more explanation about what happens in mold in this case
> and context about the setup - I don't quite understand how this can happen
> (even if the code admittedly is a bit unusual)?
>
> Since .L_swri_oldapi_conv_flt_to_s16_neon and
> .L_swri_oldapi_conv_fltp_to_s16_2ch_neon are local symbols, they don't get
> emitted by the assembler, and the branch instructions are encoded with
> fixed offsets and no relocations. And even if there would be a relocation,
> the destination is within the same text section chunk in the object file,
> so it shouldn't be possible for it to be out of range.
>
> The only possibility for this to be out of range, is if the destination is
> treated as a global and routed via the PLC?
There was confusion on our side. ffmpeg used to contain two
audio_convert_neon.S as below
libswresample/arm/audio_convert_neon.S
libavresample/arm/audio_convert_neon.S
and the latter had a problem that I explained in the previous mail.
But that file has been removed, so there's no problem with the
existing code. I'll retract the patch I sent before. Sorry for the
confusion.
Rui
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [FFmpeg-devel] [PATCH] arm32/neon: Avoid using bge/beq for function calls
2023-01-14 4:08 ` Rui Ueyama
@ 2023-01-14 22:30 ` Martin Storsjö
0 siblings, 0 replies; 5+ messages in thread
From: Martin Storsjö @ 2023-01-14 22:30 UTC (permalink / raw)
To: FFmpeg development discussions and patches
Hi Rui,
On Sat, 14 Jan 2023, Rui Ueyama wrote:
>> On Sat, 7 Jan 2023, Rui Ueyama wrote:
>>
>>> It looks like compiler-generated code always uses `b`, `bl` or `blx`
>>> instructions for function calls. These instructions have a 24-bit
>>> immediate and therefore can jump anywhere between PC +- 16 MiB.
>>>
>>> This hand-written assembly code instead uses `bge` and `beq` for
>>> interprocedural jumps. Since these instructions have only a 19-bit
>>> immediate (we have less bits for condition code), they can jump only
>>> within PC +- 512 KiB. This sometimes causes a "relocation R_ARM_THM_JUMP19
>>> out of range" error when linked with the mold linker. This error can
>>> easily be avoided by using `b` instead of `bge` or `beq`.
>>
>> Can you add a bit more explanation about what happens in mold in this case
>> and context about the setup - I don't quite understand how this can happen
>> (even if the code admittedly is a bit unusual)?
>>
>> Since .L_swri_oldapi_conv_flt_to_s16_neon and
>> .L_swri_oldapi_conv_fltp_to_s16_2ch_neon are local symbols, they don't get
>> emitted by the assembler, and the branch instructions are encoded with
>> fixed offsets and no relocations. And even if there would be a relocation,
>> the destination is within the same text section chunk in the object file,
>> so it shouldn't be possible for it to be out of range.
>>
>> The only possibility for this to be out of range, is if the destination is
>> treated as a global and routed via the PLC?
>
> There was confusion on our side. ffmpeg used to contain two
> audio_convert_neon.S as below
>
> libswresample/arm/audio_convert_neon.S
> libavresample/arm/audio_convert_neon.S
>
> and the latter had a problem that I explained in the previous mail.
> But that file has been removed, so there's no problem with the
> existing code. I'll retract the patch I sent before. Sorry for the
> confusion.
Ah, I see - that explains it!
Ok, good then, that there's no issue with that code pattern in assembly -
otherwise there could be a whole lot of issues to run into...
// Martin
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-01-14 22:30 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-07 3:54 [FFmpeg-devel] [PATCH] arm32/neon: Avoid using bge/beq for function calls Rui Ueyama
2023-01-09 16:01 ` Martin Storsjö
2023-01-09 21:48 ` Martin Storsjö
2023-01-14 4:08 ` Rui Ueyama
2023-01-14 22:30 ` Martin Storsjö
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
This inbox may be cloned and mirrored by anyone:
git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git
# If you have public-inbox 1.1+ installed, you may
# initialize and index your mirror using the following commands:
public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
ffmpegdev@gitmailbox.com
public-inbox-index ffmpegdev
Example config snippet for mirrors.
AGPL code for this site: git clone https://public-inbox.org/public-inbox.git