Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
 help / color / mirror / Atom feed
* [FFmpeg-devel] [PATCH] arm32/neon: Avoid using bge/beq for function calls
@ 2023-01-07  3:54 Rui Ueyama
  2023-01-09 16:01 ` Martin Storsjö
  2023-01-14  4:08 ` Rui Ueyama
  0 siblings, 2 replies; 5+ messages in thread
From: Rui Ueyama @ 2023-01-07  3:54 UTC (permalink / raw)
  To: ffmpeg-devel

It looks like compiler-generated code always uses `b`, `bl` or `blx`
instructions for function calls. These instructions have a 24-bit
immediate and therefore can jump anywhere between PC +- 16 MiB.

This hand-written assembly code instead uses `bge` and `beq` for
interprocedural jumps. Since these instructions have only a 19-bit
immediate (we have less bits for condition code), they can jump only
within PC +- 512 KiB. This sometimes causes a "relocation R_ARM_THM_JUMP19
out of range" error when linked with the mold linker. This error can
easily be avoided by using `b` instead of `bge` or `beq`.

Signed-off-by: Rui Ueyama <rui314@gmail.com>
---
 libswresample/arm/audio_convert_neon.S | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/libswresample/arm/audio_convert_neon.S
b/libswresample/arm/audio_convert_neon.S
index 085d50aafa..3fe114772c 100644
--- a/libswresample/arm/audio_convert_neon.S
+++ b/libswresample/arm/audio_convert_neon.S
@@ -133,12 +133,13 @@ endfunc

 function swri_oldapi_conv_fltp_to_s16_nch_neon, export=1
         cmp             r3,  #2
-        itt             lt
-        ldrlt           r1,  [r1]
-        blt             .L_swri_oldapi_conv_flt_to_s16_neon
-        beq             .L_swri_oldapi_conv_fltp_to_s16_2ch_neon
+        bgt             2f
+        beq             1f
+        ldr             r1,  [r1]
+        b               .L_swri_oldapi_conv_flt_to_s16_neon
+1:      b               .L_swri_oldapi_conv_fltp_to_s16_2ch_neon

-        push            {r4-r8, lr}
+2:      push            {r4-r8, lr}
         cmp             r3,  #4
         lsl             r12, r3,  #1
         blt             4f
--
2.34.1
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [FFmpeg-devel] [PATCH] arm32/neon: Avoid using bge/beq for function calls
  2023-01-07  3:54 [FFmpeg-devel] [PATCH] arm32/neon: Avoid using bge/beq for function calls Rui Ueyama
@ 2023-01-09 16:01 ` Martin Storsjö
  2023-01-09 21:48   ` Martin Storsjö
  2023-01-14  4:08 ` Rui Ueyama
  1 sibling, 1 reply; 5+ messages in thread
From: Martin Storsjö @ 2023-01-09 16:01 UTC (permalink / raw)
  To: FFmpeg development discussions and patches

Hi Rui,

Long time no see!

On Sat, 7 Jan 2023, Rui Ueyama wrote:

> It looks like compiler-generated code always uses `b`, `bl` or `blx`
> instructions for function calls. These instructions have a 24-bit
> immediate and therefore can jump anywhere between PC +- 16 MiB.
>
> This hand-written assembly code instead uses `bge` and `beq` for
> interprocedural jumps. Since these instructions have only a 19-bit
> immediate (we have less bits for condition code), they can jump only
> within PC +- 512 KiB. This sometimes causes a "relocation R_ARM_THM_JUMP19
> out of range" error when linked with the mold linker. This error can
> easily be avoided by using `b` instead of `bge` or `beq`.

Can you add a bit more explanation about what happens in mold in this case 
and context about the setup - I don't quite understand how this can happen 
(even if the code admittedly is a bit unusual)?

Since .L_swri_oldapi_conv_flt_to_s16_neon and 
.L_swri_oldapi_conv_fltp_to_s16_2ch_neon are local symbols, they don't get 
emitted by the assembler, and the branch instructions are encoded with 
fixed offsets and no relocations. And even if there would be a relocation, 
the destination is within the same text section chunk in the object file, 
so it shouldn't be possible for it to be out of range.

The only possibility for this to be out of range, is if the destination is 
treated as a global and routed via the PLC?

What am I missing here?

// Martin

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [FFmpeg-devel] [PATCH] arm32/neon: Avoid using bge/beq for function calls
  2023-01-09 16:01 ` Martin Storsjö
@ 2023-01-09 21:48   ` Martin Storsjö
  0 siblings, 0 replies; 5+ messages in thread
From: Martin Storsjö @ 2023-01-09 21:48 UTC (permalink / raw)
  To: FFmpeg development discussions and patches

On Mon, 9 Jan 2023, Martin Storsjö wrote:

> Hi Rui,
>
> Long time no see!
>
> On Sat, 7 Jan 2023, Rui Ueyama wrote:
>
>> It looks like compiler-generated code always uses `b`, `bl` or `blx`
>> instructions for function calls. These instructions have a 24-bit
>> immediate and therefore can jump anywhere between PC +- 16 MiB.
>> 
>> This hand-written assembly code instead uses `bge` and `beq` for
>> interprocedural jumps. Since these instructions have only a 19-bit
>> immediate (we have less bits for condition code), they can jump only
>> within PC +- 512 KiB. This sometimes causes a "relocation R_ARM_THM_JUMP19
>> out of range" error when linked with the mold linker. This error can
>> easily be avoided by using `b` instead of `bge` or `beq`.
>
> Can you add a bit more explanation about what happens in mold in this case 
> and context about the setup - I don't quite understand how this can happen 
> (even if the code admittedly is a bit unusual)?
>
> Since .L_swri_oldapi_conv_flt_to_s16_neon and 
> .L_swri_oldapi_conv_fltp_to_s16_2ch_neon are local symbols, they don't get 
> emitted by the assembler, and the branch instructions are encoded with fixed 
> offsets and no relocations. And even if there would be a relocation, the 
> destination is within the same text section chunk in the object file, so it 
> shouldn't be possible for it to be out of range.
>
> The only possibility for this to be out of range, is if the destination is 
> treated as a global and routed via the PLC?
>
> What am I missing here?

In particular, it seems like the commits 
b22db4f465c9adb2cf1489e04f7b65ef6bb55b8b and 
e84212b78e00df17799e01be1e153a073eb8f689 were introduced to fix exactly 
this issue - by converting references from using the external global 
symbols into local labels instead.

// Martin
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [FFmpeg-devel] [PATCH] arm32/neon: Avoid using bge/beq for function calls
  2023-01-07  3:54 [FFmpeg-devel] [PATCH] arm32/neon: Avoid using bge/beq for function calls Rui Ueyama
  2023-01-09 16:01 ` Martin Storsjö
@ 2023-01-14  4:08 ` Rui Ueyama
  2023-01-14 22:30   ` Martin Storsjö
  1 sibling, 1 reply; 5+ messages in thread
From: Rui Ueyama @ 2023-01-14  4:08 UTC (permalink / raw)
  To: ffmpeg-devel

Hey Martin,

It's nice to see you on this mailing list!

Sorry about sending this email as a reply to a wrong email, as I
didn't receive your mail and thus couldn't send this as a reply to
your mail.

> On Sat, 7 Jan 2023, Rui Ueyama wrote:
>
> > It looks like compiler-generated code always uses `b`, `bl` or `blx`
> > instructions for function calls. These instructions have a 24-bit
> > immediate and therefore can jump anywhere between PC +- 16 MiB.
> >
> > This hand-written assembly code instead uses `bge` and `beq` for
> > interprocedural jumps. Since these instructions have only a 19-bit
> > immediate (we have less bits for condition code), they can jump only
> > within PC +- 512 KiB. This sometimes causes a "relocation R_ARM_THM_JUMP19
> > out of range" error when linked with the mold linker. This error can
> > easily be avoided by using `b` instead of `bge` or `beq`.
>
> Can you add a bit more explanation about what happens in mold in this case
> and context about the setup - I don't quite understand how this can happen
> (even if the code admittedly is a bit unusual)?
>
> Since .L_swri_oldapi_conv_flt_to_s16_neon and
> .L_swri_oldapi_conv_fltp_to_s16_2ch_neon are local symbols, they don't get
> emitted by the assembler, and the branch instructions are encoded with
> fixed offsets and no relocations. And even if there would be a relocation,
> the destination is within the same text section chunk in the object file,
> so it shouldn't be possible for it to be out of range.
>
> The only possibility for this to be out of range, is if the destination is
> treated as a global and routed via the PLC?

There was confusion on our side. ffmpeg used to contain two
audio_convert_neon.S as below

 libswresample/arm/audio_convert_neon.S
 libavresample/arm/audio_convert_neon.S

and the latter had a problem that I explained in the previous mail.
But that file has been removed, so there's no problem with the
existing code. I'll retract the patch I sent before. Sorry for the
confusion.

Rui
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [FFmpeg-devel] [PATCH] arm32/neon: Avoid using bge/beq for function calls
  2023-01-14  4:08 ` Rui Ueyama
@ 2023-01-14 22:30   ` Martin Storsjö
  0 siblings, 0 replies; 5+ messages in thread
From: Martin Storsjö @ 2023-01-14 22:30 UTC (permalink / raw)
  To: FFmpeg development discussions and patches

Hi Rui,

On Sat, 14 Jan 2023, Rui Ueyama wrote:

>> On Sat, 7 Jan 2023, Rui Ueyama wrote:
>>
>>> It looks like compiler-generated code always uses `b`, `bl` or `blx`
>>> instructions for function calls. These instructions have a 24-bit
>>> immediate and therefore can jump anywhere between PC +- 16 MiB.
>>>
>>> This hand-written assembly code instead uses `bge` and `beq` for
>>> interprocedural jumps. Since these instructions have only a 19-bit
>>> immediate (we have less bits for condition code), they can jump only
>>> within PC +- 512 KiB. This sometimes causes a "relocation R_ARM_THM_JUMP19
>>> out of range" error when linked with the mold linker. This error can
>>> easily be avoided by using `b` instead of `bge` or `beq`.
>>
>> Can you add a bit more explanation about what happens in mold in this case
>> and context about the setup - I don't quite understand how this can happen
>> (even if the code admittedly is a bit unusual)?
>>
>> Since .L_swri_oldapi_conv_flt_to_s16_neon and
>> .L_swri_oldapi_conv_fltp_to_s16_2ch_neon are local symbols, they don't get
>> emitted by the assembler, and the branch instructions are encoded with
>> fixed offsets and no relocations. And even if there would be a relocation,
>> the destination is within the same text section chunk in the object file,
>> so it shouldn't be possible for it to be out of range.
>>
>> The only possibility for this to be out of range, is if the destination is
>> treated as a global and routed via the PLC?
>
> There was confusion on our side. ffmpeg used to contain two
> audio_convert_neon.S as below
>
> libswresample/arm/audio_convert_neon.S
> libavresample/arm/audio_convert_neon.S
>
> and the latter had a problem that I explained in the previous mail.
> But that file has been removed, so there's no problem with the
> existing code. I'll retract the patch I sent before. Sorry for the
> confusion.

Ah, I see - that explains it!

Ok, good then, that there's no issue with that code pattern in assembly - 
otherwise there could be a whole lot of issues to run into...

// Martin

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-01-14 22:30 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-07  3:54 [FFmpeg-devel] [PATCH] arm32/neon: Avoid using bge/beq for function calls Rui Ueyama
2023-01-09 16:01 ` Martin Storsjö
2023-01-09 21:48   ` Martin Storsjö
2023-01-14  4:08 ` Rui Ueyama
2023-01-14 22:30   ` Martin Storsjö

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
		ffmpegdev@gitmailbox.com
	public-inbox-index ffmpegdev

Example config snippet for mirrors.


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git