Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
 help / color / mirror / Atom feed
* [FFmpeg-devel] [PATCH] libswscale/x86/yuv2rgb: Add missing EMMS
@ 2024-06-05 17:35 Mario Hros
  2024-06-11 17:19 ` Mario Hros
  2024-06-26 17:54 ` [FFmpeg-devel] [PATCH v2] " Mario Hros
  0 siblings, 2 replies; 8+ messages in thread
From: Mario Hros @ 2024-06-05 17:35 UTC (permalink / raw)
  To: ffmpeg-devel; +Cc: Mario Hros

Previous rewrite from inline assembly into nasm (commit e934194) missed the required EMMS instruction to bring the x87 FPU back into usable state.

Signed-off-by: Mario Hros <k3x-devel@outlook.com>
---
 libswscale/x86/yuv_2_rgb.asm | 1 +
 1 file changed, 1 insertion(+)

diff --git a/libswscale/x86/yuv_2_rgb.asm b/libswscale/x86/yuv_2_rgb.asm
index e3470fd9ad..7a247797e4 100644
--- a/libswscale/x86/yuv_2_rgb.asm
+++ b/libswscale/x86/yuv_2_rgb.asm
@@ -353,6 +353,7 @@ cglobal %1_420_%2%3, GPR_num, GPR_num, reg_num, parameters
 add imageq, 8 * depth * time_num
 add indexq, 4 * time_num
 js .loop0
+emms
 
 RET
 
-- 
2.39.3 (Apple Git-146)

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [FFmpeg-devel] [PATCH] libswscale/x86/yuv2rgb: Add missing EMMS
  2024-06-05 17:35 [FFmpeg-devel] [PATCH] libswscale/x86/yuv2rgb: Add missing EMMS Mario Hros
@ 2024-06-11 17:19 ` Mario Hros
  2024-06-25 20:09   ` Michael Niedermayer
  2024-06-26 17:54 ` [FFmpeg-devel] [PATCH v2] " Mario Hros
  1 sibling, 1 reply; 8+ messages in thread
From: Mario Hros @ 2024-06-11 17:19 UTC (permalink / raw)
  To: ffmpeg-devel; +Cc: ting.fu

Ping?

Not including EMMS causes the FPU to become unusable (as the IE and SF bits are set in the status word). This breaks code which, for instance, calls fmod() and is compiled by GCC with the -O3 flag enabled. In that scenario, GCC implements fmod using the FPREM FPU instruction, but because FPU is in the invalid state at that point, the result is NaN.

This must to breaking other code using FPU also. The original code with inline assembly used EMMS at the end.
________________________________
From: Mario Hros <k3x-devel@outlook.com>
Sent: Wednesday, June 5, 2024 7:35 PM
To: ffmpeg-devel@ffmpeg.org <ffmpeg-devel@ffmpeg.org>
Cc: Mario Hros <k3x-devel@outlook.com>
Subject: [PATCH] libswscale/x86/yuv2rgb: Add missing EMMS

Previous rewrite from inline assembly into nasm (commit e934194) missed the required EMMS instruction to bring the x87 FPU back into usable state.

Signed-off-by: Mario Hros <k3x-devel@outlook.com>
---
 libswscale/x86/yuv_2_rgb.asm | 1 +
 1 file changed, 1 insertion(+)

diff --git a/libswscale/x86/yuv_2_rgb.asm b/libswscale/x86/yuv_2_rgb.asm
index e3470fd9ad..7a247797e4 100644
--- a/libswscale/x86/yuv_2_rgb.asm
+++ b/libswscale/x86/yuv_2_rgb.asm
@@ -353,6 +353,7 @@ cglobal %1_420_%2%3, GPR_num, GPR_num, reg_num, parameters
 add imageq, 8 * depth * time_num
 add indexq, 4 * time_num
 js .loop0
+emms

 RET

--
2.39.3 (Apple Git-146)

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [FFmpeg-devel] [PATCH] libswscale/x86/yuv2rgb: Add missing EMMS
  2024-06-11 17:19 ` Mario Hros
@ 2024-06-25 20:09   ` Michael Niedermayer
  0 siblings, 0 replies; 8+ messages in thread
From: Michael Niedermayer @ 2024-06-25 20:09 UTC (permalink / raw)
  To: FFmpeg development discussions and patches


[-- Attachment #1.1: Type: text/plain, Size: 1752 bytes --]

On Tue, Jun 11, 2024 at 05:19:06PM +0000, Mario Hros wrote:
> Ping?
> 
> Not including EMMS causes the FPU to become unusable (as the IE and SF bits are set in the status word). This breaks code which, for instance, calls fmod() and is compiled by GCC with the -O3 flag enabled. In that scenario, GCC implements fmod using the FPREM FPU instruction, but because FPU is in the invalid state at that point, the result is NaN.
> 
> This must to breaking other code using FPU also. The original code with inline assembly used EMMS at the end.
> ________________________________
> From: Mario Hros <k3x-devel@outlook.com>
> Sent: Wednesday, June 5, 2024 7:35 PM
> To: ffmpeg-devel@ffmpeg.org <ffmpeg-devel@ffmpeg.org>
> Cc: Mario Hros <k3x-devel@outlook.com>
> Subject: [PATCH] libswscale/x86/yuv2rgb: Add missing EMMS
> 
> Previous rewrite from inline assembly into nasm (commit e934194) missed the required EMMS instruction to bring the x87 FPU back into usable state.
> 
> Signed-off-by: Mario Hros <k3x-devel@outlook.com>
> ---
>  libswscale/x86/yuv_2_rgb.asm | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/libswscale/x86/yuv_2_rgb.asm b/libswscale/x86/yuv_2_rgb.asm
> index e3470fd9ad..7a247797e4 100644
> --- a/libswscale/x86/yuv_2_rgb.asm
> +++ b/libswscale/x86/yuv_2_rgb.asm
> @@ -353,6 +353,7 @@ cglobal %1_420_%2%3, GPR_num, GPR_num, reg_num, parameters
>  add imageq, 8 * depth * time_num
>  add indexq, 4 * time_num
>  js .loop0
> +emms
> 

this feels like its missing a
    %if mmsize == 8

thx

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Those who are too smart to engage in politics are punished by being
governed by those who are dumber. -- Plato 

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

[-- Attachment #2: Type: text/plain, Size: 251 bytes --]

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [FFmpeg-devel] [PATCH v2] libswscale/x86/yuv2rgb: Add missing EMMS
  2024-06-05 17:35 [FFmpeg-devel] [PATCH] libswscale/x86/yuv2rgb: Add missing EMMS Mario Hros
  2024-06-11 17:19 ` Mario Hros
@ 2024-06-26 17:54 ` Mario Hros
  2024-06-26 18:55   ` Ramiro Polla
  1 sibling, 1 reply; 8+ messages in thread
From: Mario Hros @ 2024-06-26 17:54 UTC (permalink / raw)
  To: ffmpeg-devel; +Cc: Mario Hros

Previous rewrite from inline assembly into nasm (commit e934194) missed the required EMMS instruction to bring the x87 FPU back into usable state.
This needs to be done for 8-byte MMX or Extended MMX only.

Signed-off-by: Mario Hros <k3x-devel@outlook.com>
---
 libswscale/x86/yuv_2_rgb.asm | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/libswscale/x86/yuv_2_rgb.asm b/libswscale/x86/yuv_2_rgb.asm
index e3470fd9ad..5926133af8 100644
--- a/libswscale/x86/yuv_2_rgb.asm
+++ b/libswscale/x86/yuv_2_rgb.asm
@@ -354,6 +354,10 @@ add imageq, 8 * depth * time_num
 add indexq, 4 * time_num
 js .loop0
 
+%if mmsize == 8
+emms
+%endif
+
 RET
 
 %endmacro
-- 
2.39.3 (Apple Git-146)

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [FFmpeg-devel] [PATCH v2] libswscale/x86/yuv2rgb: Add missing EMMS
  2024-06-26 17:54 ` [FFmpeg-devel] [PATCH v2] " Mario Hros
@ 2024-06-26 18:55   ` Ramiro Polla
  2024-07-01 11:46     ` Ramiro Polla
  0 siblings, 1 reply; 8+ messages in thread
From: Ramiro Polla @ 2024-06-26 18:55 UTC (permalink / raw)
  To: FFmpeg development discussions and patches

Hi,

On Wed, Jun 26, 2024 at 8:03 PM Mario Hros <k3x-devel@outlook.com> wrote:
> Previous rewrite from inline assembly into nasm (commit e934194) missed the required EMMS instruction to bring the x87 FPU back into usable state.
> This needs to be done for 8-byte MMX or Extended MMX only.

Sorry I didn't catch this thread earlier. I sent a patch to outright
remove the mmx/mmxext code (thread "swscale/yuv2rgb/x86: remove
mmx/mmxext yuv2rgb functions"):
https://lists.ffmpeg.org/pipermail/ffmpeg-devel/2024-June/329785.html

The C code should be faster in most cases, or have very similar
performance to the mmx/mmxext code. Is this not the case for you?

Ramiro
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [FFmpeg-devel] [PATCH v2] libswscale/x86/yuv2rgb: Add missing EMMS
  2024-06-26 18:55   ` Ramiro Polla
@ 2024-07-01 11:46     ` Ramiro Polla
  2024-07-04 17:52       ` Mario Hros
  0 siblings, 1 reply; 8+ messages in thread
From: Ramiro Polla @ 2024-07-01 11:46 UTC (permalink / raw)
  To: FFmpeg development discussions and patches; +Cc: k3x-devel

On Wed, Jun 26, 2024 at 8:55 PM Ramiro Polla <ramiro.polla@gmail.com> wrote:
> On Wed, Jun 26, 2024 at 8:03 PM Mario Hros <k3x-devel@outlook.com> wrote:
> > Previous rewrite from inline assembly into nasm (commit e934194) missed the required EMMS instruction to bring the x87 FPU back into usable state.
> > This needs to be done for 8-byte MMX or Extended MMX only.
>
> Sorry I didn't catch this thread earlier. I sent a patch to outright
> remove the mmx/mmxext code (thread "swscale/yuv2rgb/x86: remove
> mmx/mmxext yuv2rgb functions"):
> https://lists.ffmpeg.org/pipermail/ffmpeg-devel/2024-June/329785.html
>
> The C code should be faster in most cases, or have very similar
> performance to the mmx/mmxext code. Is this not the case for you?

Mario, ping? If there are no more comments I'll go ahead and push the
patch that removes mmx/mmxext.
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [FFmpeg-devel] [PATCH v2] libswscale/x86/yuv2rgb: Add missing EMMS
  2024-07-01 11:46     ` Ramiro Polla
@ 2024-07-04 17:52       ` Mario Hros
  2024-07-05  8:46         ` Ramiro Polla
  0 siblings, 1 reply; 8+ messages in thread
From: Mario Hros @ 2024-07-04 17:52 UTC (permalink / raw)
  To: Ramiro Polla, FFmpeg development discussions and patches

It is possible that removal would be faster, though I haven't benchmarked that. I think that Kodi only uses this function for UI textures, so nothing performance-intensive. My main point was that the current MMX code without EMMS is basically broken (it breaks fmod() function and other code using x87 FPU). So, either my patch or yours will fix my problem.
________________________________
From: Ramiro Polla <ramiro.polla@gmail.com>
Sent: Monday, July 1, 2024 1:46 PM
To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Cc: k3x-devel@outlook.com <k3x-devel@outlook.com>
Subject: Re: [FFmpeg-devel] [PATCH v2] libswscale/x86/yuv2rgb: Add missing EMMS

On Wed, Jun 26, 2024 at 8:55 PM Ramiro Polla <ramiro.polla@gmail.com> wrote:
> On Wed, Jun 26, 2024 at 8:03 PM Mario Hros <k3x-devel@outlook.com> wrote:
> > Previous rewrite from inline assembly into nasm (commit e934194) missed the required EMMS instruction to bring the x87 FPU back into usable state.
> > This needs to be done for 8-byte MMX or Extended MMX only.
>
> Sorry I didn't catch this thread earlier. I sent a patch to outright
> remove the mmx/mmxext code (thread "swscale/yuv2rgb/x86: remove
> mmx/mmxext yuv2rgb functions"):
> https://lists.ffmpeg.org/pipermail/ffmpeg-devel/2024-June/329785.html
>
> The C code should be faster in most cases, or have very similar
> performance to the mmx/mmxext code. Is this not the case for you?

Mario, ping? If there are no more comments I'll go ahead and push the
patch that removes mmx/mmxext.
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [FFmpeg-devel] [PATCH v2] libswscale/x86/yuv2rgb: Add missing EMMS
  2024-07-04 17:52       ` Mario Hros
@ 2024-07-05  8:46         ` Ramiro Polla
  0 siblings, 0 replies; 8+ messages in thread
From: Ramiro Polla @ 2024-07-05  8:46 UTC (permalink / raw)
  To: Mario Hros; +Cc: FFmpeg development discussions and patches

Hi,

On Thu, Jul 4, 2024 at 7:52 PM Mario Hros <k3x-devel@outlook.com> wrote:
> It is possible that removal would be faster, though I haven't benchmarked that. I think that Kodi only uses this function for UI textures, so nothing performance-intensive. My main point was that the current MMX code without EMMS is basically broken (it breaks fmod() function and other code using x87 FPU). So, either my patch or yours will fix my problem.

I ended up pushing the change to remove mmx/mmxext code, but thank you
for sending the patch. Don't hesitate to send more if you come across
other similar issues.

Ramiro
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-07-05  8:46 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-06-05 17:35 [FFmpeg-devel] [PATCH] libswscale/x86/yuv2rgb: Add missing EMMS Mario Hros
2024-06-11 17:19 ` Mario Hros
2024-06-25 20:09   ` Michael Niedermayer
2024-06-26 17:54 ` [FFmpeg-devel] [PATCH v2] " Mario Hros
2024-06-26 18:55   ` Ramiro Polla
2024-07-01 11:46     ` Ramiro Polla
2024-07-04 17:52       ` Mario Hros
2024-07-05  8:46         ` Ramiro Polla

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
		ffmpegdev@gitmailbox.com
	public-inbox-index ffmpegdev

Example config snippet for mirrors.


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git