From: Andreas Rheinhardt <andreas.rheinhardt@outlook.com> To: ffmpeg-devel@ffmpeg.org Subject: Re: [FFmpeg-devel] [PATCH] swscale/x86/rgb2_rgb: Empty MMX state in ff_shuffle_bytes_2103_mmxext Date: Tue, 23 Aug 2022 19:28:19 +0200 Message-ID: <DB6PR0101MB22141C1D7C572A2C39CAA9048F709@DB6PR0101MB2214.eurprd01.prod.exchangelabs.com> (raw) In-Reply-To: <20220823154215.GJ2088045@pb2> Michael Niedermayer: > On Mon, Aug 22, 2022 at 11:59:17PM +0200, Andreas Rheinhardt wrote: >> Andreas Rheinhardt: >>> Fixes FATE-failures with the the filter-2xbr filter-3xbr filter-4xbr >>> filter-ep2x filter-ep3x filter-hq2x filter-hq3x filter-hq4x >>> filter-paletteuse-bayer filter-paletteuse-bayer0 >>> filter-paletteuse-nodither and filter-paletteuse-sierra2_4a tests >>> when using 32bit x86 with CPUFLAGS ranging from "mmx+mmxext" to >>> "mmx+mmxext+sse+sse2+sse3" (the relevant function is only overwritten >>> when using SSSE3). >>> >>> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com> >>> --- >>> libswscale/x86/rgb_2_rgb.asm | 1 + >>> 1 file changed, 1 insertion(+) >>> >>> diff --git a/libswscale/x86/rgb_2_rgb.asm b/libswscale/x86/rgb_2_rgb.asm >>> index c695c61d5c..76ca1eec03 100644 >>> --- a/libswscale/x86/rgb_2_rgb.asm >>> +++ b/libswscale/x86/rgb_2_rgb.asm >>> @@ -104,6 +104,7 @@ jge .end >>> jl .loop_simd >>> >>> .end: >>> + emms >>> RET >>> >>> ;------------------------------------------------------------------------------ >> >> I'd really love if someone with x86 assembly skills could look over this >> trivial patch and confirm whether it is indeed correct. All I currently >> know is that is works for me. > > emms needs to be called between MMX and float code, as far outside of loops > as possible > that would suggest outside the for() loops in rgbToRgbWrapper() and any > other code using it. But there is another aspect that the above is missing: Namely that if emms_c() is put outside of MMX functions, then it will be called even when it is unnecessary. In this case it is unnecessary for all modern CPUs, as this function is overridden when SSSE3 is available. > > thats what we did and what is most efficient. One can make an argument that > emms must be called before returning to C code when its needed. That though > would imply also that all uses of emms_c() are wrong > Well, e.g. the x64 psABI contains this clause: "The CPU shall be in x87 mode upon entry to a function. Therefore, every function that uses the MMX registers is required to issue an emms or femms instruction after using MMX registers, before returning or calling another function." So using emms_c() is ABI-incompliant. If I add an av_assert0_fpu() at the beginning of av_log_default_callback (a function that may be overridden by a user-defined callback that actually relies on us conforming to the ABI), several FATE tests fail. I am sure that there are lots of av_logs or other functions that are in parts of the code where the CPU is not in x87 mode and that are just not executed in fate because they are error logs. - Andreas PS: On the brighter side: fate.ffmpeg.org now contains three more green boxes! _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
next prev parent reply other threads:[~2022-08-23 17:28 UTC|newest] Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top 2022-08-21 17:38 Andreas Rheinhardt 2022-08-22 15:10 ` [FFmpeg-devel] [PATCH 02/10] avcodec/wmalosslessdec: Remove unnecessary emms_c() Andreas Rheinhardt 2022-08-23 12:34 ` Andreas Rheinhardt 2022-08-22 15:10 ` [FFmpeg-devel] [PATCH 03/10] avcodec/takdec: " Andreas Rheinhardt 2022-08-22 15:10 ` [FFmpeg-devel] [PATCH 04/10] avcodec/jpeglsenc: " Andreas Rheinhardt 2022-08-22 15:10 ` [FFmpeg-devel] [PATCH 05/10] avcodec/ffv1(dec|enc): " Andreas Rheinhardt 2022-08-22 15:10 ` [FFmpeg-devel] [PATCH 06/10] avcodec/apedec: " Andreas Rheinhardt 2022-08-22 15:10 ` [FFmpeg-devel] [PATCH 07/10] avcodec/4xm: Remove unnecessary and redundat emms_c() Andreas Rheinhardt 2022-08-22 15:10 ` [FFmpeg-devel] [PATCH 08/10] avcodec/loongarch/cabac, vp9dsp_loongarch: Add missing headers Andreas Rheinhardt 2022-08-22 15:10 ` [FFmpeg-devel] [PATCH 09/10] avformat/os_support: Include stdint.h for int64_t Andreas Rheinhardt 2022-08-22 15:10 ` [FFmpeg-devel] [PATCH 10/10] avutil/mem_internal: Fix headers Andreas Rheinhardt 2022-08-22 21:59 ` [FFmpeg-devel] [PATCH] swscale/x86/rgb2_rgb: Empty MMX state in ff_shuffle_bytes_2103_mmxext Andreas Rheinhardt 2022-08-23 15:42 ` Michael Niedermayer 2022-08-23 17:28 ` Andreas Rheinhardt [this message] 2022-08-23 17:51 ` Michael Niedermayer 2022-08-23 18:09 ` Andreas Rheinhardt 2022-08-23 18:22 ` Michael Niedermayer 2022-08-23 18:34 ` Andreas Rheinhardt
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=DB6PR0101MB22141C1D7C572A2C39CAA9048F709@DB6PR0101MB2214.eurprd01.prod.exchangelabs.com \ --to=andreas.rheinhardt@outlook.com \ --cc=ffmpeg-devel@ffmpeg.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel This inbox may be cloned and mirrored by anyone: git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \ ffmpegdev@gitmailbox.com public-inbox-index ffmpegdev Example config snippet for mirrors. AGPL code for this site: git clone https://public-inbox.org/public-inbox.git