Hi, On Tue, Jun 11, 2024 at 8:42 PM James Almer wrote: > > On 6/11/2024 3:26 PM, Michael Niedermayer wrote: > > On Tue, Jun 11, 2024 at 02:28:56PM +0200, Ramiro Polla wrote: > >> chrRangeFromJpeg_8_c: 28.7 > >> chrRangeFromJpeg_8_sse4: 16.2 > >> chrRangeFromJpeg_24_c: 152.7 > >> chrRangeFromJpeg_24_sse4: 29.7 > >> chrRangeFromJpeg_128_c: 366.5 > >> chrRangeFromJpeg_128_sse4: 233.0 > >> chrRangeFromJpeg_144_c: 408.0 > >> chrRangeFromJpeg_144_sse4: 182.5 > >> chrRangeFromJpeg_256_c: 698.7 > >> chrRangeFromJpeg_256_sse4: 325.5 > >> chrRangeFromJpeg_512_c: 1348.7 > >> chrRangeFromJpeg_512_sse4: 660.2 > >> chrRangeToJpeg_8_c: 37.7 > >> chrRangeToJpeg_8_sse4: 16.2 > >> chrRangeToJpeg_24_c: 115.7 > >> chrRangeToJpeg_24_sse4: 36.2 > >> chrRangeToJpeg_128_c: 631.2 > >> chrRangeToJpeg_128_sse4: 163.7 > >> chrRangeToJpeg_144_c: 710.7 > >> chrRangeToJpeg_144_sse4: 183.0 > >> chrRangeToJpeg_256_c: 1253.0 > >> chrRangeToJpeg_256_sse4: 343.5 > >> chrRangeToJpeg_512_c: 2491.2 > >> chrRangeToJpeg_512_sse4: 654.2 > >> lumRangeFromJpeg_8_c: 11.7 > >> lumRangeFromJpeg_8_sse4: 10.5 > >> lumRangeFromJpeg_24_c: 38.5 > >> lumRangeFromJpeg_24_sse4: 19.0 > >> lumRangeFromJpeg_128_c: 237.5 > >> lumRangeFromJpeg_128_sse4: 79.2 > >> lumRangeFromJpeg_144_c: 255.7 > >> lumRangeFromJpeg_144_sse4: 90.5 > >> lumRangeFromJpeg_256_c: 441.5 > >> lumRangeFromJpeg_256_sse4: 161.7 > >> lumRangeFromJpeg_512_c: 879.0 > >> lumRangeFromJpeg_512_sse4: 333.2 > >> lumRangeToJpeg_8_c: 20.0 > >> lumRangeToJpeg_8_sse4: 11.7 > >> lumRangeToJpeg_24_c: 61.5 > >> lumRangeToJpeg_24_sse4: 17.7 > >> lumRangeToJpeg_128_c: 357.5 > >> lumRangeToJpeg_128_sse4: 80.0 > >> lumRangeToJpeg_144_c: 371.5 > >> lumRangeToJpeg_144_sse4: 93.2 > >> lumRangeToJpeg_256_c: 651.5 > >> lumRangeToJpeg_256_sse4: 164.5 > >> lumRangeToJpeg_512_c: 1279.0 > >> lumRangeToJpeg_512_sse4: 333.7 > >> --- > >> libswscale/swscale_internal.h | 1 + > >> libswscale/utils.c | 2 + > >> libswscale/x86/Makefile | 1 + > >> libswscale/x86/range_convert.asm | 130 +++++++++++++++++++++++++++++++ > >> libswscale/x86/swscale.c | 36 +++++++++ > >> 5 files changed, 170 insertions(+) > >> create mode 100644 libswscale/x86/range_convert.asm > > > > breaks x86-32 build > > > > LD ffmpeg_g > > /usr/lib/gcc-cross/i686-linux-gnu/7/../../../../i686-linux-gnu/bin/ld: libswscale/libswscale.a(utils.o): in function `sws_setColorspaceDetails': > > ffmpeg/linux32/src/libswscale/utils.c:1086: undefined reference to `ff_sws_init_range_convert_x86' > > collect2: error: ld returned 1 exit status > > make: *** [Makefile:139: ffmpeg_g] Error 1 > > > > thx > > The functions are wrapped in ARCH_X86_64 checks for seemingly no reason, > so they should be removed in the next iteration. Fixed. James walked me through on IRC to optimize and improve the functions in a way that they work both with sse2 and avx2. New patch attached.