On Thu, Dec 21, 2023 at 10:15:49PM -0300, James Almer wrote: > On an Intel Core i7 12700k: > > decorrelate_ls_c: 814.3 > decorrelate_ls_sse2: 165.8 > decorrelate_ls_avx2: 101.3 > decorrelate_sf_c: 1602.6 > decorrelate_sf_sse4: 640.1 > decorrelate_sf_avx2: 324.6 > decorrelate_sm_c: 1564.8 > decorrelate_sm_sse2: 379.3 > decorrelate_sm_avx2: 203.3 > decorrelate_sr_c: 785.3 > decorrelate_sr_sse2: 176.3 > decorrelate_sr_avx2: 99.8 > > Signed-off-by: James Almer on AMD Ryzen 9 3950X 16-Core Processor Illegal instruction (core dumped) threads=1 tests/Makefile:308: recipe for target 'fate-lossless-tak' failed make: *** [fate-lossless-tak] Error 132 (gdb) disassemble $rip-32, $rip+32 Dump of assembler code from 0x55555651a580 to 0x55555651a5c0: 0x000055555651a580: or $0x17,%al 0x000055555651a582: movdqa %xmm1,(%rdi,%rdx,1) 0x000055555651a587: add $0x10,%rdx 0x000055555651a58b: jl 0x55555651a562 0x000055555651a58d: retq 0x000055555651a58e: nop 0x000055555651a58f: nop 0x000055555651a590: shl $0x2,%edx 0x000055555651a593: add %rdx,%rdi 0x000055555651a596: add %rdx,%rsi 0x000055555651a599: neg %rdx 0x000055555651a59c: vmovd %ecx,%xmm2 => 0x000055555651a5a0: vpbroadcastd %r8d,%ymm3 0x000055555651a5a6: vbroadcasti128 0x4bc751(%rip),%ymm4 # 0x5555569d6d00 0x000055555651a5af: vmovdqa (%rsi,%rdx,1),%ymm1 0x000055555651a5b4: vpsrad %xmm2,%ymm1,%ymm1 0x000055555651a5b8: vpmulld %ymm3,%ymm1,%ymm1 0x000055555651a5bd: vpaddd %ymm4,%ymm1,%ymm1 End of assembler dump. [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Why not whip the teacher when the pupil misbehaves? -- Diogenes of Sinope