From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTP id 59CDB482D4 for ; Fri, 22 Dec 2023 23:30:02 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id C404C68D2CF; Sat, 23 Dec 2023 01:29:59 +0200 (EET) Received: from mail-pl1-f180.google.com (mail-pl1-f180.google.com [209.85.214.180]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id C79B868D0E6 for ; Sat, 23 Dec 2023 01:29:52 +0200 (EET) Received: by mail-pl1-f180.google.com with SMTP id d9443c01a7336-1d2e6e14865so15639025ad.0 for ; Fri, 22 Dec 2023 15:29:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1703287790; x=1703892590; darn=ffmpeg.org; h=content-transfer-encoding:in-reply-to:autocrypt:from:references:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=alND6anFXFO3JE1xkGmOt38An4yg826x4MhmLz2ZqdE=; b=CK2ha6Tw/kGzPWgBsMeBzBrTdRf5suhw57D3ozXQPqDpSrbFXaTu93zPr/QkGpwwWq lrMxQM9UxUB6Al+5+Km9QH7oR9OGb4qTwZOf1xqGvHtux18vSp3yg1cEiQQk7c0tqzV1 1XHarmXx1iTxdTuOQ40p+N9tyBm0Dtk7i/qq4N1xMfrprK1IioCIyTlGZ8hxUGUV/Fv7 ZdMGN9zDSyouadtmUvKrn2oRyZhx9oW/3T8VArUhjLuiKO0a7om5iAFgcA3u2NpcEOJm rb0NQYrEq8fj7H2Uc71RdNyeI5iKvqiuzBs4XOF1Zq98pr+tOKaCLM1ToFtpM/cuRsuE Ee+A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1703287790; x=1703892590; h=content-transfer-encoding:in-reply-to:autocrypt:from:references:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=alND6anFXFO3JE1xkGmOt38An4yg826x4MhmLz2ZqdE=; b=tKw5wlNICPFh9fy1R1aQyKLNQyckIiX7eSUj/tFLpKrIAyl6/97Gy5Hu2Mrgn1OZl1 BweLVNAvaeGXfcnYvj+Z1+y4vJeUgt+2RhCMz3WDWcygieVRVikomHXLtbFSqYhQesy6 BH7TUJ1Zc1ZImUv98xYyPTyitzlDbEsr2prujJYFSgmHE0fQHXLQXKxg52ywT0jUR+WB bHsDa0YyCFobQwrPVj09keZosu3w+fdMr1enoXM3m1ZRjAOTn7FxIItA3bFXkqGRcuSP NrI02HcyGbfi0b/tDQncMOsSUelCuIIg6hQyVMPMmjyHNoN4YYI2pJSswJwBU4K8DFYP 4/pg== X-Gm-Message-State: AOJu0YxwGRQcE5v6JXUFYwfAGapt5EKh5VvUHRxFghACpdCoVqxMPPeD Xo7CLKCCeTDxTZ+D3BtBVqb5QuulBcs= X-Google-Smtp-Source: AGHT+IF1Ovyuy9dxWQEnTAu9yEH27yvUsEEJdhypRFSZYCn6n1H3yCGmLuUbZ5rxgmH9kr/xceNSRQ== X-Received: by 2002:a17:902:ea04:b0:1cc:4072:22c6 with SMTP id s4-20020a170902ea0400b001cc407222c6mr1731869plg.24.1703287790171; Fri, 22 Dec 2023 15:29:50 -0800 (PST) Received: from [192.168.0.13] (host197.190-225-105.telecom.net.ar. [190.225.105.197]) by smtp.gmail.com with ESMTPSA id x5-20020a170902ea8500b001cfc50e5afesm3946554plb.23.2023.12.22.15.29.48 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 22 Dec 2023 15:29:49 -0800 (PST) Message-ID: <4210df0c-2d44-49ef-84f2-3ea856a72bcd@gmail.com> Date: Fri, 22 Dec 2023 20:30:09 -0300 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Content-Language: en-US To: ffmpeg-devel@ffmpeg.org References: <20231222011549.16057-1-jamrial@gmail.com> <20231222011549.16057-2-jamrial@gmail.com> <20231222230834.GH6420@pb2> From: James Almer Autocrypt: addr=jamrial@gmail.com; keydata= xsBNBFjZtqABCADLW+vdEoZaJZDsIO6geYFTOcn1unsEHefj9zn+3oTHlDFFzO47mzHsSfbK 9JE2xpOJEVnC8FAF5Sayi/pVwV+mtQUV3n5dgVeVBYF9GUQwOGFCpK8X54RRqhkgknbunOEE 0CtgAJgmpFmmmHgq02GvEspx1h/rh4apqwQR6QX4Favb+x9+i9ytVpwVcBX94vo2toyP7h/K BWfadQmb8ltgE1kshfg+SQs/H5bTV5Z1DuEASf02ZL/1qYB/sdTgWPLv9XMUHHsRFmMY8TMx wJSkP+Af3AiYQPJYz1B1D4tt98T/NoiVdin10zATakPjV8hXaobuRmxgakkUASXudydDABEB AAHNH0phbWVzIEFsbWVyIDxqYW1yaWFsQGdtYWlsLmNvbT7CwJIEEwEIADwCGwMGCwkIBwMC BhUIAgkKCwQWAgMBAh4BAheAFiEEd1EujP2UoWlX5pp6FGMBrXN2WeAFAmJoLUUCGQEACgkQ FGMBrXN2WeAFVQf9GtGhniRs1PzNUOgJktCnv6j4BbLieaIPYPEFXKDHOgjqQE2zVMYXnoXl Jam928ii902a8OY06r9ywn/R8ApD1/3NY/v64O71CY9scz5XyH2au8wIZ6HwFy3/f7sqjdGD uctY8Qs7rjT7NkoC5lmgMu2v2k03dGtM9AAf5AK5gU+H0EUw7vmKKiXzUqt5kvBuf4CEwXvH AQT1SMJ52rIlDWB7FQFyZeUbOAK2IgY/KNedfK6nsgd/eQVnlofPd2XoddE7kP6iys7jJefw DD3g3rZyDTq7in5dyk5glaNpWZpbHGBs+9SCYLnfQ8XvWqPFOD+gj0plamKANgOvavKTxM7A TQRY2bagAQgA69YtILj8kYxmqPr/M8+MXT7wVoOWVW9lvSmPquCELaDy/NIS7D06VC5EuE/6 JlJXZMTn37NLlyWhzwOgXuXw5w2tyoQQBuvqGiXJijuXwXH7HKdzrc6rpYtAqt5w05hzNrFS KrS0izG64VpWrfproy3BsL+8TBm9brLhhNPynVRqVukbbGzlATTzNQGZ14TTi2/dL6DkMQnM qn4jX9UEe4GdGQBP50bUJSSmeiIkyNLWA+znuN2PZEz930ZwNrF9GtDVw7mzcmpCZ7spldE2 tutbpy9D1bIqxyqBrYDSezyzL2adR1qgHyOTMCHg2AYNkrIQHrSyJxKTpZ1/hqOp8wARAQAB wsBfBBgBAgAJBQJY2bagAhsMAAoJEBRjAa1zdlnghekH/0Yb0iYJ74oID2f/Fj+AJKS2ekQF P2xOr8lpGzgp/+yWUvPtqbX0A33anBJdYwxaAC0NataX3tfZ+oJkzXqfmqhIHMPYHdZesJA2 Bk9hU/33mDl5s5U66/z0uelWzwKVHoQ2O6or4+qF3HJFSJLCe9uvWJ3zXf9F342Ftj73sfx+ 3xkw/IXsN1RqbYqDlzpoEQ99SIEfY/8Jjwnd3sIPfqkuyeaYfe6GJDqKawdCEP1oRRlbXEAp TJgYz8r3nPhGv9cdHNDCk44ISbsqVuxIEnLqi4fTPZaGupiQhT+srl268TTAp2TQW7+6Ce/b NPQorMquzS/LZoyALpmsYi/miMc= In-Reply-To: <20231222230834.GH6420@pb2> Subject: Re: [FFmpeg-devel] [PATCH 2/2] x86/takdsp: add avx2 versions of all functions X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: On 12/22/2023 8:08 PM, Michael Niedermayer wrote: > On Thu, Dec 21, 2023 at 10:15:49PM -0300, James Almer wrote: >> On an Intel Core i7 12700k: >> >> decorrelate_ls_c: 814.3 >> decorrelate_ls_sse2: 165.8 >> decorrelate_ls_avx2: 101.3 >> decorrelate_sf_c: 1602.6 >> decorrelate_sf_sse4: 640.1 >> decorrelate_sf_avx2: 324.6 >> decorrelate_sm_c: 1564.8 >> decorrelate_sm_sse2: 379.3 >> decorrelate_sm_avx2: 203.3 >> decorrelate_sr_c: 785.3 >> decorrelate_sr_sse2: 176.3 >> decorrelate_sr_avx2: 99.8 >> >> Signed-off-by: James Almer > > on AMD Ryzen 9 3950X 16-Core Processor > > Illegal instruction (core dumped) > threads=1 > tests/Makefile:308: recipe for target 'fate-lossless-tak' failed > make: *** [fate-lossless-tak] Error 132 > > (gdb) disassemble $rip-32, $rip+32 > Dump of assembler code from 0x55555651a580 to 0x55555651a5c0: > 0x000055555651a580: or $0x17,%al > 0x000055555651a582: movdqa %xmm1,(%rdi,%rdx,1) > 0x000055555651a587: add $0x10,%rdx > 0x000055555651a58b: jl 0x55555651a562 > 0x000055555651a58d: retq > 0x000055555651a58e: nop > 0x000055555651a58f: nop > 0x000055555651a590: shl $0x2,%edx > 0x000055555651a593: add %rdx,%rdi > 0x000055555651a596: add %rdx,%rsi > 0x000055555651a599: neg %rdx > 0x000055555651a59c: vmovd %ecx,%xmm2 > => 0x000055555651a5a0: vpbroadcastd %r8d,%ymm3 Right, on linux the fifth argument is on a gpr, and vpbroadcastd with gpr source is avx512. Will fix and resend. > 0x000055555651a5a6: vbroadcasti128 0x4bc751(%rip),%ymm4 # 0x5555569d6d00 > 0x000055555651a5af: vmovdqa (%rsi,%rdx,1),%ymm1 > 0x000055555651a5b4: vpsrad %xmm2,%ymm1,%ymm1 > 0x000055555651a5b8: vpmulld %ymm3,%ymm1,%ymm1 > 0x000055555651a5bd: vpaddd %ymm4,%ymm1,%ymm1 > End of assembler dump. > > > [...] > > > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".