From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: <ffmpeg-devel-bounces@ffmpeg.org> Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTPS id 5E1194DBE8 for <ffmpegdev@gitmailbox.com>; Wed, 23 Apr 2025 20:47:52 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 4EBCA68B379; Wed, 23 Apr 2025 23:47:29 +0300 (EEST) Received: from mail-wr1-f51.google.com (mail-wr1-f51.google.com [209.85.221.51]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id E3D3068B316 for <ffmpeg-devel@ffmpeg.org>; Wed, 23 Apr 2025 23:47:22 +0300 (EEST) Received: by mail-wr1-f51.google.com with SMTP id ffacd0b85a97d-39c1ee0fd43so219021f8f.0 for <ffmpeg-devel@ffmpeg.org>; Wed, 23 Apr 2025 13:47:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=jkqxz-net.20230601.gappssmtp.com; s=20230601; t=1745441242; x=1746046042; darn=ffmpeg.org; h=content-transfer-encoding:in-reply-to:from:references:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=v+TcsxcmHyPQIKwdhXA+z7xooRgWdY9vc0vEXuu1iRY=; b=nQ2F+c3Rc8N3UuUKwSDamrs+sCMNPdC8NO/7zLYs0wdLFW7CrzvFna4wV9G9GOTf/Z uZIoPZyVZ1ruRlRlM7oXCLRktYRO842WVkL8dH8DRlNmlxFR6ZtUp6C8MowKHcpk4QEz 6ydQHymqXD83Aqi1H3mB5gA3tNI/p3Qsxg8tjydolLKuw+G5NXaEjEzgcMEd+unEThdq 6CtCp1twxx24lQiJPkmqeWoifznqGZYd//X26Jj6d33bzcN/jHfOKx6hRo8Z3e5nwd9e SZR2OHaLmqgi/S5lVC+wQH2VjT/p87Mo5S8o5vyaYapF3kdVbISUcamvfhz2usqKwOGm AWPw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1745441242; x=1746046042; h=content-transfer-encoding:in-reply-to:from:references:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=v+TcsxcmHyPQIKwdhXA+z7xooRgWdY9vc0vEXuu1iRY=; b=sNHaesjqsYY2gCcnhuWtvnDK2GJFwHNS7uCNvx7ii4/DHd28u2sdZiIpVjYAc3bJS2 AchRjpqfHP7j6Z12VaQSC449S4UUWk/pq4fGTzwj8/NZzzj0PFY/tTHEhv6a6klVDb3x FQ4YcaWSbi48q6vy+2nl2EdfLnce+Vxv9JjoX/Lrc5Lvt+FOtLhKaFYFIHpTgx+on/ub +DZw2g38VSrN0pN79e5bi7bIQSfBsYbXkEZ8cf0kHZKNt2cLu+lk0jc5wjXsN0PRfNIA 8uUr/6i4JaxZYk05CGqn7ntyJrqCLY8vrHw24c9QhJQVa73ibQvua9j0XerTtnPFLFQm fTSg== X-Gm-Message-State: AOJu0Yw7WIvAROzdInHDM2e3Wy4qnpA93/+F4isj1T3Lft/8Y2nP+nCD F51gByra6kFfbM856NVrrUKpKw4kh28BHsEPrhqcO1CO8GRKci4isYVCpEoSVastTz4kWBb26Ng v X-Gm-Gg: ASbGncv/LdLTyCBva+TcX89BWUzg/oD5LMg+vFOaYrOCxSid37qT363lXWq9KXJYcaV KqPrfrmpSVPH0LZZJ8fgJwjOtbZBs9e1SXRDMSqPPqAQI42z+s2rlu3tv4fD5fi24FKZ+UFyB0P Jvz5okdQoyqtG83l0RE9q+CtRqTSd0bVWxXR92QZFcn+2bTN/yOAiJ42nkQNFS5iajcys4vPUVb 95EGzo/NxpZ7rgFcCDYi9I7vu4ayECsEG7HDk3gmAaghUsfYHcdppPcnm+rj3MFhWDv9mCLdt+6 zyB3XiPvPzasxsxfA/IcSCmGyrHmBjsrtezvcx1faC7UUbeGiqWIRT08GrHFTcUqQuExGxKS21L +o4bYkquar7Qthw== X-Google-Smtp-Source: AGHT+IFHdXg/WYHqJnSTR9vS5XqkiZGb5ao8XeTUB2Tl6DpK8ba1mdYip6m21blNr4uRwzp0MFd78Q== X-Received: by 2002:a5d:47cc:0:b0:39e:e588:6735 with SMTP id ffacd0b85a97d-39efbb1f7f8mr16448314f8f.59.1745441242163; Wed, 23 Apr 2025 13:47:22 -0700 (PDT) Received: from [192.168.0.15] (cpc92320-cmbg19-2-0-cust719.5-4.cable.virginm.net. [82.13.66.208]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-39efa421c6csm20232490f8f.18.2025.04.23.13.47.21 for <ffmpeg-devel@ffmpeg.org> (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 23 Apr 2025 13:47:21 -0700 (PDT) Message-ID: <52761b73-d6a2-48e2-aa6a-f1e8d5f28029@jkqxz.net> Date: Wed, 23 Apr 2025 21:47:25 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Content-Language: en-US To: ffmpeg-devel@ffmpeg.org References: <20250421152445.2110045-1-sw@jkqxz.net> <20250421152445.2110045-6-sw@jkqxz.net> <20250423195238.GR4991@pb2> From: Mark Thompson <sw@jkqxz.net> In-Reply-To: <20250423195238.GR4991@pb2> Subject: Re: [FFmpeg-devel] [PATCH v2 5/6] lavc/apv: AVX2 transquant for x86-64 X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches <ffmpeg-devel.ffmpeg.org> List-Unsubscribe: <https://ffmpeg.org/mailman/options/ffmpeg-devel>, <mailto:ffmpeg-devel-request@ffmpeg.org?subject=unsubscribe> List-Archive: <https://ffmpeg.org/pipermail/ffmpeg-devel> List-Post: <mailto:ffmpeg-devel@ffmpeg.org> List-Help: <mailto:ffmpeg-devel-request@ffmpeg.org?subject=help> List-Subscribe: <https://ffmpeg.org/mailman/listinfo/ffmpeg-devel>, <mailto:ffmpeg-devel-request@ffmpeg.org?subject=subscribe> Reply-To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org> Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org> Archived-At: <https://master.gitmailbox.com/ffmpegdev/52761b73-d6a2-48e2-aa6a-f1e8d5f28029@jkqxz.net/> List-Archive: <https://master.gitmailbox.com/ffmpegdev/> List-Post: <mailto:ffmpegdev@gitmailbox.com> On 23/04/2025 20:52, Michael Niedermayer wrote: > Hi > > On Mon, Apr 21, 2025 at 04:24:36PM +0100, Mark Thompson wrote: >> Typical checkasm result on Alder Lake: >> >> decode_transquant_8_c: 461.1 ( 1.00x) >> decode_transquant_8_avx2: 97.5 ( 4.73x) >> decode_transquant_10_c: 483.9 ( 1.00x) >> decode_transquant_10_avx2: 91.7 ( 5.28x) >> --- >> libavcodec/apv_dsp.c | 4 + >> libavcodec/apv_dsp.h | 2 + >> libavcodec/x86/Makefile | 2 + >> libavcodec/x86/apv_dsp.asm | 279 ++++++++++++++++++++++++++++++++++ >> libavcodec/x86/apv_dsp_init.c | 40 +++++ >> tests/checkasm/Makefile | 1 + >> tests/checkasm/apv_dsp.c | 109 +++++++++++++ >> tests/checkasm/checkasm.c | 3 + >> tests/checkasm/checkasm.h | 1 + >> tests/fate/checkasm.mak | 1 + >> 10 files changed, 442 insertions(+) >> create mode 100644 libavcodec/x86/apv_dsp.asm >> create mode 100644 libavcodec/x86/apv_dsp_init.c >> create mode 100644 tests/checkasm/apv_dsp.c > > breaks build on x86-32 > make > X86ASM libavcodec/x86/apv_dsp.o > src/libavcodec/x86/apv_dsp.asm:64: error: symbol `m10' undefined > src/libavcodec/x86/apv_dsp.asm:66: error: symbol `xmmm8' undefined > src//libavutil/x86/x86inc.asm:1637: ... from macro `movd' defined here > src//libavutil/x86/x86inc.asm:1501: ... from macro `RUN_AVX_INSTR' defined here > src/libavcodec/x86/apv_dsp.asm:67: error: symbol `xmmm9' undefined > src//libavutil/x86/x86inc.asm:1637: ... from macro `movd' defined here > src//libavutil/x86/x86inc.asm:1501: ... from macro `RUN_AVX_INSTR' defined here > src/libavcodec/x86/apv_dsp.asm:68: error: symbol `m10' undefined > src/libavcodec/x86/apv_dsp.asm:69: error: symbol `m10' undefined > src/libavcodec/x86/apv_dsp.asm:86: error: symbol `m11' undefined > src/libavcodec/x86/apv_dsp.asm:78: ... from macro `LOAD_AND_DEQUANT' defined here > src/libavcodec/x86/apv_dsp.asm:86: error: symbol `m11' undefined > src/libavcodec/x86/apv_dsp.asm:79: ... from macro `LOAD_AND_DEQUANT' defined here > src/libavcodec/x86/apv_dsp.asm:86: error: symbol `xmmm8' undefined > src/libavcodec/x86/apv_dsp.asm:80: ... from macro `LOAD_AND_DEQUANT' defined here > src/libavcodec/x86/apv_dsp.asm:86: error: symbol `m10' undefined > src/libavcodec/x86/apv_dsp.asm:81: ... from macro `LOAD_AND_DEQUANT' defined here > src/libavcodec/x86/apv_dsp.asm:86: error: symbol `xmmm9' undefined > src/libavcodec/x86/apv_dsp.asm:82: ... from macro `LOAD_AND_DEQUANT' defined here > src/libavcodec/x86/apv_dsp.asm:87: error: symbol `m11' undefined > src/libavcodec/x86/apv_dsp.asm:78: ... from macro `LOAD_AND_DEQUANT' defined here > ... This was intended to be x86-64 only (due to register pressure) and wasn't guarded properly. Fixed in the latest version. Thank you for testing! - Mark _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".