From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <ffmpeg-devel-bounces@ffmpeg.org>
Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100])
	by master.gitmailbox.com (Postfix) with ESMTPS id 5E1194DBE8
	for <ffmpegdev@gitmailbox.com>; Wed, 23 Apr 2025 20:47:52 +0000 (UTC)
Received: from [127.0.1.1] (localhost [127.0.0.1])
	by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 4EBCA68B379;
	Wed, 23 Apr 2025 23:47:29 +0300 (EEST)
Received: from mail-wr1-f51.google.com (mail-wr1-f51.google.com
 [209.85.221.51])
 by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id E3D3068B316
 for <ffmpeg-devel@ffmpeg.org>; Wed, 23 Apr 2025 23:47:22 +0300 (EEST)
Received: by mail-wr1-f51.google.com with SMTP id
 ffacd0b85a97d-39c1ee0fd43so219021f8f.0
 for <ffmpeg-devel@ffmpeg.org>; Wed, 23 Apr 2025 13:47:22 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=jkqxz-net.20230601.gappssmtp.com; s=20230601; t=1745441242; x=1746046042;
 darn=ffmpeg.org; 
 h=content-transfer-encoding:in-reply-to:from:references:to
 :content-language:subject:user-agent:mime-version:date:message-id
 :from:to:cc:subject:date:message-id:reply-to;
 bh=v+TcsxcmHyPQIKwdhXA+z7xooRgWdY9vc0vEXuu1iRY=;
 b=nQ2F+c3Rc8N3UuUKwSDamrs+sCMNPdC8NO/7zLYs0wdLFW7CrzvFna4wV9G9GOTf/Z
 uZIoPZyVZ1ruRlRlM7oXCLRktYRO842WVkL8dH8DRlNmlxFR6ZtUp6C8MowKHcpk4QEz
 6ydQHymqXD83Aqi1H3mB5gA3tNI/p3Qsxg8tjydolLKuw+G5NXaEjEzgcMEd+unEThdq
 6CtCp1twxx24lQiJPkmqeWoifznqGZYd//X26Jj6d33bzcN/jHfOKx6hRo8Z3e5nwd9e
 SZR2OHaLmqgi/S5lVC+wQH2VjT/p87Mo5S8o5vyaYapF3kdVbISUcamvfhz2usqKwOGm
 AWPw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20230601; t=1745441242; x=1746046042;
 h=content-transfer-encoding:in-reply-to:from:references:to
 :content-language:subject:user-agent:mime-version:date:message-id
 :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
 bh=v+TcsxcmHyPQIKwdhXA+z7xooRgWdY9vc0vEXuu1iRY=;
 b=sNHaesjqsYY2gCcnhuWtvnDK2GJFwHNS7uCNvx7ii4/DHd28u2sdZiIpVjYAc3bJS2
 AchRjpqfHP7j6Z12VaQSC449S4UUWk/pq4fGTzwj8/NZzzj0PFY/tTHEhv6a6klVDb3x
 FQ4YcaWSbi48q6vy+2nl2EdfLnce+Vxv9JjoX/Lrc5Lvt+FOtLhKaFYFIHpTgx+on/ub
 +DZw2g38VSrN0pN79e5bi7bIQSfBsYbXkEZ8cf0kHZKNt2cLu+lk0jc5wjXsN0PRfNIA
 8uUr/6i4JaxZYk05CGqn7ntyJrqCLY8vrHw24c9QhJQVa73ibQvua9j0XerTtnPFLFQm
 fTSg==
X-Gm-Message-State: AOJu0Yw7WIvAROzdInHDM2e3Wy4qnpA93/+F4isj1T3Lft/8Y2nP+nCD
 F51gByra6kFfbM856NVrrUKpKw4kh28BHsEPrhqcO1CO8GRKci4isYVCpEoSVastTz4kWBb26Ng
 v
X-Gm-Gg: ASbGncv/LdLTyCBva+TcX89BWUzg/oD5LMg+vFOaYrOCxSid37qT363lXWq9KXJYcaV
 KqPrfrmpSVPH0LZZJ8fgJwjOtbZBs9e1SXRDMSqPPqAQI42z+s2rlu3tv4fD5fi24FKZ+UFyB0P
 Jvz5okdQoyqtG83l0RE9q+CtRqTSd0bVWxXR92QZFcn+2bTN/yOAiJ42nkQNFS5iajcys4vPUVb
 95EGzo/NxpZ7rgFcCDYi9I7vu4ayECsEG7HDk3gmAaghUsfYHcdppPcnm+rj3MFhWDv9mCLdt+6
 zyB3XiPvPzasxsxfA/IcSCmGyrHmBjsrtezvcx1faC7UUbeGiqWIRT08GrHFTcUqQuExGxKS21L
 +o4bYkquar7Qthw==
X-Google-Smtp-Source: AGHT+IFHdXg/WYHqJnSTR9vS5XqkiZGb5ao8XeTUB2Tl6DpK8ba1mdYip6m21blNr4uRwzp0MFd78Q==
X-Received: by 2002:a5d:47cc:0:b0:39e:e588:6735 with SMTP id
 ffacd0b85a97d-39efbb1f7f8mr16448314f8f.59.1745441242163; 
 Wed, 23 Apr 2025 13:47:22 -0700 (PDT)
Received: from [192.168.0.15]
 (cpc92320-cmbg19-2-0-cust719.5-4.cable.virginm.net. [82.13.66.208])
 by smtp.gmail.com with ESMTPSA id
 ffacd0b85a97d-39efa421c6csm20232490f8f.18.2025.04.23.13.47.21
 for <ffmpeg-devel@ffmpeg.org>
 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128);
 Wed, 23 Apr 2025 13:47:21 -0700 (PDT)
Message-ID: <52761b73-d6a2-48e2-aa6a-f1e8d5f28029@jkqxz.net>
Date: Wed, 23 Apr 2025 21:47:25 +0100
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Content-Language: en-US
To: ffmpeg-devel@ffmpeg.org
References: <20250421152445.2110045-1-sw@jkqxz.net>
 <20250421152445.2110045-6-sw@jkqxz.net> <20250423195238.GR4991@pb2>
From: Mark Thompson <sw@jkqxz.net>
In-Reply-To: <20250423195238.GR4991@pb2>
Subject: Re: [FFmpeg-devel] [PATCH v2 5/6] lavc/apv: AVX2 transquant for
 x86-64
X-BeenThere: ffmpeg-devel@ffmpeg.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: FFmpeg development discussions and patches <ffmpeg-devel.ffmpeg.org>
List-Unsubscribe: <https://ffmpeg.org/mailman/options/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=unsubscribe>
List-Archive: <https://ffmpeg.org/pipermail/ffmpeg-devel>
List-Post: <mailto:ffmpeg-devel@ffmpeg.org>
List-Help: <mailto:ffmpeg-devel-request@ffmpeg.org?subject=help>
List-Subscribe: <https://ffmpeg.org/mailman/listinfo/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=subscribe>
Reply-To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Errors-To: ffmpeg-devel-bounces@ffmpeg.org
Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>
Archived-At: <https://master.gitmailbox.com/ffmpegdev/52761b73-d6a2-48e2-aa6a-f1e8d5f28029@jkqxz.net/>
List-Archive: <https://master.gitmailbox.com/ffmpegdev/>
List-Post: <mailto:ffmpegdev@gitmailbox.com>

On 23/04/2025 20:52, Michael Niedermayer wrote:
> Hi
> 
> On Mon, Apr 21, 2025 at 04:24:36PM +0100, Mark Thompson wrote:
>> Typical checkasm result on Alder Lake:
>>
>> decode_transquant_8_c:                                 461.1 ( 1.00x)
>> decode_transquant_8_avx2:                               97.5 ( 4.73x)
>> decode_transquant_10_c:                                483.9 ( 1.00x)
>> decode_transquant_10_avx2:                              91.7 ( 5.28x)
>> ---
>>  libavcodec/apv_dsp.c          |   4 +
>>  libavcodec/apv_dsp.h          |   2 +
>>  libavcodec/x86/Makefile       |   2 +
>>  libavcodec/x86/apv_dsp.asm    | 279 ++++++++++++++++++++++++++++++++++
>>  libavcodec/x86/apv_dsp_init.c |  40 +++++
>>  tests/checkasm/Makefile       |   1 +
>>  tests/checkasm/apv_dsp.c      | 109 +++++++++++++
>>  tests/checkasm/checkasm.c     |   3 +
>>  tests/checkasm/checkasm.h     |   1 +
>>  tests/fate/checkasm.mak       |   1 +
>>  10 files changed, 442 insertions(+)
>>  create mode 100644 libavcodec/x86/apv_dsp.asm
>>  create mode 100644 libavcodec/x86/apv_dsp_init.c
>>  create mode 100644 tests/checkasm/apv_dsp.c
> 
> breaks build on x86-32
> make
> X86ASM	libavcodec/x86/apv_dsp.o
> src/libavcodec/x86/apv_dsp.asm:64: error: symbol `m10' undefined
> src/libavcodec/x86/apv_dsp.asm:66: error: symbol `xmmm8' undefined
> src//libavutil/x86/x86inc.asm:1637: ... from macro `movd' defined here
> src//libavutil/x86/x86inc.asm:1501: ... from macro `RUN_AVX_INSTR' defined here
> src/libavcodec/x86/apv_dsp.asm:67: error: symbol `xmmm9' undefined
> src//libavutil/x86/x86inc.asm:1637: ... from macro `movd' defined here
> src//libavutil/x86/x86inc.asm:1501: ... from macro `RUN_AVX_INSTR' defined here
> src/libavcodec/x86/apv_dsp.asm:68: error: symbol `m10' undefined
> src/libavcodec/x86/apv_dsp.asm:69: error: symbol `m10' undefined
> src/libavcodec/x86/apv_dsp.asm:86: error: symbol `m11' undefined
> src/libavcodec/x86/apv_dsp.asm:78: ... from macro `LOAD_AND_DEQUANT' defined here
> src/libavcodec/x86/apv_dsp.asm:86: error: symbol `m11' undefined
> src/libavcodec/x86/apv_dsp.asm:79: ... from macro `LOAD_AND_DEQUANT' defined here
> src/libavcodec/x86/apv_dsp.asm:86: error: symbol `xmmm8' undefined
> src/libavcodec/x86/apv_dsp.asm:80: ... from macro `LOAD_AND_DEQUANT' defined here
> src/libavcodec/x86/apv_dsp.asm:86: error: symbol `m10' undefined
> src/libavcodec/x86/apv_dsp.asm:81: ... from macro `LOAD_AND_DEQUANT' defined here
> src/libavcodec/x86/apv_dsp.asm:86: error: symbol `xmmm9' undefined
> src/libavcodec/x86/apv_dsp.asm:82: ... from macro `LOAD_AND_DEQUANT' defined here
> src/libavcodec/x86/apv_dsp.asm:87: error: symbol `m11' undefined
> src/libavcodec/x86/apv_dsp.asm:78: ... from macro `LOAD_AND_DEQUANT' defined here
> ...

This was intended to be x86-64 only (due to register pressure) and wasn't guarded properly.  Fixed in the latest version.

Thank you for testing!

- Mark

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".