From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTP id A1D4A4B327 for ; Tue, 4 Jun 2024 01:45:10 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id DBC6368D6C1; Tue, 4 Jun 2024 04:45:07 +0300 (EEST) Received: from mail-pg1-f172.google.com (mail-pg1-f172.google.com [209.85.215.172]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 1712168D265 for ; Tue, 4 Jun 2024 04:45:01 +0300 (EEST) Received: by mail-pg1-f172.google.com with SMTP id 41be03b00d2f7-6c821775f82so1925203a12.0 for ; Mon, 03 Jun 2024 18:45:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1717465498; x=1718070298; darn=ffmpeg.org; h=content-transfer-encoding:in-reply-to:from:content-language :references:to:subject:user-agent:mime-version:date:message-id:from :to:cc:subject:date:message-id:reply-to; bh=NwjlWF9927vrS/V7xcP6I7fAa/PYcfhJHZymcm10K90=; b=c5yh4WONjRkXxfim84Vq0ZH7DPLA1rq9uULlb0Me9aYU7RgxyrbeRqUi9IWP2x9nAe 86hFXeOFNWZkJga7J+DXA60zBcEkCygL+AcWy5tdOuHYWdA3cmQNX6VsDEwJ9Rj7d3Uy BK35apeeM+WYfukpGfjz7CRTwjEF587YcMxznYHCPjycFIkggfv4scnWkQ2BqnqArPiU EKeEsYEWg+whjBHVulKtv3qqfcBi9rhR1Xw1KL357UnKkY65QkUiaV2+h3oKIUDGR9tY lhSys0J7sdNJmZTW67FvmUa9O2LF+wuY1iC2iWDxZ4P90hRxPIzBLZ+ugpiXHVfEg74W 8wTA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717465498; x=1718070298; h=content-transfer-encoding:in-reply-to:from:content-language :references:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=NwjlWF9927vrS/V7xcP6I7fAa/PYcfhJHZymcm10K90=; b=cMFiTkdNqxYLUqCWE7Ects7s8uvI9ClpM7kvOFjpL/3BxVlyEKCMslxMVhWACilaGV vJLLPfgJa3+1bQ2sx2KjY2bKp5uQixoTkuQ8nrLltFBfdG7j7XxqFRqvliLh49XUl3Nu b8F4OwnJQZHikDiu86E+rQnu/1Y/ZsjSr+ZbPuXKpkQhI8j7TC33fSWofgW6cVvdiPP6 sdmvDkawyCN1PhWHjw/gXDEMIVi7YgCK91rCfvUzLYLcDGu2Xs7wdmIPTbYyUu6fAm/w QZoAW7ldspCqHWUkxni7zhpeRUxF1vsKsZ0O1xDBG6zQzp3UPG3hneO4XroGQdtbyE7N Spyg== X-Gm-Message-State: AOJu0YyNsA+rX5JVz0a4efDLmy36UoCi/iLrHTZwJUcuxl45PQ8TcCDW LZ76XbBk5rK59FDTKdwh/+a4pYtpCgJY8eoP6wmNqlEPnkUEqmwsyZfxVw== X-Google-Smtp-Source: AGHT+IHGiftbyf9N7wcga1mpVToPK1JEaO6Vm0sothr3Nd3fnvbiap5Ak5n4ssCyKcrcsz4b8BdGiA== X-Received: by 2002:a05:6a20:2588:b0:1af:a469:75aa with SMTP id adf61e73a8af0-1b26f25dda0mr12059199637.46.1717465497943; Mon, 03 Jun 2024 18:44:57 -0700 (PDT) Received: from [192.168.0.16] ([190.194.167.233]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f63232dd82sm71020675ad.35.2024.06.03.18.44.56 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 03 Jun 2024 18:44:57 -0700 (PDT) Message-ID: Date: Mon, 3 Jun 2024 22:45:03 -0300 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird To: ffmpeg-devel@ffmpeg.org References: <20240604012343.1771-1-jamrial@gmail.com> Content-Language: en-US From: James Almer In-Reply-To: Subject: Re: [FFmpeg-devel] [PATCH] x86/aacencdsp: add SSE2 and AVX versions of quantize_bands X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: On 6/3/2024 10:42 PM, Andreas Rheinhardt wrote: > James Almer: >> quant_bands_signed_sse2: 417.0 >> quant_bands_signed_avx: 202.0 > > Missing benchmark numbers for the C code About 1670. And it doesn't matter as I'm only adding the AVX version (The subject is wrong, copy-paste fail), so i mentioned the SSE2 as comparison to the existing simd version. But sure, i can add the C one before pushing. > >> >> Signed-off-by: James Almer >> --- >> libavcodec/aacenc.h | 2 +- >> libavcodec/x86/aacencdsp.asm | 27 ++++++++++++++++++++++++--- >> libavcodec/x86/aacencdsp_init.c | 6 ++++++ >> tests/checkasm/aacencdsp.c | 4 ++-- >> 4 files changed, 33 insertions(+), 6 deletions(-) >> >> diff --git a/libavcodec/aacenc.h b/libavcodec/aacenc.h >> index d07960620e..ae15f91e06 100644 >> --- a/libavcodec/aacenc.h >> +++ b/libavcodec/aacenc.h >> @@ -242,7 +242,7 @@ typedef struct AACEncContext { >> enum RawDataBlockType cur_type; ///< channel group type cur_channel belongs to >> >> AudioFrameQueue afq; >> - DECLARE_ALIGNED(16, int, qcoefs)[96]; ///< quantized coefficients >> + DECLARE_ALIGNED(32, int, qcoefs)[96]; ///< quantized coefficients >> DECLARE_ALIGNED(32, float, scoefs)[1024]; ///< scaled coefficients >> >> uint16_t quantize_band_cost_cache_generation; >> diff --git a/libavcodec/x86/aacencdsp.asm b/libavcodec/x86/aacencdsp.asm >> index 0d3ba4b89d..99be2d87f5 100644 >> --- a/libavcodec/x86/aacencdsp.asm >> +++ b/libavcodec/x86/aacencdsp.asm >> @@ -53,8 +53,19 @@ cglobal abs_pow34, 3, 3, 3, out, in, size >> ; int size, int is_signed, int maxval, const float Q34, >> ; const float rounding) >> ;******************************************************************* >> -INIT_XMM sse2 >> +%macro AAC_QUANTIZE_BANDS 0 >> cglobal aac_quantize_bands, 5, 5, 6, out, in, scaled, size, is_signed, maxval, Q34, rounding >> +%if mmsize == 32 >> + vbroadcastss m0, Q34m >> + vbroadcastss m1, roundingm >> +%if UNIX64 == 0 >> + cvtsi2ss xm3, dword maxvalm >> +%else >> + cvtsi2ss xm3, maxvald >> +%endif >> + shufps xm3, xm3, xm3, 0 >> + vinsertf128 m3, m3, xm3, 1 >> +%else ; mmsize == 16 >> %if UNIX64 == 0 >> movss m0, Q34m >> movss m1, roundingm >> @@ -65,9 +76,13 @@ cglobal aac_quantize_bands, 5, 5, 6, out, in, scaled, size, is_signed, maxval, Q >> shufps m0, m0, 0 >> shufps m1, m1, 0 >> shufps m3, m3, 0 >> +%endif >> shl is_signedd, 31 >> - movd m4, is_signedd >> - shufps m4, m4, 0 >> + movd xm4, is_signedd >> + shufps xm4, xm4, xm4, 0 >> +%if mmsize == 32 >> + vinsertf128 m4, m4, xm4, 1 >> +%endif >> shl sized, 2 >> add inq, sizeq >> add outq, sizeq >> @@ -84,3 +99,9 @@ cglobal aac_quantize_bands, 5, 5, 6, out, in, scaled, size, is_signed, maxval, Q >> add sizeq, mmsize >> jl .loop >> RET >> +%endmacro >> + >> +INIT_XMM sse2 >> +AAC_QUANTIZE_BANDS >> +INIT_YMM avx >> +AAC_QUANTIZE_BANDS >> diff --git a/libavcodec/x86/aacencdsp_init.c b/libavcodec/x86/aacencdsp_init.c >> index e0d8dec4f8..cf17dbf91d 100644 >> --- a/libavcodec/x86/aacencdsp_init.c >> +++ b/libavcodec/x86/aacencdsp_init.c >> @@ -30,6 +30,9 @@ void ff_abs_pow34_sse(float *out, const float *in, const int size); >> void ff_aac_quantize_bands_sse2(int *out, const float *in, const float *scaled, >> int size, int is_signed, int maxval, const float Q34, >> const float rounding); >> +void ff_aac_quantize_bands_avx(int *out, const float *in, const float *scaled, >> + int size, int is_signed, int maxval, const float Q34, >> + const float rounding); >> >> av_cold void ff_aacenc_dsp_init_x86(AACEncDSPContext *s) >> { >> @@ -40,4 +43,7 @@ av_cold void ff_aacenc_dsp_init_x86(AACEncDSPContext *s) >> >> if (EXTERNAL_SSE2(cpu_flags)) >> s->quant_bands = ff_aac_quantize_bands_sse2; > > Seems like the commit message is wrong: You are not adding an SSE2 version. > >> + >> + if (EXTERNAL_AVX_FAST(cpu_flags)) >> + s->quant_bands = ff_aac_quantize_bands_avx; >> } >> diff --git a/tests/checkasm/aacencdsp.c b/tests/checkasm/aacencdsp.c >> index 791dd30320..5308a2ac03 100644 >> --- a/tests/checkasm/aacencdsp.c >> +++ b/tests/checkasm/aacencdsp.c >> @@ -81,8 +81,8 @@ static void test_quant_bands(AACEncDSPContext *s) >> for (int sign = 0; sign <= 1; sign++) { >> if (check_func(s->quant_bands, "quant_bands_%s", >> sign ? "signed" : "unsigned")) { >> - LOCAL_ALIGNED_16(int, out, [BUF_SIZE]); >> - LOCAL_ALIGNED_16(int, out2, [BUF_SIZE]); >> + LOCAL_ALIGNED_32(int, out, [BUF_SIZE]); >> + LOCAL_ALIGNED_32(int, out2, [BUF_SIZE]); >> >> call_ref(out, in, scaled, BUF_SIZE, sign, maxval, q34, rounding); >> call_new(out2, in, scaled, BUF_SIZE, sign, maxval, q34, rounding); > > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".