From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTP id 2F3D24AAB9 for ; Sun, 12 May 2024 16:08:08 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 4EE2968D68D; Sun, 12 May 2024 19:07:26 +0300 (EEST) Received: from mail-pl1-f171.google.com (mail-pl1-f171.google.com [209.85.214.171]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 7DA5968D695 for ; Sun, 12 May 2024 19:07:19 +0300 (EEST) Received: by mail-pl1-f171.google.com with SMTP id d9443c01a7336-1eb24e3a2d9so32214215ad.1 for ; Sun, 12 May 2024 09:07:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1715530036; x=1716134836; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=98YIqaEZehxmyf5s+R32jvvb12I0Gl8t/1oj584CJNI=; b=hCcXssmjKR9sYkkv6xorC+2PkKucxpTy7unqfmovqpGRK+jwohi8P0oeP/WZu70OnO 08q3GgP9Hqo5wfuQnfFqF8LR/wTX+PA2td8/bLinlGyUYSvy4PNjghLXp4WzCMzL+ynr KEjr6oo1QVV9lfSPMzktG83SCX880/HFS6sJCE4h+rCmwersLPifP0oWu+bFzhJ6w70i 0TLti0IKtKl4L3xBl4R9l3n3eHvs66yVJSAmugFg0Dgn8Jkj5VHv3MfbmWd0YDRpDDqX OHQSK7KBFOon7yUylKI1Co2zYSB2cVHoHII6kYYYFQwSqpLZGR/5yi4CebBnOf6Y933m Pxmw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1715530036; x=1716134836; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=98YIqaEZehxmyf5s+R32jvvb12I0Gl8t/1oj584CJNI=; b=V/Z4CBDD90h49Y0wV0RFKf9bXrjBo9pViFq6ywLM17Cm5ke4GNM/ZdKbvH3FAeXCoQ r8CnAEDmRfIU+GFlVq9zFT/zZtn77olP1MR2OBecTdZLeOivSFeiGei4B8oukaLL5VIV XaNX6bc/IOQkGV9WFrM0yq+zZ3+dZaLEu/AC9u6uogfNQ57rwRysBdtWSUK1oVJkNo/K FZgz7WZRAONhjIb/KKkIB3K04WJlkXtB2kWulISF9PI9bImn4kE7UjcOrLwk9W9utIFJ cbCIUMD9BTcy8XAJsgdriuSL84RtAvl7xJdS2xAy+lhjyREBzUGt7uMilVRFIl/PgODR F0gA== X-Gm-Message-State: AOJu0YyQINVWNNYAaItyQlEBH6UVdQutgUKxnbgfXNNm6ohDrEJz3FbE 9A3xwnOTdiuGZLLEJdc8giY6I/5TvDW4G8DlhbNoJrIASMmv6oFAuI/mLA== X-Google-Smtp-Source: AGHT+IFLV+gApYq+IxifTDuta2SXa/LctnbKbkVbAW0jtNfKNyewMkuisiXAFKigWA5J/ZuP91i5zw== X-Received: by 2002:a17:903:8c3:b0:1ec:4adc:4153 with SMTP id d9443c01a7336-1ef43d125e9mr124269805ad.24.1715530036510; Sun, 12 May 2024 09:07:16 -0700 (PDT) Received: from localhost.localdomain ([190.194.167.233]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-6341134705dsm6274574a12.85.2024.05.12.09.07.14 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 12 May 2024 09:07:15 -0700 (PDT) From: James Almer To: ffmpeg-devel@ffmpeg.org Date: Sun, 12 May 2024 13:06:57 -0300 Message-ID: <20240512160657.2733-6-jamrial@gmail.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20240511194656.1576-1-jamrial@gmail.com> References: <20240511194656.1576-1-jamrial@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 8/8] x86/flacdsp: add SSE4 and AVX2 versions of wasted33 X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: flac_wasted_33_c: 214.1 flac_wasted_33_sse4: 133.6 flac_wasted_33_avx2: 93.1 Signed-off-by: James Almer --- libavcodec/x86/flacdsp.asm | 24 ++++++++++++++++++++++++ libavcodec/x86/flacdsp_init.c | 6 ++++++ 2 files changed, 30 insertions(+) diff --git a/libavcodec/x86/flacdsp.asm b/libavcodec/x86/flacdsp.asm index 3a940059c7..84cd4dd465 100644 --- a/libavcodec/x86/flacdsp.asm +++ b/libavcodec/x86/flacdsp.asm @@ -104,6 +104,30 @@ ALIGN 16 jl .loop RET +%macro WASTED_33 1 +cglobal flac_wasted_33, 4,4,2, decoded, residuals, wasted, len + shl lend, 2 + lea decodedq, [decodedq+lenq*2] + add residualsq, lenq + neg lenq + movd xm1, wastedd +ALIGN 16 +.loop: + pmovsxdq m0, [residualsq+lenq] + psllq m0, xm1 + mov%1 [decodedq+lenq*2], m0 + add lenq, mmsize / 2 + jl .loop + RET +%endmacro + +INIT_XMM sse4 +WASTED_33 a +%if HAVE_AVX2_EXTERNAL +INIT_YMM avx2 +WASTED_33 u +%endif + ;---------------------------------------------------------------------------------- ;void ff_flac_decorrelate_[lrm]s_16_sse2(uint8_t **out, int32_t **in, int channels, ; int len, int shift); diff --git a/libavcodec/x86/flacdsp_init.c b/libavcodec/x86/flacdsp_init.c index 67aa118760..22482f8787 100644 --- a/libavcodec/x86/flacdsp_init.c +++ b/libavcodec/x86/flacdsp_init.c @@ -31,6 +31,8 @@ void ff_flac_lpc_32_xop(int32_t *samples, const int coeffs[32], int order, int qlevel, int len); void ff_flac_wasted_32_sse2(int32_t *decoded, int wasted, int len); +void ff_flac_wasted_33_sse4(int64_t *decoded, const int32_t *residual, int wasted, int len); +void ff_flac_wasted_33_avx2(int64_t *decoded, const int32_t *residual, int wasted, int len); #define DECORRELATE_FUNCS(fmt, opt) \ void ff_flac_decorrelate_ls_##fmt##_##opt(uint8_t **out, int32_t **in, int channels, \ @@ -100,6 +102,7 @@ av_cold void ff_flacdsp_init_x86(FLACDSPContext *c, enum AVSampleFormat fmt, int if (EXTERNAL_SSE4(cpu_flags)) { c->lpc16 = ff_flac_lpc_16_sse4; c->lpc32 = ff_flac_lpc_32_sse4; + c->wasted33 = ff_flac_wasted_33_sse4; } if (EXTERNAL_AVX(cpu_flags)) { if (fmt == AV_SAMPLE_FMT_S16) { @@ -117,5 +120,8 @@ av_cold void ff_flacdsp_init_x86(FLACDSPContext *c, enum AVSampleFormat fmt, int if (EXTERNAL_XOP(cpu_flags)) { c->lpc32 = ff_flac_lpc_32_xop; } + if (EXTERNAL_AVX2_FAST(cpu_flags)) { + c->wasted33 = ff_flac_wasted_33_avx2; + } #endif /* HAVE_X86ASM */ } -- 2.45.0 _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".