From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTP id 0A72C4A9F0 for ; Tue, 7 May 2024 15:27:02 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 652D268D753; Tue, 7 May 2024 18:27:00 +0300 (EEST) Received: from mail-pl1-f169.google.com (mail-pl1-f169.google.com [209.85.214.169]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id BF0A968D546 for ; Tue, 7 May 2024 18:26:53 +0300 (EEST) Received: by mail-pl1-f169.google.com with SMTP id d9443c01a7336-1ee38966529so12917715ad.1 for ; Tue, 07 May 2024 08:26:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1715095612; x=1715700412; darn=ffmpeg.org; h=content-transfer-encoding:in-reply-to:from:content-language :references:to:subject:user-agent:mime-version:date:message-id:from :to:cc:subject:date:message-id:reply-to; bh=VecoW32xhtZdIIHvSIkGwAmBjKS6rbysffQNeDjutw0=; b=Mr4bRVgdol1i9OFSP/LLh32Fs8uV6o8UwN+KeqygpTgNs6ytgmEv6WA8vFEP3KTshO iHJF11vW+oWGM/+kQDoS5x7h3UmWej5vZqe4Qh3xP2/kRQvlw/BSxJ8cblJDqzE6cpZO cwgo9IzTXLW7GWCmcHZntdpGhQjEqXaZxXSK2KEdUGFGpr4VxQxV/8f6GeB2xB4aBoJb 2V5QMEXyQEoJ3uyH1/sPHBhbarNd0NQyjRbMURh9obfu9RLfilwEZYV0RuQ8K8d/S8vg fygjTGmTxepP8PbfjS3E8i5C1XpL2xk67GI9j5stgNps8b7xbeOe68VgCLdaSdyla58T ZDFw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1715095612; x=1715700412; h=content-transfer-encoding:in-reply-to:from:content-language :references:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=VecoW32xhtZdIIHvSIkGwAmBjKS6rbysffQNeDjutw0=; b=lrB1w+Ec+dv1arhL14QzlicHq0lybEk28+juQNJ3QGhZbzk1h0k1sEbTjJGAQ3aXvK cXVWZau9fUcnytPZ/D/CwVM9eam12dJoleoWLx8oRqzj/HoXYseuDw8MQwPtfFDBIKQe CsLib9VptuJa0JxdMc0Aec5/d0YW4MA8uOj57M6SmL3WFhFcZlVGgyLOIVYu4tPLhN7F sMlnEgF4jnwa6oZ4k27rvZv4LRGoTFVuGk+PLc8HPNYuciSlIpib+Ly8ndkTXaTwo2Zq gBxWOGK5jWaH9tccI9Ia4m3J4jtIkev0M4ccFSfb7MkdngrOoeD2DgZha9W5Z5uCihUy 0ptQ== X-Gm-Message-State: AOJu0YzOOHrynVHuVDg4FDwuxPdubB+sGsWZfqjPNnkT73u+Vcy9SEJJ XKsdr7ubdPqYlPEXmGDWFvWNYGGuiqJDg+989nlPkt2WYV7bJZRDhaPgyQ== X-Google-Smtp-Source: AGHT+IEmibAMU2BX9WmbS/wu8/i4BLy2jubd2qA1MG9Ht5/A3S62Lx4ClHrmoSn424id+dxRrrVESA== X-Received: by 2002:a17:903:228e:b0:1eb:5222:7b8e with SMTP id d9443c01a7336-1eeab8b5635mr2269995ad.17.1715095611625; Tue, 07 May 2024 08:26:51 -0700 (PDT) Received: from [192.168.0.10] ([190.194.167.233]) by smtp.gmail.com with ESMTPSA id jw16-20020a170903279000b001dd82855d47sm9951957plb.265.2024.05.07.08.26.50 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 07 May 2024 08:26:51 -0700 (PDT) Message-ID: <4c270bfb-7f64-4e25-8914-38350ea94fbe@gmail.com> Date: Tue, 7 May 2024 12:26:53 -0300 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird To: ffmpeg-devel@ffmpeg.org References: <20240507002723.1603-1-jamrial@gmail.com> <20240507150205.2039-1-jamrial@gmail.com> Content-Language: en-US From: James Almer In-Reply-To: Subject: Re: [FFmpeg-devel] [PATCH 2/3] checkasm/blockdsp: use smallest allowed aligned buffers for fill_block_tab tests X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: On 5/7/2024 12:14 PM, Andreas Rheinhardt wrote: > James Almer: >> The requirement is either 8 or 16 bytes alignment, not 32. >> This should help finding bugs in asm implementations. >> >> Signed-off-by: James Almer >> --- >> tests/checkasm/blockdsp.c | 23 +++++++++-------------- >> 1 file changed, 9 insertions(+), 14 deletions(-) >> >> diff --git a/tests/checkasm/blockdsp.c b/tests/checkasm/blockdsp.c >> index ab87fc8fa4..f67a38d302 100644 >> --- a/tests/checkasm/blockdsp.c >> +++ b/tests/checkasm/blockdsp.c >> @@ -29,11 +29,6 @@ >> #include "libavutil/intreadwrite.h" >> #include "libavutil/mem_internal.h" >> >> -typedef struct { >> - const char *name; >> - int size; >> -} test; >> - >> #define randomize_buffers(size) \ >> do { \ >> int i; \ >> @@ -58,18 +53,18 @@ do { \ >> } while (0) >> >> static void check_fill(BlockDSPContext *h){ >> - const test tests[] = { >> - {"fill_block_tab[0]", 16}, >> - {"fill_block_tab[1]", 8}, >> - }; >> - LOCAL_ALIGNED_32(uint8_t, buf0, [16 * 16]); >> - LOCAL_ALIGNED_32(uint8_t, buf1, [16 * 16]); >> + LOCAL_ALIGNED_16(uint8_t, buf0_16, [16 * 16]); >> + LOCAL_ALIGNED_16(uint8_t, buf1_16, [16 * 16]); >> + LOCAL_ALIGNED_8(uint8_t, buf0_8, [8 * 8]); >> + LOCAL_ALIGNED_8(uint8_t, buf1_8, [8 * 8]); >> >> - for (size_t t = 0; t < FF_ARRAY_ELEMS(tests); ++t) { >> - int n = tests[t].size; >> + for (int t = 0; t < 2; ++t) { >> + uint8_t *buf0 = t ? buf0_8 : buf0_16; >> + uint8_t *buf1 = t ? buf1_8 : buf1_16; >> + int n = 16 - 8 * t; >> declare_func(void, uint8_t *block, uint8_t value, >> ptrdiff_t line_size, int h); >> - if (check_func(h->fill_block_tab[t], "blockdsp.%s", tests[t].name)) { >> + if (check_func(h->fill_block_tab[t], "blockdsp.fill_block_tab[%d]", t)) { >> uint8_t value = rnd(); >> memset(buf0, 0, sizeof(*buf0) * n * n); >> memset(buf1, 0, sizeof(*buf0) * n * n); > > 1. I wouldn't be surprised if the *_8 buffers were still 16 byte > aligned. You should probably force 8 byte alignment by using a 16 > byte-aligned buffer with an offset of eight. Amended the following locally: > diff --git a/tests/checkasm/blockdsp.c b/tests/checkasm/blockdsp.c > index 8c1f8281d2..5f4d46b8fa 100644 > --- a/tests/checkasm/blockdsp.c > +++ b/tests/checkasm/blockdsp.c > @@ -55,12 +55,10 @@ do { \ > static void check_fill(BlockDSPContext *h){ > LOCAL_ALIGNED_16(uint8_t, buf0_16, [16 * 16]); > LOCAL_ALIGNED_16(uint8_t, buf1_16, [16 * 16]); > - LOCAL_ALIGNED_8(uint8_t, buf0_8, [8 * 8]); > - LOCAL_ALIGNED_8(uint8_t, buf1_8, [8 * 8]); > > for (int t = 0; t < 2; ++t) { > - uint8_t *buf0 = t ? buf0_8 : buf0_16; > - uint8_t *buf1 = t ? buf1_8 : buf1_16; > + uint8_t *buf0 = buf0_16 + t * /* force 8 byte alignment */ 8; > + uint8_t *buf1 = buf1_16 + t * /* force 8 byte alignment */ 8; > int n = 16 - 8 * t; > declare_func(void, uint8_t *block, uint8_t value, > ptrdiff_t line_size, int h); > 2. Can you also extend this test to actually test the case of stride != > width? (And negative strides.) Maybe later. _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".