From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTP id C636343667 for ; Thu, 21 Jul 2022 02:31:07 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id B89F368B517; Thu, 21 Jul 2022 05:31:04 +0300 (EEST) Received: from mail-ed1-f44.google.com (mail-ed1-f44.google.com [209.85.208.44]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 1096568B45C for ; Thu, 21 Jul 2022 05:30:58 +0300 (EEST) Received: by mail-ed1-f44.google.com with SMTP id e15so556381edj.2 for ; Wed, 20 Jul 2022 19:30:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=ZLLwzdPXFkkSsHcOJb4DddPupAxK4YnbSzk3u9bV020=; b=JuDSYIvxWATjP9zSBlOYim8wFYPOW7v+RPf3la8s3LSjWVr+KrtsGN7IcdS2uDnVqw MUk9yPx6CR+ewraFdEAvo5ytBsJYtsgOnOwprlUxjEnyPuur48Zvv3POXEQLkZ2zKgLf c8jLr3lCENMkxMEDINPpaOG1PtpQ5YwOr2fMy4PICPbd5+BuTUzgCulzSizWAcUgmf3Y jldP00BXs5coFcCm5+GZEi8ljmhPeiBAfBUoJafM4Tf95fzuvmSRvb7k9UnvaA8Pewr9 HqLJVZofier0J8cTlkkqDESwJfhMQuXRV9JbkWweLviOK1vszWwjCj6O9xTMUBiRbS5H BVzg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=ZLLwzdPXFkkSsHcOJb4DddPupAxK4YnbSzk3u9bV020=; b=U+47EXwZR03Rn0l7NLBEW69syjn+/L0Llt0GgEgC0Y68iDjCbGAtkJujaaga/lTTOH 81dmn04Teg+JvwwLeAtdSVU9D2OtpPXcLjq9/9Uh8JjR0231M/TAIMjbCVewSJY6pUxh 6vdbKR7aEheBQMn8VX52PskjMpEyd9kJr2yrIlP7Yu8133mtIeGtF1CE0pdSbYIe7JyC 2dZzI3kv6vJsbkkSW8XH9BPoDt4ETApzofyy8Je3G2C5K8hpqI/iw5oTJBfv5ayjQgvd /j0Q88tBE1QKC7C1tHDLmxwE9XBWv/RYw0DPyMnEYAapHWMq9nJE/AZXiGkaYY/Wob3x qFQQ== X-Gm-Message-State: AJIora9pemO42EFiExqQpfTHm07gLvL+Oqyq6vnLXBKirf5KL6xMxlm9 EDiDatNKqvmiUe1pgp1mFRxmF4/7USlbaBVWgraCLbxwkTw= X-Google-Smtp-Source: AGRyM1tO8s2VzP0IhgmLBSyifjIKiwcVMvcZ+6wU87gA1GeX9+Ieveap4dZ5/lWNnCMUIg6+D+3qUOsdrIALU1+9yV8= X-Received: by 2002:a05:6402:3785:b0:435:5d0e:2a2e with SMTP id et5-20020a056402378500b004355d0e2a2emr56063173edb.307.1658370657262; Wed, 20 Jul 2022 19:30:57 -0700 (PDT) MIME-Version: 1.0 References: <20220720044117.1282961-1-cphlipot0@gmail.com> <20220720044117.1282961-5-cphlipot0@gmail.com> <20220720131650.GX2088045@pb2> In-Reply-To: <20220720131650.GX2088045@pb2> From: Chris Phlipot Date: Wed, 20 Jul 2022 19:30:46 -0700 Message-ID: To: FFmpeg development discussions and patches X-Content-Filtered-By: Mailman/MimeDel 2.1.29 Subject: Re: [FFmpeg-devel] [PATCH 5/5] avfilter/vf_yadif: Add x86_64 avx yadif asm X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: Thanks for calling that out. It looks like I was cross-compiling for 32-bit incorrectly from my 64-bit host. I've reproduced the failure and submitted a v2 with the fix. If you're still seeing build failures even after v2, can you also provide more details on how you are building so I can reproduce and fix? - Chris On Wed, Jul 20, 2022 at 6:17 AM Michael Niedermayer wrote: > On Tue, Jul 19, 2022 at 09:41:17PM -0700, Chris Phlipot wrote: > > Add a new version of yadif_filter_line performed using packed bytes > > instead of the packed words used by the current implementaiton. As > > a result this implementation runs almost 2x as fast as the current > > fastest SSSE3 implementation. > > > > This implementation is created from scratch based on the C code, with > > the goal of keeping all intermediate values within 8-bits so that > > the vectorized code can be computed using packed bytes. differences > > are as follows: > > - Use algorithms to compute avg and abs difference using only 8-bit > > intermediate values. > > - Reworked the mode 1 code by applying various mathematical identities > > to keep all intermediate values within 8-bits. > > - Attempt to compute the spatial score using only 8-bits. The actual > > spatial score fits within this range 97% (content dependent) of the > > time for the entire 128-bit xmm vector. In the case that spatial > > score needs more than 8-bits to be represented, we detect this case, > > and recompute the spatial score using 16-bit packed words instead. > > > > In 3% of cases the spatial_score will need more than 8-bytes to store > > so we have a slow path, where the spatial score is computed using > > packed words instead. > > > > This implementation is currently limited to x86_64 due to the number > > of registers required. x86_32 is possible, but the performance benefit > > over the existing SSSE3 implentation is not as great, due to all of the > > stack spills that would result from having far fewer registers. ASM was > > not generated for the 32-bit varient due to limited ROI, as most AVX > > users are likely on 64-bit OS at this point and 32-bit users would > > lose out on most of the performance benefit. > > > > Signed-off-by: Chris Phlipot > > theres no need to support 32it but ffmpeg build must not break > on linux x86-32 > > src/libavfilter/x86/vf_yadif_x64.asm:145: error: impossible combination of > address sizes > src/libavfilter/x86/vf_yadif_x64.asm:145: error: invalid effective address > src/libavfilter/x86/vf_yadif_x64.asm:146: error: impossible combination of > address sizes > src//libavutil/x86/x86inc.asm:1399: ... from macro `movdqu' defined here > src//libavutil/x86/x86inc.asm:1264: ... from macro `RUN_AVX_INSTR' defined > here > src//libavutil/x86/x86inc.asm:1717: ... from macro `vmovdqu' defined here > > > [...] > -- > Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB > > Everything should be made as simple as possible, but not simpler. > -- Albert Einstein > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". > _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".