From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTP id A760E4B2BC for ; Mon, 3 Jun 2024 08:07:41 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 6949C68D615; Mon, 3 Jun 2024 11:07:38 +0300 (EEST) Received: from mail-lj1-f181.google.com (mail-lj1-f181.google.com [209.85.208.181]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id C802968D5B3 for ; Mon, 3 Jun 2024 11:07:31 +0300 (EEST) Received: by mail-lj1-f181.google.com with SMTP id 38308e7fff4ca-2eaafda3b5cso8477611fa.3 for ; Mon, 03 Jun 2024 01:07:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=martin-st.20230601.gappssmtp.com; s=20230601; t=1717402051; x=1718006851; darn=ffmpeg.org; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=Olm17wDtmnkIRguLG7L5R4KRAt0SC1LvntQorU6nTtA=; b=HY/lEEkF9oCgBB2qUunXu32FztUoYVUfDfdoHc6EwnlVsmR0oZg1KbSliVjzbOfLlH Wg6TX2yudZ6GLIxTSrTRSpMvr7EMjzLXih7DZ76qg69y4ASWNFZ7JqO4c439Z5U7ABI1 ZVd7tur5QZE1cw6iIuUcFmmIXGJi8lZ77ZDd4cUyQtgqSdLvzpmby6QgwJSkCCcpSYJ/ GS7nN5nH+4F1DcMv4OnQ0xfTZj4l4iwtMuPUiFS00+BUEsgOVAiOVCpXhJznrylaoebK ABHoJDFAw1AHgqZhYAqvKaN9f15/RC6bQCo5YqwNMW3QfNCgfKfFKcCe+37q7MHnueV5 CmlA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717402051; x=1718006851; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Olm17wDtmnkIRguLG7L5R4KRAt0SC1LvntQorU6nTtA=; b=xSkVY+hHlUr0VKnYZiClMCIYSAQOqT26RceW2wN7JLw4m5/7Fo35Qo/DZrkRsKwDxA 2WCaIX9pnQUJl0Yswcw1qpLaBEL1V9bd7iOhodup/BPTvrh4i0k7A87zlB1Kt3e2A5Ff WKF19UfxcAr8nQ3Btl2TPQadDkhAndZQaDi5kM25zVu/aEuElf3bHrQ704SoSOfK+1jy sdTg4V6uYQdNFE5ZSiXGxHGBMsY2Cov0jKT8SQqupDUXJp7ad1Uwv4D207S+lmZSYBt8 Y8Hcpuz3Z5h/J+D3cBqoTLB92d3M73DO+fkOqiOtTbD9+aLshajwBe+hLTgYrU4PZe67 kdhA== X-Gm-Message-State: AOJu0Ywmtb8OkS7hs9n0QvZeOdQEFJ8Nmw3tetPUH+m8LyQnD05HnnCc kcPBdYGCEiCvVsepvoWfqcA/kgW1M2M/legu5XOM0On/HHiQnJWpzd9uP74WjneucYBHwbUm9Ob e3Q== X-Google-Smtp-Source: AGHT+IFz3qfWOXYtEmhWzsDQv/z8BmtWVj+l/QiCl0tuHEg18QcXl23pLgwwo1eVTh7R+K2U7YoHZQ== X-Received: by 2002:a2e:921a:0:b0:2df:2e02:11eb with SMTP id 38308e7fff4ca-2ea950fdc29mr56084601fa.2.1717402050839; Mon, 03 Jun 2024 01:07:30 -0700 (PDT) Received: from tunnel335574-pt.tunnel.tserv24.sto1.ipv6.he.net (tunnel335574-pt.tunnel.tserv24.sto1.ipv6.he.net. [2001:470:27:11::2]) by smtp.gmail.com with ESMTPSA id 38308e7fff4ca-2ea91bb499asm11235681fa.52.2024.06.03.01.07.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 03 Jun 2024 01:07:30 -0700 (PDT) Date: Mon, 3 Jun 2024 11:07:29 +0300 (EEST) From: =?ISO-8859-15?Q?Martin_Storsj=F6?= To: FFmpeg development discussions and patches In-Reply-To: Message-ID: <43dcb452-d266-78f5-d232-228912d41a9b@martin.st> References: <20240603071732.52523-1-quinkblack@foxmail.com> MIME-Version: 1.0 Subject: Re: [FFmpeg-devel] [PATCH 2/2] swscale/aarch64: Add rgb24 to yuv implementation X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Zhao Zhili Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: On Mon, 3 Jun 2024, Zhao Zhili wrote: > diff --git a/libswscale/aarch64/input.S b/libswscale/aarch64/input.S > new file mode 100644 > index 0000000000..0a46475723 > --- /dev/null > +++ b/libswscale/aarch64/input.S > @@ -0,0 +1,229 @@ > +/* > + * Copyright (c) 2024 Zhao Zhili > + * > + * This file is part of FFmpeg. > + * > + * FFmpeg is free software; you can redistribute it and/or > + * modify it under the terms of the GNU Lesser General Public > + * License as published by the Free Software Foundation; either > + * version 2.1 of the License, or (at your option) any later version. > + * > + * FFmpeg is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + * Lesser General Public License for more details. > + * > + * You should have received a copy of the GNU Lesser General Public > + * License along with FFmpeg; if not, write to the Free Software > + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA > + */ > + > +#include "libavutil/aarch64/asm.S" > + > +.macro rgb24_to_yuv_load_rgb, src > + ld3.16b { v16, v17, v18 }, [\src] > + ushll.8h v19, v16, #0 // v19: r > + ushll.8h v20, v17, #0 // v20: g > + ushll.8h v21, v18, #0 // v21: b > + ushll2.8h v22, v16, #0 // v22: r > + ushll2.8h v23, v17, #0 // v23: g > + ushll2.8h v24, v18, #0 // v24: b Don't use this nonstandard, Apple specific aarch64 syntax. This was used by Apple tools at the start, when the proper standardized aarch64 syntax wasn't quite settled yet, and it is still accepted. (And apparently this is still the preferred form to disassemble things in, for apple platforms.) With this syntax, the assembly is rejected by GNU binutils and MSVC. > +function ff_rgb24ToY_neon, export=1 > + cmp w4, #0 // check width > 0 > + b.le 4f > + > + ldp w10, w11, [x5], #8 // w10: ry, w11: gy > + dup v0.8H, w10 > + dup v1.8H, w11 > + ldr w12, [x5] // w12: by > + dup v2.8H, w12 Don't use uppercase .8H for field layout configurations, we prefer to stick to all lowercase here - see 184103b3105f02f1189fa0047af4269e027dfbd6. The same goes for a number of places in this patch. > + add w9, w9, #1 // i++ > + add x3, x3, #6 // src += 6 > +3: > + cmp w9, w5 > + b.lt 2b > +4: Incorrect indentation for the cmp/b.lt instructions here. I have set up a bunch of github actions for testing aarch64 assembly - see https://github.com/mstorsjo/ffmpeg/commits/gha-aarch64. If you have a github account, grab a copy of this branch into your repo, add your own commits on top, and push to your fork (and if necessary, activate running the actions), then you should get a wide testing of your patches. See https://github.com/mstorsjo/FFmpeg/actions/runs/9346228714 for one example run of these actions with your patches. // Martin _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".