From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTP id 1A72846B10 for ; Mon, 3 Jul 2023 08:44:46 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id EBC5868C547; Mon, 3 Jul 2023 11:44:43 +0300 (EEST) Received: from mail-wm1-f53.google.com (mail-wm1-f53.google.com [209.85.128.53]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 7080F68BF37 for ; Mon, 3 Jul 2023 11:44:37 +0300 (EEST) Received: by mail-wm1-f53.google.com with SMTP id 5b1f17b1804b1-3fb4146e8deso62664335e9.0 for ; Mon, 03 Jul 2023 01:44:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kynesim-co-uk.20221208.gappssmtp.com; s=20221208; t=1688373877; x=1690965877; h=content-transfer-encoding:mime-version:user-agent:in-reply-to :references:message-id:date:subject:cc:to:from:from:to:cc:subject :date:message-id:reply-to; bh=ErO0SKp2yG1o/khLVKEVpW/ZSJC6KXKsua8Qm2+BtZo=; b=WoaSTqoj/tt5Mlc4hHwbmj9snQCHnwtCALHf74Fj0fZRYEs2qzT4ffXpAGJi2ytS5W BF2S+rlnc03lBOHybfiafdTMvvq4m3vniOXWKJ3v/6CPRPIkS1GsDRsS3tttF9cj/Lhz 2jB/M7FafhyHuQPnDEd3DoIXg+cnyVu8lTqcwAbUNbtwwVZK4qm230PUfNmFKbnYXhPU 3jznT64Ts7uxxC41E9Y/5iH8flAfKrLiN/izmLSS6DsmWKWLjnGAN92dKX6t/GIXzr40 On4XT0Qlqq6hxXF6AP520pp9FGj5b2hsqFpmhmsLf3oMtbzORs+DSnyJsrXIj2yL+Lnr BxZQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688373877; x=1690965877; h=content-transfer-encoding:mime-version:user-agent:in-reply-to :references:message-id:date:subject:cc:to:from:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=ErO0SKp2yG1o/khLVKEVpW/ZSJC6KXKsua8Qm2+BtZo=; b=lqsfU9It4jgDvlaoIdArcEZu764TlTOJJjw1uh6ZJVuCy/9k6IOqxTpqYzSCT9P6zZ gPJnMqccXVqB2PkRg15WmkKhhhtxtBd2AtGkYWaHXTjJm+X9rruKcUCZQZiA3NreNOP/ TdRXAIzkSygvCTPW/ueYLnUyDvCyh0Mcyy21PA/64cSHuYG6MYbR1xuz7xRZxI6X8jxd FQnS+G+tIBTaQ3X3rgltlt6i+KCdYIWY75yjx9p7eMk9M0WoKDnRkOOgQ1LfwKon5XNE LnL9cMsq/koYMAKuCmtVdB51kvQvTz1/fxA6uoEkeYITdNxAKaIcERYWfBED9TAxMjD6 gxmQ== X-Gm-Message-State: AC+VfDzCvTSeiR5wrsw6weu7W/Kyb+YszXXTLDgJ+Dwd1ccLoeOPlZwU c+LTX57tJ7ygVIegiZkAy2l7xw== X-Google-Smtp-Source: ACHHUZ546L+KLvNc8rB2EJH9MlzJz4VNQawSo3zI4fiRgHOLOlCj7ScbB8remNsZc0w+wUfeMOODPw== X-Received: by 2002:a7b:c4c7:0:b0:3f1:789d:ad32 with SMTP id g7-20020a7bc4c7000000b003f1789dad32mr13332314wmk.11.1688373876462; Mon, 03 Jul 2023 01:44:36 -0700 (PDT) Received: from CTHALPA.outer.uphall.net (cpc1-cmbg20-2-0-cust759.5-4.cable.virginm.net. [86.21.218.248]) by smtp.gmail.com with ESMTPSA id k16-20020a7bc410000000b003fbc9b9699dsm7935554wmi.45.2023.07.03.01.44.36 (version=TLS1 cipher=ECDHE-ECDSA-AES128-SHA bits=128/128); Mon, 03 Jul 2023 01:44:36 -0700 (PDT) From: John Cox To: =?utf-8?Q?Martin_Storsj=C3=B6?= Date: Mon, 03 Jul 2023 09:44:36 +0100 Message-ID: <2g25ai9rs4f7m0ej5slaua63m8kctgk6va@4ax.com> References: <20230702123242.232484-1-jc@kynesim.co.uk> <2098e326-8016-ccc-ee90-364c6a5af182@martin.st> In-Reply-To: <2098e326-8016-ccc-ee90-364c6a5af182@martin.st> User-Agent: ForteAgent/8.00.32.1272 MIME-Version: 1.0 Subject: Re: [FFmpeg-devel] [PATCH v2 00/15] avfilter/vf_bwdif: Add aarch64 neon functions X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: thomas.mundt@hr.de, ffmpeg-devel@ffmpeg.org Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: On Mon, 3 Jul 2023 00:09:52 +0300 (EEST), you wrote: >On Sun, 2 Jul 2023, John Cox wrote: > >> Also adds a filter_line3 method which on aarch64 neon yields approx 30% >> speedup over 2xfilter_line and a memcpy >> >> Differences from v1: >> .align 16 corrected to .balign 16 >> SXTW tolower >> Mac ABI (hopefully) fixed >> V register pop/push macroed & prettified >> >> John Cox (15): >> avfilter/vf_bwdif: Add outline for aarch neon functions >> avfilter/vf_bwdif: Add common macros and consts for aarch64 neon >> avfilter/vf_bwdif: Export C filter_intra >> avfilter/vf_bwdif: Add neon for filter_intra >> tests/checkasm: Add test for vf_bwdif filter_intra >> avfilter/vf_bwdif: Add clip and spatial macros for aarch64 neon >> avfilter/vf_bwdif: Export C filter_edge >> avfilter/vf_bwdif: Add neon for filter_edge >> tests/checkasm: Add test for vf_bwdif filter_edge >> avfilter/vf_bwdif: Export C filter_line >> avfilter/vf_bwdif: Add neon for filter_line >> avfilter/vf_bwdif: Add a filter_line3 method for optimisation >> avfilter/vf_bwdif: Add neon for filter_line3 >> tests/checkasm: Add test for vf_bwdif filter_line3 >> avfilter/vf_bwdif: Block filter slices into a multiple of 4 lines > >Overall, I'd suggest squashing/reordering the patches like this: > >- tests/checkasm: Add test for vf_bwdif filter_intra >- avfilter/vf_bwdif: Add neon for filter_intra > (With the preceding patches squashed. For extra common macros, only add > the ones you use in this patch here.) >- tests/checkasm: Add test for vf_bwdif filter_edge >- avfilter/vf_bwdif: Add neon for filter_edge (with other dependencies > squashed) >- avfilter/vf_bwdif: Add neon for filter_line >- avfilter/vf_bwdif: Add a filter_line3 method for optimisation > + checkasm test squashed >- avfilter/vf_bwdif: Add neon for filter_line3 I'm happy with that if everyone else is - it is easy to merge patches - harder to take them apart. JC >// Martin _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".