From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTP id 3C89845CA5 for ; Sun, 2 Jul 2023 12:33:15 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 5FAF468C235; Sun, 2 Jul 2023 15:33:12 +0300 (EEST) Received: from mail-wm1-f44.google.com (mail-wm1-f44.google.com [209.85.128.44]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 5609568BDCB for ; Sun, 2 Jul 2023 15:33:05 +0300 (EEST) Received: by mail-wm1-f44.google.com with SMTP id 5b1f17b1804b1-3fbc244d386so34988535e9.2 for ; Sun, 02 Jul 2023 05:33:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kynesim-co-uk.20221208.gappssmtp.com; s=20221208; t=1688301184; x=1690893184; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=HhLnYXT7R6F5dMpshYXrA+Nnr27ZVf2yFon24rPJEME=; b=BW9tVIgX0wOm+ma0sIl1Nu67tMx8mg4asyNQ1uV5l0u8JihiLkyjL4JJ97ji/XoX8X aBZVw36C6CS5jGYf5Fphw4BUkN1pnAxBtF/5j/aO7dGq9PKfQKIoWXLH1qUWoQcSpbrx rWONFRtB8l4qTgsSzA/Xty0J84eWUUMhlozQoevIaWy+nEHfPmjx6UH3sAZ8P0DjFADC dGg+egMVToFB2vpq6pl41CT5VGSM1Z1RXheLYxKFIqv3bQUTBucKjmlPNxRuj3g3+253 wkAKvDJJ5ElyCtteZY0+yG8/ULKcUDgLFHZiQWptW/3E8V2aBhEslPuTxN+26g00j86R XWXw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688301184; x=1690893184; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=HhLnYXT7R6F5dMpshYXrA+Nnr27ZVf2yFon24rPJEME=; b=EnogKPv7LSK4jzyYMdKq1vVVZjBRjOmXUk0GhX0TDx+NLPG3NFtsnoSeZSac3GtxHp H2PyKZjjtUA5oC0whlICerxpOqwuDmfmmN17crgG6Q9ynOGPfMEsYY2VWi1viCnHl6NR a0rsUhBJkKN3HYPTZWBKzivTht7mWlmJzeAKOLpiFsp1czQlaMFnSlHeRDsDMqlOht+L jMlC9lxZZq1Uka7ZC3rVEje223VCH3lUkV6ENa+uBIfzPj5y9RzsiYSsGj/31sN76GV9 44P2JH3IDpi3ePoJMrObpQNhA4gNqOzkBjDhj2WcKlTxaU/1zpIKhH42o8P2BzAX3kRy RkMA== X-Gm-Message-State: AC+VfDyvgbz7Id7ba+WeMkO+OTani42By+K25qqHe2acl5AtyovZ3VVc 8i016nXr+X3SL3ZdlSydexDi/6+yDShLVPEDdUg= X-Google-Smtp-Source: ACHHUZ6dUQwTy1CUX82LBfdoihVjYbXTqkndw8W1c5E0jPcS9smsfXsEjiZBg71q5TNcKvDQdEqc/g== X-Received: by 2002:a05:600c:215:b0:3f9:c8b2:dfbd with SMTP id 21-20020a05600c021500b003f9c8b2dfbdmr5803062wmi.19.1688301184485; Sun, 02 Jul 2023 05:33:04 -0700 (PDT) Received: from sucnaath.outer.uphall.net (cpc1-cmbg20-2-0-cust759.5-4.cable.virginm.net. [86.21.218.248]) by smtp.gmail.com with ESMTPSA id f12-20020a7bc8cc000000b003fbbe41fd78sm8816167wml.10.2023.07.02.05.33.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 02 Jul 2023 05:33:03 -0700 (PDT) From: John Cox To: ffmpeg-devel@ffmpeg.org Date: Sun, 2 Jul 2023 12:32:27 +0000 Message-Id: <20230702123242.232484-1-jc@kynesim.co.uk> X-Mailer: git-send-email 2.39.2 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v2 00/15] avfilter/vf_bwdif: Add aarch64 neon functions X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: thomas.mundt@hr.de, John Cox , martin@martin.st Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: Also adds a filter_line3 method which on aarch64 neon yields approx 30% speedup over 2xfilter_line and a memcpy Differences from v1: .align 16 corrected to .balign 16 SXTW tolower Mac ABI (hopefully) fixed V register pop/push macroed & prettified John Cox (15): avfilter/vf_bwdif: Add outline for aarch neon functions avfilter/vf_bwdif: Add common macros and consts for aarch64 neon avfilter/vf_bwdif: Export C filter_intra avfilter/vf_bwdif: Add neon for filter_intra tests/checkasm: Add test for vf_bwdif filter_intra avfilter/vf_bwdif: Add clip and spatial macros for aarch64 neon avfilter/vf_bwdif: Export C filter_edge avfilter/vf_bwdif: Add neon for filter_edge tests/checkasm: Add test for vf_bwdif filter_edge avfilter/vf_bwdif: Export C filter_line avfilter/vf_bwdif: Add neon for filter_line avfilter/vf_bwdif: Add a filter_line3 method for optimisation avfilter/vf_bwdif: Add neon for filter_line3 tests/checkasm: Add test for vf_bwdif filter_line3 avfilter/vf_bwdif: Block filter slices into a multiple of 4 lines libavfilter/aarch64/Makefile | 2 + libavfilter/aarch64/vf_bwdif_init_aarch64.c | 125 ++++ libavfilter/aarch64/vf_bwdif_neon.S | 788 ++++++++++++++++++++ libavfilter/bwdif.h | 20 + libavfilter/vf_bwdif.c | 70 +- tests/checkasm/vf_bwdif.c | 172 +++++ 6 files changed, 1162 insertions(+), 15 deletions(-) create mode 100644 libavfilter/aarch64/vf_bwdif_init_aarch64.c create mode 100644 libavfilter/aarch64/vf_bwdif_neon.S -- 2.39.2 _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".