From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTPS id 052394DC4E for ; Sat, 1 Mar 2025 23:03:57 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id E362668E265; Sun, 2 Mar 2025 01:03:54 +0200 (EET) Received: from mail-lj1-f170.google.com (mail-lj1-f170.google.com [209.85.208.170]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id DC0F868DE36 for ; Sun, 2 Mar 2025 01:03:47 +0200 (EET) Received: by mail-lj1-f170.google.com with SMTP id 38308e7fff4ca-30b909f0629so22151931fa.0 for ; Sat, 01 Mar 2025 15:03:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=martin-st.20230601.gappssmtp.com; s=20230601; t=1740870227; x=1741475027; darn=ffmpeg.org; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=AdsPp/AKgJU+HU8T0hZgC1Wt35pEDps6h5zZAKRO4+0=; b=P+JmE407E55ehL8vYL05WflkKPI603t3BAZGnX5jqJf2w8z8SU+swyst4qt5k0dcs8 DVKRoSuYZguHyDAfIvRn9SKXXbm//VDGaBWMhs2OKLGLJ01ydk7h8ufwI2XBFoJo6Dtd X6jd+IBPfpO20TY2rCZpKUAfvsmYiwhrLUOB7SwZNLuZdAcnva//3UbJtdZI0Ly/NTZq kOc/6VmMpdex4xT5eTlXPWcu/gjLsWOjKNo7njnO2Hq9L4mc05Vro/Tkhi42HztrdfCy rk9vJmW1TwAzKVTHQ6yKvrdm58sz+ykUtj/gqv4TLoytb5/qWwIZqaaok2NqVKImfWa7 8b8Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740870227; x=1741475027; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=AdsPp/AKgJU+HU8T0hZgC1Wt35pEDps6h5zZAKRO4+0=; b=gGzELHpjgUTN4Y8jleAndthyTiHIwfEinuIAqAoZGlNN9UuOZLq5blGzSPbXBJn5vj RryeVIVnxAFIcGjmW4MrX/MK1N8YFZEYZch0Y9NejIXKNINp+lG1+dn9XtIKQ9BFHmFs FbPLHfD37kWbHcAxNFAE3h7cKISztYnvWxzVyuRbLmUJVVQJEDJS7O5XyrriOdeVo5EC IPGBhjN/npbZr8mB9Azf00mZ0BMhWhwq3LdKDTHCWiLWdz0w9bfH9SF6qiS1FnhLSmy5 ljXorBl0FR8YtB3JXPKAsxxLFYYwFc0umC/m+0No0PgQY6axHa/CLqMpeM6WEk8f7asR rl5A== X-Gm-Message-State: AOJu0YxzzrojkMbheWq+HXJK1aio7/+kg+fZDoA3akjWDnN4RxFeD2vy 33+xAJ92ZY+B0zAV3k5GIRJcTBV6Twk8yzBn5iTjgbdl3i4H0oIfkM/C+v6ZCPFViqp6htXsjbm lSA== X-Gm-Gg: ASbGncuPTVDy6XVwNhiUKVUMN1B4gynlV4fuhtLlytND7UX9c6zXvzDkimbKUwsQyEP xVu96gkpJPKfCgaSi7PnavRA5+2qOsS5Gt/KyLK9vi9tD4BFY0SvgAI5GAXyg2h1oeC8SwAuIc1 pyYuQcqGp2A4FRXQJCPi/1BN5ckbNG9VX9Ax4DwQ6GFFwqM5QpLJZams7hxyZmXywPVSN51jbo5 dfrl4KycaUv/raqg3ZLWZyYnBXFlkG5szzCu1UAXoFXncpQculaG4ZhZ3JuwDjZDC6VllKY65Fo cjKDl5Eq2eSwZxq07bgIko4UDzdVA+jU27WlxwLSuopZrLxp0gR/2wFKB+F7FK5nXfFe7wfRSwW jcYa993rNG5T9nQ41TvZcAWlHtSgZjH6LOM23c431 X-Google-Smtp-Source: AGHT+IEvVvAYI8asdlVZ0C8uYxPyzvPXPh621IHi7xReXLdodt7Wt4Vh69mYsVbbOodAYjSotwkNJw== X-Received: by 2002:a05:6512:3b83:b0:545:240:55ba with SMTP id 2adb3069b0e04-5494c3758f6mr3182321e87.26.1740870226944; Sat, 01 Mar 2025 15:03:46 -0800 (PST) Received: from tunnel335574-pt.tunnel.tserv24.sto1.ipv6.he.net (tunnel335574-pt.tunnel.tserv24.sto1.ipv6.he.net. [2001:470:27:11::2]) by smtp.gmail.com with ESMTPSA id 2adb3069b0e04-54949de706fsm731562e87.23.2025.03.01.15.03.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 01 Mar 2025 15:03:46 -0800 (PST) Date: Sun, 2 Mar 2025 01:03:44 +0200 (EET) From: =?ISO-8859-15?Q?Martin_Storsj=F6?= To: Krzysztof Pyrkosz via ffmpeg-devel In-Reply-To: <20250301125859.113969-2-ffmpeg@szaka.eu> Message-ID: <34183bb-28d2-c340-bbf-dba42d724a87@martin.st> References: <20250301125859.113969-2-ffmpeg@szaka.eu> MIME-Version: 1.0 Subject: Re: [FFmpeg-devel] [PATCH] swscale/aarch64/hscale.S Refactor hscale_16_to_15__fs_4 X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Krzysztof Pyrkosz Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: On Sat, 1 Mar 2025, Krzysztof Pyrkosz via ffmpeg-devel wrote: > Before/after: > > A78 > hscale_16_to_15__fs_4_dstW_8_neon: 86.8 ( 1.72x) > hscale_16_to_15__fs_4_dstW_24_neon: 147.5 ( 2.73x) > hscale_16_to_15__fs_4_dstW_128_neon: 614.0 ( 3.14x) > hscale_16_to_15__fs_4_dstW_144_neon: 680.5 ( 3.18x) > hscale_16_to_15__fs_4_dstW_256_neon: 1193.2 ( 3.19x) > hscale_16_to_15__fs_4_dstW_512_neon: 2305.0 ( 3.27x) > > hscale_16_to_15__fs_4_dstW_8_neon: 86.0 ( 1.74x) > hscale_16_to_15__fs_4_dstW_24_neon: 106.8 ( 3.78x) > hscale_16_to_15__fs_4_dstW_128_neon: 404.0 ( 4.81x) > hscale_16_to_15__fs_4_dstW_144_neon: 451.8 ( 4.80x) > hscale_16_to_15__fs_4_dstW_256_neon: 760.5 ( 5.06x) > hscale_16_to_15__fs_4_dstW_512_neon: 1520.0 ( 5.01x) > > A72 > hscale_16_to_15__fs_4_dstW_8_neon: 156.8 ( 1.52x) > hscale_16_to_15__fs_4_dstW_24_neon: 217.8 ( 2.52x) > hscale_16_to_15__fs_4_dstW_128_neon: 906.8 ( 2.90x) > hscale_16_to_15__fs_4_dstW_144_neon: 1014.5 ( 2.91x) > hscale_16_to_15__fs_4_dstW_256_neon: 1751.5 ( 2.96x) > hscale_16_to_15__fs_4_dstW_512_neon: 3469.3 ( 2.97x) > > hscale_16_to_15__fs_4_dstW_8_neon: 151.2 ( 1.54x) > hscale_16_to_15__fs_4_dstW_24_neon: 173.4 ( 3.15x) > hscale_16_to_15__fs_4_dstW_128_neon: 660.0 ( 3.98x) > hscale_16_to_15__fs_4_dstW_144_neon: 735.7 ( 4.00x) > hscale_16_to_15__fs_4_dstW_256_neon: 1273.5 ( 4.09x) > hscale_16_to_15__fs_4_dstW_512_neon: 2488.2 ( 4.16x) > --- > > This patch removes the use of stack for temporary state and replaces > interleaved ld4 loads with ld1. > I'm aware the component is being deprecated, however in my use case > (screen recording) the total time spent in this function is roughly 15%, > the improvement is significant and worth sharing. The patch looks good. I didn't follow it in exact detail, but it overall looks reasonable, and looks much better than the previous form. This description of what the patch does and why also is worth keeping in the final commit message, but as there's no need to repost the patch, I could just adjust the message myself before pushing it. // Martin _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".