From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <ffmpeg-devel-bounces@ffmpeg.org>
Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100])
	by master.gitmailbox.com (Postfix) with ESMTPS id DC75C4DD98
	for <ffmpegdev@gitmailbox.com>; Fri, 25 Apr 2025 08:29:19 +0000 (UTC)
Received: from [127.0.1.1] (localhost [127.0.0.1])
	by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 2DA59687DB9;
	Fri, 25 Apr 2025 11:29:14 +0300 (EEST)
Received: from mail-lf1-f48.google.com (mail-lf1-f48.google.com
 [209.85.167.48])
 by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id C31DC687D52
 for <ffmpeg-devel@ffmpeg.org>; Fri, 25 Apr 2025 11:29:07 +0300 (EEST)
Received: by mail-lf1-f48.google.com with SMTP id
 2adb3069b0e04-54b166fa41bso2304511e87.0
 for <ffmpeg-devel@ffmpeg.org>; Fri, 25 Apr 2025 01:29:07 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=martin-st.20230601.gappssmtp.com; s=20230601; t=1745569747; x=1746174547;
 darn=ffmpeg.org; 
 h=mime-version:references:message-id:in-reply-to:subject:cc:to:from
 :date:from:to:cc:subject:date:message-id:reply-to;
 bh=NyvWVdiRq0wUjlPAqe8VkuvYHBelFcgPTjYJTeoXLsQ=;
 b=O3Vm9MGutXJy9Oc2nlxKRJFpLxiE+bTxsGhPbWaJqMlJc0I5twG7q89ZUcsNaJdO2b
 63vsy/pRoC1pkoAKwbw7eJaCrVa+NCZvc9fOVU6oTsoMCP/azYaecOMOaRJ9AppXRKda
 B2uESvlkPBMbLrquly8DDYlnRu4Fhlf75f6hVhmlgmIySERxTyxhnIg4BSm7DJmWgMC2
 xaDWwhyzpRHEPUl8ZkfEPy8SzuXlYOlAdF4LKOSFUppg/z+14903k3LQRI6KUsyRa9BH
 8lHNfgaeeqa4y88IlALOxS9TqeL01+erF5fWLstUa1W4WTvrr0x223R9nKzxJt7BGgT6
 v7Hg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20230601; t=1745569747; x=1746174547;
 h=mime-version:references:message-id:in-reply-to:subject:cc:to:from
 :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
 bh=NyvWVdiRq0wUjlPAqe8VkuvYHBelFcgPTjYJTeoXLsQ=;
 b=M6K1T5hMXyV4Nad6hI2AddjxAGu/OZuTQMErSzILVHGa3HWv238FOLTIcxtN85eIkm
 JxvULOqSHvJyoThQBoCoWEwPhVDfozJIo6LUSmPLZhi5+h3129RvG+jk5XBIB846O8WP
 PHE3zTA4x7uJFE1ARzkqA1BhnlpRefn/F+mi2kVoSPQyuMHPgk6s/EoDnPswLPZ09YzN
 a8XtQjUUo/235zQ/dzKrZHugAyP35y13uWBNJT/6bgFOo1OKlJUQIeRb5ybz3l0hQOFk
 o9m57oE1ov/nGJA8IiTaThMq/3dcqpoRXpqNi2aznvgyWtDRMKjeDNvAF5VrMvBdPpDP
 K32w==
X-Gm-Message-State: AOJu0YxOxXHZ0lQ6T4zkz/dKY9ZbgoWCHzX1WSSO3KR9bl9/nih29Y7w
 JW1Dfx4iD14XSZqWPsByo97n+xaVEYVr8KozOt8+35LYd9YU7gg/mi2Nk4zURv5nZBH2PF9DzoN
 kvQ==
X-Gm-Gg: ASbGncsW3P3aavVe+lBvGcjGHmjJnYMP53gURuhiOVaomIfFKjkn5go8j11ednSrvk9
 7fcAum3dmOICZwwcX4scu0yNrN1d8V1lA2LC9Tg3Ht54NzoHHWx0xRXE1szKt5xcC3skIzh8/GL
 0eKQTJLJcfdJRHkNHMMeZLJUy+GNvNAQ0ky/5hIt5cK5rlgXg+YFeOGVbpS4EQJxJllq2ypmYi7
 AzeBs4AX4i7frIYbdyHC1nweczsRk4IPecoBsYw3Mju7SDJDME1DK1mylw31Ig0PcocM5Y17VBF
 ClVbIzW6RD9BQxc3R473P0KQGOgxl6+oPOk5T/t8ekguPp+t8fAwZurSGQiFGbkMD/HfGpK80sa
 cQy4LQDJiOGMcMSf3PwgeHeLnaRDluiIfVdlS
X-Google-Smtp-Source: AGHT+IHLItj2ZxaMuFgphaa5z6HnjHMxytcBx+k6yd828459v+8/gm+VpIe0/tqrI8sQBXlHzIzY/Q==
X-Received: by 2002:a05:6512:2201:b0:545:285f:cd7f with SMTP id
 2adb3069b0e04-54e8cbd53e0mr362086e87.14.1745569746677; 
 Fri, 25 Apr 2025 01:29:06 -0700 (PDT)
Received: from tunnel335574-pt.tunnel.tserv24.sto1.ipv6.he.net
 (tunnel335574-pt.tunnel.tserv24.sto1.ipv6.he.net. [2001:470:27:11::2])
 by smtp.gmail.com with ESMTPSA id
 2adb3069b0e04-54e7cc9e9dbsm522283e87.145.2025.04.25.01.29.06
 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
 Fri, 25 Apr 2025 01:29:06 -0700 (PDT)
Date: Fri, 25 Apr 2025 11:29:05 +0300 (EEST)
From: =?ISO-8859-15?Q?Martin_Storsj=F6?= <martin@martin.st>
To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
In-Reply-To: <tencent_E595C1A91CCEAE4550D30CC85533ECF37005@qq.com>
Message-ID: <baa3bb48-a390-96b0-d0ea-d9c39e1027a2@martin.st>
References: <tencent_E595C1A91CCEAE4550D30CC85533ECF37005@qq.com>
MIME-Version: 1.0
Subject: Re: [FFmpeg-devel] [PATCH] aarch64/h26x: Add
 put_hevc_pel_bi_w_pixels
X-BeenThere: ffmpeg-devel@ffmpeg.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: FFmpeg development discussions and patches <ffmpeg-devel.ffmpeg.org>
List-Unsubscribe: <https://ffmpeg.org/mailman/options/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=unsubscribe>
List-Archive: <https://ffmpeg.org/pipermail/ffmpeg-devel>
List-Post: <mailto:ffmpeg-devel@ffmpeg.org>
List-Help: <mailto:ffmpeg-devel-request@ffmpeg.org?subject=help>
List-Subscribe: <https://ffmpeg.org/mailman/listinfo/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=subscribe>
Reply-To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Cc: Zhao Zhili <zhilizhao@tencent.com>
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset="us-ascii"; Format="flowed"
Errors-To: ffmpeg-devel-bounces@ffmpeg.org
Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>
Archived-At: <https://master.gitmailbox.com/ffmpegdev/baa3bb48-a390-96b0-d0ea-d9c39e1027a2@martin.st/>
List-Archive: <https://master.gitmailbox.com/ffmpegdev/>
List-Post: <mailto:ffmpegdev@gitmailbox.com>

On Wed, 23 Apr 2025, Zhao Zhili wrote:

> From: Zhao Zhili <zhilizhao@tencent.com>
>
> On rpi5 (A76):
>
> put_hevc_pel_bi_w_pixels4_8_c:                          90.0 ( 1.00x)
> put_hevc_pel_bi_w_pixels4_8_neon:                       34.1 ( 2.64x)
> put_hevc_pel_bi_w_pixels6_8_c:                         188.3 ( 1.00x)
> put_hevc_pel_bi_w_pixels6_8_neon:                       73.5 ( 2.56x)
> put_hevc_pel_bi_w_pixels8_8_c:                         327.1 ( 1.00x)
> put_hevc_pel_bi_w_pixels8_8_neon:                       75.8 ( 4.32x)
> put_hevc_pel_bi_w_pixels12_8_c:                        728.8 ( 1.00x)
> put_hevc_pel_bi_w_pixels12_8_neon:                     186.1 ( 3.92x)
> put_hevc_pel_bi_w_pixels16_8_c:                       1288.1 ( 1.00x)
> put_hevc_pel_bi_w_pixels16_8_neon:                     268.5 ( 4.80x)
> put_hevc_pel_bi_w_pixels24_8_c:                       2855.5 ( 1.00x)
> put_hevc_pel_bi_w_pixels24_8_neon:                     723.8 ( 3.95x)
> put_hevc_pel_bi_w_pixels32_8_c:                       5095.3 ( 1.00x)
> put_hevc_pel_bi_w_pixels32_8_neon:                    1165.0 ( 4.37x)
> put_hevc_pel_bi_w_pixels48_8_c:                      11521.5 ( 1.00x)
> put_hevc_pel_bi_w_pixels48_8_neon:                    2856.0 ( 4.03x)
> put_hevc_pel_bi_w_pixels64_8_c:                      21020.5 ( 1.00x)
> put_hevc_pel_bi_w_pixels64_8_neon:                    4699.1 ( 4.47x)
> ---
> libavcodec/aarch64/h26x/dsp.h             |   5 +
> libavcodec/aarch64/h26x/epel_neon.S       | 373 ++++++++++++++++++++++
> libavcodec/aarch64/hevcdsp_init_aarch64.c |  13 +
> 3 files changed, 391 insertions(+)

This looks good overall, thanks!

It's quite regrettable how many duplicates of near-identical functions 
there are in the h26x qpel/epel code; ideally we should be able to 
produce most of these function variants with some sort of template instead 
of having them all duplicated (with minor style differences).

// Martin

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".