From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: <ffmpeg-devel-bounces@ffmpeg.org> Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTPS id DC75C4DD98 for <ffmpegdev@gitmailbox.com>; Fri, 25 Apr 2025 08:29:19 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 2DA59687DB9; Fri, 25 Apr 2025 11:29:14 +0300 (EEST) Received: from mail-lf1-f48.google.com (mail-lf1-f48.google.com [209.85.167.48]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id C31DC687D52 for <ffmpeg-devel@ffmpeg.org>; Fri, 25 Apr 2025 11:29:07 +0300 (EEST) Received: by mail-lf1-f48.google.com with SMTP id 2adb3069b0e04-54b166fa41bso2304511e87.0 for <ffmpeg-devel@ffmpeg.org>; Fri, 25 Apr 2025 01:29:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=martin-st.20230601.gappssmtp.com; s=20230601; t=1745569747; x=1746174547; darn=ffmpeg.org; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=NyvWVdiRq0wUjlPAqe8VkuvYHBelFcgPTjYJTeoXLsQ=; b=O3Vm9MGutXJy9Oc2nlxKRJFpLxiE+bTxsGhPbWaJqMlJc0I5twG7q89ZUcsNaJdO2b 63vsy/pRoC1pkoAKwbw7eJaCrVa+NCZvc9fOVU6oTsoMCP/azYaecOMOaRJ9AppXRKda B2uESvlkPBMbLrquly8DDYlnRu4Fhlf75f6hVhmlgmIySERxTyxhnIg4BSm7DJmWgMC2 xaDWwhyzpRHEPUl8ZkfEPy8SzuXlYOlAdF4LKOSFUppg/z+14903k3LQRI6KUsyRa9BH 8lHNfgaeeqa4y88IlALOxS9TqeL01+erF5fWLstUa1W4WTvrr0x223R9nKzxJt7BGgT6 v7Hg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1745569747; x=1746174547; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=NyvWVdiRq0wUjlPAqe8VkuvYHBelFcgPTjYJTeoXLsQ=; b=M6K1T5hMXyV4Nad6hI2AddjxAGu/OZuTQMErSzILVHGa3HWv238FOLTIcxtN85eIkm JxvULOqSHvJyoThQBoCoWEwPhVDfozJIo6LUSmPLZhi5+h3129RvG+jk5XBIB846O8WP PHE3zTA4x7uJFE1ARzkqA1BhnlpRefn/F+mi2kVoSPQyuMHPgk6s/EoDnPswLPZ09YzN a8XtQjUUo/235zQ/dzKrZHugAyP35y13uWBNJT/6bgFOo1OKlJUQIeRb5ybz3l0hQOFk o9m57oE1ov/nGJA8IiTaThMq/3dcqpoRXpqNi2aznvgyWtDRMKjeDNvAF5VrMvBdPpDP K32w== X-Gm-Message-State: AOJu0YxOxXHZ0lQ6T4zkz/dKY9ZbgoWCHzX1WSSO3KR9bl9/nih29Y7w JW1Dfx4iD14XSZqWPsByo97n+xaVEYVr8KozOt8+35LYd9YU7gg/mi2Nk4zURv5nZBH2PF9DzoN kvQ== X-Gm-Gg: ASbGncsW3P3aavVe+lBvGcjGHmjJnYMP53gURuhiOVaomIfFKjkn5go8j11ednSrvk9 7fcAum3dmOICZwwcX4scu0yNrN1d8V1lA2LC9Tg3Ht54NzoHHWx0xRXE1szKt5xcC3skIzh8/GL 0eKQTJLJcfdJRHkNHMMeZLJUy+GNvNAQ0ky/5hIt5cK5rlgXg+YFeOGVbpS4EQJxJllq2ypmYi7 AzeBs4AX4i7frIYbdyHC1nweczsRk4IPecoBsYw3Mju7SDJDME1DK1mylw31Ig0PcocM5Y17VBF ClVbIzW6RD9BQxc3R473P0KQGOgxl6+oPOk5T/t8ekguPp+t8fAwZurSGQiFGbkMD/HfGpK80sa cQy4LQDJiOGMcMSf3PwgeHeLnaRDluiIfVdlS X-Google-Smtp-Source: AGHT+IHLItj2ZxaMuFgphaa5z6HnjHMxytcBx+k6yd828459v+8/gm+VpIe0/tqrI8sQBXlHzIzY/Q== X-Received: by 2002:a05:6512:2201:b0:545:285f:cd7f with SMTP id 2adb3069b0e04-54e8cbd53e0mr362086e87.14.1745569746677; Fri, 25 Apr 2025 01:29:06 -0700 (PDT) Received: from tunnel335574-pt.tunnel.tserv24.sto1.ipv6.he.net (tunnel335574-pt.tunnel.tserv24.sto1.ipv6.he.net. [2001:470:27:11::2]) by smtp.gmail.com with ESMTPSA id 2adb3069b0e04-54e7cc9e9dbsm522283e87.145.2025.04.25.01.29.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 25 Apr 2025 01:29:06 -0700 (PDT) Date: Fri, 25 Apr 2025 11:29:05 +0300 (EEST) From: =?ISO-8859-15?Q?Martin_Storsj=F6?= <martin@martin.st> To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org> In-Reply-To: <tencent_E595C1A91CCEAE4550D30CC85533ECF37005@qq.com> Message-ID: <baa3bb48-a390-96b0-d0ea-d9c39e1027a2@martin.st> References: <tencent_E595C1A91CCEAE4550D30CC85533ECF37005@qq.com> MIME-Version: 1.0 Subject: Re: [FFmpeg-devel] [PATCH] aarch64/h26x: Add put_hevc_pel_bi_w_pixels X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches <ffmpeg-devel.ffmpeg.org> List-Unsubscribe: <https://ffmpeg.org/mailman/options/ffmpeg-devel>, <mailto:ffmpeg-devel-request@ffmpeg.org?subject=unsubscribe> List-Archive: <https://ffmpeg.org/pipermail/ffmpeg-devel> List-Post: <mailto:ffmpeg-devel@ffmpeg.org> List-Help: <mailto:ffmpeg-devel-request@ffmpeg.org?subject=help> List-Subscribe: <https://ffmpeg.org/mailman/listinfo/ffmpeg-devel>, <mailto:ffmpeg-devel-request@ffmpeg.org?subject=subscribe> Reply-To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org> Cc: Zhao Zhili <zhilizhao@tencent.com> Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org> Archived-At: <https://master.gitmailbox.com/ffmpegdev/baa3bb48-a390-96b0-d0ea-d9c39e1027a2@martin.st/> List-Archive: <https://master.gitmailbox.com/ffmpegdev/> List-Post: <mailto:ffmpegdev@gitmailbox.com> On Wed, 23 Apr 2025, Zhao Zhili wrote: > From: Zhao Zhili <zhilizhao@tencent.com> > > On rpi5 (A76): > > put_hevc_pel_bi_w_pixels4_8_c: 90.0 ( 1.00x) > put_hevc_pel_bi_w_pixels4_8_neon: 34.1 ( 2.64x) > put_hevc_pel_bi_w_pixels6_8_c: 188.3 ( 1.00x) > put_hevc_pel_bi_w_pixels6_8_neon: 73.5 ( 2.56x) > put_hevc_pel_bi_w_pixels8_8_c: 327.1 ( 1.00x) > put_hevc_pel_bi_w_pixels8_8_neon: 75.8 ( 4.32x) > put_hevc_pel_bi_w_pixels12_8_c: 728.8 ( 1.00x) > put_hevc_pel_bi_w_pixels12_8_neon: 186.1 ( 3.92x) > put_hevc_pel_bi_w_pixels16_8_c: 1288.1 ( 1.00x) > put_hevc_pel_bi_w_pixels16_8_neon: 268.5 ( 4.80x) > put_hevc_pel_bi_w_pixels24_8_c: 2855.5 ( 1.00x) > put_hevc_pel_bi_w_pixels24_8_neon: 723.8 ( 3.95x) > put_hevc_pel_bi_w_pixels32_8_c: 5095.3 ( 1.00x) > put_hevc_pel_bi_w_pixels32_8_neon: 1165.0 ( 4.37x) > put_hevc_pel_bi_w_pixels48_8_c: 11521.5 ( 1.00x) > put_hevc_pel_bi_w_pixels48_8_neon: 2856.0 ( 4.03x) > put_hevc_pel_bi_w_pixels64_8_c: 21020.5 ( 1.00x) > put_hevc_pel_bi_w_pixels64_8_neon: 4699.1 ( 4.47x) > --- > libavcodec/aarch64/h26x/dsp.h | 5 + > libavcodec/aarch64/h26x/epel_neon.S | 373 ++++++++++++++++++++++ > libavcodec/aarch64/hevcdsp_init_aarch64.c | 13 + > 3 files changed, 391 insertions(+) This looks good overall, thanks! It's quite regrettable how many duplicates of near-identical functions there are in the h26x qpel/epel code; ideally we should be able to produce most of these function variants with some sort of template instead of having them all duplicated (with minor style differences). // Martin _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".