From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTPS id 5429A4DC13 for ; Sat, 1 Mar 2025 22:59:35 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id D99B568E258; Sun, 2 Mar 2025 00:59:28 +0200 (EET) Received: from mail-lf1-f50.google.com (mail-lf1-f50.google.com [209.85.167.50]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 4A94F68E160 for ; Sun, 2 Mar 2025 00:59:22 +0200 (EET) Received: by mail-lf1-f50.google.com with SMTP id 2adb3069b0e04-546210287c1so3361400e87.2 for ; Sat, 01 Mar 2025 14:59:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=martin-st.20230601.gappssmtp.com; s=20230601; t=1740869961; x=1741474761; darn=ffmpeg.org; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=rfVOLL6qvo2Jeare9gPjls3OZwdnRjvNR9M/JyUD98s=; b=DwpX7PsszTZkrGLCOLAUqSGKZl/20ZH3NCOapvfB3pbFFBoHZ9CoiSWhcQ2DOAaPdP qSqVX1M0lakEWF6dWcchJITMECDdVNfLdr6s/Kx7k3kGfjngJLRatRxEaiBJo1zw0O58 eFI50s4NSrbXiKda/5jDuccnCFfW/Donrd0VdEcUqZLzmR7VroP5qc7jUNW+hB8Zm72R zaAWCZLqNGjjv7PBW1BB3n0x1cqi7XVI0SmRCrqNcPggnKoSLZoLbjOB73PR19897/i4 GEitTDp1K49K7N1bGKxVfqRX/t8trbZuLu4sVcQK26pcXulljs5ZWk3SacFMQWSoKdn6 CMHQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740869961; x=1741474761; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=rfVOLL6qvo2Jeare9gPjls3OZwdnRjvNR9M/JyUD98s=; b=hLqfskuN6exwAq/fHRsrBs4sK9KdQcdMF/9BwfgU8O++IEBdoXY9ueYXzLSjrXv0GR 8WRx1WDmds+YsTNGVxYD97E9DtzGcLBfdJej4f+rHuOkmtmxrNysbI1nHGLUivj0nXMp v+tbVApnmERS7bUUb6t2Ba3bO2Ps+V05RYB4LxtFX2jxl7mVDomq7DEybW9ZgUc844w+ cPMN2Cofqz59lWE4m3tNE0eXgblWzDc2tWyrJxUu9KppW0lt90TnDAFYPuACuJkFghRV AGpxW4cBBrUBExjY0jy26pF25GR24DsP/8cjgyHPC/OK01+kgPacMloKub4lLBOmYcK5 nyxA== X-Gm-Message-State: AOJu0YwLAdT4ng+y5H37TpC+7Xz/b+Kac2SQio+QqNINyL60rM3lSHqK 1sYW2KM6P9JKnQfMrzYdRYGHcqNb1Qqlb1r3LoLmizylYtvv1LKeXqBpHE7skGEg2qmBU2vsjVd yxw== X-Gm-Gg: ASbGncu/5xJSlp1NXTldf64E2Ac2RegfioaGi+++TKbXGw/bjMBsLQuqXIhxQg9MA3S KU509RMMR3sUz+o/I7KvAQe4x9Di4x/8B5u7b/VGBW82/xNicBfNGK8oCp/PtG/4VFUNN9NCElJ z/6haQRJtjB+o1ZZrwj068zJDO9B40GWcyyYG8WE8DW0N3uQ21UhKBLHWkLxEnnNId/KjHnz1Le x93Y7o2KMP9d+x8dUO4KlLQPmEurebp/kmiVvnGQ6ZNU+iCl/tJ1nJIgOldfcGjpFbPjCh/y8jZ 3EbkvfiNk9mTdMX0eilpdWPCzH1z8wxNVfdYBkRyk6Knx8UH1So6eh33ZbpnCqk7iXYCuguVzCC aVdIS/wfxQ2Vy4simqNqlWKHMNbeqE1Z6TfVtYzwa X-Google-Smtp-Source: AGHT+IHPLE5jTGr2i+iL1tHiJe6KeDQT2jbJKTABUJ1E8ufNGmCbhI1Tps93Mje929lIBkEqGeiDhA== X-Received: by 2002:a05:6512:3405:b0:545:f90:2753 with SMTP id 2adb3069b0e04-5494c330b64mr3168462e87.30.1740869961452; Sat, 01 Mar 2025 14:59:21 -0800 (PST) Received: from tunnel335574-pt.tunnel.tserv24.sto1.ipv6.he.net (tunnel335574-pt.tunnel.tserv24.sto1.ipv6.he.net. [2001:470:27:11::2]) by smtp.gmail.com with ESMTPSA id 2adb3069b0e04-5495d5482f3sm157774e87.251.2025.03.01.14.59.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 01 Mar 2025 14:59:21 -0800 (PST) Date: Sun, 2 Mar 2025 00:59:20 +0200 (EET) From: =?ISO-8859-15?Q?Martin_Storsj=F6?= To: Krzysztof Pyrkosz via ffmpeg-devel In-Reply-To: <20250228212148.11560-2-ffmpeg@szaka.eu> Message-ID: <39fc9b98-6e97-a659-32f3-5060a32f60d6@martin.st> References: <20250228212148.11560-2-ffmpeg@szaka.eu> MIME-Version: 1.0 Subject: Re: [FFmpeg-devel] [PATCH 1/2] avcodec/aarch64/ac3dsp_neon.S: Optimize ac3_extract_exponents X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Krzysztof Pyrkosz Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: On Fri, 28 Feb 2025, Krzysztof Pyrkosz via ffmpeg-devel wrote: > Before and after: > > A78 > ac3_extract_exponents_n512_neon: 503.2 ( 3.36x) > ac3_extract_exponents_n3072_neon: 2986.2 ( 3.35x) > > ac3_extract_exponents_n512_neon: 211.2 ( 8.02x) > ac3_extract_exponents_n3072_neon: 1251.5 ( 8.00x) > > A72 > ac3_extract_exponents_n512_neon: 964.7 ( 2.39x) > ac3_extract_exponents_n3072_neon: 5434.5 ( 2.47x) > > ac3_extract_exponents_n512_neon: 465.6 ( 4.87x) > ac3_extract_exponents_n3072_neon: 2696.3 ( 4.97x) > --- > This version handles 16 ints in one go and consolidates separate > extractions and writes into one. I assume the length of the input is a > multiple of 16 (there are no constraints defined in the template file), > but the tests are passing. I have no clue about whehter this is ok or not (it may be good to check other assembly implementations if we do this on e.g. x86). Codewise, the patch looks good, thanks! This description of the patch, what it does and the assumptions it makes, is probably nice to keep in the final commit as well, so it could be included above "---" too. // Martin _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".