From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <ffmpeg-devel-bounces@ffmpeg.org>
Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100])
	by master.gitmailbox.com (Postfix) with ESMTPS id 6A1FE4BFE7
	for <ffmpegdev@gitmailbox.com>; Tue,  1 Apr 2025 15:29:44 +0000 (UTC)
Received: from [127.0.1.1] (localhost [127.0.0.1])
	by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 09443687D0E;
	Tue,  1 Apr 2025 18:29:41 +0300 (EEST)
Received: from mail-pj1-f46.google.com (mail-pj1-f46.google.com
 [209.85.216.46])
 by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 9996F687A4C
 for <ffmpeg-devel@ffmpeg.org>; Tue,  1 Apr 2025 18:29:34 +0300 (EEST)
Received: by mail-pj1-f46.google.com with SMTP id
 98e67ed59e1d1-301a4d5156aso9145518a91.1
 for <ffmpeg-devel@ffmpeg.org>; Tue, 01 Apr 2025 08:29:34 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=gmail.com; s=20230601; t=1743521372; x=1744126172; darn=ffmpeg.org;
 h=content-transfer-encoding:mime-version:references:in-reply-to
 :message-id:date:subject:to:from:from:to:cc:subject:date:message-id
 :reply-to; bh=PhebnO3da9UKlsx75/Q8YQ+GqR1lKT2u7MDm3ijbNBE=;
 b=Ie2e8eOW5Q2/C9E5Qz3Cc/3qKGk8thJp6GgX0bdna8jBUIn8nOn+RRQffbKZs5AaFS
 J/otkBGp0GQQk98Gp+1abpF2YhMMK8LfDYXgrF30Zz2oOG0cNtBmp2DO9wpIzaonIC4S
 MSvCbkVL/e7Je63HSpAHXeIpph2+AnfNHH7Mgre0c/mMPG3AzTN6rQqaYLzOufgqT4bW
 1ik0vYwqJtKoZhjW3cGAl8OYpN+x+94toYAkaC6NGBX1hWX+JsbjC/B8puEiYr8LFZwW
 unn5MArmYXc1ZTraeT+uAizhexFOa2eWGrCy7P9us6UV9vKY9IN58PknainwPxhS+AZy
 e1Hw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20230601; t=1743521372; x=1744126172;
 h=content-transfer-encoding:mime-version:references:in-reply-to
 :message-id:date:subject:to:from:x-gm-message-state:from:to:cc
 :subject:date:message-id:reply-to;
 bh=PhebnO3da9UKlsx75/Q8YQ+GqR1lKT2u7MDm3ijbNBE=;
 b=uJSJoEUgxUD5c/U80cQyem8xEQItObY5KcBBCBVOwixuARn/sPNtLODk1AA7X2ScoY
 GYnEQAVB3MdFf14cp2ojlcJ5+zeeVjvDifiNFIN/2BT9KaALKAwAHtxveK+XqfwHJ1Ky
 Qk27FiSOWvcuz8Hs/FybRYGak+UJtqKeBEkaHETtuOJVybL9R2WGjunrW46HEujD7ERx
 H66HiXOh1SoNKIARro847HDmx+znnMgYH4xVJqg9BHuuD8ElwcxDS3txchbBM+XIIg9U
 CbT+iMfQFKTmqn4/gJcyilJ39AiiaXsGXXQeo9M6Km5VrjhKafid6S78c5FGNUXcKfRi
 /t1Q==
X-Gm-Message-State: AOJu0Yz5gzZrqJsiES+EkjzxG3ka4m0fJDZRDwDBB0RcID+X7ptKl+8Z
 ZbtT40ZfoLhTfeVDVyPolHl5dJLsTtlqx3p2TGmudUqx4P6JfqtyFwchZA==
X-Gm-Gg: ASbGnct80KMcSdS8w5LZWghOC16XnpU2HB78B7+AmHwF4ui3bTxfGDEAewhzZXbMwFV
 w/UHWi3DtvBVwRU8yolbFwkgndIq0GHvCS2hJJ8p8MhupVq/xWAFuMKPylyRa866+1hyk5eXoIR
 i8FgwufFvox08yqY6vOUDCF6Z9dCoNKzl4DXA05mu/j6+rZxzpbvrXdMrOODHuZaunRFYptqTgS
 zrgs7wAkxGU+ZI4xzvGu4zs8gIlIBBiLiuhbRZOWTvZ6VFxwLw/grlM5Us69UcgUBarJwSRMIsW
 KPJv25mB5Xlv0h5cCMehA7QjkBkiZKHKQ9XcpRMBkw7K1F5o1HAky7z9PFszfhOl
X-Google-Smtp-Source: AGHT+IHtp1C+pheKF8+fHTKCOdmsVEU0KmT7LuxSRjWoHv8P8eNJDKvz86ph6OHdMxfL58Xd8mrqBQ==
X-Received: by 2002:a17:90b:1d48:b0:2fe:b774:3ec8 with SMTP id
 98e67ed59e1d1-305321471fcmr17605470a91.23.1743521372042; 
 Tue, 01 Apr 2025 08:29:32 -0700 (PDT)
Received: from localhost.localdomain ([2800:2121:b000:82e:6c84:2143:301b:3529])
 by smtp.gmail.com with ESMTPSA id
 98e67ed59e1d1-30516d3cd42sm9501162a91.3.2025.04.01.08.29.30
 for <ffmpeg-devel@ffmpeg.org>
 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
 Tue, 01 Apr 2025 08:29:31 -0700 (PDT)
From: James Almer <jamrial@gmail.com>
To: ffmpeg-devel@ffmpeg.org
Date: Tue,  1 Apr 2025 12:29:06 -0300
Message-ID: <20250401152906.2280-1-jamrial@gmail.com>
X-Mailer: git-send-email 2.49.0
In-Reply-To: <Z-wDq8UYVsXlCZ3E@phare.normalesup.org>
References: <Z-wDq8UYVsXlCZ3E@phare.normalesup.org>
MIME-Version: 1.0
Subject: [FFmpeg-devel] [PATCH v2 1/2] avutil/aes_ctr: simplify and optimize
 av_aes_ctr_crypt()
X-BeenThere: ffmpeg-devel@ffmpeg.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: FFmpeg development discussions and patches <ffmpeg-devel.ffmpeg.org>
List-Unsubscribe: <https://ffmpeg.org/mailman/options/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=unsubscribe>
List-Archive: <https://ffmpeg.org/pipermail/ffmpeg-devel>
List-Post: <mailto:ffmpeg-devel@ffmpeg.org>
List-Help: <mailto:ffmpeg-devel-request@ffmpeg.org?subject=help>
List-Subscribe: <https://ffmpeg.org/mailman/listinfo/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=subscribe>
Reply-To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Errors-To: ffmpeg-devel-bounces@ffmpeg.org
Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>
Archived-At: <https://master.gitmailbox.com/ffmpegdev/20250401152906.2280-1-jamrial@gmail.com/>
List-Archive: <https://master.gitmailbox.com/ffmpegdev/>
List-Post: <mailto:ffmpegdev@gitmailbox.com>

Process data in chunks of four or eight bytes, depending on host, instead of
one at a time.

before:
55561 decicycles in av_aes_ctr_crypt

after:
52204 decicycles in av_aes_ctr_crypt

Signed-off-by: James Almer <jamrial@gmail.com>
---
 libavutil/aes_ctr.c | 47 ++++++++++++++++++++-------------------------
 1 file changed, 21 insertions(+), 26 deletions(-)

diff --git a/libavutil/aes_ctr.c b/libavutil/aes_ctr.c
index c2d6d570e7..d720aa6aaf 100644
--- a/libavutil/aes_ctr.c
+++ b/libavutil/aes_ctr.c
@@ -24,6 +24,7 @@
 #include "aes_ctr.h"
 #include "aes.h"
 #include "aes_internal.h"
+#include "intreadwrite.h"
 #include "macros.h"
 #include "mem.h"
 #include "random_seed.h"
@@ -32,8 +33,7 @@
 
 typedef struct AVAESCTR {
     uint8_t counter[AES_BLOCK_SIZE];
-    uint8_t encrypted_counter[AES_BLOCK_SIZE];
-    int block_offset;
+    DECLARE_ALIGNED(8, uint8_t, encrypted_counter)[AES_BLOCK_SIZE];
     AVAES aes;
 } AVAESCTR;
 
@@ -46,13 +46,11 @@ void av_aes_ctr_set_iv(struct AVAESCTR *a, const uint8_t* iv)
 {
     memcpy(a->counter, iv, AES_CTR_IV_SIZE);
     memset(a->counter + AES_CTR_IV_SIZE, 0, sizeof(a->counter) - AES_CTR_IV_SIZE);
-    a->block_offset = 0;
 }
 
 void av_aes_ctr_set_full_iv(struct AVAESCTR *a, const uint8_t* iv)
 {
     memcpy(a->counter, iv, sizeof(a->counter));
-    a->block_offset = 0;
 }
 
 const uint8_t* av_aes_ctr_get_iv(struct AVAESCTR *a)
@@ -75,7 +73,6 @@ int av_aes_ctr_init(struct AVAESCTR *a, const uint8_t *key)
     av_aes_init(&a->aes, key, 128, 0);
 
     memset(a->counter, 0, sizeof(a->counter));
-    a->block_offset = 0;
 
     return 0;
 }
@@ -101,31 +98,29 @@ void av_aes_ctr_increment_iv(struct AVAESCTR *a)
 {
     av_aes_ctr_increment_be64(a->counter);
     memset(a->counter + AES_CTR_IV_SIZE, 0, sizeof(a->counter) - AES_CTR_IV_SIZE);
-    a->block_offset = 0;
 }
 
 void av_aes_ctr_crypt(struct AVAESCTR *a, uint8_t *dst, const uint8_t *src, int count)
 {
-    const uint8_t* src_end = src + count;
-    const uint8_t* cur_end_pos;
-    uint8_t* encrypted_counter_pos;
-
-    while (src < src_end) {
-        if (a->block_offset == 0) {
-            av_aes_crypt(&a->aes, a->encrypted_counter, a->counter, 1, NULL, 0);
-
-            av_aes_ctr_increment_be64(a->counter + 8);
-        }
-
-        encrypted_counter_pos = a->encrypted_counter + a->block_offset;
-        cur_end_pos = src + AES_BLOCK_SIZE - a->block_offset;
-        cur_end_pos = FFMIN(cur_end_pos, src_end);
-
-        a->block_offset += cur_end_pos - src;
-        a->block_offset &= (AES_BLOCK_SIZE - 1);
+    while (count >= AES_BLOCK_SIZE) {
+        av_aes_crypt(&a->aes, a->encrypted_counter, a->counter, 1, NULL, 0);
+        av_aes_ctr_increment_be64(a->counter + 8);
+#if HAVE_FAST_64BIT
+        for (int len = 0; len < AES_BLOCK_SIZE; len += 8)
+            AV_WN64(&dst[len], AV_RN64(&src[len]) ^ AV_RN64A(&a->encrypted_counter[len]));
+#else
+        for (int len = 0; len < AES_BLOCK_SIZE; len += 4)
+            AV_WN32(&dst[len], AV_RN32(&src[len]) ^ AV_RN32A(&a->encrypted_counter[len]));
+#endif
+        dst += AES_BLOCK_SIZE;
+        src += AES_BLOCK_SIZE;
+        count -= AES_BLOCK_SIZE;
+    }
 
-        while (src < cur_end_pos) {
-            *dst++ = *src++ ^ *encrypted_counter_pos++;
-        }
+    if (count > 0) {
+        av_aes_crypt(&a->aes, a->encrypted_counter, a->counter, 1, NULL, 0);
+        av_aes_ctr_increment_be64(a->counter + 8);
+        for (int len = 0; len < count; len++)
+            dst[len] = src[len] ^ a->encrypted_counter[len];
     }
 }
-- 
2.49.0

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".