From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.ffmpeg.org (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTPS id BF77B4B7D1 for ; Sat, 25 Oct 2025 17:00:00 +0000 (UTC) Authentication-Results: ffbox; dkim=fail (body hash mismatch (got b'a2/b+Wz0jD6VIU17CnzXMIIwU1x0ni7vRlA6lwiMHnI=', expected b'BmQhaGZeMliZgkG4vXVQ6axq11RbkGYwpdWvxmKbnIw=')) header.d=ffmpeg.org header.i=@ffmpeg.org header.a=rsa-sha256 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ffmpeg.org; i=@ffmpeg.org; q=dns/txt; s=mail; t=1761411568; h=mime-version : to : date : message-id : reply-to : subject : list-id : list-archive : list-archive : list-help : list-owner : list-post : list-subscribe : list-unsubscribe : from : cc : content-type : content-transfer-encoding : from; bh=a2/b+Wz0jD6VIU17CnzXMIIwU1x0ni7vRlA6lwiMHnI=; b=AEcQ3S0orAL0MBz3rhm8BWPh9f4O9/vS+DwoeQoNPUDYDBf4mwTWR24cbqen5/hu9ybfY TqGIrbE8109VnRwnv0Ycd2YwH2gKrRdtQ3TUv3QYgbC4kagPy76T4lDnB2LhpGcwCJdlFsB hs3Z5oOei4y2brh+H/K7d+XeCUC32ctgTF5ru0yW7RQP0L1gSnVtxSDSaXr1WgZqoqKmxb6 3u8+e2W4az7D/j5Rqmb5fm3dYkxEeZNdbxsdMOcnBGqV77kdJxuHrFK91LHquSwsQWdqkOq uTMNawXDqM4/gCx/RhXcZcQ+Sx1QG0d1jrMWktQtsN42H9hVsDs6B4umlMIA== Received: from [172.19.0.2] (unknown [172.19.0.2]) by ffbox0-bg.ffmpeg.org (Postfix) with ESMTP id 8EF2368F52D; Sat, 25 Oct 2025 19:59:28 +0300 (EEST) ARC-Seal: i=1; cv=none; a=rsa-sha256; d=ffmpeg.org; s=arc; t=1761411565; b=Zt4VULqY7kyA9638YXk/lhyrU0W1Ws4xxIgH8FGYi+yYnekywm5TJ0lAaNysvcHsxCOGD 9wtDSMpEJHgUCr7flRJdI9YO7NxQ7QLI/IPUOOJb78gCfTOzTb29RNritsmpV8V9uJGTNEK zDzEQ+BB02nsKWMg61cCpsURZZFw7zpsjP4QQk+VkfTGHX+QJOf74gcwu2okaOVKvE8pBTd JudRs/2K9I4AMc4VQREP4o0ClRiCFpZWbkDg3jZ3K7NMN8Xq7RURdd+n/YWRZHHDve+NjGB pHQR4qtvlVWg+EgHIouK7wiHQa80K5XLlQCL57h0ASq15/HFrIgMLweA6fXQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=ffmpeg.org; s=arc; t=1761411565; h=from : sender : reply-to : subject : date : message-id : to : cc : mime-version : content-type : content-transfer-encoding : content-id : content-description : resent-date : resent-from : resent-sender : resent-to : resent-cc : resent-message-id : in-reply-to : references : list-id : list-help : list-unsubscribe : list-subscribe : list-post : list-owner : list-archive; bh=I3COP7G5De+sfyZMv73nV6CUUUdhiu745WewiK6g1sQ=; b=OnM0xS2QkVKCvk0URJ68lrBn3syi+Q/1af5/c5B8V6WBOPdq9cJJ8lcvdG4ACOXv9mSbb XsBkajb1HArISNybMH1aBWuTtcmM/UG8NyPxNJ00HJAZbARo9OwL70358fj50sBL0lROVUA 2cMxiliSq7Z8JcDuaG+fZ1qChx+fg9DaQcaf2YC1Jn/BkyswZNZG8o9RBWd0dA7hlhWGW3J a62vYBe6eGtF0UkdPyvfcbynNOXvSeJpWjttaFmePR08OzZHLol/vy/1sdQuAcE9oGxmdC8 cJTOZOvSSmjov0nkx+s4RpHn82yQZb74mpKSI2EbM6KtgjHoXxI04O4ja7cw== ARC-Authentication-Results: i=1; ffmpeg.org; dkim=pass header.d=ffmpeg.org header.i=@ffmpeg.org; arc=none; dmarc=pass header.from=ffmpeg.org policy.dmarc=quarantine Authentication-Results: ffmpeg.org; dkim=pass header.d=ffmpeg.org header.i=@ffmpeg.org; arc=none (Message is not ARC signed); dmarc=pass (Used From Domain Record) header.from=ffmpeg.org policy.dmarc=quarantine DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ffmpeg.org; i=@ffmpeg.org; q=dns/txt; s=mail; t=1761411559; h=content-type : mime-version : content-transfer-encoding : from : to : reply-to : subject : date : from; bh=BmQhaGZeMliZgkG4vXVQ6axq11RbkGYwpdWvxmKbnIw=; b=ET9qB+xlLIZ310LLCAMX6HgvcwY5xepecTzWjkpCgdxZdlmf6+Uu1VVDZjJXJ29CB20fd OXZSqZgKiJ0+D7JsCTVLLMieTxm1b9w4UkqXb4Aq7G7qmS3Q1p8gIG7CalK9gEHzK8JPUCg X/OSX+zSXtcOdiJfd44WI/9LJUUORB+hy6sNwRN0r6X0CeXJdbsVccTXJS/Kw37DIRXhYTg DXTsLs7c6807oh77JydUsDx6s4DncbbPMhQGzlw3dpREjbsoF25AMgl2dFb61lhGI9JuG7R c8pYMUxd7gBCQxNDAw3kKRv20kCF7a7F4ieFPoeeHPj71SGIfIlxEuBWP4qA== Received: from 547bf0a948a1 (code.ffmpeg.org [188.245.149.3]) by ffbox0-bg.ffmpeg.org (Postfix) with ESMTPS id A578968F491 for ; Sat, 25 Oct 2025 19:59:19 +0300 (EEST) MIME-Version: 1.0 To: ffmpeg-devel@ffmpeg.org Date: Sat, 25 Oct 2025 16:59:19 -0000 Message-ID: <176141155986.25.6618563836001670415@7d278768979e> Message-ID-Hash: C3WVFZSY3CKFDU7WDCIKZUVUHCYT4NWX X-Message-ID-Hash: C3WVFZSY3CKFDU7WDCIKZUVUHCYT4NWX X-MailFrom: code@ffmpeg.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; header-match-ffmpeg-devel.ffmpeg.org-0; header-match-ffmpeg-devel.ffmpeg.org-1; header-match-ffmpeg-devel.ffmpeg.org-2; header-match-ffmpeg-devel.ffmpeg.org-3; emergency; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.10 Precedence: list Reply-To: FFmpeg development discussions and patches Subject: [FFmpeg-devel] [PATCH] avutil/crc: add x86 SSE4.2 clmul SIMD implementation for av_crc (PR #20751) List-Id: FFmpeg development discussions and patches Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: From: Shreesh Adiga via ffmpeg-devel Cc: Shreesh Adiga Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Archived-At: List-Archive: List-Post: PR #20751 opened by Shreesh Adiga (tantei3) URL: https://code.ffmpeg.org/FFmpeg/FFmpeg/pulls/20751 Patch URL: https://code.ffmpeg.org/FFmpeg/FFmpeg/pulls/20751.patch Implemented the algorithm described in the paper titled "Fast CRC Computation for Generic Polynomials Using PCLMULQDQ Instruction" by Intel. Added function pointer indirection for av_crc, av_crc_get_table and av_crc_init to match the current API for x86. Added checkasm test to check all the input len from 0 to 200 to cover the different loops in ASM, along with different initial CRC values for all the types supported by libavutil. Observed near 10x speedup on AMD Zen4 7950x: av_crc_c: 22057.0 ( 1.00x) av_crc_clmul: 2202.8 (10.01x) Signed-off-by: Shreesh Adiga <16567adigashreesh@gmail.com> >>From 8d9e0e77250906f89f01766c5866629d410a4828 Mon Sep 17 00:00:00 2001 From: Shreesh Adiga <16567adigashreesh@gmail.com> Date: Sat, 25 Oct 2025 21:48:15 +0530 Subject: [PATCH] avutil/crc: add x86 SSE4.2 clmul SIMD implementation for av_crc Implemented the algorithm described in the paper titled "Fast CRC Computation for Generic Polynomials Using PCLMULQDQ Instruction" by Intel. Added function pointer indirection for av_crc, av_crc_get_table and av_crc_init to match the current API for x86. Added checkasm test to check all the input len from 0 to 200 to cover the different loops in ASM, along with different initial CRC values for all the types supported by libavutil. Observed near 10x speedup on AMD Zen4 7950x: av_crc_c: 22057.0 ( 1.00x) av_crc_clmul: 2202.8 (10.01x) Signed-off-by: Shreesh Adiga <16567adigashreesh@gmail.com> --- configure | 4 + libavutil/cpu.c | 1 + libavutil/cpu.h | 1 + libavutil/crc.c | 77 +++++++- libavutil/tests/cpu.c | 1 + libavutil/x86/Makefile | 16 +- libavutil/x86/cpu.c | 2 + libavutil/x86/cpu.h | 2 + libavutil/x86/crc.asm | 297 +++++++++++++++++++++++++++++ libavutil/x86/crc.h | 38 ++++ libavutil/x86/crc_init.c | 233 +++++++++++++++++++++++ tests/checkasm/Makefile | 1 + tests/checkasm/checkasm.c | 2 + tests/checkasm/checkasm.h | 1 + tests/checkasm/crc.c | 386 ++++++++++++++++++++++++++++++++++++++ 15 files changed, 1046 insertions(+), 16 deletions(-) create mode 100644 libavutil/x86/crc.asm create mode 100644 libavutil/x86/crc.h create mode 100644 libavutil/x86/crc_init.c create mode 100644 tests/checkasm/crc.c diff --git a/configure b/configure index ed4f8c4a94..8a74691d16 100755 --- a/configure +++ b/configure @@ -469,6 +469,7 @@ Optimization options (experts only): --disable-avx512 disable AVX-512 optimizations --disable-avx512icl disable AVX-512ICL optimizations --disable-aesni disable AESNI optimizations + --disable-clmul disable CLMUL optimizations --disable-armv5te disable armv5te optimizations --disable-armv6 disable armv6 optimizations --disable-armv6t2 disable armv6t2 optimizations @@ -2252,6 +2253,7 @@ ARCH_EXT_LIST_WASM=" ARCH_EXT_LIST_X86_SIMD=" aesni + clmul amd3dnow amd3dnowext avx @@ -2871,6 +2873,7 @@ ssse3_deps="sse3" sse4_deps="ssse3" sse42_deps="sse4" aesni_deps="sse42" +clmul_deps="sse42" avx_deps="sse42" xop_deps="avx" fma3_deps="avx" @@ -8180,6 +8183,7 @@ if enabled x86; then echo "SSE enabled ${sse-no}" echo "SSSE3 enabled ${ssse3-no}" echo "AESNI enabled ${aesni-no}" + echo "CLMUL enabled ${clmul-no}" echo "AVX enabled ${avx-no}" echo "AVX2 enabled ${avx2-no}" echo "AVX-512 enabled ${avx512-no}" diff --git a/libavutil/cpu.c b/libavutil/cpu.c index 8f9b785ebc..0ddbc50da5 100644 --- a/libavutil/cpu.c +++ b/libavutil/cpu.c @@ -149,6 +149,7 @@ int av_parse_cpu_caps(unsigned *flags, const char *s) { "3dnowext", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_3DNOWEXT }, .unit = "flags" }, { "cmov", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_CMOV }, .unit = "flags" }, { "aesni", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_AESNI }, .unit = "flags" }, + { "clmul", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_CLMUL }, .unit = "flags" }, { "avx512" , NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_AVX512 }, .unit = "flags" }, { "avx512icl", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_AVX512ICL }, .unit = "flags" }, { "slowgather", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_SLOW_GATHER }, .unit = "flags" }, diff --git a/libavutil/cpu.h b/libavutil/cpu.h index 5ef5da58eb..c63718d10f 100644 --- a/libavutil/cpu.h +++ b/libavutil/cpu.h @@ -45,6 +45,7 @@ #define AV_CPU_FLAG_SSE4 0x0100 ///< Penryn SSE4.1 functions #define AV_CPU_FLAG_SSE42 0x0200 ///< Nehalem SSE4.2 functions #define AV_CPU_FLAG_AESNI 0x80000 ///< Advanced Encryption Standard functions +#define AV_CPU_FLAG_CLMUL 0x400000 ///< Carry-less Multiplication instruction #define AV_CPU_FLAG_AVX 0x4000 ///< AVX functions: requires OS support even if YMM registers aren't used #define AV_CPU_FLAG_AVXSLOW 0x8000000 ///< AVX supported, but slow when using YMM registers (e.g. Bulldozer) #define AV_CPU_FLAG_XOP 0x0400 ///< Bulldozer XOP functions diff --git a/libavutil/crc.c b/libavutil/crc.c index 703b56f4e0..52c604625a 100644 --- a/libavutil/crc.c +++ b/libavutil/crc.c @@ -25,6 +25,7 @@ #include "bswap.h" #include "crc.h" #include "error.h" +#include "x86/crc.h" #if CONFIG_HARDCODED_TABLES static const AVCRC av_crc_table[AV_CRC_MAX][257] = { @@ -319,11 +320,13 @@ static const AVCRC av_crc_table[AV_CRC_MAX][257] = { #endif static AVCRC av_crc_table[AV_CRC_MAX][CRC_TABLE_SIZE]; -#define DECLARE_CRC_INIT_TABLE_ONCE(id, le, bits, poly) \ -static AVOnce id ## _once_control = AV_ONCE_INIT; \ -static void id ## _init_table_once(void) \ -{ \ - av_assert0(av_crc_init(av_crc_table[id], le, bits, poly, sizeof(av_crc_table[id])) >= 0); \ +static int av_crc_init_c(AVCRC *ctx, int le, int bits, uint32_t poly, int ctx_size); + +#define DECLARE_CRC_INIT_TABLE_ONCE(id, le, bits, poly) \ +static AVOnce id ## _once_control = AV_ONCE_INIT; \ +static void id ## _init_table_once(void) \ +{ \ + av_assert0(av_crc_init_c(av_crc_table[id], le, bits, poly, sizeof(av_crc_table[id])) >= 0); \ } #define CRC_INIT_TABLE_ONCE(id) ff_thread_once(&id ## _once_control, id ## _init_table_once) @@ -338,7 +341,7 @@ DECLARE_CRC_INIT_TABLE_ONCE(AV_CRC_32_IEEE_LE, 1, 32, 0xEDB88320) DECLARE_CRC_INIT_TABLE_ONCE(AV_CRC_16_ANSI_LE, 1, 16, 0xA001) #endif -int av_crc_init(AVCRC *ctx, int le, int bits, uint32_t poly, int ctx_size) +static int av_crc_init_c(AVCRC *ctx, int le, int bits, uint32_t poly, int ctx_size) { unsigned i, j; uint32_t c; @@ -371,7 +374,7 @@ int av_crc_init(AVCRC *ctx, int le, int bits, uint32_t poly, int ctx_size) return 0; } -const AVCRC *av_crc_get_table(AVCRCId crc_id) +static const AVCRC *av_crc_get_table_c(AVCRCId crc_id) { #if !CONFIG_HARDCODED_TABLES switch (crc_id) { @@ -389,8 +392,8 @@ const AVCRC *av_crc_get_table(AVCRCId crc_id) return av_crc_table[crc_id]; } -uint32_t av_crc(const AVCRC *ctx, uint32_t crc, - const uint8_t *buffer, size_t length) +static uint32_t av_crc_c(const AVCRC *ctx, uint32_t crc, + const uint8_t *buffer, size_t length) { const uint8_t *end = buffer + length; @@ -413,3 +416,59 @@ uint32_t av_crc(const AVCRC *ctx, uint32_t crc, return crc; } + +static const AVCRC *av_crc_get_table_dispatch(AVCRCId crc_id); +static uint32_t av_crc_dispatch(const AVCRC *ctx, uint32_t crc, + const uint8_t *buffer, size_t length); +static int av_crc_init_dispatch(AVCRC *ctx, int le, int bits, + uint32_t poly, int ctx_size); + +const AVCRC *(*av_crc_get_table_func)(AVCRCId crc_id) = av_crc_get_table_dispatch; +uint32_t (*av_crc_func)(const AVCRC *ctx, uint32_t crc, + const uint8_t *buffer, size_t length) = av_crc_dispatch; +int (*av_crc_init_func)(AVCRC *ctx, int le, int bits, + uint32_t poly, int ctx_size) = av_crc_init_dispatch; + +void av_crc_init_fn(void) { + av_crc_get_table_func = av_crc_get_table_c; + av_crc_func = av_crc_c; + av_crc_init_func = av_crc_init_c; +#if ARCH_X86 + av_crc_init_x86(); +#endif +} + +static const AVCRC *av_crc_get_table_dispatch(AVCRCId crc_id) +{ + av_crc_init_fn(); + return av_crc_get_table_func(crc_id); +} + +static uint32_t av_crc_dispatch(const AVCRC *ctx, uint32_t crc, + const uint8_t *buffer, size_t length) +{ + av_crc_init_fn(); + return av_crc_func(ctx, crc, buffer, length); +} + +static int av_crc_init_dispatch(AVCRC *ctx, int le, int bits, uint32_t poly, int ctx_size) +{ + av_crc_init_fn(); + return av_crc_init_func(ctx, le, bits, poly, ctx_size); +} + +uint32_t av_crc(const AVCRC *ctx, uint32_t crc, + const uint8_t *buffer, size_t length) +{ + return av_crc_func(ctx, crc, buffer, length); +} + +const AVCRC *av_crc_get_table(AVCRCId crc_id) +{ + return av_crc_get_table_func(crc_id); +} + +int av_crc_init(AVCRC *ctx, int le, int bits, uint32_t poly, int ctx_size) +{ + return av_crc_init_func(ctx, le, bits, poly, ctx_size); +} diff --git a/libavutil/tests/cpu.c b/libavutil/tests/cpu.c index fd2e32901d..6f8a0be2c3 100644 --- a/libavutil/tests/cpu.c +++ b/libavutil/tests/cpu.c @@ -88,6 +88,7 @@ static const struct { { AV_CPU_FLAG_BMI1, "bmi1" }, { AV_CPU_FLAG_BMI2, "bmi2" }, { AV_CPU_FLAG_AESNI, "aesni" }, + { AV_CPU_FLAG_CLMUL, "clmul" }, { AV_CPU_FLAG_AVX512, "avx512" }, { AV_CPU_FLAG_AVX512ICL, "avx512icl" }, { AV_CPU_FLAG_SLOW_GATHER, "slowgather" }, diff --git a/libavutil/x86/Makefile b/libavutil/x86/Makefile index 8cfd646108..8a721ecd05 100644 --- a/libavutil/x86/Makefile +++ b/libavutil/x86/Makefile @@ -1,5 +1,6 @@ OBJS += x86/aes_init.o \ x86/cpu.o \ + x86/crc_init.o \ x86/fixed_dsp_init.o \ x86/float_dsp_init.o \ x86/imgutils_init.o \ @@ -12,12 +13,13 @@ OBJS-$(CONFIG_PIXELUTILS) += x86/pixelutils_init.o \ EMMS_OBJS_$(HAVE_MMX_INLINE)_$(HAVE_MMX_EXTERNAL)_$(HAVE_MM_EMPTY) = x86/emms.o X86ASM-OBJS += x86/aes.o \ - x86/cpuid.o \ - $(EMMS_OBJS__yes_) \ - x86/fixed_dsp.o \ - x86/float_dsp.o \ - x86/imgutils.o \ - x86/lls.o \ - x86/tx_float.o \ + x86/cpuid.o \ + x86/crc.o \ + $(EMMS_OBJS__yes_) \ + x86/fixed_dsp.o \ + x86/float_dsp.o \ + x86/imgutils.o \ + x86/lls.o \ + x86/tx_float.o \ X86ASM-OBJS-$(CONFIG_PIXELUTILS) += x86/pixelutils.o \ diff --git a/libavutil/x86/cpu.c b/libavutil/x86/cpu.c index 1a592f3bf4..5563f6cc3b 100644 --- a/libavutil/x86/cpu.c +++ b/libavutil/x86/cpu.c @@ -121,6 +121,8 @@ int ff_get_cpu_flags_x86(void) rval |= AV_CPU_FLAG_SSE2; if (ecx & 1) rval |= AV_CPU_FLAG_SSE3; + if (ecx & 0x2) + rval |= AV_CPU_FLAG_CLMUL; if (ecx & 0x00000200 ) rval |= AV_CPU_FLAG_SSSE3; if (ecx & 0x00080000 ) diff --git a/libavutil/x86/cpu.h b/libavutil/x86/cpu.h index 00e82255b1..af081b2ed8 100644 --- a/libavutil/x86/cpu.h +++ b/libavutil/x86/cpu.h @@ -44,6 +44,7 @@ #define X86_FMA4(flags) CPUEXT(flags, FMA4) #define X86_AVX2(flags) CPUEXT(flags, AVX2) #define X86_AESNI(flags) CPUEXT(flags, AESNI) +#define X86_CLMUL(flags) CPUEXT(flags, CLMUL) #define X86_AVX512(flags) CPUEXT(flags, AVX512) #define EXTERNAL_MMX(flags) CPUEXT_SUFFIX(flags, _EXTERNAL, MMX) @@ -72,6 +73,7 @@ #define EXTERNAL_AVX2_FAST(flags) CPUEXT_SUFFIX_FAST2(flags, _EXTERNAL, AVX2, AVX) #define EXTERNAL_AVX2_SLOW(flags) CPUEXT_SUFFIX_SLOW2(flags, _EXTERNAL, AVX2, AVX) #define EXTERNAL_AESNI(flags) CPUEXT_SUFFIX(flags, _EXTERNAL, AESNI) +#define EXTERNAL_CLMUL(flags) CPUEXT_SUFFIX(flags, _EXTERNAL, CLMUL) #define EXTERNAL_AVX512(flags) CPUEXT_SUFFIX(flags, _EXTERNAL, AVX512) #define EXTERNAL_AVX512ICL(flags) CPUEXT_SUFFIX(flags, _EXTERNAL, AVX512ICL) diff --git a/libavutil/x86/crc.asm b/libavutil/x86/crc.asm new file mode 100644 index 0000000000..89b2becc8d --- /dev/null +++ b/libavutil/x86/crc.asm @@ -0,0 +1,297 @@ +;***************************************************************************** +;* Copyright (c) 2025 Shreesh Adiga <16567adigashreesh@gmail.com> +;* +;* This file is part of FFmpeg. +;* +;* FFmpeg is free software; you can redistribute it and/or +;* modify it under the terms of the GNU Lesser General Public +;* License as published by the Free Software Foundation; either +;* version 2.1 of the License, or (at your option) any later version. +;* +;* FFmpeg is distributed in the hope that it will be useful, +;* but WITHOUT ANY WARRANTY; without even the implied warranty of +;* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +;* Lesser General Public License for more details. +;* +;* You should have received a copy of the GNU Lesser General Public +;* License along with FFmpeg; if not, write to the Free Software +;* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA +;****************************************************************************** + +%include "x86util.asm" + +SECTION RODATA +reverse_shuffle: db 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0 + +partial_bytes_shuf_tab: db 255, 254, 253, 252, 251, 250, 249, 248,\ + 247, 246, 245, 244, 243, 242, 241, 240,\ + 0, 1, 2, 3, 4, 5, 6, 7,\ + 8, 9, 10, 11, 12, 13, 14, 15 + +SECTION .text + +%macro FOLD_128_TO_64 4 +; %1 LE ; %2 128 bit fold reg ; %3 pre-computed constant reg ; %4 tmp reg +%if %1 == 1 + mova %4, %2 + pclmulqdq %2, %3, 0x00 + psrldq %4, 8 + pxor %2, %4 + mova %4, %2 + psllq %4, 32 + pclmulqdq %4, %3, 0x10 + pxor %2, %4 +%else + movq %4, %2 + pclmulqdq %2, %3, 0x11 + pslldq %4, 4 + pxor %4, %2 + mova %2, %4 + pclmulqdq %4, %3, 0x01 + pxor %2, %4 +%endif +%endmacro + +%macro FOLD_64_TO_32 4 +; %1 LE ; %2 128 bit fold reg ; %3 pre-computed constant reg ; %4 tmp reg +%if %1 == 1 + pxor %4, %4 + pblendw %4, %2, 0xfc + mova %2, %4 + pclmulqdq %4, %3, 0x00 + pxor %4, %2 + pclmulqdq %4, %3, 0x10 + pxor %2, %4 + pextrd eax, %2, 2 +%else + mova %4, %2 + pclmulqdq %2, %3, 0x00 + pclmulqdq %2, %3, 0x11 + pxor %2, %4 + movd eax, %2 + bswap eax +%endif +%endmacro + +%macro FOLD_SINGLE 4 +; %1 temp ; %2 fold reg ; %3 pre-computed constants ; %4 input data block + mova %1, %2 + pclmulqdq %1, %3, 0x01 + pxor %1, %4 + pclmulqdq %2, %3, 0x10 + pxor %2, %1 +%endmacro + +%macro XMM_SHIFT_LEFT 4 +; %1 xmm input reg ; %2 shift bytes amount ; %3 temp xmm register ; %4 temp gpr + lea %4, [partial_bytes_shuf_tab] + movu %3, [%4 + 16 - (%2)] + pshufb %1, %3 +%endmacro + +%macro MEMCPY_0_15 6 +; %1 dst ; %2 src ; %3 len ; %4, %5 temp gpr register; %6 done label + cmp %3, 8 + jae .between_8_15 + cmp %3, 4 + jae .between_4_7 + cmp %3, 1 + ja .between_2_3 + jb %6 + mov %4b, [%2] + mov [%1], %4b + jmp %6 + +.between_8_15: +%if ARCH_X86_64 + mov %4q, [%2] + mov %5q, [%2 + %3 - 8] + mov [%1], %4q + mov [%1 + %3 - 8], %5q + jmp %6 +%else + xor %5, %5 +.copy4b: + mov %4d, [%2 + %5] + mov [%1 + %5], %4d + add %5, 4 + lea %4, [%5 + 4] + cmp %4, %3 + jb .copy4b + + mov %4d, [%2 + %3 - 4] + mov [%1 + %3 - 4], %4d + jmp %6 +%endif +.between_4_7: + mov %4d, [%2] + mov %5d, [%2 + %3 - 4] + mov [%1], %4d + mov [%1 + %3 - 4], %5d + jmp %6 +.between_2_3: + mov %4w, [%2] + mov %5w, [%2 + %3 - 2] + mov [%1], %4w + mov [%1 + %3 - 2], %5w + ; fall through, %6 label is expected to be next instruction +%endmacro + +%macro CRC 1 +;----------------------------------------------------------------------------------------------- +; ff_av_crc[_le]_clmul(const uint8_t *ctx, uint32_t crc, const uint8_t *buffer, size_t length +;----------------------------------------------------------------------------------------------- +; %1 == 1 - LE format +%if %1 == 1 +cglobal av_crc_le, 4, 6, 7+4*ARCH_X86_64, 0x10 +%else +cglobal av_crc, 4, 6, 6+4*ARCH_X86_64, 0x10 +%endif + +%if ARCH_X86_32 + %define m10 m6 +%endif + +%if %1 == 0 + movu m10, [reverse_shuffle] +%endif + + movd m4, r1d +%if ARCH_X86_32 + ; skip 4x unrolled loop due to only 8 XMM reg being available in X86_32 + jmp .less_than_64bytes +%else + cmp r3, 64 + jb .less_than_64bytes + movu m1, [r2 + 0] + movu m3, [r2 + 16] + movu m2, [r2 + 32] + movu m0, [r2 + 48] + pxor m1, m4 +%if %1 == 0 + pshufb m0, m10 + pshufb m1, m10 + pshufb m2, m10 + pshufb m3, m10 +%endif + mov r4, 64 + cmp r3, 128 + jb .reduce_4x_to_1 + movu m4, [r0] + +.fold_4x_loop: + movu m6, [r2 + r4 + 0] + movu m7, [r2 + r4 + 16] + movu m8, [r2 + r4 + 32] + movu m9, [r2 + r4 + 48] +%if %1 == 0 + pshufb m6, m10 + pshufb m7, m10 + pshufb m8, m10 + pshufb m9, m10 +%endif + FOLD_SINGLE m5, m1, m4, m6 + FOLD_SINGLE m5, m3, m4, m7 + FOLD_SINGLE m5, m2, m4, m8 + FOLD_SINGLE m5, m0, m4, m9 + add r4, 64 + lea r5, [r4 + 64] + cmp r5, r3 + jbe .fold_4x_loop + +.reduce_4x_to_1: + movu m4, [r0 + 16] + FOLD_SINGLE m5, m1, m4, m3 + FOLD_SINGLE m5, m1, m4, m2 + FOLD_SINGLE m5, m1, m4, m0 +%endif + +.fold_1x_pre: + lea r5, [r4 + 16] + cmp r5, r3 + ja .partial_block + +.fold_1x_loop: + movu m2, [r2 + r4] +%if %1 == 0 + pshufb m2, m10 +%endif + FOLD_SINGLE m5, m1, m4, m2 + add r4, 16 + lea r5, [r4 + 16] + cmp r5, r3 + jbe .fold_1x_loop + +.partial_block: + cmp r4, r3 + jae .reduce_128_to_64 + movu m2, [r2 + r3 - 16] + and r3, 0xf + lea r4, [partial_bytes_shuf_tab] + movu m0, [r3 + r4] +%if %1 == 0 + pshufb m1, m10 +%endif + mova m3, m1 + pcmpeqd m5, m5 ; m5 = _mm_set1_epi8(0xff) + pxor m5, m0 + pshufb m3, m5 + pblendvb m2, m3, m0 + pshufb m1, m0 +%if %1 == 0 + pshufb m1, m10 + pshufb m2, m10 +%endif + FOLD_SINGLE m5, m1, m4, m2 + +.reduce_128_to_64: + movu m4, [r0 + 32] + FOLD_128_TO_64 %1, m1, m4, m5 +.reduce_64_to_32: + movu m4, [r0 + 48] + FOLD_64_TO_32 %1, m1, m4, m5 + RET + +.less_than_64bytes: + cmp r3, 16 + jb .less_than_16bytes + movu m1, [r2] + pxor m1, m4 +%if %1 == 0 + pshufb m1, m10 +%endif + mov r4, 16 + movu m4, [r0 + 16] + jmp .fold_1x_pre + +.less_than_16bytes: + pxor m1, m1 + movu [rsp], m1 + MEMCPY_0_15 rsp, r2, r3, r1, r4, .memcpy_done + +.memcpy_done: + movu m1, [rsp] + pxor m1, m4 + cmp r3, 5 + jb .less_than_5bytes + XMM_SHIFT_LEFT m1, (16 - r3), m2, r4 +%if %1 == 0 + pshufb m1, m10 +%endif + jmp .reduce_128_to_64 + +.less_than_5bytes: +%if %1 == 0 + XMM_SHIFT_LEFT m1, (4 - r3), m2, r4 + movq m10, [reverse_shuffle + 8] ; 0x0001020304050607 + pshufb m1, m10 +%else + XMM_SHIFT_LEFT m1, (8 - r3), m2, r4 +%endif + jmp .reduce_64_to_32 + +%endmacro + +INIT_XMM clmul +CRC 0 +CRC 1 diff --git a/libavutil/x86/crc.h b/libavutil/x86/crc.h new file mode 100644 index 0000000000..0cb00fe567 --- /dev/null +++ b/libavutil/x86/crc.h @@ -0,0 +1,38 @@ +/* + * Copyright (c) 2025 Shreesh Adiga <16567adigashreesh@gmail.com> + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ +#ifndef AVUTIL_X86_CRC_H +#define AVUTIL_X86_CRC_H + +#include "libavutil/crc.h" + +extern const AVCRC *(*av_crc_get_table_func)(AVCRCId crc_id); +extern uint32_t (*av_crc_func)(const AVCRC *ctx, uint32_t crc, + const uint8_t *buffer, size_t length); +extern int (*av_crc_init_func)(AVCRC *ctx, int le, int bits, + uint32_t poly, int ctx_size); + +uint32_t ff_av_crc_clmul(const AVCRC *ctx, uint32_t crc, + const uint8_t *buffer, size_t length); +uint32_t ff_av_crc_le_clmul(const AVCRC *ctx, uint32_t crc, + const uint8_t *buffer, size_t length); +void av_crc_init_x86(void); +void av_crc_init_fn(void); + +#endif /* AVUTIL_X86_CRC_H */ diff --git a/libavutil/x86/crc_init.c b/libavutil/x86/crc_init.c new file mode 100644 index 0000000000..37998954a6 --- /dev/null +++ b/libavutil/x86/crc_init.c @@ -0,0 +1,233 @@ +/* + * Copyright (c) 2025 Shreesh Adiga <16567adigashreesh@gmail.com> + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include + +#include "config.h" +#include "libavutil/attributes.h" +#include "libavutil/avassert.h" +#include "libavutil/cpu.h" +#include "libavutil/crc.h" +#include "libavutil/thread.h" +#include "libavutil/x86/cpu.h" +#include "libavutil/x86/crc.h" + +static int av_crc_init_clmul(AVCRC *ctx, int le, int bits, uint32_t poly, int ctx_size); +#if CONFIG_HARDCODED_TABLES +static const AVCRC av_crc_table_clmul[AV_CRC_MAX][16] = { + [AV_CRC_8_ATM] = { + 0x32000000, 0x0, 0xbc000000, 0x0, + 0xc4000000, 0x0, 0x94000000, 0x0, + 0x62000000, 0x0, 0x79000000, 0x0, + 0x07156a16, 0x1, 0x07000000, 0x1, + }, + [AV_CRC_8_EBU] = { + 0xb5000000, 0x0, 0xf3000000, 0x0, + 0xfc000000, 0x0, 0x0d000000, 0x0, + 0x6a000000, 0x0, 0x65000000, 0x0, + 0x1c4b8192, 0x1, 0x1d000000, 0x1, + }, + [AV_CRC_16_ANSI] = { + 0xf9e30000, 0x0, 0x807d0000, 0x0, + 0xf9130000, 0x0, 0xff830000, 0x0, + 0x807b0000, 0x0, 0x86630000, 0x0, + 0xfffbffe7, 0x1, 0x80050000, 0x1, + }, + [AV_CRC_16_CCITT] = { + 0x60190000, 0x0, 0x59b00000, 0x0, + 0xd5f60000, 0x0, 0x45630000, 0x0, + 0xaa510000, 0x0, 0xeb230000, 0x0, + 0x11303471, 0x1, 0x10210000, 0x1, + }, + [AV_CRC_24_IEEE] = { + 0x1f428700, 0x0, 0x467d2400, 0x0, + 0x2c8c9d00, 0x0, 0x64e4d700, 0x0, + 0xd9fe8c00, 0x0, 0xfd7e0c00, 0x0, + 0xf845fe24, 0x1, 0x864cfb00, 0x1, + }, + [AV_CRC_32_IEEE] = { + 0x8833794c, 0x0, 0xe6228b11, 0x0, + 0xc5b9cd4c, 0x0, 0xe8a45605, 0x0, + 0x490d678d, 0x0, 0xf200aa66, 0x0, + 0x04d101df, 0x1, 0x04c11db7, 0x1, + }, + [AV_CRC_32_IEEE_LE] = { + 0xc6e41596, 0x1, 0x54442bd4, 0x1, + 0xccaa009e, 0x0, 0x751997d0, 0x1, + 0xccaa009e, 0x0, 0x63cd6124, 0x1, + 0xf7011640, 0x1, 0xdb710641, 0x1, + }, + [AV_CRC_16_ANSI_LE] = { + 0x0000bffa, 0x0, 0x1b0c2, 0x0, + 0x00018cc2, 0x0, 0x1d0c2, 0x0, + 0x00018cc2, 0x0, 0x1bc02, 0x0, + 0xcfffbffe, 0x1, 0x14003, 0x0, + }, +}; +#else +static AVCRC av_crc_table_clmul[AV_CRC_MAX][16]; + +#define DECLARE_CRC_INIT_TABLE_ONCE(id, le, bits, poly) \ +static AVOnce id ## _once_control = AV_ONCE_INIT; \ +static void id ## _init_table_once(void) \ +{ \ + av_assert0(av_crc_init_clmul(av_crc_table_clmul[id], le, bits, poly, sizeof(av_crc_table_clmul[id])) >= 0); \ +} + +#define CRC_INIT_TABLE_ONCE(id) ff_thread_once(&id ## _once_control, id ## _init_table_once) + +DECLARE_CRC_INIT_TABLE_ONCE(AV_CRC_8_ATM, 0, 8, 0x07) +DECLARE_CRC_INIT_TABLE_ONCE(AV_CRC_8_EBU, 0, 8, 0x1D) +DECLARE_CRC_INIT_TABLE_ONCE(AV_CRC_16_ANSI, 0, 16, 0x8005) +DECLARE_CRC_INIT_TABLE_ONCE(AV_CRC_16_CCITT, 0, 16, 0x1021) +DECLARE_CRC_INIT_TABLE_ONCE(AV_CRC_24_IEEE, 0, 24, 0x864CFB) +DECLARE_CRC_INIT_TABLE_ONCE(AV_CRC_32_IEEE, 0, 32, 0x04C11DB7) +DECLARE_CRC_INIT_TABLE_ONCE(AV_CRC_32_IEEE_LE, 1, 32, 0xEDB88320) +DECLARE_CRC_INIT_TABLE_ONCE(AV_CRC_16_ANSI_LE, 1, 16, 0xA001) +#endif + +static uint32_t av_crc_clmul(const AVCRC *ctx, uint32_t crc, const uint8_t *buffer, size_t length) +{ + if (ctx[4] == ctx[8]) { + return ff_av_crc_le_clmul(ctx, crc, buffer, length); + } else { + return ff_av_crc_clmul(ctx, crc, buffer, length); + } +} + +static const AVCRC *av_crc_get_table_clmul(AVCRCId crc_id) +{ +#if !CONFIG_HARDCODED_TABLES + switch (crc_id) { + case AV_CRC_8_ATM: CRC_INIT_TABLE_ONCE(AV_CRC_8_ATM); break; + case AV_CRC_8_EBU: CRC_INIT_TABLE_ONCE(AV_CRC_8_EBU); break; + case AV_CRC_16_ANSI: CRC_INIT_TABLE_ONCE(AV_CRC_16_ANSI); break; + case AV_CRC_16_CCITT: CRC_INIT_TABLE_ONCE(AV_CRC_16_CCITT); break; + case AV_CRC_24_IEEE: CRC_INIT_TABLE_ONCE(AV_CRC_24_IEEE); break; + case AV_CRC_32_IEEE: CRC_INIT_TABLE_ONCE(AV_CRC_32_IEEE); break; + case AV_CRC_32_IEEE_LE: CRC_INIT_TABLE_ONCE(AV_CRC_32_IEEE_LE); break; + case AV_CRC_16_ANSI_LE: CRC_INIT_TABLE_ONCE(AV_CRC_16_ANSI_LE); break; + default: av_assert0(0); + } +#endif + return av_crc_table_clmul[crc_id]; +} + +static uint64_t reverse(uint64_t p, unsigned int deg) { + uint64_t ret = 0; + for (int i = 0; i < (deg + 1); i++) { + ret = (ret << 1) | (p & 1); + p >>= 1; + } + return ret; +} + +static uint64_t xnmodp(unsigned n, uint64_t poly, unsigned deg, uint64_t *div, int bitreverse) +{ + uint64_t mod, mask, high; + + if (n < deg) { + *div = 0; + return poly; + } + mask = ((uint64_t)1 << deg) - 1; + poly &= mask; + mod = poly; + *div = 1; + deg--; + while (--n > deg) { + high = (mod >> deg) & 1; + *div = (*div << 1) | high; + mod <<= 1; + if (high) + mod ^= poly; + } + uint64_t ret = mod & mask; + if (bitreverse) { + *div = reverse(*div, deg) << 1; + return reverse(ret, deg) << 1; + } + return ret; +} + +static int av_crc_init_clmul(AVCRC *ctx, int le, int bits, uint32_t poly, int ctx_size) +{ + if (bits < 8 || bits > 32 || poly >= (1LL << bits)) + return AVERROR(EINVAL); + if (ctx_size < sizeof(AVCRC) * 16) + return AVERROR(EINVAL); + + uint64_t poly_; + if (le) { + // convert the reversed representation to regular form + poly = reverse(poly, bits) >> 1; + } + // convert to 32 degree polynomial + poly_ = ((uint64_t)poly) << (32 - bits); + + uint64_t x1, x2, x3, x4, x5, x6, x7, x8, div; + if (le) { + x1 = xnmodp(4 * 128 - 32, poly_, 32, &div, le); + x2 = xnmodp(4 * 128 + 32, poly_, 32, &div, le); + x3 = xnmodp(128 - 32, poly_, 32, &div, le); + x4 = xnmodp(128 + 32, poly_, 32, &div, le); + x5 = x3; + x6 = xnmodp(64, poly_, 32, &div, le); + x7 = div; + x8 = reverse(poly_ | (1ULL << 32), 32); + } else { + x1 = xnmodp(4 * 128 + 64, poly_, 32, &div, le); + x2 = xnmodp(4 * 128, poly_, 32, &div, le); + x3 = xnmodp(128 + 64, poly_, 32, &div, le); + x4 = xnmodp(128, poly_, 32, &div, le); + x5 = xnmodp(64, poly_, 32, &div, le); + x7 = div; + x6 = xnmodp(96, poly_, 32, &div, le); + x8 = poly_ | (1ULL << 32); + } + ctx[0] = (AVCRC)x1; + ctx[1] = (AVCRC)(x1 >> 32); + ctx[2] = (AVCRC)x2; + ctx[3] = (AVCRC)(x2 >> 32); + ctx[4] = (AVCRC)x3; + ctx[5] = (AVCRC)(x3 >> 32); + ctx[6] = (AVCRC)x4; + ctx[7] = (AVCRC)(x4 >> 32); + ctx[8] = (AVCRC)x5; + ctx[9] = (AVCRC)(x5 >> 32); + ctx[10] = (AVCRC)x6; + ctx[11] = (AVCRC)(x6 >> 32); + ctx[12] = (AVCRC)x7; + ctx[13] = (AVCRC)(x7 >> 32); + ctx[14] = (AVCRC)x8; + ctx[15] = (AVCRC)(x8 >> 32); + return 0; +} + +av_cold void av_crc_init_x86(void) +{ + int cpu_flags = av_get_cpu_flags(); + + if (EXTERNAL_CLMUL(cpu_flags)) { + av_crc_func = av_crc_clmul; + av_crc_get_table_func = av_crc_get_table_clmul; + av_crc_init_func = av_crc_init_clmul; + } +} diff --git a/tests/checkasm/Makefile b/tests/checkasm/Makefile index e47070d90f..08d734efb8 100644 --- a/tests/checkasm/Makefile +++ b/tests/checkasm/Makefile @@ -87,6 +87,7 @@ CHECKASMOBJS-$(CONFIG_SWSCALE) += $(SWSCALEOBJS) # libavutil tests AVUTILOBJS += aes.o AVUTILOBJS += av_tx.o +AVUTILOBJS += crc.o AVUTILOBJS += fixed_dsp.o AVUTILOBJS += float_dsp.o AVUTILOBJS += lls.o diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c index 4469e043f5..05ffabc054 100644 --- a/tests/checkasm/checkasm.c +++ b/tests/checkasm/checkasm.c @@ -327,6 +327,7 @@ static const struct { #endif #if CONFIG_AVUTIL { "aes", checkasm_check_aes }, + { "crc", checkasm_check_crc }, { "fixed_dsp", checkasm_check_fixed_dsp }, { "float_dsp", checkasm_check_float_dsp }, { "lls", checkasm_check_lls }, @@ -385,6 +386,7 @@ static const struct { { "SSE4.1", "sse4", AV_CPU_FLAG_SSE4 }, { "SSE4.2", "sse42", AV_CPU_FLAG_SSE42 }, { "AES-NI", "aesni", AV_CPU_FLAG_AESNI }, + { "CLMUL", "clmul", AV_CPU_FLAG_CLMUL }, { "AVX", "avx", AV_CPU_FLAG_AVX }, { "XOP", "xop", AV_CPU_FLAG_XOP }, { "FMA3", "fma3", AV_CPU_FLAG_FMA3 }, diff --git a/tests/checkasm/checkasm.h b/tests/checkasm/checkasm.h index e1ccd4011b..7ad705e28e 100644 --- a/tests/checkasm/checkasm.h +++ b/tests/checkasm/checkasm.h @@ -93,6 +93,7 @@ void checkasm_check_bswapdsp(void); void checkasm_check_cavsdsp(void); void checkasm_check_colordetect(void); void checkasm_check_colorspace(void); +void checkasm_check_crc(void); void checkasm_check_dcadsp(void); void checkasm_check_diracdsp(void); void checkasm_check_exrdsp(void); diff --git a/tests/checkasm/crc.c b/tests/checkasm/crc.c new file mode 100644 index 0000000000..9544210b63 --- /dev/null +++ b/tests/checkasm/crc.c @@ -0,0 +1,386 @@ +/* + * Copyright (c) 2025 Shreesh Adiga <16567adigashreesh@gmail.com> + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License along + * with FFmpeg; if not, write to the Free Software Foundation, Inc., + * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. + */ + +#include "checkasm.h" +#include "libavutil/crc.h" +#include "libavutil/mem_internal.h" +#include "libavutil/x86/crc.h" + +#define BUF_SIZE (8192 + 11) + +static const AVCRC av_crc_table_ref[AV_CRC_MAX][257] = { + [AV_CRC_8_ATM] = { + 0x00, 0x07, 0x0E, 0x09, 0x1C, 0x1B, 0x12, 0x15, 0x38, 0x3F, 0x36, 0x31, + 0x24, 0x23, 0x2A, 0x2D, 0x70, 0x77, 0x7E, 0x79, 0x6C, 0x6B, 0x62, 0x65, + 0x48, 0x4F, 0x46, 0x41, 0x54, 0x53, 0x5A, 0x5D, 0xE0, 0xE7, 0xEE, 0xE9, + 0xFC, 0xFB, 0xF2, 0xF5, 0xD8, 0xDF, 0xD6, 0xD1, 0xC4, 0xC3, 0xCA, 0xCD, + 0x90, 0x97, 0x9E, 0x99, 0x8C, 0x8B, 0x82, 0x85, 0xA8, 0xAF, 0xA6, 0xA1, + 0xB4, 0xB3, 0xBA, 0xBD, 0xC7, 0xC0, 0xC9, 0xCE, 0xDB, 0xDC, 0xD5, 0xD2, + 0xFF, 0xF8, 0xF1, 0xF6, 0xE3, 0xE4, 0xED, 0xEA, 0xB7, 0xB0, 0xB9, 0xBE, + 0xAB, 0xAC, 0xA5, 0xA2, 0x8F, 0x88, 0x81, 0x86, 0x93, 0x94, 0x9D, 0x9A, + 0x27, 0x20, 0x29, 0x2E, 0x3B, 0x3C, 0x35, 0x32, 0x1F, 0x18, 0x11, 0x16, + 0x03, 0x04, 0x0D, 0x0A, 0x57, 0x50, 0x59, 0x5E, 0x4B, 0x4C, 0x45, 0x42, + 0x6F, 0x68, 0x61, 0x66, 0x73, 0x74, 0x7D, 0x7A, 0x89, 0x8E, 0x87, 0x80, + 0x95, 0x92, 0x9B, 0x9C, 0xB1, 0xB6, 0xBF, 0xB8, 0xAD, 0xAA, 0xA3, 0xA4, + 0xF9, 0xFE, 0xF7, 0xF0, 0xE5, 0xE2, 0xEB, 0xEC, 0xC1, 0xC6, 0xCF, 0xC8, + 0xDD, 0xDA, 0xD3, 0xD4, 0x69, 0x6E, 0x67, 0x60, 0x75, 0x72, 0x7B, 0x7C, + 0x51, 0x56, 0x5F, 0x58, 0x4D, 0x4A, 0x43, 0x44, 0x19, 0x1E, 0x17, 0x10, + 0x05, 0x02, 0x0B, 0x0C, 0x21, 0x26, 0x2F, 0x28, 0x3D, 0x3A, 0x33, 0x34, + 0x4E, 0x49, 0x40, 0x47, 0x52, 0x55, 0x5C, 0x5B, 0x76, 0x71, 0x78, 0x7F, + 0x6A, 0x6D, 0x64, 0x63, 0x3E, 0x39, 0x30, 0x37, 0x22, 0x25, 0x2C, 0x2B, + 0x06, 0x01, 0x08, 0x0F, 0x1A, 0x1D, 0x14, 0x13, 0xAE, 0xA9, 0xA0, 0xA7, + 0xB2, 0xB5, 0xBC, 0xBB, 0x96, 0x91, 0x98, 0x9F, 0x8A, 0x8D, 0x84, 0x83, + 0xDE, 0xD9, 0xD0, 0xD7, 0xC2, 0xC5, 0xCC, 0xCB, 0xE6, 0xE1, 0xE8, 0xEF, + 0xFA, 0xFD, 0xF4, 0xF3, 0x01 + }, + [AV_CRC_8_EBU] = { + 0x00, 0x1D, 0x3A, 0x27, 0x74, 0x69, 0x4E, 0x53, 0xE8, 0xF5, 0xD2, 0xCF, + 0x9C, 0x81, 0xA6, 0xBB, 0xCD, 0xD0, 0xF7, 0xEA, 0xB9, 0xA4, 0x83, 0x9E, + 0x25, 0x38, 0x1F, 0x02, 0x51, 0x4C, 0x6B, 0x76, 0x87, 0x9A, 0xBD, 0xA0, + 0xF3, 0xEE, 0xC9, 0xD4, 0x6F, 0x72, 0x55, 0x48, 0x1B, 0x06, 0x21, 0x3C, + 0x4A, 0x57, 0x70, 0x6D, 0x3E, 0x23, 0x04, 0x19, 0xA2, 0xBF, 0x98, 0x85, + 0xD6, 0xCB, 0xEC, 0xF1, 0x13, 0x0E, 0x29, 0x34, 0x67, 0x7A, 0x5D, 0x40, + 0xFB, 0xE6, 0xC1, 0xDC, 0x8F, 0x92, 0xB5, 0xA8, 0xDE, 0xC3, 0xE4, 0xF9, + 0xAA, 0xB7, 0x90, 0x8D, 0x36, 0x2B, 0x0C, 0x11, 0x42, 0x5F, 0x78, 0x65, + 0x94, 0x89, 0xAE, 0xB3, 0xE0, 0xFD, 0xDA, 0xC7, 0x7C, 0x61, 0x46, 0x5B, + 0x08, 0x15, 0x32, 0x2F, 0x59, 0x44, 0x63, 0x7E, 0x2D, 0x30, 0x17, 0x0A, + 0xB1, 0xAC, 0x8B, 0x96, 0xC5, 0xD8, 0xFF, 0xE2, 0x26, 0x3B, 0x1C, 0x01, + 0x52, 0x4F, 0x68, 0x75, 0xCE, 0xD3, 0xF4, 0xE9, 0xBA, 0xA7, 0x80, 0x9D, + 0xEB, 0xF6, 0xD1, 0xCC, 0x9F, 0x82, 0xA5, 0xB8, 0x03, 0x1E, 0x39, 0x24, + 0x77, 0x6A, 0x4D, 0x50, 0xA1, 0xBC, 0x9B, 0x86, 0xD5, 0xC8, 0xEF, 0xF2, + 0x49, 0x54, 0x73, 0x6E, 0x3D, 0x20, 0x07, 0x1A, 0x6C, 0x71, 0x56, 0x4B, + 0x18, 0x05, 0x22, 0x3F, 0x84, 0x99, 0xBE, 0xA3, 0xF0, 0xED, 0xCA, 0xD7, + 0x35, 0x28, 0x0F, 0x12, 0x41, 0x5C, 0x7B, 0x66, 0xDD, 0xC0, 0xE7, 0xFA, + 0xA9, 0xB4, 0x93, 0x8E, 0xF8, 0xE5, 0xC2, 0xDF, 0x8C, 0x91, 0xB6, 0xAB, + 0x10, 0x0D, 0x2A, 0x37, 0x64, 0x79, 0x5E, 0x43, 0xB2, 0xAF, 0x88, 0x95, + 0xC6, 0xDB, 0xFC, 0xE1, 0x5A, 0x47, 0x60, 0x7D, 0x2E, 0x33, 0x14, 0x09, + 0x7F, 0x62, 0x45, 0x58, 0x0B, 0x16, 0x31, 0x2C, 0x97, 0x8A, 0xAD, 0xB0, + 0xE3, 0xFE, 0xD9, 0xC4, 0x01 + }, + [AV_CRC_16_ANSI] = { + 0x0000, 0x0580, 0x0F80, 0x0A00, 0x1B80, 0x1E00, 0x1400, 0x1180, + 0x3380, 0x3600, 0x3C00, 0x3980, 0x2800, 0x2D80, 0x2780, 0x2200, + 0x6380, 0x6600, 0x6C00, 0x6980, 0x7800, 0x7D80, 0x7780, 0x7200, + 0x5000, 0x5580, 0x5F80, 0x5A00, 0x4B80, 0x4E00, 0x4400, 0x4180, + 0xC380, 0xC600, 0xCC00, 0xC980, 0xD800, 0xDD80, 0xD780, 0xD200, + 0xF000, 0xF580, 0xFF80, 0xFA00, 0xEB80, 0xEE00, 0xE400, 0xE180, + 0xA000, 0xA580, 0xAF80, 0xAA00, 0xBB80, 0xBE00, 0xB400, 0xB180, + 0x9380, 0x9600, 0x9C00, 0x9980, 0x8800, 0x8D80, 0x8780, 0x8200, + 0x8381, 0x8601, 0x8C01, 0x8981, 0x9801, 0x9D81, 0x9781, 0x9201, + 0xB001, 0xB581, 0xBF81, 0xBA01, 0xAB81, 0xAE01, 0xA401, 0xA181, + 0xE001, 0xE581, 0xEF81, 0xEA01, 0xFB81, 0xFE01, 0xF401, 0xF181, + 0xD381, 0xD601, 0xDC01, 0xD981, 0xC801, 0xCD81, 0xC781, 0xC201, + 0x4001, 0x4581, 0x4F81, 0x4A01, 0x5B81, 0x5E01, 0x5401, 0x5181, + 0x7381, 0x7601, 0x7C01, 0x7981, 0x6801, 0x6D81, 0x6781, 0x6201, + 0x2381, 0x2601, 0x2C01, 0x2981, 0x3801, 0x3D81, 0x3781, 0x3201, + 0x1001, 0x1581, 0x1F81, 0x1A01, 0x0B81, 0x0E01, 0x0401, 0x0181, + 0x0383, 0x0603, 0x0C03, 0x0983, 0x1803, 0x1D83, 0x1783, 0x1203, + 0x3003, 0x3583, 0x3F83, 0x3A03, 0x2B83, 0x2E03, 0x2403, 0x2183, + 0x6003, 0x6583, 0x6F83, 0x6A03, 0x7B83, 0x7E03, 0x7403, 0x7183, + 0x5383, 0x5603, 0x5C03, 0x5983, 0x4803, 0x4D83, 0x4783, 0x4203, + 0xC003, 0xC583, 0xCF83, 0xCA03, 0xDB83, 0xDE03, 0xD403, 0xD183, + 0xF383, 0xF603, 0xFC03, 0xF983, 0xE803, 0xED83, 0xE783, 0xE203, + 0xA383, 0xA603, 0xAC03, 0xA983, 0xB803, 0xBD83, 0xB783, 0xB203, + 0x9003, 0x9583, 0x9F83, 0x9A03, 0x8B83, 0x8E03, 0x8403, 0x8183, + 0x8002, 0x8582, 0x8F82, 0x8A02, 0x9B82, 0x9E02, 0x9402, 0x9182, + 0xB382, 0xB602, 0xBC02, 0xB982, 0xA802, 0xAD82, 0xA782, 0xA202, + 0xE382, 0xE602, 0xEC02, 0xE982, 0xF802, 0xFD82, 0xF782, 0xF202, + 0xD002, 0xD582, 0xDF82, 0xDA02, 0xCB82, 0xCE02, 0xC402, 0xC182, + 0x4382, 0x4602, 0x4C02, 0x4982, 0x5802, 0x5D82, 0x5782, 0x5202, + 0x7002, 0x7582, 0x7F82, 0x7A02, 0x6B82, 0x6E02, 0x6402, 0x6182, + 0x2002, 0x2582, 0x2F82, 0x2A02, 0x3B82, 0x3E02, 0x3402, 0x3182, + 0x1382, 0x1602, 0x1C02, 0x1982, 0x0802, 0x0D82, 0x0782, 0x0202, + 0x0001 + }, + [AV_CRC_16_CCITT] = { + 0x0000, 0x2110, 0x4220, 0x6330, 0x8440, 0xA550, 0xC660, 0xE770, + 0x0881, 0x2991, 0x4AA1, 0x6BB1, 0x8CC1, 0xADD1, 0xCEE1, 0xEFF1, + 0x3112, 0x1002, 0x7332, 0x5222, 0xB552, 0x9442, 0xF772, 0xD662, + 0x3993, 0x1883, 0x7BB3, 0x5AA3, 0xBDD3, 0x9CC3, 0xFFF3, 0xDEE3, + 0x6224, 0x4334, 0x2004, 0x0114, 0xE664, 0xC774, 0xA444, 0x8554, + 0x6AA5, 0x4BB5, 0x2885, 0x0995, 0xEEE5, 0xCFF5, 0xACC5, 0x8DD5, + 0x5336, 0x7226, 0x1116, 0x3006, 0xD776, 0xF666, 0x9556, 0xB446, + 0x5BB7, 0x7AA7, 0x1997, 0x3887, 0xDFF7, 0xFEE7, 0x9DD7, 0xBCC7, + 0xC448, 0xE558, 0x8668, 0xA778, 0x4008, 0x6118, 0x0228, 0x2338, + 0xCCC9, 0xEDD9, 0x8EE9, 0xAFF9, 0x4889, 0x6999, 0x0AA9, 0x2BB9, + 0xF55A, 0xD44A, 0xB77A, 0x966A, 0x711A, 0x500A, 0x333A, 0x122A, + 0xFDDB, 0xDCCB, 0xBFFB, 0x9EEB, 0x799B, 0x588B, 0x3BBB, 0x1AAB, + 0xA66C, 0x877C, 0xE44C, 0xC55C, 0x222C, 0x033C, 0x600C, 0x411C, + 0xAEED, 0x8FFD, 0xECCD, 0xCDDD, 0x2AAD, 0x0BBD, 0x688D, 0x499D, + 0x977E, 0xB66E, 0xD55E, 0xF44E, 0x133E, 0x322E, 0x511E, 0x700E, + 0x9FFF, 0xBEEF, 0xDDDF, 0xFCCF, 0x1BBF, 0x3AAF, 0x599F, 0x788F, + 0x8891, 0xA981, 0xCAB1, 0xEBA1, 0x0CD1, 0x2DC1, 0x4EF1, 0x6FE1, + 0x8010, 0xA100, 0xC230, 0xE320, 0x0450, 0x2540, 0x4670, 0x6760, + 0xB983, 0x9893, 0xFBA3, 0xDAB3, 0x3DC3, 0x1CD3, 0x7FE3, 0x5EF3, + 0xB102, 0x9012, 0xF322, 0xD232, 0x3542, 0x1452, 0x7762, 0x5672, + 0xEAB5, 0xCBA5, 0xA895, 0x8985, 0x6EF5, 0x4FE5, 0x2CD5, 0x0DC5, + 0xE234, 0xC324, 0xA014, 0x8104, 0x6674, 0x4764, 0x2454, 0x0544, + 0xDBA7, 0xFAB7, 0x9987, 0xB897, 0x5FE7, 0x7EF7, 0x1DC7, 0x3CD7, + 0xD326, 0xF236, 0x9106, 0xB016, 0x5766, 0x7676, 0x1546, 0x3456, + 0x4CD9, 0x6DC9, 0x0EF9, 0x2FE9, 0xC899, 0xE989, 0x8AB9, 0xABA9, + 0x4458, 0x6548, 0x0678, 0x2768, 0xC018, 0xE108, 0x8238, 0xA328, + 0x7DCB, 0x5CDB, 0x3FEB, 0x1EFB, 0xF98B, 0xD89B, 0xBBAB, 0x9ABB, + 0x754A, 0x545A, 0x376A, 0x167A, 0xF10A, 0xD01A, 0xB32A, 0x923A, + 0x2EFD, 0x0FED, 0x6CDD, 0x4DCD, 0xAABD, 0x8BAD, 0xE89D, 0xC98D, + 0x267C, 0x076C, 0x645C, 0x454C, 0xA23C, 0x832C, 0xE01C, 0xC10C, + 0x1FEF, 0x3EFF, 0x5DCF, 0x7CDF, 0x9BAF, 0xBABF, 0xD98F, 0xF89F, + 0x176E, 0x367E, 0x554E, 0x745E, 0x932E, 0xB23E, 0xD10E, 0xF01E, + 0x0001 + }, + [AV_CRC_24_IEEE] = { + 0x000000, 0xFB4C86, 0x0DD58A, 0xF6990C, 0xE1E693, 0x1AAA15, 0xEC3319, + 0x177F9F, 0x3981A1, 0xC2CD27, 0x34542B, 0xCF18AD, 0xD86732, 0x232BB4, + 0xD5B2B8, 0x2EFE3E, 0x894EC5, 0x720243, 0x849B4F, 0x7FD7C9, 0x68A856, + 0x93E4D0, 0x657DDC, 0x9E315A, 0xB0CF64, 0x4B83E2, 0xBD1AEE, 0x465668, + 0x5129F7, 0xAA6571, 0x5CFC7D, 0xA7B0FB, 0xE9D10C, 0x129D8A, 0xE40486, + 0x1F4800, 0x08379F, 0xF37B19, 0x05E215, 0xFEAE93, 0xD050AD, 0x2B1C2B, + 0xDD8527, 0x26C9A1, 0x31B63E, 0xCAFAB8, 0x3C63B4, 0xC72F32, 0x609FC9, + 0x9BD34F, 0x6D4A43, 0x9606C5, 0x81795A, 0x7A35DC, 0x8CACD0, 0x77E056, + 0x591E68, 0xA252EE, 0x54CBE2, 0xAF8764, 0xB8F8FB, 0x43B47D, 0xB52D71, + 0x4E61F7, 0xD2A319, 0x29EF9F, 0xDF7693, 0x243A15, 0x33458A, 0xC8090C, + 0x3E9000, 0xC5DC86, 0xEB22B8, 0x106E3E, 0xE6F732, 0x1DBBB4, 0x0AC42B, + 0xF188AD, 0x0711A1, 0xFC5D27, 0x5BEDDC, 0xA0A15A, 0x563856, 0xAD74D0, + 0xBA0B4F, 0x4147C9, 0xB7DEC5, 0x4C9243, 0x626C7D, 0x9920FB, 0x6FB9F7, + 0x94F571, 0x838AEE, 0x78C668, 0x8E5F64, 0x7513E2, 0x3B7215, 0xC03E93, + 0x36A79F, 0xCDEB19, 0xDA9486, 0x21D800, 0xD7410C, 0x2C0D8A, 0x02F3B4, + 0xF9BF32, 0x0F263E, 0xF46AB8, 0xE31527, 0x1859A1, 0xEEC0AD, 0x158C2B, + 0xB23CD0, 0x497056, 0xBFE95A, 0x44A5DC, 0x53DA43, 0xA896C5, 0x5E0FC9, + 0xA5434F, 0x8BBD71, 0x70F1F7, 0x8668FB, 0x7D247D, 0x6A5BE2, 0x911764, + 0x678E68, 0x9CC2EE, 0xA44733, 0x5F0BB5, 0xA992B9, 0x52DE3F, 0x45A1A0, + 0xBEED26, 0x48742A, 0xB338AC, 0x9DC692, 0x668A14, 0x901318, 0x6B5F9E, + 0x7C2001, 0x876C87, 0x71F58B, 0x8AB90D, 0x2D09F6, 0xD64570, 0x20DC7C, + 0xDB90FA, 0xCCEF65, 0x37A3E3, 0xC13AEF, 0x3A7669, 0x148857, 0xEFC4D1, + 0x195DDD, 0xE2115B, 0xF56EC4, 0x0E2242, 0xF8BB4E, 0x03F7C8, 0x4D963F, + 0xB6DAB9, 0x4043B5, 0xBB0F33, 0xAC70AC, 0x573C2A, 0xA1A526, 0x5AE9A0, + 0x74179E, 0x8F5B18, 0x79C214, 0x828E92, 0x95F10D, 0x6EBD8B, 0x982487, + 0x636801, 0xC4D8FA, 0x3F947C, 0xC90D70, 0x3241F6, 0x253E69, 0xDE72EF, + 0x28EBE3, 0xD3A765, 0xFD595B, 0x0615DD, 0xF08CD1, 0x0BC057, 0x1CBFC8, + 0xE7F34E, 0x116A42, 0xEA26C4, 0x76E42A, 0x8DA8AC, 0x7B31A0, 0x807D26, + 0x9702B9, 0x6C4E3F, 0x9AD733, 0x619BB5, 0x4F658B, 0xB4290D, 0x42B001, + 0xB9FC87, 0xAE8318, 0x55CF9E, 0xA35692, 0x581A14, 0xFFAAEF, 0x04E669, + 0xF27F65, 0x0933E3, 0x1E4C7C, 0xE500FA, 0x1399F6, 0xE8D570, 0xC62B4E, + 0x3D67C8, 0xCBFEC4, 0x30B242, 0x27CDDD, 0xDC815B, 0x2A1857, 0xD154D1, + 0x9F3526, 0x6479A0, 0x92E0AC, 0x69AC2A, 0x7ED3B5, 0x859F33, 0x73063F, + 0x884AB9, 0xA6B487, 0x5DF801, 0xAB610D, 0x502D8B, 0x475214, 0xBC1E92, + 0x4A879E, 0xB1CB18, 0x167BE3, 0xED3765, 0x1BAE69, 0xE0E2EF, 0xF79D70, + 0x0CD1F6, 0xFA48FA, 0x01047C, 0x2FFA42, 0xD4B6C4, 0x222FC8, 0xD9634E, + 0xCE1CD1, 0x355057, 0xC3C95B, 0x3885DD, 0x000001, + }, + [AV_CRC_32_IEEE] = { + 0x00000000, 0xB71DC104, 0x6E3B8209, 0xD926430D, 0xDC760413, 0x6B6BC517, + 0xB24D861A, 0x0550471E, 0xB8ED0826, 0x0FF0C922, 0xD6D68A2F, 0x61CB4B2B, + 0x649B0C35, 0xD386CD31, 0x0AA08E3C, 0xBDBD4F38, 0x70DB114C, 0xC7C6D048, + 0x1EE09345, 0xA9FD5241, 0xACAD155F, 0x1BB0D45B, 0xC2969756, 0x758B5652, + 0xC836196A, 0x7F2BD86E, 0xA60D9B63, 0x11105A67, 0x14401D79, 0xA35DDC7D, + 0x7A7B9F70, 0xCD665E74, 0xE0B62398, 0x57ABE29C, 0x8E8DA191, 0x39906095, + 0x3CC0278B, 0x8BDDE68F, 0x52FBA582, 0xE5E66486, 0x585B2BBE, 0xEF46EABA, + 0x3660A9B7, 0x817D68B3, 0x842D2FAD, 0x3330EEA9, 0xEA16ADA4, 0x5D0B6CA0, + 0x906D32D4, 0x2770F3D0, 0xFE56B0DD, 0x494B71D9, 0x4C1B36C7, 0xFB06F7C3, + 0x2220B4CE, 0x953D75CA, 0x28803AF2, 0x9F9DFBF6, 0x46BBB8FB, 0xF1A679FF, + 0xF4F63EE1, 0x43EBFFE5, 0x9ACDBCE8, 0x2DD07DEC, 0x77708634, 0xC06D4730, + 0x194B043D, 0xAE56C539, 0xAB068227, 0x1C1B4323, 0xC53D002E, 0x7220C12A, + 0xCF9D8E12, 0x78804F16, 0xA1A60C1B, 0x16BBCD1F, 0x13EB8A01, 0xA4F64B05, + 0x7DD00808, 0xCACDC90C, 0x07AB9778, 0xB0B6567C, 0x69901571, 0xDE8DD475, + 0xDBDD936B, 0x6CC0526F, 0xB5E61162, 0x02FBD066, 0xBF469F5E, 0x085B5E5A, + 0xD17D1D57, 0x6660DC53, 0x63309B4D, 0xD42D5A49, 0x0D0B1944, 0xBA16D840, + 0x97C6A5AC, 0x20DB64A8, 0xF9FD27A5, 0x4EE0E6A1, 0x4BB0A1BF, 0xFCAD60BB, + 0x258B23B6, 0x9296E2B2, 0x2F2BAD8A, 0x98366C8E, 0x41102F83, 0xF60DEE87, + 0xF35DA999, 0x4440689D, 0x9D662B90, 0x2A7BEA94, 0xE71DB4E0, 0x500075E4, + 0x892636E9, 0x3E3BF7ED, 0x3B6BB0F3, 0x8C7671F7, 0x555032FA, 0xE24DF3FE, + 0x5FF0BCC6, 0xE8ED7DC2, 0x31CB3ECF, 0x86D6FFCB, 0x8386B8D5, 0x349B79D1, + 0xEDBD3ADC, 0x5AA0FBD8, 0xEEE00C69, 0x59FDCD6D, 0x80DB8E60, 0x37C64F64, + 0x3296087A, 0x858BC97E, 0x5CAD8A73, 0xEBB04B77, 0x560D044F, 0xE110C54B, + 0x38368646, 0x8F2B4742, 0x8A7B005C, 0x3D66C158, 0xE4408255, 0x535D4351, + 0x9E3B1D25, 0x2926DC21, 0xF0009F2C, 0x471D5E28, 0x424D1936, 0xF550D832, + 0x2C769B3F, 0x9B6B5A3B, 0x26D61503, 0x91CBD407, 0x48ED970A, 0xFFF0560E, + 0xFAA01110, 0x4DBDD014, 0x949B9319, 0x2386521D, 0x0E562FF1, 0xB94BEEF5, + 0x606DADF8, 0xD7706CFC, 0xD2202BE2, 0x653DEAE6, 0xBC1BA9EB, 0x0B0668EF, + 0xB6BB27D7, 0x01A6E6D3, 0xD880A5DE, 0x6F9D64DA, 0x6ACD23C4, 0xDDD0E2C0, + 0x04F6A1CD, 0xB3EB60C9, 0x7E8D3EBD, 0xC990FFB9, 0x10B6BCB4, 0xA7AB7DB0, + 0xA2FB3AAE, 0x15E6FBAA, 0xCCC0B8A7, 0x7BDD79A3, 0xC660369B, 0x717DF79F, + 0xA85BB492, 0x1F467596, 0x1A163288, 0xAD0BF38C, 0x742DB081, 0xC3307185, + 0x99908A5D, 0x2E8D4B59, 0xF7AB0854, 0x40B6C950, 0x45E68E4E, 0xF2FB4F4A, + 0x2BDD0C47, 0x9CC0CD43, 0x217D827B, 0x9660437F, 0x4F460072, 0xF85BC176, + 0xFD0B8668, 0x4A16476C, 0x93300461, 0x242DC565, 0xE94B9B11, 0x5E565A15, + 0x87701918, 0x306DD81C, 0x353D9F02, 0x82205E06, 0x5B061D0B, 0xEC1BDC0F, + 0x51A69337, 0xE6BB5233, 0x3F9D113E, 0x8880D03A, 0x8DD09724, 0x3ACD5620, + 0xE3EB152D, 0x54F6D429, 0x7926A9C5, 0xCE3B68C1, 0x171D2BCC, 0xA000EAC8, + 0xA550ADD6, 0x124D6CD2, 0xCB6B2FDF, 0x7C76EEDB, 0xC1CBA1E3, 0x76D660E7, + 0xAFF023EA, 0x18EDE2EE, 0x1DBDA5F0, 0xAAA064F4, 0x738627F9, 0xC49BE6FD, + 0x09FDB889, 0xBEE0798D, 0x67C63A80, 0xD0DBFB84, 0xD58BBC9A, 0x62967D9E, + 0xBBB03E93, 0x0CADFF97, 0xB110B0AF, 0x060D71AB, 0xDF2B32A6, 0x6836F3A2, + 0x6D66B4BC, 0xDA7B75B8, 0x035D36B5, 0xB440F7B1, 0x00000001 + }, + [AV_CRC_32_IEEE_LE] = { + 0x00000000, 0x77073096, 0xEE0E612C, 0x990951BA, 0x076DC419, 0x706AF48F, + 0xE963A535, 0x9E6495A3, 0x0EDB8832, 0x79DCB8A4, 0xE0D5E91E, 0x97D2D988, + 0x09B64C2B, 0x7EB17CBD, 0xE7B82D07, 0x90BF1D91, 0x1DB71064, 0x6AB020F2, + 0xF3B97148, 0x84BE41DE, 0x1ADAD47D, 0x6DDDE4EB, 0xF4D4B551, 0x83D385C7, + 0x136C9856, 0x646BA8C0, 0xFD62F97A, 0x8A65C9EC, 0x14015C4F, 0x63066CD9, + 0xFA0F3D63, 0x8D080DF5, 0x3B6E20C8, 0x4C69105E, 0xD56041E4, 0xA2677172, + 0x3C03E4D1, 0x4B04D447, 0xD20D85FD, 0xA50AB56B, 0x35B5A8FA, 0x42B2986C, + 0xDBBBC9D6, 0xACBCF940, 0x32D86CE3, 0x45DF5C75, 0xDCD60DCF, 0xABD13D59, + 0x26D930AC, 0x51DE003A, 0xC8D75180, 0xBFD06116, 0x21B4F4B5, 0x56B3C423, + 0xCFBA9599, 0xB8BDA50F, 0x2802B89E, 0x5F058808, 0xC60CD9B2, 0xB10BE924, + 0x2F6F7C87, 0x58684C11, 0xC1611DAB, 0xB6662D3D, 0x76DC4190, 0x01DB7106, + 0x98D220BC, 0xEFD5102A, 0x71B18589, 0x06B6B51F, 0x9FBFE4A5, 0xE8B8D433, + 0x7807C9A2, 0x0F00F934, 0x9609A88E, 0xE10E9818, 0x7F6A0DBB, 0x086D3D2D, + 0x91646C97, 0xE6635C01, 0x6B6B51F4, 0x1C6C6162, 0x856530D8, 0xF262004E, + 0x6C0695ED, 0x1B01A57B, 0x8208F4C1, 0xF50FC457, 0x65B0D9C6, 0x12B7E950, + 0x8BBEB8EA, 0xFCB9887C, 0x62DD1DDF, 0x15DA2D49, 0x8CD37CF3, 0xFBD44C65, + 0x4DB26158, 0x3AB551CE, 0xA3BC0074, 0xD4BB30E2, 0x4ADFA541, 0x3DD895D7, + 0xA4D1C46D, 0xD3D6F4FB, 0x4369E96A, 0x346ED9FC, 0xAD678846, 0xDA60B8D0, + 0x44042D73, 0x33031DE5, 0xAA0A4C5F, 0xDD0D7CC9, 0x5005713C, 0x270241AA, + 0xBE0B1010, 0xC90C2086, 0x5768B525, 0x206F85B3, 0xB966D409, 0xCE61E49F, + 0x5EDEF90E, 0x29D9C998, 0xB0D09822, 0xC7D7A8B4, 0x59B33D17, 0x2EB40D81, + 0xB7BD5C3B, 0xC0BA6CAD, 0xEDB88320, 0x9ABFB3B6, 0x03B6E20C, 0x74B1D29A, + 0xEAD54739, 0x9DD277AF, 0x04DB2615, 0x73DC1683, 0xE3630B12, 0x94643B84, + 0x0D6D6A3E, 0x7A6A5AA8, 0xE40ECF0B, 0x9309FF9D, 0x0A00AE27, 0x7D079EB1, + 0xF00F9344, 0x8708A3D2, 0x1E01F268, 0x6906C2FE, 0xF762575D, 0x806567CB, + 0x196C3671, 0x6E6B06E7, 0xFED41B76, 0x89D32BE0, 0x10DA7A5A, 0x67DD4ACC, + 0xF9B9DF6F, 0x8EBEEFF9, 0x17B7BE43, 0x60B08ED5, 0xD6D6A3E8, 0xA1D1937E, + 0x38D8C2C4, 0x4FDFF252, 0xD1BB67F1, 0xA6BC5767, 0x3FB506DD, 0x48B2364B, + 0xD80D2BDA, 0xAF0A1B4C, 0x36034AF6, 0x41047A60, 0xDF60EFC3, 0xA867DF55, + 0x316E8EEF, 0x4669BE79, 0xCB61B38C, 0xBC66831A, 0x256FD2A0, 0x5268E236, + 0xCC0C7795, 0xBB0B4703, 0x220216B9, 0x5505262F, 0xC5BA3BBE, 0xB2BD0B28, + 0x2BB45A92, 0x5CB36A04, 0xC2D7FFA7, 0xB5D0CF31, 0x2CD99E8B, 0x5BDEAE1D, + 0x9B64C2B0, 0xEC63F226, 0x756AA39C, 0x026D930A, 0x9C0906A9, 0xEB0E363F, + 0x72076785, 0x05005713, 0x95BF4A82, 0xE2B87A14, 0x7BB12BAE, 0x0CB61B38, + 0x92D28E9B, 0xE5D5BE0D, 0x7CDCEFB7, 0x0BDBDF21, 0x86D3D2D4, 0xF1D4E242, + 0x68DDB3F8, 0x1FDA836E, 0x81BE16CD, 0xF6B9265B, 0x6FB077E1, 0x18B74777, + 0x88085AE6, 0xFF0F6A70, 0x66063BCA, 0x11010B5C, 0x8F659EFF, 0xF862AE69, + 0x616BFFD3, 0x166CCF45, 0xA00AE278, 0xD70DD2EE, 0x4E048354, 0x3903B3C2, + 0xA7672661, 0xD06016F7, 0x4969474D, 0x3E6E77DB, 0xAED16A4A, 0xD9D65ADC, + 0x40DF0B66, 0x37D83BF0, 0xA9BCAE53, 0xDEBB9EC5, 0x47B2CF7F, 0x30B5FFE9, + 0xBDBDF21C, 0xCABAC28A, 0x53B39330, 0x24B4A3A6, 0xBAD03605, 0xCDD70693, + 0x54DE5729, 0x23D967BF, 0xB3667A2E, 0xC4614AB8, 0x5D681B02, 0x2A6F2B94, + 0xB40BBE37, 0xC30C8EA1, 0x5A05DF1B, 0x2D02EF8D, 0x00000001 + }, + [AV_CRC_16_ANSI_LE] = { + 0x0000, 0xC0C1, 0xC181, 0x0140, 0xC301, 0x03C0, 0x0280, 0xC241, + 0xC601, 0x06C0, 0x0780, 0xC741, 0x0500, 0xC5C1, 0xC481, 0x0440, + 0xCC01, 0x0CC0, 0x0D80, 0xCD41, 0x0F00, 0xCFC1, 0xCE81, 0x0E40, + 0x0A00, 0xCAC1, 0xCB81, 0x0B40, 0xC901, 0x09C0, 0x0880, 0xC841, + 0xD801, 0x18C0, 0x1980, 0xD941, 0x1B00, 0xDBC1, 0xDA81, 0x1A40, + 0x1E00, 0xDEC1, 0xDF81, 0x1F40, 0xDD01, 0x1DC0, 0x1C80, 0xDC41, + 0x1400, 0xD4C1, 0xD581, 0x1540, 0xD701, 0x17C0, 0x1680, 0xD641, + 0xD201, 0x12C0, 0x1380, 0xD341, 0x1100, 0xD1C1, 0xD081, 0x1040, + 0xF001, 0x30C0, 0x3180, 0xF141, 0x3300, 0xF3C1, 0xF281, 0x3240, + 0x3600, 0xF6C1, 0xF781, 0x3740, 0xF501, 0x35C0, 0x3480, 0xF441, + 0x3C00, 0xFCC1, 0xFD81, 0x3D40, 0xFF01, 0x3FC0, 0x3E80, 0xFE41, + 0xFA01, 0x3AC0, 0x3B80, 0xFB41, 0x3900, 0xF9C1, 0xF881, 0x3840, + 0x2800, 0xE8C1, 0xE981, 0x2940, 0xEB01, 0x2BC0, 0x2A80, 0xEA41, + 0xEE01, 0x2EC0, 0x2F80, 0xEF41, 0x2D00, 0xEDC1, 0xEC81, 0x2C40, + 0xE401, 0x24C0, 0x2580, 0xE541, 0x2700, 0xE7C1, 0xE681, 0x2640, + 0x2200, 0xE2C1, 0xE381, 0x2340, 0xE101, 0x21C0, 0x2080, 0xE041, + 0xA001, 0x60C0, 0x6180, 0xA141, 0x6300, 0xA3C1, 0xA281, 0x6240, + 0x6600, 0xA6C1, 0xA781, 0x6740, 0xA501, 0x65C0, 0x6480, 0xA441, + 0x6C00, 0xACC1, 0xAD81, 0x6D40, 0xAF01, 0x6FC0, 0x6E80, 0xAE41, + 0xAA01, 0x6AC0, 0x6B80, 0xAB41, 0x6900, 0xA9C1, 0xA881, 0x6840, + 0x7800, 0xB8C1, 0xB981, 0x7940, 0xBB01, 0x7BC0, 0x7A80, 0xBA41, + 0xBE01, 0x7EC0, 0x7F80, 0xBF41, 0x7D00, 0xBDC1, 0xBC81, 0x7C40, + 0xB401, 0x74C0, 0x7580, 0xB541, 0x7700, 0xB7C1, 0xB681, 0x7640, + 0x7200, 0xB2C1, 0xB381, 0x7340, 0xB101, 0x71C0, 0x7080, 0xB041, + 0x5000, 0x90C1, 0x9181, 0x5140, 0x9301, 0x53C0, 0x5280, 0x9241, + 0x9601, 0x56C0, 0x5780, 0x9741, 0x5500, 0x95C1, 0x9481, 0x5440, + 0x9C01, 0x5CC0, 0x5D80, 0x9D41, 0x5F00, 0x9FC1, 0x9E81, 0x5E40, + 0x5A00, 0x9AC1, 0x9B81, 0x5B40, 0x9901, 0x59C0, 0x5880, 0x9841, + 0x8801, 0x48C0, 0x4980, 0x8941, 0x4B00, 0x8BC1, 0x8A81, 0x4A40, + 0x4E00, 0x8EC1, 0x8F81, 0x4F40, 0x8D01, 0x4DC0, 0x4C80, 0x8C41, + 0x4400, 0x84C1, 0x8581, 0x4540, 0x8701, 0x47C0, 0x4680, 0x8641, + 0x8201, 0x42C0, 0x4380, 0x8341, 0x4100, 0x81C1, 0x8081, 0x4040, + 0x0001 + }, +}; + +static const AVCRC *av_crc_get_table_ref(AVCRCId crc_id) +{ + return av_crc_table_ref[crc_id]; +} + +static uint32_t av_crc_ref(const AVCRC *ctx, uint32_t crc, const uint8_t *buffer, size_t length) +{ + const uint8_t *end = buffer + length; + + while (buffer < end) + crc = ctx[((uint8_t) crc) ^ *buffer++] ^ (crc >> 8); + + return crc; +} + +static void check_crc(void) +{ + LOCAL_ALIGNED_32(uint8_t, buf, [BUF_SIZE]); + + for (int i = 0; i < BUF_SIZE; i++) + buf[i] = rnd(); + + av_crc_init_fn(); + declare_func(uint32_t, const AVCRC *crctab, uint32_t crc, + const uint8_t *buffer, size_t length); + + func_type *fn = av_crc_func; + uint32_t crc_arr[] = { + 0, + 0x12, + 0x1234, + 0x123456, + 0x12345678, + 0x12000000, + 0x12340000, + 0x12345600, + UINT32_MAX + }; + AVCRCId crc_id_arr[] = { + AV_CRC_8_ATM, + AV_CRC_16_ANSI, + AV_CRC_16_CCITT, + AV_CRC_32_IEEE, + AV_CRC_32_IEEE_LE, + AV_CRC_16_ANSI_LE, + AV_CRC_24_IEEE, + AV_CRC_8_EBU, + }; + + if (check_func(fn, "av_crc")) { + for (uint32_t crc_id_idx = 0; crc_id_idx < FF_ARRAY_ELEMS(crc_id_arr); crc_id_idx++) { + AVCRCId crc_id = crc_id_arr[crc_id_idx]; + for (uint32_t crc_idx = 0; crc_idx < FF_ARRAY_ELEMS(crc_arr); crc_idx++) { + for (int i = 0; i < 200; i++) { + uint32_t crc0 = crc_arr[crc_idx]; + uint32_t crc1 = crc_arr[crc_idx]; + + crc0 = call_new(av_crc_get_table(crc_id), crc0, buf, i); + crc1 = av_crc_ref(av_crc_get_table_ref(crc_id), crc1, buf, i); + if (crc0 != crc1) + fail(); + } + } + } + + bench_new(av_crc_get_table(AV_CRC_32_IEEE_LE), UINT32_MAX, buf, BUF_SIZE); + } +} + +void checkasm_check_crc(void) +{ + check_crc(); + report("crc"); +} -- 2.49.1 _______________________________________________ ffmpeg-devel mailing list -- ffmpeg-devel@ffmpeg.org To unsubscribe send an email to ffmpeg-devel-leave@ffmpeg.org