From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: <ffmpeg-devel-bounces@ffmpeg.org> Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTPS id 5FB4F4E9B9 for <ffmpegdev@gitmailbox.com>; Fri, 21 Mar 2025 00:28:36 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 69E21687B99; Fri, 21 Mar 2025 02:28:32 +0200 (EET) Received: from mail-qk1-f202.google.com (mail-qk1-f202.google.com [209.85.222.202]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id C3C1C687AE0 for <ffmpeg-devel@ffmpeg.org>; Fri, 21 Mar 2025 02:28:25 +0200 (EET) Received: by mail-qk1-f202.google.com with SMTP id af79cd13be357-7c53e316734so279907385a.2 for <ffmpeg-devel@ffmpeg.org>; Thu, 20 Mar 2025 17:28:25 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1742516904; x=1743121704; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=qh8Qrym1cLmOc4COze1mNwJU77jQ0zxavL8g/TjskGk=; b=QsO8yEiZ/+u6edyGdtqyiQsSHDxWsxlgk5CA7Z4Vmp4vpvJvdnH7oIujkiUOvbwbTZ btouhzvv2XpbtEiL7pocbk7T4+UeMP0tHxonGloMy+Hjt/QgFXfD4Ja/MztyYYUnZvzy zTNC7Nm/rO4A/c1sfhLeoAqNAvuFyezIwkOMA9eX/sm8l3jkmW9P/8fwsuCXNjI+Quni h+WOpdopp28pgA80AlP8DDk/IXQN+riqIPMFEo3rZTPlFUdGv03zJQAWt6ZyFmoKzBFs ogdRkjoTs27AvX9O0cXWqwsSyPdM5iWpznE8NSy5m0tLRwEgPGzd8g1RQIoZ35KOzxvx J0HQ== X-Gm-Message-State: AOJu0Yz0mfnfsTArQ1sYHHMybU67lzgmNuXYyV7cVUnh8DdTKNK2EZpM rDz5fx3llpp6R+8F26Vx3xHl5jHlk8ZhHULIXDWAfvS40/4kdk8jBN0E6ddQ70VW2c57lQWqYRC TV9k0kU8+emWgVZYQsX+F2Aa3ZLiVA3vSV+/xwuKiQC9q0VJoaEWRxWQaToX2WhJ1GwBieJzJuk MYjUBcDRriKb1cYPOrwuVOGIU= X-Google-Smtp-Source: AGHT+IH3XJpH6p55BKKKqh3zsjuSeoe6/WYxnfkv5wF0PVV3lZu9E7D8ljznHcnvSDHBi2oxr2xNSiwj X-Received: from qkbdv27.prod.google.com ([2002:a05:620a:1b9b:b0:7c5:b832:24c0]) (user=prka job=prod-delivery.src-stubby-dispatcher) by 2002:a05:620a:40c7:b0:7c5:57e6:ee87 with SMTP id af79cd13be357-7c5ba1e41abmr178390185a.41.1742516904029; Thu, 20 Mar 2025 17:28:24 -0700 (PDT) Date: Fri, 21 Mar 2025 00:28:20 +0000 In-Reply-To: <CAPvNhsL-V9bKMDk1pw3DuZb-X_nZw0=-mnjBbNQYWQhADk65bQ@mail.gmail.com> Mime-Version: 1.0 References: <CAPvNhsL-V9bKMDk1pw3DuZb-X_nZw0=-mnjBbNQYWQhADk65bQ@mail.gmail.com> X-Mailer: git-send-email 2.49.0.395.g12beb8f557-goog Message-ID: <20250321002820.717356-1-prka@google.com> To: ffmpeg-devel@ffmpeg.org Subject: [FFmpeg-devel] [PATCH v5] Mark C globals with small code model X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches <ffmpeg-devel.ffmpeg.org> List-Unsubscribe: <https://ffmpeg.org/mailman/options/ffmpeg-devel>, <mailto:ffmpeg-devel-request@ffmpeg.org?subject=unsubscribe> List-Archive: <https://ffmpeg.org/pipermail/ffmpeg-devel> List-Post: <mailto:ffmpeg-devel@ffmpeg.org> List-Help: <mailto:ffmpeg-devel-request@ffmpeg.org?subject=help> List-Subscribe: <https://ffmpeg.org/mailman/listinfo/ffmpeg-devel>, <mailto:ffmpeg-devel-request@ffmpeg.org?subject=subscribe> From: Pranav Kant via ffmpeg-devel <ffmpeg-devel@ffmpeg.org> Reply-To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org> Cc: Pranav Kant <prka@google.com> Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org> Archived-At: <https://master.gitmailbox.com/ffmpegdev/20250321002820.717356-1-prka@google.com/> List-Archive: <https://master.gitmailbox.com/ffmpegdev/> List-Post: <mailto:ffmpegdev@gitmailbox.com> By default, all globals in C/C++ compiled by clang are allocated in non-large data sections. See [1] for background on code models. For PIC (Position independent code), this is fine as long as binary is small but as binary size increases, users maybe want to use medium/large code models (-mcmodel=medium) which moves data in to large sections. As data in these large sections cannot be accessed using PIC code anymore (as it may be too far away), compiler ends up using a different instruction sequence when building C/C++ code -- using GOT to access these globals (which can be relaxed by linker at link time if binary ends up being smaller). However, external assembly and inline assembly continue to access these globals using older PC-relative addressing which may not work because globals may be placed too far away. Introduce new macros for such variables that mark them with small code model attribute. This ensures that these variables are never allocated in large data sections, and continue to be validly accessed from assembly code. This patch should not have any affect on builds that use small code model, which is the default mode. [1] https://eli.thegreenplace.net/2012/01/03/understanding-the-x64-code-models Signed-off-by: Pranav Kant <prka@google.com> --- libavcodec/ac3dsp.h | 4 +++- libavcodec/cabac.h | 4 +++- libavcodec/x86/constants.h | 12 +++++++----- libavutil/attributes.h | 6 ++++++ libavutil/attributes_internal.h | 16 ++++++++++++++++ libavutil/mem_internal.h | 13 +++++++++++++ 6 files changed, 48 insertions(+), 7 deletions(-) diff --git a/libavcodec/ac3dsp.h b/libavcodec/ac3dsp.h index b1b2bced8f..a3c55a833b 100644 --- a/libavcodec/ac3dsp.h +++ b/libavcodec/ac3dsp.h @@ -25,11 +25,13 @@ #include <stddef.h> #include <stdint.h> +#include "libavutil/mem_internal.h" + /** * Number of mantissa bits written for each bap value. * bap values with fractional bits are set to 0 and are calculated separately. */ -extern const uint16_t ff_ac3_bap_bits[16]; +extern DECLARE_EXTERNAL_ASM_VAR(16, const uint16_t, ff_ac3_bap_bits)[16]; typedef struct AC3DSPContext { /** diff --git a/libavcodec/cabac.h b/libavcodec/cabac.h index 38d06b2842..df352258c6 100644 --- a/libavcodec/cabac.h +++ b/libavcodec/cabac.h @@ -29,7 +29,9 @@ #include <stdint.h> -extern const uint8_t ff_h264_cabac_tables[512 + 4*2*64 + 4*64 + 63]; +#include "libavutil/mem_internal.h" + +extern DECLARE_ASM_VAR(1, const uint8_t, ff_h264_cabac_tables)[512 + 4*2*64 + 4*64 + 63]; #define H264_NORM_SHIFT_OFFSET 0 #define H264_LPS_RANGE_OFFSET 512 #define H264_MLPS_STATE_OFFSET 1024 diff --git a/libavcodec/x86/constants.h b/libavcodec/x86/constants.h index 0c6bf41fa0..2561302604 100644 --- a/libavcodec/x86/constants.h +++ b/libavcodec/x86/constants.h @@ -23,13 +23,14 @@ #include <stdint.h> +#include "libavutil/mem_internal.h" #include "libavutil/x86/asm.h" -extern const ymm_reg ff_pw_1; +extern DECLARE_EXTERNAL_ASM_VAR(32, const ymm_reg, ff_pw_1); extern const ymm_reg ff_pw_2; extern const xmm_reg ff_pw_3; -extern const ymm_reg ff_pw_4; -extern const xmm_reg ff_pw_5; +extern DECLARE_ASM_VAR(32, const ymm_reg, ff_pw_4); +extern DECLARE_ASM_VAR(16, const xmm_reg, ff_pw_5); extern const xmm_reg ff_pw_8; extern const xmm_reg ff_pw_9; extern const uint64_t ff_pw_15; @@ -43,7 +44,7 @@ extern const uint64_t ff_pw_128; extern const ymm_reg ff_pw_255; extern const ymm_reg ff_pw_256; extern const ymm_reg ff_pw_512; -extern const ymm_reg ff_pw_1023; +extern DECLARE_EXTERNAL_ASM_VAR(32, const ymm_reg, ff_pw_1023); extern const ymm_reg ff_pw_1024; extern const ymm_reg ff_pw_2048; extern const ymm_reg ff_pw_4095; @@ -52,9 +53,10 @@ extern const ymm_reg ff_pw_8192; extern const ymm_reg ff_pw_m1; extern const ymm_reg ff_pb_0; -extern const ymm_reg ff_pb_1; +extern DECLARE_EXTERNAL_ASM_VAR(32, const ymm_reg, ff_pb_1); extern const ymm_reg ff_pb_2; extern const ymm_reg ff_pb_3; +extern DECLARE_ASM_VAR(32, const xmm_reg, ff_pb_15); extern const ymm_reg ff_pb_80; extern const ymm_reg ff_pb_FE; extern const uint64_t ff_pb_FC; diff --git a/libavutil/attributes.h b/libavutil/attributes.h index 04c615c952..dfc35fa31e 100644 --- a/libavutil/attributes.h +++ b/libavutil/attributes.h @@ -40,6 +40,12 @@ # define AV_HAS_BUILTIN(x) 0 #endif +#ifdef __has_attribute +# define AV_HAS_ATTRIBUTE(x) __has_attribute(x) +#else +# define AV_HAS_ATTRIBUTE(x) 0 +#endif + #ifndef av_always_inline #if AV_GCC_VERSION_AT_LEAST(3,1) # define av_always_inline __attribute__((always_inline)) inline diff --git a/libavutil/attributes_internal.h b/libavutil/attributes_internal.h index bc85ce77ff..c557fa0af0 100644 --- a/libavutil/attributes_internal.h +++ b/libavutil/attributes_internal.h @@ -19,6 +19,7 @@ #ifndef AVUTIL_ATTRIBUTES_INTERNAL_H #define AVUTIL_ATTRIBUTES_INTERNAL_H +#include "config.h" #include "attributes.h" #if (AV_GCC_VERSION_AT_LEAST(4,0) || defined(__clang__)) && (defined(__ELF__) || defined(__MACH__)) @@ -33,4 +34,19 @@ #define EXTERN extern attribute_visibility_hidden +/** + * Some globals defined in C files are used from hardcoded asm that assumes small + * code model (that is, accessing these globals without GOT). This is a problem + * when FFMpeg is built with medium code model (-mcmodel=medium) which allocates + * all globals in a data section that's unreachable with PC relative instructions + * (small code model instruction sequence). We mark all such globals with this + * attribute_mcmodel_small to ensure assembly accessible globals continue to be + * allocated in sections reachable from PC relative instructions. + */ +#if ARCH_X86_64 && defined(__ELF__) && AV_HAS_ATTRIBUTE(model) +# define attribute_mcmodel_small __attribute__((model("small"))) +#else +# define attribute_mcmodel_small +#endif + #endif /* AVUTIL_ATTRIBUTES_INTERNAL_H */ diff --git a/libavutil/mem_internal.h b/libavutil/mem_internal.h index c027fa51c3..efb4c89c39 100644 --- a/libavutil/mem_internal.h +++ b/libavutil/mem_internal.h @@ -29,6 +29,7 @@ #endif #include "attributes.h" +#include "attributes_internal.h" #include "macros.h" /** @@ -113,6 +114,18 @@ #define DECLARE_ALIGNED_32(t,v) DECLARE_ALIGNED_T(ALIGN_32, t, v) #define DECLARE_ALIGNED_64(t,v) DECLARE_ALIGNED_T(ALIGN_64, t, v) +// DECLARE_ASM_VAR used for variables accessed by inline asm +// and external assembly +#define DECLARE_ASM_VAR(n,t,v) \ + attribute_mcmodel_small \ + alignas(n) t v +// DECLARE_EXTERNAL_ASM_VAR used for variables exclusively +// accessed by external assembly +#define DECLARE_EXTERNAL_ASM_VAR(n,t,v) \ + attribute_visibility_hidden \ + attribute_mcmodel_small \ + alignas(n) t v + // Some broken preprocessors need a second expansion // to be forced to tokenize __VA_ARGS__ #define E1(x) x -- 2.49.0.395.g12beb8f557-goog _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".