From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTPS id 062354E53D for ; Wed, 12 Mar 2025 07:06:15 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id A5CB768D80F; Wed, 12 Mar 2025 09:06:11 +0200 (EET) Received: from mail-lj1-f176.google.com (mail-lj1-f176.google.com [209.85.208.176]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id B444468D6CF for ; Wed, 12 Mar 2025 09:06:04 +0200 (EET) Received: by mail-lj1-f176.google.com with SMTP id 38308e7fff4ca-30bd21f887aso55932621fa.1 for ; Wed, 12 Mar 2025 00:06:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=martin-st.20230601.gappssmtp.com; s=20230601; t=1741763164; x=1742367964; darn=ffmpeg.org; h=mime-version:references:message-id:in-reply-to:subject:to:from:date :from:to:cc:subject:date:message-id:reply-to; bh=Imv3oyZHbZu39lvUrK1rSXcaXbbBTgwCV4qUQac4NoY=; b=hqBtbN+yzjlU+MwVXDs3LdXCrHKbDaBU4KQz52BkYIl5cPMRR/De509jISwgnOWfXK M3LWDeP16WTAZn+FjWlFWng4TithRJAdx71p2pD9itoTwSxK+SKzM7ZqKA/NZOIawa+I F4qDQ7qI4/n9teAjbO7g9aeWXKw0U3Uu9cSWWX6t0HU6UsujS3zm0sFEq0egJys8E0ka sNvFZTmt1rBhYyHgdzqEoH8wRlXNJO5Cw1m+C82wYWeIdgst+feppxL1HJBHkQ1JXzlA EeaowKqqkZAjos4/Z+3IlbQGBLg9/ViPakRsXxwP0145Y40O2pCLHxXJChgOXf0VSIJ7 NyMA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1741763164; x=1742367964; h=mime-version:references:message-id:in-reply-to:subject:to:from:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Imv3oyZHbZu39lvUrK1rSXcaXbbBTgwCV4qUQac4NoY=; b=q0YB4SgdiQvpXpvkfZS31buXfAbohWu7O7/5YOj0pJkJxjAE7N7vCNLVFwRuDXhtnz Y7CprmE4xxTuBahkdge1nERcrnJI+j6XwDG3nhK09bP092Q+FNICskVll0vlAoaAAdM8 pT6QAhgJvGGx3xmQZmuNgTCs2w7AwoHRNGf9RAFKcKQR6CW/EimoXgQBAvKjg39jtdey lQBJldXcA+kVNCnQq+Ne2jgsosjyMYcvyGlnvrycfWFocOfw4k93K7jiP4EAmwc/KCRf 1GPBUFNNlNSEC5MteAEBiQ9wB5dmCNgJAgvCRWuEXpK+rBe3YSZiTFquwTQAG4M+8Oxl Arfw== X-Gm-Message-State: AOJu0YxCdbnumSoLd1zUrgwIPe/ymffGqDHVaV5SbF62aBGEHNEqHsnQ 1dIKxO4wnnP01QIEn2GZhCavGJyjN0Dmup9s8g0Ze4MVKzm0o245rAnSXLLLo8NySvNwLgZ5L+L Ot9Dm X-Gm-Gg: ASbGncuCv6e4hK/FNJgFIrsnLXzbPQIBvrZ6OAqDvmyr46VvAHygSP9TKOUA42z3ocp a945L35funEpRGL7u/s9bPe5ESw+eO3kOwr1lAq35YSa3QpfZN+j4nxueSdKMdWRRQzlmGmRXzu /NPaQH51jhz/JLbnXa4e4s+F0/HrK640L1K3UjNI4HGuJEZWxBnVMlneBMNe7VDl3N6VAktYyEN qJLjq5rKH42Si0k4s4IIvDet2wl1iOGHWHArgDlCzLNz0gVwY3G2sD+2p7vMD4T52MOt/G7/RHu g4YM8C3rz8MEIrsM/47/2gFfqK586Posn3Ey71RX177y1Edy4D9d0SAOsCV44lk2XDM5tXPWdsS hYqgVE58dArRBtSmbH+/tWkKK+3crtev6seIqfl3O X-Google-Smtp-Source: AGHT+IEVyE9/PmH5SN3FbOVCN06rm67ahiyl5rgsu8wJyYIbxesOtuSlXcw43M4O+GiffGBiDV5H0Q== X-Received: by 2002:a05:651c:4ca:b0:30c:aae:6d4e with SMTP id 38308e7fff4ca-30c0aae6eafmr51897681fa.23.1741763163440; Wed, 12 Mar 2025 00:06:03 -0700 (PDT) Received: from tunnel335574-pt.tunnel.tserv24.sto1.ipv6.he.net (tunnel335574-pt.tunnel.tserv24.sto1.ipv6.he.net. [2001:470:27:11::2]) by smtp.gmail.com with ESMTPSA id 38308e7fff4ca-30c1055bdc6sm11788621fa.18.2025.03.12.00.06.01 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 12 Mar 2025 00:06:02 -0700 (PDT) Date: Wed, 12 Mar 2025 09:05:59 +0200 (EET) From: =?ISO-8859-15?Q?Martin_Storsj=F6?= To: FFmpeg development discussions and patches In-Reply-To: Message-ID: References: <20250303205612.3955254-1-prka@google.com> <20250311191737.2393125-1-prka@google.com> MIME-Version: 1.0 Subject: Re: [FFmpeg-devel] [PATCH v4] Mark C globals with small code model X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: On Wed, 12 Mar 2025, Andreas Rheinhardt wrote: > Pranav Kant via ffmpeg-devel: >> By default, all globals in C/C++ compiled by clang are allocated >> in non-large data sections. See [1] for background on code models. >> For PIC (Position independent code), this is fine as long as binary is >> small but as binary size increases, users maybe want to use medium/large >> code models (-mcmodel=medium) which moves data in to large sections. >> As data in these large sections cannot be accessed using PIC code >> anymore (as it may be too far away), compiler ends up using a different >> instruction sequence when building C/C++ code -- using GOT to access >> these globals (which can be relaxed by linker at link time if binary >> ends up being smaller). However, assembly files continue to access these >> globals defined in C/C++ files using older (and invalid instruction >> sequence). So, we mark all such globals with an attribute that forces >> them to be allocated in small sections allowing them to validly be >> accessed from the assembly code. >> >> This patch should not have any affect on builds that use small code >> model, which is the default mode. >> >> [1] https://eli.thegreenplace.net/2012/01/03/understanding-the-x64-code-models >> >> Signed-off-by: Pranav Kant >> --- >> libavcodec/ac3dsp.c | 2 ++ >> libavcodec/cabac.c | 2 ++ >> libavcodec/x86/constants.c | 8 ++++++++ >> libavutil/attributes.h | 6 ++++++ >> libavutil/attributes_internal.h | 16 ++++++++++++++++ >> 5 files changed, 34 insertions(+) >> >> diff --git a/libavcodec/ac3dsp.c b/libavcodec/ac3dsp.c >> index 730fa70fff..d16b6c24c3 100644 >> --- a/libavcodec/ac3dsp.c >> +++ b/libavcodec/ac3dsp.c >> @@ -25,6 +25,7 @@ >> >> #include "config.h" >> #include "libavutil/attributes.h" >> +#include "libavutil/attributes_internal.h" >> #include "libavutil/common.h" >> #include "libavutil/intmath.h" >> #include "libavutil/mem_internal.h" >> @@ -104,6 +105,7 @@ static void ac3_update_bap_counts_c(uint16_t mant_cnt[16], uint8_t *bap, >> mant_cnt[bap[len]]++; >> } >> >> +attribute_mcmodel_small >> DECLARE_ALIGNED(16, const uint16_t, ff_ac3_bap_bits)[16] = { > > Shouldn't stuff like this be applied to the declaration so that C code > can also take advantage of the knowledge that this object will be placed > in the small code section? > >> 0, 0, 0, 3, 0, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 16 >> }; >> diff --git a/libavcodec/cabac.c b/libavcodec/cabac.c >> index 7d41cd2ae6..b8c6db29a2 100644 >> --- a/libavcodec/cabac.c >> +++ b/libavcodec/cabac.c >> @@ -24,11 +24,13 @@ >> * Context Adaptive Binary Arithmetic Coder. >> */ >> >> +#include "libavutil/attributes_internal.h" >> #include "libavutil/error.h" >> #include "libavutil/mem_internal.h" >> >> #include "cabac.h" >> >> +attribute_mcmodel_small >> DECLARE_ASM_ALIGNED(1, const uint8_t, ff_h264_cabac_tables)[512 + 4*2*64 + 4*64 + 63] = { > > Your commit message ("However, assembly files continue to access") > speaks only of assembly files, i.e. external asm. Yet this here is only > used by inline ASM. Looking through the code the reason for this is that > I thought that specifying the memory model is only necessary for stuff > used by external asm, yet ff_h264_cabac_tables does not seem to be used > by external ASM at all, only inline ASM. If I see this correctly, the > reason for this is that LOCAL_MANGLE (and therefore MANGLE) uses rip > addressing on x64 when configure sets the RIP define. But this means > that the set of files needing attribute_mcmodel_small is a superset of > the files currently using DECLARE_ASM_ALIGNED. This means that one would > only need two macros for the variables accessed by ASM: One for only > external ASM and one for inline ASM (and potentially external ASM) > instead of adding attribute_mcmodel_small at various places in the codebase. > >> 9,8,7,7,6,6,6,6,5,5,5,5,5,5,5,5, >> 4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4, >> diff --git a/libavcodec/x86/constants.c b/libavcodec/x86/constants.c >> index bc7f2b17b8..347b7dd1d3 100644 >> --- a/libavcodec/x86/constants.c >> +++ b/libavcodec/x86/constants.c >> @@ -18,17 +18,21 @@ >> * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA >> */ >> >> +#include "libavutil/attributes_internal.h" >> #include "libavutil/mem_internal.h" >> #include "libavutil/x86/asm.h" // for xmm_reg >> #include "constants.h" >> >> +attribute_mcmodel_small >> DECLARE_ALIGNED(32, const ymm_reg, ff_pw_1) = { > 0x0001000100010001ULL, 0x0001000100010001ULL,> > 0x0001000100010001ULL, 0x0001000100010001ULL }; >> DECLARE_ALIGNED(32, const ymm_reg, ff_pw_2) = { 0x0002000200020002ULL, 0x0002000200020002ULL, >> 0x0002000200020002ULL, 0x0002000200020002ULL }; >> DECLARE_ASM_ALIGNED(16, const xmm_reg, ff_pw_3) = { 0x0003000300030003ULL, 0x0003000300030003ULL }; >> +attribute_mcmodel_small >> DECLARE_ASM_ALIGNED(32, const ymm_reg, ff_pw_4) = { 0x0004000400040004ULL, 0x0004000400040004ULL, >> 0x0004000400040004ULL, 0x0004000400040004ULL }; >> +attribute_mcmodel_small >> DECLARE_ASM_ALIGNED(16, const xmm_reg, ff_pw_5) = { 0x0005000500050005ULL, 0x0005000500050005ULL }; >> DECLARE_ALIGNED(16, const xmm_reg, ff_pw_8) = { 0x0008000800080008ULL, 0x0008000800080008ULL }; >> DECLARE_ASM_ALIGNED(16, const xmm_reg, ff_pw_9) = { 0x0009000900090009ULL, 0x0009000900090009ULL }; >> @@ -49,6 +53,7 @@ DECLARE_ALIGNED(32, const ymm_reg, ff_pw_256) = { 0x0100010001000100ULL, 0x010 >> DECLARE_ALIGNED(32, const ymm_reg, ff_pw_512) = { 0x0200020002000200ULL, 0x0200020002000200ULL, >> 0x0200020002000200ULL, 0x0200020002000200ULL }; >> DECLARE_ALIGNED(16, const xmm_reg, ff_pw_1019) = { 0x03FB03FB03FB03FBULL, 0x03FB03FB03FB03FBULL }; >> +attribute_mcmodel_small >> DECLARE_ALIGNED(32, const ymm_reg, ff_pw_1023) = { 0x03ff03ff03ff03ffULL, 0x03ff03ff03ff03ffULL, >> 0x03ff03ff03ff03ffULL, 0x03ff03ff03ff03ffULL}; >> DECLARE_ALIGNED(32, const ymm_reg, ff_pw_1024) = { 0x0400040004000400ULL, 0x0400040004000400ULL, >> @@ -66,13 +71,16 @@ DECLARE_ALIGNED(32, const ymm_reg, ff_pw_m1) = { 0xFFFFFFFFFFFFFFFFULL, 0xFFF >> >> DECLARE_ALIGNED(32, const ymm_reg, ff_pb_0) = { 0x0000000000000000ULL, 0x0000000000000000ULL, >> 0x0000000000000000ULL, 0x0000000000000000ULL }; >> +attribute_mcmodel_small >> DECLARE_ALIGNED(32, const ymm_reg, ff_pb_1) = { 0x0101010101010101ULL, 0x0101010101010101ULL, >> 0x0101010101010101ULL, 0x0101010101010101ULL }; >> DECLARE_ALIGNED(32, const ymm_reg, ff_pb_2) = { 0x0202020202020202ULL, 0x0202020202020202ULL, >> 0x0202020202020202ULL, 0x0202020202020202ULL }; >> +attribute_mcmodel_small >> DECLARE_ALIGNED(32, const ymm_reg, ff_pb_3) = { 0x0303030303030303ULL, 0x0303030303030303ULL, >> 0x0303030303030303ULL, 0x0303030303030303ULL }; >> DECLARE_ALIGNED(32, const xmm_reg, ff_pb_15) = { 0x0F0F0F0F0F0F0F0FULL, 0x0F0F0F0F0F0F0F0FULL }; >> +attribute_mcmodel_small >> DECLARE_ALIGNED(32, const ymm_reg, ff_pb_80) = { 0x8080808080808080ULL, 0x8080808080808080ULL, >> 0x8080808080808080ULL, 0x8080808080808080ULL }; >> DECLARE_ALIGNED(32, const ymm_reg, ff_pb_FE) = { 0xFEFEFEFEFEFEFEFEULL, 0xFEFEFEFEFEFEFEFEULL, > > Why do you add this attribute to only eight of these constants? How did > you come up with this list? In addition to the above, git grep cextern > shows that there are several more variables than just these one (e.g. > ff_h263_loop_filter_strength, ff_sbr_noise_table). > >> diff --git a/libavutil/attributes.h b/libavutil/attributes.h >> index 04c615c952..dfc35fa31e 100644 >> --- a/libavutil/attributes.h >> +++ b/libavutil/attributes.h >> @@ -40,6 +40,12 @@ >> # define AV_HAS_BUILTIN(x) 0 >> #endif >> >> +#ifdef __has_attribute >> +# define AV_HAS_ATTRIBUTE(x) __has_attribute(x) >> +#else >> +# define AV_HAS_ATTRIBUTE(x) 0 >> +#endif >> + >> #ifndef av_always_inline >> #if AV_GCC_VERSION_AT_LEAST(3,1) >> # define av_always_inline __attribute__((always_inline)) inline >> diff --git a/libavutil/attributes_internal.h b/libavutil/attributes_internal.h >> index bc85ce77ff..c557fa0af0 100644 >> --- a/libavutil/attributes_internal.h >> +++ b/libavutil/attributes_internal.h >> @@ -19,6 +19,7 @@ >> #ifndef AVUTIL_ATTRIBUTES_INTERNAL_H >> #define AVUTIL_ATTRIBUTES_INTERNAL_H >> >> +#include "config.h" >> #include "attributes.h" >> >> #if (AV_GCC_VERSION_AT_LEAST(4,0) || defined(__clang__)) && (defined(__ELF__) || defined(__MACH__)) >> @@ -33,4 +34,19 @@ >> >> #define EXTERN extern attribute_visibility_hidden >> >> +/** >> + * Some globals defined in C files are used from hardcoded asm that assumes small >> + * code model (that is, accessing these globals without GOT). This is a problem >> + * when FFMpeg is built with medium code model (-mcmodel=medium) which allocates > > s/FFMpeg/FFmpeg/ > >> + * all globals in a data section that's unreachable with PC relative instructions >> + * (small code model instruction sequence). We mark all such globals with this >> + * attribute_mcmodel_small to ensure assembly accessible globals continue to be >> + * allocated in sections reachable from PC relative instructions. >> + */ >> +#if ARCH_X86_64 && defined(__ELF__) && AV_HAS_ATTRIBUTE(model) > > You make this ARCH_X86_64 only, yet the concept of code models also > exists for other arches, e.g. AArch64. I presume that assembly for other > arches presumes that we are not using the large code model for accesses, > so that the same issue can happen with them. Then we have two options: FWIW, in e4ac156b7c47725327dff78bb83f5eecbaee3add we added attribute_visibility_hidden to a bunch of symbols that are accessed from aarch64 assembly, in a similar fashion. While the effects are different, I wonder if we should abstract these two down to a more generic attribute for any symbol accessed directly from any assembly. // Martin _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".