* Re: [FFmpeg-devel] [PATCH] Mark C globals with small code model
2025-02-25 21:37 [FFmpeg-devel] [PATCH] Mark C globals with small code model Pranav Kant via ffmpeg-devel
@ 2025-02-25 23:03 ` Andreas Rheinhardt
2025-02-26 19:44 ` Pranav Kant via ffmpeg-devel
2025-02-28 2:31 ` [FFmpeg-devel] [PATCH] " Lynne
2 siblings, 0 replies; 16+ messages in thread
From: Andreas Rheinhardt @ 2025-02-25 23:03 UTC (permalink / raw)
To: ffmpeg-devel
Pranav Kant via ffmpeg-devel:
> By default, all globals in C/C++ compiled by clang are allocated
> in non-large data sections. See [1] for background on code models.
> For PIC (Position independent code), this is fine as long as binary is
> small but as binary size increases, users maybe want to use medium/large
> code models (-mcmodel=medium) which moves data in to large sections.
> As data in these large sections cannot be accessed using PIC code
> anymore (as it may be too far away), compiler ends up using a different
> instruction sequence when building C/C++ code -- using GOT to access
> these globals (which can be relaxed by linker at link time if binary
> ends up being smaller). However, assembly files continue to access these
> globals defined in C/C++ files using older (and invalid instruction
> sequence). So, we mark all such globals with an attribute that forces
> them to be allocated in small sections allowing them to validly be
> accessed from the assembly code.
>
> This patch should not have any affect on builds that use small code
> model, which is the default mode.
>
> [1] https://eli.thegreenplace.net/2012/01/03/understanding-the-x64-code-models
> ---
> libavcodec/ac3dsp.c | 2 +-
> libavcodec/cabac.c | 3 ++-
> libavcodec/x86/constants.c | 8 ++++++++
> libavutil/attributes.h | 6 ++++++
> 4 files changed, 17 insertions(+), 2 deletions(-)
>
> diff --git a/libavcodec/ac3dsp.c b/libavcodec/ac3dsp.c
> index 730fa70fff..43b4fcbda9 100644
> --- a/libavcodec/ac3dsp.c
> +++ b/libavcodec/ac3dsp.c
> @@ -104,7 +104,7 @@ static void ac3_update_bap_counts_c(uint16_t mant_cnt[16], uint8_t *bap,
> mant_cnt[bap[len]]++;
> }
>
> -DECLARE_ALIGNED(16, const uint16_t, ff_ac3_bap_bits)[16] = {
> +av_mcmodel_small DECLARE_ALIGNED(16, const uint16_t, ff_ac3_bap_bits)[16] = {
> 0, 0, 0, 3, 0, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 16
> };
>
> diff --git a/libavcodec/cabac.c b/libavcodec/cabac.c
> index 7d41cd2ae6..dfc3ba135a 100644
> --- a/libavcodec/cabac.c
> +++ b/libavcodec/cabac.c
> @@ -24,12 +24,13 @@
> * Context Adaptive Binary Arithmetic Coder.
> */
>
> +#include "libavutil/attributes.h"
> #include "libavutil/error.h"
> #include "libavutil/mem_internal.h"
>
> #include "cabac.h"
>
> -DECLARE_ASM_ALIGNED(1, const uint8_t, ff_h264_cabac_tables)[512 + 4*2*64 + 4*64 + 63] = {
> +av_mcmodel_small DECLARE_ASM_ALIGNED(1, const uint8_t, ff_h264_cabac_tables)[512 + 4*2*64 + 4*64 + 63] = {
> 9,8,7,7,6,6,6,6,5,5,5,5,5,5,5,5,
> 4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,
> 3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,
> diff --git a/libavcodec/x86/constants.c b/libavcodec/x86/constants.c
> index bc7f2b17b8..9a5af2871c 100644
> --- a/libavcodec/x86/constants.c
> +++ b/libavcodec/x86/constants.c
> @@ -18,17 +18,21 @@
> * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
> */
>
> +#include "libavutil/attributes.h"
> #include "libavutil/mem_internal.h"
> #include "libavutil/x86/asm.h" // for xmm_reg
> #include "constants.h"
>
> +av_mcmodel_small
> DECLARE_ALIGNED(32, const ymm_reg, ff_pw_1) = { 0x0001000100010001ULL, 0x0001000100010001ULL,
> 0x0001000100010001ULL, 0x0001000100010001ULL };
> DECLARE_ALIGNED(32, const ymm_reg, ff_pw_2) = { 0x0002000200020002ULL, 0x0002000200020002ULL,
> 0x0002000200020002ULL, 0x0002000200020002ULL };
> DECLARE_ASM_ALIGNED(16, const xmm_reg, ff_pw_3) = { 0x0003000300030003ULL, 0x0003000300030003ULL };
> +av_mcmodel_small
> DECLARE_ASM_ALIGNED(32, const ymm_reg, ff_pw_4) = { 0x0004000400040004ULL, 0x0004000400040004ULL,
> 0x0004000400040004ULL, 0x0004000400040004ULL };
> +av_mcmodel_small
> DECLARE_ASM_ALIGNED(16, const xmm_reg, ff_pw_5) = { 0x0005000500050005ULL, 0x0005000500050005ULL };
> DECLARE_ALIGNED(16, const xmm_reg, ff_pw_8) = { 0x0008000800080008ULL, 0x0008000800080008ULL };
> DECLARE_ASM_ALIGNED(16, const xmm_reg, ff_pw_9) = { 0x0009000900090009ULL, 0x0009000900090009ULL };
> @@ -49,6 +53,7 @@ DECLARE_ALIGNED(32, const ymm_reg, ff_pw_256) = { 0x0100010001000100ULL, 0x010
> DECLARE_ALIGNED(32, const ymm_reg, ff_pw_512) = { 0x0200020002000200ULL, 0x0200020002000200ULL,
> 0x0200020002000200ULL, 0x0200020002000200ULL };
> DECLARE_ALIGNED(16, const xmm_reg, ff_pw_1019) = { 0x03FB03FB03FB03FBULL, 0x03FB03FB03FB03FBULL };
> +av_mcmodel_small
> DECLARE_ALIGNED(32, const ymm_reg, ff_pw_1023) = { 0x03ff03ff03ff03ffULL, 0x03ff03ff03ff03ffULL,
> 0x03ff03ff03ff03ffULL, 0x03ff03ff03ff03ffULL};
> DECLARE_ALIGNED(32, const ymm_reg, ff_pw_1024) = { 0x0400040004000400ULL, 0x0400040004000400ULL,
> @@ -66,13 +71,16 @@ DECLARE_ALIGNED(32, const ymm_reg, ff_pw_m1) = { 0xFFFFFFFFFFFFFFFFULL, 0xFFF
>
> DECLARE_ALIGNED(32, const ymm_reg, ff_pb_0) = { 0x0000000000000000ULL, 0x0000000000000000ULL,
> 0x0000000000000000ULL, 0x0000000000000000ULL };
> +av_mcmodel_small
> DECLARE_ALIGNED(32, const ymm_reg, ff_pb_1) = { 0x0101010101010101ULL, 0x0101010101010101ULL,
> 0x0101010101010101ULL, 0x0101010101010101ULL };
> DECLARE_ALIGNED(32, const ymm_reg, ff_pb_2) = { 0x0202020202020202ULL, 0x0202020202020202ULL,
> 0x0202020202020202ULL, 0x0202020202020202ULL };
> +av_mcmodel_small
> DECLARE_ALIGNED(32, const ymm_reg, ff_pb_3) = { 0x0303030303030303ULL, 0x0303030303030303ULL,
> 0x0303030303030303ULL, 0x0303030303030303ULL };
> DECLARE_ALIGNED(32, const xmm_reg, ff_pb_15) = { 0x0F0F0F0F0F0F0F0FULL, 0x0F0F0F0F0F0F0F0FULL };
> +av_mcmodel_small
> DECLARE_ALIGNED(32, const ymm_reg, ff_pb_80) = { 0x8080808080808080ULL, 0x8080808080808080ULL,
> 0x8080808080808080ULL, 0x8080808080808080ULL };
> DECLARE_ALIGNED(32, const ymm_reg, ff_pb_FE) = { 0xFEFEFEFEFEFEFEFEULL, 0xFEFEFEFEFEFEFEFEULL,
> diff --git a/libavutil/attributes.h b/libavutil/attributes.h
> index 04c615c952..704a7070db 100644
> --- a/libavutil/attributes.h
> +++ b/libavutil/attributes.h
> @@ -104,6 +104,12 @@
> # define attribute_deprecated
> #endif
>
> +#if defined(__clang__) && __has_attribute(model)
> +# define av_mcmodel_small __attribute__((model("small")))
> +#else
> +# define av_mcmodel_small
> +#endif
This should be in attributes_internal.h. Or maybe we should add new
macros in mem_internal.h for the case of variables with external linkage
and static storage duration (one macro for the const case, one for the
non-const case) that would automatically include your proposed attribute?
> +
> /**
> * Disable warnings about deprecated features
> * This is useful for sections of code kept for backward compatibility and
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 16+ messages in thread
* [FFmpeg-devel] [PATCH] Mark C globals with small code model
2025-02-25 21:37 [FFmpeg-devel] [PATCH] Mark C globals with small code model Pranav Kant via ffmpeg-devel
2025-02-25 23:03 ` Andreas Rheinhardt
@ 2025-02-26 19:44 ` Pranav Kant via ffmpeg-devel
2025-02-26 19:45 ` Pranav Kant via ffmpeg-devel
` (3 more replies)
2025-02-28 2:31 ` [FFmpeg-devel] [PATCH] " Lynne
2 siblings, 4 replies; 16+ messages in thread
From: Pranav Kant via ffmpeg-devel @ 2025-02-26 19:44 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Pranav Kant
By default, all globals in C/C++ compiled by clang are allocated
in non-large data sections. See [1] for background on code models.
For PIC (Position independent code), this is fine as long as binary is
small but as binary size increases, users maybe want to use medium/large
code models (-mcmodel=medium) which moves data in to large sections.
As data in these large sections cannot be accessed using PIC code
anymore (as it may be too far away), compiler ends up using a different
instruction sequence when building C/C++ code -- using GOT to access
these globals (which can be relaxed by linker at link time if binary
ends up being smaller). However, assembly files continue to access these
globals defined in C/C++ files using older (and invalid instruction
sequence). So, we mark all such globals with an attribute that forces
them to be allocated in small sections allowing them to validly be
accessed from the assembly code.
This patch should not have any affect on builds that use small code
model, which is the default mode.
[1] https://eli.thegreenplace.net/2012/01/03/understanding-the-x64-code-models
Signed-off-by: Pranav Kant <prka@google.com>
---
libavcodec/ac3dsp.c | 2 ++
libavcodec/cabac.c | 2 ++
libavcodec/x86/constants.c | 8 ++++++++
libavutil/attributes_internal.h | 15 +++++++++++++++
4 files changed, 27 insertions(+)
diff --git a/libavcodec/ac3dsp.c b/libavcodec/ac3dsp.c
index 730fa70fff..d16b6c24c3 100644
--- a/libavcodec/ac3dsp.c
+++ b/libavcodec/ac3dsp.c
@@ -25,6 +25,7 @@
#include "config.h"
#include "libavutil/attributes.h"
+#include "libavutil/attributes_internal.h"
#include "libavutil/common.h"
#include "libavutil/intmath.h"
#include "libavutil/mem_internal.h"
@@ -104,6 +105,7 @@ static void ac3_update_bap_counts_c(uint16_t mant_cnt[16], uint8_t *bap,
mant_cnt[bap[len]]++;
}
+attribute_mcmodel_small
DECLARE_ALIGNED(16, const uint16_t, ff_ac3_bap_bits)[16] = {
0, 0, 0, 3, 0, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 16
};
diff --git a/libavcodec/cabac.c b/libavcodec/cabac.c
index 7d41cd2ae6..b8c6db29a2 100644
--- a/libavcodec/cabac.c
+++ b/libavcodec/cabac.c
@@ -24,11 +24,13 @@
* Context Adaptive Binary Arithmetic Coder.
*/
+#include "libavutil/attributes_internal.h"
#include "libavutil/error.h"
#include "libavutil/mem_internal.h"
#include "cabac.h"
+attribute_mcmodel_small
DECLARE_ASM_ALIGNED(1, const uint8_t, ff_h264_cabac_tables)[512 + 4*2*64 + 4*64 + 63] = {
9,8,7,7,6,6,6,6,5,5,5,5,5,5,5,5,
4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,
diff --git a/libavcodec/x86/constants.c b/libavcodec/x86/constants.c
index bc7f2b17b8..347b7dd1d3 100644
--- a/libavcodec/x86/constants.c
+++ b/libavcodec/x86/constants.c
@@ -18,17 +18,21 @@
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
*/
+#include "libavutil/attributes_internal.h"
#include "libavutil/mem_internal.h"
#include "libavutil/x86/asm.h" // for xmm_reg
#include "constants.h"
+attribute_mcmodel_small
DECLARE_ALIGNED(32, const ymm_reg, ff_pw_1) = { 0x0001000100010001ULL, 0x0001000100010001ULL,
0x0001000100010001ULL, 0x0001000100010001ULL };
DECLARE_ALIGNED(32, const ymm_reg, ff_pw_2) = { 0x0002000200020002ULL, 0x0002000200020002ULL,
0x0002000200020002ULL, 0x0002000200020002ULL };
DECLARE_ASM_ALIGNED(16, const xmm_reg, ff_pw_3) = { 0x0003000300030003ULL, 0x0003000300030003ULL };
+attribute_mcmodel_small
DECLARE_ASM_ALIGNED(32, const ymm_reg, ff_pw_4) = { 0x0004000400040004ULL, 0x0004000400040004ULL,
0x0004000400040004ULL, 0x0004000400040004ULL };
+attribute_mcmodel_small
DECLARE_ASM_ALIGNED(16, const xmm_reg, ff_pw_5) = { 0x0005000500050005ULL, 0x0005000500050005ULL };
DECLARE_ALIGNED(16, const xmm_reg, ff_pw_8) = { 0x0008000800080008ULL, 0x0008000800080008ULL };
DECLARE_ASM_ALIGNED(16, const xmm_reg, ff_pw_9) = { 0x0009000900090009ULL, 0x0009000900090009ULL };
@@ -49,6 +53,7 @@ DECLARE_ALIGNED(32, const ymm_reg, ff_pw_256) = { 0x0100010001000100ULL, 0x010
DECLARE_ALIGNED(32, const ymm_reg, ff_pw_512) = { 0x0200020002000200ULL, 0x0200020002000200ULL,
0x0200020002000200ULL, 0x0200020002000200ULL };
DECLARE_ALIGNED(16, const xmm_reg, ff_pw_1019) = { 0x03FB03FB03FB03FBULL, 0x03FB03FB03FB03FBULL };
+attribute_mcmodel_small
DECLARE_ALIGNED(32, const ymm_reg, ff_pw_1023) = { 0x03ff03ff03ff03ffULL, 0x03ff03ff03ff03ffULL,
0x03ff03ff03ff03ffULL, 0x03ff03ff03ff03ffULL};
DECLARE_ALIGNED(32, const ymm_reg, ff_pw_1024) = { 0x0400040004000400ULL, 0x0400040004000400ULL,
@@ -66,13 +71,16 @@ DECLARE_ALIGNED(32, const ymm_reg, ff_pw_m1) = { 0xFFFFFFFFFFFFFFFFULL, 0xFFF
DECLARE_ALIGNED(32, const ymm_reg, ff_pb_0) = { 0x0000000000000000ULL, 0x0000000000000000ULL,
0x0000000000000000ULL, 0x0000000000000000ULL };
+attribute_mcmodel_small
DECLARE_ALIGNED(32, const ymm_reg, ff_pb_1) = { 0x0101010101010101ULL, 0x0101010101010101ULL,
0x0101010101010101ULL, 0x0101010101010101ULL };
DECLARE_ALIGNED(32, const ymm_reg, ff_pb_2) = { 0x0202020202020202ULL, 0x0202020202020202ULL,
0x0202020202020202ULL, 0x0202020202020202ULL };
+attribute_mcmodel_small
DECLARE_ALIGNED(32, const ymm_reg, ff_pb_3) = { 0x0303030303030303ULL, 0x0303030303030303ULL,
0x0303030303030303ULL, 0x0303030303030303ULL };
DECLARE_ALIGNED(32, const xmm_reg, ff_pb_15) = { 0x0F0F0F0F0F0F0F0FULL, 0x0F0F0F0F0F0F0F0FULL };
+attribute_mcmodel_small
DECLARE_ALIGNED(32, const ymm_reg, ff_pb_80) = { 0x8080808080808080ULL, 0x8080808080808080ULL,
0x8080808080808080ULL, 0x8080808080808080ULL };
DECLARE_ALIGNED(32, const ymm_reg, ff_pb_FE) = { 0xFEFEFEFEFEFEFEFEULL, 0xFEFEFEFEFEFEFEFEULL,
diff --git a/libavutil/attributes_internal.h b/libavutil/attributes_internal.h
index 3df1ee6af3..0c40e6461f 100644
--- a/libavutil/attributes_internal.h
+++ b/libavutil/attributes_internal.h
@@ -31,4 +31,19 @@
# define FF_VISIBILITY_POP_HIDDEN
#endif
+/**
+ * Some globals defined in C files are used from hardcoded asm that assumes small
+ * code model (that is, accessing these globals without GOT). This is a problem
+ * when FFMpeg is built with medium code model (-mcmodel=medium) which allocates
+ * all globals in a data section that's unreachable with PC relative instructions
+ * (small code model instruction sequence). We mark all such globals with this
+ * attribute_mcmodel_small to ensure assembly accessible globals continue to be
+ * allocated in sections reachable from PC relative instructions.
+ */
+#if ARCH_X86_64 && defined(__ELF__) && __has_attribute(model)
+# define attribute_mcmodel_small __attribute__(model("small"))
+#else
+# define attribute_mcmodel_small
+#endif
+
#endif /* AVUTIL_ATTRIBUTES_INTERNAL_H */
--
2.48.1.658.g4767266eb4-goog
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [FFmpeg-devel] [PATCH] Mark C globals with small code model
2025-02-26 19:44 ` Pranav Kant via ffmpeg-devel
@ 2025-02-26 19:45 ` Pranav Kant via ffmpeg-devel
2025-02-27 3:36 ` James Zern via ffmpeg-devel
` (2 subsequent siblings)
3 siblings, 0 replies; 16+ messages in thread
From: Pranav Kant via ffmpeg-devel @ 2025-02-26 19:45 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Pranav Kant
I added it to attributes_internal.h. The existing attribute in
attributes_internal.h (attribute_visibility_hidden) is also being used with
DECLARE_ALIGNED macros (see libavcodec/sbrdsp_template.c). My new macro is
similar in nature.
On Wed, Feb 26, 2025 at 11:44 AM Pranav Kant <prka@google.com> wrote:
> By default, all globals in C/C++ compiled by clang are allocated
> in non-large data sections. See [1] for background on code models.
> For PIC (Position independent code), this is fine as long as binary is
> small but as binary size increases, users maybe want to use medium/large
> code models (-mcmodel=medium) which moves data in to large sections.
> As data in these large sections cannot be accessed using PIC code
> anymore (as it may be too far away), compiler ends up using a different
> instruction sequence when building C/C++ code -- using GOT to access
> these globals (which can be relaxed by linker at link time if binary
> ends up being smaller). However, assembly files continue to access these
> globals defined in C/C++ files using older (and invalid instruction
> sequence). So, we mark all such globals with an attribute that forces
> them to be allocated in small sections allowing them to validly be
> accessed from the assembly code.
>
> This patch should not have any affect on builds that use small code
> model, which is the default mode.
>
> [1]
> https://eli.thegreenplace.net/2012/01/03/understanding-the-x64-code-models
>
> Signed-off-by: Pranav Kant <prka@google.com>
> ---
> libavcodec/ac3dsp.c | 2 ++
> libavcodec/cabac.c | 2 ++
> libavcodec/x86/constants.c | 8 ++++++++
> libavutil/attributes_internal.h | 15 +++++++++++++++
> 4 files changed, 27 insertions(+)
>
> diff --git a/libavcodec/ac3dsp.c b/libavcodec/ac3dsp.c
> index 730fa70fff..d16b6c24c3 100644
> --- a/libavcodec/ac3dsp.c
> +++ b/libavcodec/ac3dsp.c
> @@ -25,6 +25,7 @@
>
> #include "config.h"
> #include "libavutil/attributes.h"
> +#include "libavutil/attributes_internal.h"
> #include "libavutil/common.h"
> #include "libavutil/intmath.h"
> #include "libavutil/mem_internal.h"
> @@ -104,6 +105,7 @@ static void ac3_update_bap_counts_c(uint16_t
> mant_cnt[16], uint8_t *bap,
> mant_cnt[bap[len]]++;
> }
>
> +attribute_mcmodel_small
> DECLARE_ALIGNED(16, const uint16_t, ff_ac3_bap_bits)[16] = {
> 0, 0, 0, 3, 0, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 16
> };
> diff --git a/libavcodec/cabac.c b/libavcodec/cabac.c
> index 7d41cd2ae6..b8c6db29a2 100644
> --- a/libavcodec/cabac.c
> +++ b/libavcodec/cabac.c
> @@ -24,11 +24,13 @@
> * Context Adaptive Binary Arithmetic Coder.
> */
>
> +#include "libavutil/attributes_internal.h"
> #include "libavutil/error.h"
> #include "libavutil/mem_internal.h"
>
> #include "cabac.h"
>
> +attribute_mcmodel_small
> DECLARE_ASM_ALIGNED(1, const uint8_t, ff_h264_cabac_tables)[512 + 4*2*64
> + 4*64 + 63] = {
> 9,8,7,7,6,6,6,6,5,5,5,5,5,5,5,5,
> 4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,
> diff --git a/libavcodec/x86/constants.c b/libavcodec/x86/constants.c
> index bc7f2b17b8..347b7dd1d3 100644
> --- a/libavcodec/x86/constants.c
> +++ b/libavcodec/x86/constants.c
> @@ -18,17 +18,21 @@
> * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
> 02110-1301 USA
> */
>
> +#include "libavutil/attributes_internal.h"
> #include "libavutil/mem_internal.h"
> #include "libavutil/x86/asm.h" // for xmm_reg
> #include "constants.h"
>
> +attribute_mcmodel_small
> DECLARE_ALIGNED(32, const ymm_reg, ff_pw_1) = {
> 0x0001000100010001ULL, 0x0001000100010001ULL,
>
> 0x0001000100010001ULL, 0x0001000100010001ULL };
> DECLARE_ALIGNED(32, const ymm_reg, ff_pw_2) = {
> 0x0002000200020002ULL, 0x0002000200020002ULL,
>
> 0x0002000200020002ULL, 0x0002000200020002ULL };
> DECLARE_ASM_ALIGNED(16, const xmm_reg, ff_pw_3) = {
> 0x0003000300030003ULL, 0x0003000300030003ULL };
> +attribute_mcmodel_small
> DECLARE_ASM_ALIGNED(32, const ymm_reg, ff_pw_4) = {
> 0x0004000400040004ULL, 0x0004000400040004ULL,
>
> 0x0004000400040004ULL, 0x0004000400040004ULL };
> +attribute_mcmodel_small
> DECLARE_ASM_ALIGNED(16, const xmm_reg, ff_pw_5) = {
> 0x0005000500050005ULL, 0x0005000500050005ULL };
> DECLARE_ALIGNED(16, const xmm_reg, ff_pw_8) = {
> 0x0008000800080008ULL, 0x0008000800080008ULL };
> DECLARE_ASM_ALIGNED(16, const xmm_reg, ff_pw_9) = {
> 0x0009000900090009ULL, 0x0009000900090009ULL };
> @@ -49,6 +53,7 @@ DECLARE_ALIGNED(32, const ymm_reg, ff_pw_256) = {
> 0x0100010001000100ULL, 0x010
> DECLARE_ALIGNED(32, const ymm_reg, ff_pw_512) = {
> 0x0200020002000200ULL, 0x0200020002000200ULL,
>
> 0x0200020002000200ULL, 0x0200020002000200ULL };
> DECLARE_ALIGNED(16, const xmm_reg, ff_pw_1019) = {
> 0x03FB03FB03FB03FBULL, 0x03FB03FB03FB03FBULL };
> +attribute_mcmodel_small
> DECLARE_ALIGNED(32, const ymm_reg, ff_pw_1023) = {
> 0x03ff03ff03ff03ffULL, 0x03ff03ff03ff03ffULL,
>
> 0x03ff03ff03ff03ffULL, 0x03ff03ff03ff03ffULL};
> DECLARE_ALIGNED(32, const ymm_reg, ff_pw_1024) = {
> 0x0400040004000400ULL, 0x0400040004000400ULL,
> @@ -66,13 +71,16 @@ DECLARE_ALIGNED(32, const ymm_reg, ff_pw_m1) = {
> 0xFFFFFFFFFFFFFFFFULL, 0xFFF
>
> DECLARE_ALIGNED(32, const ymm_reg, ff_pb_0) = {
> 0x0000000000000000ULL, 0x0000000000000000ULL,
>
> 0x0000000000000000ULL, 0x0000000000000000ULL };
> +attribute_mcmodel_small
> DECLARE_ALIGNED(32, const ymm_reg, ff_pb_1) = {
> 0x0101010101010101ULL, 0x0101010101010101ULL,
>
> 0x0101010101010101ULL, 0x0101010101010101ULL };
> DECLARE_ALIGNED(32, const ymm_reg, ff_pb_2) = {
> 0x0202020202020202ULL, 0x0202020202020202ULL,
>
> 0x0202020202020202ULL, 0x0202020202020202ULL };
> +attribute_mcmodel_small
> DECLARE_ALIGNED(32, const ymm_reg, ff_pb_3) = {
> 0x0303030303030303ULL, 0x0303030303030303ULL,
>
> 0x0303030303030303ULL, 0x0303030303030303ULL };
> DECLARE_ALIGNED(32, const xmm_reg, ff_pb_15) = {
> 0x0F0F0F0F0F0F0F0FULL, 0x0F0F0F0F0F0F0F0FULL };
> +attribute_mcmodel_small
> DECLARE_ALIGNED(32, const ymm_reg, ff_pb_80) = {
> 0x8080808080808080ULL, 0x8080808080808080ULL,
>
> 0x8080808080808080ULL, 0x8080808080808080ULL };
> DECLARE_ALIGNED(32, const ymm_reg, ff_pb_FE) = {
> 0xFEFEFEFEFEFEFEFEULL, 0xFEFEFEFEFEFEFEFEULL,
> diff --git a/libavutil/attributes_internal.h
> b/libavutil/attributes_internal.h
> index 3df1ee6af3..0c40e6461f 100644
> --- a/libavutil/attributes_internal.h
> +++ b/libavutil/attributes_internal.h
> @@ -31,4 +31,19 @@
> # define FF_VISIBILITY_POP_HIDDEN
> #endif
>
> +/**
> + * Some globals defined in C files are used from hardcoded asm that
> assumes small
> + * code model (that is, accessing these globals without GOT). This is a
> problem
> + * when FFMpeg is built with medium code model (-mcmodel=medium) which
> allocates
> + * all globals in a data section that's unreachable with PC relative
> instructions
> + * (small code model instruction sequence). We mark all such globals with
> this
> + * attribute_mcmodel_small to ensure assembly accessible globals continue
> to be
> + * allocated in sections reachable from PC relative instructions.
> + */
> +#if ARCH_X86_64 && defined(__ELF__) && __has_attribute(model)
> +# define attribute_mcmodel_small __attribute__(model("small"))
> +#else
> +# define attribute_mcmodel_small
> +#endif
> +
> #endif /* AVUTIL_ATTRIBUTES_INTERNAL_H */
> --
> 2.48.1.658.g4767266eb4-goog
>
>
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [FFmpeg-devel] [PATCH] Mark C globals with small code model
2025-02-26 19:44 ` Pranav Kant via ffmpeg-devel
2025-02-26 19:45 ` Pranav Kant via ffmpeg-devel
@ 2025-02-27 3:36 ` James Zern via ffmpeg-devel
2025-02-28 1:14 ` Michael Niedermayer
2025-03-03 20:56 ` [FFmpeg-devel] [PATCH v3] " Pranav Kant via ffmpeg-devel
3 siblings, 0 replies; 16+ messages in thread
From: James Zern via ffmpeg-devel @ 2025-02-27 3:36 UTC (permalink / raw)
To: FFmpeg development discussions and patches; +Cc: James Zern
On Wed, Feb 26, 2025 at 11:44 AM Pranav Kant via ffmpeg-devel
<ffmpeg-devel@ffmpeg.org> wrote:
>
> [...]
> --- a/libavutil/attributes_internal.h
> +++ b/libavutil/attributes_internal.h
> @@ -31,4 +31,19 @@
> # define FF_VISIBILITY_POP_HIDDEN
> #endif
>
> +/**
> + * Some globals defined in C files are used from hardcoded asm that assumes small
> + * code model (that is, accessing these globals without GOT). This is a problem
> + * when FFMpeg is built with medium code model (-mcmodel=medium) which allocates
> + * all globals in a data section that's unreachable with PC relative instructions
> + * (small code model instruction sequence). We mark all such globals with this
> + * attribute_mcmodel_small to ensure assembly accessible globals continue to be
> + * allocated in sections reachable from PC relative instructions.
> + */
> +#if ARCH_X86_64 && defined(__ELF__) && __has_attribute(model)
You should check `defined(__has_attribute)` before using it [1], the
preprocessor won't short circuit. See also __has_feature in
libavutil/aarch64/asm.S.
> +# define attribute_mcmodel_small __attribute__(model("small"))
> +#else
> +# define attribute_mcmodel_small
> +#endif
> +
> #endif /* AVUTIL_ATTRIBUTES_INTERNAL_H */
> --
[1]: https://clang.llvm.org/docs/LanguageExtensions.html#has-attribute
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [FFmpeg-devel] [PATCH] Mark C globals with small code model
2025-02-26 19:44 ` Pranav Kant via ffmpeg-devel
2025-02-26 19:45 ` Pranav Kant via ffmpeg-devel
2025-02-27 3:36 ` James Zern via ffmpeg-devel
@ 2025-02-28 1:14 ` Michael Niedermayer
2025-03-07 2:13 ` Pranav Kant via ffmpeg-devel
2025-03-03 20:56 ` [FFmpeg-devel] [PATCH v3] " Pranav Kant via ffmpeg-devel
3 siblings, 1 reply; 16+ messages in thread
From: Michael Niedermayer @ 2025-02-28 1:14 UTC (permalink / raw)
To: FFmpeg development discussions and patches
[-- Attachment #1.1: Type: text/plain, Size: 2187 bytes --]
On Wed, Feb 26, 2025 at 07:44:37PM +0000, Pranav Kant via ffmpeg-devel wrote:
> By default, all globals in C/C++ compiled by clang are allocated
> in non-large data sections. See [1] for background on code models.
> For PIC (Position independent code), this is fine as long as binary is
> small but as binary size increases, users maybe want to use medium/large
> code models (-mcmodel=medium) which moves data in to large sections.
> As data in these large sections cannot be accessed using PIC code
> anymore (as it may be too far away), compiler ends up using a different
> instruction sequence when building C/C++ code -- using GOT to access
> these globals (which can be relaxed by linker at link time if binary
> ends up being smaller). However, assembly files continue to access these
> globals defined in C/C++ files using older (and invalid instruction
> sequence). So, we mark all such globals with an attribute that forces
> them to be allocated in small sections allowing them to validly be
> accessed from the assembly code.
>
> This patch should not have any affect on builds that use small code
> model, which is the default mode.
>
> [1] https://eli.thegreenplace.net/2012/01/03/understanding-the-x64-code-models
>
> Signed-off-by: Pranav Kant <prka@google.com>
> ---
> libavcodec/ac3dsp.c | 2 ++
> libavcodec/cabac.c | 2 ++
> libavcodec/x86/constants.c | 8 ++++++++
> libavutil/attributes_internal.h | 15 +++++++++++++++
> 4 files changed, 27 insertions(+)
This produces many warnings:
CC libavcodec/svq1.o
In file included from libavcodec/svq1.h:40,
from libavcodec/svq1.c:35:
./libavutil/attributes_internal.h:43:5: warning: "ARCH_X86_64" is not defined, evaluates to 0 [-Wundef]
43 | #if ARCH_X86_64 && defined(__ELF__) && __has_attribute(model)
| ^~~~~~~~~~~
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
If you fake or manipulate statistics in a paper in physics you will never
get a job again.
If you fake or manipulate statistics in a paper in medicin you will get
a job for life at the pharma industry.
[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]
[-- Attachment #2: Type: text/plain, Size: 251 bytes --]
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [FFmpeg-devel] [PATCH] Mark C globals with small code model
2025-02-28 1:14 ` Michael Niedermayer
@ 2025-03-07 2:13 ` Pranav Kant via ffmpeg-devel
0 siblings, 0 replies; 16+ messages in thread
From: Pranav Kant via ffmpeg-devel @ 2025-03-07 2:13 UTC (permalink / raw)
To: FFmpeg development discussions and patches; +Cc: Pranav Kant
I uploaded a new patch (v3) that addresses these concerns.
On Thu, Feb 27, 2025 at 5:14 PM Michael Niedermayer <michael@niedermayer.cc>
wrote:
> On Wed, Feb 26, 2025 at 07:44:37PM +0000, Pranav Kant via ffmpeg-devel
> wrote:
> > By default, all globals in C/C++ compiled by clang are allocated
> > in non-large data sections. See [1] for background on code models.
> > For PIC (Position independent code), this is fine as long as binary is
> > small but as binary size increases, users maybe want to use medium/large
> > code models (-mcmodel=medium) which moves data in to large sections.
> > As data in these large sections cannot be accessed using PIC code
> > anymore (as it may be too far away), compiler ends up using a different
> > instruction sequence when building C/C++ code -- using GOT to access
> > these globals (which can be relaxed by linker at link time if binary
> > ends up being smaller). However, assembly files continue to access these
> > globals defined in C/C++ files using older (and invalid instruction
> > sequence). So, we mark all such globals with an attribute that forces
> > them to be allocated in small sections allowing them to validly be
> > accessed from the assembly code.
> >
> > This patch should not have any affect on builds that use small code
> > model, which is the default mode.
> >
> > [1]
> https://eli.thegreenplace.net/2012/01/03/understanding-the-x64-code-models
> >
> > Signed-off-by: Pranav Kant <prka@google.com>
> > ---
> > libavcodec/ac3dsp.c | 2 ++
> > libavcodec/cabac.c | 2 ++
> > libavcodec/x86/constants.c | 8 ++++++++
> > libavutil/attributes_internal.h | 15 +++++++++++++++
> > 4 files changed, 27 insertions(+)
>
> This produces many warnings:
>
> CC libavcodec/svq1.o
> In file included from libavcodec/svq1.h:40,
> from libavcodec/svq1.c:35:
> ./libavutil/attributes_internal.h:43:5: warning: "ARCH_X86_64" is not
> defined, evaluates to 0 [-Wundef]
> 43 | #if ARCH_X86_64 && defined(__ELF__) && __has_attribute(model)
> | ^~~~~~~~~~~
>
> [...]
>
> --
> Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>
> If you fake or manipulate statistics in a paper in physics you will never
> get a job again.
> If you fake or manipulate statistics in a paper in medicin you will get
> a job for life at the pharma industry.
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 16+ messages in thread
* [FFmpeg-devel] [PATCH v3] Mark C globals with small code model
2025-02-26 19:44 ` Pranav Kant via ffmpeg-devel
` (2 preceding siblings ...)
2025-02-28 1:14 ` Michael Niedermayer
@ 2025-03-03 20:56 ` Pranav Kant via ffmpeg-devel
2025-03-11 19:17 ` [FFmpeg-devel] [PATCH v4] " Pranav Kant via ffmpeg-devel
3 siblings, 1 reply; 16+ messages in thread
From: Pranav Kant via ffmpeg-devel @ 2025-03-03 20:56 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Pranav Kant
By default, all globals in C/C++ compiled by clang are allocated
in non-large data sections. See [1] for background on code models.
For PIC (Position independent code), this is fine as long as binary is
small but as binary size increases, users maybe want to use medium/large
code models (-mcmodel=medium) which moves data in to large sections.
As data in these large sections cannot be accessed using PIC code
anymore (as it may be too far away), compiler ends up using a different
instruction sequence when building C/C++ code -- using GOT to access
these globals (which can be relaxed by linker at link time if binary
ends up being smaller). However, assembly files continue to access these
globals defined in C/C++ files using older (and invalid instruction
sequence). So, we mark all such globals with an attribute that forces
them to be allocated in small sections allowing them to validly be
accessed from the assembly code.
This patch should not have any affect on builds that use small code
model, which is the default mode.
[1] https://eli.thegreenplace.net/2012/01/03/understanding-the-x64-code-models
Signed-off-by: Pranav Kant <prka@google.com>
---
libavcodec/ac3dsp.c | 2 ++
libavcodec/cabac.c | 2 ++
libavcodec/x86/constants.c | 8 ++++++++
libavutil/attributes.h | 6 ++++++
libavutil/attributes_internal.h | 16 ++++++++++++++++
5 files changed, 34 insertions(+)
diff --git a/libavcodec/ac3dsp.c b/libavcodec/ac3dsp.c
index 730fa70fff..d16b6c24c3 100644
--- a/libavcodec/ac3dsp.c
+++ b/libavcodec/ac3dsp.c
@@ -25,6 +25,7 @@
#include "config.h"
#include "libavutil/attributes.h"
+#include "libavutil/attributes_internal.h"
#include "libavutil/common.h"
#include "libavutil/intmath.h"
#include "libavutil/mem_internal.h"
@@ -104,6 +105,7 @@ static void ac3_update_bap_counts_c(uint16_t mant_cnt[16], uint8_t *bap,
mant_cnt[bap[len]]++;
}
+attribute_mcmodel_small
DECLARE_ALIGNED(16, const uint16_t, ff_ac3_bap_bits)[16] = {
0, 0, 0, 3, 0, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 16
};
diff --git a/libavcodec/cabac.c b/libavcodec/cabac.c
index 7d41cd2ae6..b8c6db29a2 100644
--- a/libavcodec/cabac.c
+++ b/libavcodec/cabac.c
@@ -24,11 +24,13 @@
* Context Adaptive Binary Arithmetic Coder.
*/
+#include "libavutil/attributes_internal.h"
#include "libavutil/error.h"
#include "libavutil/mem_internal.h"
#include "cabac.h"
+attribute_mcmodel_small
DECLARE_ASM_ALIGNED(1, const uint8_t, ff_h264_cabac_tables)[512 + 4*2*64 + 4*64 + 63] = {
9,8,7,7,6,6,6,6,5,5,5,5,5,5,5,5,
4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,
diff --git a/libavcodec/x86/constants.c b/libavcodec/x86/constants.c
index bc7f2b17b8..347b7dd1d3 100644
--- a/libavcodec/x86/constants.c
+++ b/libavcodec/x86/constants.c
@@ -18,17 +18,21 @@
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
*/
+#include "libavutil/attributes_internal.h"
#include "libavutil/mem_internal.h"
#include "libavutil/x86/asm.h" // for xmm_reg
#include "constants.h"
+attribute_mcmodel_small
DECLARE_ALIGNED(32, const ymm_reg, ff_pw_1) = { 0x0001000100010001ULL, 0x0001000100010001ULL,
0x0001000100010001ULL, 0x0001000100010001ULL };
DECLARE_ALIGNED(32, const ymm_reg, ff_pw_2) = { 0x0002000200020002ULL, 0x0002000200020002ULL,
0x0002000200020002ULL, 0x0002000200020002ULL };
DECLARE_ASM_ALIGNED(16, const xmm_reg, ff_pw_3) = { 0x0003000300030003ULL, 0x0003000300030003ULL };
+attribute_mcmodel_small
DECLARE_ASM_ALIGNED(32, const ymm_reg, ff_pw_4) = { 0x0004000400040004ULL, 0x0004000400040004ULL,
0x0004000400040004ULL, 0x0004000400040004ULL };
+attribute_mcmodel_small
DECLARE_ASM_ALIGNED(16, const xmm_reg, ff_pw_5) = { 0x0005000500050005ULL, 0x0005000500050005ULL };
DECLARE_ALIGNED(16, const xmm_reg, ff_pw_8) = { 0x0008000800080008ULL, 0x0008000800080008ULL };
DECLARE_ASM_ALIGNED(16, const xmm_reg, ff_pw_9) = { 0x0009000900090009ULL, 0x0009000900090009ULL };
@@ -49,6 +53,7 @@ DECLARE_ALIGNED(32, const ymm_reg, ff_pw_256) = { 0x0100010001000100ULL, 0x010
DECLARE_ALIGNED(32, const ymm_reg, ff_pw_512) = { 0x0200020002000200ULL, 0x0200020002000200ULL,
0x0200020002000200ULL, 0x0200020002000200ULL };
DECLARE_ALIGNED(16, const xmm_reg, ff_pw_1019) = { 0x03FB03FB03FB03FBULL, 0x03FB03FB03FB03FBULL };
+attribute_mcmodel_small
DECLARE_ALIGNED(32, const ymm_reg, ff_pw_1023) = { 0x03ff03ff03ff03ffULL, 0x03ff03ff03ff03ffULL,
0x03ff03ff03ff03ffULL, 0x03ff03ff03ff03ffULL};
DECLARE_ALIGNED(32, const ymm_reg, ff_pw_1024) = { 0x0400040004000400ULL, 0x0400040004000400ULL,
@@ -66,13 +71,16 @@ DECLARE_ALIGNED(32, const ymm_reg, ff_pw_m1) = { 0xFFFFFFFFFFFFFFFFULL, 0xFFF
DECLARE_ALIGNED(32, const ymm_reg, ff_pb_0) = { 0x0000000000000000ULL, 0x0000000000000000ULL,
0x0000000000000000ULL, 0x0000000000000000ULL };
+attribute_mcmodel_small
DECLARE_ALIGNED(32, const ymm_reg, ff_pb_1) = { 0x0101010101010101ULL, 0x0101010101010101ULL,
0x0101010101010101ULL, 0x0101010101010101ULL };
DECLARE_ALIGNED(32, const ymm_reg, ff_pb_2) = { 0x0202020202020202ULL, 0x0202020202020202ULL,
0x0202020202020202ULL, 0x0202020202020202ULL };
+attribute_mcmodel_small
DECLARE_ALIGNED(32, const ymm_reg, ff_pb_3) = { 0x0303030303030303ULL, 0x0303030303030303ULL,
0x0303030303030303ULL, 0x0303030303030303ULL };
DECLARE_ALIGNED(32, const xmm_reg, ff_pb_15) = { 0x0F0F0F0F0F0F0F0FULL, 0x0F0F0F0F0F0F0F0FULL };
+attribute_mcmodel_small
DECLARE_ALIGNED(32, const ymm_reg, ff_pb_80) = { 0x8080808080808080ULL, 0x8080808080808080ULL,
0x8080808080808080ULL, 0x8080808080808080ULL };
DECLARE_ALIGNED(32, const ymm_reg, ff_pb_FE) = { 0xFEFEFEFEFEFEFEFEULL, 0xFEFEFEFEFEFEFEFEULL,
diff --git a/libavutil/attributes.h b/libavutil/attributes.h
index 04c615c952..dfc35fa31e 100644
--- a/libavutil/attributes.h
+++ b/libavutil/attributes.h
@@ -40,6 +40,12 @@
# define AV_HAS_BUILTIN(x) 0
#endif
+#ifdef __has_attribute
+# define AV_HAS_ATTRIBUTE(x) __has_attribute(x)
+#else
+# define AV_HAS_ATTRIBUTE(x) 0
+#endif
+
#ifndef av_always_inline
#if AV_GCC_VERSION_AT_LEAST(3,1)
# define av_always_inline __attribute__((always_inline)) inline
diff --git a/libavutil/attributes_internal.h b/libavutil/attributes_internal.h
index 3df1ee6af3..87e490dd18 100644
--- a/libavutil/attributes_internal.h
+++ b/libavutil/attributes_internal.h
@@ -19,6 +19,7 @@
#ifndef AVUTIL_ATTRIBUTES_INTERNAL_H
#define AVUTIL_ATTRIBUTES_INTERNAL_H
+#include "config.h"
#include "attributes.h"
#if (AV_GCC_VERSION_AT_LEAST(4,0) || defined(__clang__)) && (defined(__ELF__) || defined(__MACH__))
@@ -31,4 +32,19 @@
# define FF_VISIBILITY_POP_HIDDEN
#endif
+/**
+ * Some globals defined in C files are used from hardcoded asm that assumes small
+ * code model (that is, accessing these globals without GOT). This is a problem
+ * when FFMpeg is built with medium code model (-mcmodel=medium) which allocates
+ * all globals in a data section that's unreachable with PC relative instructions
+ * (small code model instruction sequence). We mark all such globals with this
+ * attribute_mcmodel_small to ensure assembly accessible globals continue to be
+ * allocated in sections reachable from PC relative instructions.
+ */
+#if ARCH_X86_64 && defined(__ELF__) && AV_HAS_ATTRIBUTE(model)
+# define attribute_mcmodel_small __attribute__(model("small"))
+#else
+# define attribute_mcmodel_small
+#endif
+
#endif /* AVUTIL_ATTRIBUTES_INTERNAL_H */
--
2.48.1.711.g2feabab25a-goog
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 16+ messages in thread
* [FFmpeg-devel] [PATCH v4] Mark C globals with small code model
2025-03-03 20:56 ` [FFmpeg-devel] [PATCH v3] " Pranav Kant via ffmpeg-devel
@ 2025-03-11 19:17 ` Pranav Kant via ffmpeg-devel
2025-03-11 19:18 ` Pranav Kant via ffmpeg-devel
2025-03-11 23:45 ` Andreas Rheinhardt
0 siblings, 2 replies; 16+ messages in thread
From: Pranav Kant via ffmpeg-devel @ 2025-03-11 19:17 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Pranav Kant
By default, all globals in C/C++ compiled by clang are allocated
in non-large data sections. See [1] for background on code models.
For PIC (Position independent code), this is fine as long as binary is
small but as binary size increases, users maybe want to use medium/large
code models (-mcmodel=medium) which moves data in to large sections.
As data in these large sections cannot be accessed using PIC code
anymore (as it may be too far away), compiler ends up using a different
instruction sequence when building C/C++ code -- using GOT to access
these globals (which can be relaxed by linker at link time if binary
ends up being smaller). However, assembly files continue to access these
globals defined in C/C++ files using older (and invalid instruction
sequence). So, we mark all such globals with an attribute that forces
them to be allocated in small sections allowing them to validly be
accessed from the assembly code.
This patch should not have any affect on builds that use small code
model, which is the default mode.
[1] https://eli.thegreenplace.net/2012/01/03/understanding-the-x64-code-models
Signed-off-by: Pranav Kant <prka@google.com>
---
libavcodec/ac3dsp.c | 2 ++
libavcodec/cabac.c | 2 ++
libavcodec/x86/constants.c | 8 ++++++++
libavutil/attributes.h | 6 ++++++
libavutil/attributes_internal.h | 16 ++++++++++++++++
5 files changed, 34 insertions(+)
diff --git a/libavcodec/ac3dsp.c b/libavcodec/ac3dsp.c
index 730fa70fff..d16b6c24c3 100644
--- a/libavcodec/ac3dsp.c
+++ b/libavcodec/ac3dsp.c
@@ -25,6 +25,7 @@
#include "config.h"
#include "libavutil/attributes.h"
+#include "libavutil/attributes_internal.h"
#include "libavutil/common.h"
#include "libavutil/intmath.h"
#include "libavutil/mem_internal.h"
@@ -104,6 +105,7 @@ static void ac3_update_bap_counts_c(uint16_t mant_cnt[16], uint8_t *bap,
mant_cnt[bap[len]]++;
}
+attribute_mcmodel_small
DECLARE_ALIGNED(16, const uint16_t, ff_ac3_bap_bits)[16] = {
0, 0, 0, 3, 0, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 16
};
diff --git a/libavcodec/cabac.c b/libavcodec/cabac.c
index 7d41cd2ae6..b8c6db29a2 100644
--- a/libavcodec/cabac.c
+++ b/libavcodec/cabac.c
@@ -24,11 +24,13 @@
* Context Adaptive Binary Arithmetic Coder.
*/
+#include "libavutil/attributes_internal.h"
#include "libavutil/error.h"
#include "libavutil/mem_internal.h"
#include "cabac.h"
+attribute_mcmodel_small
DECLARE_ASM_ALIGNED(1, const uint8_t, ff_h264_cabac_tables)[512 + 4*2*64 + 4*64 + 63] = {
9,8,7,7,6,6,6,6,5,5,5,5,5,5,5,5,
4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,
diff --git a/libavcodec/x86/constants.c b/libavcodec/x86/constants.c
index bc7f2b17b8..347b7dd1d3 100644
--- a/libavcodec/x86/constants.c
+++ b/libavcodec/x86/constants.c
@@ -18,17 +18,21 @@
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
*/
+#include "libavutil/attributes_internal.h"
#include "libavutil/mem_internal.h"
#include "libavutil/x86/asm.h" // for xmm_reg
#include "constants.h"
+attribute_mcmodel_small
DECLARE_ALIGNED(32, const ymm_reg, ff_pw_1) = { 0x0001000100010001ULL, 0x0001000100010001ULL,
0x0001000100010001ULL, 0x0001000100010001ULL };
DECLARE_ALIGNED(32, const ymm_reg, ff_pw_2) = { 0x0002000200020002ULL, 0x0002000200020002ULL,
0x0002000200020002ULL, 0x0002000200020002ULL };
DECLARE_ASM_ALIGNED(16, const xmm_reg, ff_pw_3) = { 0x0003000300030003ULL, 0x0003000300030003ULL };
+attribute_mcmodel_small
DECLARE_ASM_ALIGNED(32, const ymm_reg, ff_pw_4) = { 0x0004000400040004ULL, 0x0004000400040004ULL,
0x0004000400040004ULL, 0x0004000400040004ULL };
+attribute_mcmodel_small
DECLARE_ASM_ALIGNED(16, const xmm_reg, ff_pw_5) = { 0x0005000500050005ULL, 0x0005000500050005ULL };
DECLARE_ALIGNED(16, const xmm_reg, ff_pw_8) = { 0x0008000800080008ULL, 0x0008000800080008ULL };
DECLARE_ASM_ALIGNED(16, const xmm_reg, ff_pw_9) = { 0x0009000900090009ULL, 0x0009000900090009ULL };
@@ -49,6 +53,7 @@ DECLARE_ALIGNED(32, const ymm_reg, ff_pw_256) = { 0x0100010001000100ULL, 0x010
DECLARE_ALIGNED(32, const ymm_reg, ff_pw_512) = { 0x0200020002000200ULL, 0x0200020002000200ULL,
0x0200020002000200ULL, 0x0200020002000200ULL };
DECLARE_ALIGNED(16, const xmm_reg, ff_pw_1019) = { 0x03FB03FB03FB03FBULL, 0x03FB03FB03FB03FBULL };
+attribute_mcmodel_small
DECLARE_ALIGNED(32, const ymm_reg, ff_pw_1023) = { 0x03ff03ff03ff03ffULL, 0x03ff03ff03ff03ffULL,
0x03ff03ff03ff03ffULL, 0x03ff03ff03ff03ffULL};
DECLARE_ALIGNED(32, const ymm_reg, ff_pw_1024) = { 0x0400040004000400ULL, 0x0400040004000400ULL,
@@ -66,13 +71,16 @@ DECLARE_ALIGNED(32, const ymm_reg, ff_pw_m1) = { 0xFFFFFFFFFFFFFFFFULL, 0xFFF
DECLARE_ALIGNED(32, const ymm_reg, ff_pb_0) = { 0x0000000000000000ULL, 0x0000000000000000ULL,
0x0000000000000000ULL, 0x0000000000000000ULL };
+attribute_mcmodel_small
DECLARE_ALIGNED(32, const ymm_reg, ff_pb_1) = { 0x0101010101010101ULL, 0x0101010101010101ULL,
0x0101010101010101ULL, 0x0101010101010101ULL };
DECLARE_ALIGNED(32, const ymm_reg, ff_pb_2) = { 0x0202020202020202ULL, 0x0202020202020202ULL,
0x0202020202020202ULL, 0x0202020202020202ULL };
+attribute_mcmodel_small
DECLARE_ALIGNED(32, const ymm_reg, ff_pb_3) = { 0x0303030303030303ULL, 0x0303030303030303ULL,
0x0303030303030303ULL, 0x0303030303030303ULL };
DECLARE_ALIGNED(32, const xmm_reg, ff_pb_15) = { 0x0F0F0F0F0F0F0F0FULL, 0x0F0F0F0F0F0F0F0FULL };
+attribute_mcmodel_small
DECLARE_ALIGNED(32, const ymm_reg, ff_pb_80) = { 0x8080808080808080ULL, 0x8080808080808080ULL,
0x8080808080808080ULL, 0x8080808080808080ULL };
DECLARE_ALIGNED(32, const ymm_reg, ff_pb_FE) = { 0xFEFEFEFEFEFEFEFEULL, 0xFEFEFEFEFEFEFEFEULL,
diff --git a/libavutil/attributes.h b/libavutil/attributes.h
index 04c615c952..dfc35fa31e 100644
--- a/libavutil/attributes.h
+++ b/libavutil/attributes.h
@@ -40,6 +40,12 @@
# define AV_HAS_BUILTIN(x) 0
#endif
+#ifdef __has_attribute
+# define AV_HAS_ATTRIBUTE(x) __has_attribute(x)
+#else
+# define AV_HAS_ATTRIBUTE(x) 0
+#endif
+
#ifndef av_always_inline
#if AV_GCC_VERSION_AT_LEAST(3,1)
# define av_always_inline __attribute__((always_inline)) inline
diff --git a/libavutil/attributes_internal.h b/libavutil/attributes_internal.h
index bc85ce77ff..c557fa0af0 100644
--- a/libavutil/attributes_internal.h
+++ b/libavutil/attributes_internal.h
@@ -19,6 +19,7 @@
#ifndef AVUTIL_ATTRIBUTES_INTERNAL_H
#define AVUTIL_ATTRIBUTES_INTERNAL_H
+#include "config.h"
#include "attributes.h"
#if (AV_GCC_VERSION_AT_LEAST(4,0) || defined(__clang__)) && (defined(__ELF__) || defined(__MACH__))
@@ -33,4 +34,19 @@
#define EXTERN extern attribute_visibility_hidden
+/**
+ * Some globals defined in C files are used from hardcoded asm that assumes small
+ * code model (that is, accessing these globals without GOT). This is a problem
+ * when FFMpeg is built with medium code model (-mcmodel=medium) which allocates
+ * all globals in a data section that's unreachable with PC relative instructions
+ * (small code model instruction sequence). We mark all such globals with this
+ * attribute_mcmodel_small to ensure assembly accessible globals continue to be
+ * allocated in sections reachable from PC relative instructions.
+ */
+#if ARCH_X86_64 && defined(__ELF__) && AV_HAS_ATTRIBUTE(model)
+# define attribute_mcmodel_small __attribute__((model("small")))
+#else
+# define attribute_mcmodel_small
+#endif
+
#endif /* AVUTIL_ATTRIBUTES_INTERNAL_H */
--
2.49.0.rc0.332.g42c0ae87b1-goog
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [FFmpeg-devel] [PATCH v4] Mark C globals with small code model
2025-03-11 19:17 ` [FFmpeg-devel] [PATCH v4] " Pranav Kant via ffmpeg-devel
@ 2025-03-11 19:18 ` Pranav Kant via ffmpeg-devel
2025-03-11 23:45 ` Andreas Rheinhardt
1 sibling, 0 replies; 16+ messages in thread
From: Pranav Kant via ffmpeg-devel @ 2025-03-11 19:18 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Pranav Kant
Patch version v4.
- Rebased
- Missing "(" in __attribute__((model("small)) in the earlier patch version.
On Tue, Mar 11, 2025 at 12:17 PM Pranav Kant <prka@google.com> wrote:
> By default, all globals in C/C++ compiled by clang are allocated
> in non-large data sections. See [1] for background on code models.
> For PIC (Position independent code), this is fine as long as binary is
> small but as binary size increases, users maybe want to use medium/large
> code models (-mcmodel=medium) which moves data in to large sections.
> As data in these large sections cannot be accessed using PIC code
> anymore (as it may be too far away), compiler ends up using a different
> instruction sequence when building C/C++ code -- using GOT to access
> these globals (which can be relaxed by linker at link time if binary
> ends up being smaller). However, assembly files continue to access these
> globals defined in C/C++ files using older (and invalid instruction
> sequence). So, we mark all such globals with an attribute that forces
> them to be allocated in small sections allowing them to validly be
> accessed from the assembly code.
>
> This patch should not have any affect on builds that use small code
> model, which is the default mode.
>
> [1]
> https://eli.thegreenplace.net/2012/01/03/understanding-the-x64-code-models
>
> Signed-off-by: Pranav Kant <prka@google.com>
> ---
> libavcodec/ac3dsp.c | 2 ++
> libavcodec/cabac.c | 2 ++
> libavcodec/x86/constants.c | 8 ++++++++
> libavutil/attributes.h | 6 ++++++
> libavutil/attributes_internal.h | 16 ++++++++++++++++
> 5 files changed, 34 insertions(+)
>
> diff --git a/libavcodec/ac3dsp.c b/libavcodec/ac3dsp.c
> index 730fa70fff..d16b6c24c3 100644
> --- a/libavcodec/ac3dsp.c
> +++ b/libavcodec/ac3dsp.c
> @@ -25,6 +25,7 @@
>
> #include "config.h"
> #include "libavutil/attributes.h"
> +#include "libavutil/attributes_internal.h"
> #include "libavutil/common.h"
> #include "libavutil/intmath.h"
> #include "libavutil/mem_internal.h"
> @@ -104,6 +105,7 @@ static void ac3_update_bap_counts_c(uint16_t
> mant_cnt[16], uint8_t *bap,
> mant_cnt[bap[len]]++;
> }
>
> +attribute_mcmodel_small
> DECLARE_ALIGNED(16, const uint16_t, ff_ac3_bap_bits)[16] = {
> 0, 0, 0, 3, 0, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 16
> };
> diff --git a/libavcodec/cabac.c b/libavcodec/cabac.c
> index 7d41cd2ae6..b8c6db29a2 100644
> --- a/libavcodec/cabac.c
> +++ b/libavcodec/cabac.c
> @@ -24,11 +24,13 @@
> * Context Adaptive Binary Arithmetic Coder.
> */
>
> +#include "libavutil/attributes_internal.h"
> #include "libavutil/error.h"
> #include "libavutil/mem_internal.h"
>
> #include "cabac.h"
>
> +attribute_mcmodel_small
> DECLARE_ASM_ALIGNED(1, const uint8_t, ff_h264_cabac_tables)[512 + 4*2*64
> + 4*64 + 63] = {
> 9,8,7,7,6,6,6,6,5,5,5,5,5,5,5,5,
> 4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,
> diff --git a/libavcodec/x86/constants.c b/libavcodec/x86/constants.c
> index bc7f2b17b8..347b7dd1d3 100644
> --- a/libavcodec/x86/constants.c
> +++ b/libavcodec/x86/constants.c
> @@ -18,17 +18,21 @@
> * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
> 02110-1301 USA
> */
>
> +#include "libavutil/attributes_internal.h"
> #include "libavutil/mem_internal.h"
> #include "libavutil/x86/asm.h" // for xmm_reg
> #include "constants.h"
>
> +attribute_mcmodel_small
> DECLARE_ALIGNED(32, const ymm_reg, ff_pw_1) = {
> 0x0001000100010001ULL, 0x0001000100010001ULL,
>
> 0x0001000100010001ULL, 0x0001000100010001ULL };
> DECLARE_ALIGNED(32, const ymm_reg, ff_pw_2) = {
> 0x0002000200020002ULL, 0x0002000200020002ULL,
>
> 0x0002000200020002ULL, 0x0002000200020002ULL };
> DECLARE_ASM_ALIGNED(16, const xmm_reg, ff_pw_3) = {
> 0x0003000300030003ULL, 0x0003000300030003ULL };
> +attribute_mcmodel_small
> DECLARE_ASM_ALIGNED(32, const ymm_reg, ff_pw_4) = {
> 0x0004000400040004ULL, 0x0004000400040004ULL,
>
> 0x0004000400040004ULL, 0x0004000400040004ULL };
> +attribute_mcmodel_small
> DECLARE_ASM_ALIGNED(16, const xmm_reg, ff_pw_5) = {
> 0x0005000500050005ULL, 0x0005000500050005ULL };
> DECLARE_ALIGNED(16, const xmm_reg, ff_pw_8) = {
> 0x0008000800080008ULL, 0x0008000800080008ULL };
> DECLARE_ASM_ALIGNED(16, const xmm_reg, ff_pw_9) = {
> 0x0009000900090009ULL, 0x0009000900090009ULL };
> @@ -49,6 +53,7 @@ DECLARE_ALIGNED(32, const ymm_reg, ff_pw_256) = {
> 0x0100010001000100ULL, 0x010
> DECLARE_ALIGNED(32, const ymm_reg, ff_pw_512) = {
> 0x0200020002000200ULL, 0x0200020002000200ULL,
>
> 0x0200020002000200ULL, 0x0200020002000200ULL };
> DECLARE_ALIGNED(16, const xmm_reg, ff_pw_1019) = {
> 0x03FB03FB03FB03FBULL, 0x03FB03FB03FB03FBULL };
> +attribute_mcmodel_small
> DECLARE_ALIGNED(32, const ymm_reg, ff_pw_1023) = {
> 0x03ff03ff03ff03ffULL, 0x03ff03ff03ff03ffULL,
>
> 0x03ff03ff03ff03ffULL, 0x03ff03ff03ff03ffULL};
> DECLARE_ALIGNED(32, const ymm_reg, ff_pw_1024) = {
> 0x0400040004000400ULL, 0x0400040004000400ULL,
> @@ -66,13 +71,16 @@ DECLARE_ALIGNED(32, const ymm_reg, ff_pw_m1) = {
> 0xFFFFFFFFFFFFFFFFULL, 0xFFF
>
> DECLARE_ALIGNED(32, const ymm_reg, ff_pb_0) = {
> 0x0000000000000000ULL, 0x0000000000000000ULL,
>
> 0x0000000000000000ULL, 0x0000000000000000ULL };
> +attribute_mcmodel_small
> DECLARE_ALIGNED(32, const ymm_reg, ff_pb_1) = {
> 0x0101010101010101ULL, 0x0101010101010101ULL,
>
> 0x0101010101010101ULL, 0x0101010101010101ULL };
> DECLARE_ALIGNED(32, const ymm_reg, ff_pb_2) = {
> 0x0202020202020202ULL, 0x0202020202020202ULL,
>
> 0x0202020202020202ULL, 0x0202020202020202ULL };
> +attribute_mcmodel_small
> DECLARE_ALIGNED(32, const ymm_reg, ff_pb_3) = {
> 0x0303030303030303ULL, 0x0303030303030303ULL,
>
> 0x0303030303030303ULL, 0x0303030303030303ULL };
> DECLARE_ALIGNED(32, const xmm_reg, ff_pb_15) = {
> 0x0F0F0F0F0F0F0F0FULL, 0x0F0F0F0F0F0F0F0FULL };
> +attribute_mcmodel_small
> DECLARE_ALIGNED(32, const ymm_reg, ff_pb_80) = {
> 0x8080808080808080ULL, 0x8080808080808080ULL,
>
> 0x8080808080808080ULL, 0x8080808080808080ULL };
> DECLARE_ALIGNED(32, const ymm_reg, ff_pb_FE) = {
> 0xFEFEFEFEFEFEFEFEULL, 0xFEFEFEFEFEFEFEFEULL,
> diff --git a/libavutil/attributes.h b/libavutil/attributes.h
> index 04c615c952..dfc35fa31e 100644
> --- a/libavutil/attributes.h
> +++ b/libavutil/attributes.h
> @@ -40,6 +40,12 @@
> # define AV_HAS_BUILTIN(x) 0
> #endif
>
> +#ifdef __has_attribute
> +# define AV_HAS_ATTRIBUTE(x) __has_attribute(x)
> +#else
> +# define AV_HAS_ATTRIBUTE(x) 0
> +#endif
> +
> #ifndef av_always_inline
> #if AV_GCC_VERSION_AT_LEAST(3,1)
> # define av_always_inline __attribute__((always_inline)) inline
> diff --git a/libavutil/attributes_internal.h
> b/libavutil/attributes_internal.h
> index bc85ce77ff..c557fa0af0 100644
> --- a/libavutil/attributes_internal.h
> +++ b/libavutil/attributes_internal.h
> @@ -19,6 +19,7 @@
> #ifndef AVUTIL_ATTRIBUTES_INTERNAL_H
> #define AVUTIL_ATTRIBUTES_INTERNAL_H
>
> +#include "config.h"
> #include "attributes.h"
>
> #if (AV_GCC_VERSION_AT_LEAST(4,0) || defined(__clang__)) &&
> (defined(__ELF__) || defined(__MACH__))
> @@ -33,4 +34,19 @@
>
> #define EXTERN extern attribute_visibility_hidden
>
> +/**
> + * Some globals defined in C files are used from hardcoded asm that
> assumes small
> + * code model (that is, accessing these globals without GOT). This is a
> problem
> + * when FFMpeg is built with medium code model (-mcmodel=medium) which
> allocates
> + * all globals in a data section that's unreachable with PC relative
> instructions
> + * (small code model instruction sequence). We mark all such globals with
> this
> + * attribute_mcmodel_small to ensure assembly accessible globals continue
> to be
> + * allocated in sections reachable from PC relative instructions.
> + */
> +#if ARCH_X86_64 && defined(__ELF__) && AV_HAS_ATTRIBUTE(model)
> +# define attribute_mcmodel_small __attribute__((model("small")))
> +#else
> +# define attribute_mcmodel_small
> +#endif
> +
> #endif /* AVUTIL_ATTRIBUTES_INTERNAL_H */
> --
> 2.49.0.rc0.332.g42c0ae87b1-goog
>
>
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [FFmpeg-devel] [PATCH v4] Mark C globals with small code model
2025-03-11 19:17 ` [FFmpeg-devel] [PATCH v4] " Pranav Kant via ffmpeg-devel
2025-03-11 19:18 ` Pranav Kant via ffmpeg-devel
@ 2025-03-11 23:45 ` Andreas Rheinhardt
2025-03-12 7:05 ` Martin Storsjö
1 sibling, 1 reply; 16+ messages in thread
From: Andreas Rheinhardt @ 2025-03-11 23:45 UTC (permalink / raw)
To: ffmpeg-devel
Pranav Kant via ffmpeg-devel:
> By default, all globals in C/C++ compiled by clang are allocated
> in non-large data sections. See [1] for background on code models.
> For PIC (Position independent code), this is fine as long as binary is
> small but as binary size increases, users maybe want to use medium/large
> code models (-mcmodel=medium) which moves data in to large sections.
> As data in these large sections cannot be accessed using PIC code
> anymore (as it may be too far away), compiler ends up using a different
> instruction sequence when building C/C++ code -- using GOT to access
> these globals (which can be relaxed by linker at link time if binary
> ends up being smaller). However, assembly files continue to access these
> globals defined in C/C++ files using older (and invalid instruction
> sequence). So, we mark all such globals with an attribute that forces
> them to be allocated in small sections allowing them to validly be
> accessed from the assembly code.
>
> This patch should not have any affect on builds that use small code
> model, which is the default mode.
>
> [1] https://eli.thegreenplace.net/2012/01/03/understanding-the-x64-code-models
>
> Signed-off-by: Pranav Kant <prka@google.com>
> ---
> libavcodec/ac3dsp.c | 2 ++
> libavcodec/cabac.c | 2 ++
> libavcodec/x86/constants.c | 8 ++++++++
> libavutil/attributes.h | 6 ++++++
> libavutil/attributes_internal.h | 16 ++++++++++++++++
> 5 files changed, 34 insertions(+)
>
> diff --git a/libavcodec/ac3dsp.c b/libavcodec/ac3dsp.c
> index 730fa70fff..d16b6c24c3 100644
> --- a/libavcodec/ac3dsp.c
> +++ b/libavcodec/ac3dsp.c
> @@ -25,6 +25,7 @@
>
> #include "config.h"
> #include "libavutil/attributes.h"
> +#include "libavutil/attributes_internal.h"
> #include "libavutil/common.h"
> #include "libavutil/intmath.h"
> #include "libavutil/mem_internal.h"
> @@ -104,6 +105,7 @@ static void ac3_update_bap_counts_c(uint16_t mant_cnt[16], uint8_t *bap,
> mant_cnt[bap[len]]++;
> }
>
> +attribute_mcmodel_small
> DECLARE_ALIGNED(16, const uint16_t, ff_ac3_bap_bits)[16] = {
Shouldn't stuff like this be applied to the declaration so that C code
can also take advantage of the knowledge that this object will be placed
in the small code section?
> 0, 0, 0, 3, 0, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 16
> };
> diff --git a/libavcodec/cabac.c b/libavcodec/cabac.c
> index 7d41cd2ae6..b8c6db29a2 100644
> --- a/libavcodec/cabac.c
> +++ b/libavcodec/cabac.c
> @@ -24,11 +24,13 @@
> * Context Adaptive Binary Arithmetic Coder.
> */
>
> +#include "libavutil/attributes_internal.h"
> #include "libavutil/error.h"
> #include "libavutil/mem_internal.h"
>
> #include "cabac.h"
>
> +attribute_mcmodel_small
> DECLARE_ASM_ALIGNED(1, const uint8_t, ff_h264_cabac_tables)[512 + 4*2*64 + 4*64 + 63] = {
Your commit message ("However, assembly files continue to access")
speaks only of assembly files, i.e. external asm. Yet this here is only
used by inline ASM. Looking through the code the reason for this is that
I thought that specifying the memory model is only necessary for stuff
used by external asm, yet ff_h264_cabac_tables does not seem to be used
by external ASM at all, only inline ASM. If I see this correctly, the
reason for this is that LOCAL_MANGLE (and therefore MANGLE) uses rip
addressing on x64 when configure sets the RIP define. But this means
that the set of files needing attribute_mcmodel_small is a superset of
the files currently using DECLARE_ASM_ALIGNED. This means that one would
only need two macros for the variables accessed by ASM: One for only
external ASM and one for inline ASM (and potentially external ASM)
instead of adding attribute_mcmodel_small at various places in the codebase.
> 9,8,7,7,6,6,6,6,5,5,5,5,5,5,5,5,
> 4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,
> diff --git a/libavcodec/x86/constants.c b/libavcodec/x86/constants.c
> index bc7f2b17b8..347b7dd1d3 100644
> --- a/libavcodec/x86/constants.c
> +++ b/libavcodec/x86/constants.c
> @@ -18,17 +18,21 @@
> * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
> */
>
> +#include "libavutil/attributes_internal.h"
> #include "libavutil/mem_internal.h"
> #include "libavutil/x86/asm.h" // for xmm_reg
> #include "constants.h"
>
> +attribute_mcmodel_small
> DECLARE_ALIGNED(32, const ymm_reg, ff_pw_1) = {
0x0001000100010001ULL, 0x0001000100010001ULL,>
0x0001000100010001ULL, 0x0001000100010001ULL };
> DECLARE_ALIGNED(32, const ymm_reg, ff_pw_2) = { 0x0002000200020002ULL, 0x0002000200020002ULL,
> 0x0002000200020002ULL, 0x0002000200020002ULL };
> DECLARE_ASM_ALIGNED(16, const xmm_reg, ff_pw_3) = { 0x0003000300030003ULL, 0x0003000300030003ULL };
> +attribute_mcmodel_small
> DECLARE_ASM_ALIGNED(32, const ymm_reg, ff_pw_4) = { 0x0004000400040004ULL, 0x0004000400040004ULL,
> 0x0004000400040004ULL, 0x0004000400040004ULL };
> +attribute_mcmodel_small
> DECLARE_ASM_ALIGNED(16, const xmm_reg, ff_pw_5) = { 0x0005000500050005ULL, 0x0005000500050005ULL };
> DECLARE_ALIGNED(16, const xmm_reg, ff_pw_8) = { 0x0008000800080008ULL, 0x0008000800080008ULL };
> DECLARE_ASM_ALIGNED(16, const xmm_reg, ff_pw_9) = { 0x0009000900090009ULL, 0x0009000900090009ULL };
> @@ -49,6 +53,7 @@ DECLARE_ALIGNED(32, const ymm_reg, ff_pw_256) = { 0x0100010001000100ULL, 0x010
> DECLARE_ALIGNED(32, const ymm_reg, ff_pw_512) = { 0x0200020002000200ULL, 0x0200020002000200ULL,
> 0x0200020002000200ULL, 0x0200020002000200ULL };
> DECLARE_ALIGNED(16, const xmm_reg, ff_pw_1019) = { 0x03FB03FB03FB03FBULL, 0x03FB03FB03FB03FBULL };
> +attribute_mcmodel_small
> DECLARE_ALIGNED(32, const ymm_reg, ff_pw_1023) = { 0x03ff03ff03ff03ffULL, 0x03ff03ff03ff03ffULL,
> 0x03ff03ff03ff03ffULL, 0x03ff03ff03ff03ffULL};
> DECLARE_ALIGNED(32, const ymm_reg, ff_pw_1024) = { 0x0400040004000400ULL, 0x0400040004000400ULL,
> @@ -66,13 +71,16 @@ DECLARE_ALIGNED(32, const ymm_reg, ff_pw_m1) = { 0xFFFFFFFFFFFFFFFFULL, 0xFFF
>
> DECLARE_ALIGNED(32, const ymm_reg, ff_pb_0) = { 0x0000000000000000ULL, 0x0000000000000000ULL,
> 0x0000000000000000ULL, 0x0000000000000000ULL };
> +attribute_mcmodel_small
> DECLARE_ALIGNED(32, const ymm_reg, ff_pb_1) = { 0x0101010101010101ULL, 0x0101010101010101ULL,
> 0x0101010101010101ULL, 0x0101010101010101ULL };
> DECLARE_ALIGNED(32, const ymm_reg, ff_pb_2) = { 0x0202020202020202ULL, 0x0202020202020202ULL,
> 0x0202020202020202ULL, 0x0202020202020202ULL };
> +attribute_mcmodel_small
> DECLARE_ALIGNED(32, const ymm_reg, ff_pb_3) = { 0x0303030303030303ULL, 0x0303030303030303ULL,
> 0x0303030303030303ULL, 0x0303030303030303ULL };
> DECLARE_ALIGNED(32, const xmm_reg, ff_pb_15) = { 0x0F0F0F0F0F0F0F0FULL, 0x0F0F0F0F0F0F0F0FULL };
> +attribute_mcmodel_small
> DECLARE_ALIGNED(32, const ymm_reg, ff_pb_80) = { 0x8080808080808080ULL, 0x8080808080808080ULL,
> 0x8080808080808080ULL, 0x8080808080808080ULL };
> DECLARE_ALIGNED(32, const ymm_reg, ff_pb_FE) = { 0xFEFEFEFEFEFEFEFEULL, 0xFEFEFEFEFEFEFEFEULL,
Why do you add this attribute to only eight of these constants? How did
you come up with this list? In addition to the above, git grep cextern
shows that there are several more variables than just these one (e.g.
ff_h263_loop_filter_strength, ff_sbr_noise_table).
> diff --git a/libavutil/attributes.h b/libavutil/attributes.h
> index 04c615c952..dfc35fa31e 100644
> --- a/libavutil/attributes.h
> +++ b/libavutil/attributes.h
> @@ -40,6 +40,12 @@
> # define AV_HAS_BUILTIN(x) 0
> #endif
>
> +#ifdef __has_attribute
> +# define AV_HAS_ATTRIBUTE(x) __has_attribute(x)
> +#else
> +# define AV_HAS_ATTRIBUTE(x) 0
> +#endif
> +
> #ifndef av_always_inline
> #if AV_GCC_VERSION_AT_LEAST(3,1)
> # define av_always_inline __attribute__((always_inline)) inline
> diff --git a/libavutil/attributes_internal.h b/libavutil/attributes_internal.h
> index bc85ce77ff..c557fa0af0 100644
> --- a/libavutil/attributes_internal.h
> +++ b/libavutil/attributes_internal.h
> @@ -19,6 +19,7 @@
> #ifndef AVUTIL_ATTRIBUTES_INTERNAL_H
> #define AVUTIL_ATTRIBUTES_INTERNAL_H
>
> +#include "config.h"
> #include "attributes.h"
>
> #if (AV_GCC_VERSION_AT_LEAST(4,0) || defined(__clang__)) && (defined(__ELF__) || defined(__MACH__))
> @@ -33,4 +34,19 @@
>
> #define EXTERN extern attribute_visibility_hidden
>
> +/**
> + * Some globals defined in C files are used from hardcoded asm that assumes small
> + * code model (that is, accessing these globals without GOT). This is a problem
> + * when FFMpeg is built with medium code model (-mcmodel=medium) which allocates
s/FFMpeg/FFmpeg/
> + * all globals in a data section that's unreachable with PC relative instructions
> + * (small code model instruction sequence). We mark all such globals with this
> + * attribute_mcmodel_small to ensure assembly accessible globals continue to be
> + * allocated in sections reachable from PC relative instructions.
> + */
> +#if ARCH_X86_64 && defined(__ELF__) && AV_HAS_ATTRIBUTE(model)
You make this ARCH_X86_64 only, yet the concept of code models also
exists for other arches, e.g. AArch64. I presume that assembly for other
arches presumes that we are not using the large code model for accesses,
so that the same issue can happen with them. Then we have two options:
a) Continue to use the same macro you are adding here. This will mean
that stuff only used in assembly by different arches will nevertheless
be declared as using the small code model, potentially clogging the
small symbol address space unnecessarily.
b) Use one macro per arch.
I presume we want to go with a), because the objects we are talking
about are so small that they should be in the small data section anyway.
> +# define attribute_mcmodel_small __attribute__((model("small")))
> +#else
> +# define attribute_mcmodel_small
> +#endif
> +
> #endif /* AVUTIL_ATTRIBUTES_INTERNAL_H */
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [FFmpeg-devel] [PATCH v4] Mark C globals with small code model
2025-03-11 23:45 ` Andreas Rheinhardt
@ 2025-03-12 7:05 ` Martin Storsjö
2025-03-12 7:17 ` Andreas Rheinhardt
0 siblings, 1 reply; 16+ messages in thread
From: Martin Storsjö @ 2025-03-12 7:05 UTC (permalink / raw)
To: FFmpeg development discussions and patches
On Wed, 12 Mar 2025, Andreas Rheinhardt wrote:
> Pranav Kant via ffmpeg-devel:
>> By default, all globals in C/C++ compiled by clang are allocated
>> in non-large data sections. See [1] for background on code models.
>> For PIC (Position independent code), this is fine as long as binary is
>> small but as binary size increases, users maybe want to use medium/large
>> code models (-mcmodel=medium) which moves data in to large sections.
>> As data in these large sections cannot be accessed using PIC code
>> anymore (as it may be too far away), compiler ends up using a different
>> instruction sequence when building C/C++ code -- using GOT to access
>> these globals (which can be relaxed by linker at link time if binary
>> ends up being smaller). However, assembly files continue to access these
>> globals defined in C/C++ files using older (and invalid instruction
>> sequence). So, we mark all such globals with an attribute that forces
>> them to be allocated in small sections allowing them to validly be
>> accessed from the assembly code.
>>
>> This patch should not have any affect on builds that use small code
>> model, which is the default mode.
>>
>> [1] https://eli.thegreenplace.net/2012/01/03/understanding-the-x64-code-models
>>
>> Signed-off-by: Pranav Kant <prka@google.com>
>> ---
>> libavcodec/ac3dsp.c | 2 ++
>> libavcodec/cabac.c | 2 ++
>> libavcodec/x86/constants.c | 8 ++++++++
>> libavutil/attributes.h | 6 ++++++
>> libavutil/attributes_internal.h | 16 ++++++++++++++++
>> 5 files changed, 34 insertions(+)
>>
>> diff --git a/libavcodec/ac3dsp.c b/libavcodec/ac3dsp.c
>> index 730fa70fff..d16b6c24c3 100644
>> --- a/libavcodec/ac3dsp.c
>> +++ b/libavcodec/ac3dsp.c
>> @@ -25,6 +25,7 @@
>>
>> #include "config.h"
>> #include "libavutil/attributes.h"
>> +#include "libavutil/attributes_internal.h"
>> #include "libavutil/common.h"
>> #include "libavutil/intmath.h"
>> #include "libavutil/mem_internal.h"
>> @@ -104,6 +105,7 @@ static void ac3_update_bap_counts_c(uint16_t mant_cnt[16], uint8_t *bap,
>> mant_cnt[bap[len]]++;
>> }
>>
>> +attribute_mcmodel_small
>> DECLARE_ALIGNED(16, const uint16_t, ff_ac3_bap_bits)[16] = {
>
> Shouldn't stuff like this be applied to the declaration so that C code
> can also take advantage of the knowledge that this object will be placed
> in the small code section?
>
>> 0, 0, 0, 3, 0, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 16
>> };
>> diff --git a/libavcodec/cabac.c b/libavcodec/cabac.c
>> index 7d41cd2ae6..b8c6db29a2 100644
>> --- a/libavcodec/cabac.c
>> +++ b/libavcodec/cabac.c
>> @@ -24,11 +24,13 @@
>> * Context Adaptive Binary Arithmetic Coder.
>> */
>>
>> +#include "libavutil/attributes_internal.h"
>> #include "libavutil/error.h"
>> #include "libavutil/mem_internal.h"
>>
>> #include "cabac.h"
>>
>> +attribute_mcmodel_small
>> DECLARE_ASM_ALIGNED(1, const uint8_t, ff_h264_cabac_tables)[512 + 4*2*64 + 4*64 + 63] = {
>
> Your commit message ("However, assembly files continue to access")
> speaks only of assembly files, i.e. external asm. Yet this here is only
> used by inline ASM. Looking through the code the reason for this is that
> I thought that specifying the memory model is only necessary for stuff
> used by external asm, yet ff_h264_cabac_tables does not seem to be used
> by external ASM at all, only inline ASM. If I see this correctly, the
> reason for this is that LOCAL_MANGLE (and therefore MANGLE) uses rip
> addressing on x64 when configure sets the RIP define. But this means
> that the set of files needing attribute_mcmodel_small is a superset of
> the files currently using DECLARE_ASM_ALIGNED. This means that one would
> only need two macros for the variables accessed by ASM: One for only
> external ASM and one for inline ASM (and potentially external ASM)
> instead of adding attribute_mcmodel_small at various places in the codebase.
>
>> 9,8,7,7,6,6,6,6,5,5,5,5,5,5,5,5,
>> 4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,
>> diff --git a/libavcodec/x86/constants.c b/libavcodec/x86/constants.c
>> index bc7f2b17b8..347b7dd1d3 100644
>> --- a/libavcodec/x86/constants.c
>> +++ b/libavcodec/x86/constants.c
>> @@ -18,17 +18,21 @@
>> * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
>> */
>>
>> +#include "libavutil/attributes_internal.h"
>> #include "libavutil/mem_internal.h"
>> #include "libavutil/x86/asm.h" // for xmm_reg
>> #include "constants.h"
>>
>> +attribute_mcmodel_small
>> DECLARE_ALIGNED(32, const ymm_reg, ff_pw_1) = {
> 0x0001000100010001ULL, 0x0001000100010001ULL,>
> 0x0001000100010001ULL, 0x0001000100010001ULL };
>> DECLARE_ALIGNED(32, const ymm_reg, ff_pw_2) = { 0x0002000200020002ULL, 0x0002000200020002ULL,
>> 0x0002000200020002ULL, 0x0002000200020002ULL };
>> DECLARE_ASM_ALIGNED(16, const xmm_reg, ff_pw_3) = { 0x0003000300030003ULL, 0x0003000300030003ULL };
>> +attribute_mcmodel_small
>> DECLARE_ASM_ALIGNED(32, const ymm_reg, ff_pw_4) = { 0x0004000400040004ULL, 0x0004000400040004ULL,
>> 0x0004000400040004ULL, 0x0004000400040004ULL };
>> +attribute_mcmodel_small
>> DECLARE_ASM_ALIGNED(16, const xmm_reg, ff_pw_5) = { 0x0005000500050005ULL, 0x0005000500050005ULL };
>> DECLARE_ALIGNED(16, const xmm_reg, ff_pw_8) = { 0x0008000800080008ULL, 0x0008000800080008ULL };
>> DECLARE_ASM_ALIGNED(16, const xmm_reg, ff_pw_9) = { 0x0009000900090009ULL, 0x0009000900090009ULL };
>> @@ -49,6 +53,7 @@ DECLARE_ALIGNED(32, const ymm_reg, ff_pw_256) = { 0x0100010001000100ULL, 0x010
>> DECLARE_ALIGNED(32, const ymm_reg, ff_pw_512) = { 0x0200020002000200ULL, 0x0200020002000200ULL,
>> 0x0200020002000200ULL, 0x0200020002000200ULL };
>> DECLARE_ALIGNED(16, const xmm_reg, ff_pw_1019) = { 0x03FB03FB03FB03FBULL, 0x03FB03FB03FB03FBULL };
>> +attribute_mcmodel_small
>> DECLARE_ALIGNED(32, const ymm_reg, ff_pw_1023) = { 0x03ff03ff03ff03ffULL, 0x03ff03ff03ff03ffULL,
>> 0x03ff03ff03ff03ffULL, 0x03ff03ff03ff03ffULL};
>> DECLARE_ALIGNED(32, const ymm_reg, ff_pw_1024) = { 0x0400040004000400ULL, 0x0400040004000400ULL,
>> @@ -66,13 +71,16 @@ DECLARE_ALIGNED(32, const ymm_reg, ff_pw_m1) = { 0xFFFFFFFFFFFFFFFFULL, 0xFFF
>>
>> DECLARE_ALIGNED(32, const ymm_reg, ff_pb_0) = { 0x0000000000000000ULL, 0x0000000000000000ULL,
>> 0x0000000000000000ULL, 0x0000000000000000ULL };
>> +attribute_mcmodel_small
>> DECLARE_ALIGNED(32, const ymm_reg, ff_pb_1) = { 0x0101010101010101ULL, 0x0101010101010101ULL,
>> 0x0101010101010101ULL, 0x0101010101010101ULL };
>> DECLARE_ALIGNED(32, const ymm_reg, ff_pb_2) = { 0x0202020202020202ULL, 0x0202020202020202ULL,
>> 0x0202020202020202ULL, 0x0202020202020202ULL };
>> +attribute_mcmodel_small
>> DECLARE_ALIGNED(32, const ymm_reg, ff_pb_3) = { 0x0303030303030303ULL, 0x0303030303030303ULL,
>> 0x0303030303030303ULL, 0x0303030303030303ULL };
>> DECLARE_ALIGNED(32, const xmm_reg, ff_pb_15) = { 0x0F0F0F0F0F0F0F0FULL, 0x0F0F0F0F0F0F0F0FULL };
>> +attribute_mcmodel_small
>> DECLARE_ALIGNED(32, const ymm_reg, ff_pb_80) = { 0x8080808080808080ULL, 0x8080808080808080ULL,
>> 0x8080808080808080ULL, 0x8080808080808080ULL };
>> DECLARE_ALIGNED(32, const ymm_reg, ff_pb_FE) = { 0xFEFEFEFEFEFEFEFEULL, 0xFEFEFEFEFEFEFEFEULL,
>
> Why do you add this attribute to only eight of these constants? How did
> you come up with this list? In addition to the above, git grep cextern
> shows that there are several more variables than just these one (e.g.
> ff_h263_loop_filter_strength, ff_sbr_noise_table).
>
>> diff --git a/libavutil/attributes.h b/libavutil/attributes.h
>> index 04c615c952..dfc35fa31e 100644
>> --- a/libavutil/attributes.h
>> +++ b/libavutil/attributes.h
>> @@ -40,6 +40,12 @@
>> # define AV_HAS_BUILTIN(x) 0
>> #endif
>>
>> +#ifdef __has_attribute
>> +# define AV_HAS_ATTRIBUTE(x) __has_attribute(x)
>> +#else
>> +# define AV_HAS_ATTRIBUTE(x) 0
>> +#endif
>> +
>> #ifndef av_always_inline
>> #if AV_GCC_VERSION_AT_LEAST(3,1)
>> # define av_always_inline __attribute__((always_inline)) inline
>> diff --git a/libavutil/attributes_internal.h b/libavutil/attributes_internal.h
>> index bc85ce77ff..c557fa0af0 100644
>> --- a/libavutil/attributes_internal.h
>> +++ b/libavutil/attributes_internal.h
>> @@ -19,6 +19,7 @@
>> #ifndef AVUTIL_ATTRIBUTES_INTERNAL_H
>> #define AVUTIL_ATTRIBUTES_INTERNAL_H
>>
>> +#include "config.h"
>> #include "attributes.h"
>>
>> #if (AV_GCC_VERSION_AT_LEAST(4,0) || defined(__clang__)) && (defined(__ELF__) || defined(__MACH__))
>> @@ -33,4 +34,19 @@
>>
>> #define EXTERN extern attribute_visibility_hidden
>>
>> +/**
>> + * Some globals defined in C files are used from hardcoded asm that assumes small
>> + * code model (that is, accessing these globals without GOT). This is a problem
>> + * when FFMpeg is built with medium code model (-mcmodel=medium) which allocates
>
> s/FFMpeg/FFmpeg/
>
>> + * all globals in a data section that's unreachable with PC relative instructions
>> + * (small code model instruction sequence). We mark all such globals with this
>> + * attribute_mcmodel_small to ensure assembly accessible globals continue to be
>> + * allocated in sections reachable from PC relative instructions.
>> + */
>> +#if ARCH_X86_64 && defined(__ELF__) && AV_HAS_ATTRIBUTE(model)
>
> You make this ARCH_X86_64 only, yet the concept of code models also
> exists for other arches, e.g. AArch64. I presume that assembly for other
> arches presumes that we are not using the large code model for accesses,
> so that the same issue can happen with them. Then we have two options:
FWIW, in e4ac156b7c47725327dff78bb83f5eecbaee3add we added
attribute_visibility_hidden to a bunch of symbols that are accessed from
aarch64 assembly, in a similar fashion. While the effects are different, I
wonder if we should abstract these two down to a more generic attribute
for any symbol accessed directly from any assembly.
// Martin
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [FFmpeg-devel] [PATCH v4] Mark C globals with small code model
2025-03-12 7:05 ` Martin Storsjö
@ 2025-03-12 7:17 ` Andreas Rheinhardt
2025-03-12 7:47 ` Martin Storsjö
0 siblings, 1 reply; 16+ messages in thread
From: Andreas Rheinhardt @ 2025-03-12 7:17 UTC (permalink / raw)
To: ffmpeg-devel
Martin Storsjö:
> On Wed, 12 Mar 2025, Andreas Rheinhardt wrote:
>
>> Pranav Kant via ffmpeg-devel:
>>> By default, all globals in C/C++ compiled by clang are allocated
>>> in non-large data sections. See [1] for background on code models.
>>> For PIC (Position independent code), this is fine as long as binary is
>>> small but as binary size increases, users maybe want to use medium/large
>>> code models (-mcmodel=medium) which moves data in to large sections.
>>> As data in these large sections cannot be accessed using PIC code
>>> anymore (as it may be too far away), compiler ends up using a different
>>> instruction sequence when building C/C++ code -- using GOT to access
>>> these globals (which can be relaxed by linker at link time if binary
>>> ends up being smaller). However, assembly files continue to access these
>>> globals defined in C/C++ files using older (and invalid instruction
>>> sequence). So, we mark all such globals with an attribute that forces
>>> them to be allocated in small sections allowing them to validly be
>>> accessed from the assembly code.
>>>
>>> This patch should not have any affect on builds that use small code
>>> model, which is the default mode.
>>>
>>> [1] https://eli.thegreenplace.net/2012/01/03/understanding-the-x64-
>>> code-models
>>>
>>> Signed-off-by: Pranav Kant <prka@google.com>
>>> ---
>>> libavcodec/ac3dsp.c | 2 ++
>>> libavcodec/cabac.c | 2 ++
>>> libavcodec/x86/constants.c | 8 ++++++++
>>> libavutil/attributes.h | 6 ++++++
>>> libavutil/attributes_internal.h | 16 ++++++++++++++++
>>> 5 files changed, 34 insertions(+)
>>>
>>> diff --git a/libavcodec/ac3dsp.c b/libavcodec/ac3dsp.c
>>> index 730fa70fff..d16b6c24c3 100644
>>> --- a/libavcodec/ac3dsp.c
>>> +++ b/libavcodec/ac3dsp.c
>>> @@ -25,6 +25,7 @@
>>>
>>> #include "config.h"
>>> #include "libavutil/attributes.h"
>>> +#include "libavutil/attributes_internal.h"
>>> #include "libavutil/common.h"
>>> #include "libavutil/intmath.h"
>>> #include "libavutil/mem_internal.h"
>>> @@ -104,6 +105,7 @@ static void ac3_update_bap_counts_c(uint16_t
>>> mant_cnt[16], uint8_t *bap,
>>> mant_cnt[bap[len]]++;
>>> }
>>>
>>> +attribute_mcmodel_small
>>> DECLARE_ALIGNED(16, const uint16_t, ff_ac3_bap_bits)[16] = {
>>
>> Shouldn't stuff like this be applied to the declaration so that C code
>> can also take advantage of the knowledge that this object will be placed
>> in the small code section?
>>
>>> 0, 0, 0, 3, 0, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 16
>>> };
>>> diff --git a/libavcodec/cabac.c b/libavcodec/cabac.c
>>> index 7d41cd2ae6..b8c6db29a2 100644
>>> --- a/libavcodec/cabac.c
>>> +++ b/libavcodec/cabac.c
>>> @@ -24,11 +24,13 @@
>>> * Context Adaptive Binary Arithmetic Coder.
>>> */
>>>
>>> +#include "libavutil/attributes_internal.h"
>>> #include "libavutil/error.h"
>>> #include "libavutil/mem_internal.h"
>>>
>>> #include "cabac.h"
>>>
>>> +attribute_mcmodel_small
>>> DECLARE_ASM_ALIGNED(1, const uint8_t, ff_h264_cabac_tables)[512 +
>>> 4*2*64 + 4*64 + 63] = {
>>
>> Your commit message ("However, assembly files continue to access")
>> speaks only of assembly files, i.e. external asm. Yet this here is only
>> used by inline ASM. Looking through the code the reason for this is that
>> I thought that specifying the memory model is only necessary for stuff
>> used by external asm, yet ff_h264_cabac_tables does not seem to be used
>> by external ASM at all, only inline ASM. If I see this correctly, the
>> reason for this is that LOCAL_MANGLE (and therefore MANGLE) uses rip
>> addressing on x64 when configure sets the RIP define. But this means
>> that the set of files needing attribute_mcmodel_small is a superset of
>> the files currently using DECLARE_ASM_ALIGNED. This means that one would
>> only need two macros for the variables accessed by ASM: One for only
>> external ASM and one for inline ASM (and potentially external ASM)
>> instead of adding attribute_mcmodel_small at various places in the
>> codebase.
>>
>>> 9,8,7,7,6,6,6,6,5,5,5,5,5,5,5,5,
>>> 4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,
>>> diff --git a/libavcodec/x86/constants.c b/libavcodec/x86/constants.c
>>> index bc7f2b17b8..347b7dd1d3 100644
>>> --- a/libavcodec/x86/constants.c
>>> +++ b/libavcodec/x86/constants.c
>>> @@ -18,17 +18,21 @@
>>> * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
>>> 02110-1301 USA
>>> */
>>>
>>> +#include "libavutil/attributes_internal.h"
>>> #include "libavutil/mem_internal.h"
>>> #include "libavutil/x86/asm.h" // for xmm_reg
>>> #include "constants.h"
>>>
>>> +attribute_mcmodel_small
>>> DECLARE_ALIGNED(32, const ymm_reg, ff_pw_1) = {
>> 0x0001000100010001ULL, 0x0001000100010001ULL,>
>> 0x0001000100010001ULL, 0x0001000100010001ULL };
>>> DECLARE_ALIGNED(32, const ymm_reg, ff_pw_2) =
>>> { 0x0002000200020002ULL, 0x0002000200020002ULL,
>>>
>>> 0x0002000200020002ULL, 0x0002000200020002ULL };
>>> DECLARE_ASM_ALIGNED(16, const xmm_reg, ff_pw_3) =
>>> { 0x0003000300030003ULL, 0x0003000300030003ULL };
>>> +attribute_mcmodel_small
>>> DECLARE_ASM_ALIGNED(32, const ymm_reg, ff_pw_4) =
>>> { 0x0004000400040004ULL, 0x0004000400040004ULL,
>>>
>>> 0x0004000400040004ULL, 0x0004000400040004ULL };
>>> +attribute_mcmodel_small
>>> DECLARE_ASM_ALIGNED(16, const xmm_reg, ff_pw_5) =
>>> { 0x0005000500050005ULL, 0x0005000500050005ULL };
>>> DECLARE_ALIGNED(16, const xmm_reg, ff_pw_8) =
>>> { 0x0008000800080008ULL, 0x0008000800080008ULL };
>>> DECLARE_ASM_ALIGNED(16, const xmm_reg, ff_pw_9) =
>>> { 0x0009000900090009ULL, 0x0009000900090009ULL };
>>> @@ -49,6 +53,7 @@ DECLARE_ALIGNED(32, const ymm_reg, ff_pw_256) =
>>> { 0x0100010001000100ULL, 0x010
>>> DECLARE_ALIGNED(32, const ymm_reg, ff_pw_512) =
>>> { 0x0200020002000200ULL, 0x0200020002000200ULL,
>>>
>>> 0x0200020002000200ULL, 0x0200020002000200ULL };
>>> DECLARE_ALIGNED(16, const xmm_reg, ff_pw_1019) =
>>> { 0x03FB03FB03FB03FBULL, 0x03FB03FB03FB03FBULL };
>>> +attribute_mcmodel_small
>>> DECLARE_ALIGNED(32, const ymm_reg, ff_pw_1023) =
>>> { 0x03ff03ff03ff03ffULL, 0x03ff03ff03ff03ffULL,
>>>
>>> 0x03ff03ff03ff03ffULL, 0x03ff03ff03ff03ffULL};
>>> DECLARE_ALIGNED(32, const ymm_reg, ff_pw_1024) =
>>> { 0x0400040004000400ULL, 0x0400040004000400ULL,
>>> @@ -66,13 +71,16 @@ DECLARE_ALIGNED(32, const ymm_reg, ff_pw_m1) =
>>> { 0xFFFFFFFFFFFFFFFFULL, 0xFFF
>>>
>>> DECLARE_ALIGNED(32, const ymm_reg, ff_pb_0) =
>>> { 0x0000000000000000ULL, 0x0000000000000000ULL,
>>>
>>> 0x0000000000000000ULL, 0x0000000000000000ULL };
>>> +attribute_mcmodel_small
>>> DECLARE_ALIGNED(32, const ymm_reg, ff_pb_1) =
>>> { 0x0101010101010101ULL, 0x0101010101010101ULL,
>>>
>>> 0x0101010101010101ULL, 0x0101010101010101ULL };
>>> DECLARE_ALIGNED(32, const ymm_reg, ff_pb_2) =
>>> { 0x0202020202020202ULL, 0x0202020202020202ULL,
>>>
>>> 0x0202020202020202ULL, 0x0202020202020202ULL };
>>> +attribute_mcmodel_small
>>> DECLARE_ALIGNED(32, const ymm_reg, ff_pb_3) =
>>> { 0x0303030303030303ULL, 0x0303030303030303ULL,
>>>
>>> 0x0303030303030303ULL, 0x0303030303030303ULL };
>>> DECLARE_ALIGNED(32, const xmm_reg, ff_pb_15) =
>>> { 0x0F0F0F0F0F0F0F0FULL, 0x0F0F0F0F0F0F0F0FULL };
>>> +attribute_mcmodel_small
>>> DECLARE_ALIGNED(32, const ymm_reg, ff_pb_80) =
>>> { 0x8080808080808080ULL, 0x8080808080808080ULL,
>>>
>>> 0x8080808080808080ULL, 0x8080808080808080ULL };
>>> DECLARE_ALIGNED(32, const ymm_reg, ff_pb_FE) =
>>> { 0xFEFEFEFEFEFEFEFEULL, 0xFEFEFEFEFEFEFEFEULL,
>>
>> Why do you add this attribute to only eight of these constants? How did
>> you come up with this list? In addition to the above, git grep cextern
>> shows that there are several more variables than just these one (e.g.
>> ff_h263_loop_filter_strength, ff_sbr_noise_table).
>>
>>> diff --git a/libavutil/attributes.h b/libavutil/attributes.h
>>> index 04c615c952..dfc35fa31e 100644
>>> --- a/libavutil/attributes.h
>>> +++ b/libavutil/attributes.h
>>> @@ -40,6 +40,12 @@
>>> # define AV_HAS_BUILTIN(x) 0
>>> #endif
>>>
>>> +#ifdef __has_attribute
>>> +# define AV_HAS_ATTRIBUTE(x) __has_attribute(x)
>>> +#else
>>> +# define AV_HAS_ATTRIBUTE(x) 0
>>> +#endif
>>> +
>>> #ifndef av_always_inline
>>> #if AV_GCC_VERSION_AT_LEAST(3,1)
>>> # define av_always_inline __attribute__((always_inline)) inline
>>> diff --git a/libavutil/attributes_internal.h b/libavutil/
>>> attributes_internal.h
>>> index bc85ce77ff..c557fa0af0 100644
>>> --- a/libavutil/attributes_internal.h
>>> +++ b/libavutil/attributes_internal.h
>>> @@ -19,6 +19,7 @@
>>> #ifndef AVUTIL_ATTRIBUTES_INTERNAL_H
>>> #define AVUTIL_ATTRIBUTES_INTERNAL_H
>>>
>>> +#include "config.h"
>>> #include "attributes.h"
>>>
>>> #if (AV_GCC_VERSION_AT_LEAST(4,0) || defined(__clang__)) &&
>>> (defined(__ELF__) || defined(__MACH__))
>>> @@ -33,4 +34,19 @@
>>>
>>> #define EXTERN extern attribute_visibility_hidden
>>>
>>> +/**
>>> + * Some globals defined in C files are used from hardcoded asm that
>>> assumes small
>>> + * code model (that is, accessing these globals without GOT). This
>>> is a problem
>>> + * when FFMpeg is built with medium code model (-mcmodel=medium)
>>> which allocates
>>
>> s/FFMpeg/FFmpeg/
>>
>>> + * all globals in a data section that's unreachable with PC relative
>>> instructions
>>> + * (small code model instruction sequence). We mark all such globals
>>> with this
>>> + * attribute_mcmodel_small to ensure assembly accessible globals
>>> continue to be
>>> + * allocated in sections reachable from PC relative instructions.
>>> + */
>>> +#if ARCH_X86_64 && defined(__ELF__) && AV_HAS_ATTRIBUTE(model)
>>
>> You make this ARCH_X86_64 only, yet the concept of code models also
>> exists for other arches, e.g. AArch64. I presume that assembly for other
>> arches presumes that we are not using the large code model for accesses,
>> so that the same issue can happen with them. Then we have two options:
>
> FWIW, in e4ac156b7c47725327dff78bb83f5eecbaee3add we added
> attribute_visibility_hidden to a bunch of symbols that are accessed from
> aarch64 assembly, in a similar fashion. While the effects are different,
> I wonder if we should abstract these two down to a more generic
> attribute for any symbol accessed directly from any assembly.
>
As I said above: We would only need two macros: One for any object
accessed from inline assembly and one for objects accessed by external
assembly (and not inline assembly). The latter would be hidden
visibility+small code model, the former would also have av_used. Of
course, both would allow to specify alignment.
But naming is hard. How about DECLARE_ASM_VAR and DECLARE_INLINE_ASM_VAR?
- Andreas
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [FFmpeg-devel] [PATCH v4] Mark C globals with small code model
2025-03-12 7:17 ` Andreas Rheinhardt
@ 2025-03-12 7:47 ` Martin Storsjö
0 siblings, 0 replies; 16+ messages in thread
From: Martin Storsjö @ 2025-03-12 7:47 UTC (permalink / raw)
To: FFmpeg development discussions and patches
On Wed, 12 Mar 2025, Andreas Rheinhardt wrote:
> Martin Storsjö:
>> On Wed, 12 Mar 2025, Andreas Rheinhardt wrote:
>>
>>> Pranav Kant via ffmpeg-devel:
>>>> + * all globals in a data section that's unreachable with PC relative
>>>> instructions
>>>> + * (small code model instruction sequence). We mark all such globals
>>>> with this
>>>> + * attribute_mcmodel_small to ensure assembly accessible globals
>>>> continue to be
>>>> + * allocated in sections reachable from PC relative instructions.
>>>> + */
>>>> +#if ARCH_X86_64 && defined(__ELF__) && AV_HAS_ATTRIBUTE(model)
>>>
>>> You make this ARCH_X86_64 only, yet the concept of code models also
>>> exists for other arches, e.g. AArch64. I presume that assembly for other
>>> arches presumes that we are not using the large code model for accesses,
>>> so that the same issue can happen with them. Then we have two options:
>>
>> FWIW, in e4ac156b7c47725327dff78bb83f5eecbaee3add we added
>> attribute_visibility_hidden to a bunch of symbols that are accessed from
>> aarch64 assembly, in a similar fashion. While the effects are different,
>> I wonder if we should abstract these two down to a more generic
>> attribute for any symbol accessed directly from any assembly.
>>
>
> As I said above: We would only need two macros: One for any object
> accessed from inline assembly and one for objects accessed by external
> assembly (and not inline assembly). The latter would be hidden
> visibility+small code model, the former would also have av_used. Of
> course, both would allow to specify alignment.
>
> But naming is hard. How about DECLARE_ASM_VAR and DECLARE_INLINE_ASM_VAR?
That sounds ok to me.
// Martin
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [FFmpeg-devel] [PATCH] Mark C globals with small code model
2025-02-25 21:37 [FFmpeg-devel] [PATCH] Mark C globals with small code model Pranav Kant via ffmpeg-devel
2025-02-25 23:03 ` Andreas Rheinhardt
2025-02-26 19:44 ` Pranav Kant via ffmpeg-devel
@ 2025-02-28 2:31 ` Lynne
2025-03-07 2:14 ` Pranav Kant via ffmpeg-devel
2 siblings, 1 reply; 16+ messages in thread
From: Lynne @ 2025-02-28 2:31 UTC (permalink / raw)
To: ffmpeg-devel
[-- Attachment #1.1.1.1: Type: text/plain, Size: 7010 bytes --]
On 25/02/2025 22:37, Pranav Kant via ffmpeg-devel wrote:
> By default, all globals in C/C++ compiled by clang are allocated
> in non-large data sections. See [1] for background on code models.
> For PIC (Position independent code), this is fine as long as binary is
> small but as binary size increases, users maybe want to use medium/large
> code models (-mcmodel=medium) which moves data in to large sections.
> As data in these large sections cannot be accessed using PIC code
> anymore (as it may be too far away), compiler ends up using a different
> instruction sequence when building C/C++ code -- using GOT to access
> these globals (which can be relaxed by linker at link time if binary
> ends up being smaller). However, assembly files continue to access these
> globals defined in C/C++ files using older (and invalid instruction
> sequence). So, we mark all such globals with an attribute that forces
> them to be allocated in small sections allowing them to validly be
> accessed from the assembly code.
>
> This patch should not have any affect on builds that use small code
> model, which is the default mode.
>
> [1] https://eli.thegreenplace.net/2012/01/03/understanding-the-x64-code-models
> ---
> libavcodec/ac3dsp.c | 2 +-
> libavcodec/cabac.c | 3 ++-
> libavcodec/x86/constants.c | 8 ++++++++
> libavutil/attributes.h | 6 ++++++
> 4 files changed, 17 insertions(+), 2 deletions(-)
>
> diff --git a/libavcodec/ac3dsp.c b/libavcodec/ac3dsp.c
> index 730fa70fff..43b4fcbda9 100644
> --- a/libavcodec/ac3dsp.c
> +++ b/libavcodec/ac3dsp.c
> @@ -104,7 +104,7 @@ static void ac3_update_bap_counts_c(uint16_t mant_cnt[16], uint8_t *bap,
> mant_cnt[bap[len]]++;
> }
>
> -DECLARE_ALIGNED(16, const uint16_t, ff_ac3_bap_bits)[16] = {
> +av_mcmodel_small DECLARE_ALIGNED(16, const uint16_t, ff_ac3_bap_bits)[16] = {
> 0, 0, 0, 3, 0, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 16
> };
>
> diff --git a/libavcodec/cabac.c b/libavcodec/cabac.c
> index 7d41cd2ae6..dfc3ba135a 100644
> --- a/libavcodec/cabac.c
> +++ b/libavcodec/cabac.c
> @@ -24,12 +24,13 @@
> * Context Adaptive Binary Arithmetic Coder.
> */
>
> +#include "libavutil/attributes.h"
> #include "libavutil/error.h"
> #include "libavutil/mem_internal.h"
>
> #include "cabac.h"
>
> -DECLARE_ASM_ALIGNED(1, const uint8_t, ff_h264_cabac_tables)[512 + 4*2*64 + 4*64 + 63] = {
> +av_mcmodel_small DECLARE_ASM_ALIGNED(1, const uint8_t, ff_h264_cabac_tables)[512 + 4*2*64 + 4*64 + 63] = {
> 9,8,7,7,6,6,6,6,5,5,5,5,5,5,5,5,
> 4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,
> 3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,
> diff --git a/libavcodec/x86/constants.c b/libavcodec/x86/constants.c
> index bc7f2b17b8..9a5af2871c 100644
> --- a/libavcodec/x86/constants.c
> +++ b/libavcodec/x86/constants.c
> @@ -18,17 +18,21 @@
> * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
> */
>
> +#include "libavutil/attributes.h"
> #include "libavutil/mem_internal.h"
> #include "libavutil/x86/asm.h" // for xmm_reg
> #include "constants.h"
>
> +av_mcmodel_small
> DECLARE_ALIGNED(32, const ymm_reg, ff_pw_1) = { 0x0001000100010001ULL, 0x0001000100010001ULL,
> 0x0001000100010001ULL, 0x0001000100010001ULL };
> DECLARE_ALIGNED(32, const ymm_reg, ff_pw_2) = { 0x0002000200020002ULL, 0x0002000200020002ULL,
> 0x0002000200020002ULL, 0x0002000200020002ULL };
> DECLARE_ASM_ALIGNED(16, const xmm_reg, ff_pw_3) = { 0x0003000300030003ULL, 0x0003000300030003ULL };
> +av_mcmodel_small
> DECLARE_ASM_ALIGNED(32, const ymm_reg, ff_pw_4) = { 0x0004000400040004ULL, 0x0004000400040004ULL,
> 0x0004000400040004ULL, 0x0004000400040004ULL };
> +av_mcmodel_small
> DECLARE_ASM_ALIGNED(16, const xmm_reg, ff_pw_5) = { 0x0005000500050005ULL, 0x0005000500050005ULL };
> DECLARE_ALIGNED(16, const xmm_reg, ff_pw_8) = { 0x0008000800080008ULL, 0x0008000800080008ULL };
> DECLARE_ASM_ALIGNED(16, const xmm_reg, ff_pw_9) = { 0x0009000900090009ULL, 0x0009000900090009ULL };
> @@ -49,6 +53,7 @@ DECLARE_ALIGNED(32, const ymm_reg, ff_pw_256) = { 0x0100010001000100ULL, 0x010
> DECLARE_ALIGNED(32, const ymm_reg, ff_pw_512) = { 0x0200020002000200ULL, 0x0200020002000200ULL,
> 0x0200020002000200ULL, 0x0200020002000200ULL };
> DECLARE_ALIGNED(16, const xmm_reg, ff_pw_1019) = { 0x03FB03FB03FB03FBULL, 0x03FB03FB03FB03FBULL };
> +av_mcmodel_small
> DECLARE_ALIGNED(32, const ymm_reg, ff_pw_1023) = { 0x03ff03ff03ff03ffULL, 0x03ff03ff03ff03ffULL,
> 0x03ff03ff03ff03ffULL, 0x03ff03ff03ff03ffULL};
> DECLARE_ALIGNED(32, const ymm_reg, ff_pw_1024) = { 0x0400040004000400ULL, 0x0400040004000400ULL,
> @@ -66,13 +71,16 @@ DECLARE_ALIGNED(32, const ymm_reg, ff_pw_m1) = { 0xFFFFFFFFFFFFFFFFULL, 0xFFF
>
> DECLARE_ALIGNED(32, const ymm_reg, ff_pb_0) = { 0x0000000000000000ULL, 0x0000000000000000ULL,
> 0x0000000000000000ULL, 0x0000000000000000ULL };
> +av_mcmodel_small
> DECLARE_ALIGNED(32, const ymm_reg, ff_pb_1) = { 0x0101010101010101ULL, 0x0101010101010101ULL,
> 0x0101010101010101ULL, 0x0101010101010101ULL };
> DECLARE_ALIGNED(32, const ymm_reg, ff_pb_2) = { 0x0202020202020202ULL, 0x0202020202020202ULL,
> 0x0202020202020202ULL, 0x0202020202020202ULL };
> +av_mcmodel_small
> DECLARE_ALIGNED(32, const ymm_reg, ff_pb_3) = { 0x0303030303030303ULL, 0x0303030303030303ULL,
> 0x0303030303030303ULL, 0x0303030303030303ULL };
> DECLARE_ALIGNED(32, const xmm_reg, ff_pb_15) = { 0x0F0F0F0F0F0F0F0FULL, 0x0F0F0F0F0F0F0F0FULL };
> +av_mcmodel_small
> DECLARE_ALIGNED(32, const ymm_reg, ff_pb_80) = { 0x8080808080808080ULL, 0x8080808080808080ULL,
> 0x8080808080808080ULL, 0x8080808080808080ULL };
> DECLARE_ALIGNED(32, const ymm_reg, ff_pb_FE) = { 0xFEFEFEFEFEFEFEFEULL, 0xFEFEFEFEFEFEFEFEULL,
> diff --git a/libavutil/attributes.h b/libavutil/attributes.h
> index 04c615c952..704a7070db 100644
> --- a/libavutil/attributes.h
> +++ b/libavutil/attributes.h
> @@ -104,6 +104,12 @@
> # define attribute_deprecated
> #endif
>
> +#if defined(__clang__) && __has_attribute(model)
> +# define av_mcmodel_small __attribute__((model("small")))
> +#else
> +# define av_mcmodel_small
> +#endif
That's a bad macro name, moreover being a compiler-specific name.
Could you consider calling it simply av_global?
[-- Attachment #1.1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 637 bytes --]
[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 236 bytes --]
[-- Attachment #2: Type: text/plain, Size: 251 bytes --]
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [FFmpeg-devel] [PATCH] Mark C globals with small code model
2025-02-28 2:31 ` [FFmpeg-devel] [PATCH] " Lynne
@ 2025-03-07 2:14 ` Pranav Kant via ffmpeg-devel
0 siblings, 0 replies; 16+ messages in thread
From: Pranav Kant via ffmpeg-devel @ 2025-03-07 2:14 UTC (permalink / raw)
To: FFmpeg development discussions and patches; +Cc: Pranav Kant
I think you were looking at an older version of the patch. Newer version
didn't have this. Anyhow, there's a new version I uploaded (v3).
On Thu, Feb 27, 2025 at 6:31 PM Lynne <dev@lynne.ee> wrote:
> On 25/02/2025 22:37, Pranav Kant via ffmpeg-devel wrote:
> > By default, all globals in C/C++ compiled by clang are allocated
> > in non-large data sections. See [1] for background on code models.
> > For PIC (Position independent code), this is fine as long as binary is
> > small but as binary size increases, users maybe want to use medium/large
> > code models (-mcmodel=medium) which moves data in to large sections.
> > As data in these large sections cannot be accessed using PIC code
> > anymore (as it may be too far away), compiler ends up using a different
> > instruction sequence when building C/C++ code -- using GOT to access
> > these globals (which can be relaxed by linker at link time if binary
> > ends up being smaller). However, assembly files continue to access these
> > globals defined in C/C++ files using older (and invalid instruction
> > sequence). So, we mark all such globals with an attribute that forces
> > them to be allocated in small sections allowing them to validly be
> > accessed from the assembly code.
> >
> > This patch should not have any affect on builds that use small code
> > model, which is the default mode.
> >
> > [1]
> https://eli.thegreenplace.net/2012/01/03/understanding-the-x64-code-models
> > ---
> > libavcodec/ac3dsp.c | 2 +-
> > libavcodec/cabac.c | 3 ++-
> > libavcodec/x86/constants.c | 8 ++++++++
> > libavutil/attributes.h | 6 ++++++
> > 4 files changed, 17 insertions(+), 2 deletions(-)
> >
> > diff --git a/libavcodec/ac3dsp.c b/libavcodec/ac3dsp.c
> > index 730fa70fff..43b4fcbda9 100644
> > --- a/libavcodec/ac3dsp.c
> > +++ b/libavcodec/ac3dsp.c
> > @@ -104,7 +104,7 @@ static void ac3_update_bap_counts_c(uint16_t
> mant_cnt[16], uint8_t *bap,
> > mant_cnt[bap[len]]++;
> > }
> >
> > -DECLARE_ALIGNED(16, const uint16_t, ff_ac3_bap_bits)[16] = {
> > +av_mcmodel_small DECLARE_ALIGNED(16, const uint16_t,
> ff_ac3_bap_bits)[16] = {
> > 0, 0, 0, 3, 0, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 16
> > };
> >
> > diff --git a/libavcodec/cabac.c b/libavcodec/cabac.c
> > index 7d41cd2ae6..dfc3ba135a 100644
> > --- a/libavcodec/cabac.c
> > +++ b/libavcodec/cabac.c
> > @@ -24,12 +24,13 @@
> > * Context Adaptive Binary Arithmetic Coder.
> > */
> >
> > +#include "libavutil/attributes.h"
> > #include "libavutil/error.h"
> > #include "libavutil/mem_internal.h"
> >
> > #include "cabac.h"
> >
> > -DECLARE_ASM_ALIGNED(1, const uint8_t, ff_h264_cabac_tables)[512 +
> 4*2*64 + 4*64 + 63] = {
> > +av_mcmodel_small DECLARE_ASM_ALIGNED(1, const uint8_t,
> ff_h264_cabac_tables)[512 + 4*2*64 + 4*64 + 63] = {
> > 9,8,7,7,6,6,6,6,5,5,5,5,5,5,5,5,
> > 4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,
> > 3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,
> > diff --git a/libavcodec/x86/constants.c b/libavcodec/x86/constants.c
> > index bc7f2b17b8..9a5af2871c 100644
> > --- a/libavcodec/x86/constants.c
> > +++ b/libavcodec/x86/constants.c
> > @@ -18,17 +18,21 @@
> > * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
> 02110-1301 USA
> > */
> >
> > +#include "libavutil/attributes.h"
> > #include "libavutil/mem_internal.h"
> > #include "libavutil/x86/asm.h" // for xmm_reg
> > #include "constants.h"
> >
> > +av_mcmodel_small
> > DECLARE_ALIGNED(32, const ymm_reg, ff_pw_1) = {
> 0x0001000100010001ULL, 0x0001000100010001ULL,
> >
> 0x0001000100010001ULL, 0x0001000100010001ULL };
> > DECLARE_ALIGNED(32, const ymm_reg, ff_pw_2) = {
> 0x0002000200020002ULL, 0x0002000200020002ULL,
> >
> 0x0002000200020002ULL, 0x0002000200020002ULL };
> > DECLARE_ASM_ALIGNED(16, const xmm_reg, ff_pw_3) = {
> 0x0003000300030003ULL, 0x0003000300030003ULL };
> > +av_mcmodel_small
> > DECLARE_ASM_ALIGNED(32, const ymm_reg, ff_pw_4) = {
> 0x0004000400040004ULL, 0x0004000400040004ULL,
> >
> 0x0004000400040004ULL, 0x0004000400040004ULL };
> > +av_mcmodel_small
> > DECLARE_ASM_ALIGNED(16, const xmm_reg, ff_pw_5) = {
> 0x0005000500050005ULL, 0x0005000500050005ULL };
> > DECLARE_ALIGNED(16, const xmm_reg, ff_pw_8) = {
> 0x0008000800080008ULL, 0x0008000800080008ULL };
> > DECLARE_ASM_ALIGNED(16, const xmm_reg, ff_pw_9) = {
> 0x0009000900090009ULL, 0x0009000900090009ULL };
> > @@ -49,6 +53,7 @@ DECLARE_ALIGNED(32, const ymm_reg, ff_pw_256) = {
> 0x0100010001000100ULL, 0x010
> > DECLARE_ALIGNED(32, const ymm_reg, ff_pw_512) = {
> 0x0200020002000200ULL, 0x0200020002000200ULL,
> >
> 0x0200020002000200ULL, 0x0200020002000200ULL };
> > DECLARE_ALIGNED(16, const xmm_reg, ff_pw_1019) = {
> 0x03FB03FB03FB03FBULL, 0x03FB03FB03FB03FBULL };
> > +av_mcmodel_small
> > DECLARE_ALIGNED(32, const ymm_reg, ff_pw_1023) = {
> 0x03ff03ff03ff03ffULL, 0x03ff03ff03ff03ffULL,
> >
> 0x03ff03ff03ff03ffULL, 0x03ff03ff03ff03ffULL};
> > DECLARE_ALIGNED(32, const ymm_reg, ff_pw_1024) = {
> 0x0400040004000400ULL, 0x0400040004000400ULL,
> > @@ -66,13 +71,16 @@ DECLARE_ALIGNED(32, const ymm_reg, ff_pw_m1) = {
> 0xFFFFFFFFFFFFFFFFULL, 0xFFF
> >
> > DECLARE_ALIGNED(32, const ymm_reg, ff_pb_0) = {
> 0x0000000000000000ULL, 0x0000000000000000ULL,
> >
> 0x0000000000000000ULL, 0x0000000000000000ULL };
> > +av_mcmodel_small
> > DECLARE_ALIGNED(32, const ymm_reg, ff_pb_1) = {
> 0x0101010101010101ULL, 0x0101010101010101ULL,
> >
> 0x0101010101010101ULL, 0x0101010101010101ULL };
> > DECLARE_ALIGNED(32, const ymm_reg, ff_pb_2) = {
> 0x0202020202020202ULL, 0x0202020202020202ULL,
> >
> 0x0202020202020202ULL, 0x0202020202020202ULL };
> > +av_mcmodel_small
> > DECLARE_ALIGNED(32, const ymm_reg, ff_pb_3) = {
> 0x0303030303030303ULL, 0x0303030303030303ULL,
> >
> 0x0303030303030303ULL, 0x0303030303030303ULL };
> > DECLARE_ALIGNED(32, const xmm_reg, ff_pb_15) = {
> 0x0F0F0F0F0F0F0F0FULL, 0x0F0F0F0F0F0F0F0FULL };
> > +av_mcmodel_small
> > DECLARE_ALIGNED(32, const ymm_reg, ff_pb_80) = {
> 0x8080808080808080ULL, 0x8080808080808080ULL,
> >
> 0x8080808080808080ULL, 0x8080808080808080ULL };
> > DECLARE_ALIGNED(32, const ymm_reg, ff_pb_FE) = {
> 0xFEFEFEFEFEFEFEFEULL, 0xFEFEFEFEFEFEFEFEULL,
> > diff --git a/libavutil/attributes.h b/libavutil/attributes.h
> > index 04c615c952..704a7070db 100644
> > --- a/libavutil/attributes.h
> > +++ b/libavutil/attributes.h
> > @@ -104,6 +104,12 @@
> > # define attribute_deprecated
> > #endif
> >
> > +#if defined(__clang__) && __has_attribute(model)
> > +# define av_mcmodel_small __attribute__((model("small")))
> > +#else
> > +# define av_mcmodel_small
> > +#endif
>
> That's a bad macro name, moreover being a compiler-specific name.
> Could you consider calling it simply av_global?
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 16+ messages in thread