* [FFmpeg-devel] [PATCH 01/41] avcodec/x86/qpeldsp: Remove unused ff_put_no_rnd_pixels16_l2_3dnow
2022-06-09 23:15 [FFmpeg-devel] [PATCH 00/41] Stop including superseded functions for x64 Andreas Rheinhardt
@ 2022-06-09 23:54 ` Andreas Rheinhardt
2022-06-09 23:54 ` [FFmpeg-devel] [PATCH 02/41] avcodec/x86/hevcdsp_init: Remove unnecessary inclusion of get_bits.h Andreas Rheinhardt
` (40 subsequent siblings)
41 siblings, 0 replies; 46+ messages in thread
From: Andreas Rheinhardt @ 2022-06-09 23:54 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Andreas Rheinhardt
qpeldsp does not use 3dnow, it is MMXEXT-only.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
---
libavcodec/x86/qpeldsp.asm | 2 --
1 file changed, 2 deletions(-)
diff --git a/libavcodec/x86/qpeldsp.asm b/libavcodec/x86/qpeldsp.asm
index 282faed14f..3a6a650654 100644
--- a/libavcodec/x86/qpeldsp.asm
+++ b/libavcodec/x86/qpeldsp.asm
@@ -166,8 +166,6 @@ cglobal put_no_rnd_pixels16_l2, 6,6
INIT_MMX mmxext
PUT_NO_RND_PIXELS16_l2
-INIT_MMX 3dnow
-PUT_NO_RND_PIXELS16_l2
%macro MPEG4_QPEL16_H_LOWPASS 1
cglobal %1_mpeg4_qpel16_h_lowpass, 5, 5, 0, 16
--
2.34.1
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 46+ messages in thread
* [FFmpeg-devel] [PATCH 02/41] avcodec/x86/hevcdsp_init: Remove unnecessary inclusion of get_bits.h
2022-06-09 23:15 [FFmpeg-devel] [PATCH 00/41] Stop including superseded functions for x64 Andreas Rheinhardt
2022-06-09 23:54 ` [FFmpeg-devel] [PATCH 01/41] avcodec/x86/qpeldsp: Remove unused ff_put_no_rnd_pixels16_l2_3dnow Andreas Rheinhardt
@ 2022-06-09 23:54 ` Andreas Rheinhardt
2022-06-09 23:54 ` [FFmpeg-devel] [PATCH 03/41] avcodec/hevcdec: Make ff_hevc_pel_weight static Andreas Rheinhardt
` (39 subsequent siblings)
41 siblings, 0 replies; 46+ messages in thread
From: Andreas Rheinhardt @ 2022-06-09 23:54 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Andreas Rheinhardt
This file does not use anything from get_bits.h at all;
furthermore hevcdsp.h now includes get_bits.h itself.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
---
libavcodec/x86/hevcdsp_init.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/libavcodec/x86/hevcdsp_init.c b/libavcodec/x86/hevcdsp_init.c
index f3061bda84..48f48a925f 100644
--- a/libavcodec/x86/hevcdsp_init.c
+++ b/libavcodec/x86/hevcdsp_init.c
@@ -25,7 +25,6 @@
#include "libavutil/mem_internal.h"
#include "libavutil/x86/asm.h"
#include "libavutil/x86/cpu.h"
-#include "libavcodec/get_bits.h" /* required for hevcdsp.h GetBitContext */
#include "libavcodec/hevcdsp.h"
#include "libavcodec/x86/hevcdsp.h"
--
2.34.1
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 46+ messages in thread
* [FFmpeg-devel] [PATCH 03/41] avcodec/hevcdec: Make ff_hevc_pel_weight static
2022-06-09 23:15 [FFmpeg-devel] [PATCH 00/41] Stop including superseded functions for x64 Andreas Rheinhardt
2022-06-09 23:54 ` [FFmpeg-devel] [PATCH 01/41] avcodec/x86/qpeldsp: Remove unused ff_put_no_rnd_pixels16_l2_3dnow Andreas Rheinhardt
2022-06-09 23:54 ` [FFmpeg-devel] [PATCH 02/41] avcodec/x86/hevcdsp_init: Remove unnecessary inclusion of get_bits.h Andreas Rheinhardt
@ 2022-06-09 23:54 ` Andreas Rheinhardt
2022-06-09 23:54 ` [FFmpeg-devel] [PATCH 04/41] avcodec/v4l2_m2m: Remove unused ff_v4l2_m2m_codec_full_reinit Andreas Rheinhardt
` (38 subsequent siblings)
41 siblings, 0 replies; 46+ messages in thread
From: Andreas Rheinhardt @ 2022-06-09 23:54 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Andreas Rheinhardt
Only used here.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
---
libavcodec/hevcdec.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/libavcodec/hevcdec.c b/libavcodec/hevcdec.c
index f782ea6394..e84c30dd13 100644
--- a/libavcodec/hevcdec.c
+++ b/libavcodec/hevcdec.c
@@ -52,7 +52,7 @@
#include "thread.h"
#include "threadframe.h"
-const uint8_t ff_hevc_pel_weight[65] = { [2] = 0, [4] = 1, [6] = 2, [8] = 3, [12] = 4, [16] = 5, [24] = 6, [32] = 7, [48] = 8, [64] = 9 };
+static const uint8_t hevc_pel_weight[65] = { [2] = 0, [4] = 1, [6] = 2, [8] = 3, [12] = 4, [16] = 5, [24] = 6, [32] = 7, [48] = 8, [64] = 9 };
/**
* NOTE: Each function hls_foo correspond to the function foo in the
@@ -1509,7 +1509,7 @@ static void luma_mc_uni(HEVCContext *s, uint8_t *dst, ptrdiff_t dststride,
int my = mv->y & 3;
int weight_flag = (s->sh.slice_type == HEVC_SLICE_P && s->ps.pps->weighted_pred_flag) ||
(s->sh.slice_type == HEVC_SLICE_B && s->ps.pps->weighted_bipred_flag);
- int idx = ff_hevc_pel_weight[block_w];
+ int idx = hevc_pel_weight[block_w];
x_off += mv->x >> 2;
y_off += mv->y >> 2;
@@ -1576,7 +1576,7 @@ static void luma_mc_uni(HEVCContext *s, uint8_t *dst, ptrdiff_t dststride,
int y_off0 = y_off + (mv0->y >> 2);
int x_off1 = x_off + (mv1->x >> 2);
int y_off1 = y_off + (mv1->y >> 2);
- int idx = ff_hevc_pel_weight[block_w];
+ int idx = hevc_pel_weight[block_w];
uint8_t *src0 = ref0->data[0] + y_off0 * src0stride + (int)((unsigned)x_off0 << s->ps.sps->pixel_shift);
uint8_t *src1 = ref1->data[0] + y_off1 * src1stride + (int)((unsigned)x_off1 << s->ps.sps->pixel_shift);
@@ -1658,7 +1658,7 @@ static void chroma_mc_uni(HEVCContext *s, uint8_t *dst0,
const Mv *mv = ¤t_mv->mv[reflist];
int weight_flag = (s->sh.slice_type == HEVC_SLICE_P && s->ps.pps->weighted_pred_flag) ||
(s->sh.slice_type == HEVC_SLICE_B && s->ps.pps->weighted_bipred_flag);
- int idx = ff_hevc_pel_weight[block_w];
+ int idx = hevc_pel_weight[block_w];
int hshift = s->ps.sps->hshift[1];
int vshift = s->ps.sps->vshift[1];
intptr_t mx = av_mod_uintp2(mv->x, 2 + hshift);
@@ -1743,7 +1743,7 @@ static void chroma_mc_bi(HEVCContext *s, uint8_t *dst0, ptrdiff_t dststride, AVF
int y_off0 = y_off + (mv0->y >> (2 + vshift));
int x_off1 = x_off + (mv1->x >> (2 + hshift));
int y_off1 = y_off + (mv1->y >> (2 + vshift));
- int idx = ff_hevc_pel_weight[block_w];
+ int idx = hevc_pel_weight[block_w];
src1 += y_off0 * src1stride + (int)((unsigned)x_off0 << s->ps.sps->pixel_shift);
src2 += y_off1 * src2stride + (int)((unsigned)x_off1 << s->ps.sps->pixel_shift);
--
2.34.1
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 46+ messages in thread
* [FFmpeg-devel] [PATCH 04/41] avcodec/v4l2_m2m: Remove unused ff_v4l2_m2m_codec_full_reinit
2022-06-09 23:15 [FFmpeg-devel] [PATCH 00/41] Stop including superseded functions for x64 Andreas Rheinhardt
` (2 preceding siblings ...)
2022-06-09 23:54 ` [FFmpeg-devel] [PATCH 03/41] avcodec/hevcdec: Make ff_hevc_pel_weight static Andreas Rheinhardt
@ 2022-06-09 23:54 ` Andreas Rheinhardt
2022-06-09 23:54 ` [FFmpeg-devel] [PATCH 05/41] avcodec/videodsp: Make ff_emulated_edge_mc_16 static Andreas Rheinhardt
` (37 subsequent siblings)
41 siblings, 0 replies; 46+ messages in thread
From: Andreas Rheinhardt @ 2022-06-09 23:54 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Andreas Rheinhardt
Unused since df701ed0b582a6b5c763310b4225446089cbcfb1.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
---
libavcodec/v4l2_m2m.c | 76 -------------------------------------------
1 file changed, 76 deletions(-)
diff --git a/libavcodec/v4l2_m2m.c b/libavcodec/v4l2_m2m.c
index 3178ef06b8..51932baf84 100644
--- a/libavcodec/v4l2_m2m.c
+++ b/libavcodec/v4l2_m2m.c
@@ -244,82 +244,6 @@ int ff_v4l2_m2m_codec_reinit(V4L2m2mContext *s)
return 0;
}
-int ff_v4l2_m2m_codec_full_reinit(V4L2m2mContext *s)
-{
- void *log_ctx = s->avctx;
- int ret;
-
- av_log(log_ctx, AV_LOG_DEBUG, "%s full reinit\n", s->devname);
-
- /* wait for pending buffer references */
- if (atomic_load(&s->refcount))
- while(sem_wait(&s->refsync) == -1 && errno == EINTR);
-
- ret = ff_v4l2_context_set_status(&s->output, VIDIOC_STREAMOFF);
- if (ret) {
- av_log(log_ctx, AV_LOG_ERROR, "output VIDIOC_STREAMOFF\n");
- goto error;
- }
-
- ret = ff_v4l2_context_set_status(&s->capture, VIDIOC_STREAMOFF);
- if (ret) {
- av_log(log_ctx, AV_LOG_ERROR, "capture VIDIOC_STREAMOFF\n");
- goto error;
- }
-
- /* release and unmmap the buffers */
- ff_v4l2_context_release(&s->output);
- ff_v4l2_context_release(&s->capture);
-
- /* start again now that we know the stream dimensions */
- s->draining = 0;
- s->reinit = 0;
-
- ret = ff_v4l2_context_get_format(&s->output, 0);
- if (ret) {
- av_log(log_ctx, AV_LOG_DEBUG, "v4l2 output format not supported\n");
- goto error;
- }
-
- ret = ff_v4l2_context_get_format(&s->capture, 0);
- if (ret) {
- av_log(log_ctx, AV_LOG_DEBUG, "v4l2 capture format not supported\n");
- goto error;
- }
-
- ret = ff_v4l2_context_set_format(&s->output);
- if (ret) {
- av_log(log_ctx, AV_LOG_ERROR, "can't set v4l2 output format\n");
- goto error;
- }
-
- ret = ff_v4l2_context_set_format(&s->capture);
- if (ret) {
- av_log(log_ctx, AV_LOG_ERROR, "can't to set v4l2 capture format\n");
- goto error;
- }
-
- ret = ff_v4l2_context_init(&s->output);
- if (ret) {
- av_log(log_ctx, AV_LOG_ERROR, "no v4l2 output context's buffers\n");
- goto error;
- }
-
- /* decoder's buffers need to be updated at a later stage */
- if (s->avctx && !av_codec_is_decoder(s->avctx->codec)) {
- ret = ff_v4l2_context_init(&s->capture);
- if (ret) {
- av_log(log_ctx, AV_LOG_ERROR, "no v4l2 capture context's buffers\n");
- goto error;
- }
- }
-
- return 0;
-
-error:
- return ret;
-}
-
static void v4l2_m2m_destroy_context(void *opaque, uint8_t *context)
{
V4L2m2mContext *s = (V4L2m2mContext*)context;
--
2.34.1
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 46+ messages in thread
* [FFmpeg-devel] [PATCH 05/41] avcodec/videodsp: Make ff_emulated_edge_mc_16 static
2022-06-09 23:15 [FFmpeg-devel] [PATCH 00/41] Stop including superseded functions for x64 Andreas Rheinhardt
` (3 preceding siblings ...)
2022-06-09 23:54 ` [FFmpeg-devel] [PATCH 04/41] avcodec/v4l2_m2m: Remove unused ff_v4l2_m2m_codec_full_reinit Andreas Rheinhardt
@ 2022-06-09 23:54 ` Andreas Rheinhardt
2022-06-10 15:50 ` Ronald S. Bultje
2022-06-09 23:54 ` [FFmpeg-devel] [PATCH 06/41] avcodec/x86/fpel: Remove unused ff_avg_pixels4_mmx Andreas Rheinhardt
` (36 subsequent siblings)
41 siblings, 1 reply; 46+ messages in thread
From: Andreas Rheinhardt @ 2022-06-09 23:54 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Andreas Rheinhardt
Only ff_emulated_edge_mc_8() is used outside of lavc/videodsp.c.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
---
libavcodec/videodsp.c | 4 ++++
libavcodec/videodsp.h | 1 -
libavcodec/videodsp_template.c | 1 +
3 files changed, 5 insertions(+), 1 deletion(-)
diff --git a/libavcodec/videodsp.c b/libavcodec/videodsp.c
index 2198d46c15..02af046b81 100644
--- a/libavcodec/videodsp.c
+++ b/libavcodec/videodsp.c
@@ -25,11 +25,15 @@
#include "videodsp.h"
#define BIT_DEPTH 8
+#define STATIC
#include "videodsp_template.c"
+#undef STATIC
#undef BIT_DEPTH
#define BIT_DEPTH 16
+#define STATIC static
#include "videodsp_template.c"
+#undef STATIC
#undef BIT_DEPTH
static void just_return(uint8_t *buf, ptrdiff_t stride, int h)
diff --git a/libavcodec/videodsp.h b/libavcodec/videodsp.h
index ac971dc57f..b5219d236c 100644
--- a/libavcodec/videodsp.h
+++ b/libavcodec/videodsp.h
@@ -36,7 +36,6 @@ void ff_emulated_edge_mc_ ## depth(uint8_t *dst, const uint8_t *src, \
int src_x, int src_y, int w, int h);
EMULATED_EDGE(8)
-EMULATED_EDGE(16)
typedef struct VideoDSPContext {
/**
diff --git a/libavcodec/videodsp_template.c b/libavcodec/videodsp_template.c
index 55123a5844..8bc3290248 100644
--- a/libavcodec/videodsp_template.c
+++ b/libavcodec/videodsp_template.c
@@ -20,6 +20,7 @@
*/
#include "bit_depth_template.c"
+STATIC
void FUNC(ff_emulated_edge_mc)(uint8_t *buf, const uint8_t *src,
ptrdiff_t buf_linesize,
ptrdiff_t src_linesize,
--
2.34.1
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [FFmpeg-devel] [PATCH 05/41] avcodec/videodsp: Make ff_emulated_edge_mc_16 static
2022-06-09 23:54 ` [FFmpeg-devel] [PATCH 05/41] avcodec/videodsp: Make ff_emulated_edge_mc_16 static Andreas Rheinhardt
@ 2022-06-10 15:50 ` Ronald S. Bultje
2022-06-10 16:07 ` Andreas Rheinhardt
0 siblings, 1 reply; 46+ messages in thread
From: Ronald S. Bultje @ 2022-06-10 15:50 UTC (permalink / raw)
To: FFmpeg development discussions and patches; +Cc: Andreas Rheinhardt
Hi,
On Thu, Jun 9, 2022 at 7:56 PM Andreas Rheinhardt <
andreas.rheinhardt@outlook.com> wrote:
> Only ff_emulated_edge_mc_8() is used outside of lavc/videodsp.c.
>
> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
> ---
> libavcodec/videodsp.c | 4 ++++
> libavcodec/videodsp.h | 1 -
> libavcodec/videodsp_template.c | 1 +
> 3 files changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/libavcodec/videodsp.c b/libavcodec/videodsp.c
> index 2198d46c15..02af046b81 100644
> --- a/libavcodec/videodsp.c
> +++ b/libavcodec/videodsp.c
> @@ -25,11 +25,15 @@
> #include "videodsp.h"
>
> #define BIT_DEPTH 8
> +#define STATIC
> #include "videodsp_template.c"
> +#undef STATIC
> #undef BIT_DEPTH
>
> #define BIT_DEPTH 16
> +#define STATIC static
> #include "videodsp_template.c"
> +#undef STATIC
> #undef BIT_DEPTH
>
> static void just_return(uint8_t *buf, ptrdiff_t stride, int h)
>
[..]
> diff --git a/libavcodec/videodsp_template.c
> b/libavcodec/videodsp_template.c
> index 55123a5844..8bc3290248 100644
> --- a/libavcodec/videodsp_template.c
> +++ b/libavcodec/videodsp_template.c
> @@ -20,6 +20,7 @@
> */
>
> #include "bit_depth_template.c"
> +STATIC
> void FUNC(ff_emulated_edge_mc)(uint8_t *buf, const uint8_t *src,
> ptrdiff_t buf_linesize,
> ptrdiff_t src_linesize,
> --
> 2.34.1
>
This splits the "staticness" over two places (i.e. to understand what
STATIC means and/or why it exists, you have to look at two places), and
also doesn't explain why we need "variable staticness" (i.e. one being
static, but not the other one).
Maybe you could use the following:
#if BIT_DEPTH != 8 // we make a call to the 8-bit version in
$fill_me_in_here$
static
#endif
void FUNC(ff_..
That way the meaning of STATIC is not obfuscated (I know, STATIC should be
obvious, but it's still an indirection) and the reasoning is included also.
Ronald
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [FFmpeg-devel] [PATCH 05/41] avcodec/videodsp: Make ff_emulated_edge_mc_16 static
2022-06-10 15:50 ` Ronald S. Bultje
@ 2022-06-10 16:07 ` Andreas Rheinhardt
0 siblings, 0 replies; 46+ messages in thread
From: Andreas Rheinhardt @ 2022-06-10 16:07 UTC (permalink / raw)
To: Ronald S. Bultje, FFmpeg development discussions and patches
Ronald S. Bultje:
> Hi,
>
> On Thu, Jun 9, 2022 at 7:56 PM Andreas Rheinhardt <
> andreas.rheinhardt@outlook.com> wrote:
>
>> Only ff_emulated_edge_mc_8() is used outside of lavc/videodsp.c.
>>
>> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
>> ---
>> libavcodec/videodsp.c | 4 ++++
>> libavcodec/videodsp.h | 1 -
>> libavcodec/videodsp_template.c | 1 +
>> 3 files changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/libavcodec/videodsp.c b/libavcodec/videodsp.c
>> index 2198d46c15..02af046b81 100644
>> --- a/libavcodec/videodsp.c
>> +++ b/libavcodec/videodsp.c
>> @@ -25,11 +25,15 @@
>> #include "videodsp.h"
>>
>> #define BIT_DEPTH 8
>> +#define STATIC
>> #include "videodsp_template.c"
>> +#undef STATIC
>> #undef BIT_DEPTH
>>
>> #define BIT_DEPTH 16
>> +#define STATIC static
>> #include "videodsp_template.c"
>> +#undef STATIC
>> #undef BIT_DEPTH
>>
>> static void just_return(uint8_t *buf, ptrdiff_t stride, int h)
>>
> [..]
>
>> diff --git a/libavcodec/videodsp_template.c
>> b/libavcodec/videodsp_template.c
>> index 55123a5844..8bc3290248 100644
>> --- a/libavcodec/videodsp_template.c
>> +++ b/libavcodec/videodsp_template.c
>> @@ -20,6 +20,7 @@
>> */
>>
>> #include "bit_depth_template.c"
>> +STATIC
>> void FUNC(ff_emulated_edge_mc)(uint8_t *buf, const uint8_t *src,
>> ptrdiff_t buf_linesize,
>> ptrdiff_t src_linesize,
>> --
>> 2.34.1
>>
>
> This splits the "staticness" over two places (i.e. to understand what
> STATIC means and/or why it exists, you have to look at two places), and
> also doesn't explain why we need "variable staticness" (i.e. one being
> static, but not the other one).
>
> Maybe you could use the following:
>
> #if BIT_DEPTH != 8 // we make a call to the 8-bit version in
> $fill_me_in_here$
> static
> #endif
> void FUNC(ff_..
>
> That way the meaning of STATIC is not obfuscated (I know, STATIC should be
> obvious, but it's still an indirection) and the reasoning is included also.
>
> Ronald
>
I'll change it if you prefer it that way.
- Andreas
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 46+ messages in thread
* [FFmpeg-devel] [PATCH 06/41] avcodec/x86/fpel: Remove unused ff_avg_pixels4_mmx
2022-06-09 23:15 [FFmpeg-devel] [PATCH 00/41] Stop including superseded functions for x64 Andreas Rheinhardt
` (4 preceding siblings ...)
2022-06-09 23:54 ` [FFmpeg-devel] [PATCH 05/41] avcodec/videodsp: Make ff_emulated_edge_mc_16 static Andreas Rheinhardt
@ 2022-06-09 23:54 ` Andreas Rheinhardt
2022-06-09 23:54 ` [FFmpeg-devel] [PATCH 07/41] avcodec/x86/rv34dsp: Remove unused ff_rv34_idct_dc_mmxext Andreas Rheinhardt
` (35 subsequent siblings)
41 siblings, 0 replies; 46+ messages in thread
From: Andreas Rheinhardt @ 2022-06-09 23:54 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Andreas Rheinhardt
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
---
libavcodec/x86/fpel.asm | 1 -
1 file changed, 1 deletion(-)
diff --git a/libavcodec/x86/fpel.asm b/libavcodec/x86/fpel.asm
index 961a1587a7..d38a1b1035 100644
--- a/libavcodec/x86/fpel.asm
+++ b/libavcodec/x86/fpel.asm
@@ -90,7 +90,6 @@ cglobal %1_pixels%2, 4,5,4
INIT_MMX mmx
OP_PIXELS put, 4
-OP_PIXELS avg, 4
OP_PIXELS put, 8
OP_PIXELS avg, 8
OP_PIXELS put, 16
--
2.34.1
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 46+ messages in thread
* [FFmpeg-devel] [PATCH 07/41] avcodec/x86/rv34dsp: Remove unused ff_rv34_idct_dc_mmxext
2022-06-09 23:15 [FFmpeg-devel] [PATCH 00/41] Stop including superseded functions for x64 Andreas Rheinhardt
` (5 preceding siblings ...)
2022-06-09 23:54 ` [FFmpeg-devel] [PATCH 06/41] avcodec/x86/fpel: Remove unused ff_avg_pixels4_mmx Andreas Rheinhardt
@ 2022-06-09 23:54 ` Andreas Rheinhardt
2022-06-09 23:54 ` [FFmpeg-devel] [PATCH 08/41] avcodec/x86/h264_qpel_8bit: Remove unused function Andreas Rheinhardt
` (34 subsequent siblings)
41 siblings, 0 replies; 46+ messages in thread
From: Andreas Rheinhardt @ 2022-06-09 23:54 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Andreas Rheinhardt
Forgotten in 9ba9c3402499d90e54f8aa111b62c278206d11af.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
---
libavcodec/x86/rv34dsp.asm | 13 +++----------
1 file changed, 3 insertions(+), 10 deletions(-)
diff --git a/libavcodec/x86/rv34dsp.asm b/libavcodec/x86/rv34dsp.asm
index 692b4acfcd..5568ddfdf8 100644
--- a/libavcodec/x86/rv34dsp.asm
+++ b/libavcodec/x86/rv34dsp.asm
@@ -44,10 +44,10 @@ SECTION .text
sar %1, 10
%endmacro
-%macro rv34_idct 1
-cglobal rv34_idct_%1, 1, 2, 0
+INIT_MMX mmxext
+cglobal rv34_idct_dc_noround, 1, 2, 0
movsx r1, word [r0]
- IDCT_DC r1
+ IDCT_DC_NOROUND r1
movd m0, r1d
pshufw m0, m0, 0
movq [r0+ 0], m0
@@ -55,13 +55,6 @@ cglobal rv34_idct_%1, 1, 2, 0
movq [r0+16], m0
movq [r0+24], m0
REP_RET
-%endmacro
-
-INIT_MMX mmxext
-%define IDCT_DC IDCT_DC_ROUND
-rv34_idct dc
-%define IDCT_DC IDCT_DC_NOROUND
-rv34_idct dc_noround
; ff_rv34_idct_dc_add_mmx(uint8_t *dst, int stride, int dc);
%if ARCH_X86_32
--
2.34.1
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 46+ messages in thread
* [FFmpeg-devel] [PATCH 08/41] avcodec/x86/h264_qpel_8bit: Remove unused function
2022-06-09 23:15 [FFmpeg-devel] [PATCH 00/41] Stop including superseded functions for x64 Andreas Rheinhardt
` (6 preceding siblings ...)
2022-06-09 23:54 ` [FFmpeg-devel] [PATCH 07/41] avcodec/x86/rv34dsp: Remove unused ff_rv34_idct_dc_mmxext Andreas Rheinhardt
@ 2022-06-09 23:54 ` Andreas Rheinhardt
2022-06-09 23:54 ` [FFmpeg-devel] [PATCH 09/41] avcodec/x86/vc1dsp_init: Disable overridden functions on x64 Andreas Rheinhardt
` (33 subsequent siblings)
41 siblings, 0 replies; 46+ messages in thread
From: Andreas Rheinhardt @ 2022-06-09 23:54 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Andreas Rheinhardt
Namely ff_avg_h264_qpel8or16_hv1_lowpass_op_mmxext. It seems to exist
since 610e00b3594bf0f2a75713f20e9c4edf0d03a818 (a function like this
already existed before that commit, but it was static and
av_always_inline and was therefore not present in the actual binaries).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
---
libavcodec/x86/h264_qpel_8bit.asm | 1 -
1 file changed, 1 deletion(-)
diff --git a/libavcodec/x86/h264_qpel_8bit.asm b/libavcodec/x86/h264_qpel_8bit.asm
index 2d287ba443..03c7d88f8c 100644
--- a/libavcodec/x86/h264_qpel_8bit.asm
+++ b/libavcodec/x86/h264_qpel_8bit.asm
@@ -583,7 +583,6 @@ cglobal %1_h264_qpel8or16_hv1_lowpass_op, 4,4,8 ; src, tmp, srcStride, size
INIT_MMX mmxext
QPEL8OR16_HV1_LOWPASS_OP put
-QPEL8OR16_HV1_LOWPASS_OP avg
INIT_XMM sse2
QPEL8OR16_HV1_LOWPASS_OP put
--
2.34.1
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 46+ messages in thread
* [FFmpeg-devel] [PATCH 09/41] avcodec/x86/vc1dsp_init: Disable overridden functions on x64
2022-06-09 23:15 [FFmpeg-devel] [PATCH 00/41] Stop including superseded functions for x64 Andreas Rheinhardt
` (7 preceding siblings ...)
2022-06-09 23:54 ` [FFmpeg-devel] [PATCH 08/41] avcodec/x86/h264_qpel_8bit: Remove unused function Andreas Rheinhardt
@ 2022-06-09 23:54 ` Andreas Rheinhardt
2022-06-09 23:54 ` [FFmpeg-devel] [PATCH 10/41] avcodec/x86/ac3dsp_init: " Andreas Rheinhardt
` (32 subsequent siblings)
41 siblings, 0 replies; 46+ messages in thread
From: Andreas Rheinhardt @ 2022-06-09 23:54 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Andreas Rheinhardt
x64 always has MMX, MMXEXT, SSE and SSE2 and this means
that some functions for MMX, MMXEXT and 3dnow are always
overridden by other functions (unless one e.g. explicitly
disables SSE2). This commit therefore disables these functions
at compile-time.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
---
libavcodec/x86/h264_chromamc.asm | 2 ++
libavcodec/x86/vc1dsp_init.c | 41 +++++++++++++++++++---------
libavcodec/x86/vc1dsp_loopfilter.asm | 2 ++
3 files changed, 32 insertions(+), 13 deletions(-)
diff --git a/libavcodec/x86/h264_chromamc.asm b/libavcodec/x86/h264_chromamc.asm
index b5a78b537d..0421fa8695 100644
--- a/libavcodec/x86/h264_chromamc.asm
+++ b/libavcodec/x86/h264_chromamc.asm
@@ -448,7 +448,9 @@ chroma_mc2_mmx_func avg, h264
INIT_MMX 3dnow
chroma_mc8_mmx_func avg, h264, _rnd
+%if ARCH_X86_32
chroma_mc8_mmx_func avg, vc1, _nornd
+%endif
chroma_mc8_mmx_func avg, rv40
chroma_mc4_mmx_func avg, h264
chroma_mc4_mmx_func avg, rv40
diff --git a/libavcodec/x86/vc1dsp_init.c b/libavcodec/x86/vc1dsp_init.c
index 2fbf0b3a74..66d894061c 100644
--- a/libavcodec/x86/vc1dsp_init.c
+++ b/libavcodec/x86/vc1dsp_init.c
@@ -33,9 +33,10 @@
#include "vc1dsp.h"
#include "config.h"
-#define LOOP_FILTER(EXT) \
+#define LOOP_FILTER4(EXT) \
void ff_vc1_v_loop_filter4_ ## EXT(uint8_t *src, ptrdiff_t stride, int pq); \
-void ff_vc1_h_loop_filter4_ ## EXT(uint8_t *src, ptrdiff_t stride, int pq); \
+void ff_vc1_h_loop_filter4_ ## EXT(uint8_t *src, ptrdiff_t stride, int pq);
+#define LOOP_FILTER816(EXT) \
void ff_vc1_v_loop_filter8_ ## EXT(uint8_t *src, ptrdiff_t stride, int pq); \
void ff_vc1_h_loop_filter8_ ## EXT(uint8_t *src, ptrdiff_t stride, int pq); \
\
@@ -52,9 +53,13 @@ static void vc1_h_loop_filter16_ ## EXT(uint8_t *src, ptrdiff_t stride, int pq)
}
#if HAVE_X86ASM
-LOOP_FILTER(mmxext)
-LOOP_FILTER(sse2)
-LOOP_FILTER(ssse3)
+LOOP_FILTER4(mmxext)
+#if ARCH_X86_32
+LOOP_FILTER816(mmxext)
+#endif
+LOOP_FILTER816(sse2)
+LOOP_FILTER4(ssse3)
+LOOP_FILTER816(ssse3)
void ff_vc1_h_loop_filter8_sse4(uint8_t *src, ptrdiff_t stride, int pq);
@@ -71,12 +76,14 @@ static void vc1_h_loop_filter16_sse4(uint8_t *src, ptrdiff_t stride, int pq)
ff_ ## OP ## pixels ## DEPTH ## INSN(dst, src, stride, DEPTH); \
}
-DECLARE_FUNCTION(put_, 8, _mmx)
+#if ARCH_X86_32
DECLARE_FUNCTION(put_, 16, _mmx)
DECLARE_FUNCTION(avg_, 8, _mmx)
DECLARE_FUNCTION(avg_, 16, _mmx)
-DECLARE_FUNCTION(avg_, 8, _mmxext)
DECLARE_FUNCTION(avg_, 16, _mmxext)
+#endif
+DECLARE_FUNCTION(put_, 8, _mmx)
+DECLARE_FUNCTION(avg_, 8, _mmxext)
DECLARE_FUNCTION(put_, 16, _sse2)
DECLARE_FUNCTION(avg_, 16, _sse2)
@@ -114,9 +121,10 @@ av_cold void ff_vc1dsp_init_x86(VC1DSPContext *dsp)
if (EXTERNAL_MMXEXT(cpu_flags))
ff_vc1dsp_init_mmxext(dsp);
-#define ASSIGN_LF(EXT) \
+#define ASSIGN_LF4(EXT) \
dsp->vc1_v_loop_filter4 = ff_vc1_v_loop_filter4_ ## EXT; \
- dsp->vc1_h_loop_filter4 = ff_vc1_h_loop_filter4_ ## EXT; \
+ dsp->vc1_h_loop_filter4 = ff_vc1_h_loop_filter4_ ## EXT
+#define ASSIGN_LF816(EXT) \
dsp->vc1_v_loop_filter8 = ff_vc1_v_loop_filter8_ ## EXT; \
dsp->vc1_h_loop_filter8 = ff_vc1_h_loop_filter8_ ## EXT; \
dsp->vc1_v_loop_filter16 = vc1_v_loop_filter16_ ## EXT; \
@@ -127,19 +135,25 @@ av_cold void ff_vc1dsp_init_x86(VC1DSPContext *dsp)
dsp->put_no_rnd_vc1_chroma_pixels_tab[0] = ff_put_vc1_chroma_mc8_nornd_mmx;
dsp->put_vc1_mspel_pixels_tab[1][0] = put_vc1_mspel_mc00_8_mmx;
+#if ARCH_X86_32
dsp->put_vc1_mspel_pixels_tab[0][0] = put_vc1_mspel_mc00_16_mmx;
dsp->avg_vc1_mspel_pixels_tab[1][0] = avg_vc1_mspel_mc00_8_mmx;
dsp->avg_vc1_mspel_pixels_tab[0][0] = avg_vc1_mspel_mc00_16_mmx;
}
if (EXTERNAL_AMD3DNOW(cpu_flags)) {
dsp->avg_no_rnd_vc1_chroma_pixels_tab[0] = ff_avg_vc1_chroma_mc8_nornd_3dnow;
+#endif
}
if (EXTERNAL_MMXEXT(cpu_flags)) {
- ASSIGN_LF(mmxext);
- dsp->avg_no_rnd_vc1_chroma_pixels_tab[0] = ff_avg_vc1_chroma_mc8_nornd_mmxext;
+ ASSIGN_LF4(mmxext);
+#if ARCH_X86_32
+ ASSIGN_LF816(mmxext);
- dsp->avg_vc1_mspel_pixels_tab[1][0] = avg_vc1_mspel_mc00_8_mmxext;
dsp->avg_vc1_mspel_pixels_tab[0][0] = avg_vc1_mspel_mc00_16_mmxext;
+#endif
+ dsp->avg_vc1_mspel_pixels_tab[1][0] = avg_vc1_mspel_mc00_8_mmxext;
+
+ dsp->avg_no_rnd_vc1_chroma_pixels_tab[0] = ff_avg_vc1_chroma_mc8_nornd_mmxext;
dsp->vc1_inv_trans_8x8_dc = ff_vc1_inv_trans_8x8_dc_mmxext;
dsp->vc1_inv_trans_4x8_dc = ff_vc1_inv_trans_4x8_dc_mmxext;
@@ -156,7 +170,8 @@ av_cold void ff_vc1dsp_init_x86(VC1DSPContext *dsp)
dsp->avg_vc1_mspel_pixels_tab[0][0] = avg_vc1_mspel_mc00_16_sse2;
}
if (EXTERNAL_SSSE3(cpu_flags)) {
- ASSIGN_LF(ssse3);
+ ASSIGN_LF4(ssse3);
+ ASSIGN_LF816(ssse3);
dsp->put_no_rnd_vc1_chroma_pixels_tab[0] = ff_put_vc1_chroma_mc8_nornd_ssse3;
dsp->avg_no_rnd_vc1_chroma_pixels_tab[0] = ff_avg_vc1_chroma_mc8_nornd_ssse3;
}
diff --git a/libavcodec/x86/vc1dsp_loopfilter.asm b/libavcodec/x86/vc1dsp_loopfilter.asm
index 74360949dc..3475a682b3 100644
--- a/libavcodec/x86/vc1dsp_loopfilter.asm
+++ b/libavcodec/x86/vc1dsp_loopfilter.asm
@@ -249,6 +249,7 @@ cglobal vc1_h_loop_filter4, 3,5,0
call vc1_h_loop_filter_internal
RET
+%if ARCH_X86_32
; void ff_vc1_v_loop_filter8_mmxext(uint8_t *src, ptrdiff_t stride, int pq)
cglobal vc1_v_loop_filter8, 3,5,0
START_V_FILTER
@@ -265,6 +266,7 @@ cglobal vc1_h_loop_filter8, 3,5,0
lea r0, [r0+4*r1]
call vc1_h_loop_filter_internal
RET
+%endif
%endmacro
INIT_MMX mmxext
--
2.34.1
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 46+ messages in thread
* [FFmpeg-devel] [PATCH 10/41] avcodec/x86/ac3dsp_init: Disable overridden functions on x64
2022-06-09 23:15 [FFmpeg-devel] [PATCH 00/41] Stop including superseded functions for x64 Andreas Rheinhardt
` (8 preceding siblings ...)
2022-06-09 23:54 ` [FFmpeg-devel] [PATCH 09/41] avcodec/x86/vc1dsp_init: Disable overridden functions on x64 Andreas Rheinhardt
@ 2022-06-09 23:54 ` Andreas Rheinhardt
2022-06-09 23:54 ` [FFmpeg-devel] [PATCH 11/41] avcodec/x86/audiodsp_init: " Andreas Rheinhardt
` (31 subsequent siblings)
41 siblings, 0 replies; 46+ messages in thread
From: Andreas Rheinhardt @ 2022-06-09 23:54 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Andreas Rheinhardt
x64 always has MMX, MMXEXT, SSE and SSE2 and this means
that some functions for MMX, MMXEXT, SSE and 3dnow are always
overridden by other functions (unless one e.g. explicitly
disables SSE2). This commit therefore disables such AC-3 functions
at compile-time.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
---
libavcodec/x86/ac3dsp.asm | 5 +++++
libavcodec/x86/ac3dsp_init.c | 2 ++
2 files changed, 7 insertions(+)
diff --git a/libavcodec/x86/ac3dsp.asm b/libavcodec/x86/ac3dsp.asm
index 4ddaa94320..050cbfe06a 100644
--- a/libavcodec/x86/ac3dsp.asm
+++ b/libavcodec/x86/ac3dsp.asm
@@ -64,6 +64,7 @@ cglobal ac3_exponent_min, 3, 4, 2, exp, reuse_blks, expn, offset
%endmacro
%define LOOP_ALIGN
+%if ARCH_X86_32
INIT_MMX mmx
AC3_EXPONENT_MIN
%if HAVE_MMXEXT_EXTERNAL
@@ -71,7 +72,9 @@ AC3_EXPONENT_MIN
INIT_MMX mmxext
AC3_EXPONENT_MIN
%endif
+%endif
%if HAVE_SSE2_EXTERNAL
+%define LOOP_ALIGN ALIGN 16
INIT_XMM sse2
AC3_EXPONENT_MIN
%endif
@@ -81,6 +84,7 @@ AC3_EXPONENT_MIN
; void ff_float_to_fixed24(int32_t *dst, const float *src, unsigned int len)
;-----------------------------------------------------------------------------
+%if ARCH_X86_32
; The 3DNow! version is not bit-identical because pf2id uses truncation rather
; than round-to-nearest.
INIT_MMX 3dnow
@@ -134,6 +138,7 @@ cglobal float_to_fixed24, 3, 3, 3, dst, src, len
ja .loop
emms
RET
+%endif
INIT_XMM sse2
cglobal float_to_fixed24, 3, 3, 9, dst, src, len
diff --git a/libavcodec/x86/ac3dsp_init.c b/libavcodec/x86/ac3dsp_init.c
index 5f20e6dc31..47ec5d8070 100644
--- a/libavcodec/x86/ac3dsp_init.c
+++ b/libavcodec/x86/ac3dsp_init.c
@@ -41,6 +41,7 @@ av_cold void ff_ac3dsp_init_x86(AC3DSPContext *c, int bit_exact)
{
int cpu_flags = av_get_cpu_flags();
+#if ARCH_X86_32
if (EXTERNAL_MMX(cpu_flags)) {
c->ac3_exponent_min = ff_ac3_exponent_min_mmx;
}
@@ -55,6 +56,7 @@ av_cold void ff_ac3dsp_init_x86(AC3DSPContext *c, int bit_exact)
if (EXTERNAL_SSE(cpu_flags)) {
c->float_to_fixed24 = ff_float_to_fixed24_sse;
}
+#endif
if (EXTERNAL_SSE2(cpu_flags)) {
c->ac3_exponent_min = ff_ac3_exponent_min_sse2;
c->float_to_fixed24 = ff_float_to_fixed24_sse2;
--
2.34.1
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 46+ messages in thread
* [FFmpeg-devel] [PATCH 11/41] avcodec/x86/audiodsp_init: Disable overridden functions on x64
2022-06-09 23:15 [FFmpeg-devel] [PATCH 00/41] Stop including superseded functions for x64 Andreas Rheinhardt
` (9 preceding siblings ...)
2022-06-09 23:54 ` [FFmpeg-devel] [PATCH 10/41] avcodec/x86/ac3dsp_init: " Andreas Rheinhardt
@ 2022-06-09 23:54 ` Andreas Rheinhardt
2022-06-09 23:54 ` [FFmpeg-devel] [PATCH 12/41] avcodec/x86/diracdsp_init: " Andreas Rheinhardt
` (30 subsequent siblings)
41 siblings, 0 replies; 46+ messages in thread
From: Andreas Rheinhardt @ 2022-06-09 23:54 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Andreas Rheinhardt
x64 always has MMX, MMXEXT, SSE and SSE2 and this means
that some functions for MMX, MMXEXT, SSE and 3dnow are always
overridden by other functions (unless one e.g. explicitly
disables SSE2). This commit therefore disables such audiodsp functions
at compile-time.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
---
libavcodec/x86/audiodsp.asm | 4 ++++
libavcodec/x86/audiodsp_init.c | 2 ++
2 files changed, 6 insertions(+)
diff --git a/libavcodec/x86/audiodsp.asm b/libavcodec/x86/audiodsp.asm
index de395e5fa8..e4a498b516 100644
--- a/libavcodec/x86/audiodsp.asm
+++ b/libavcodec/x86/audiodsp.asm
@@ -48,8 +48,10 @@ cglobal scalarproduct_int16, 3,3,3, v1, v2, order
RET
%endmacro
+%if ARCH_X86_32
INIT_MMX mmxext
SCALARPRODUCT
+%endif
INIT_XMM sse2
SCALARPRODUCT
@@ -117,8 +119,10 @@ cglobal vector_clip_int32%5, 5,5,%1, dst, src, min, max, len
REP_RET
%endmacro
+%if ARCH_X86_32
INIT_MMX mmx
VECTOR_CLIP_INT32 0, 1, 0, 0
+%endif
INIT_XMM sse2
VECTOR_CLIP_INT32 6, 1, 0, 0, _int
VECTOR_CLIP_INT32 6, 2, 0, 1
diff --git a/libavcodec/x86/audiodsp_init.c b/libavcodec/x86/audiodsp_init.c
index 98e296c264..ebb28ece78 100644
--- a/libavcodec/x86/audiodsp_init.c
+++ b/libavcodec/x86/audiodsp_init.c
@@ -44,11 +44,13 @@ av_cold void ff_audiodsp_init_x86(AudioDSPContext *c)
{
int cpu_flags = av_get_cpu_flags();
+#if ARCH_X86_32
if (EXTERNAL_MMX(cpu_flags))
c->vector_clip_int32 = ff_vector_clip_int32_mmx;
if (EXTERNAL_MMXEXT(cpu_flags))
c->scalarproduct_int16 = ff_scalarproduct_int16_mmxext;
+#endif
if (EXTERNAL_SSE(cpu_flags))
c->vector_clipf = ff_vector_clipf_sse;
--
2.34.1
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 46+ messages in thread
* [FFmpeg-devel] [PATCH 12/41] avcodec/x86/diracdsp_init: Disable overridden functions on x64
2022-06-09 23:15 [FFmpeg-devel] [PATCH 00/41] Stop including superseded functions for x64 Andreas Rheinhardt
` (10 preceding siblings ...)
2022-06-09 23:54 ` [FFmpeg-devel] [PATCH 11/41] avcodec/x86/audiodsp_init: " Andreas Rheinhardt
@ 2022-06-09 23:54 ` Andreas Rheinhardt
2022-06-09 23:54 ` [FFmpeg-devel] [PATCH 13/41] avcodec/x86/mpegvideoenc: " Andreas Rheinhardt
` (29 subsequent siblings)
41 siblings, 0 replies; 46+ messages in thread
From: Andreas Rheinhardt @ 2022-06-09 23:54 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Andreas Rheinhardt
x64 always has MMX, MMXEXT, SSE and SSE2 and this means
that some functions for MMX, MMXEXT, SSE and 3dnow are always
overridden by other functions (unless one e.g. explicitly
disables SSE2). This commit therefore disables such diracdsp functions
at compile-time.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
---
libavcodec/x86/diracdsp_init.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/libavcodec/x86/diracdsp_init.c b/libavcodec/x86/diracdsp_init.c
index 8baacf3129..c9dca58f1b 100644
--- a/libavcodec/x86/diracdsp_init.c
+++ b/libavcodec/x86/diracdsp_init.c
@@ -87,9 +87,11 @@ static void OPNAME ## _dirac_pixels32_ ## EXT(uint8_t *dst, const uint8_t *src[5
}\
}
+#if !ARCH_X86_64
DIRAC_PIXOP(put, mmx)
DIRAC_PIXOP(avg, mmx)
DIRAC_PIXOP(avg, mmxext)
+#endif
DIRAC_PIXOP(put, sse2)
DIRAC_PIXOP(avg, sse2)
@@ -114,13 +116,13 @@ void ff_diracdsp_init_x86(DiracDSPContext* c)
c->dirac_hpel_filter = dirac_hpel_filter_mmx;
c->add_rect_clamped = ff_add_rect_clamped_mmx;
c->put_signed_rect_clamped[0] = (void *)ff_put_signed_rect_clamped_mmx;
-#endif
PIXFUNC(put, 0, mmx);
PIXFUNC(avg, 0, mmx);
}
if (EXTERNAL_MMXEXT(mm_flags)) {
PIXFUNC(avg, 0, mmxext);
+#endif
}
if (EXTERNAL_SSE2(mm_flags)) {
--
2.34.1
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 46+ messages in thread
* [FFmpeg-devel] [PATCH 13/41] avcodec/x86/mpegvideoenc: Disable overridden functions on x64
2022-06-09 23:15 [FFmpeg-devel] [PATCH 00/41] Stop including superseded functions for x64 Andreas Rheinhardt
` (11 preceding siblings ...)
2022-06-09 23:54 ` [FFmpeg-devel] [PATCH 12/41] avcodec/x86/diracdsp_init: " Andreas Rheinhardt
@ 2022-06-09 23:54 ` Andreas Rheinhardt
2022-06-09 23:54 ` [FFmpeg-devel] [PATCH 14/41] avcodec/x86/fdct: " Andreas Rheinhardt
` (28 subsequent siblings)
41 siblings, 0 replies; 46+ messages in thread
From: Andreas Rheinhardt @ 2022-06-09 23:54 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Andreas Rheinhardt
x64 always has MMX, MMXEXT, SSE and SSE2 and this means
that some functions for MMX, MMXEXT, SSE and 3dnow are always
overridden by other functions (unless one e.g. explicitly
disables SSE2). This commit therefore disables such mpegvideoenc
functions at compile-time.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
---
libavcodec/x86/mpegvideoenc.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/libavcodec/x86/mpegvideoenc.c b/libavcodec/x86/mpegvideoenc.c
index 3691cce26c..d9c89d568b 100644
--- a/libavcodec/x86/mpegvideoenc.c
+++ b/libavcodec/x86/mpegvideoenc.c
@@ -42,6 +42,7 @@ DECLARE_ALIGNED(16, static const uint16_t, inv_zigzag_direct16)[64] = {
#if HAVE_6REGS
+#if ARCH_X86_32
#if HAVE_MMX_INLINE
#define COMPILE_TEMPLATE_MMXEXT 0
#define COMPILE_TEMPLATE_SSE2 0
@@ -64,6 +65,7 @@ DECLARE_ALIGNED(16, static const uint16_t, inv_zigzag_direct16)[64] = {
#define RENAME_FDCT(a) a ## _mmxext
#include "mpegvideoenc_template.c"
#endif /* HAVE_MMXEXT_INLINE */
+#endif /* ARCH_X86_32 */
#if HAVE_SSE2_INLINE
#undef COMPILE_TEMPLATE_MMXEXT
@@ -96,7 +98,7 @@ DECLARE_ALIGNED(16, static const uint16_t, inv_zigzag_direct16)[64] = {
#endif /* HAVE_6REGS */
#if HAVE_INLINE_ASM
-#if HAVE_MMX_INLINE
+#if HAVE_MMX_INLINE && ARCH_X86_32
static void denoise_dct_mmx(MpegEncContext *s, int16_t *block){
const int intra= s->mb_intra;
int *sum= s->dct_error_sum[intra];
@@ -218,17 +220,18 @@ av_cold void ff_dct_encode_init_x86(MpegEncContext *s)
if (dct_algo == FF_DCT_AUTO || dct_algo == FF_DCT_MMX) {
#if HAVE_MMX_INLINE
int cpu_flags = av_get_cpu_flags();
+#if ARCH_X86_32
if (INLINE_MMX(cpu_flags)) {
#if HAVE_6REGS
s->dct_quantize = dct_quantize_mmx;
#endif
s->denoise_dct = denoise_dct_mmx;
}
-#endif
#if HAVE_6REGS && HAVE_MMXEXT_INLINE
if (INLINE_MMXEXT(cpu_flags))
s->dct_quantize = dct_quantize_mmxext;
#endif
+#endif
#if HAVE_SSE2_INLINE
if (INLINE_SSE2(cpu_flags)) {
#if HAVE_6REGS
@@ -240,6 +243,7 @@ av_cold void ff_dct_encode_init_x86(MpegEncContext *s)
#if HAVE_6REGS && HAVE_SSSE3_INLINE
if (INLINE_SSSE3(cpu_flags))
s->dct_quantize = dct_quantize_ssse3;
+#endif
#endif
}
}
--
2.34.1
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 46+ messages in thread
* [FFmpeg-devel] [PATCH 14/41] avcodec/x86/fdct: Disable overridden functions on x64
2022-06-09 23:15 [FFmpeg-devel] [PATCH 00/41] Stop including superseded functions for x64 Andreas Rheinhardt
` (12 preceding siblings ...)
2022-06-09 23:54 ` [FFmpeg-devel] [PATCH 13/41] avcodec/x86/mpegvideoenc: " Andreas Rheinhardt
@ 2022-06-09 23:54 ` Andreas Rheinhardt
2022-06-09 23:54 ` [FFmpeg-devel] [PATCH 15/41] avcodec/x86/hevcdsp_init: " Andreas Rheinhardt
` (27 subsequent siblings)
41 siblings, 0 replies; 46+ messages in thread
From: Andreas Rheinhardt @ 2022-06-09 23:54 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Andreas Rheinhardt
x64 always has MMX, MMXEXT, SSE and SSE2 and this means
that some functions for MMX, MMXEXT, SSE and 3dnow are always
overridden by other functions (unless one e.g. explicitly
disables SSE2). This commit therefore disables such mpegvideoenc
functions at compile-time.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
---
libavcodec/tests/x86/dct.c | 2 ++
libavcodec/x86/fdct.c | 12 ++++++++----
libavcodec/x86/fdctdsp_init.c | 2 ++
3 files changed, 12 insertions(+), 4 deletions(-)
diff --git a/libavcodec/tests/x86/dct.c b/libavcodec/tests/x86/dct.c
index b332c9642d..1eb9400567 100644
--- a/libavcodec/tests/x86/dct.c
+++ b/libavcodec/tests/x86/dct.c
@@ -58,12 +58,14 @@ PR_WRAP(avx)
#endif
static const struct algo fdct_tab_arch[] = {
+#if ARCH_X86_32
#if HAVE_MMX_INLINE
{ "MMX", ff_fdct_mmx, FF_IDCT_PERM_NONE, AV_CPU_FLAG_MMX },
#endif
#if HAVE_MMXEXT_INLINE
{ "MMXEXT", ff_fdct_mmxext, FF_IDCT_PERM_NONE, AV_CPU_FLAG_MMXEXT },
#endif
+#endif
#if HAVE_SSE2_INLINE
{ "SSE2", ff_fdct_sse2, FF_IDCT_PERM_NONE, AV_CPU_FLAG_SSE2 },
#endif
diff --git a/libavcodec/x86/fdct.c b/libavcodec/x86/fdct.c
index 835fcc2b28..5e00287764 100644
--- a/libavcodec/x86/fdct.c
+++ b/libavcodec/x86/fdct.c
@@ -71,8 +71,6 @@ DECLARE_ALIGNED(16, static const int16_t, ocos_4_16)[8] = {
DECLARE_ALIGNED(16, static const int16_t, fdct_one_corr)[8] = { X8(1) };
-DECLARE_ALIGNED(8, static const int32_t, fdct_r_row)[2] = {RND_FRW_ROW, RND_FRW_ROW };
-
static const struct
{
DECLARE_ALIGNED(16, const int32_t, fdct_r_row_sse2)[4];
@@ -375,7 +373,6 @@ static av_always_inline void fdct_col_##cpu(const int16_t *in, int16_t *out, int
"r" (out + offset), "r" (ocos_4_16)); \
}
-FDCT_COL(mmx, mm, movq)
FDCT_COL(sse2, xmm, movdqa)
static av_always_inline void fdct_row_sse2(const int16_t *in, int16_t *out)
@@ -443,6 +440,12 @@ static av_always_inline void fdct_row_sse2(const int16_t *in, int16_t *out)
);
}
+#if ARCH_X86_32
+
+DECLARE_ALIGNED(8, static const int32_t, fdct_r_row)[2] = { RND_FRW_ROW, RND_FRW_ROW };
+
+FDCT_COL(mmx, mm, movq)
+
static av_always_inline void fdct_row_mmxext(const int16_t *in, int16_t *out,
const int16_t *table)
{
@@ -559,9 +562,10 @@ void ff_fdct_mmx(int16_t *block)
}
}
+#endif /* ARCH_X86_32 */
#endif /* HAVE_MMX_INLINE */
-#if HAVE_MMXEXT_INLINE
+#if HAVE_MMXEXT_INLINE && ARCH_X86_32
void ff_fdct_mmxext(int16_t *block)
{
diff --git a/libavcodec/x86/fdctdsp_init.c b/libavcodec/x86/fdctdsp_init.c
index 0cb5fd625b..b801e57701 100644
--- a/libavcodec/x86/fdctdsp_init.c
+++ b/libavcodec/x86/fdctdsp_init.c
@@ -31,11 +31,13 @@ av_cold void ff_fdctdsp_init_x86(FDCTDSPContext *c, AVCodecContext *avctx,
if (!high_bit_depth) {
if ((dct_algo == FF_DCT_AUTO || dct_algo == FF_DCT_MMX)) {
+#if ARCH_X86_32
if (INLINE_MMX(cpu_flags))
c->fdct = ff_fdct_mmx;
if (INLINE_MMXEXT(cpu_flags))
c->fdct = ff_fdct_mmxext;
+#endif
if (INLINE_SSE2(cpu_flags))
c->fdct = ff_fdct_sse2;
--
2.34.1
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 46+ messages in thread
* [FFmpeg-devel] [PATCH 15/41] avcodec/x86/hevcdsp_init: Disable overridden functions on x64
2022-06-09 23:15 [FFmpeg-devel] [PATCH 00/41] Stop including superseded functions for x64 Andreas Rheinhardt
` (13 preceding siblings ...)
2022-06-09 23:54 ` [FFmpeg-devel] [PATCH 14/41] avcodec/x86/fdct: " Andreas Rheinhardt
@ 2022-06-09 23:54 ` Andreas Rheinhardt
2022-06-09 23:54 ` [FFmpeg-devel] [PATCH 16/41] avcodec/x86/rv40dsp_init: " Andreas Rheinhardt
` (26 subsequent siblings)
41 siblings, 0 replies; 46+ messages in thread
From: Andreas Rheinhardt @ 2022-06-09 23:54 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Andreas Rheinhardt
x64 always has MMX, MMXEXT, SSE and SSE2 and this means
that some functions for MMX, MMXEXT, SSE and 3dnow are always
overridden by other functions (unless one e.g. explicitly
disables SSE2). This commit therefore disables such hevcdsp
functions at compile-time.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
---
FYI: There is a pre-existing stride/alignment bug in this code:
If one configures with --disable-sse3, one gets STRIDE_ALIGN 16.
Then the test fate-hevc-conformance-DBLK_A_MAIN10_VIXS_3 fails
when using SSE2; more exactly, if one comments out both
SAO_BAND_INIT(10, sse2); and SAO_EDGE_INIT(10, sse2); in
x86/hevcdsp_init.c, the test passes. It also passes if one hardcodes
STRIDE_ALIGN to 32 in avcodec_align_dimensions2().
libavcodec/x86/hevc_idct.asm | 2 ++
libavcodec/x86/hevcdsp_init.c | 6 ++++++
2 files changed, 8 insertions(+)
diff --git a/libavcodec/x86/hevc_idct.asm b/libavcodec/x86/hevc_idct.asm
index 1eb1973f27..eb44e06123 100644
--- a/libavcodec/x86/hevc_idct.asm
+++ b/libavcodec/x86/hevc_idct.asm
@@ -811,7 +811,9 @@ cglobal hevc_idct_32x32_%1, 1, 6, 16, 256, coeffs
%macro INIT_IDCT_DC 1
INIT_MMX mmxext
IDCT_DC_NL 4, %1
+%if ARCH_X86_32
IDCT_DC 8, 2, %1
+%endif
INIT_XMM sse2
IDCT_DC_NL 8, %1
diff --git a/libavcodec/x86/hevcdsp_init.c b/libavcodec/x86/hevcdsp_init.c
index 48f48a925f..b48661fe35 100644
--- a/libavcodec/x86/hevcdsp_init.c
+++ b/libavcodec/x86/hevcdsp_init.c
@@ -712,7 +712,9 @@ void ff_hevc_dsp_init_x86(HEVCDSPContext *c, const int bit_depth)
if (bit_depth == 8) {
if (EXTERNAL_MMXEXT(cpu_flags)) {
c->idct_dc[0] = ff_hevc_idct_4x4_dc_8_mmxext;
+#if ARCH_X86_32
c->idct_dc[1] = ff_hevc_idct_8x8_dc_8_mmxext;
+#endif
c->add_residual[0] = ff_hevc_add_residual_4_8_mmxext;
}
@@ -889,7 +891,9 @@ void ff_hevc_dsp_init_x86(HEVCDSPContext *c, const int bit_depth)
if (EXTERNAL_MMXEXT(cpu_flags)) {
c->add_residual[0] = ff_hevc_add_residual_4_10_mmxext;
c->idct_dc[0] = ff_hevc_idct_4x4_dc_10_mmxext;
+#if ARCH_X86_32
c->idct_dc[1] = ff_hevc_idct_8x8_dc_10_mmxext;
+#endif
}
if (EXTERNAL_SSE2(cpu_flags)) {
c->hevc_v_loop_filter_chroma = ff_hevc_v_loop_filter_chroma_10_sse2;
@@ -1105,7 +1109,9 @@ void ff_hevc_dsp_init_x86(HEVCDSPContext *c, const int bit_depth)
} else if (bit_depth == 12) {
if (EXTERNAL_MMXEXT(cpu_flags)) {
c->idct_dc[0] = ff_hevc_idct_4x4_dc_12_mmxext;
+#if ARCH_X86_32
c->idct_dc[1] = ff_hevc_idct_8x8_dc_12_mmxext;
+#endif
}
if (EXTERNAL_SSE2(cpu_flags)) {
c->hevc_v_loop_filter_chroma = ff_hevc_v_loop_filter_chroma_12_sse2;
--
2.34.1
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 46+ messages in thread
* [FFmpeg-devel] [PATCH 16/41] avcodec/x86/rv40dsp_init: Disable overridden functions on x64
2022-06-09 23:15 [FFmpeg-devel] [PATCH 00/41] Stop including superseded functions for x64 Andreas Rheinhardt
` (14 preceding siblings ...)
2022-06-09 23:54 ` [FFmpeg-devel] [PATCH 15/41] avcodec/x86/hevcdsp_init: " Andreas Rheinhardt
@ 2022-06-09 23:54 ` Andreas Rheinhardt
2022-06-09 23:54 ` [FFmpeg-devel] [PATCH 17/41] avcodec/x86/cavsdsp: " Andreas Rheinhardt
` (25 subsequent siblings)
41 siblings, 0 replies; 46+ messages in thread
From: Andreas Rheinhardt @ 2022-06-09 23:54 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Andreas Rheinhardt
x64 always has MMX, MMXEXT, SSE and SSE2 and this means
that some functions for MMX, MMXEXT, SSE and 3dnow are always
overridden by other functions (unless one e.g. explicitly
disables SSE2). This commit therefore disables such RV40-dsp
functions at compile-time.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
---
libavcodec/x86/h264_chromamc.asm | 4 ++--
libavcodec/x86/rv40dsp.asm | 2 ++
libavcodec/x86/rv40dsp_init.c | 10 ++++++----
3 files changed, 10 insertions(+), 6 deletions(-)
diff --git a/libavcodec/x86/h264_chromamc.asm b/libavcodec/x86/h264_chromamc.asm
index 0421fa8695..d59c183371 100644
--- a/libavcodec/x86/h264_chromamc.asm
+++ b/libavcodec/x86/h264_chromamc.asm
@@ -450,10 +450,10 @@ INIT_MMX 3dnow
chroma_mc8_mmx_func avg, h264, _rnd
%if ARCH_X86_32
chroma_mc8_mmx_func avg, vc1, _nornd
-%endif
chroma_mc8_mmx_func avg, rv40
-chroma_mc4_mmx_func avg, h264
chroma_mc4_mmx_func avg, rv40
+%endif
+chroma_mc4_mmx_func avg, h264
%macro chroma_mc8_ssse3_func 2-3
cglobal %1_%2_chroma_mc8%3, 6, 7, 8
diff --git a/libavcodec/x86/rv40dsp.asm b/libavcodec/x86/rv40dsp.asm
index bcad1aee80..7fa271a5d5 100644
--- a/libavcodec/x86/rv40dsp.asm
+++ b/libavcodec/x86/rv40dsp.asm
@@ -481,11 +481,13 @@ cglobal rv40_weight_func_%1_%2, 6, 7, 8
REP_RET
%endmacro
+%if ARCH_X86_32
INIT_MMX mmxext
RV40_WEIGHT rnd, 8, 3
RV40_WEIGHT rnd, 16, 4
RV40_WEIGHT nornd, 8, 3
RV40_WEIGHT nornd, 16, 4
+%endif
INIT_XMM sse2
RV40_WEIGHT rnd, 8, 3
diff --git a/libavcodec/x86/rv40dsp_init.c b/libavcodec/x86/rv40dsp_init.c
index 7a05ab14ad..7a60a30295 100644
--- a/libavcodec/x86/rv40dsp_init.c
+++ b/libavcodec/x86/rv40dsp_init.c
@@ -207,10 +207,12 @@ DEFINE_FN(avg, 16, ssse3)
#if HAVE_MMX_INLINE
DEFINE_FN(put, 8, mmx)
+#if ARCH_X86_32
DEFINE_FN(avg, 8, mmx)
DEFINE_FN(put, 16, mmx)
DEFINE_FN(avg, 16, mmx)
#endif
+#endif
av_cold void ff_rv40dsp_init_x86(RV34DSPContext *c)
{
@@ -218,10 +220,12 @@ av_cold void ff_rv40dsp_init_x86(RV34DSPContext *c)
#if HAVE_MMX_INLINE
if (INLINE_MMX(cpu_flags)) {
- c->put_pixels_tab[0][15] = put_rv40_qpel16_mc33_mmx;
c->put_pixels_tab[1][15] = put_rv40_qpel8_mc33_mmx;
+#if ARCH_X86_32
+ c->put_pixels_tab[0][15] = put_rv40_qpel16_mc33_mmx;
c->avg_pixels_tab[0][15] = avg_rv40_qpel16_mc33_mmx;
c->avg_pixels_tab[1][15] = avg_rv40_qpel8_mc33_mmx;
+#endif
}
#endif /* HAVE_MMX_INLINE */
@@ -231,12 +235,10 @@ av_cold void ff_rv40dsp_init_x86(RV34DSPContext *c)
c->put_chroma_pixels_tab[1] = ff_put_rv40_chroma_mc4_mmx;
#if ARCH_X86_32
QPEL_MC_SET(put_, _mmx)
-#endif
}
if (EXTERNAL_AMD3DNOW(cpu_flags)) {
c->avg_chroma_pixels_tab[0] = ff_avg_rv40_chroma_mc8_3dnow;
c->avg_chroma_pixels_tab[1] = ff_avg_rv40_chroma_mc4_3dnow;
-#if ARCH_X86_32
QPEL_MC_SET(avg_, _3dnow)
#endif
}
@@ -244,11 +246,11 @@ av_cold void ff_rv40dsp_init_x86(RV34DSPContext *c)
c->avg_pixels_tab[1][15] = avg_rv40_qpel8_mc33_mmxext;
c->avg_chroma_pixels_tab[0] = ff_avg_rv40_chroma_mc8_mmxext;
c->avg_chroma_pixels_tab[1] = ff_avg_rv40_chroma_mc4_mmxext;
+#if ARCH_X86_32
c->rv40_weight_pixels_tab[0][0] = ff_rv40_weight_func_rnd_16_mmxext;
c->rv40_weight_pixels_tab[0][1] = ff_rv40_weight_func_rnd_8_mmxext;
c->rv40_weight_pixels_tab[1][0] = ff_rv40_weight_func_nornd_16_mmxext;
c->rv40_weight_pixels_tab[1][1] = ff_rv40_weight_func_nornd_8_mmxext;
-#if ARCH_X86_32
QPEL_MC_SET(avg_, _mmxext)
#endif
}
--
2.34.1
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 46+ messages in thread
* [FFmpeg-devel] [PATCH 17/41] avcodec/x86/cavsdsp: Disable overridden functions on x64
2022-06-09 23:15 [FFmpeg-devel] [PATCH 00/41] Stop including superseded functions for x64 Andreas Rheinhardt
` (15 preceding siblings ...)
2022-06-09 23:54 ` [FFmpeg-devel] [PATCH 16/41] avcodec/x86/rv40dsp_init: " Andreas Rheinhardt
@ 2022-06-09 23:54 ` Andreas Rheinhardt
2022-06-09 23:55 ` [FFmpeg-devel] [PATCH 18/41] avcodec/x86/h264_intrapred_init: " Andreas Rheinhardt
` (24 subsequent siblings)
41 siblings, 0 replies; 46+ messages in thread
From: Andreas Rheinhardt @ 2022-06-09 23:54 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Andreas Rheinhardt
x64 always has MMX, MMXEXT, SSE and SSE2 and this means
that some functions for MMX, MMXEXT, SSE and 3dnow are always
overridden by other functions (unless one e.g. explicitly
disables SSE2). This commit therefore disables such CAVS-dsp
functions at compile-time.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
---
libavcodec/x86/cavsdsp.c | 20 +++++++++++++++-----
libavcodec/x86/cavsidct.asm | 2 ++
2 files changed, 17 insertions(+), 5 deletions(-)
diff --git a/libavcodec/x86/cavsdsp.c b/libavcodec/x86/cavsdsp.c
index f974f93fc0..fea9daa0ff 100644
--- a/libavcodec/x86/cavsdsp.c
+++ b/libavcodec/x86/cavsdsp.c
@@ -38,6 +38,7 @@
#if HAVE_MMX_EXTERNAL
+#if ARCH_X86_32
void ff_cavs_idct8_mmx(int16_t *out, const int16_t *in);
static void cavs_idct8_add_mmx(uint8_t *dst, int16_t *block, ptrdiff_t stride)
@@ -46,6 +47,7 @@ static void cavs_idct8_add_mmx(uint8_t *dst, int16_t *block, ptrdiff_t stride)
ff_cavs_idct8_mmx(b2, block);
ff_add_pixels_clamped_mmx(b2, dst, stride);
}
+#endif /* ARCH_X86_32 */
void ff_cavs_idct8_sse2(int16_t *out, const int16_t *in);
@@ -335,11 +337,13 @@ static void put_cavs_qpel8_mc00_mmx(uint8_t *dst, const uint8_t *src,
ff_put_pixels8_mmx(dst, src, stride, 8);
}
+#if ARCH_X86_32
static void avg_cavs_qpel8_mc00_mmx(uint8_t *dst, const uint8_t *src,
ptrdiff_t stride)
{
ff_avg_pixels8_mmx(dst, src, stride, 8);
}
+#endif
static void avg_cavs_qpel8_mc00_mmxext(uint8_t *dst, const uint8_t *src,
ptrdiff_t stride)
@@ -347,6 +351,7 @@ static void avg_cavs_qpel8_mc00_mmxext(uint8_t *dst, const uint8_t *src,
ff_avg_pixels8_mmxext(dst, src, stride, 8);
}
+#if ARCH_X86_32
static void put_cavs_qpel16_mc00_mmx(uint8_t *dst, const uint8_t *src,
ptrdiff_t stride)
{
@@ -364,6 +369,7 @@ static void avg_cavs_qpel16_mc00_mmxext(uint8_t *dst, const uint8_t *src,
{
ff_avg_pixels16_mmxext(dst, src, stride, 16);
}
+#endif
static void put_cavs_qpel16_mc00_sse2(uint8_t *dst, const uint8_t *src,
ptrdiff_t stride)
@@ -382,13 +388,15 @@ static av_cold void cavsdsp_init_mmx(CAVSDSPContext *c,
AVCodecContext *avctx)
{
#if HAVE_MMX_EXTERNAL
- c->put_cavs_qpel_pixels_tab[0][0] = put_cavs_qpel16_mc00_mmx;
c->put_cavs_qpel_pixels_tab[1][0] = put_cavs_qpel8_mc00_mmx;
+#if ARCH_X86_32
+ c->put_cavs_qpel_pixels_tab[0][0] = put_cavs_qpel16_mc00_mmx;
c->avg_cavs_qpel_pixels_tab[0][0] = avg_cavs_qpel16_mc00_mmx;
c->avg_cavs_qpel_pixels_tab[1][0] = avg_cavs_qpel8_mc00_mmx;
c->cavs_idct8_add = cavs_idct8_add_mmx;
c->idct_perm = FF_IDCT_PERM_TRANSPOSE;
+#endif /* ARCH_X86_32 */
#endif /* HAVE_MMX_EXTERNAL */
}
@@ -408,7 +416,7 @@ CAVS_MC(avg_, 8, mmxext)
CAVS_MC(avg_, 16, mmxext)
#endif /* HAVE_MMXEXT_INLINE */
-#if HAVE_AMD3DNOW_INLINE
+#if ARCH_X86_32 && HAVE_AMD3DNOW_INLINE
QPEL_CAVS(put_, PUT_OP, 3dnow)
QPEL_CAVS(avg_, AVG_3DNOW_OP, 3dnow)
@@ -425,7 +433,7 @@ static av_cold void cavsdsp_init_3dnow(CAVSDSPContext *c,
DSPFUNC(avg, 0, 16, 3dnow);
DSPFUNC(avg, 1, 8, 3dnow);
}
-#endif /* HAVE_AMD3DNOW_INLINE */
+#endif /* ARCH_X86_32 && HAVE_AMD3DNOW_INLINE */
av_cold void ff_cavsdsp_init_x86(CAVSDSPContext *c, AVCodecContext *avctx)
{
@@ -434,10 +442,10 @@ av_cold void ff_cavsdsp_init_x86(CAVSDSPContext *c, AVCodecContext *avctx)
if (X86_MMX(cpu_flags))
cavsdsp_init_mmx(c, avctx);
-#if HAVE_AMD3DNOW_INLINE
+#if ARCH_X86_32 && HAVE_AMD3DNOW_INLINE
if (INLINE_AMD3DNOW(cpu_flags))
cavsdsp_init_3dnow(c, avctx);
-#endif /* HAVE_AMD3DNOW_INLINE */
+#endif /* ARCH_X86_32 && HAVE_AMD3DNOW_INLINE */
#if HAVE_MMXEXT_INLINE
if (INLINE_MMXEXT(cpu_flags)) {
DSPFUNC(put, 0, 16, mmxext);
@@ -448,7 +456,9 @@ av_cold void ff_cavsdsp_init_x86(CAVSDSPContext *c, AVCodecContext *avctx)
#endif
#if HAVE_MMX_EXTERNAL
if (EXTERNAL_MMXEXT(cpu_flags)) {
+#if ARCH_X86_32
c->avg_cavs_qpel_pixels_tab[0][0] = avg_cavs_qpel16_mc00_mmxext;
+#endif
c->avg_cavs_qpel_pixels_tab[1][0] = avg_cavs_qpel8_mc00_mmxext;
}
#endif
diff --git a/libavcodec/x86/cavsidct.asm b/libavcodec/x86/cavsidct.asm
index 6c768c2646..070b46a6cc 100644
--- a/libavcodec/x86/cavsidct.asm
+++ b/libavcodec/x86/cavsidct.asm
@@ -107,6 +107,7 @@ SECTION .text
SUMSUB_BA w, 1, 0 ; m1 = dst3, m0 = dst4
%endmacro
+%if ARCH_X86_32
INIT_MMX mmx
cglobal cavs_idct8, 2, 4, 8, 8 * 16, out, in, cnt, tmp
mov cntd, 2
@@ -168,6 +169,7 @@ cglobal cavs_idct8, 2, 4, 8, 8 * 16, out, in, cnt, tmp
jg .loop_2
RET
+%endif
INIT_XMM sse2
cglobal cavs_idct8, 2, 2, 8 + ARCH_X86_64, 0 - 8 * 16, out, in
--
2.34.1
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 46+ messages in thread
* [FFmpeg-devel] [PATCH 18/41] avcodec/x86/h264_intrapred_init: Disable overridden functions on x64
2022-06-09 23:15 [FFmpeg-devel] [PATCH 00/41] Stop including superseded functions for x64 Andreas Rheinhardt
` (16 preceding siblings ...)
2022-06-09 23:54 ` [FFmpeg-devel] [PATCH 17/41] avcodec/x86/cavsdsp: " Andreas Rheinhardt
@ 2022-06-09 23:55 ` Andreas Rheinhardt
2022-06-09 23:55 ` [FFmpeg-devel] [PATCH 19/41] avfilter/x86/vf_noise: " Andreas Rheinhardt
` (23 subsequent siblings)
41 siblings, 0 replies; 46+ messages in thread
From: Andreas Rheinhardt @ 2022-06-09 23:55 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Andreas Rheinhardt
x64 always has MMX, MMXEXT, SSE and SSE2 and this means
that some functions for MMX, MMXEXT, SSE and 3dnow are always
overridden by other functions (unless one e.g. explicitly
disables SSE2). This commit therefore disables such
H.264-intrapred-dsp functions at compile-time.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
---
libavcodec/x86/h264_intrapred.asm | 26 +++++++++++++++++++++++++
libavcodec/x86/h264_intrapred_10bit.asm | 16 +++++++++++++++
libavcodec/x86/h264_intrapred_init.c | 20 +++++++++++++++----
3 files changed, 58 insertions(+), 4 deletions(-)
diff --git a/libavcodec/x86/h264_intrapred.asm b/libavcodec/x86/h264_intrapred.asm
index b36c198fbb..9426598a63 100644
--- a/libavcodec/x86/h264_intrapred.asm
+++ b/libavcodec/x86/h264_intrapred.asm
@@ -48,6 +48,7 @@ cextern pw_8
; void ff_pred16x16_vertical_8(uint8_t *src, ptrdiff_t stride)
;-----------------------------------------------------------------------------
+%if ARCH_X86_32
INIT_MMX mmx
cglobal pred16x16_vertical_8, 2,3
sub r0, r1
@@ -63,6 +64,7 @@ cglobal pred16x16_vertical_8, 2,3
dec r2
jg .loop
REP_RET
+%endif
INIT_XMM sse
cglobal pred16x16_vertical_8, 2,3
@@ -114,8 +116,10 @@ cglobal pred16x16_horizontal_8, 2,3
REP_RET
%endmacro
+%if ARCH_X86_32
INIT_MMX mmx
PRED16x16_H
+%endif
INIT_MMX mmxext
PRED16x16_H
INIT_XMM ssse3
@@ -176,8 +180,10 @@ cglobal pred16x16_dc_8, 2,7
REP_RET
%endmacro
+%if ARCH_X86_32
INIT_MMX mmxext
PRED16x16_DC
+%endif
INIT_XMM sse2
PRED16x16_DC
INIT_XMM ssse3
@@ -187,6 +193,7 @@ PRED16x16_DC
; void ff_pred16x16_tm_vp8_8(uint8_t *src, ptrdiff_t stride)
;-----------------------------------------------------------------------------
+%if ARCH_X86_32
%macro PRED16x16_TM 0
cglobal pred16x16_tm_vp8_8, 2,5
sub r0, r1
@@ -227,6 +234,7 @@ INIT_MMX mmx
PRED16x16_TM
INIT_MMX mmxext
PRED16x16_TM
+%endif
INIT_XMM sse2
cglobal pred16x16_tm_vp8_8, 2,6,6
@@ -565,6 +573,7 @@ cglobal pred16x16_plane_%1_8, 2,9,7
REP_RET
%endmacro
+%if ARCH_X86_32
INIT_MMX mmx
H264_PRED16x16_PLANE h264
H264_PRED16x16_PLANE rv40
@@ -573,6 +582,7 @@ INIT_MMX mmxext
H264_PRED16x16_PLANE h264
H264_PRED16x16_PLANE rv40
H264_PRED16x16_PLANE svq3
+%endif
INIT_XMM sse2
H264_PRED16x16_PLANE h264
H264_PRED16x16_PLANE rv40
@@ -747,10 +757,12 @@ ALIGN 16
REP_RET
%endmacro
+%if ARCH_X86_32
INIT_MMX mmx
H264_PRED8x8_PLANE
INIT_MMX mmxext
H264_PRED8x8_PLANE
+%endif
INIT_XMM sse2
H264_PRED8x8_PLANE
INIT_XMM ssse3
@@ -794,8 +806,10 @@ cglobal pred8x8_horizontal_8, 2,3
REP_RET
%endmacro
+%if ARCH_X86_32
INIT_MMX mmx
PRED8x8_H
+%endif
INIT_MMX mmxext
PRED8x8_H
INIT_MMX ssse3
@@ -937,6 +951,7 @@ cglobal pred8x8_dc_rv40_8, 2,7
; void ff_pred8x8_tm_vp8_8(uint8_t *src, ptrdiff_t stride)
;-----------------------------------------------------------------------------
+%if ARCH_X86_32
%macro PRED8x8_TM 0
cglobal pred8x8_tm_vp8_8, 2,6
sub r0, r1
@@ -976,6 +991,7 @@ INIT_MMX mmx
PRED8x8_TM
INIT_MMX mmxext
PRED8x8_TM
+%endif
INIT_XMM sse2
cglobal pred8x8_tm_vp8_8, 2,6,4
@@ -1333,6 +1349,7 @@ PRED8x8L_VERTICAL
; int has_topright, ptrdiff_t stride)
;-----------------------------------------------------------------------------
+%if ARCH_X86_32
INIT_MMX mmxext
cglobal pred8x8l_down_left_8, 4,5
sub r0, r3
@@ -1440,6 +1457,7 @@ cglobal pred8x8l_down_left_8, 4,5
por mm1, mm0
movq [r0+r3*1], mm1
RET
+%endif
%macro PRED8x8L_DOWN_LEFT 0
cglobal pred8x8l_down_left_8, 4,4
@@ -1534,6 +1552,7 @@ PRED8x8L_DOWN_LEFT
; int has_topright, ptrdiff_t stride)
;-----------------------------------------------------------------------------
+%if ARCH_X86_32
INIT_MMX mmxext
cglobal pred8x8l_down_right_8, 4,5
sub r0, r3
@@ -1665,6 +1684,7 @@ cglobal pred8x8l_down_right_8, 4,5
por mm0, mm1
movq [r0+r3*1], mm0
RET
+%endif
%macro PRED8x8L_DOWN_RIGHT 0
cglobal pred8x8l_down_right_8, 4,5
@@ -1786,6 +1806,7 @@ PRED8x8L_DOWN_RIGHT
; int has_topright, ptrdiff_t stride)
;-----------------------------------------------------------------------------
+%if ARCH_X86_32
INIT_MMX mmxext
cglobal pred8x8l_vertical_right_8, 4,5
sub r0, r3
@@ -1892,6 +1913,7 @@ cglobal pred8x8l_vertical_right_8, 4,5
PALIGNR mm5, mm0, 7, mm1
movq [r4+r3*2], mm5
RET
+%endif
%macro PRED8x8L_VERTICAL_RIGHT 0
cglobal pred8x8l_vertical_right_8, 4,5,7
@@ -2192,6 +2214,7 @@ PRED8x8L_HORIZONTAL_UP
; int has_topright, ptrdiff_t stride)
;-----------------------------------------------------------------------------
+%if ARCH_X86_32
INIT_MMX mmxext
cglobal pred8x8l_horizontal_down_8, 4,5
sub r0, r3
@@ -2306,6 +2329,7 @@ cglobal pred8x8l_horizontal_down_8, 4,5
PALIGNR mm3, mm4, 6, mm4
movq [r0+r3*1], mm3
RET
+%endif
%macro PRED8x8L_HORIZONTAL_DOWN 0
cglobal pred8x8l_horizontal_down_8, 4,5
@@ -2508,8 +2532,10 @@ cglobal pred4x4_tm_vp8_8, 3,6
REP_RET
%endmacro
+%if ARCH_X86_32
INIT_MMX mmx
PRED4x4_TM
+%endif
INIT_MMX mmxext
PRED4x4_TM
diff --git a/libavcodec/x86/h264_intrapred_10bit.asm b/libavcodec/x86/h264_intrapred_10bit.asm
index 629e0a72e3..e978d91ff1 100644
--- a/libavcodec/x86/h264_intrapred_10bit.asm
+++ b/libavcodec/x86/h264_intrapred_10bit.asm
@@ -411,8 +411,10 @@ cglobal pred8x8_dc_10, 2, 6
RET
%endmacro
+%if ARCH_X86_32
INIT_MMX mmxext
PRED8x8_DC pshufw
+%endif
INIT_XMM sse2
PRED8x8_DC pshuflw
@@ -526,8 +528,10 @@ cglobal pred8x8l_128_dc_10, 4, 4
RET
%endmacro
+%if ARCH_X86_32
INIT_MMX mmxext
PRED8x8L_128_DC
+%endif
INIT_XMM sse2
PRED8x8L_128_DC
@@ -1033,8 +1037,10 @@ cglobal pred16x16_vertical_10, 2, 3
REP_RET
%endmacro
+%if ARCH_X86_32
INIT_MMX mmxext
PRED16x16_VERTICAL
+%endif
INIT_XMM sse2
PRED16x16_VERTICAL
@@ -1057,8 +1063,10 @@ cglobal pred16x16_horizontal_10, 2, 3
REP_RET
%endmacro
+%if ARCH_X86_32
INIT_MMX mmxext
PRED16x16_HORIZONTAL
+%endif
INIT_XMM sse2
PRED16x16_HORIZONTAL
@@ -1103,8 +1111,10 @@ cglobal pred16x16_dc_10, 2, 6
REP_RET
%endmacro
+%if ARCH_X86_32
INIT_MMX mmxext
PRED16x16_DC
+%endif
INIT_XMM sse2
PRED16x16_DC
@@ -1135,8 +1145,10 @@ cglobal pred16x16_top_dc_10, 2, 3
REP_RET
%endmacro
+%if ARCH_X86_32
INIT_MMX mmxext
PRED16x16_TOP_DC
+%endif
INIT_XMM sse2
PRED16x16_TOP_DC
@@ -1172,8 +1184,10 @@ cglobal pred16x16_left_dc_10, 2, 6
REP_RET
%endmacro
+%if ARCH_X86_32
INIT_MMX mmxext
PRED16x16_LEFT_DC
+%endif
INIT_XMM sse2
PRED16x16_LEFT_DC
@@ -1193,7 +1207,9 @@ cglobal pred16x16_128_dc_10, 2,3
REP_RET
%endmacro
+%if ARCH_X86_32
INIT_MMX mmxext
PRED16x16_128_DC
+%endif
INIT_XMM sse2
PRED16x16_128_DC
diff --git a/libavcodec/x86/h264_intrapred_init.c b/libavcodec/x86/h264_intrapred_init.c
index a95cfbca55..b4b04beff5 100644
--- a/libavcodec/x86/h264_intrapred_init.c
+++ b/libavcodec/x86/h264_intrapred_init.c
@@ -193,10 +193,13 @@ av_cold void ff_h264_pred_init_x86(H264PredContext *h, int codec_id,
if (bit_depth == 8) {
if (EXTERNAL_MMX(cpu_flags)) {
+#if ARCH_X86_32
h->pred16x16[VERT_PRED8x8 ] = ff_pred16x16_vertical_8_mmx;
h->pred16x16[HOR_PRED8x8 ] = ff_pred16x16_horizontal_8_mmx;
+#endif
if (chroma_format_idc <= 1) {
h->pred8x8 [VERT_PRED8x8 ] = ff_pred8x8_vertical_8_mmx;
+#if ARCH_X86_32
h->pred8x8 [HOR_PRED8x8 ] = ff_pred8x8_horizontal_8_mmx;
}
if (codec_id == AV_CODEC_ID_VP7 || codec_id == AV_CODEC_ID_VP8) {
@@ -214,23 +217,28 @@ av_cold void ff_h264_pred_init_x86(H264PredContext *h, int codec_id,
} else {
h->pred16x16[PLANE_PRED8x8] = ff_pred16x16_plane_h264_8_mmx;
}
+#endif
}
}
if (EXTERNAL_MMXEXT(cpu_flags)) {
h->pred16x16[HOR_PRED8x8 ] = ff_pred16x16_horizontal_8_mmxext;
+#if ARCH_X86_32
h->pred16x16[DC_PRED8x8 ] = ff_pred16x16_dc_8_mmxext;
+#endif
if (chroma_format_idc <= 1)
h->pred8x8[HOR_PRED8x8 ] = ff_pred8x8_horizontal_8_mmxext;
h->pred8x8l [TOP_DC_PRED ] = ff_pred8x8l_top_dc_8_mmxext;
h->pred8x8l [DC_PRED ] = ff_pred8x8l_dc_8_mmxext;
h->pred8x8l [HOR_PRED ] = ff_pred8x8l_horizontal_8_mmxext;
h->pred8x8l [VERT_PRED ] = ff_pred8x8l_vertical_8_mmxext;
- h->pred8x8l [DIAG_DOWN_RIGHT_PRED ] = ff_pred8x8l_down_right_8_mmxext;
- h->pred8x8l [VERT_RIGHT_PRED ] = ff_pred8x8l_vertical_right_8_mmxext;
h->pred8x8l [HOR_UP_PRED ] = ff_pred8x8l_horizontal_up_8_mmxext;
+#if ARCH_X86_32
h->pred8x8l [DIAG_DOWN_LEFT_PRED ] = ff_pred8x8l_down_left_8_mmxext;
+ h->pred8x8l [DIAG_DOWN_RIGHT_PRED ] = ff_pred8x8l_down_right_8_mmxext;
+ h->pred8x8l [VERT_RIGHT_PRED ] = ff_pred8x8l_vertical_right_8_mmxext;
h->pred8x8l [HOR_DOWN_PRED ] = ff_pred8x8l_horizontal_down_8_mmxext;
+#endif
h->pred4x4 [DIAG_DOWN_RIGHT_PRED ] = ff_pred4x4_down_right_8_mmxext;
h->pred4x4 [VERT_RIGHT_PRED ] = ff_pred4x4_vertical_right_8_mmxext;
h->pred4x4 [HOR_DOWN_PRED ] = ff_pred4x4_horizontal_down_8_mmxext;
@@ -252,11 +260,12 @@ av_cold void ff_h264_pred_init_x86(H264PredContext *h, int codec_id,
}
}
if (codec_id == AV_CODEC_ID_VP7 || codec_id == AV_CODEC_ID_VP8) {
- h->pred16x16[PLANE_PRED8x8 ] = ff_pred16x16_tm_vp8_8_mmxext;
h->pred8x8 [DC_PRED8x8 ] = ff_pred8x8_dc_rv40_8_mmxext;
- h->pred8x8 [PLANE_PRED8x8 ] = ff_pred8x8_tm_vp8_8_mmxext;
h->pred4x4 [TM_VP8_PRED ] = ff_pred4x4_tm_vp8_8_mmxext;
h->pred4x4 [VERT_PRED ] = ff_pred4x4_vertical_vp8_8_mmxext;
+#if ARCH_X86_32
+ h->pred16x16[PLANE_PRED8x8 ] = ff_pred16x16_tm_vp8_8_mmxext;
+ h->pred8x8 [PLANE_PRED8x8 ] = ff_pred8x8_tm_vp8_8_mmxext;
} else {
if (chroma_format_idc <= 1)
h->pred8x8 [PLANE_PRED8x8] = ff_pred8x8_plane_8_mmxext;
@@ -267,6 +276,7 @@ av_cold void ff_h264_pred_init_x86(H264PredContext *h, int codec_id,
} else {
h->pred16x16[PLANE_PRED8x8 ] = ff_pred16x16_plane_h264_8_mmxext;
}
+#endif
}
}
@@ -338,6 +348,7 @@ av_cold void ff_h264_pred_init_x86(H264PredContext *h, int codec_id,
h->pred4x4[DC_PRED ] = ff_pred4x4_dc_10_mmxext;
h->pred4x4[HOR_UP_PRED ] = ff_pred4x4_horizontal_up_10_mmxext;
+#if ARCH_X86_32
if (chroma_format_idc <= 1)
h->pred8x8[DC_PRED8x8 ] = ff_pred8x8_dc_10_mmxext;
@@ -349,6 +360,7 @@ av_cold void ff_h264_pred_init_x86(H264PredContext *h, int codec_id,
h->pred16x16[LEFT_DC_PRED8x8 ] = ff_pred16x16_left_dc_10_mmxext;
h->pred16x16[VERT_PRED8x8 ] = ff_pred16x16_vertical_10_mmxext;
h->pred16x16[HOR_PRED8x8 ] = ff_pred16x16_horizontal_10_mmxext;
+#endif
}
if (EXTERNAL_SSE2(cpu_flags)) {
h->pred4x4[DIAG_DOWN_LEFT_PRED ] = ff_pred4x4_down_left_10_sse2;
--
2.34.1
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 46+ messages in thread
* [FFmpeg-devel] [PATCH 19/41] avfilter/x86/vf_noise: Disable overridden functions on x64
2022-06-09 23:15 [FFmpeg-devel] [PATCH 00/41] Stop including superseded functions for x64 Andreas Rheinhardt
` (17 preceding siblings ...)
2022-06-09 23:55 ` [FFmpeg-devel] [PATCH 18/41] avcodec/x86/h264_intrapred_init: " Andreas Rheinhardt
@ 2022-06-09 23:55 ` Andreas Rheinhardt
2022-06-09 23:55 ` [FFmpeg-devel] [PATCH 20/41] avcodec/x86/me_cmp: " Andreas Rheinhardt
` (22 subsequent siblings)
41 siblings, 0 replies; 46+ messages in thread
From: Andreas Rheinhardt @ 2022-06-09 23:55 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Andreas Rheinhardt
x64 always has MMX, MMXEXT, SSE and SSE2 and this means
that some functions for MMX, MMXEXT, SSE and 3dnow are always
overridden by other functions (unless one e.g. explicitly
disables SSE2). This commit therefore disables line_noise_mmx
at compile-time for x64.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
---
libavfilter/x86/vf_noise.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/libavfilter/x86/vf_noise.c b/libavfilter/x86/vf_noise.c
index f7a4d00336..ce7260e94f 100644
--- a/libavfilter/x86/vf_noise.c
+++ b/libavfilter/x86/vf_noise.c
@@ -25,6 +25,7 @@
#include "libavfilter/vf_noise.h"
#if HAVE_INLINE_ASM
+#if ARCH_X86_32
static void line_noise_mmx(uint8_t *dst, const uint8_t *src,
const int8_t *noise, int len, int shift)
{
@@ -52,6 +53,7 @@ static void line_noise_mmx(uint8_t *dst, const uint8_t *src,
if (mmx_len != len)
ff_line_noise_c(dst+mmx_len, src+mmx_len, noise+mmx_len, len-mmx_len, 0);
}
+#endif
#if HAVE_6REGS
static void line_noise_avg_mmx(uint8_t *dst, const uint8_t *src,
@@ -132,7 +134,9 @@ av_cold void ff_noise_init_x86(NoiseContext *n)
int cpu_flags = av_get_cpu_flags();
if (INLINE_MMX(cpu_flags)) {
+#if ARCH_X86_32
n->line_noise = line_noise_mmx;
+#endif
#if HAVE_6REGS
n->line_noise_avg = line_noise_avg_mmx;
#endif
--
2.34.1
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 46+ messages in thread
* [FFmpeg-devel] [PATCH 20/41] avcodec/x86/me_cmp: Disable overridden functions on x64
2022-06-09 23:15 [FFmpeg-devel] [PATCH 00/41] Stop including superseded functions for x64 Andreas Rheinhardt
` (18 preceding siblings ...)
2022-06-09 23:55 ` [FFmpeg-devel] [PATCH 19/41] avfilter/x86/vf_noise: " Andreas Rheinhardt
@ 2022-06-09 23:55 ` Andreas Rheinhardt
2022-06-09 23:55 ` [FFmpeg-devel] [PATCH 21/41] avcodec/x86/mpegvideoencdsp: Disable ff_pix_norm1_mmx " Andreas Rheinhardt
` (21 subsequent siblings)
41 siblings, 0 replies; 46+ messages in thread
From: Andreas Rheinhardt @ 2022-06-09 23:55 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Andreas Rheinhardt
x64 always has MMX, MMXEXT, SSE and SSE2 and this means
that some functions for MMX, MMXEXT, SSE and 3dnow are always
overridden by other functions (unless one e.g. explicitly
disables SSE2). This commit therefore disables such me_cmp functions
at compile-time for x64.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
---
libavcodec/x86/me_cmp.asm | 6 ++++
libavcodec/x86/me_cmp_init.c | 61 +++++++++++++++++++++---------------
2 files changed, 42 insertions(+), 25 deletions(-)
diff --git a/libavcodec/x86/me_cmp.asm b/libavcodec/x86/me_cmp.asm
index ad06d485ab..05e521cb08 100644
--- a/libavcodec/x86/me_cmp.asm
+++ b/libavcodec/x86/me_cmp.asm
@@ -261,11 +261,15 @@ hadamard8_16_wrapper 0, 14
%endif
%endmacro
+%if ARCH_X86_32
INIT_MMX mmx
HADAMARD8_DIFF
+%endif
+%if ARCH_X86_32 || HAVE_ALIGNED_STACK == 0
INIT_MMX mmxext
HADAMARD8_DIFF
+%endif
INIT_XMM sse2
%if ARCH_X86_64
@@ -385,10 +389,12 @@ cglobal sum_abs_dctelem, 1, 1, %1, block
RET
%endmacro
+%if ARCH_X86_32
INIT_MMX mmx
SUM_ABS_DCTELEM 0, 4
INIT_MMX mmxext
SUM_ABS_DCTELEM 0, 4
+%endif
INIT_XMM sse2
SUM_ABS_DCTELEM 7, 2
INIT_XMM ssse3
diff --git a/libavcodec/x86/me_cmp_init.c b/libavcodec/x86/me_cmp_init.c
index 9af911bb88..6144bb9496 100644
--- a/libavcodec/x86/me_cmp_init.c
+++ b/libavcodec/x86/me_cmp_init.c
@@ -126,6 +126,7 @@ static int nsse8_mmx(MpegEncContext *c, uint8_t *pix1, uint8_t *pix2,
#if HAVE_INLINE_ASM
+#if ARCH_X86_32
static int vsad_intra16_mmx(MpegEncContext *v, uint8_t *pix, uint8_t *dummy,
ptrdiff_t stride, int h)
{
@@ -270,6 +271,7 @@ static int vsad16_mmx(MpegEncContext *v, uint8_t *pix1, uint8_t *pix2,
return tmp & 0x7FFF;
}
#undef SUM
+#endif
DECLARE_ASM_CONST(8, uint64_t, round_tab)[3] = {
0x0000000000000000ULL,
@@ -478,20 +480,6 @@ static int sad8_y2_ ## suf(MpegEncContext *v, uint8_t *blk2, \
return sum_ ## suf(); \
} \
\
-static int sad8_xy2_ ## suf(MpegEncContext *v, uint8_t *blk2, \
- uint8_t *blk1, ptrdiff_t stride, int h) \
-{ \
- av_assert2(h == 8); \
- __asm__ volatile ( \
- "pxor %%mm7, %%mm7 \n\t" \
- "pxor %%mm6, %%mm6 \n\t" \
- ::); \
- \
- sad8_4_ ## suf(blk1, blk2, stride, 8); \
- \
- return sum_ ## suf(); \
-} \
- \
static int sad16_ ## suf(MpegEncContext *v, uint8_t *blk2, \
uint8_t *blk1, ptrdiff_t stride, int h) \
{ \
@@ -535,7 +523,8 @@ static int sad16_y2_ ## suf(MpegEncContext *v, uint8_t *blk2, \
\
return sum_ ## suf(); \
} \
- \
+
+#define PIX_SADXY(suf) \
static int sad16_xy2_ ## suf(MpegEncContext *v, uint8_t *blk2, \
uint8_t *blk1, ptrdiff_t stride, int h) \
{ \
@@ -549,8 +538,25 @@ static int sad16_xy2_ ## suf(MpegEncContext *v, uint8_t *blk2, \
\
return sum_ ## suf(); \
} \
+ \
+static int sad8_xy2_ ## suf(MpegEncContext *v, uint8_t *blk2, \
+ uint8_t *blk1, ptrdiff_t stride, int h) \
+{ \
+ av_assert2(h == 8); \
+ __asm__ volatile ( \
+ "pxor %%mm7, %%mm7 \n\t" \
+ "pxor %%mm6, %%mm6 \n\t" \
+ ::); \
+ \
+ sad8_4_ ## suf(blk1, blk2, stride, 8); \
+ \
+ return sum_ ## suf(); \
+} \
+#if ARCH_X86_32
PIX_SAD(mmx)
+#endif
+PIX_SADXY(mmx)
#endif /* HAVE_INLINE_ASM */
@@ -560,32 +566,35 @@ av_cold void ff_me_cmp_init_x86(MECmpContext *c, AVCodecContext *avctx)
#if HAVE_INLINE_ASM
if (INLINE_MMX(cpu_flags)) {
+#if ARCH_X86_32
+ c->sad[0] = sad16_mmx;
+ c->sad[1] = sad8_mmx;
+
+ if (!(avctx->flags & AV_CODEC_FLAG_BITEXACT)) {
+ c->vsad[0] = vsad16_mmx;
+ }
+ c->vsad[4] = vsad_intra16_mmx;
+
c->pix_abs[0][0] = sad16_mmx;
c->pix_abs[0][1] = sad16_x2_mmx;
c->pix_abs[0][2] = sad16_y2_mmx;
- c->pix_abs[0][3] = sad16_xy2_mmx;
c->pix_abs[1][0] = sad8_mmx;
c->pix_abs[1][1] = sad8_x2_mmx;
c->pix_abs[1][2] = sad8_y2_mmx;
+#endif
+ c->pix_abs[0][3] = sad16_xy2_mmx;
c->pix_abs[1][3] = sad8_xy2_mmx;
-
- c->sad[0] = sad16_mmx;
- c->sad[1] = sad8_mmx;
-
- c->vsad[4] = vsad_intra16_mmx;
-
- if (!(avctx->flags & AV_CODEC_FLAG_BITEXACT)) {
- c->vsad[0] = vsad16_mmx;
- }
}
#endif /* HAVE_INLINE_ASM */
if (EXTERNAL_MMX(cpu_flags)) {
+#if ARCH_X86_32
c->hadamard8_diff[0] = ff_hadamard8_diff16_mmx;
c->hadamard8_diff[1] = ff_hadamard8_diff_mmx;
c->sum_abs_dctelem = ff_sum_abs_dctelem_mmx;
c->sse[0] = ff_sse16_mmx;
+#endif
c->sse[1] = ff_sse8_mmx;
#if HAVE_X86ASM
c->nsse[0] = nsse16_mmx;
@@ -594,9 +603,11 @@ av_cold void ff_me_cmp_init_x86(MECmpContext *c, AVCodecContext *avctx)
}
if (EXTERNAL_MMXEXT(cpu_flags)) {
+#if ARCH_X86_32 || !HAVE_ALIGNED_STACK
c->hadamard8_diff[0] = ff_hadamard8_diff16_mmxext;
c->hadamard8_diff[1] = ff_hadamard8_diff_mmxext;
c->sum_abs_dctelem = ff_sum_abs_dctelem_mmxext;
+#endif
c->sad[0] = ff_sad16_mmxext;
c->sad[1] = ff_sad8_mmxext;
--
2.34.1
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 46+ messages in thread
* [FFmpeg-devel] [PATCH 21/41] avcodec/x86/mpegvideoencdsp: Disable ff_pix_norm1_mmx on x64
2022-06-09 23:15 [FFmpeg-devel] [PATCH 00/41] Stop including superseded functions for x64 Andreas Rheinhardt
` (19 preceding siblings ...)
2022-06-09 23:55 ` [FFmpeg-devel] [PATCH 20/41] avcodec/x86/me_cmp: " Andreas Rheinhardt
@ 2022-06-09 23:55 ` Andreas Rheinhardt
2022-06-09 23:55 ` [FFmpeg-devel] [PATCH 22/41] avcodec/x86/h264dsp_init: Disable overridden functions " Andreas Rheinhardt
` (20 subsequent siblings)
41 siblings, 0 replies; 46+ messages in thread
From: Andreas Rheinhardt @ 2022-06-09 23:55 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Andreas Rheinhardt
Forgotten in acebff8e5dc0789c228b10ffcae2f2eb6c30a91d.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
---
libavcodec/x86/mpegvideoencdsp.asm | 2 ++
1 file changed, 2 insertions(+)
diff --git a/libavcodec/x86/mpegvideoencdsp.asm b/libavcodec/x86/mpegvideoencdsp.asm
index aec73f82dc..639abc429d 100644
--- a/libavcodec/x86/mpegvideoencdsp.asm
+++ b/libavcodec/x86/mpegvideoencdsp.asm
@@ -147,8 +147,10 @@ cglobal pix_norm1, 2, 3, %1
RET
%endmacro
+%if ARCH_X86_32
INIT_MMX mmx
PIX_NORM1 0, 16
+%endif
INIT_XMM sse2
PIX_NORM1 6, 8
--
2.34.1
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 46+ messages in thread
* [FFmpeg-devel] [PATCH 22/41] avcodec/x86/h264dsp_init: Disable overridden functions on x64
2022-06-09 23:15 [FFmpeg-devel] [PATCH 00/41] Stop including superseded functions for x64 Andreas Rheinhardt
` (20 preceding siblings ...)
2022-06-09 23:55 ` [FFmpeg-devel] [PATCH 21/41] avcodec/x86/mpegvideoencdsp: Disable ff_pix_norm1_mmx " Andreas Rheinhardt
@ 2022-06-09 23:55 ` Andreas Rheinhardt
2022-06-09 23:55 ` [FFmpeg-devel] [PATCH 23/41] avcodec/x86/sbrdsp_init: " Andreas Rheinhardt
` (19 subsequent siblings)
41 siblings, 0 replies; 46+ messages in thread
From: Andreas Rheinhardt @ 2022-06-09 23:55 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Andreas Rheinhardt
x64 always has MMX, MMXEXT, SSE and SSE2 and this means
that some functions for MMX, MMXEXT, SSE and 3dnow are always
overridden by other functions (unless one e.g. explicitly
disables SSE2). This commit therefore disables such h264dsp functions
at compile-time for x64.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
---
libavcodec/x86/h264_deblock.asm | 24 +++-----------
libavcodec/x86/h264_idct.asm | 57 +++++++--------------------------
libavcodec/x86/h264_weight.asm | 8 +++++
libavcodec/x86/h264dsp_init.c | 21 ++++++++----
4 files changed, 38 insertions(+), 72 deletions(-)
diff --git a/libavcodec/x86/h264_deblock.asm b/libavcodec/x86/h264_deblock.asm
index a2e745cd8e..9e671af45c 100644
--- a/libavcodec/x86/h264_deblock.asm
+++ b/libavcodec/x86/h264_deblock.asm
@@ -867,7 +867,6 @@ DEBLOCK_LUMA_INTRA v
%if ARCH_X86_64 == 0
INIT_MMX mmxext
DEBLOCK_LUMA_INTRA v8
-%endif
INIT_MMX mmxext
@@ -911,17 +910,8 @@ cglobal deblock_v_chroma_8, 5,6
; int8_t *tc0)
;-----------------------------------------------------------------------------
cglobal deblock_h_chroma_8, 5,7
-%if ARCH_X86_64
- ; This could use the red zone on 64 bit unix to avoid the stack pointer
- ; readjustment, but valgrind assumes the red zone is clobbered on
- ; function calls and returns.
- sub rsp, 16
- %define buf0 [rsp]
- %define buf1 [rsp+8]
-%else
%define buf0 r0m
%define buf1 r2m
-%endif
CHROMA_H_START
TRANSPOSE4x8_LOAD bw, wd, dq, PASS8ROWS(t5, r0, r1, t6)
movq buf0, m0
@@ -934,9 +924,6 @@ cglobal deblock_h_chroma_8, 5,7
movq m0, buf0
movq m3, buf1
TRANSPOSE8x4B_STORE PASS8ROWS(t5, r0, r1, t6)
-%if ARCH_X86_64
- add rsp, 16
-%endif
RET
ALIGN 16
@@ -953,13 +940,8 @@ ff_chroma_inter_body_mmxext:
cglobal deblock_h_chroma422_8, 5, 6
SUB rsp, (1+ARCH_X86_64*2)*mmsize
- %if ARCH_X86_64
- %define buf0 [rsp+16]
- %define buf1 [rsp+8]
- %else
- %define buf0 r0m
- %define buf1 r2m
- %endif
+ %define buf0 r0m
+ %define buf1 r2m
movd m6, [r4]
punpcklbw m6, m6
@@ -1059,6 +1041,8 @@ ff_chroma_intra_body_mmxext:
paddb m2, m6
ret
+%endif ; ARCH_X86_64 == 0
+
%macro LOAD_8_ROWS 8
movd m0, %1
movd m1, %2
diff --git a/libavcodec/x86/h264_idct.asm b/libavcodec/x86/h264_idct.asm
index c54f9f1a68..17c7af388c 100644
--- a/libavcodec/x86/h264_idct.asm
+++ b/libavcodec/x86/h264_idct.asm
@@ -87,12 +87,14 @@ SECTION .text
STORE_DIFFx2 m2, m3, m4, m5, m7, 6, %1, %3
%endmacro
+%if ARCH_X86_32
INIT_MMX mmx
; void ff_h264_idct_add_8_mmx(uint8_t *dst, int16_t *block, int stride)
cglobal h264_idct_add_8, 3, 3, 0
movsxdifnidn r2, r2d
IDCT4_ADD r0, r1, r2
RET
+%endif
%macro IDCT8_1D 2
psraw m0, m1, 1
@@ -207,6 +209,7 @@ cglobal h264_idct_add_8, 3, 3, 0
STORE_DIFFx2 m1, m2, m5, m6, m7, 6, %1, %3
%endmacro
+%if ARCH_X86_32
INIT_MMX mmx
; void ff_h264_idct8_add_8_mmx(uint8_t *dst, int16_t *block, int stride)
cglobal h264_idct8_add_8, 3, 4, 0
@@ -223,6 +226,7 @@ cglobal h264_idct8_add_8, 3, 4, 0
ADD rsp, pad
RET
+%endif
; %1=uint8_t *dst, %2=int16_t *block, %3=int stride
%macro IDCT8_ADD_SSE 4
@@ -315,16 +319,7 @@ cglobal h264_idct8_add_8, 3, 4, 10
%endmacro
INIT_MMX mmxext
-; void ff_h264_idct_dc_add_8_mmxext(uint8_t *dst, int16_t *block, int stride)
%if ARCH_X86_64
-cglobal h264_idct_dc_add_8, 3, 4, 0
- movsxd r2, r2d
- movsx r3, word [r1]
- mov dword [r1], 0
- DC_ADD_MMXEXT_INIT r3, r2
- DC_ADD_MMXEXT_OP movh, r0, r2, r3
- RET
-
; void ff_h264_idct8_dc_add_8_mmxext(uint8_t *dst, int16_t *block, int stride)
cglobal h264_idct8_dc_add_8, 3, 4, 0
movsxd r2, r2d
@@ -358,6 +353,7 @@ cglobal h264_idct8_dc_add_8, 2, 3, 0
%endif
INIT_MMX mmx
+%if ARCH_X86_32
; void ff_h264_idct_add16_8_mmx(uint8_t *dst, const int *block_offset,
; int16_t *block, int stride,
; const uint8_t nnzc[6 * 8])
@@ -438,16 +434,12 @@ cglobal h264_idct_add16_8, 5, 8 + npicregs, 0, dst1, block_offset, block, stride
jz .no_dc
mov word [r2], 0
DC_ADD_MMXEXT_INIT r6, r3
-%if ARCH_X86_64 == 0
%define dst2q r1
%define dst2d r1d
-%endif
mov dst2d, dword [r1+r5*4]
lea dst2q, [r0+dst2q]
DC_ADD_MMXEXT_OP movh, dst2q, r3, r6
-%if ARCH_X86_64 == 0
mov r1, r1m
-%endif
inc r5
add r2, 32
cmp r5, 16
@@ -519,16 +511,12 @@ cglobal h264_idct_add16intra_8, 5, 8 + npicregs, 0, dst1, block_offset, block, s
jz .skipblock
mov word [r2], 0
DC_ADD_MMXEXT_INIT r6, r3
-%if ARCH_X86_64 == 0
%define dst2q r1
%define dst2d r1d
-%endif
mov dst2d, dword [r1+r5*4]
add dst2q, r0
DC_ADD_MMXEXT_OP movh, dst2q, r3, r6
-%if ARCH_X86_64 == 0
mov r1, r1m
-%endif
.skipblock:
inc r5
add r2, 32
@@ -560,18 +548,14 @@ cglobal h264_idct8_add4_8, 5, 8 + npicregs, 0, dst1, block_offset, block, stride
jz .no_dc
mov word [r2], 0
DC_ADD_MMXEXT_INIT r6, r3
-%if ARCH_X86_64 == 0
%define dst2q r1
%define dst2d r1d
-%endif
mov dst2d, dword [r1+r5*4]
lea dst2q, [r0+dst2q]
DC_ADD_MMXEXT_OP mova, dst2q, r3, r6
lea dst2q, [dst2q+r3*4]
DC_ADD_MMXEXT_OP mova, dst2q, r3, r6
-%if ARCH_X86_64 == 0
mov r1, r1m
-%endif
add r5, 4
add r2, 128
cmp r5, 16
@@ -597,6 +581,7 @@ cglobal h264_idct8_add4_8, 5, 8 + npicregs, 0, dst1, block_offset, block, stride
ADD rsp, pad
RET
+%endif
INIT_XMM sse2
; void ff_h264_idct8_add4_8_sse2(uint8_t *dst, const int *block_offset,
@@ -678,6 +663,7 @@ h264_idct_add8_mmx_plane:
jnz .nextblock
rep ret
+%if ARCH_X86_32
; void ff_h264_idct_add8_8_mmx(uint8_t **dest, const int *block_offset,
; int16_t *block, int stride,
; const uint8_t nnzc[6 * 8])
@@ -687,20 +673,14 @@ cglobal h264_idct_add8_8, 5, 8 + npicregs, 0, dst1, block_offset, block, stride,
add r2, 512
%ifdef PIC
lea picregq, [scan8_mem]
-%endif
-%if ARCH_X86_64
- mov dst2q, r0
%endif
call h264_idct_add8_mmx_plane
mov r5, 32
add r2, 384
-%if ARCH_X86_64
- add dst2q, gprsize
-%else
add r0mp, gprsize
-%endif
call h264_idct_add8_mmx_plane
RET ; TODO: check rep ret after a function call
+%endif
cglobal h264_idct_add8_422_8, 5, 8 + npicregs, 0, dst1, block_offset, block, stride, nnzc, cntr, coeff, dst2, picreg
; dst1, block_offset, block, stride, nnzc, cntr, coeff, dst2, picreg
@@ -734,6 +714,7 @@ cglobal h264_idct_add8_422_8, 5, 8 + npicregs, 0, dst1, block_offset, block, str
RET ; TODO: check rep ret after a function call
+%if ARCH_X86_32
h264_idct_add8_mmxext_plane:
movsxdifnidn r3, r3d
.nextblock:
@@ -741,14 +722,9 @@ h264_idct_add8_mmxext_plane:
movzx r6, byte [r4+r6]
test r6, r6
jz .try_dc
-%if ARCH_X86_64
- mov r0d, dword [r1+r5*4]
- add r0, [dst2q]
-%else
mov r0, r1m ; XXX r1m here is actually r0m of the calling func
mov r0, [r0]
add r0, dword [r1+r5*4]
-%endif
IDCT4_ADD r0, r2, r3
inc r5
add r2, 32
@@ -761,14 +737,9 @@ h264_idct_add8_mmxext_plane:
jz .skipblock
mov word [r2], 0
DC_ADD_MMXEXT_INIT r6, r3
-%if ARCH_X86_64
- mov r0d, dword [r1+r5*4]
- add r0, [dst2q]
-%else
mov r0, r1m ; XXX r1m here is actually r0m of the calling func
mov r0, [r0]
add r0, dword [r1+r5*4]
-%endif
DC_ADD_MMXEXT_OP movh, r0, r3, r6
.skipblock:
inc r5
@@ -785,22 +756,16 @@ cglobal h264_idct_add8_8, 5, 8 + npicregs, 0, dst1, block_offset, block, stride,
movsxdifnidn r3, r3d
mov r5, 16
add r2, 512
-%if ARCH_X86_64
- mov dst2q, r0
-%endif
%ifdef PIC
lea picregq, [scan8_mem]
%endif
call h264_idct_add8_mmxext_plane
mov r5, 32
add r2, 384
-%if ARCH_X86_64
- add dst2q, gprsize
-%else
add r0mp, gprsize
-%endif
call h264_idct_add8_mmxext_plane
RET ; TODO: check rep ret after a function call
+%endif
; r0 = uint8_t *dst, r2 = int16_t *block, r3 = int stride, r6=clobbered
h264_idct_dc_add8_mmxext:
@@ -1139,8 +1104,10 @@ cglobal h264_luma_dc_dequant_idct, 3, 4, %1
RET
%endmacro
+%if ARCH_X86_32
INIT_MMX mmx
IDCT_DC_DEQUANT 0
+%endif
INIT_MMX sse2
IDCT_DC_DEQUANT 7
diff --git a/libavcodec/x86/h264_weight.asm b/libavcodec/x86/h264_weight.asm
index 0975d74fcf..086616e633 100644
--- a/libavcodec/x86/h264_weight.asm
+++ b/libavcodec/x86/h264_weight.asm
@@ -70,6 +70,7 @@ SECTION .text
packuswb m0, m1
%endmacro
+%if ARCH_X86_32
INIT_MMX mmxext
cglobal h264_weight_16, 6, 6, 0
WEIGHT_SETUP
@@ -82,6 +83,7 @@ cglobal h264_weight_16, 6, 6, 0
dec r2d
jnz .nextrow
REP_RET
+%endif
%macro WEIGHT_FUNC_MM 2
cglobal h264_weight_%1, 6, 6, %2
@@ -95,8 +97,10 @@ cglobal h264_weight_%1, 6, 6, %2
REP_RET
%endmacro
+%if ARCH_X86_32
INIT_MMX mmxext
WEIGHT_FUNC_MM 8, 0
+%endif
INIT_XMM sse2
WEIGHT_FUNC_MM 16, 8
@@ -198,6 +202,7 @@ WEIGHT_FUNC_HALF_MM 8, 8
packuswb m0, m1
%endmacro
+%if ARCH_X86_32
INIT_MMX mmxext
cglobal h264_biweight_16, 7, 8, 0
BIWEIGHT_SETUP
@@ -216,6 +221,7 @@ cglobal h264_biweight_16, 7, 8, 0
dec r3d
jnz .nextrow
REP_RET
+%endif
%macro BIWEIGHT_FUNC_MM 2
cglobal h264_biweight_%1, 7, 8, %2
@@ -233,8 +239,10 @@ cglobal h264_biweight_%1, 7, 8, %2
REP_RET
%endmacro
+%if ARCH_X86_32
INIT_MMX mmxext
BIWEIGHT_FUNC_MM 8, 0
+%endif
INIT_XMM sse2
BIWEIGHT_FUNC_MM 16, 8
diff --git a/libavcodec/x86/h264dsp_init.c b/libavcodec/x86/h264dsp_init.c
index c9a96c7dca..9ef6c6bb53 100644
--- a/libavcodec/x86/h264dsp_init.c
+++ b/libavcodec/x86/h264dsp_init.c
@@ -236,6 +236,10 @@ av_cold void ff_h264dsp_init_x86(H264DSPContext *c, const int bit_depth,
if (bit_depth == 8) {
if (EXTERNAL_MMX(cpu_flags)) {
+#if ARCH_X86_32
+ if (cpu_flags & AV_CPU_FLAG_CMOV)
+ c->h264_luma_dc_dequant_idct = ff_h264_luma_dc_dequant_idct_mmx;
+
c->h264_idct_dc_add =
c->h264_idct_add = ff_h264_idct_add_8_mmx;
c->h264_idct8_dc_add =
@@ -243,18 +247,21 @@ av_cold void ff_h264dsp_init_x86(H264DSPContext *c, const int bit_depth,
c->h264_idct_add16 = ff_h264_idct_add16_8_mmx;
c->h264_idct8_add4 = ff_h264_idct8_add4_8_mmx;
+
+ c->h264_idct_add16intra = ff_h264_idct_add16intra_8_mmx;
+#endif
if (chroma_format_idc <= 1) {
+#if ARCH_X86_32
c->h264_idct_add8 = ff_h264_idct_add8_8_mmx;
+#endif
} else {
c->h264_idct_add8 = ff_h264_idct_add8_422_8_mmx;
}
- c->h264_idct_add16intra = ff_h264_idct_add16intra_8_mmx;
- if (cpu_flags & AV_CPU_FLAG_CMOV)
- c->h264_luma_dc_dequant_idct = ff_h264_luma_dc_dequant_idct_mmx;
}
if (EXTERNAL_MMXEXT(cpu_flags)) {
- c->h264_idct_dc_add = ff_h264_idct_dc_add_8_mmxext;
c->h264_idct8_dc_add = ff_h264_idct8_dc_add_8_mmxext;
+#if ARCH_X86_32 && HAVE_MMXEXT_EXTERNAL
+ c->h264_idct_dc_add = ff_h264_idct_dc_add_8_mmxext;
c->h264_idct_add16 = ff_h264_idct_add16_8_mmxext;
c->h264_idct8_add4 = ff_h264_idct8_add4_8_mmxext;
if (chroma_format_idc <= 1)
@@ -270,18 +277,18 @@ av_cold void ff_h264dsp_init_x86(H264DSPContext *c, const int bit_depth,
c->h264_h_loop_filter_chroma = ff_deblock_h_chroma422_8_mmxext;
c->h264_h_loop_filter_chroma_intra = ff_deblock_h_chroma422_intra_8_mmxext;
}
-#if ARCH_X86_32 && HAVE_MMXEXT_EXTERNAL
c->h264_v_loop_filter_luma = deblock_v_luma_8_mmxext;
c->h264_h_loop_filter_luma = ff_deblock_h_luma_8_mmxext;
c->h264_v_loop_filter_luma_intra = deblock_v_luma_intra_8_mmxext;
c->h264_h_loop_filter_luma_intra = ff_deblock_h_luma_intra_8_mmxext;
-#endif /* ARCH_X86_32 && HAVE_MMXEXT_EXTERNAL */
+
c->weight_h264_pixels_tab[0] = ff_h264_weight_16_mmxext;
c->weight_h264_pixels_tab[1] = ff_h264_weight_8_mmxext;
- c->weight_h264_pixels_tab[2] = ff_h264_weight_4_mmxext;
c->biweight_h264_pixels_tab[0] = ff_h264_biweight_16_mmxext;
c->biweight_h264_pixels_tab[1] = ff_h264_biweight_8_mmxext;
+#endif /* ARCH_X86_32 && HAVE_MMXEXT_EXTERNAL */
+ c->weight_h264_pixels_tab[2] = ff_h264_weight_4_mmxext;
c->biweight_h264_pixels_tab[2] = ff_h264_biweight_4_mmxext;
}
if (EXTERNAL_SSE2(cpu_flags)) {
--
2.34.1
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 46+ messages in thread
* [FFmpeg-devel] [PATCH 23/41] avcodec/x86/sbrdsp_init: Disable overridden functions on x64
2022-06-09 23:15 [FFmpeg-devel] [PATCH 00/41] Stop including superseded functions for x64 Andreas Rheinhardt
` (21 preceding siblings ...)
2022-06-09 23:55 ` [FFmpeg-devel] [PATCH 22/41] avcodec/x86/h264dsp_init: Disable overridden functions " Andreas Rheinhardt
@ 2022-06-09 23:55 ` Andreas Rheinhardt
2022-06-09 23:55 ` [FFmpeg-devel] [PATCH 24/41] avcodec/x86/idctdsp_init: " Andreas Rheinhardt
` (18 subsequent siblings)
41 siblings, 0 replies; 46+ messages in thread
From: Andreas Rheinhardt @ 2022-06-09 23:55 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Andreas Rheinhardt
x64 always has MMX, MMXEXT, SSE and SSE2 and this means
that some functions for MMX, MMXEXT, SSE and 3dnow are always
overridden by other functions (unless one e.g. explicitly
disables SSE2). This commit therefore disables
ff_sbr_qmf_deint_bfly_sse (which is overridden by an SSE2 function)
at compile-time for x64.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
---
libavcodec/x86/sbrdsp.asm | 2 ++
libavcodec/x86/sbrdsp_init.c | 2 ++
2 files changed, 4 insertions(+)
diff --git a/libavcodec/x86/sbrdsp.asm b/libavcodec/x86/sbrdsp.asm
index 62bbe512ec..ea5099a5d1 100644
--- a/libavcodec/x86/sbrdsp.asm
+++ b/libavcodec/x86/sbrdsp.asm
@@ -286,8 +286,10 @@ cglobal sbr_qmf_deint_bfly, 3,5,8, v,src0,src1,vrev,c
REP_RET
%endmacro
+%if ARCH_X86_32
INIT_XMM sse
SBR_QMF_DEINT_BFLY
+%endif
INIT_XMM sse2
SBR_QMF_DEINT_BFLY
diff --git a/libavcodec/x86/sbrdsp_init.c b/libavcodec/x86/sbrdsp_init.c
index 6911a1a515..a710f10dc1 100644
--- a/libavcodec/x86/sbrdsp_init.c
+++ b/libavcodec/x86/sbrdsp_init.c
@@ -67,7 +67,9 @@ av_cold void ff_sbrdsp_init_x86(SBRDSPContext *s)
s->hf_g_filt = ff_sbr_hf_g_filt_sse;
s->hf_gen = ff_sbr_hf_gen_sse;
s->qmf_post_shuffle = ff_sbr_qmf_post_shuffle_sse;
+#if ARCH_X86_32
s->qmf_deint_bfly = ff_sbr_qmf_deint_bfly_sse;
+#endif
s->qmf_deint_neg = ff_sbr_qmf_deint_neg_sse;
s->autocorrelate = ff_sbr_autocorrelate_sse;
}
--
2.34.1
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 46+ messages in thread
* [FFmpeg-devel] [PATCH 24/41] avcodec/x86/idctdsp_init: Disable overridden functions on x64
2022-06-09 23:15 [FFmpeg-devel] [PATCH 00/41] Stop including superseded functions for x64 Andreas Rheinhardt
` (22 preceding siblings ...)
2022-06-09 23:55 ` [FFmpeg-devel] [PATCH 23/41] avcodec/x86/sbrdsp_init: " Andreas Rheinhardt
@ 2022-06-09 23:55 ` Andreas Rheinhardt
2022-06-09 23:55 ` [FFmpeg-devel] [PATCH 25/41] avcodec/x86/blockdsp_init: " Andreas Rheinhardt
` (17 subsequent siblings)
41 siblings, 0 replies; 46+ messages in thread
From: Andreas Rheinhardt @ 2022-06-09 23:55 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Andreas Rheinhardt
x64 always has MMX, MMXEXT, SSE and SSE2 and this means
that some functions for MMX, MMXEXT, SSE and 3dnow are always
overridden by other functions (unless one e.g. explicitly
disables SSE2). This commit therefore disables
the MMX as well as the non-64 bit (which are overridden by the 64bit
specific implementation) at compile-time for x64.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
---
libavcodec/tests/x86/dct.c | 2 +-
libavcodec/x86/idctdsp.asm | 6 ++++++
libavcodec/x86/idctdsp_init.c | 4 ++++
libavcodec/x86/simple_idct.asm | 2 ++
4 files changed, 13 insertions(+), 1 deletion(-)
diff --git a/libavcodec/tests/x86/dct.c b/libavcodec/tests/x86/dct.c
index 1eb9400567..144d055cff 100644
--- a/libavcodec/tests/x86/dct.c
+++ b/libavcodec/tests/x86/dct.c
@@ -73,7 +73,7 @@ static const struct algo fdct_tab_arch[] = {
};
static const struct algo idct_tab_arch[] = {
-#if HAVE_MMX_EXTERNAL
+#if ARCH_X86_32 && HAVE_MMX_EXTERNAL
{ "SIMPLE-MMX", ff_simple_idct_mmx, FF_IDCT_PERM_SIMPLE, AV_CPU_FLAG_MMX },
#endif
#if CONFIG_MPEG4_DECODER && HAVE_X86ASM
diff --git a/libavcodec/x86/idctdsp.asm b/libavcodec/x86/idctdsp.asm
index 089425a9ab..701a8c5a43 100644
--- a/libavcodec/x86/idctdsp.asm
+++ b/libavcodec/x86/idctdsp.asm
@@ -74,8 +74,10 @@ cglobal put_signed_pixels_clamped, 3, 4, %1, block, pixels, lsize, lsize3
RET
%endmacro
+%if ARCH_X86_32
INIT_MMX mmx
PUT_SIGNED_PIXELS_CLAMPED 0
+%endif
INIT_XMM sse2
PUT_SIGNED_PIXELS_CLAMPED 3
@@ -117,8 +119,10 @@ cglobal put_pixels_clamped, 3, 4, 2, block, pixels, lsize, lsize3
RET
%endmacro
+%if ARCH_X86_32
INIT_MMX mmx
PUT_PIXELS_CLAMPED
+%endif
INIT_XMM sse2
PUT_PIXELS_CLAMPED
@@ -177,7 +181,9 @@ cglobal add_pixels_clamped, 3, 3, 5, block, pixels, lsize
RET
%endmacro
+%if ARCH_X86_32
INIT_MMX mmx
ADD_PIXELS_CLAMPED
+%endif
INIT_XMM sse2
ADD_PIXELS_CLAMPED
diff --git a/libavcodec/x86/idctdsp_init.c b/libavcodec/x86/idctdsp_init.c
index 9103b92ce7..41ba9d68cb 100644
--- a/libavcodec/x86/idctdsp_init.c
+++ b/libavcodec/x86/idctdsp_init.c
@@ -63,6 +63,7 @@ av_cold void ff_idctdsp_init_x86(IDCTDSPContext *c, AVCodecContext *avctx,
{
int cpu_flags = av_get_cpu_flags();
+#if ARCH_X86_32
if (EXTERNAL_MMX(cpu_flags)) {
c->put_signed_pixels_clamped = ff_put_signed_pixels_clamped_mmx;
c->put_pixels_clamped = ff_put_pixels_clamped_mmx;
@@ -79,12 +80,14 @@ av_cold void ff_idctdsp_init_x86(IDCTDSPContext *c, AVCodecContext *avctx,
c->perm_type = FF_IDCT_PERM_SIMPLE;
}
}
+#endif
if (EXTERNAL_SSE2(cpu_flags)) {
c->put_signed_pixels_clamped = ff_put_signed_pixels_clamped_sse2;
c->put_pixels_clamped = ff_put_pixels_clamped_sse2;
c->add_pixels_clamped = ff_add_pixels_clamped_sse2;
+#if ARCH_X86_32
if (!high_bit_depth &&
avctx->lowres == 0 &&
(avctx->idct_algo == FF_IDCT_AUTO ||
@@ -94,6 +97,7 @@ av_cold void ff_idctdsp_init_x86(IDCTDSPContext *c, AVCodecContext *avctx,
c->idct_add = ff_simple_idct_add_sse2;
c->perm_type = FF_IDCT_PERM_SIMPLE;
}
+#endif
if (ARCH_X86_64 &&
!high_bit_depth &&
diff --git a/libavcodec/x86/simple_idct.asm b/libavcodec/x86/simple_idct.asm
index 6fedbb5784..002fdede90 100644
--- a/libavcodec/x86/simple_idct.asm
+++ b/libavcodec/x86/simple_idct.asm
@@ -25,6 +25,7 @@
%include "libavutil/x86/x86util.asm"
+%if ARCH_X86_32
SECTION_RODATA
cextern pb_80
@@ -887,3 +888,4 @@ cglobal simple_idct_add, 3, 4, 8, 128, pixels, lsize, block, t0
lea pixelsq, [pixelsq+lsizeq*2]
ADD_PIXELS_CLAMPED 96
RET
+%endif
--
2.34.1
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 46+ messages in thread
* [FFmpeg-devel] [PATCH 25/41] avcodec/x86/blockdsp_init: Disable overridden functions on x64
2022-06-09 23:15 [FFmpeg-devel] [PATCH 00/41] Stop including superseded functions for x64 Andreas Rheinhardt
` (23 preceding siblings ...)
2022-06-09 23:55 ` [FFmpeg-devel] [PATCH 24/41] avcodec/x86/idctdsp_init: " Andreas Rheinhardt
@ 2022-06-09 23:55 ` Andreas Rheinhardt
2022-06-09 23:55 ` [FFmpeg-devel] [PATCH 26/41] avcodec/x86/pixblockdsp_init: " Andreas Rheinhardt
` (16 subsequent siblings)
41 siblings, 0 replies; 46+ messages in thread
From: Andreas Rheinhardt @ 2022-06-09 23:55 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Andreas Rheinhardt
x64 always has MMX, MMXEXT, SSE and SSE2 and this means
that some functions for MMX, MMXEXT, SSE and 3dnow are always
overridden by other functions (unless one e.g. explicitly
disables SSE2). This commit therefore disables
the MMX implementation (which is overridden by the SSE
specific implementation) at compile-time for x64.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
---
libavcodec/x86/blockdsp.asm | 4 ++++
libavcodec/x86/blockdsp_init.c | 2 ++
2 files changed, 6 insertions(+)
diff --git a/libavcodec/x86/blockdsp.asm b/libavcodec/x86/blockdsp.asm
index 9d203df8f5..dfa38f1d3d 100644
--- a/libavcodec/x86/blockdsp.asm
+++ b/libavcodec/x86/blockdsp.asm
@@ -46,9 +46,11 @@ cglobal clear_block, 1, 1, %1, blocks
RET
%endmacro
+%if ARCH_X86_32
INIT_MMX mmx
%define ZERO pxor
CLEAR_BLOCK 0, 4
+%endif
INIT_XMM sse
%define ZERO xorps
CLEAR_BLOCK 1, 2
@@ -78,9 +80,11 @@ cglobal clear_blocks, 1, 2, %1, blocks, len
RET
%endmacro
+%if ARCH_X86_32
INIT_MMX mmx
%define ZERO pxor
CLEAR_BLOCKS 0
+%endif
INIT_XMM sse
%define ZERO xorps
CLEAR_BLOCKS 1
diff --git a/libavcodec/x86/blockdsp_init.c b/libavcodec/x86/blockdsp_init.c
index d7f8a8e508..3b857a51b6 100644
--- a/libavcodec/x86/blockdsp_init.c
+++ b/libavcodec/x86/blockdsp_init.c
@@ -37,10 +37,12 @@ av_cold void ff_blockdsp_init_x86(BlockDSPContext *c,
#if HAVE_X86ASM
int cpu_flags = av_get_cpu_flags();
+#if ARCH_X86_32
if (EXTERNAL_MMX(cpu_flags)) {
c->clear_block = ff_clear_block_mmx;
c->clear_blocks = ff_clear_blocks_mmx;
}
+#endif
if (EXTERNAL_SSE(cpu_flags)) {
c->clear_block = ff_clear_block_sse;
--
2.34.1
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 46+ messages in thread
* [FFmpeg-devel] [PATCH 26/41] avcodec/x86/pixblockdsp_init: Disable overridden functions on x64
2022-06-09 23:15 [FFmpeg-devel] [PATCH 00/41] Stop including superseded functions for x64 Andreas Rheinhardt
` (24 preceding siblings ...)
2022-06-09 23:55 ` [FFmpeg-devel] [PATCH 25/41] avcodec/x86/blockdsp_init: " Andreas Rheinhardt
@ 2022-06-09 23:55 ` Andreas Rheinhardt
2022-06-09 23:55 ` [FFmpeg-devel] [PATCH 27/41] avcodec/x86/lossless_audiodsp_init: " Andreas Rheinhardt
` (15 subsequent siblings)
41 siblings, 0 replies; 46+ messages in thread
From: Andreas Rheinhardt @ 2022-06-09 23:55 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Andreas Rheinhardt
x64 always has MMX, MMXEXT, SSE and SSE2 and this means
that some functions for MMX, MMXEXT, SSE and 3dnow are always
overridden by other functions (unless one e.g. explicitly
disables SSE2). This commit therefore disables
the MMX implementation (which is overridden by the SSE2
specific implementation) at compile-time for x64.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
---
libavcodec/x86/pixblockdsp.asm | 4 ++++
libavcodec/x86/pixblockdsp_init.c | 2 ++
2 files changed, 6 insertions(+)
diff --git a/libavcodec/x86/pixblockdsp.asm b/libavcodec/x86/pixblockdsp.asm
index 440fe29bcc..4a6369d7da 100644
--- a/libavcodec/x86/pixblockdsp.asm
+++ b/libavcodec/x86/pixblockdsp.asm
@@ -25,6 +25,7 @@
SECTION .text
+%if ARCH_X86_32
INIT_MMX mmx
; void ff_get_pixels_mmx(int16_t *block, const uint8_t *pixels, ptrdiff_t stride)
cglobal get_pixels, 3,4
@@ -48,6 +49,7 @@ cglobal get_pixels, 3,4
add r3, 32
js .loop
REP_RET
+%endif
INIT_XMM sse2
cglobal get_pixels, 3, 4, 5
@@ -121,8 +123,10 @@ cglobal diff_pixels, 4,5,5
RET
%endmacro
+%if ARCH_X86_32
INIT_MMX mmx
DIFF_PIXELS
+%endif
INIT_XMM sse2
DIFF_PIXELS
diff --git a/libavcodec/x86/pixblockdsp_init.c b/libavcodec/x86/pixblockdsp_init.c
index 3a5eb6959c..684879598e 100644
--- a/libavcodec/x86/pixblockdsp_init.c
+++ b/libavcodec/x86/pixblockdsp_init.c
@@ -36,6 +36,7 @@ av_cold void ff_pixblockdsp_init_x86(PixblockDSPContext *c,
{
int cpu_flags = av_get_cpu_flags();
+#if ARCH_X86_32
if (EXTERNAL_MMX(cpu_flags)) {
if (!high_bit_depth) {
c->get_pixels_unaligned =
@@ -44,6 +45,7 @@ av_cold void ff_pixblockdsp_init_x86(PixblockDSPContext *c,
c->diff_pixels_unaligned =
c->diff_pixels = ff_diff_pixels_mmx;
}
+#endif
if (EXTERNAL_SSE2(cpu_flags)) {
if (!high_bit_depth) {
--
2.34.1
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 46+ messages in thread
* [FFmpeg-devel] [PATCH 27/41] avcodec/x86/lossless_audiodsp_init: Disable overridden functions on x64
2022-06-09 23:15 [FFmpeg-devel] [PATCH 00/41] Stop including superseded functions for x64 Andreas Rheinhardt
` (25 preceding siblings ...)
2022-06-09 23:55 ` [FFmpeg-devel] [PATCH 26/41] avcodec/x86/pixblockdsp_init: " Andreas Rheinhardt
@ 2022-06-09 23:55 ` Andreas Rheinhardt
2022-06-09 23:55 ` [FFmpeg-devel] [PATCH 28/41] avcodec/x86/svq1enc_init: " Andreas Rheinhardt
` (14 subsequent siblings)
41 siblings, 0 replies; 46+ messages in thread
From: Andreas Rheinhardt @ 2022-06-09 23:55 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Andreas Rheinhardt
x64 always has MMX, MMXEXT, SSE and SSE2 and this means
that some functions for MMX, MMXEXT, SSE and 3dnow are always
overridden by other functions (unless one e.g. explicitly
disables SSE2). This commit therefore disables
the MMX implementation (which is overridden by the SSE2
specific implementation) at compile-time for x64.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
---
libavcodec/x86/lossless_audiodsp.asm | 2 ++
libavcodec/x86/lossless_audiodsp_init.c | 2 ++
2 files changed, 4 insertions(+)
diff --git a/libavcodec/x86/lossless_audiodsp.asm b/libavcodec/x86/lossless_audiodsp.asm
index 063d7b41af..ad8a34531a 100644
--- a/libavcodec/x86/lossless_audiodsp.asm
+++ b/libavcodec/x86/lossless_audiodsp.asm
@@ -63,8 +63,10 @@ cglobal scalarproduct_and_madd_int16, 4,4,8, v1, v2, v3, order, mul
RET
%endmacro
+%if ARCH_X86_32
INIT_MMX mmxext
SCALARPRODUCT
+%endif
INIT_XMM sse2
SCALARPRODUCT
diff --git a/libavcodec/x86/lossless_audiodsp_init.c b/libavcodec/x86/lossless_audiodsp_init.c
index f74c7e4361..f90cab55d3 100644
--- a/libavcodec/x86/lossless_audiodsp_init.c
+++ b/libavcodec/x86/lossless_audiodsp_init.c
@@ -40,8 +40,10 @@ av_cold void ff_llauddsp_init_x86(LLAudDSPContext *c)
#if HAVE_X86ASM
int cpu_flags = av_get_cpu_flags();
+#if ARCH_X86_32
if (EXTERNAL_MMXEXT(cpu_flags))
c->scalarproduct_and_madd_int16 = ff_scalarproduct_and_madd_int16_mmxext;
+#endif
if (EXTERNAL_SSE2(cpu_flags))
c->scalarproduct_and_madd_int16 = ff_scalarproduct_and_madd_int16_sse2;
--
2.34.1
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 46+ messages in thread
* [FFmpeg-devel] [PATCH 28/41] avcodec/x86/svq1enc_init: Disable overridden functions on x64
2022-06-09 23:15 [FFmpeg-devel] [PATCH 00/41] Stop including superseded functions for x64 Andreas Rheinhardt
` (26 preceding siblings ...)
2022-06-09 23:55 ` [FFmpeg-devel] [PATCH 27/41] avcodec/x86/lossless_audiodsp_init: " Andreas Rheinhardt
@ 2022-06-09 23:55 ` Andreas Rheinhardt
2022-06-09 23:55 ` [FFmpeg-devel] [PATCH 29/41] avcodec/x86/fmtconvert_init: " Andreas Rheinhardt
` (13 subsequent siblings)
41 siblings, 0 replies; 46+ messages in thread
From: Andreas Rheinhardt @ 2022-06-09 23:55 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Andreas Rheinhardt
x64 always has MMX, MMXEXT, SSE and SSE2 and this means
that some functions for MMX, MMXEXT, SSE and 3dnow are always
overridden by other functions (unless one e.g. explicitly
disables SSE2). This commit therefore disables
the MMX implementation (which is overridden by the SSE2
specific implementation) at compile-time for x64.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
---
libavcodec/x86/svq1enc.asm | 2 ++
libavcodec/x86/svq1enc_init.c | 2 ++
2 files changed, 4 insertions(+)
diff --git a/libavcodec/x86/svq1enc.asm b/libavcodec/x86/svq1enc.asm
index a87632836d..a43d3b4029 100644
--- a/libavcodec/x86/svq1enc.asm
+++ b/libavcodec/x86/svq1enc.asm
@@ -55,7 +55,9 @@ cglobal ssd_int8_vs_int16, 3, 3, 3, pix1, pix2, size
RET
%endmacro
+%if ARCH_X86_32
INIT_MMX mmx
SSD_INT8_VS_INT16
+%endif
INIT_XMM sse2
SSD_INT8_VS_INT16
diff --git a/libavcodec/x86/svq1enc_init.c b/libavcodec/x86/svq1enc_init.c
index 40b4b0e183..aaebbb92cc 100644
--- a/libavcodec/x86/svq1enc_init.c
+++ b/libavcodec/x86/svq1enc_init.c
@@ -33,9 +33,11 @@ av_cold void ff_svq1enc_init_x86(SVQ1EncContext *c)
{
int cpu_flags = av_get_cpu_flags();
+#if ARCH_X86_32
if (EXTERNAL_MMX(cpu_flags)) {
c->ssd_int8_vs_int16 = ff_ssd_int8_vs_int16_mmx;
}
+#endif
if (EXTERNAL_SSE2(cpu_flags)) {
c->ssd_int8_vs_int16 = ff_ssd_int8_vs_int16_sse2;
}
--
2.34.1
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 46+ messages in thread
* [FFmpeg-devel] [PATCH 29/41] avcodec/x86/fmtconvert_init: Disable overridden functions on x64
2022-06-09 23:15 [FFmpeg-devel] [PATCH 00/41] Stop including superseded functions for x64 Andreas Rheinhardt
` (27 preceding siblings ...)
2022-06-09 23:55 ` [FFmpeg-devel] [PATCH 28/41] avcodec/x86/svq1enc_init: " Andreas Rheinhardt
@ 2022-06-09 23:55 ` Andreas Rheinhardt
2022-06-09 23:55 ` [FFmpeg-devel] [PATCH 30/41] avcodec/x86/hpeldsp_vp3_init: " Andreas Rheinhardt
` (12 subsequent siblings)
41 siblings, 0 replies; 46+ messages in thread
From: Andreas Rheinhardt @ 2022-06-09 23:55 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Andreas Rheinhardt
x64 always has MMX, MMXEXT, SSE and SSE2 and this means
that some functions for MMX, MMXEXT, SSE and 3dnow are always
overridden by other functions (unless one e.g. explicitly
disables SSE2). This commit therefore disables
the SSE implementation (which is overridden by the SSE2
specific implementation) at compile-time for x64.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
---
libavcodec/x86/fmtconvert.asm | 4 ++++
libavcodec/x86/fmtconvert_init.c | 2 ++
2 files changed, 6 insertions(+)
diff --git a/libavcodec/x86/fmtconvert.asm b/libavcodec/x86/fmtconvert.asm
index 8f62a0a093..5b7578825a 100644
--- a/libavcodec/x86/fmtconvert.asm
+++ b/libavcodec/x86/fmtconvert.asm
@@ -71,8 +71,10 @@ cglobal int32_to_float_fmul_scalar, 4, 4, %1, dst, src, mul, len
RET
%endmacro
+%if ARCH_X86_32
INIT_XMM sse
INT32_TO_FLOAT_FMUL_SCALAR 5
+%endif
INIT_XMM sse2
INT32_TO_FLOAT_FMUL_SCALAR 3
@@ -117,8 +119,10 @@ cglobal int32_to_float_fmul_array8, 5, 5, 5, c, dst, src, mul, len
RET
%endmacro
+%if ARCH_X86_32
INIT_XMM sse
INT32_TO_FLOAT_FMUL_ARRAY8
+%endif
INIT_XMM sse2
INT32_TO_FLOAT_FMUL_ARRAY8
diff --git a/libavcodec/x86/fmtconvert_init.c b/libavcodec/x86/fmtconvert_init.c
index df097054e4..0883cd8a56 100644
--- a/libavcodec/x86/fmtconvert_init.c
+++ b/libavcodec/x86/fmtconvert_init.c
@@ -43,10 +43,12 @@ av_cold void ff_fmt_convert_init_x86(FmtConvertContext *c, AVCodecContext *avctx
#if HAVE_X86ASM
int cpu_flags = av_get_cpu_flags();
+#if ARCH_X86_32
if (EXTERNAL_SSE(cpu_flags)) {
c->int32_to_float_fmul_scalar = ff_int32_to_float_fmul_scalar_sse;
c->int32_to_float_fmul_array8 = ff_int32_to_float_fmul_array8_sse;
}
+#endif
if (EXTERNAL_SSE2(cpu_flags)) {
c->int32_to_float_fmul_scalar = ff_int32_to_float_fmul_scalar_sse2;
c->int32_to_float_fmul_array8 = ff_int32_to_float_fmul_array8_sse2;
--
2.34.1
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 46+ messages in thread
* [FFmpeg-devel] [PATCH 30/41] avcodec/x86/hpeldsp_vp3_init: Disable overridden functions on x64
2022-06-09 23:15 [FFmpeg-devel] [PATCH 00/41] Stop including superseded functions for x64 Andreas Rheinhardt
` (28 preceding siblings ...)
2022-06-09 23:55 ` [FFmpeg-devel] [PATCH 29/41] avcodec/x86/fmtconvert_init: " Andreas Rheinhardt
@ 2022-06-09 23:55 ` Andreas Rheinhardt
2022-06-09 23:55 ` [FFmpeg-devel] [PATCH 31/41] avcodec/x86/hpeldsp_init: " Andreas Rheinhardt
` (11 subsequent siblings)
41 siblings, 0 replies; 46+ messages in thread
From: Andreas Rheinhardt @ 2022-06-09 23:55 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Andreas Rheinhardt
x64 always has MMX, MMXEXT, SSE and SSE2 and this means
that some functions for MMX, MMXEXT, SSE and 3dnow are always
overridden by other functions (unless one e.g. explicitly
disables SSE2). This commit therefore disables
the 3dnow implementation (which is overridden by the MMXEXT
specific implementation) at compile-time for x64.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
---
libavcodec/x86/hpeldsp_vp3.asm | 4 ++++
libavcodec/x86/hpeldsp_vp3_init.c | 2 ++
2 files changed, 6 insertions(+)
diff --git a/libavcodec/x86/hpeldsp_vp3.asm b/libavcodec/x86/hpeldsp_vp3.asm
index cba96d06cb..284babfd70 100644
--- a/libavcodec/x86/hpeldsp_vp3.asm
+++ b/libavcodec/x86/hpeldsp_vp3.asm
@@ -65,8 +65,10 @@ cglobal put_no_rnd_pixels8_x2_exact, 4,5
INIT_MMX mmxext
PUT_NO_RND_PIXELS8_X2_EXACT
+%if ARCH_X86_32
INIT_MMX 3dnow
PUT_NO_RND_PIXELS8_X2_EXACT
+%endif
; void ff_put_no_rnd_pixels8_y2_exact(uint8_t *block, const uint8_t *pixels, ptrdiff_t line_size, int h)
@@ -107,5 +109,7 @@ cglobal put_no_rnd_pixels8_y2_exact, 4,5
INIT_MMX mmxext
PUT_NO_RND_PIXELS8_Y2_EXACT
+%if ARCH_X86_32
INIT_MMX 3dnow
PUT_NO_RND_PIXELS8_Y2_EXACT
+%endif
diff --git a/libavcodec/x86/hpeldsp_vp3_init.c b/libavcodec/x86/hpeldsp_vp3_init.c
index 5979f4123c..3c52f9d6f4 100644
--- a/libavcodec/x86/hpeldsp_vp3_init.c
+++ b/libavcodec/x86/hpeldsp_vp3_init.c
@@ -40,12 +40,14 @@ void ff_put_no_rnd_pixels8_y2_exact_3dnow(uint8_t *block,
av_cold void ff_hpeldsp_vp3_init_x86(HpelDSPContext *c, int cpu_flags, int flags)
{
+#if ARCH_X86_32
if (EXTERNAL_AMD3DNOW(cpu_flags)) {
if (flags & AV_CODEC_FLAG_BITEXACT) {
c->put_no_rnd_pixels_tab[1][1] = ff_put_no_rnd_pixels8_x2_exact_3dnow;
c->put_no_rnd_pixels_tab[1][2] = ff_put_no_rnd_pixels8_y2_exact_3dnow;
}
}
+#endif
if (EXTERNAL_MMXEXT(cpu_flags)) {
if (flags & AV_CODEC_FLAG_BITEXACT) {
--
2.34.1
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 46+ messages in thread
* [FFmpeg-devel] [PATCH 31/41] avcodec/x86/hpeldsp_init: Disable overridden functions on x64
2022-06-09 23:15 [FFmpeg-devel] [PATCH 00/41] Stop including superseded functions for x64 Andreas Rheinhardt
` (29 preceding siblings ...)
2022-06-09 23:55 ` [FFmpeg-devel] [PATCH 30/41] avcodec/x86/hpeldsp_vp3_init: " Andreas Rheinhardt
@ 2022-06-09 23:55 ` Andreas Rheinhardt
2022-06-09 23:55 ` [FFmpeg-devel] [PATCH 32/41] avcodec/x86/h264_qpel: Make functions only used here static Andreas Rheinhardt
` (10 subsequent siblings)
41 siblings, 0 replies; 46+ messages in thread
From: Andreas Rheinhardt @ 2022-06-09 23:55 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Andreas Rheinhardt
x64 always has MMX, MMXEXT, SSE and SSE2 and this means
that some functions for MMX, MMXEXT, SSE and 3dnow are always
overridden by other functions (unless one e.g. explicitly
disables SSE2). This commit therefore disables
the 3dnow implementation (which is overridden by the MMXEXT
specific implementation) as well as some MMX functions
at compile-time for x64.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
---
libavcodec/x86/fpel.asm | 2 ++
libavcodec/x86/hpeldsp.asm | 22 +++++++++++++++++++
libavcodec/x86/hpeldsp_init.c | 40 +++++++++++++++++++++++++----------
libavcodec/x86/rnd_template.c | 2 ++
4 files changed, 55 insertions(+), 11 deletions(-)
diff --git a/libavcodec/x86/fpel.asm b/libavcodec/x86/fpel.asm
index d38a1b1035..8c810265c3 100644
--- a/libavcodec/x86/fpel.asm
+++ b/libavcodec/x86/fpel.asm
@@ -91,7 +91,9 @@ cglobal %1_pixels%2, 4,5,4
INIT_MMX mmx
OP_PIXELS put, 4
OP_PIXELS put, 8
+%if ARCH_X86_32
OP_PIXELS avg, 8
+%endif
OP_PIXELS put, 16
OP_PIXELS avg, 16
diff --git a/libavcodec/x86/hpeldsp.asm b/libavcodec/x86/hpeldsp.asm
index ce5d7a4e28..97f9f06539 100644
--- a/libavcodec/x86/hpeldsp.asm
+++ b/libavcodec/x86/hpeldsp.asm
@@ -83,8 +83,10 @@ cglobal put_pixels8_x2, 4,5
INIT_MMX mmxext
PUT_PIXELS8_X2
+%if ARCH_X86_32
INIT_MMX 3dnow
PUT_PIXELS8_X2
+%endif
; void ff_put_pixels16_x2(uint8_t *block, const uint8_t *pixels, ptrdiff_t line_size, int h)
@@ -127,8 +129,10 @@ cglobal put_pixels16_x2, 4,5
INIT_MMX mmxext
PUT_PIXELS_16
+%if ARCH_X86_32
INIT_MMX 3dnow
PUT_PIXELS_16
+%endif
; The 8_X2 macro can easily be used here
INIT_XMM sse2
PUT_PIXELS8_X2
@@ -171,8 +175,10 @@ cglobal put_no_rnd_pixels8_x2, 4,5
INIT_MMX mmxext
PUT_NO_RND_PIXELS8_X2
+%if ARCH_X86_32
INIT_MMX 3dnow
PUT_NO_RND_PIXELS8_X2
+%endif
; void ff_put_pixels8_y2(uint8_t *block, const uint8_t *pixels, ptrdiff_t line_size, int h)
@@ -209,8 +215,10 @@ cglobal put_pixels8_y2, 4,5
INIT_MMX mmxext
PUT_PIXELS8_Y2
+%if ARCH_X86_32
INIT_MMX 3dnow
PUT_PIXELS8_Y2
+%endif
; actually, put_pixels16_y2_sse2
INIT_XMM sse2
PUT_PIXELS8_Y2
@@ -249,8 +257,10 @@ cglobal put_no_rnd_pixels8_y2, 4,5
INIT_MMX mmxext
PUT_NO_RND_PIXELS8_Y2
+%if ARCH_X86_32
INIT_MMX 3dnow
PUT_NO_RND_PIXELS8_Y2
+%endif
; void ff_avg_pixels8(uint8_t *block, const uint8_t *pixels, ptrdiff_t line_size, int h)
@@ -279,8 +289,10 @@ cglobal avg_pixels8, 4,5
REP_RET
%endmacro
+%if ARCH_X86_32
INIT_MMX 3dnow
AVG_PIXELS8
+%endif
; void ff_avg_pixels8_x2(uint8_t *block, const uint8_t *pixels, ptrdiff_t line_size, int h)
@@ -335,12 +347,16 @@ cglobal avg_pixels8_x2, 4,5
REP_RET
%endmacro
+%if ARCH_X86_32
INIT_MMX mmx
AVG_PIXELS8_X2
+%endif
INIT_MMX mmxext
AVG_PIXELS8_X2
+%if ARCH_X86_32
INIT_MMX 3dnow
AVG_PIXELS8_X2
+%endif
; actually avg_pixels16_x2
INIT_XMM sse2
AVG_PIXELS8_X2
@@ -384,8 +400,10 @@ cglobal avg_pixels8_y2, 4,5
INIT_MMX mmxext
AVG_PIXELS8_Y2
+%if ARCH_X86_32
INIT_MMX 3dnow
AVG_PIXELS8_Y2
+%endif
; actually avg_pixels16_y2
INIT_XMM sse2
AVG_PIXELS8_Y2
@@ -433,8 +451,10 @@ cglobal avg_approx_pixels8_xy2, 4,5
INIT_MMX mmxext
AVG_APPROX_PIXELS8_XY2
+%if ARCH_X86_32
INIT_MMX 3dnow
AVG_APPROX_PIXELS8_XY2
+%endif
; void ff_avg_pixels16_xy2(uint8_t *block, const uint8_t *pixels, ptrdiff_t line_size, int h)
@@ -517,8 +537,10 @@ cglobal %1_pixels8_xy2, 4,5
INIT_MMX mmxext
SET_PIXELS_XY2 avg
+%if ARCH_X86_32
INIT_MMX 3dnow
SET_PIXELS_XY2 avg
+%endif
INIT_XMM sse2
SET_PIXELS_XY2 put
SET_PIXELS_XY2 avg
diff --git a/libavcodec/x86/hpeldsp_init.c b/libavcodec/x86/hpeldsp_init.c
index 6336587281..06ba5390d7 100644
--- a/libavcodec/x86/hpeldsp_init.c
+++ b/libavcodec/x86/hpeldsp_init.c
@@ -131,19 +131,25 @@ CALL_2X_PIXELS(put_no_rnd_pixels16_xy2_mmx, put_no_rnd_pixels8_xy2_mmx, 8)
#undef DEF
#define DEF(x, y) ff_ ## x ## _ ## y ## _mmx
#define STATIC
+#if ARCH_X86_64
+#define NO_AVG
+#endif
#include "rnd_template.c"
+#undef NO_AVG
#undef DEF
#undef SET_RND
#undef PAVGBP
#undef PAVGB
#if HAVE_MMX
+#if ARCH_X86_32
CALL_2X_PIXELS(avg_pixels16_y2_mmx, avg_pixels8_y2_mmx, 8)
CALL_2X_PIXELS(put_pixels16_y2_mmx, put_pixels8_y2_mmx, 8)
CALL_2X_PIXELS_EXPORT(ff_avg_pixels16_xy2_mmx, ff_avg_pixels8_xy2_mmx, 8)
+#endif
CALL_2X_PIXELS_EXPORT(ff_put_pixels16_xy2_mmx, ff_put_pixels8_xy2_mmx, 8)
#endif
@@ -162,38 +168,49 @@ CALL_2X_PIXELS_EXPORT(ff_put_pixels16_xy2_mmx, ff_put_pixels8_xy2_mmx, 8)
CALL_2X_PIXELS(avg_pixels16_xy2 ## CPUEXT, ff_avg_pixels8_xy2 ## CPUEXT, 8) \
CALL_2X_PIXELS(avg_approx_pixels16_xy2## CPUEXT, ff_avg_approx_pixels8_xy2## CPUEXT, 8)
+#if ARCH_X86_32
HPELDSP_AVG_PIXELS16(_3dnow)
+#endif
HPELDSP_AVG_PIXELS16(_mmxext)
#endif /* HAVE_X86ASM */
#define SET_HPEL_FUNCS_EXT(PFX, IDX, SIZE, CPU) \
if (HAVE_MMX_EXTERNAL) \
- c->PFX ## _pixels_tab IDX [0] = PFX ## _pixels ## SIZE ## _ ## CPU;
+ c->PFX ## _pixels_tab IDX [0] = PFX ## _pixels ## SIZE ## _ ## CPU
#if HAVE_MMX_INLINE
-#define SET_HPEL_FUNCS(PFX, IDX, SIZE, CPU) \
+#define SET_HPEL_FUNCS03(PFX, IDX, SIZE, CPU) \
+ do { \
+ SET_HPEL_FUNCS_EXT(PFX, IDX, SIZE, CPU); \
+ c->PFX ## _pixels_tab IDX [3] = PFX ## _pixels ## SIZE ## _xy2_ ## CPU; \
+ } while (0)
+#define SET_HPEL_FUNCS12(PFX, IDX, SIZE, CPU) \
do { \
- SET_HPEL_FUNCS_EXT(PFX, IDX, SIZE, CPU) \
c->PFX ## _pixels_tab IDX [1] = PFX ## _pixels ## SIZE ## _x2_ ## CPU; \
c->PFX ## _pixels_tab IDX [2] = PFX ## _pixels ## SIZE ## _y2_ ## CPU; \
- c->PFX ## _pixels_tab IDX [3] = PFX ## _pixels ## SIZE ## _xy2_ ## CPU; \
} while (0)
#else
+#define SET_HPEL_FUNCS03(PFX, IDX, SIZE, CPU) SET_HPEL_FUNCS_EXT(PFX, IDX, SIZE, CPU)
+#define SET_HPEL_FUNCS12(PFX, IDX, SIZE, CPU) ((void)0)
+#endif
#define SET_HPEL_FUNCS(PFX, IDX, SIZE, CPU) \
do { \
- SET_HPEL_FUNCS_EXT(PFX, IDX, SIZE, CPU) \
+ SET_HPEL_FUNCS03(PFX, IDX, SIZE, CPU); \
+ SET_HPEL_FUNCS12(PFX, IDX, SIZE, CPU); \
} while (0)
-#endif
static void hpeldsp_init_mmx(HpelDSPContext *c, int flags)
{
- SET_HPEL_FUNCS(put, [0], 16, mmx);
+ SET_HPEL_FUNCS03(put, [0], 16, mmx);
SET_HPEL_FUNCS(put_no_rnd, [0], 16, mmx);
- SET_HPEL_FUNCS(avg, [0], 16, mmx);
SET_HPEL_FUNCS(avg_no_rnd, , 16, mmx);
- SET_HPEL_FUNCS(put, [1], 8, mmx);
+ SET_HPEL_FUNCS03(put, [1], 8, mmx);
SET_HPEL_FUNCS(put_no_rnd, [1], 8, mmx);
+#if ARCH_X86_32
+ SET_HPEL_FUNCS12(put, [0], 16, mmx);
+ SET_HPEL_FUNCS12(put, [1], 8, mmx);
+ SET_HPEL_FUNCS(avg, [0], 16, mmx);
if (HAVE_MMX_EXTERNAL) {
c->avg_pixels_tab[1][0] = ff_avg_pixels8_mmx;
c->avg_pixels_tab[1][1] = ff_avg_pixels8_x2_mmx;
@@ -202,6 +219,7 @@ static void hpeldsp_init_mmx(HpelDSPContext *c, int flags)
c->avg_pixels_tab[1][2] = avg_pixels8_y2_mmx;
c->avg_pixels_tab[1][3] = ff_avg_pixels8_xy2_mmx;
#endif
+#endif
}
static void hpeldsp_init_mmxext(HpelDSPContext *c, int flags)
@@ -237,7 +255,7 @@ static void hpeldsp_init_mmxext(HpelDSPContext *c, int flags)
static void hpeldsp_init_3dnow(HpelDSPContext *c, int flags)
{
-#if HAVE_AMD3DNOW_EXTERNAL
+#if HAVE_AMD3DNOW_EXTERNAL && ARCH_X86_32
c->put_pixels_tab[0][1] = ff_put_pixels16_x2_3dnow;
c->put_pixels_tab[0][2] = put_pixels16_y2_3dnow;
@@ -263,7 +281,7 @@ static void hpeldsp_init_3dnow(HpelDSPContext *c, int flags)
c->avg_pixels_tab[0][3] = avg_approx_pixels16_xy2_3dnow;
c->avg_pixels_tab[1][3] = ff_avg_approx_pixels8_xy2_3dnow;
}
-#endif /* HAVE_AMD3DNOW_EXTERNAL */
+#endif /* HAVE_AMD3DNOW_EXTERNAL && ARCH_X86_32 */
}
static void hpeldsp_init_sse2_fast(HpelDSPContext *c, int flags)
diff --git a/libavcodec/x86/rnd_template.c b/libavcodec/x86/rnd_template.c
index 09946bd23f..b825eeba6e 100644
--- a/libavcodec/x86/rnd_template.c
+++ b/libavcodec/x86/rnd_template.c
@@ -97,6 +97,7 @@ av_unused STATIC void DEF(put, pixels8_xy2)(uint8_t *block, const uint8_t *pixel
:FF_REG_a, "memory");
}
+#ifndef NO_AVG
// avg_pixels
// this routine is 'slightly' suboptimal but mostly unused
av_unused STATIC void DEF(avg, pixels8_xy2)(uint8_t *block, const uint8_t *pixels,
@@ -173,3 +174,4 @@ av_unused STATIC void DEF(avg, pixels8_xy2)(uint8_t *block, const uint8_t *pixel
:"D"(block), "r"((x86_reg)line_size)
:FF_REG_a, "memory");
}
+#endif
--
2.34.1
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 46+ messages in thread
* [FFmpeg-devel] [PATCH 32/41] avcodec/x86/h264_qpel: Make functions only used here static
2022-06-09 23:15 [FFmpeg-devel] [PATCH 00/41] Stop including superseded functions for x64 Andreas Rheinhardt
` (30 preceding siblings ...)
2022-06-09 23:55 ` [FFmpeg-devel] [PATCH 31/41] avcodec/x86/hpeldsp_init: " Andreas Rheinhardt
@ 2022-06-09 23:55 ` Andreas Rheinhardt
2022-06-09 23:55 ` [FFmpeg-devel] [PATCH 33/41] avcodec/x86/h264_qpel: Disable overridden functions on x64 Andreas Rheinhardt
` (9 subsequent siblings)
41 siblings, 0 replies; 46+ messages in thread
From: Andreas Rheinhardt @ 2022-06-09 23:55 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Andreas Rheinhardt
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
---
libavcodec/x86/h264_qpel.c | 44 ++++++++++++++++++--------------------
1 file changed, 21 insertions(+), 23 deletions(-)
diff --git a/libavcodec/x86/h264_qpel.c b/libavcodec/x86/h264_qpel.c
index dda50ded89..fd1070247b 100644
--- a/libavcodec/x86/h264_qpel.c
+++ b/libavcodec/x86/h264_qpel.c
@@ -409,13 +409,11 @@ H264_MC_816(H264_MC_HV, ssse3)
void ff_ ## OP ## _h264_qpel ## NUM ## _ ## TYPE ## _ ## DEPTH ## _ ## OPT \
(uint8_t *dst, const uint8_t *src, ptrdiff_t stride);
-#define LUMA_MC_ALL(DEPTH, TYPE, OPT) \
+#define LUMA_MC_48(DEPTH, TYPE, OPT) \
LUMA_MC_OP(put, 4, DEPTH, TYPE, OPT) \
LUMA_MC_OP(avg, 4, DEPTH, TYPE, OPT) \
LUMA_MC_OP(put, 8, DEPTH, TYPE, OPT) \
- LUMA_MC_OP(avg, 8, DEPTH, TYPE, OPT) \
- LUMA_MC_OP(put, 16, DEPTH, TYPE, OPT) \
- LUMA_MC_OP(avg, 16, DEPTH, TYPE, OPT)
+ LUMA_MC_OP(avg, 8, DEPTH, TYPE, OPT)
#define LUMA_MC_816(DEPTH, TYPE, OPT) \
LUMA_MC_OP(put, 8, DEPTH, TYPE, OPT) \
@@ -423,22 +421,22 @@ void ff_ ## OP ## _h264_qpel ## NUM ## _ ## TYPE ## _ ## DEPTH ## _ ## OPT \
LUMA_MC_OP(put, 16, DEPTH, TYPE, OPT) \
LUMA_MC_OP(avg, 16, DEPTH, TYPE, OPT)
-LUMA_MC_ALL(10, mc00, mmxext)
-LUMA_MC_ALL(10, mc10, mmxext)
-LUMA_MC_ALL(10, mc20, mmxext)
-LUMA_MC_ALL(10, mc30, mmxext)
-LUMA_MC_ALL(10, mc01, mmxext)
-LUMA_MC_ALL(10, mc11, mmxext)
-LUMA_MC_ALL(10, mc21, mmxext)
-LUMA_MC_ALL(10, mc31, mmxext)
-LUMA_MC_ALL(10, mc02, mmxext)
-LUMA_MC_ALL(10, mc12, mmxext)
-LUMA_MC_ALL(10, mc22, mmxext)
-LUMA_MC_ALL(10, mc32, mmxext)
-LUMA_MC_ALL(10, mc03, mmxext)
-LUMA_MC_ALL(10, mc13, mmxext)
-LUMA_MC_ALL(10, mc23, mmxext)
-LUMA_MC_ALL(10, mc33, mmxext)
+LUMA_MC_48(10, mc00, mmxext)
+LUMA_MC_48(10, mc10, mmxext)
+LUMA_MC_48(10, mc20, mmxext)
+LUMA_MC_48(10, mc30, mmxext)
+LUMA_MC_48(10, mc01, mmxext)
+LUMA_MC_48(10, mc11, mmxext)
+LUMA_MC_48(10, mc21, mmxext)
+LUMA_MC_48(10, mc31, mmxext)
+LUMA_MC_48(10, mc02, mmxext)
+LUMA_MC_48(10, mc12, mmxext)
+LUMA_MC_48(10, mc22, mmxext)
+LUMA_MC_48(10, mc32, mmxext)
+LUMA_MC_48(10, mc03, mmxext)
+LUMA_MC_48(10, mc13, mmxext)
+LUMA_MC_48(10, mc23, mmxext)
+LUMA_MC_48(10, mc33, mmxext)
LUMA_MC_816(10, mc00, sse2)
LUMA_MC_816(10, mc10, sse2)
@@ -464,7 +462,7 @@ LUMA_MC_816(10, mc23, sse2)
LUMA_MC_816(10, mc33, sse2)
#define QPEL16_OPMC(OP, MC, MMX)\
-void ff_ ## OP ## _h264_qpel16_ ## MC ## _10_ ## MMX(uint8_t *dst, const uint8_t *src, ptrdiff_t stride){\
+static void OP ## _h264_qpel16_ ## MC ## _10_ ## MMX(uint8_t *dst, const uint8_t *src, ptrdiff_t stride){\
ff_ ## OP ## _h264_qpel8_ ## MC ## _10_ ## MMX(dst , src , stride);\
ff_ ## OP ## _h264_qpel8_ ## MC ## _10_ ## MMX(dst+16, src+16, stride);\
src += 8*stride;\
@@ -553,8 +551,8 @@ av_cold void ff_h264qpel_init_x86(H264QpelContext *c, int bit_depth)
SET_QPEL_FUNCS(avg_h264_qpel, 2, 4, mmxext, );
} else if (bit_depth == 10) {
#if ARCH_X86_32
- SET_QPEL_FUNCS(avg_h264_qpel, 0, 16, 10_mmxext, ff_);
- SET_QPEL_FUNCS(put_h264_qpel, 0, 16, 10_mmxext, ff_);
+ SET_QPEL_FUNCS(avg_h264_qpel, 0, 16, 10_mmxext, );
+ SET_QPEL_FUNCS(put_h264_qpel, 0, 16, 10_mmxext, );
SET_QPEL_FUNCS(put_h264_qpel, 1, 8, 10_mmxext, ff_);
SET_QPEL_FUNCS(avg_h264_qpel, 1, 8, 10_mmxext, ff_);
#endif
--
2.34.1
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 46+ messages in thread
* [FFmpeg-devel] [PATCH 33/41] avcodec/x86/h264_qpel: Disable overridden functions on x64
2022-06-09 23:15 [FFmpeg-devel] [PATCH 00/41] Stop including superseded functions for x64 Andreas Rheinhardt
` (31 preceding siblings ...)
2022-06-09 23:55 ` [FFmpeg-devel] [PATCH 32/41] avcodec/x86/h264_qpel: Make functions only used here static Andreas Rheinhardt
@ 2022-06-09 23:55 ` Andreas Rheinhardt
2022-06-09 23:55 ` [FFmpeg-devel] [PATCH 34/41] avcodec/x86/h264chroma_init: " Andreas Rheinhardt
` (8 subsequent siblings)
41 siblings, 0 replies; 46+ messages in thread
From: Andreas Rheinhardt @ 2022-06-09 23:55 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Andreas Rheinhardt
x64 always has MMX, MMXEXT, SSE and SSE2 and this means
that some functions for MMX, MMXEXT, SSE and 3dnow are always
overridden by other functions (unless one e.g. explicitly
disables SSE2). This commit therefore disables several MMXEXT
functions (that are overridden by SSE2 functions)
at compile-time for x64.
Notice that some 10-bit SSE2 functions are overridden by sse2_cache64
functions in the same code block. This is suboptimal and the functions
that are overridden should either be removed or the sse2_cache64
functions be put behind suitable checks. This commit does neither.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
---
I would love to get input on what to do with these sse2_cache64
functions. If no one says anything, I will send a patch that
retains the current behaviour and removes the functions
overridden by the sse2_cache64 functions.
libavcodec/x86/h264_qpel.c | 44 +++++++++++++++++++++----------
libavcodec/x86/h264_qpel_8bit.asm | 4 +++
2 files changed, 34 insertions(+), 14 deletions(-)
diff --git a/libavcodec/x86/h264_qpel.c b/libavcodec/x86/h264_qpel.c
index fd1070247b..cb5f8a126c 100644
--- a/libavcodec/x86/h264_qpel.c
+++ b/libavcodec/x86/h264_qpel.c
@@ -236,7 +236,11 @@ static av_always_inline void ff_ ## OPNAME ## h264_qpel16_hv_lowpass_ ## MMX(uin
#define ff_put_h264_qpel8or16_hv2_lowpass_sse2 ff_put_h264_qpel8or16_hv2_lowpass_mmxext
#define ff_avg_h264_qpel8or16_hv2_lowpass_sse2 ff_avg_h264_qpel8or16_hv2_lowpass_mmxext
-#define H264_MC(OPNAME, SIZE, MMX, ALIGN) \
+#define H264_MC_C_H(OPNAME, SIZE, MMX, ALIGN) \
+H264_MC_C(OPNAME, SIZE, MMX, ALIGN)\
+H264_MC_H(OPNAME, SIZE, MMX, ALIGN)\
+
+#define H264_MC_C_V_H_HV(OPNAME, SIZE, MMX, ALIGN) \
H264_MC_C(OPNAME, SIZE, MMX, ALIGN)\
H264_MC_V(OPNAME, SIZE, MMX, ALIGN)\
H264_MC_H(OPNAME, SIZE, MMX, ALIGN)\
@@ -372,13 +376,9 @@ static void OPNAME ## h264_qpel ## SIZE ## _mc32_ ## MMX(uint8_t *dst, const uin
ff_ ## OPNAME ## pixels ## SIZE ## _l2_shift5_mmxext(dst, halfV+3, halfHV, stride, SIZE, SIZE);\
}\
-#define H264_MC_4816(MMX)\
-H264_MC(put_, 4, MMX, 8)\
-H264_MC(put_, 8, MMX, 8)\
-H264_MC(put_, 16,MMX, 8)\
-H264_MC(avg_, 4, MMX, 8)\
-H264_MC(avg_, 8, MMX, 8)\
-H264_MC(avg_, 16,MMX, 8)\
+#define H264_MC(QPEL, SIZE, MMX, ALIGN)\
+QPEL(put_, SIZE, MMX, ALIGN) \
+QPEL(avg_, SIZE, MMX, ALIGN) \
#define H264_MC_816(QPEL, XMM)\
QPEL(put_, 8, XMM, 16)\
@@ -397,7 +397,14 @@ QPEL_H264_H_XMM(avg_,AVG_MMXEXT_OP, ssse3)
QPEL_H264_HV_XMM(put_, PUT_OP, ssse3)
QPEL_H264_HV_XMM(avg_,AVG_MMXEXT_OP, ssse3)
-H264_MC_4816(mmxext)
+H264_MC(H264_MC_C_V_H_HV, 4, mmxext, 8)
+#if ARCH_X86_32
+H264_MC(H264_MC_C_V_H_HV, 8, mmxext, 8)
+H264_MC(H264_MC_C_V_H_HV, 16, mmxext, 8)
+#else
+H264_MC(H264_MC_C_H, 8, mmxext, 8)
+H264_MC(H264_MC_C_H, 16, mmxext, 8)
+#endif
H264_MC_816(H264_MC_V, sse2)
H264_MC_816(H264_MC_HV, sse2)
H264_MC_816(H264_MC_H, ssse3)
@@ -499,12 +506,16 @@ QPEL16(mmxext)
#endif /* HAVE_X86ASM */
-#define SET_QPEL_FUNCS(PFX, IDX, SIZE, CPU, PREFIX) \
+#define SET_QPEL_FUNCS0123(PFX, IDX, SIZE, CPU, PREFIX) \
do { \
c->PFX ## _pixels_tab[IDX][ 0] = PREFIX ## PFX ## SIZE ## _mc00_ ## CPU; \
c->PFX ## _pixels_tab[IDX][ 1] = PREFIX ## PFX ## SIZE ## _mc10_ ## CPU; \
c->PFX ## _pixels_tab[IDX][ 2] = PREFIX ## PFX ## SIZE ## _mc20_ ## CPU; \
c->PFX ## _pixels_tab[IDX][ 3] = PREFIX ## PFX ## SIZE ## _mc30_ ## CPU; \
+ } while (0)
+#define SET_QPEL_FUNCS(PFX, IDX, SIZE, CPU, PREFIX) \
+ do { \
+ SET_QPEL_FUNCS0123(PFX, IDX, SIZE, CPU, PREFIX); \
c->PFX ## _pixels_tab[IDX][ 4] = PREFIX ## PFX ## SIZE ## _mc01_ ## CPU; \
c->PFX ## _pixels_tab[IDX][ 5] = PREFIX ## PFX ## SIZE ## _mc11_ ## CPU; \
c->PFX ## _pixels_tab[IDX][ 6] = PREFIX ## PFX ## SIZE ## _mc21_ ## CPU; \
@@ -543,11 +554,16 @@ av_cold void ff_h264qpel_init_x86(H264QpelContext *c, int bit_depth)
if (EXTERNAL_MMXEXT(cpu_flags)) {
if (!high_bit_depth) {
- SET_QPEL_FUNCS(put_h264_qpel, 0, 16, mmxext, );
- SET_QPEL_FUNCS(put_h264_qpel, 1, 8, mmxext, );
+#if ARCH_X86_32
+#define SET_MMXEXT_QPEL_FUNCS(PFX, IDX, SIZE, CPU, PREFIX) SET_QPEL_FUNCS(PFX, IDX, SIZE, CPU, PREFIX)
+#else
+#define SET_MMXEXT_QPEL_FUNCS(PFX, IDX, SIZE, CPU, PREFIX) SET_QPEL_FUNCS0123(PFX, IDX, SIZE, CPU, PREFIX)
+#endif
+ SET_MMXEXT_QPEL_FUNCS(put_h264_qpel, 0, 16, mmxext, );
+ SET_MMXEXT_QPEL_FUNCS(put_h264_qpel, 1, 8, mmxext, );
SET_QPEL_FUNCS(put_h264_qpel, 2, 4, mmxext, );
- SET_QPEL_FUNCS(avg_h264_qpel, 0, 16, mmxext, );
- SET_QPEL_FUNCS(avg_h264_qpel, 1, 8, mmxext, );
+ SET_MMXEXT_QPEL_FUNCS(avg_h264_qpel, 0, 16, mmxext, );
+ SET_MMXEXT_QPEL_FUNCS(avg_h264_qpel, 1, 8, mmxext, );
SET_QPEL_FUNCS(avg_h264_qpel, 2, 4, mmxext, );
} else if (bit_depth == 10) {
#if ARCH_X86_32
diff --git a/libavcodec/x86/h264_qpel_8bit.asm b/libavcodec/x86/h264_qpel_8bit.asm
index 03c7d88f8c..72e98248d8 100644
--- a/libavcodec/x86/h264_qpel_8bit.asm
+++ b/libavcodec/x86/h264_qpel_8bit.asm
@@ -461,9 +461,11 @@ cglobal %1_h264_qpel8or16_v_lowpass_op, 5,5,8 ; dst, src, dstStride, srcStride,
REP_RET
%endmacro
+%if ARCH_X86_32
INIT_MMX mmxext
QPEL8OR16_V_LOWPASS_OP put
QPEL8OR16_V_LOWPASS_OP avg
+%endif
INIT_XMM sse2
QPEL8OR16_V_LOWPASS_OP put
@@ -581,8 +583,10 @@ cglobal %1_h264_qpel8or16_hv1_lowpass_op, 4,4,8 ; src, tmp, srcStride, size
REP_RET
%endmacro
+%if ARCH_X86_32
INIT_MMX mmxext
QPEL8OR16_HV1_LOWPASS_OP put
+%endif
INIT_XMM sse2
QPEL8OR16_HV1_LOWPASS_OP put
--
2.34.1
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 46+ messages in thread
* [FFmpeg-devel] [PATCH 34/41] avcodec/x86/h264chroma_init: Disable overridden functions on x64
2022-06-09 23:15 [FFmpeg-devel] [PATCH 00/41] Stop including superseded functions for x64 Andreas Rheinhardt
` (32 preceding siblings ...)
2022-06-09 23:55 ` [FFmpeg-devel] [PATCH 33/41] avcodec/x86/h264_qpel: Disable overridden functions on x64 Andreas Rheinhardt
@ 2022-06-09 23:55 ` Andreas Rheinhardt
2022-06-09 23:55 ` [FFmpeg-devel] [PATCH 35/41] swresample/x86/audio_convert_init: " Andreas Rheinhardt
` (7 subsequent siblings)
41 siblings, 0 replies; 46+ messages in thread
From: Andreas Rheinhardt @ 2022-06-09 23:55 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Andreas Rheinhardt
x64 always has MMX, MMXEXT, SSE and SSE2 and this means
that some functions for MMX, MMXEXT, SSE and 3dnow are always
overridden by other functions (unless one e.g. explicitly
disables SSE2). This commit therefore disables
the 3dnow implementation (which is overridden by the MMXEXT
specific implementation) at compile-time for x64.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
---
libavcodec/x86/h264_chromamc.asm | 4 ++--
libavcodec/x86/h264chroma_init.c | 2 ++
2 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/libavcodec/x86/h264_chromamc.asm b/libavcodec/x86/h264_chromamc.asm
index d59c183371..5704deed1c 100644
--- a/libavcodec/x86/h264_chromamc.asm
+++ b/libavcodec/x86/h264_chromamc.asm
@@ -446,14 +446,14 @@ chroma_mc4_mmx_func avg, h264
chroma_mc4_mmx_func avg, rv40
chroma_mc2_mmx_func avg, h264
+%if ARCH_X86_32
INIT_MMX 3dnow
chroma_mc8_mmx_func avg, h264, _rnd
-%if ARCH_X86_32
chroma_mc8_mmx_func avg, vc1, _nornd
chroma_mc8_mmx_func avg, rv40
chroma_mc4_mmx_func avg, rv40
-%endif
chroma_mc4_mmx_func avg, h264
+%endif
%macro chroma_mc8_ssse3_func 2-3
cglobal %1_%2_chroma_mc8%3, 6, 7, 8
diff --git a/libavcodec/x86/h264chroma_init.c b/libavcodec/x86/h264chroma_init.c
index 36bf29df02..26883d57e9 100644
--- a/libavcodec/x86/h264chroma_init.c
+++ b/libavcodec/x86/h264chroma_init.c
@@ -77,10 +77,12 @@ av_cold void ff_h264chroma_init_x86(H264ChromaContext *c, int bit_depth)
c->put_h264_chroma_pixels_tab[1] = ff_put_h264_chroma_mc4_mmx;
}
+#if ARCH_X86_32
if (EXTERNAL_AMD3DNOW(cpu_flags) && !high_bit_depth) {
c->avg_h264_chroma_pixels_tab[0] = ff_avg_h264_chroma_mc8_rnd_3dnow;
c->avg_h264_chroma_pixels_tab[1] = ff_avg_h264_chroma_mc4_3dnow;
}
+#endif
if (EXTERNAL_MMXEXT(cpu_flags) && !high_bit_depth) {
c->avg_h264_chroma_pixels_tab[0] = ff_avg_h264_chroma_mc8_rnd_mmxext;
--
2.34.1
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 46+ messages in thread
* [FFmpeg-devel] [PATCH 35/41] swresample/x86/audio_convert_init: Disable overridden functions on x64
2022-06-09 23:15 [FFmpeg-devel] [PATCH 00/41] Stop including superseded functions for x64 Andreas Rheinhardt
` (33 preceding siblings ...)
2022-06-09 23:55 ` [FFmpeg-devel] [PATCH 34/41] avcodec/x86/h264chroma_init: " Andreas Rheinhardt
@ 2022-06-09 23:55 ` Andreas Rheinhardt
2022-06-09 23:55 ` [FFmpeg-devel] [PATCH 36/41] swresample/x86/rematrix_init: " Andreas Rheinhardt
` (6 subsequent siblings)
41 siblings, 0 replies; 46+ messages in thread
From: Andreas Rheinhardt @ 2022-06-09 23:55 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Andreas Rheinhardt
x64 always has MMX, MMXEXT, SSE and SSE2 and this means
that some functions for MMX, MMXEXT, SSE and 3dnow are always
overridden by other functions (unless one e.g. explicitly
disables SSE2). This commit therefore disables
the MMX implementation (which is overridden by the SSE2
specific implementation) at compile-time for x64.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
---
libswresample/x86/audio_convert.asm | 2 ++
libswresample/x86/audio_convert_init.c | 4 ++++
2 files changed, 6 insertions(+)
diff --git a/libswresample/x86/audio_convert.asm b/libswresample/x86/audio_convert.asm
index d441636d3c..d2413f9f04 100644
--- a/libswresample/x86/audio_convert.asm
+++ b/libswresample/x86/audio_convert.asm
@@ -608,6 +608,7 @@ pack_8ch_%2_to_%1_u_int %+ SUFFIX:
%macro NOP_N 0-6
%endmacro
+%if ARCH_X86_32
INIT_MMX mmx
CONV int32, int16, u, 2, 1, INT16_TO_INT32_N, NOP_N
CONV int32, int16, a, 2, 1, INT16_TO_INT32_N, NOP_N
@@ -616,6 +617,7 @@ CONV int16, int32, a, 1, 2, INT32_TO_INT16_N, NOP_N
PACK_6CH float, float, u, 2, 2, 0, NOP_N, NOP_N
PACK_6CH float, float, a, 2, 2, 0, NOP_N, NOP_N
+%endif
INIT_XMM sse
PACK_6CH float, float, u, 2, 2, 7, NOP_N, NOP_N
diff --git a/libswresample/x86/audio_convert_init.c b/libswresample/x86/audio_convert_init.c
index a7d5ab89f8..7728c9ae00 100644
--- a/libswresample/x86/audio_convert_init.c
+++ b/libswresample/x86/audio_convert_init.c
@@ -52,15 +52,19 @@ av_cold void swri_audio_convert_init_x86(struct AudioConvert *ac,
ac->simd_f = ff_int32_to_int16_a_ ## cap;\
}
+#if ARCH_X86_32
MULTI_CAPS_FUNC(MMX, mmx)
+#endif
MULTI_CAPS_FUNC(SSE2, sse2)
+#if ARCH_X86_32
if(EXTERNAL_MMX(mm_flags)) {
if(channels == 6) {
if( out_fmt == AV_SAMPLE_FMT_FLT && in_fmt == AV_SAMPLE_FMT_FLTP || out_fmt == AV_SAMPLE_FMT_S32 && in_fmt == AV_SAMPLE_FMT_S32P)
ac->simd_f = ff_pack_6ch_float_to_float_a_mmx;
}
}
+#endif
if(EXTERNAL_SSE(mm_flags)) {
if(channels == 6) {
if( out_fmt == AV_SAMPLE_FMT_FLT && in_fmt == AV_SAMPLE_FMT_FLTP || out_fmt == AV_SAMPLE_FMT_S32 && in_fmt == AV_SAMPLE_FMT_S32P)
--
2.34.1
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 46+ messages in thread
* [FFmpeg-devel] [PATCH 36/41] swresample/x86/rematrix_init: Disable overridden functions on x64
2022-06-09 23:15 [FFmpeg-devel] [PATCH 00/41] Stop including superseded functions for x64 Andreas Rheinhardt
` (34 preceding siblings ...)
2022-06-09 23:55 ` [FFmpeg-devel] [PATCH 35/41] swresample/x86/audio_convert_init: " Andreas Rheinhardt
@ 2022-06-09 23:55 ` Andreas Rheinhardt
2022-06-09 23:55 ` [FFmpeg-devel] [PATCH 37/41] swscale/x86/rgb2rgb: " Andreas Rheinhardt
` (5 subsequent siblings)
41 siblings, 0 replies; 46+ messages in thread
From: Andreas Rheinhardt @ 2022-06-09 23:55 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Andreas Rheinhardt
x64 always has MMX, MMXEXT, SSE and SSE2 and this means
that some functions for MMX, MMXEXT, SSE and 3dnow are always
overridden by other functions (unless one e.g. explicitly
disables SSE2). This commit therefore disables
the MMX implementation (which is overridden by the SSE2
specific implementation) at compile-time for x64.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
---
libswresample/x86/rematrix.asm | 2 ++
libswresample/x86/rematrix_init.c | 2 ++
2 files changed, 4 insertions(+)
diff --git a/libswresample/x86/rematrix.asm b/libswresample/x86/rematrix.asm
index 7984b9a729..1c657ff72f 100644
--- a/libswresample/x86/rematrix.asm
+++ b/libswresample/x86/rematrix.asm
@@ -223,11 +223,13 @@ mix_2_1_int16_u_int %+ SUFFIX:
%endmacro
+%if ARCH_X86_32
INIT_MMX mmx
MIX1_INT16 u
MIX1_INT16 a
MIX2_INT16 u
MIX2_INT16 a
+%endif
INIT_XMM sse
MIX2_FLT u
diff --git a/libswresample/x86/rematrix_init.c b/libswresample/x86/rematrix_init.c
index 0608c74e7f..3981de4277 100644
--- a/libswresample/x86/rematrix_init.c
+++ b/libswresample/x86/rematrix_init.c
@@ -43,10 +43,12 @@ av_cold int swri_rematrix_init_x86(struct SwrContext *s){
s->mix_2_1_simd = NULL;
if (s->midbuf.fmt == AV_SAMPLE_FMT_S16P){
+#if ARCH_X86_32
if(EXTERNAL_MMX(mm_flags)) {
s->mix_1_1_simd = ff_mix_1_1_a_int16_mmx;
s->mix_2_1_simd = ff_mix_2_1_a_int16_mmx;
}
+#endif
if(EXTERNAL_SSE2(mm_flags)) {
s->mix_1_1_simd = ff_mix_1_1_a_int16_sse2;
s->mix_2_1_simd = ff_mix_2_1_a_int16_sse2;
--
2.34.1
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 46+ messages in thread
* [FFmpeg-devel] [PATCH 37/41] swscale/x86/rgb2rgb: Disable overridden functions on x64
2022-06-09 23:15 [FFmpeg-devel] [PATCH 00/41] Stop including superseded functions for x64 Andreas Rheinhardt
` (35 preceding siblings ...)
2022-06-09 23:55 ` [FFmpeg-devel] [PATCH 36/41] swresample/x86/rematrix_init: " Andreas Rheinhardt
@ 2022-06-09 23:55 ` Andreas Rheinhardt
2022-06-09 23:55 ` [FFmpeg-devel] [PATCH 38/41] swscale/x86/yuv2rgb: " Andreas Rheinhardt
` (4 subsequent siblings)
41 siblings, 0 replies; 46+ messages in thread
From: Andreas Rheinhardt @ 2022-06-09 23:55 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Andreas Rheinhardt
x64 always has MMX, MMXEXT, SSE and SSE2 and this means
that some functions for MMX, MMXEXT, SSE and 3dnow are always
overridden by other functions (unless one e.g. explicitly
disables SSE2). This commit therefore disables
the MMX and 3dnow implementations (overridden by MMXEXT)
and a single MMXEXT function (overridden by SSE2)
at compile-time for x64.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
---
libswscale/x86/rgb2rgb.c | 6 ++++++
libswscale/x86/rgb2rgb_template.c | 10 ++++++----
2 files changed, 12 insertions(+), 4 deletions(-)
diff --git a/libswscale/x86/rgb2rgb.c b/libswscale/x86/rgb2rgb.c
index 0ab139aca4..d8dfbbca35 100644
--- a/libswscale/x86/rgb2rgb.c
+++ b/libswscale/x86/rgb2rgb.c
@@ -91,9 +91,11 @@ DECLARE_ALIGNED(8, extern const uint64_t, ff_bgr2UVOffset);
#define COMPILE_TEMPLATE_AVX 0
//MMX versions
+#if ARCH_X86_32
#undef RENAME
#define RENAME(a) a ## _mmx
#include "rgb2rgb_template.c"
+#endif
// MMXEXT versions
#undef RENAME
@@ -116,6 +118,7 @@ DECLARE_ALIGNED(8, extern const uint64_t, ff_bgr2UVOffset);
#define RENAME(a) a ## _avx
#include "rgb2rgb_template.c"
+#if ARCH_X86_32
//3DNOW versions
#undef RENAME
#undef COMPILE_TEMPLATE_MMXEXT
@@ -128,6 +131,7 @@ DECLARE_ALIGNED(8, extern const uint64_t, ff_bgr2UVOffset);
#define COMPILE_TEMPLATE_AMD3DNOW 1
#define RENAME(a) a ## _3dnow
#include "rgb2rgb_template.c"
+#endif
/*
RGB15->RGB16 original by Strepto/Astral
@@ -165,10 +169,12 @@ av_cold void rgb2rgb_init_x86(void)
int cpu_flags = av_get_cpu_flags();
#if HAVE_INLINE_ASM
+#if ARCH_X86_32
if (INLINE_MMX(cpu_flags))
rgb2rgb_init_mmx();
if (INLINE_AMD3DNOW(cpu_flags))
rgb2rgb_init_3dnow();
+#endif
if (INLINE_MMXEXT(cpu_flags))
rgb2rgb_init_mmxext();
if (INLINE_SSE2(cpu_flags))
diff --git a/libswscale/x86/rgb2rgb_template.c b/libswscale/x86/rgb2rgb_template.c
index ae2469e663..ae7af550e0 100644
--- a/libswscale/x86/rgb2rgb_template.c
+++ b/libswscale/x86/rgb2rgb_template.c
@@ -1822,7 +1822,7 @@ static inline void RENAME(rgb24toyv12)(const uint8_t *src, uint8_t *ydst, uint8_
#endif /* HAVE_7REGS */
#endif /* !COMPILE_TEMPLATE_SSE2 */
-#if !COMPILE_TEMPLATE_AMD3DNOW && !COMPILE_TEMPLATE_AVX
+#if !COMPILE_TEMPLATE_AMD3DNOW && !COMPILE_TEMPLATE_AVX && (ARCH_X86_32 || COMPILE_TEMPLATE_SSE2)
static void RENAME(interleaveBytes)(const uint8_t *src1, const uint8_t *src2, uint8_t *dest,
int width, int height, int src1Stride,
int src2Stride, int dstStride)
@@ -2185,7 +2185,7 @@ static void RENAME(extract_odd)(const uint8_t *src, uint8_t *dst, x86_reg count)
}
}
-#if !COMPILE_TEMPLATE_AMD3DNOW
+#if !COMPILE_TEMPLATE_AMD3DNOW && ARCH_X86_32
static void RENAME(extract_even2)(const uint8_t *src, uint8_t *dst0, uint8_t *dst1, x86_reg count)
{
dst0+= count;
@@ -2465,7 +2465,7 @@ static void RENAME(uyvytoyuv420)(uint8_t *ydst, uint8_t *udst, uint8_t *vdst, co
);
}
-#if !COMPILE_TEMPLATE_AMD3DNOW
+#if !COMPILE_TEMPLATE_AMD3DNOW && ARCH_X86_32
static void RENAME(uyvytoyuv422)(uint8_t *ydst, uint8_t *udst, uint8_t *vdst, const uint8_t *src,
int width, int height,
int lumStride, int chromStride, int srcStride)
@@ -2519,7 +2519,9 @@ static av_cold void RENAME(rgb2rgb_init)(void)
yuy2toyv12 = RENAME(yuy2toyv12);
vu9_to_vu12 = RENAME(vu9_to_vu12);
yvu9_to_yuy2 = RENAME(yvu9_to_yuy2);
+#if ARCH_X86_32
uyvytoyuv422 = RENAME(uyvytoyuv422);
+#endif
yuyvtoyuv422 = RENAME(yuyvtoyuv422);
#endif /* !COMPILE_TEMPLATE_AMD3DNOW */
@@ -2534,7 +2536,7 @@ static av_cold void RENAME(rgb2rgb_init)(void)
uyvytoyuv420 = RENAME(uyvytoyuv420);
#endif /* !COMPILE_TEMPLATE_SSE2 */
-#if !COMPILE_TEMPLATE_AMD3DNOW && !COMPILE_TEMPLATE_AVX
+#if !COMPILE_TEMPLATE_AMD3DNOW && !COMPILE_TEMPLATE_AVX && (ARCH_X86_32 || COMPILE_TEMPLATE_SSE2)
interleaveBytes = RENAME(interleaveBytes);
#endif /* !COMPILE_TEMPLATE_AMD3DNOW && !COMPILE_TEMPLATE_AVX */
#if !COMPILE_TEMPLATE_AVX || HAVE_AVX_EXTERNAL
--
2.34.1
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 46+ messages in thread
* [FFmpeg-devel] [PATCH 38/41] swscale/x86/yuv2rgb: Disable overridden functions on x64
2022-06-09 23:15 [FFmpeg-devel] [PATCH 00/41] Stop including superseded functions for x64 Andreas Rheinhardt
` (36 preceding siblings ...)
2022-06-09 23:55 ` [FFmpeg-devel] [PATCH 37/41] swscale/x86/rgb2rgb: " Andreas Rheinhardt
@ 2022-06-09 23:55 ` Andreas Rheinhardt
2022-06-09 23:55 ` [FFmpeg-devel] [PATCH 39/41] swscale/x86/swscale: " Andreas Rheinhardt
` (3 subsequent siblings)
41 siblings, 0 replies; 46+ messages in thread
From: Andreas Rheinhardt @ 2022-06-09 23:55 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Andreas Rheinhardt
x64 always has MMX, MMXEXT, SSE and SSE2 and this means
that some functions for MMX, MMXEXT, SSE and 3dnow are always
overridden by other functions (unless one e.g. explicitly
disables SSE2). This commit therefore disables
some MMX functions that are overridden by MMXEXT
at compile-time for x64.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
---
libswscale/x86/yuv2rgb.c | 13 +++++++------
libswscale/x86/yuv2rgb_template.c | 5 +++--
libswscale/x86/yuv_2_rgb.asm | 5 +++++
3 files changed, 15 insertions(+), 8 deletions(-)
diff --git a/libswscale/x86/yuv2rgb.c b/libswscale/x86/yuv2rgb.c
index 47f45bd7c2..60dcb4b7be 100644
--- a/libswscale/x86/yuv2rgb.c
+++ b/libswscale/x86/yuv2rgb.c
@@ -44,23 +44,22 @@
//MMX versions
#if HAVE_MMX
#undef RENAME
-#undef COMPILE_TEMPLATE_MMXEXT
-#define COMPILE_TEMPLATE_MMXEXT 0
+#define COMPILE_TEMPLATE_MMX
#define RENAME(a) a ## _mmx
#include "yuv2rgb_template.c"
+#undef COMPILE_TEMPLATE_MMX
#endif /* HAVE_MMX */
// MMXEXT versions
#undef RENAME
-#undef COMPILE_TEMPLATE_MMXEXT
-#define COMPILE_TEMPLATE_MMXEXT 1
+#define COMPILE_TEMPLATE_MMXEXT
#define RENAME(a) a ## _mmxext
#include "yuv2rgb_template.c"
+#undef COMPILE_TEMPLATE_MMXEXT
//SSSE3 versions
#undef RENAME
-#undef COMPILE_TEMPLATE_MMXEXT
-#define COMPILE_TEMPLATE_MMXEXT 0
+#define COMPILE_TEMPLATE_SSSE3
#define RENAME(a) a ## _ssse3
#include "yuv2rgb_template.c"
@@ -127,10 +126,12 @@ av_cold SwsFunc ff_yuv2rgb_init_x86(SwsContext *c)
break;
} else
return yuv420_bgr32_mmx;
+#if ARCH_X86_32
case AV_PIX_FMT_RGB24:
return yuv420_rgb24_mmx;
case AV_PIX_FMT_BGR24:
return yuv420_bgr24_mmx;
+#endif
case AV_PIX_FMT_RGB565:
return yuv420_rgb16_mmx;
case AV_PIX_FMT_RGB555:
diff --git a/libswscale/x86/yuv2rgb_template.c b/libswscale/x86/yuv2rgb_template.c
index d506f75e15..b60567b8f6 100644
--- a/libswscale/x86/yuv2rgb_template.c
+++ b/libswscale/x86/yuv2rgb_template.c
@@ -47,7 +47,7 @@ extern void RENAME(ff_yuv_420_bgr24)(x86_reg index, uint8_t *image, const uint8_
const uint8_t *pv_index, const uint64_t *pointer_c_dither,
const uint8_t *py_2index);
-#if !COMPILE_TEMPLATE_MMXEXT
+#ifndef COMPILE_TEMPLATE_MMXEXT
extern void RENAME(ff_yuv_420_rgb15)(x86_reg index, uint8_t *image, const uint8_t *pu_index,
const uint8_t *pv_index, const uint64_t *pointer_c_dither,
const uint8_t *py_2index);
@@ -165,6 +165,7 @@ static inline int RENAME(yuva420_bgr32)(SwsContext *c, const uint8_t *src[],
}
#endif
+#if ARCH_X86_32 || !defined(COMPILE_TEMPLATE_MMX)
static inline int RENAME(yuv420_rgb24)(SwsContext *c, const uint8_t *src[],
int srcStride[],
int srcSliceY, int srcSliceH,
@@ -192,4 +193,4 @@ static inline int RENAME(yuv420_bgr24)(SwsContext *c, const uint8_t *src[],
}
return srcSliceH;
}
-
+#endif
diff --git a/libswscale/x86/yuv_2_rgb.asm b/libswscale/x86/yuv_2_rgb.asm
index f968b3a0a2..542e3cb1ab 100644
--- a/libswscale/x86/yuv_2_rgb.asm
+++ b/libswscale/x86/yuv_2_rgb.asm
@@ -69,6 +69,9 @@ SECTION .text
%ifidn %1, yuva
%define parameters index, image, pu_index, pv_index, pointer_c_dither, py_2index, pa_2index
%define GPR_num 7
+ %else
+ %define parameters index, image, pu_index, pv_index, pointer_c_dither, py_2index
+ %define GPR_num 6
%endif
%else
%define parameters index, image, pu_index, pv_index, pointer_c_dither, py_2index
@@ -356,8 +359,10 @@ REP_RET
%endmacro
INIT_MMX mmx
+%if ARCH_X86_32
yuv2rgb_fn yuv, rgb, 24
yuv2rgb_fn yuv, bgr, 24
+%endif
yuv2rgb_fn yuv, rgb, 32
yuv2rgb_fn yuv, bgr, 32
yuv2rgb_fn yuva, rgb, 32
--
2.34.1
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 46+ messages in thread
* [FFmpeg-devel] [PATCH 39/41] swscale/x86/swscale: Disable overridden functions on x64
2022-06-09 23:15 [FFmpeg-devel] [PATCH 00/41] Stop including superseded functions for x64 Andreas Rheinhardt
` (37 preceding siblings ...)
2022-06-09 23:55 ` [FFmpeg-devel] [PATCH 38/41] swscale/x86/yuv2rgb: " Andreas Rheinhardt
@ 2022-06-09 23:55 ` Andreas Rheinhardt
2022-06-09 23:55 ` [FFmpeg-devel] [PATCH 40/41] avfilter/x86/vf_eq_init: " Andreas Rheinhardt
` (2 subsequent siblings)
41 siblings, 0 replies; 46+ messages in thread
From: Andreas Rheinhardt @ 2022-06-09 23:55 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Andreas Rheinhardt
x64 always has MMX, MMXEXT, SSE and SSE2 and this means
that some functions for MMX, MMXEXT, SSE and 3dnow are always
overridden by other functions (unless one e.g. explicitly
disables SSE2). This commit therefore disables
the MMX implementation (which is overridden by MMXEXT)
at compile-time for x64.
Notice that yuv2yuvX_mmx is not removed, because it is used
by SSE3 and AVX2 as fallback in case of unaligned data and
also for tail processing. I don't know why yuv2yuvX_mmxext
isn't being used for this; an earlier version [1] of
554c2bc7086f49ef5a6a989ad6bc4bc11807eb6f used it, but
the version that was eventually applied does not.
[1]: https://ffmpeg.org/pipermail/ffmpeg-devel/2020-November/272124.html
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
---
There is some issue with 32bit versions of some of the MMX functions
here, leading to failures with the f32le and f32be versions of
gray, gbrp and gbrap. See
https://fate.ffmpeg.org/report.cgi?time=20220609221253&slot=x86_32-debian-kfreebsd-gcc-4.4-cpuflags-mmx
It does not affect x64 and if SSE2 is available, the MMX functions
are overridden by SSE2 functions that don't suffer from this defect.
This is a good reason to remove these overridden functions for x86, too.
libswscale/x86/swscale.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/libswscale/x86/swscale.c b/libswscale/x86/swscale.c
index 73869355b8..5cc174694c 100644
--- a/libswscale/x86/swscale.c
+++ b/libswscale/x86/swscale.c
@@ -55,7 +55,7 @@ DECLARE_ASM_ALIGNED(8, const uint64_t, ff_w1111) = 0x0001000100010001ULL;
//MMX versions
-#if HAVE_MMX_INLINE
+#if ARCH_X86_32 && HAVE_MMX_INLINE
#undef RENAME
#define COMPILE_TEMPLATE_MMXEXT 0
#define RENAME(a) a ## _mmx
@@ -470,7 +470,7 @@ av_cold void ff_sws_init_swscale_x86(SwsContext *c)
{
int cpu_flags = av_get_cpu_flags();
-#if HAVE_MMX_INLINE
+#if ARCH_X86_32 && HAVE_MMX_INLINE
if (INLINE_MMX(cpu_flags))
sws_init_swscale_mmx(c);
#endif
@@ -479,7 +479,7 @@ av_cold void ff_sws_init_swscale_x86(SwsContext *c)
sws_init_swscale_mmxext(c);
#endif
if(c->use_mmx_vfilter && !(c->flags & SWS_ACCURATE_RND)) {
-#if HAVE_MMX_EXTERNAL
+#if ARCH_X86_32 && HAVE_MMX_EXTERNAL
if (EXTERNAL_MMX(cpu_flags))
c->yuv2planeX = yuv2yuvX_mmx;
#endif
--
2.34.1
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 46+ messages in thread
* [FFmpeg-devel] [PATCH 40/41] avfilter/x86/vf_eq_init: Disable overridden functions on x64
2022-06-09 23:15 [FFmpeg-devel] [PATCH 00/41] Stop including superseded functions for x64 Andreas Rheinhardt
` (38 preceding siblings ...)
2022-06-09 23:55 ` [FFmpeg-devel] [PATCH 39/41] swscale/x86/swscale: " Andreas Rheinhardt
@ 2022-06-09 23:55 ` Andreas Rheinhardt
2022-06-09 23:55 ` [FFmpeg-devel] [PATCH 41/41] avutil/x86/pixelutils_init: " Andreas Rheinhardt
2022-06-11 20:14 ` [FFmpeg-devel] [PATCH 00/41] Stop including superseded functions for x64 Andreas Rheinhardt
41 siblings, 0 replies; 46+ messages in thread
From: Andreas Rheinhardt @ 2022-06-09 23:55 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Andreas Rheinhardt
x64 always has MMX, MMXEXT, SSE and SSE2 and this means
that some functions for MMX, MMXEXT, SSE and 3dnow are always
overridden by other functions (unless one e.g. explicitly
disables SSE2). This commit therefore disables
process_mmxext (overridden by process_sse2) at compile-time for x64.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
---
libavfilter/x86/vf_eq.asm | 2 ++
libavfilter/x86/vf_eq_init.c | 4 ++++
2 files changed, 6 insertions(+)
diff --git a/libavfilter/x86/vf_eq.asm b/libavfilter/x86/vf_eq.asm
index a30a287029..0b55d28c6b 100644
--- a/libavfilter/x86/vf_eq.asm
+++ b/libavfilter/x86/vf_eq.asm
@@ -83,8 +83,10 @@ cglobal process_one_line, 5, 7, 5, src, dst, contrast, brightness, w
%endmacro
+%if ARCH_X86_32
INIT_MMX mmxext
PROCESS_ONE_LINE 3
+%endif
INIT_XMM sse2
PROCESS_ONE_LINE 4
diff --git a/libavfilter/x86/vf_eq_init.c b/libavfilter/x86/vf_eq_init.c
index 113056e76b..0ca341e07d 100644
--- a/libavfilter/x86/vf_eq_init.c
+++ b/libavfilter/x86/vf_eq_init.c
@@ -31,6 +31,7 @@ extern void ff_process_one_line_sse2(const uint8_t *src, uint8_t *dst, short con
short brightness, int w);
#if HAVE_X86ASM
+#if ARCH_X86_32
static void process_mmxext(EQParameters *param, uint8_t *dst, int dst_stride,
const uint8_t *src, int src_stride, int w, int h)
{
@@ -45,6 +46,7 @@ static void process_mmxext(EQParameters *param, uint8_t *dst, int dst_stride,
}
emms_c();
}
+#endif
static void process_sse2(EQParameters *param, uint8_t *dst, int dst_stride,
const uint8_t *src, int src_stride, int w, int h)
@@ -65,9 +67,11 @@ av_cold void ff_eq_init_x86(EQContext *eq)
{
#if HAVE_X86ASM
int cpu_flags = av_get_cpu_flags();
+#if ARCH_X86_32
if (EXTERNAL_MMXEXT(cpu_flags)) {
eq->process = process_mmxext;
}
+#endif
if (EXTERNAL_SSE2(cpu_flags)) {
eq->process = process_sse2;
}
--
2.34.1
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 46+ messages in thread
* [FFmpeg-devel] [PATCH 41/41] avutil/x86/pixelutils_init: Disable overridden functions on x64
2022-06-09 23:15 [FFmpeg-devel] [PATCH 00/41] Stop including superseded functions for x64 Andreas Rheinhardt
` (39 preceding siblings ...)
2022-06-09 23:55 ` [FFmpeg-devel] [PATCH 40/41] avfilter/x86/vf_eq_init: " Andreas Rheinhardt
@ 2022-06-09 23:55 ` Andreas Rheinhardt
2022-06-11 20:14 ` [FFmpeg-devel] [PATCH 00/41] Stop including superseded functions for x64 Andreas Rheinhardt
41 siblings, 0 replies; 46+ messages in thread
From: Andreas Rheinhardt @ 2022-06-09 23:55 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Andreas Rheinhardt
x64 always has MMX, MMXEXT, SSE and SSE2 and this means
that some functions for MMX, MMXEXT, SSE and 3dnow are always
overridden by other functions (unless one e.g. explicitly
disables SSE2). This commit therefore disables
the 8x8 MMX (overridden by MMXEXT) as well as the 16x16 MMXEXT
(overridden by SSE2) sad functions at compile-time for x64.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
---
libavutil/x86/pixelutils.asm | 4 ++++
libavutil/x86/pixelutils_init.c | 4 ++++
2 files changed, 8 insertions(+)
diff --git a/libavutil/x86/pixelutils.asm b/libavutil/x86/pixelutils.asm
index 8b45ead78b..ea79186e19 100644
--- a/libavutil/x86/pixelutils.asm
+++ b/libavutil/x86/pixelutils.asm
@@ -25,6 +25,7 @@
SECTION .text
+%if ARCH_X86_32
;-------------------------------------------------------------------------------
; int ff_pixelutils_sad_8x8_mmx(const uint8_t *src1, ptrdiff_t stride1,
; const uint8_t *src2, ptrdiff_t stride2);
@@ -62,6 +63,7 @@ cglobal pixelutils_sad_8x8, 4,4,0, src1, stride1, src2, stride2
movd eax, m6
movzx eax, ax
RET
+%endif
;-------------------------------------------------------------------------------
; int ff_pixelutils_sad_8x8_mmxext(const uint8_t *src1, ptrdiff_t stride1,
@@ -83,6 +85,7 @@ cglobal pixelutils_sad_8x8, 4,4,0, src1, stride1, src2, stride2
movd eax, m2
RET
+%if ARCH_X86_32
;-------------------------------------------------------------------------------
; int ff_pixelutils_sad_16x16_mmxext(const uint8_t *src1, ptrdiff_t stride1,
; const uint8_t *src2, ptrdiff_t stride2);
@@ -102,6 +105,7 @@ cglobal pixelutils_sad_16x16, 4,4,0, src1, stride1, src2, stride2
%endrep
movd eax, m2
RET
+%endif
;-------------------------------------------------------------------------------
; int ff_pixelutils_sad_16x16_sse2(const uint8_t *src1, ptrdiff_t stride1,
diff --git a/libavutil/x86/pixelutils_init.c b/libavutil/x86/pixelutils_init.c
index 184a3a4a9f..e74d5634d7 100644
--- a/libavutil/x86/pixelutils_init.c
+++ b/libavutil/x86/pixelutils_init.c
@@ -53,9 +53,11 @@ void ff_pixelutils_sad_init_x86(av_pixelutils_sad_fn *sad, int aligned)
{
int cpu_flags = av_get_cpu_flags();
+#if ARCH_X86_32
if (EXTERNAL_MMX(cpu_flags)) {
sad[2] = ff_pixelutils_sad_8x8_mmx;
}
+#endif
// The best way to use SSE2 would be to do 2 SADs in parallel,
// but we'd have to modify the pixelutils API to return SIMD functions.
@@ -65,7 +67,9 @@ void ff_pixelutils_sad_init_x86(av_pixelutils_sad_fn *sad, int aligned)
// so just use the MMX 8x8 version even when SSE2 is available.
if (EXTERNAL_MMXEXT(cpu_flags)) {
sad[2] = ff_pixelutils_sad_8x8_mmxext;
+#if ARCH_X86_32
sad[3] = ff_pixelutils_sad_16x16_mmxext;
+#endif
}
if (EXTERNAL_SSE2(cpu_flags)) {
--
2.34.1
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [FFmpeg-devel] [PATCH 00/41] Stop including superseded functions for x64
2022-06-09 23:15 [FFmpeg-devel] [PATCH 00/41] Stop including superseded functions for x64 Andreas Rheinhardt
` (40 preceding siblings ...)
2022-06-09 23:55 ` [FFmpeg-devel] [PATCH 41/41] avutil/x86/pixelutils_init: " Andreas Rheinhardt
@ 2022-06-11 20:14 ` Andreas Rheinhardt
2022-06-20 11:16 ` Andreas Rheinhardt
41 siblings, 1 reply; 46+ messages in thread
From: Andreas Rheinhardt @ 2022-06-11 20:14 UTC (permalink / raw)
To: ffmpeg-devel
Andreas Rheinhardt:
> x64 requires MMX, MMXEXT, SSE and SSE2; yet there is no shortage
> of code like the following:
>
> if (EXTERNAL_MMX(cpu_flags)) {
> c->ssd_int8_vs_int16 = ff_ssd_int8_vs_int16_mmx;
> }
> if (EXTERNAL_SSE2(cpu_flags)) {
> c->ssd_int8_vs_int16 = ff_ssd_int8_vs_int16_sse2;
> }
>
> Given that SSE2 is always present on x64, the only way
> for the mmx version to be chosen in the above example
> is if SSE2 has been disabled either at compile-time
> or at runtime, i.e. it is never used unless one shoots
> oneself in the foot.
> This patchset therefore disables such functions for x64
> by #if'ing them away; x86 has not been affected. This
> saves about 140KB.
>
> (Another way to handle this would be to remove every function
> that would be overridden if one had a processor capable of
> MMX, MMXEXT, SSE and SSE2. x86 processors not fulfilling
> this requirement (which are truely ancient nowadays)
> would still work, but would be slower, i.e. they would be treated
> as second-class citizens. This would have the advantage of
> avoiding #ifs and would lighten x86 binaries of code that is
> not used at all by the overwhelming majority of users.
> I'll update this patchset if it is preferred to do it that way.)
>
I have now implemented this other way mentioned above (i.e. removing
stuff that is overridden if SSE2 is available altogether also for
x86-32); the result can be seen here:
https://github.com/mkver/FFmpeg/commits/mmx2
I prefer this to the old version because of the reduced complexity which
dwarfs the potential to slow down some ancient systems a bit (if these
ancient systems use an up-to-date FFmpeg which is quite unlikely).
Furthermore, some of the MMX scale functions that are removed are
buggy/not bixexact. See
https://github.com/mkver/FFmpeg/commit/c5513ad962100040601b5eba0042692a740ac50a
(or shall I post these patches?)
Is anyone against this removal?
- Andreas
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [FFmpeg-devel] [PATCH 00/41] Stop including superseded functions for x64
2022-06-11 20:14 ` [FFmpeg-devel] [PATCH 00/41] Stop including superseded functions for x64 Andreas Rheinhardt
@ 2022-06-20 11:16 ` Andreas Rheinhardt
0 siblings, 0 replies; 46+ messages in thread
From: Andreas Rheinhardt @ 2022-06-20 11:16 UTC (permalink / raw)
To: ffmpeg-devel
Andreas Rheinhardt:
> Andreas Rheinhardt:
>> x64 requires MMX, MMXEXT, SSE and SSE2; yet there is no shortage
>> of code like the following:
>>
>> if (EXTERNAL_MMX(cpu_flags)) {
>> c->ssd_int8_vs_int16 = ff_ssd_int8_vs_int16_mmx;
>> }
>> if (EXTERNAL_SSE2(cpu_flags)) {
>> c->ssd_int8_vs_int16 = ff_ssd_int8_vs_int16_sse2;
>> }
>>
>> Given that SSE2 is always present on x64, the only way
>> for the mmx version to be chosen in the above example
>> is if SSE2 has been disabled either at compile-time
>> or at runtime, i.e. it is never used unless one shoots
>> oneself in the foot.
>> This patchset therefore disables such functions for x64
>> by #if'ing them away; x86 has not been affected. This
>> saves about 140KB.
>>
>> (Another way to handle this would be to remove every function
>> that would be overridden if one had a processor capable of
>> MMX, MMXEXT, SSE and SSE2. x86 processors not fulfilling
>> this requirement (which are truely ancient nowadays)
>> would still work, but would be slower, i.e. they would be treated
>> as second-class citizens. This would have the advantage of
>> avoiding #ifs and would lighten x86 binaries of code that is
>> not used at all by the overwhelming majority of users.
>> I'll update this patchset if it is preferred to do it that way.)
>>
>
> I have now implemented this other way mentioned above (i.e. removing
> stuff that is overridden if SSE2 is available altogether also for
> x86-32); the result can be seen here:
> https://github.com/mkver/FFmpeg/commits/mmx2
> I prefer this to the old version because of the reduced complexity which
> dwarfs the potential to slow down some ancient systems a bit (if these
> ancient systems use an up-to-date FFmpeg which is quite unlikely).
> Furthermore, some of the MMX scale functions that are removed are
> buggy/not bixexact. See
> https://github.com/mkver/FFmpeg/commit/c5513ad962100040601b5eba0042692a740ac50a
> (or shall I post these patches?)
> Is anyone against this removal?
>
Given that no one was against removing these old functions, but several
people (on IRC) supported the idea I will go ahead and do it. I will
apply https://github.com/mkver/FFmpeg/commits/mmx3 (an updated and
extended version of the branch linked to above) in two days unless there
are objections.
(I can also send this to the mailing list if desired.)
- Andreas
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 46+ messages in thread