* [FFmpeg-devel] [PATCH v2] libavfilter/x86/vf_convolution: fix sobel swap issue on WIN64
@ 2022-11-14 15:20 bin.wang-at-intel.com
2022-11-14 16:34 ` James Almer
0 siblings, 1 reply; 3+ messages in thread
From: bin.wang-at-intel.com @ 2022-11-14 15:20 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Wang, Bin, Wang
From: "Wang, Bin" <bin.wang@intel.com>
Signed-off-by: Wang, Bin <bin.wang@intel.com>
---
libavfilter/x86/vf_convolution.asm | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/libavfilter/x86/vf_convolution.asm b/libavfilter/x86/vf_convolution.asm
index c912d56752..9ac9ef5d73 100644
--- a/libavfilter/x86/vf_convolution.asm
+++ b/libavfilter/x86/vf_convolution.asm
@@ -189,15 +189,16 @@ cglobal filter_sobel, 4, 15, 7, dst, width, matrix, ptr, c0, c1, c2, c3, c4, c5,
cglobal filter_sobel, 4, 15, 7, dst, width, rdiv, bias, matrix, ptr, c0, c1, c2, c3, c4, c5, c6, c7, c8, r, x
%endif
%if WIN64
- SWAP xmm0, xmm2
- SWAP xmm1, xmm3
+ VBROADCASTSS m0, xmm2
+ VBROADCASTSS m1, xmm3
mov r2q, matrixmp
mov r3q, ptrmp
DEFINE_ARGS dst, width, matrix, ptr, c0, c1, c2, c3, c4, c5, c6, c7, c8, r, x
-%endif
- movsxdifnidn widthq, widthd
+%else
VBROADCASTSS m0, xmm0
VBROADCASTSS m1, xmm1
+%endif
+ movsxdifnidn widthq, widthd
pxor m6, m6
mov c0q, [ptrq + 0*gprsize]
mov c1q, [ptrq + 1*gprsize]
@@ -281,7 +282,7 @@ cglobal filter_sobel, 4, 15, 7, dst, width, rdiv, bias, matrix, ptr, c0, c1, c2,
fmaddss xmm4, xmm5, xmm5, xmm4
sqrtps xmm4, xmm4
- fmaddss xmm4, xmm4, xmm0, xmm1 ;sum = sum * rdiv + bias
+ fmaddss xmm4, xmm4, xm0, xm1 ;sum = sum * rdiv + bias
cvttps2dq xmm4, xmm4 ; trunc to integer
packssdw xmm4, xmm4
packuswb xmm4, xmm4
--
2.27.0
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [FFmpeg-devel] [PATCH v2] libavfilter/x86/vf_convolution: fix sobel swap issue on WIN64
2022-11-14 15:20 [FFmpeg-devel] [PATCH v2] libavfilter/x86/vf_convolution: fix sobel swap issue on WIN64 bin.wang-at-intel.com
@ 2022-11-14 16:34 ` James Almer
2022-11-21 4:37 ` Xiang, Haihao
0 siblings, 1 reply; 3+ messages in thread
From: James Almer @ 2022-11-14 16:34 UTC (permalink / raw)
To: ffmpeg-devel
On 11/14/2022 12:20 PM, bin.wang-at-intel.com@ffmpeg.org wrote:
> From: "Wang, Bin" <bin.wang@intel.com>
>
> Signed-off-by: Wang, Bin <bin.wang@intel.com>
> ---
> libavfilter/x86/vf_convolution.asm | 11 ++++++-----
> 1 file changed, 6 insertions(+), 5 deletions(-)
>
> diff --git a/libavfilter/x86/vf_convolution.asm b/libavfilter/x86/vf_convolution.asm
> index c912d56752..9ac9ef5d73 100644
> --- a/libavfilter/x86/vf_convolution.asm
> +++ b/libavfilter/x86/vf_convolution.asm
> @@ -189,15 +189,16 @@ cglobal filter_sobel, 4, 15, 7, dst, width, matrix, ptr, c0, c1, c2, c3, c4, c5,
> cglobal filter_sobel, 4, 15, 7, dst, width, rdiv, bias, matrix, ptr, c0, c1, c2, c3, c4, c5, c6, c7, c8, r, x
> %endif
> %if WIN64
> - SWAP xmm0, xmm2
> - SWAP xmm1, xmm3
> + VBROADCASTSS m0, xmm2
> + VBROADCASTSS m1, xmm3
> mov r2q, matrixmp
> mov r3q, ptrmp
> DEFINE_ARGS dst, width, matrix, ptr, c0, c1, c2, c3, c4, c5, c6, c7, c8, r, x
> -%endif
> - movsxdifnidn widthq, widthd
> +%else
> VBROADCASTSS m0, xmm0
> VBROADCASTSS m1, xmm1
> +%endif
> + movsxdifnidn widthq, widthd
> pxor m6, m6
> mov c0q, [ptrq + 0*gprsize]
> mov c1q, [ptrq + 1*gprsize]
> @@ -281,7 +282,7 @@ cglobal filter_sobel, 4, 15, 7, dst, width, rdiv, bias, matrix, ptr, c0, c1, c2,
> fmaddss xmm4, xmm5, xmm5, xmm4
>
> sqrtps xmm4, xmm4
> - fmaddss xmm4, xmm4, xmm0, xmm1 ;sum = sum * rdiv + bias
> + fmaddss xmm4, xmm4, xm0, xm1 ;sum = sum * rdiv + bias
> cvttps2dq xmm4, xmm4 ; trunc to integer
> packssdw xmm4, xmm4
> packuswb xmm4, xmm4
Should be ok.
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [FFmpeg-devel] [PATCH v2] libavfilter/x86/vf_convolution: fix sobel swap issue on WIN64
2022-11-14 16:34 ` James Almer
@ 2022-11-21 4:37 ` Xiang, Haihao
0 siblings, 0 replies; 3+ messages in thread
From: Xiang, Haihao @ 2022-11-21 4:37 UTC (permalink / raw)
To: ffmpeg-devel
On Mon, 2022-11-14 at 13:34 -0300, James Almer wrote:
> On 11/14/2022 12:20 PM, bin.wang-at-intel.com@ffmpeg.org wrote:
> > From: "Wang, Bin" <bin.wang@intel.com>
> >
> > Signed-off-by: Wang, Bin <bin.wang@intel.com>
> > ---
> > libavfilter/x86/vf_convolution.asm | 11 ++++++-----
> > 1 file changed, 6 insertions(+), 5 deletions(-)
> >
> > diff --git a/libavfilter/x86/vf_convolution.asm
> > b/libavfilter/x86/vf_convolution.asm
> > index c912d56752..9ac9ef5d73 100644
> > --- a/libavfilter/x86/vf_convolution.asm
> > +++ b/libavfilter/x86/vf_convolution.asm
> > @@ -189,15 +189,16 @@ cglobal filter_sobel, 4, 15, 7, dst, width, matrix,
> > ptr, c0, c1, c2, c3, c4, c5,
> > cglobal filter_sobel, 4, 15, 7, dst, width, rdiv, bias, matrix, ptr, c0,
> > c1, c2, c3, c4, c5, c6, c7, c8, r, x
> > %endif
> > %if WIN64
> > - SWAP xmm0, xmm2
> > - SWAP xmm1, xmm3
> > + VBROADCASTSS m0, xmm2
> > + VBROADCASTSS m1, xmm3
> > mov r2q, matrixmp
> > mov r3q, ptrmp
> > DEFINE_ARGS dst, width, matrix, ptr, c0, c1, c2, c3, c4, c5, c6, c7,
> > c8, r, x
> > -%endif
> > - movsxdifnidn widthq, widthd
> > +%else
> > VBROADCASTSS m0, xmm0
> > VBROADCASTSS m1, xmm1
> > +%endif
> > + movsxdifnidn widthq, widthd
> > pxor m6, m6
> > mov c0q, [ptrq + 0*gprsize]
> > mov c1q, [ptrq + 1*gprsize]
> > @@ -281,7 +282,7 @@ cglobal filter_sobel, 4, 15, 7, dst, width, rdiv, bias,
> > matrix, ptr, c0, c1, c2,
> > fmaddss xmm4, xmm5, xmm5, xmm4
> >
> > sqrtps xmm4, xmm4
> > - fmaddss xmm4, xmm4, xmm0, xmm1 ;sum = sum * rdiv + bias
> > + fmaddss xmm4, xmm4, xm0, xm1 ;sum = sum * rdiv + bias
> > cvttps2dq xmm4, xmm4 ; trunc to integer
> > packssdw xmm4, xmm4
> > packuswb xmm4, xmm4
>
> Should be ok.
Applied,
-Haihao
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2022-11-21 4:37 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-14 15:20 [FFmpeg-devel] [PATCH v2] libavfilter/x86/vf_convolution: fix sobel swap issue on WIN64 bin.wang-at-intel.com
2022-11-14 16:34 ` James Almer
2022-11-21 4:37 ` Xiang, Haihao
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
This inbox may be cloned and mirrored by anyone:
git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git
# If you have public-inbox 1.1+ installed, you may
# initialize and index your mirror using the following commands:
public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
ffmpegdev@gitmailbox.com
public-inbox-index ffmpegdev
Example config snippet for mirrors.
AGPL code for this site: git clone https://public-inbox.org/public-inbox.git