* [FFmpeg-devel] [PATCH 1/2] x86: Don't hardcode the height to 8 in sad8_xy2_mmx @ 2022-07-13 20:47 Martin Storsjö 2022-07-13 20:47 ` [FFmpeg-devel] [PATCH 2/2] RFC: checkasm: motion: Test different h parameters Martin Storsjö 2022-08-04 7:47 ` [FFmpeg-devel] [PATCH 1/2] x86: Don't hardcode the height to 8 in sad8_xy2_mmx Martin Storsjö 0 siblings, 2 replies; 7+ messages in thread From: Martin Storsjö @ 2022-07-13 20:47 UTC (permalink / raw) To: ffmpeg-devel; +Cc: Michael Niedermayer, Jonathan Swinney The height is hardcoded in some of the me_cmp functions, but not in all of them. But in the case of all other functions, it's hardcoded in the same place in SIMD functions as in the C reference functions, while this one function differs from the behaviour of the C code. (Before 542765ce3eccbca587d54262a512cbdb1407230d, there were a couple other sad8_*_mmx functions with similar hardcoded height.) --- libavcodec/x86/me_cmp_init.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/libavcodec/x86/me_cmp_init.c b/libavcodec/x86/me_cmp_init.c index 61e9396b8f..dcc2621276 100644 --- a/libavcodec/x86/me_cmp_init.c +++ b/libavcodec/x86/me_cmp_init.c @@ -202,13 +202,12 @@ static inline int sum_mmx(void) static int sad8_xy2_ ## suf(MpegEncContext *v, uint8_t *blk2, \ uint8_t *blk1, ptrdiff_t stride, int h) \ { \ - av_assert2(h == 8); \ __asm__ volatile ( \ "pxor %%mm7, %%mm7 \n\t" \ "pxor %%mm6, %%mm6 \n\t" \ ::); \ \ - sad8_4_ ## suf(blk1, blk2, stride, 8); \ + sad8_4_ ## suf(blk1, blk2, stride, h); \ \ return sum_ ## suf(); \ } \ -- 2.25.1 _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 7+ messages in thread
* [FFmpeg-devel] [PATCH 2/2] RFC: checkasm: motion: Test different h parameters 2022-07-13 20:47 [FFmpeg-devel] [PATCH 1/2] x86: Don't hardcode the height to 8 in sad8_xy2_mmx Martin Storsjö @ 2022-07-13 20:47 ` Martin Storsjö 2022-08-04 7:47 ` Martin Storsjö 2022-08-04 7:47 ` [FFmpeg-devel] [PATCH 1/2] x86: Don't hardcode the height to 8 in sad8_xy2_mmx Martin Storsjö 1 sibling, 1 reply; 7+ messages in thread From: Martin Storsjö @ 2022-07-13 20:47 UTC (permalink / raw) To: ffmpeg-devel; +Cc: Michael Niedermayer, Jonathan Swinney Previously, the checkasm test always passed h=8, so no other cases were tested. Out of the me_cmp functions, in practice, some functions are hardcoded to always assume a 8x8 block (ignoring the h parameter), while others do use the parameter. For those with hardcoded height, both the reference C function and the assembly implementations ignore the parameter similarly. The documentation for the functions indicate that heights between w/2 and 2*w, within the range of 4 to 16, should be supported. This patch just tests random heights in that range, without knowing what width the current function actually uses. --- I'm not sure if it's good to have checkasm exercise cases that don't occur in practice or not. In particular, the aarch64 functions have a separate implementation for non-multiple-of-4 height, which probably doesn't ever get called in practice, while other SIMD implementations lack that. Alternatively, we'd improve the documentation for the expectations for these functions and make the test match that, and remove the unused non-multiple-of-4 case in the aarch64 assembly. --- tests/checkasm/motion.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/tests/checkasm/motion.c b/tests/checkasm/motion.c index 0112822174..79e4358941 100644 --- a/tests/checkasm/motion.c +++ b/tests/checkasm/motion.c @@ -45,7 +45,7 @@ static void test_motion(const char *name, me_cmp_func test_func) /* motion estimation can look up to 17 bytes ahead */ static const int look_ahead = 17; - int i, x, y, d1, d2; + int i, x, y, h, d1, d2; uint8_t *ptr; LOCAL_ALIGNED_16(uint8_t, img1, [WIDTH * HEIGHT]); @@ -68,14 +68,16 @@ static void test_motion(const char *name, me_cmp_func test_func) for (i = 0; i < ITERATIONS; i++) { x = rnd() % (WIDTH - look_ahead); y = rnd() % (HEIGHT - look_ahead); + // Pick a random h between 4 and 16; pick an even value. + h = 4 + ((rnd() % (16 + 1 - 4)) & ~1); ptr = img2 + y * WIDTH + x; - d2 = call_ref(NULL, img1, ptr, WIDTH, 8); - d1 = call_new(NULL, img1, ptr, WIDTH, 8); + d2 = call_ref(NULL, img1, ptr, WIDTH, h); + d1 = call_new(NULL, img1, ptr, WIDTH, h); if (d1 != d2) { fail(); - printf("func: %s, x=%d y=%d, error: asm=%d c=%d\n", name, x, y, d1, d2); + printf("func: %s, x=%d y=%d h=%d, error: asm=%d c=%d\n", name, x, y, h, d1, d2); break; } } -- 2.25.1 _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [FFmpeg-devel] [PATCH 2/2] RFC: checkasm: motion: Test different h parameters 2022-07-13 20:47 ` [FFmpeg-devel] [PATCH 2/2] RFC: checkasm: motion: Test different h parameters Martin Storsjö @ 2022-08-04 7:47 ` Martin Storsjö 2022-08-16 11:16 ` Martin Storsjö 0 siblings, 1 reply; 7+ messages in thread From: Martin Storsjö @ 2022-08-04 7:47 UTC (permalink / raw) To: ffmpeg-devel; +Cc: Michael Niedermayer, Jonathan Swinney On Wed, 13 Jul 2022, Martin Storsjö wrote: > Previously, the checkasm test always passed h=8, so no other cases > were tested. > > Out of the me_cmp functions, in practice, some functions are hardcoded > to always assume a 8x8 block (ignoring the h parameter), while others > do use the parameter. For those with hardcoded height, both the > reference C function and the assembly implementations ignore the > parameter similarly. > > The documentation for the functions indicate that heights between > w/2 and 2*w, within the range of 4 to 16, should be supported. This > patch just tests random heights in that range, without knowing what > width the current function actually uses. > --- > I'm not sure if it's good to have checkasm exercise cases that > don't occur in practice or not. In particular, the aarch64 > functions have a separate implementation for non-multiple-of-4 > height, which probably doesn't ever get called in practice, while > other SIMD implementations lack that. > > Alternatively, we'd improve the documentation for the expectations > for these functions and make the test match that, and remove the > unused non-multiple-of-4 case in the aarch64 assembly. > --- > tests/checkasm/motion.c | 10 ++++++---- > 1 file changed, 6 insertions(+), 4 deletions(-) > > diff --git a/tests/checkasm/motion.c b/tests/checkasm/motion.c > index 0112822174..79e4358941 100644 > --- a/tests/checkasm/motion.c > +++ b/tests/checkasm/motion.c > @@ -45,7 +45,7 @@ static void test_motion(const char *name, me_cmp_func test_func) > /* motion estimation can look up to 17 bytes ahead */ > static const int look_ahead = 17; > > - int i, x, y, d1, d2; > + int i, x, y, h, d1, d2; > uint8_t *ptr; > > LOCAL_ALIGNED_16(uint8_t, img1, [WIDTH * HEIGHT]); > @@ -68,14 +68,16 @@ static void test_motion(const char *name, me_cmp_func test_func) > for (i = 0; i < ITERATIONS; i++) { > x = rnd() % (WIDTH - look_ahead); > y = rnd() % (HEIGHT - look_ahead); > + // Pick a random h between 4 and 16; pick an even value. > + h = 4 + ((rnd() % (16 + 1 - 4)) & ~1); > > ptr = img2 + y * WIDTH + x; > - d2 = call_ref(NULL, img1, ptr, WIDTH, 8); > - d1 = call_new(NULL, img1, ptr, WIDTH, 8); > + d2 = call_ref(NULL, img1, ptr, WIDTH, h); > + d1 = call_new(NULL, img1, ptr, WIDTH, h); > > if (d1 != d2) { > fail(); > - printf("func: %s, x=%d y=%d, error: asm=%d c=%d\n", name, x, y, d1, d2); > + printf("func: %s, x=%d y=%d h=%d, error: asm=%d c=%d\n", name, x, y, h, d1, d2); > break; > } > } > -- > 2.25.1 Ping // Martin _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [FFmpeg-devel] [PATCH 2/2] RFC: checkasm: motion: Test different h parameters 2022-08-04 7:47 ` Martin Storsjö @ 2022-08-16 11:16 ` Martin Storsjö 0 siblings, 0 replies; 7+ messages in thread From: Martin Storsjö @ 2022-08-16 11:16 UTC (permalink / raw) To: ffmpeg-devel; +Cc: Michael Niedermayer, Jonathan Swinney On Thu, 4 Aug 2022, Martin Storsjö wrote: > On Wed, 13 Jul 2022, Martin Storsjö wrote: > >> Previously, the checkasm test always passed h=8, so no other cases >> were tested. >> >> Out of the me_cmp functions, in practice, some functions are hardcoded >> to always assume a 8x8 block (ignoring the h parameter), while others >> do use the parameter. For those with hardcoded height, both the >> reference C function and the assembly implementations ignore the >> parameter similarly. >> >> The documentation for the functions indicate that heights between >> w/2 and 2*w, within the range of 4 to 16, should be supported. This >> patch just tests random heights in that range, without knowing what >> width the current function actually uses. >> --- >> I'm not sure if it's good to have checkasm exercise cases that >> don't occur in practice or not. In particular, the aarch64 >> functions have a separate implementation for non-multiple-of-4 >> height, which probably doesn't ever get called in practice, while >> other SIMD implementations lack that. >> >> Alternatively, we'd improve the documentation for the expectations >> for these functions and make the test match that, and remove the >> unused non-multiple-of-4 case in the aarch64 assembly. >> --- >> tests/checkasm/motion.c | 10 ++++++---- >> 1 file changed, 6 insertions(+), 4 deletions(-) >> >> diff --git a/tests/checkasm/motion.c b/tests/checkasm/motion.c >> index 0112822174..79e4358941 100644 >> --- a/tests/checkasm/motion.c >> +++ b/tests/checkasm/motion.c >> @@ -45,7 +45,7 @@ static void test_motion(const char *name, me_cmp_func >> test_func) >> /* motion estimation can look up to 17 bytes ahead */ >> static const int look_ahead = 17; >> >> - int i, x, y, d1, d2; >> + int i, x, y, h, d1, d2; >> uint8_t *ptr; >> >> LOCAL_ALIGNED_16(uint8_t, img1, [WIDTH * HEIGHT]); >> @@ -68,14 +68,16 @@ static void test_motion(const char *name, me_cmp_func >> test_func) >> for (i = 0; i < ITERATIONS; i++) { >> x = rnd() % (WIDTH - look_ahead); >> y = rnd() % (HEIGHT - look_ahead); >> + // Pick a random h between 4 and 16; pick an even value. >> + h = 4 + ((rnd() % (16 + 1 - 4)) & ~1); >> >> ptr = img2 + y * WIDTH + x; >> - d2 = call_ref(NULL, img1, ptr, WIDTH, 8); >> - d1 = call_new(NULL, img1, ptr, WIDTH, 8); >> + d2 = call_ref(NULL, img1, ptr, WIDTH, h); >> + d1 = call_new(NULL, img1, ptr, WIDTH, h); >> >> if (d1 != d2) { >> fail(); >> - printf("func: %s, x=%d y=%d, error: asm=%d c=%d\n", name, >> x, y, d1, d2); >> + printf("func: %s, x=%d y=%d h=%d, error: asm=%d c=%d\n", >> name, x, y, h, d1, d2); >> break; >> } >> } >> -- >> 2.25.1 > > Ping As there doesn't seem to be much opinion on this, I'll go ahead and land this (patch 1/2 was approved already), as this improves the test coverage for the aarch64 neon me_cmp assembly patches that are in progress. // Martin _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [FFmpeg-devel] [PATCH 1/2] x86: Don't hardcode the height to 8 in sad8_xy2_mmx 2022-07-13 20:47 [FFmpeg-devel] [PATCH 1/2] x86: Don't hardcode the height to 8 in sad8_xy2_mmx Martin Storsjö 2022-07-13 20:47 ` [FFmpeg-devel] [PATCH 2/2] RFC: checkasm: motion: Test different h parameters Martin Storsjö @ 2022-08-04 7:47 ` Martin Storsjö 2022-08-04 15:31 ` Michael Niedermayer 1 sibling, 1 reply; 7+ messages in thread From: Martin Storsjö @ 2022-08-04 7:47 UTC (permalink / raw) To: ffmpeg-devel; +Cc: Michael Niedermayer, Jonathan Swinney On Wed, 13 Jul 2022, Martin Storsjö wrote: > The height is hardcoded in some of the me_cmp functions, but not > in all of them. But in the case of all other functions, it's hardcoded > in the same place in SIMD functions as in the C reference functions, > while this one function differs from the behaviour of the C code. > > (Before 542765ce3eccbca587d54262a512cbdb1407230d, there were a > couple other sad8_*_mmx functions with similar hardcoded height.) > --- > libavcodec/x86/me_cmp_init.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/libavcodec/x86/me_cmp_init.c b/libavcodec/x86/me_cmp_init.c > index 61e9396b8f..dcc2621276 100644 > --- a/libavcodec/x86/me_cmp_init.c > +++ b/libavcodec/x86/me_cmp_init.c > @@ -202,13 +202,12 @@ static inline int sum_mmx(void) > static int sad8_xy2_ ## suf(MpegEncContext *v, uint8_t *blk2, \ > uint8_t *blk1, ptrdiff_t stride, int h) \ > { \ > - av_assert2(h == 8); \ > __asm__ volatile ( \ > "pxor %%mm7, %%mm7 \n\t" \ > "pxor %%mm6, %%mm6 \n\t" \ > ::); \ > \ > - sad8_4_ ## suf(blk1, blk2, stride, 8); \ > + sad8_4_ ## suf(blk1, blk2, stride, h); \ > \ > return sum_ ## suf(); \ > } \ > -- > 2.25.1 Ping, does this seem reasonable? Michael indicated a desire to make the me_cmp functions more general and flexible than what they are today, and this would be a first step to making checkasm test such cases. // Martin _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [FFmpeg-devel] [PATCH 1/2] x86: Don't hardcode the height to 8 in sad8_xy2_mmx 2022-08-04 7:47 ` [FFmpeg-devel] [PATCH 1/2] x86: Don't hardcode the height to 8 in sad8_xy2_mmx Martin Storsjö @ 2022-08-04 15:31 ` Michael Niedermayer 2022-08-04 20:29 ` Martin Storsjö 0 siblings, 1 reply; 7+ messages in thread From: Michael Niedermayer @ 2022-08-04 15:31 UTC (permalink / raw) To: FFmpeg development discussions and patches [-- Attachment #1.1: Type: text/plain, Size: 2669 bytes --] On Thu, Aug 04, 2022 at 10:47:34AM +0300, Martin Storsjö wrote: > On Wed, 13 Jul 2022, Martin Storsjö wrote: > > > The height is hardcoded in some of the me_cmp functions, but not > > in all of them. But in the case of all other functions, it's hardcoded > > in the same place in SIMD functions as in the C reference functions, > > while this one function differs from the behaviour of the C code. > > > > (Before 542765ce3eccbca587d54262a512cbdb1407230d, there were a > > couple other sad8_*_mmx functions with similar hardcoded height.) > > --- > > libavcodec/x86/me_cmp_init.c | 3 +-- > > 1 file changed, 1 insertion(+), 2 deletions(-) > > > > diff --git a/libavcodec/x86/me_cmp_init.c b/libavcodec/x86/me_cmp_init.c > > index 61e9396b8f..dcc2621276 100644 > > --- a/libavcodec/x86/me_cmp_init.c > > +++ b/libavcodec/x86/me_cmp_init.c > > @@ -202,13 +202,12 @@ static inline int sum_mmx(void) > > static int sad8_xy2_ ## suf(MpegEncContext *v, uint8_t *blk2, \ > > uint8_t *blk1, ptrdiff_t stride, int h) \ > > { \ > > - av_assert2(h == 8); \ > > __asm__ volatile ( \ > > "pxor %%mm7, %%mm7 \n\t" \ > > "pxor %%mm6, %%mm6 \n\t" \ > > ::); \ > > \ > > - sad8_4_ ## suf(blk1, blk2, stride, 8); \ > > + sad8_4_ ## suf(blk1, blk2, stride, h); \ > > \ > > return sum_ ## suf(); \ > > } \ > > -- > > 2.25.1 > > Ping, does this seem reasonable? Michael indicated a desire to make the > me_cmp functions more general and flexible than what they are today, and > this would be a first step to making checkasm test such cases. LGTM assuming it doesnt have any problematic perforamce impact thx [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Frequently ignored answer#1 FFmpeg bugs should be sent to our bugtracker. User questions about the command line tools should be sent to the ffmpeg-user ML. And questions about how to use libav* should be sent to the libav-user ML. [-- Attachment #1.2: signature.asc --] [-- Type: application/pgp-signature, Size: 195 bytes --] [-- Attachment #2: Type: text/plain, Size: 251 bytes --] _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [FFmpeg-devel] [PATCH 1/2] x86: Don't hardcode the height to 8 in sad8_xy2_mmx 2022-08-04 15:31 ` Michael Niedermayer @ 2022-08-04 20:29 ` Martin Storsjö 0 siblings, 0 replies; 7+ messages in thread From: Martin Storsjö @ 2022-08-04 20:29 UTC (permalink / raw) To: FFmpeg development discussions and patches On Thu, 4 Aug 2022, Michael Niedermayer wrote: > On Thu, Aug 04, 2022 at 10:47:34AM +0300, Martin Storsjö wrote: >> On Wed, 13 Jul 2022, Martin Storsjö wrote: >> >>> The height is hardcoded in some of the me_cmp functions, but not >>> in all of them. But in the case of all other functions, it's hardcoded >>> in the same place in SIMD functions as in the C reference functions, >>> while this one function differs from the behaviour of the C code. >>> >>> (Before 542765ce3eccbca587d54262a512cbdb1407230d, there were a >>> couple other sad8_*_mmx functions with similar hardcoded height.) >>> --- >>> libavcodec/x86/me_cmp_init.c | 3 +-- >>> 1 file changed, 1 insertion(+), 2 deletions(-) >>> >>> diff --git a/libavcodec/x86/me_cmp_init.c b/libavcodec/x86/me_cmp_init.c >>> index 61e9396b8f..dcc2621276 100644 >>> --- a/libavcodec/x86/me_cmp_init.c >>> +++ b/libavcodec/x86/me_cmp_init.c >>> @@ -202,13 +202,12 @@ static inline int sum_mmx(void) >>> static int sad8_xy2_ ## suf(MpegEncContext *v, uint8_t *blk2, \ >>> uint8_t *blk1, ptrdiff_t stride, int h) \ >>> { \ >>> - av_assert2(h == 8); \ >>> __asm__ volatile ( \ >>> "pxor %%mm7, %%mm7 \n\t" \ >>> "pxor %%mm6, %%mm6 \n\t" \ >>> ::); \ >>> \ >>> - sad8_4_ ## suf(blk1, blk2, stride, 8); \ >>> + sad8_4_ ## suf(blk1, blk2, stride, h); \ >>> \ >>> return sum_ ## suf(); \ >>> } \ >>> -- >>> 2.25.1 >> >> Ping, does this seem reasonable? Michael indicated a desire to make the >> me_cmp functions more general and flexible than what they are today, and >> this would be a first step to making checkasm test such cases. > > LGTM assuming it doesnt have any problematic perforamce impact Thanks - I didn't notice any significant change in the checkasm bench numbers for it. // Martin _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2022-08-16 11:17 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-07-13 20:47 [FFmpeg-devel] [PATCH 1/2] x86: Don't hardcode the height to 8 in sad8_xy2_mmx Martin Storsjö 2022-07-13 20:47 ` [FFmpeg-devel] [PATCH 2/2] RFC: checkasm: motion: Test different h parameters Martin Storsjö 2022-08-04 7:47 ` Martin Storsjö 2022-08-16 11:16 ` Martin Storsjö 2022-08-04 7:47 ` [FFmpeg-devel] [PATCH 1/2] x86: Don't hardcode the height to 8 in sad8_xy2_mmx Martin Storsjö 2022-08-04 15:31 ` Michael Niedermayer 2022-08-04 20:29 ` Martin Storsjö
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel This inbox may be cloned and mirrored by anyone: git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \ ffmpegdev@gitmailbox.com public-inbox-index ffmpegdev Example config snippet for mirrors. AGPL code for this site: git clone https://public-inbox.org/public-inbox.git