[FFmpeg-devel] [PATCH v2 2/3] avcodec/x86/vvc/vvc_alf: use xq to match ptrdiff

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
 help / color / mirror / Atom feed

* [FFmpeg-devel] [PATCH v2 2/3] avcodec/x86/vvc/vvc_alf: use xq to match ptrdiff_t
       [not found] <20240530162807.490-1-toqsxw@outlook.com>
@ 2024-05-30 16:28 ` toqsxw
  2024-05-30 16:28 ` [FFmpeg-devel] [PATCH v2 3/3] tests/checkasm/vvc_alf: change alf step size to 8 toqsxw
  1 sibling, 0 replies; 4+ messages in thread
From: toqsxw @ 2024-05-30 16:28 UTC (permalink / raw)
  To: ffmpeg-devel; +Cc: Wu Jianhua

From: Wu Jianhua <toqsxw@outlook.com>

Signed-off-by: Wu Jianhua <toqsxw@outlook.com>
---
 libavcodec/x86/vvc/vvc_alf.asm | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/libavcodec/x86/vvc/vvc_alf.asm b/libavcodec/x86/vvc/vvc_alf.asm
index f7b3e2a6cc..b35dd9b0e9 100644
--- a/libavcodec/x86/vvc/vvc_alf.asm
+++ b/libavcodec/x86/vvc/vvc_alf.asm
@@ -409,7 +409,7 @@ cglobal vvc_alf_filter_%2_%1bpc, 11, 15, 16, 0-0x28, dst, dst_stride, src, src_s
 .loop:
     push            srcq
     push            dstq
-    xor               xd, xd
+    xor               xq, xq
 
     .loop_w:
         LOAD_PARAMS
@@ -417,8 +417,8 @@ cglobal vvc_alf_filter_%2_%1bpc, 11, 15, 16, 0-0x28, dst, dst_stride, src, src_s
 
         add         srcq, 16 * ps
         add         dstq, 16 * ps
-        add           xd, 16
-        cmp           xd, widthd
+        add           xq, 16
+        cmp           xq, widthq
         jl       .loop_w
 
     pop             dstq
@@ -427,7 +427,7 @@ cglobal vvc_alf_filter_%2_%1bpc, 11, 15, 16, 0-0x28, dst, dst_stride, src, src_s
     lea             dstq, [dstq + 4 * dst_strideq]
 
     lea          filterq, [filterq + 2 * strideq]
-    lea            clipq, [clipq + 2 * strideq]
+    lea            clipq, [clipq   + 2 * strideq]
 
     sub          vb_posq, 4
     sub          heightq, 4
-- 
2.44.0.windows.1

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [FFmpeg-devel] [PATCH v2 3/3] tests/checkasm/vvc_alf: change alf step size to 8
       [not found] <20240530162807.490-1-toqsxw@outlook.com>
  2024-05-30 16:28 ` [FFmpeg-devel] [PATCH v2 2/3] avcodec/x86/vvc/vvc_alf: use xq to match ptrdiff_t toqsxw
@ 2024-05-30 16:28 ` toqsxw
  2024-06-03 10:42   ` Bross, Benjamin
  1 sibling, 1 reply; 4+ messages in thread
From: toqsxw @ 2024-05-30 16:28 UTC (permalink / raw)
  To: ffmpeg-devel; +Cc: Wu Jianhua

From: Wu Jianhua <toqsxw@outlook.com>

From Benjamin Bross:
> for ALF where functions are in increments of 4 while 8 should be sufficient according to the spec.

Signed-off-by: Wu Jianhua <toqsxw@outlook.com>
---
 tests/checkasm/vvc_alf.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/tests/checkasm/vvc_alf.c b/tests/checkasm/vvc_alf.c
index f35fd2cd3e..84b0f9da15 100644
--- a/tests/checkasm/vvc_alf.c
+++ b/tests/checkasm/vvc_alf.c
@@ -90,8 +90,8 @@ static void check_alf_filter(VVCDSPContext *c, const int bit_depth)
     randomize_buffers2(filter, LUMA_PARAMS_SIZE, 1);
     randomize_buffers2(clip, LUMA_PARAMS_SIZE, 0);
 
-    for (int h = 4; h <= MAX_CTU_SIZE; h += 4) {
-        for (int w = 4; w <= MAX_CTU_SIZE; w += 4) {
+    for (int h = 4; h <= MAX_CTU_SIZE; h += 8) {
+        for (int w = 4; w <= MAX_CTU_SIZE; w += 8) {
             const int ctu_size = MAX_CTU_SIZE;
             if (check_func(c->alf.filter[LUMA], "vvc_alf_filter_luma_%dx%d_%d", w, h, bit_depth)) {
                 const int vb_pos = ctu_size - ALF_VB_POS_ABOVE_LUMA;
@@ -142,8 +142,8 @@ static void check_alf_classify(VVCDSPContext *c, const int bit_depth)
 
     randomize_buffers(src0, src1, SRC_BUF_SIZE);
 
-    for (int h = 4; h <= MAX_CTU_SIZE; h += 4) {
-        for (int w = 4; w <= MAX_CTU_SIZE; w += 4) {
+    for (int h = 4; h <= MAX_CTU_SIZE; h += 8) {
+        for (int w = 4; w <= MAX_CTU_SIZE; w += 8) {
             const int id_size = w * h / ALF_BLOCK_SIZE / ALF_BLOCK_SIZE * sizeof(int);
             const int vb_pos  = MAX_CTU_SIZE - ALF_BLOCK_SIZE;
             if (check_func(c->alf.classify, "vvc_alf_classify_%dx%d_%d", w, h, bit_depth)) {
-- 
2.44.0.windows.1

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [FFmpeg-devel] [PATCH v2 3/3] tests/checkasm/vvc_alf: change alf step size to 8
  2024-05-30 16:28 ` [FFmpeg-devel] [PATCH v2 3/3] tests/checkasm/vvc_alf: change alf step size to 8 toqsxw
@ 2024-06-03 10:42   ` Bross, Benjamin
  2024-06-06 18:24     ` [FFmpeg-devel] 回复: " Wu Jianhua
  0 siblings, 1 reply; 4+ messages in thread
From: Bross, Benjamin @ 2024-06-03 10:42 UTC (permalink / raw)
  To: FFmpeg development discussions and patches; +Cc: Wu Jianhua

> From Benjamin Bross:
>> for ALF where functions are in increments of 4 while 8 should be sufficient according to the spec.

Actually, it is not only the increment but the size has to be a multiple of 8, hence in addition loops should start at 8 instead of at 4.

ALF filter and classification is applied on CTU-level.
According to VVC spec, CTU sizes can be: 32x32, 64x64, 128x128 luma samples.
However, if width or height are not a multiple of CTU size, the larger CTU is "forced to split" until width or height fits.
E.g. width=840 luma samples fits 26.25 CTUs, i.e. 26 32x32 and one 8x32. 
Width and height are restricted to be at least a multiple of 8 (see H.266 09/2023, 7.4.3.4 Sequence parameter set RBSP semantics)

sps_pic_width_max_in_luma_samples shall not be equal to 0 and shall be an integer multiple of
Max( 8, MinCbSizeY )
…
sps_pic_height_max_in_luma_samples shall not be equal to 0 and shall be an integer multiple of
Max( 8, MinCbSizeY ).

Please note that this applies only for classification and 7x7 luma filtering. For chroma 5x5 filtering and 420 subsampling, CTU sizes can be as small as multiples of 4 chroma samples.


> On 30. May 2024, at 18:28, toqsxw@outlook.com wrote:
> 
> From: Wu Jianhua <toqsxw@outlook.com>
> 
> From Benjamin Bross:
>> for ALF where functions are in increments of 4 while 8 should be sufficient according to the spec.
> 
> Signed-off-by: Wu Jianhua <toqsxw@outlook.com>
> ---
> tests/checkasm/vvc_alf.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/tests/checkasm/vvc_alf.c b/tests/checkasm/vvc_alf.c
> index f35fd2cd3e..84b0f9da15 100644
> --- a/tests/checkasm/vvc_alf.c
> +++ b/tests/checkasm/vvc_alf.c
> @@ -90,8 +90,8 @@ static void check_alf_filter(VVCDSPContext *c, const int bit_depth)
>     randomize_buffers2(filter, LUMA_PARAMS_SIZE, 1);
>     randomize_buffers2(clip, LUMA_PARAMS_SIZE, 0);
> 
> -    for (int h = 4; h <= MAX_CTU_SIZE; h += 4) {
> -        for (int w = 4; w <= MAX_CTU_SIZE; w += 4) {
> +    for (int h = 4; h <= MAX_CTU_SIZE; h += 8) {
> +        for (int w = 4; w <= MAX_CTU_SIZE; w += 8) {
>             const int ctu_size = MAX_CTU_SIZE;
>             if (check_func(c->alf.filter[LUMA], "vvc_alf_filter_luma_%dx%d_%d", w, h, bit_depth)) {
>                 const int vb_pos = ctu_size - ALF_VB_POS_ABOVE_LUMA;
> @@ -142,8 +142,8 @@ static void check_alf_classify(VVCDSPContext *c, const int bit_depth)
> 
>     randomize_buffers(src0, src1, SRC_BUF_SIZE);
> 
> -    for (int h = 4; h <= MAX_CTU_SIZE; h += 4) {
> -        for (int w = 4; w <= MAX_CTU_SIZE; w += 4) {
> +    for (int h = 4; h <= MAX_CTU_SIZE; h += 8) {
> +        for (int w = 4; w <= MAX_CTU_SIZE; w += 8) {
>             const int id_size = w * h / ALF_BLOCK_SIZE / ALF_BLOCK_SIZE * sizeof(int);
>             const int vb_pos  = MAX_CTU_SIZE - ALF_BLOCK_SIZE;
>             if (check_func(c->alf.classify, "vvc_alf_classify_%dx%d_%d", w, h, bit_depth)) {
> -- 
> 2.44.0.windows.1
> 
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
> 

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [FFmpeg-devel] 回复:  [PATCH v2 3/3] tests/checkasm/vvc_alf: change alf step size to 8
  2024-06-03 10:42   ` Bross, Benjamin
@ 2024-06-06 18:24     ` Wu Jianhua
  0 siblings, 0 replies; 4+ messages in thread
From: Wu Jianhua @ 2024-06-06 18:24 UTC (permalink / raw)
  To: Bross, Benjamin, FFmpeg development discussions and patches

Bross, Benjamin:
> 发件人: Bross, Benjamin <benjamin.bross@hhi.fraunhofer.de>
> 发送时间: 2024年6月3日 3:42
> 收件人: FFmpeg development discussions and patches
> 抄送: Wu Jianhua
> 主题: Re: [FFmpeg-devel] [PATCH v2 3/3] tests/checkasm/vvc_alf: change alf step size to 8
> 
>> From Benjamin Bross:
>>> for ALF where functions are in increments of 4 while 8 should be sufficient according to the spec.
> 
> Actually, it is not only the increment but the size has to be a multiple of 8, hence in addition loops should start at 8 instead of at 4.
> 
> ALF filter and classification is applied on CTU-level.
> According to VVC spec, CTU sizes can be: 32x32, 64x64, 128x128 luma samples.
> However, if width or height are not a multiple of CTU size, the larger CTU is "forced to split" until width or height fits.
> E.g. width=840 luma samples fits 26.25 CTUs, i.e. 26 32x32 and one 8x32.
> Width and height are restricted to be at least a multiple of 8 (see H.266 09/2023, 7.4.3.4 Sequence parameter set RBSP semantics)

> sps_pic_width_max_in_luma_samples shall not be equal to 0 and shall be an integer multiple of
> Max( 8, MinCbSizeY )
> …
> sps_pic_height_max_in_luma_samples shall not be equal to 0 and shall be an integer multiple of
> Max( 8, MinCbSizeY ).
> 
> Please note that this applies only for classification and 7x7 luma filtering. For chroma 5x5 filtering and 420 subsampling, CTU sizes can be as small as multiples of 4 chroma samples.


Hi Benjamin,

I will update some patches according to your comments.

Thanks for your suggestion!
Jianhua
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-06-06 18:25 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20240530162807.490-1-toqsxw@outlook.com>
2024-05-30 16:28 ` [FFmpeg-devel] [PATCH v2 2/3] avcodec/x86/vvc/vvc_alf: use xq to match ptrdiff_t toqsxw
2024-05-30 16:28 ` [FFmpeg-devel] [PATCH v2 3/3] tests/checkasm/vvc_alf: change alf step size to 8 toqsxw
2024-06-03 10:42   ` Bross, Benjamin
2024-06-06 18:24     ` [FFmpeg-devel] 回复: " Wu Jianhua

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
		ffmpegdev@gitmailbox.com
	public-inbox-index ffmpegdev

Example config snippet for mirrors.


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git