[FFmpeg-devel] [PATCH v3] checkasm/h264dsp: Fix stack overflow in check_idct

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
 help / color / mirror / Atom feed

* [FFmpeg-devel] [PATCH v3] checkasm/h264dsp: Fix stack overflow in check_idct_dequant
@ 2025-06-16  9:40 Zhao Zhili
  2025-06-16  9:46 ` Andreas Rheinhardt
  0 siblings, 1 reply; 7+ messages in thread
From: Zhao Zhili @ 2025-06-16  9:40 UTC (permalink / raw)
  To: ffmpeg-devel; +Cc: Zhao Zhili

From: Zhao Zhili <zhilizhao@tencent.com>

---
 tests/checkasm/h264dsp.c | 20 +++++++++++++++-----
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/tests/checkasm/h264dsp.c b/tests/checkasm/h264dsp.c
index f5f9650224..a0f8fd858a 100644
--- a/tests/checkasm/h264dsp.c
+++ b/tests/checkasm/h264dsp.c
@@ -328,25 +328,35 @@ static void check_idct_multiple(void)
 static void check_idct_dequant(void)
 {
     static const int depths[5] = { 8, 9, 10, 12, 14 };
-    LOCAL_ALIGNED_16(int16_t, src, [16]);
-    /* Ensure dst buffers are large enough to hold dctcoefs of all bit-depths. */
+    /* Ensure buffers are large enough to hold dctcoefs of all bit-depths. */
+    LOCAL_ALIGNED_16(uint8_t, src_buf, [16 * sizeof(int32_t)]);
     LOCAL_ALIGNED_16(uint8_t, dst0, [16 * 16 * sizeof(int32_t)]);
     LOCAL_ALIGNED_16(uint8_t, dst1, [16 * 16 * sizeof(int32_t)]);
+    int16_t *src = (int16_t *)src_buf;
     int16_t *dst_ref = (int16_t *)dst0;
     int16_t *dst_new = (int16_t *)dst1;
     H264DSPContext h;
     int bit_depth, i, qmul;
     declare_func_emms(AV_CPU_FLAG_MMX | AV_CPU_FLAG_SSE2, void, int16_t *output, int16_t *input, int qmul);
 
-    for (int j = 0; j < 16; j++)
-        src[j] = (rnd() % 512) - 256;
-
     qmul = rnd() % 4096;
 
     for (i = 0; i < FF_ARRAY_ELEMS(depths); i++) {
         bit_depth = depths[i];
         ff_h264dsp_init(&h, bit_depth, 1);
 
+        if (bit_depth == 8) {
+            for (size_t j = 0; j < 16; j++) {
+                int16_t r = (rnd() % 512) - 256;
+                AV_WN16A(&src_buf[j << 1], r);
+            }
+        } else {
+            for (size_t j = 0; j < 16; j++) {
+                int32_t r = (rnd() % (1 << (bit_depth + 1))) - (1 << bit_depth);
+                AV_WN32A(&src_buf[j << 2], r);
+            }
+        }
+
         memset(dst0, 0, 16 * 16 * SIZEOF_COEF);
         memset(dst1, 0, 16 * 16 * SIZEOF_COEF);
 
-- 
2.25.1


_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [FFmpeg-devel] [PATCH v3] checkasm/h264dsp: Fix stack overflow in check_idct_dequant
  2025-06-16  9:40 [FFmpeg-devel] [PATCH v3] checkasm/h264dsp: Fix stack overflow in check_idct_dequant Zhao Zhili
@ 2025-06-16  9:46 ` Andreas Rheinhardt
  2025-06-16 10:21   ` Zhao Zhili
  0 siblings, 1 reply; 7+ messages in thread
From: Andreas Rheinhardt @ 2025-06-16  9:46 UTC (permalink / raw)
  To: ffmpeg-devel

Zhao Zhili:
> From: Zhao Zhili <zhilizhao@tencent.com>
> 
> ---
>  tests/checkasm/h264dsp.c | 20 +++++++++++++++-----
>  1 file changed, 15 insertions(+), 5 deletions(-)
> 
> diff --git a/tests/checkasm/h264dsp.c b/tests/checkasm/h264dsp.c
> index f5f9650224..a0f8fd858a 100644
> --- a/tests/checkasm/h264dsp.c
> +++ b/tests/checkasm/h264dsp.c
> @@ -328,25 +328,35 @@ static void check_idct_multiple(void)
>  static void check_idct_dequant(void)
>  {
>      static const int depths[5] = { 8, 9, 10, 12, 14 };
> -    LOCAL_ALIGNED_16(int16_t, src, [16]);
> -    /* Ensure dst buffers are large enough to hold dctcoefs of all bit-depths. */
> +    /* Ensure buffers are large enough to hold dctcoefs of all bit-depths. */
> +    LOCAL_ALIGNED_16(uint8_t, src_buf, [16 * sizeof(int32_t)]);
>      LOCAL_ALIGNED_16(uint8_t, dst0, [16 * 16 * sizeof(int32_t)]);
>      LOCAL_ALIGNED_16(uint8_t, dst1, [16 * 16 * sizeof(int32_t)]);
> +    int16_t *src = (int16_t *)src_buf;
>      int16_t *dst_ref = (int16_t *)dst0;
>      int16_t *dst_new = (int16_t *)dst1;
>      H264DSPContext h;
>      int bit_depth, i, qmul;
>      declare_func_emms(AV_CPU_FLAG_MMX | AV_CPU_FLAG_SSE2, void, int16_t *output, int16_t *input, int qmul);
>  
> -    for (int j = 0; j < 16; j++)
> -        src[j] = (rnd() % 512) - 256;
> -
>      qmul = rnd() % 4096;
>  
>      for (i = 0; i < FF_ARRAY_ELEMS(depths); i++) {
>          bit_depth = depths[i];
>          ff_h264dsp_init(&h, bit_depth, 1);
>  
> +        if (bit_depth == 8) {
> +            for (size_t j = 0; j < 16; j++) {
> +                int16_t r = (rnd() % 512) - 256;
> +                AV_WN16A(&src_buf[j << 1], r);
> +            }
> +        } else {
> +            for (size_t j = 0; j < 16; j++) {
> +                int32_t r = (rnd() % (1 << (bit_depth + 1))) - (1 << bit_depth);
> +                AV_WN32A(&src_buf[j << 2], r);
> +            }
> +        }
> +
>          memset(dst0, 0, 16 * 16 * SIZEOF_COEF);
>          memset(dst1, 0, 16 * 16 * SIZEOF_COEF);
>  

This still has an effective-type violation: src_buf is of type uint8_t,
yet the ff_h264_luma_dc_dequant_idct functions will read it as
int16_t/int32_t. It also still has the downside that buffer overflows
for the 8bit case can go undetected.

- Andreas

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [FFmpeg-devel] [PATCH v3] checkasm/h264dsp: Fix stack overflow in check_idct_dequant
  2025-06-16  9:46 ` Andreas Rheinhardt
@ 2025-06-16 10:21   ` Zhao Zhili
  2025-06-16 11:03     ` Andreas Rheinhardt
  0 siblings, 1 reply; 7+ messages in thread
From: Zhao Zhili @ 2025-06-16 10:21 UTC (permalink / raw)
  To: FFmpeg development discussions and patches



> On Jun 16, 2025, at 17:46, Andreas Rheinhardt <andreas.rheinhardt@outlook.com> wrote:
> 
> Zhao Zhili:
>> From: Zhao Zhili <zhilizhao@tencent.com>
>> 
>> ---
>> tests/checkasm/h264dsp.c | 20 +++++++++++++++-----
>> 1 file changed, 15 insertions(+), 5 deletions(-)
>> 
>> diff --git a/tests/checkasm/h264dsp.c b/tests/checkasm/h264dsp.c
>> index f5f9650224..a0f8fd858a 100644
>> --- a/tests/checkasm/h264dsp.c
>> +++ b/tests/checkasm/h264dsp.c
>> @@ -328,25 +328,35 @@ static void check_idct_multiple(void)
>> static void check_idct_dequant(void)
>> {
>>     static const int depths[5] = { 8, 9, 10, 12, 14 };
>> -    LOCAL_ALIGNED_16(int16_t, src, [16]);
>> -    /* Ensure dst buffers are large enough to hold dctcoefs of all bit-depths. */
>> +    /* Ensure buffers are large enough to hold dctcoefs of all bit-depths. */
>> +    LOCAL_ALIGNED_16(uint8_t, src_buf, [16 * sizeof(int32_t)]);
>>     LOCAL_ALIGNED_16(uint8_t, dst0, [16 * 16 * sizeof(int32_t)]);
>>     LOCAL_ALIGNED_16(uint8_t, dst1, [16 * 16 * sizeof(int32_t)]);
>> +    int16_t *src = (int16_t *)src_buf;
>>     int16_t *dst_ref = (int16_t *)dst0;
>>     int16_t *dst_new = (int16_t *)dst1;
>>     H264DSPContext h;
>>     int bit_depth, i, qmul;
>>     declare_func_emms(AV_CPU_FLAG_MMX | AV_CPU_FLAG_SSE2, void, int16_t *output, int16_t *input, int qmul);
>> 
>> -    for (int j = 0; j < 16; j++)
>> -        src[j] = (rnd() % 512) - 256;
>> -
>>     qmul = rnd() % 4096;
>> 
>>     for (i = 0; i < FF_ARRAY_ELEMS(depths); i++) {
>>         bit_depth = depths[i];
>>         ff_h264dsp_init(&h, bit_depth, 1);
>> 
>> +        if (bit_depth == 8) {
>> +            for (size_t j = 0; j < 16; j++) {
>> +                int16_t r = (rnd() % 512) - 256;
>> +                AV_WN16A(&src_buf[j << 1], r);
>> +            }
>> +        } else {
>> +            for (size_t j = 0; j < 16; j++) {
>> +                int32_t r = (rnd() % (1 << (bit_depth + 1))) - (1 << bit_depth);
>> +                AV_WN32A(&src_buf[j << 2], r);
>> +            }
>> +        }
>> +
>>         memset(dst0, 0, 16 * 16 * SIZEOF_COEF);
>>         memset(dst1, 0, 16 * 16 * SIZEOF_COEF);
>> 
> 
> This still has an effective-type violation: src_buf is of type uint8_t,
> yet the ff_h264_luma_dc_dequant_idct functions will read it as
> int16_t/int32_t. It also still has the downside that buffer overflows
> for the 8bit case can go undetected.

A bunch of template has cast like 

    pixel *dst = (pixel *)_dst;
    const pixel *src = (const pixel *)_src;

then read and write as int16_t.

And a bunch of checkasm use uint8_t[] array on stack as src and dst,
which leading to UB.

This patch isn’t specific. And this patch add zero UB (it’s there before the patch,
both src and dst are accessed as int32_t/int16_t while they are int16_t and uint8_t).

> 
> - Andreas
> 
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [FFmpeg-devel] [PATCH v3] checkasm/h264dsp: Fix stack overflow in check_idct_dequant
  2025-06-16 10:21   ` Zhao Zhili
@ 2025-06-16 11:03     ` Andreas Rheinhardt
  2025-06-16 11:49       ` Zhao Zhili
  0 siblings, 1 reply; 7+ messages in thread
From: Andreas Rheinhardt @ 2025-06-16 11:03 UTC (permalink / raw)
  To: ffmpeg-devel

Zhao Zhili:
> 
> 
>> On Jun 16, 2025, at 17:46, Andreas Rheinhardt <andreas.rheinhardt@outlook.com> wrote:
>>
>> Zhao Zhili:
>>> From: Zhao Zhili <zhilizhao@tencent.com>
>>>
>>> ---
>>> tests/checkasm/h264dsp.c | 20 +++++++++++++++-----
>>> 1 file changed, 15 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/tests/checkasm/h264dsp.c b/tests/checkasm/h264dsp.c
>>> index f5f9650224..a0f8fd858a 100644
>>> --- a/tests/checkasm/h264dsp.c
>>> +++ b/tests/checkasm/h264dsp.c
>>> @@ -328,25 +328,35 @@ static void check_idct_multiple(void)
>>> static void check_idct_dequant(void)
>>> {
>>>     static const int depths[5] = { 8, 9, 10, 12, 14 };
>>> -    LOCAL_ALIGNED_16(int16_t, src, [16]);
>>> -    /* Ensure dst buffers are large enough to hold dctcoefs of all bit-depths. */
>>> +    /* Ensure buffers are large enough to hold dctcoefs of all bit-depths. */
>>> +    LOCAL_ALIGNED_16(uint8_t, src_buf, [16 * sizeof(int32_t)]);
>>>     LOCAL_ALIGNED_16(uint8_t, dst0, [16 * 16 * sizeof(int32_t)]);
>>>     LOCAL_ALIGNED_16(uint8_t, dst1, [16 * 16 * sizeof(int32_t)]);
>>> +    int16_t *src = (int16_t *)src_buf;
>>>     int16_t *dst_ref = (int16_t *)dst0;
>>>     int16_t *dst_new = (int16_t *)dst1;
>>>     H264DSPContext h;
>>>     int bit_depth, i, qmul;
>>>     declare_func_emms(AV_CPU_FLAG_MMX | AV_CPU_FLAG_SSE2, void, int16_t *output, int16_t *input, int qmul);
>>>
>>> -    for (int j = 0; j < 16; j++)
>>> -        src[j] = (rnd() % 512) - 256;
>>> -
>>>     qmul = rnd() % 4096;
>>>
>>>     for (i = 0; i < FF_ARRAY_ELEMS(depths); i++) {
>>>         bit_depth = depths[i];
>>>         ff_h264dsp_init(&h, bit_depth, 1);
>>>
>>> +        if (bit_depth == 8) {
>>> +            for (size_t j = 0; j < 16; j++) {
>>> +                int16_t r = (rnd() % 512) - 256;
>>> +                AV_WN16A(&src_buf[j << 1], r);
>>> +            }
>>> +        } else {
>>> +            for (size_t j = 0; j < 16; j++) {
>>> +                int32_t r = (rnd() % (1 << (bit_depth + 1))) - (1 << bit_depth);
>>> +                AV_WN32A(&src_buf[j << 2], r);
>>> +            }
>>> +        }
>>> +
>>>         memset(dst0, 0, 16 * 16 * SIZEOF_COEF);
>>>         memset(dst1, 0, 16 * 16 * SIZEOF_COEF);
>>>
>>
>> This still has an effective-type violation: src_buf is of type uint8_t,
>> yet the ff_h264_luma_dc_dequant_idct functions will read it as
>> int16_t/int32_t. It also still has the downside that buffer overflows
>> for the 8bit case can go undetected.
> 
> A bunch of template has cast like 
> 
>     pixel *dst = (pixel *)_dst;
>     const pixel *src = (const pixel *)_src;
> 
> then read and write as int16_t.
> 
> And a bunch of checkasm use uint8_t[] array on stack as src and dst,
> which leading to UB.
> 
> This patch isn’t specific. And this patch add zero UB (it’s there before the patch,
> both src and dst are accessed as int32_t/int16_t while they are int16_t and uint8_t).
> 

This patch adds UB: src was int16_t before, so that the accesses in the
eight bit function were fine, but are not with this patch. Anyway, it is
irrelevant now.

- Andreas

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [FFmpeg-devel] [PATCH v3] checkasm/h264dsp: Fix stack overflow in check_idct_dequant
  2025-06-16 11:03     ` Andreas Rheinhardt
@ 2025-06-16 11:49       ` Zhao Zhili
  2025-06-16 18:29         ` Andreas Rheinhardt
  0 siblings, 1 reply; 7+ messages in thread
From: Zhao Zhili @ 2025-06-16 11:49 UTC (permalink / raw)
  To: FFmpeg development discussions and patches



> On Jun 16, 2025, at 19:03, Andreas Rheinhardt <andreas.rheinhardt@outlook.com> wrote:
> 
> Zhao Zhili:
>> 
>> 
>>> On Jun 16, 2025, at 17:46, Andreas Rheinhardt <andreas.rheinhardt@outlook.com> wrote:
>>> 
>>> Zhao Zhili:
>>>> From: Zhao Zhili <zhilizhao@tencent.com>
>>>> 
>>>> ---
>>>> tests/checkasm/h264dsp.c | 20 +++++++++++++++-----
>>>> 1 file changed, 15 insertions(+), 5 deletions(-)
>>>> 
>>>> diff --git a/tests/checkasm/h264dsp.c b/tests/checkasm/h264dsp.c
>>>> index f5f9650224..a0f8fd858a 100644
>>>> --- a/tests/checkasm/h264dsp.c
>>>> +++ b/tests/checkasm/h264dsp.c
>>>> @@ -328,25 +328,35 @@ static void check_idct_multiple(void)
>>>> static void check_idct_dequant(void)
>>>> {
>>>>    static const int depths[5] = { 8, 9, 10, 12, 14 };
>>>> -    LOCAL_ALIGNED_16(int16_t, src, [16]);
>>>> -    /* Ensure dst buffers are large enough to hold dctcoefs of all bit-depths. */
>>>> +    /* Ensure buffers are large enough to hold dctcoefs of all bit-depths. */
>>>> +    LOCAL_ALIGNED_16(uint8_t, src_buf, [16 * sizeof(int32_t)]);
>>>>    LOCAL_ALIGNED_16(uint8_t, dst0, [16 * 16 * sizeof(int32_t)]);
>>>>    LOCAL_ALIGNED_16(uint8_t, dst1, [16 * 16 * sizeof(int32_t)]);
>>>> +    int16_t *src = (int16_t *)src_buf;
>>>>    int16_t *dst_ref = (int16_t *)dst0;
>>>>    int16_t *dst_new = (int16_t *)dst1;
>>>>    H264DSPContext h;
>>>>    int bit_depth, i, qmul;
>>>>    declare_func_emms(AV_CPU_FLAG_MMX | AV_CPU_FLAG_SSE2, void, int16_t *output, int16_t *input, int qmul);
>>>> 
>>>> -    for (int j = 0; j < 16; j++)
>>>> -        src[j] = (rnd() % 512) - 256;
>>>> -
>>>>    qmul = rnd() % 4096;
>>>> 
>>>>    for (i = 0; i < FF_ARRAY_ELEMS(depths); i++) {
>>>>        bit_depth = depths[i];
>>>>        ff_h264dsp_init(&h, bit_depth, 1);
>>>> 
>>>> +        if (bit_depth == 8) {
>>>> +            for (size_t j = 0; j < 16; j++) {
>>>> +                int16_t r = (rnd() % 512) - 256;
>>>> +                AV_WN16A(&src_buf[j << 1], r);
>>>> +            }
>>>> +        } else {
>>>> +            for (size_t j = 0; j < 16; j++) {
>>>> +                int32_t r = (rnd() % (1 << (bit_depth + 1))) - (1 << bit_depth);
>>>> +                AV_WN32A(&src_buf[j << 2], r);
>>>> +            }
>>>> +        }
>>>> +
>>>>        memset(dst0, 0, 16 * 16 * SIZEOF_COEF);
>>>>        memset(dst1, 0, 16 * 16 * SIZEOF_COEF);
>>>> 
>>> 
>>> This still has an effective-type violation: src_buf is of type uint8_t,
>>> yet the ff_h264_luma_dc_dequant_idct functions will read it as
>>> int16_t/int32_t. It also still has the downside that buffer overflows
>>> for the 8bit case can go undetected.
>> 
>> A bunch of template has cast like 
>> 
>>    pixel *dst = (pixel *)_dst;
>>    const pixel *src = (const pixel *)_src;
>> 
>> then read and write as int16_t.
>> 
>> And a bunch of checkasm use uint8_t[] array on stack as src and dst,
>> which leading to UB.
>> 
>> This patch isn’t specific. And this patch add zero UB (it’s there before the patch,
>> both src and dst are accessed as int32_t/int16_t while they are int16_t and uint8_t).
>> 
> 
> This patch adds UB: src was int16_t before, so that the accesses in the
> eight bit function were fine, but are not with this patch. Anyway, it is
> irrelevant now.

Why it suddenly becomes a big problem access to properly aligned uint8_t *?

I don’t mind to discuss the rules regarding to these violating of strict aliasing,
especially in checkasm. But why it suddenly becomes a rule blocking a patch
trying to fix a fate failure.

I don’t buy the reason "the accesses in the eight bit function were fine”.

> 
> - Andreas
> 
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [FFmpeg-devel] [PATCH v3] checkasm/h264dsp: Fix stack overflow in check_idct_dequant
  2025-06-16 11:49       ` Zhao Zhili
@ 2025-06-16 18:29         ` Andreas Rheinhardt
  2025-06-17  2:01           ` Zhao Zhili
  0 siblings, 1 reply; 7+ messages in thread
From: Andreas Rheinhardt @ 2025-06-16 18:29 UTC (permalink / raw)
  To: ffmpeg-devel

Zhao Zhili:
> 
> 
>> On Jun 16, 2025, at 19:03, Andreas Rheinhardt <andreas.rheinhardt@outlook.com> wrote:
>>
>> Zhao Zhili:
>>>
>>>
>>>> On Jun 16, 2025, at 17:46, Andreas Rheinhardt <andreas.rheinhardt@outlook.com> wrote:
>>>>
>>>> Zhao Zhili:
>>>>> From: Zhao Zhili <zhilizhao@tencent.com>
>>>>>
>>>>> ---
>>>>> tests/checkasm/h264dsp.c | 20 +++++++++++++++-----
>>>>> 1 file changed, 15 insertions(+), 5 deletions(-)
>>>>>
>>>>> diff --git a/tests/checkasm/h264dsp.c b/tests/checkasm/h264dsp.c
>>>>> index f5f9650224..a0f8fd858a 100644
>>>>> --- a/tests/checkasm/h264dsp.c
>>>>> +++ b/tests/checkasm/h264dsp.c
>>>>> @@ -328,25 +328,35 @@ static void check_idct_multiple(void)
>>>>> static void check_idct_dequant(void)
>>>>> {
>>>>>    static const int depths[5] = { 8, 9, 10, 12, 14 };
>>>>> -    LOCAL_ALIGNED_16(int16_t, src, [16]);
>>>>> -    /* Ensure dst buffers are large enough to hold dctcoefs of all bit-depths. */
>>>>> +    /* Ensure buffers are large enough to hold dctcoefs of all bit-depths. */
>>>>> +    LOCAL_ALIGNED_16(uint8_t, src_buf, [16 * sizeof(int32_t)]);
>>>>>    LOCAL_ALIGNED_16(uint8_t, dst0, [16 * 16 * sizeof(int32_t)]);
>>>>>    LOCAL_ALIGNED_16(uint8_t, dst1, [16 * 16 * sizeof(int32_t)]);
>>>>> +    int16_t *src = (int16_t *)src_buf;
>>>>>    int16_t *dst_ref = (int16_t *)dst0;
>>>>>    int16_t *dst_new = (int16_t *)dst1;
>>>>>    H264DSPContext h;
>>>>>    int bit_depth, i, qmul;
>>>>>    declare_func_emms(AV_CPU_FLAG_MMX | AV_CPU_FLAG_SSE2, void, int16_t *output, int16_t *input, int qmul);
>>>>>
>>>>> -    for (int j = 0; j < 16; j++)
>>>>> -        src[j] = (rnd() % 512) - 256;
>>>>> -
>>>>>    qmul = rnd() % 4096;
>>>>>
>>>>>    for (i = 0; i < FF_ARRAY_ELEMS(depths); i++) {
>>>>>        bit_depth = depths[i];
>>>>>        ff_h264dsp_init(&h, bit_depth, 1);
>>>>>
>>>>> +        if (bit_depth == 8) {
>>>>> +            for (size_t j = 0; j < 16; j++) {
>>>>> +                int16_t r = (rnd() % 512) - 256;
>>>>> +                AV_WN16A(&src_buf[j << 1], r);
>>>>> +            }
>>>>> +        } else {
>>>>> +            for (size_t j = 0; j < 16; j++) {
>>>>> +                int32_t r = (rnd() % (1 << (bit_depth + 1))) - (1 << bit_depth);
>>>>> +                AV_WN32A(&src_buf[j << 2], r);
>>>>> +            }
>>>>> +        }
>>>>> +
>>>>>        memset(dst0, 0, 16 * 16 * SIZEOF_COEF);
>>>>>        memset(dst1, 0, 16 * 16 * SIZEOF_COEF);
>>>>>
>>>>
>>>> This still has an effective-type violation: src_buf is of type uint8_t,
>>>> yet the ff_h264_luma_dc_dequant_idct functions will read it as
>>>> int16_t/int32_t. It also still has the downside that buffer overflows
>>>> for the 8bit case can go undetected.
>>>
>>> A bunch of template has cast like 
>>>
>>>    pixel *dst = (pixel *)_dst;
>>>    const pixel *src = (const pixel *)_src;
>>>
>>> then read and write as int16_t.
>>>
>>> And a bunch of checkasm use uint8_t[] array on stack as src and dst,
>>> which leading to UB.
>>>
>>> This patch isn’t specific. And this patch add zero UB (it’s there before the patch,
>>> both src and dst are accessed as int32_t/int16_t while they are int16_t and uint8_t).
>>>
>>
>> This patch adds UB: src was int16_t before, so that the accesses in the
>> eight bit function were fine, but are not with this patch. Anyway, it is
>> irrelevant now.
> 
> Why it suddenly becomes a big problem access to properly aligned uint8_t *?
> 
> I don’t mind to discuss the rules regarding to these violating of strict aliasing,
> especially in checkasm. But why it suddenly becomes a rule blocking a patch
> trying to fix a fate failure.
> 
> I don’t buy the reason "the accesses in the eight bit function were fine”.
> 

The effective type violation goes hand in hand with using a too big
buffer for the smaller type, making the test less strict. This is an
issue that checkasm should worry about (the effective type violation
itself is not that important).
Anyway, have you seen my patch?

- Andreas

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [FFmpeg-devel] [PATCH v3] checkasm/h264dsp: Fix stack overflow in check_idct_dequant
  2025-06-16 18:29         ` Andreas Rheinhardt
@ 2025-06-17  2:01           ` Zhao Zhili
  0 siblings, 0 replies; 7+ messages in thread
From: Zhao Zhili @ 2025-06-17  2:01 UTC (permalink / raw)
  To: ffmpeg-devel


> 在 2025年6月17日，上午2:29，Andreas Rheinhardt <andreas.rheinhardt@outlook.com> 写道：
> 
> Zhao Zhili:
>> 
>> 
>>>> On Jun 16, 2025, at 19:03, Andreas Rheinhardt <andreas.rheinhardt@outlook.com> wrote:
>>> 
>>> Zhao Zhili:
>>>> 
>>>> 
>>>>> On Jun 16, 2025, at 17:46, Andreas Rheinhardt <andreas.rheinhardt@outlook.com> wrote:
>>>>> 
>>>>> Zhao Zhili:
>>>>>> From: Zhao Zhili <zhilizhao@tencent.com>
>>>>>> 
>>>>>> ---
>>>>>> tests/checkasm/h264dsp.c | 20 +++++++++++++++-----
>>>>>> 1 file changed, 15 insertions(+), 5 deletions(-)
>>>>>> 
>>>>>> diff --git a/tests/checkasm/h264dsp.c b/tests/checkasm/h264dsp.c
>>>>>> index f5f9650224..a0f8fd858a 100644
>>>>>> --- a/tests/checkasm/h264dsp.c
>>>>>> +++ b/tests/checkasm/h264dsp.c
>>>>>> @@ -328,25 +328,35 @@ static void check_idct_multiple(void)
>>>>>> static void check_idct_dequant(void)
>>>>>> {
>>>>>>   static const int depths[5] = { 8, 9, 10, 12, 14 };
>>>>>> -    LOCAL_ALIGNED_16(int16_t, src, [16]);
>>>>>> -    /* Ensure dst buffers are large enough to hold dctcoefs of all bit-depths. */
>>>>>> +    /* Ensure buffers are large enough to hold dctcoefs of all bit-depths. */
>>>>>> +    LOCAL_ALIGNED_16(uint8_t, src_buf, [16 * sizeof(int32_t)]);
>>>>>>   LOCAL_ALIGNED_16(uint8_t, dst0, [16 * 16 * sizeof(int32_t)]);
>>>>>>   LOCAL_ALIGNED_16(uint8_t, dst1, [16 * 16 * sizeof(int32_t)]);
>>>>>> +    int16_t *src = (int16_t *)src_buf;
>>>>>>   int16_t *dst_ref = (int16_t *)dst0;
>>>>>>   int16_t *dst_new = (int16_t *)dst1;
>>>>>>   H264DSPContext h;
>>>>>>   int bit_depth, i, qmul;
>>>>>>   declare_func_emms(AV_CPU_FLAG_MMX | AV_CPU_FLAG_SSE2, void, int16_t *output, int16_t *input, int qmul);
>>>>>> 
>>>>>> -    for (int j = 0; j < 16; j++)
>>>>>> -        src[j] = (rnd() % 512) - 256;
>>>>>> -
>>>>>>   qmul = rnd() % 4096;
>>>>>> 
>>>>>>   for (i = 0; i < FF_ARRAY_ELEMS(depths); i++) {
>>>>>>       bit_depth = depths[i];
>>>>>>       ff_h264dsp_init(&h, bit_depth, 1);
>>>>>> 
>>>>>> +        if (bit_depth == 8) {
>>>>>> +            for (size_t j = 0; j < 16; j++) {
>>>>>> +                int16_t r = (rnd() % 512) - 256;
>>>>>> +                AV_WN16A(&src_buf[j << 1], r);
>>>>>> +            }
>>>>>> +        } else {
>>>>>> +            for (size_t j = 0; j < 16; j++) {
>>>>>> +                int32_t r = (rnd() % (1 << (bit_depth + 1))) - (1 << bit_depth);
>>>>>> +                AV_WN32A(&src_buf[j << 2], r);
>>>>>> +            }
>>>>>> +        }
>>>>>> +
>>>>>>       memset(dst0, 0, 16 * 16 * SIZEOF_COEF);
>>>>>>       memset(dst1, 0, 16 * 16 * SIZEOF_COEF);
>>>>>> 
>>>>> 
>>>>> This still has an effective-type violation: src_buf is of type uint8_t,
>>>>> yet the ff_h264_luma_dc_dequant_idct functions will read it as
>>>>> int16_t/int32_t. It also still has the downside that buffer overflows
>>>>> for the 8bit case can go undetected.
>>>> 
>>>> A bunch of template has cast like
>>>> 
>>>>   pixel *dst = (pixel *)_dst;
>>>>   const pixel *src = (const pixel *)_src;
>>>> 
>>>> then read and write as int16_t.
>>>> 
>>>> And a bunch of checkasm use uint8_t[] array on stack as src and dst,
>>>> which leading to UB.
>>>> 
>>>> This patch isn’t specific. And this patch add zero UB (it’s there before the patch,
>>>> both src and dst are accessed as int32_t/int16_t while they are int16_t and uint8_t).
>>>> 
>>> 
>>> This patch adds UB: src was int16_t before, so that the accesses in the
>>> eight bit function were fine, but are not with this patch. Anyway, it is
>>> irrelevant now.
>> 
>> Why it suddenly becomes a big problem access to properly aligned uint8_t *?
>> 
>> I don’t mind to discuss the rules regarding to these violating of strict aliasing,
>> especially in checkasm. But why it suddenly becomes a rule blocking a patch
>> trying to fix a fate failure.
>> 
>> I don’t buy the reason "the accesses in the eight bit function were fine”.
>> 
> 
> The effective type violation goes hand in hand with using a too big
> buffer for the smaller type, making the test less strict. This is an
> issue that checkasm should worry about (the effective type violation
> itself is not that important).

It’s the same buffer size inside libavcodec/h264, the test is as strict as real use case. As long as the output is correct, over read a few bytes inside the input buffer doesn’t matter. 

And there are tools to detect read uninitialized values. Without tools, stack overflow cannot be detected neither.

There is a v5. No more comments.

> Anyway, have you seen my patch?
> 
> - Andreas
> 
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-06-17  2:02 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-06-16  9:40 [FFmpeg-devel] [PATCH v3] checkasm/h264dsp: Fix stack overflow in check_idct_dequant Zhao Zhili
2025-06-16  9:46 ` Andreas Rheinhardt
2025-06-16 10:21   ` Zhao Zhili
2025-06-16 11:03     ` Andreas Rheinhardt
2025-06-16 11:49       ` Zhao Zhili
2025-06-16 18:29         ` Andreas Rheinhardt
2025-06-17  2:01           ` Zhao Zhili

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
		ffmpegdev@gitmailbox.com
	public-inbox-index ffmpegdev

Example config snippet for mirrors.


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git