* [FFmpeg-devel] [PATCH v3] checkasm/h264dsp: Fix stack overflow in check_idct_dequant @ 2025-06-16 9:40 Zhao Zhili 2025-06-16 9:46 ` Andreas Rheinhardt 0 siblings, 1 reply; 7+ messages in thread From: Zhao Zhili @ 2025-06-16 9:40 UTC (permalink / raw) To: ffmpeg-devel; +Cc: Zhao Zhili From: Zhao Zhili <zhilizhao@tencent.com> --- tests/checkasm/h264dsp.c | 20 +++++++++++++++----- 1 file changed, 15 insertions(+), 5 deletions(-) diff --git a/tests/checkasm/h264dsp.c b/tests/checkasm/h264dsp.c index f5f9650224..a0f8fd858a 100644 --- a/tests/checkasm/h264dsp.c +++ b/tests/checkasm/h264dsp.c @@ -328,25 +328,35 @@ static void check_idct_multiple(void) static void check_idct_dequant(void) { static const int depths[5] = { 8, 9, 10, 12, 14 }; - LOCAL_ALIGNED_16(int16_t, src, [16]); - /* Ensure dst buffers are large enough to hold dctcoefs of all bit-depths. */ + /* Ensure buffers are large enough to hold dctcoefs of all bit-depths. */ + LOCAL_ALIGNED_16(uint8_t, src_buf, [16 * sizeof(int32_t)]); LOCAL_ALIGNED_16(uint8_t, dst0, [16 * 16 * sizeof(int32_t)]); LOCAL_ALIGNED_16(uint8_t, dst1, [16 * 16 * sizeof(int32_t)]); + int16_t *src = (int16_t *)src_buf; int16_t *dst_ref = (int16_t *)dst0; int16_t *dst_new = (int16_t *)dst1; H264DSPContext h; int bit_depth, i, qmul; declare_func_emms(AV_CPU_FLAG_MMX | AV_CPU_FLAG_SSE2, void, int16_t *output, int16_t *input, int qmul); - for (int j = 0; j < 16; j++) - src[j] = (rnd() % 512) - 256; - qmul = rnd() % 4096; for (i = 0; i < FF_ARRAY_ELEMS(depths); i++) { bit_depth = depths[i]; ff_h264dsp_init(&h, bit_depth, 1); + if (bit_depth == 8) { + for (size_t j = 0; j < 16; j++) { + int16_t r = (rnd() % 512) - 256; + AV_WN16A(&src_buf[j << 1], r); + } + } else { + for (size_t j = 0; j < 16; j++) { + int32_t r = (rnd() % (1 << (bit_depth + 1))) - (1 << bit_depth); + AV_WN32A(&src_buf[j << 2], r); + } + } + memset(dst0, 0, 16 * 16 * SIZEOF_COEF); memset(dst1, 0, 16 * 16 * SIZEOF_COEF); -- 2.25.1 _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [FFmpeg-devel] [PATCH v3] checkasm/h264dsp: Fix stack overflow in check_idct_dequant 2025-06-16 9:40 [FFmpeg-devel] [PATCH v3] checkasm/h264dsp: Fix stack overflow in check_idct_dequant Zhao Zhili @ 2025-06-16 9:46 ` Andreas Rheinhardt 2025-06-16 10:21 ` Zhao Zhili 0 siblings, 1 reply; 7+ messages in thread From: Andreas Rheinhardt @ 2025-06-16 9:46 UTC (permalink / raw) To: ffmpeg-devel Zhao Zhili: > From: Zhao Zhili <zhilizhao@tencent.com> > > --- > tests/checkasm/h264dsp.c | 20 +++++++++++++++----- > 1 file changed, 15 insertions(+), 5 deletions(-) > > diff --git a/tests/checkasm/h264dsp.c b/tests/checkasm/h264dsp.c > index f5f9650224..a0f8fd858a 100644 > --- a/tests/checkasm/h264dsp.c > +++ b/tests/checkasm/h264dsp.c > @@ -328,25 +328,35 @@ static void check_idct_multiple(void) > static void check_idct_dequant(void) > { > static const int depths[5] = { 8, 9, 10, 12, 14 }; > - LOCAL_ALIGNED_16(int16_t, src, [16]); > - /* Ensure dst buffers are large enough to hold dctcoefs of all bit-depths. */ > + /* Ensure buffers are large enough to hold dctcoefs of all bit-depths. */ > + LOCAL_ALIGNED_16(uint8_t, src_buf, [16 * sizeof(int32_t)]); > LOCAL_ALIGNED_16(uint8_t, dst0, [16 * 16 * sizeof(int32_t)]); > LOCAL_ALIGNED_16(uint8_t, dst1, [16 * 16 * sizeof(int32_t)]); > + int16_t *src = (int16_t *)src_buf; > int16_t *dst_ref = (int16_t *)dst0; > int16_t *dst_new = (int16_t *)dst1; > H264DSPContext h; > int bit_depth, i, qmul; > declare_func_emms(AV_CPU_FLAG_MMX | AV_CPU_FLAG_SSE2, void, int16_t *output, int16_t *input, int qmul); > > - for (int j = 0; j < 16; j++) > - src[j] = (rnd() % 512) - 256; > - > qmul = rnd() % 4096; > > for (i = 0; i < FF_ARRAY_ELEMS(depths); i++) { > bit_depth = depths[i]; > ff_h264dsp_init(&h, bit_depth, 1); > > + if (bit_depth == 8) { > + for (size_t j = 0; j < 16; j++) { > + int16_t r = (rnd() % 512) - 256; > + AV_WN16A(&src_buf[j << 1], r); > + } > + } else { > + for (size_t j = 0; j < 16; j++) { > + int32_t r = (rnd() % (1 << (bit_depth + 1))) - (1 << bit_depth); > + AV_WN32A(&src_buf[j << 2], r); > + } > + } > + > memset(dst0, 0, 16 * 16 * SIZEOF_COEF); > memset(dst1, 0, 16 * 16 * SIZEOF_COEF); > This still has an effective-type violation: src_buf is of type uint8_t, yet the ff_h264_luma_dc_dequant_idct functions will read it as int16_t/int32_t. It also still has the downside that buffer overflows for the 8bit case can go undetected. - Andreas _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [FFmpeg-devel] [PATCH v3] checkasm/h264dsp: Fix stack overflow in check_idct_dequant 2025-06-16 9:46 ` Andreas Rheinhardt @ 2025-06-16 10:21 ` Zhao Zhili 2025-06-16 11:03 ` Andreas Rheinhardt 0 siblings, 1 reply; 7+ messages in thread From: Zhao Zhili @ 2025-06-16 10:21 UTC (permalink / raw) To: FFmpeg development discussions and patches > On Jun 16, 2025, at 17:46, Andreas Rheinhardt <andreas.rheinhardt@outlook.com> wrote: > > Zhao Zhili: >> From: Zhao Zhili <zhilizhao@tencent.com> >> >> --- >> tests/checkasm/h264dsp.c | 20 +++++++++++++++----- >> 1 file changed, 15 insertions(+), 5 deletions(-) >> >> diff --git a/tests/checkasm/h264dsp.c b/tests/checkasm/h264dsp.c >> index f5f9650224..a0f8fd858a 100644 >> --- a/tests/checkasm/h264dsp.c >> +++ b/tests/checkasm/h264dsp.c >> @@ -328,25 +328,35 @@ static void check_idct_multiple(void) >> static void check_idct_dequant(void) >> { >> static const int depths[5] = { 8, 9, 10, 12, 14 }; >> - LOCAL_ALIGNED_16(int16_t, src, [16]); >> - /* Ensure dst buffers are large enough to hold dctcoefs of all bit-depths. */ >> + /* Ensure buffers are large enough to hold dctcoefs of all bit-depths. */ >> + LOCAL_ALIGNED_16(uint8_t, src_buf, [16 * sizeof(int32_t)]); >> LOCAL_ALIGNED_16(uint8_t, dst0, [16 * 16 * sizeof(int32_t)]); >> LOCAL_ALIGNED_16(uint8_t, dst1, [16 * 16 * sizeof(int32_t)]); >> + int16_t *src = (int16_t *)src_buf; >> int16_t *dst_ref = (int16_t *)dst0; >> int16_t *dst_new = (int16_t *)dst1; >> H264DSPContext h; >> int bit_depth, i, qmul; >> declare_func_emms(AV_CPU_FLAG_MMX | AV_CPU_FLAG_SSE2, void, int16_t *output, int16_t *input, int qmul); >> >> - for (int j = 0; j < 16; j++) >> - src[j] = (rnd() % 512) - 256; >> - >> qmul = rnd() % 4096; >> >> for (i = 0; i < FF_ARRAY_ELEMS(depths); i++) { >> bit_depth = depths[i]; >> ff_h264dsp_init(&h, bit_depth, 1); >> >> + if (bit_depth == 8) { >> + for (size_t j = 0; j < 16; j++) { >> + int16_t r = (rnd() % 512) - 256; >> + AV_WN16A(&src_buf[j << 1], r); >> + } >> + } else { >> + for (size_t j = 0; j < 16; j++) { >> + int32_t r = (rnd() % (1 << (bit_depth + 1))) - (1 << bit_depth); >> + AV_WN32A(&src_buf[j << 2], r); >> + } >> + } >> + >> memset(dst0, 0, 16 * 16 * SIZEOF_COEF); >> memset(dst1, 0, 16 * 16 * SIZEOF_COEF); >> > > This still has an effective-type violation: src_buf is of type uint8_t, > yet the ff_h264_luma_dc_dequant_idct functions will read it as > int16_t/int32_t. It also still has the downside that buffer overflows > for the 8bit case can go undetected. A bunch of template has cast like pixel *dst = (pixel *)_dst; const pixel *src = (const pixel *)_src; then read and write as int16_t. And a bunch of checkasm use uint8_t[] array on stack as src and dst, which leading to UB. This patch isn’t specific. And this patch add zero UB (it’s there before the patch, both src and dst are accessed as int32_t/int16_t while they are int16_t and uint8_t). > > - Andreas > > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [FFmpeg-devel] [PATCH v3] checkasm/h264dsp: Fix stack overflow in check_idct_dequant 2025-06-16 10:21 ` Zhao Zhili @ 2025-06-16 11:03 ` Andreas Rheinhardt 2025-06-16 11:49 ` Zhao Zhili 0 siblings, 1 reply; 7+ messages in thread From: Andreas Rheinhardt @ 2025-06-16 11:03 UTC (permalink / raw) To: ffmpeg-devel Zhao Zhili: > > >> On Jun 16, 2025, at 17:46, Andreas Rheinhardt <andreas.rheinhardt@outlook.com> wrote: >> >> Zhao Zhili: >>> From: Zhao Zhili <zhilizhao@tencent.com> >>> >>> --- >>> tests/checkasm/h264dsp.c | 20 +++++++++++++++----- >>> 1 file changed, 15 insertions(+), 5 deletions(-) >>> >>> diff --git a/tests/checkasm/h264dsp.c b/tests/checkasm/h264dsp.c >>> index f5f9650224..a0f8fd858a 100644 >>> --- a/tests/checkasm/h264dsp.c >>> +++ b/tests/checkasm/h264dsp.c >>> @@ -328,25 +328,35 @@ static void check_idct_multiple(void) >>> static void check_idct_dequant(void) >>> { >>> static const int depths[5] = { 8, 9, 10, 12, 14 }; >>> - LOCAL_ALIGNED_16(int16_t, src, [16]); >>> - /* Ensure dst buffers are large enough to hold dctcoefs of all bit-depths. */ >>> + /* Ensure buffers are large enough to hold dctcoefs of all bit-depths. */ >>> + LOCAL_ALIGNED_16(uint8_t, src_buf, [16 * sizeof(int32_t)]); >>> LOCAL_ALIGNED_16(uint8_t, dst0, [16 * 16 * sizeof(int32_t)]); >>> LOCAL_ALIGNED_16(uint8_t, dst1, [16 * 16 * sizeof(int32_t)]); >>> + int16_t *src = (int16_t *)src_buf; >>> int16_t *dst_ref = (int16_t *)dst0; >>> int16_t *dst_new = (int16_t *)dst1; >>> H264DSPContext h; >>> int bit_depth, i, qmul; >>> declare_func_emms(AV_CPU_FLAG_MMX | AV_CPU_FLAG_SSE2, void, int16_t *output, int16_t *input, int qmul); >>> >>> - for (int j = 0; j < 16; j++) >>> - src[j] = (rnd() % 512) - 256; >>> - >>> qmul = rnd() % 4096; >>> >>> for (i = 0; i < FF_ARRAY_ELEMS(depths); i++) { >>> bit_depth = depths[i]; >>> ff_h264dsp_init(&h, bit_depth, 1); >>> >>> + if (bit_depth == 8) { >>> + for (size_t j = 0; j < 16; j++) { >>> + int16_t r = (rnd() % 512) - 256; >>> + AV_WN16A(&src_buf[j << 1], r); >>> + } >>> + } else { >>> + for (size_t j = 0; j < 16; j++) { >>> + int32_t r = (rnd() % (1 << (bit_depth + 1))) - (1 << bit_depth); >>> + AV_WN32A(&src_buf[j << 2], r); >>> + } >>> + } >>> + >>> memset(dst0, 0, 16 * 16 * SIZEOF_COEF); >>> memset(dst1, 0, 16 * 16 * SIZEOF_COEF); >>> >> >> This still has an effective-type violation: src_buf is of type uint8_t, >> yet the ff_h264_luma_dc_dequant_idct functions will read it as >> int16_t/int32_t. It also still has the downside that buffer overflows >> for the 8bit case can go undetected. > > A bunch of template has cast like > > pixel *dst = (pixel *)_dst; > const pixel *src = (const pixel *)_src; > > then read and write as int16_t. > > And a bunch of checkasm use uint8_t[] array on stack as src and dst, > which leading to UB. > > This patch isn’t specific. And this patch add zero UB (it’s there before the patch, > both src and dst are accessed as int32_t/int16_t while they are int16_t and uint8_t). > This patch adds UB: src was int16_t before, so that the accesses in the eight bit function were fine, but are not with this patch. Anyway, it is irrelevant now. - Andreas _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [FFmpeg-devel] [PATCH v3] checkasm/h264dsp: Fix stack overflow in check_idct_dequant 2025-06-16 11:03 ` Andreas Rheinhardt @ 2025-06-16 11:49 ` Zhao Zhili 2025-06-16 18:29 ` Andreas Rheinhardt 0 siblings, 1 reply; 7+ messages in thread From: Zhao Zhili @ 2025-06-16 11:49 UTC (permalink / raw) To: FFmpeg development discussions and patches > On Jun 16, 2025, at 19:03, Andreas Rheinhardt <andreas.rheinhardt@outlook.com> wrote: > > Zhao Zhili: >> >> >>> On Jun 16, 2025, at 17:46, Andreas Rheinhardt <andreas.rheinhardt@outlook.com> wrote: >>> >>> Zhao Zhili: >>>> From: Zhao Zhili <zhilizhao@tencent.com> >>>> >>>> --- >>>> tests/checkasm/h264dsp.c | 20 +++++++++++++++----- >>>> 1 file changed, 15 insertions(+), 5 deletions(-) >>>> >>>> diff --git a/tests/checkasm/h264dsp.c b/tests/checkasm/h264dsp.c >>>> index f5f9650224..a0f8fd858a 100644 >>>> --- a/tests/checkasm/h264dsp.c >>>> +++ b/tests/checkasm/h264dsp.c >>>> @@ -328,25 +328,35 @@ static void check_idct_multiple(void) >>>> static void check_idct_dequant(void) >>>> { >>>> static const int depths[5] = { 8, 9, 10, 12, 14 }; >>>> - LOCAL_ALIGNED_16(int16_t, src, [16]); >>>> - /* Ensure dst buffers are large enough to hold dctcoefs of all bit-depths. */ >>>> + /* Ensure buffers are large enough to hold dctcoefs of all bit-depths. */ >>>> + LOCAL_ALIGNED_16(uint8_t, src_buf, [16 * sizeof(int32_t)]); >>>> LOCAL_ALIGNED_16(uint8_t, dst0, [16 * 16 * sizeof(int32_t)]); >>>> LOCAL_ALIGNED_16(uint8_t, dst1, [16 * 16 * sizeof(int32_t)]); >>>> + int16_t *src = (int16_t *)src_buf; >>>> int16_t *dst_ref = (int16_t *)dst0; >>>> int16_t *dst_new = (int16_t *)dst1; >>>> H264DSPContext h; >>>> int bit_depth, i, qmul; >>>> declare_func_emms(AV_CPU_FLAG_MMX | AV_CPU_FLAG_SSE2, void, int16_t *output, int16_t *input, int qmul); >>>> >>>> - for (int j = 0; j < 16; j++) >>>> - src[j] = (rnd() % 512) - 256; >>>> - >>>> qmul = rnd() % 4096; >>>> >>>> for (i = 0; i < FF_ARRAY_ELEMS(depths); i++) { >>>> bit_depth = depths[i]; >>>> ff_h264dsp_init(&h, bit_depth, 1); >>>> >>>> + if (bit_depth == 8) { >>>> + for (size_t j = 0; j < 16; j++) { >>>> + int16_t r = (rnd() % 512) - 256; >>>> + AV_WN16A(&src_buf[j << 1], r); >>>> + } >>>> + } else { >>>> + for (size_t j = 0; j < 16; j++) { >>>> + int32_t r = (rnd() % (1 << (bit_depth + 1))) - (1 << bit_depth); >>>> + AV_WN32A(&src_buf[j << 2], r); >>>> + } >>>> + } >>>> + >>>> memset(dst0, 0, 16 * 16 * SIZEOF_COEF); >>>> memset(dst1, 0, 16 * 16 * SIZEOF_COEF); >>>> >>> >>> This still has an effective-type violation: src_buf is of type uint8_t, >>> yet the ff_h264_luma_dc_dequant_idct functions will read it as >>> int16_t/int32_t. It also still has the downside that buffer overflows >>> for the 8bit case can go undetected. >> >> A bunch of template has cast like >> >> pixel *dst = (pixel *)_dst; >> const pixel *src = (const pixel *)_src; >> >> then read and write as int16_t. >> >> And a bunch of checkasm use uint8_t[] array on stack as src and dst, >> which leading to UB. >> >> This patch isn’t specific. And this patch add zero UB (it’s there before the patch, >> both src and dst are accessed as int32_t/int16_t while they are int16_t and uint8_t). >> > > This patch adds UB: src was int16_t before, so that the accesses in the > eight bit function were fine, but are not with this patch. Anyway, it is > irrelevant now. Why it suddenly becomes a big problem access to properly aligned uint8_t *? I don’t mind to discuss the rules regarding to these violating of strict aliasing, especially in checkasm. But why it suddenly becomes a rule blocking a patch trying to fix a fate failure. I don’t buy the reason "the accesses in the eight bit function were fine”. > > - Andreas > > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [FFmpeg-devel] [PATCH v3] checkasm/h264dsp: Fix stack overflow in check_idct_dequant 2025-06-16 11:49 ` Zhao Zhili @ 2025-06-16 18:29 ` Andreas Rheinhardt 2025-06-17 2:01 ` Zhao Zhili 0 siblings, 1 reply; 7+ messages in thread From: Andreas Rheinhardt @ 2025-06-16 18:29 UTC (permalink / raw) To: ffmpeg-devel Zhao Zhili: > > >> On Jun 16, 2025, at 19:03, Andreas Rheinhardt <andreas.rheinhardt@outlook.com> wrote: >> >> Zhao Zhili: >>> >>> >>>> On Jun 16, 2025, at 17:46, Andreas Rheinhardt <andreas.rheinhardt@outlook.com> wrote: >>>> >>>> Zhao Zhili: >>>>> From: Zhao Zhili <zhilizhao@tencent.com> >>>>> >>>>> --- >>>>> tests/checkasm/h264dsp.c | 20 +++++++++++++++----- >>>>> 1 file changed, 15 insertions(+), 5 deletions(-) >>>>> >>>>> diff --git a/tests/checkasm/h264dsp.c b/tests/checkasm/h264dsp.c >>>>> index f5f9650224..a0f8fd858a 100644 >>>>> --- a/tests/checkasm/h264dsp.c >>>>> +++ b/tests/checkasm/h264dsp.c >>>>> @@ -328,25 +328,35 @@ static void check_idct_multiple(void) >>>>> static void check_idct_dequant(void) >>>>> { >>>>> static const int depths[5] = { 8, 9, 10, 12, 14 }; >>>>> - LOCAL_ALIGNED_16(int16_t, src, [16]); >>>>> - /* Ensure dst buffers are large enough to hold dctcoefs of all bit-depths. */ >>>>> + /* Ensure buffers are large enough to hold dctcoefs of all bit-depths. */ >>>>> + LOCAL_ALIGNED_16(uint8_t, src_buf, [16 * sizeof(int32_t)]); >>>>> LOCAL_ALIGNED_16(uint8_t, dst0, [16 * 16 * sizeof(int32_t)]); >>>>> LOCAL_ALIGNED_16(uint8_t, dst1, [16 * 16 * sizeof(int32_t)]); >>>>> + int16_t *src = (int16_t *)src_buf; >>>>> int16_t *dst_ref = (int16_t *)dst0; >>>>> int16_t *dst_new = (int16_t *)dst1; >>>>> H264DSPContext h; >>>>> int bit_depth, i, qmul; >>>>> declare_func_emms(AV_CPU_FLAG_MMX | AV_CPU_FLAG_SSE2, void, int16_t *output, int16_t *input, int qmul); >>>>> >>>>> - for (int j = 0; j < 16; j++) >>>>> - src[j] = (rnd() % 512) - 256; >>>>> - >>>>> qmul = rnd() % 4096; >>>>> >>>>> for (i = 0; i < FF_ARRAY_ELEMS(depths); i++) { >>>>> bit_depth = depths[i]; >>>>> ff_h264dsp_init(&h, bit_depth, 1); >>>>> >>>>> + if (bit_depth == 8) { >>>>> + for (size_t j = 0; j < 16; j++) { >>>>> + int16_t r = (rnd() % 512) - 256; >>>>> + AV_WN16A(&src_buf[j << 1], r); >>>>> + } >>>>> + } else { >>>>> + for (size_t j = 0; j < 16; j++) { >>>>> + int32_t r = (rnd() % (1 << (bit_depth + 1))) - (1 << bit_depth); >>>>> + AV_WN32A(&src_buf[j << 2], r); >>>>> + } >>>>> + } >>>>> + >>>>> memset(dst0, 0, 16 * 16 * SIZEOF_COEF); >>>>> memset(dst1, 0, 16 * 16 * SIZEOF_COEF); >>>>> >>>> >>>> This still has an effective-type violation: src_buf is of type uint8_t, >>>> yet the ff_h264_luma_dc_dequant_idct functions will read it as >>>> int16_t/int32_t. It also still has the downside that buffer overflows >>>> for the 8bit case can go undetected. >>> >>> A bunch of template has cast like >>> >>> pixel *dst = (pixel *)_dst; >>> const pixel *src = (const pixel *)_src; >>> >>> then read and write as int16_t. >>> >>> And a bunch of checkasm use uint8_t[] array on stack as src and dst, >>> which leading to UB. >>> >>> This patch isn’t specific. And this patch add zero UB (it’s there before the patch, >>> both src and dst are accessed as int32_t/int16_t while they are int16_t and uint8_t). >>> >> >> This patch adds UB: src was int16_t before, so that the accesses in the >> eight bit function were fine, but are not with this patch. Anyway, it is >> irrelevant now. > > Why it suddenly becomes a big problem access to properly aligned uint8_t *? > > I don’t mind to discuss the rules regarding to these violating of strict aliasing, > especially in checkasm. But why it suddenly becomes a rule blocking a patch > trying to fix a fate failure. > > I don’t buy the reason "the accesses in the eight bit function were fine”. > The effective type violation goes hand in hand with using a too big buffer for the smaller type, making the test less strict. This is an issue that checkasm should worry about (the effective type violation itself is not that important). Anyway, have you seen my patch? - Andreas _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [FFmpeg-devel] [PATCH v3] checkasm/h264dsp: Fix stack overflow in check_idct_dequant 2025-06-16 18:29 ` Andreas Rheinhardt @ 2025-06-17 2:01 ` Zhao Zhili 0 siblings, 0 replies; 7+ messages in thread From: Zhao Zhili @ 2025-06-17 2:01 UTC (permalink / raw) To: ffmpeg-devel > 在 2025年6月17日,上午2:29,Andreas Rheinhardt <andreas.rheinhardt@outlook.com> 写道: > > Zhao Zhili: >> >> >>>> On Jun 16, 2025, at 19:03, Andreas Rheinhardt <andreas.rheinhardt@outlook.com> wrote: >>> >>> Zhao Zhili: >>>> >>>> >>>>> On Jun 16, 2025, at 17:46, Andreas Rheinhardt <andreas.rheinhardt@outlook.com> wrote: >>>>> >>>>> Zhao Zhili: >>>>>> From: Zhao Zhili <zhilizhao@tencent.com> >>>>>> >>>>>> --- >>>>>> tests/checkasm/h264dsp.c | 20 +++++++++++++++----- >>>>>> 1 file changed, 15 insertions(+), 5 deletions(-) >>>>>> >>>>>> diff --git a/tests/checkasm/h264dsp.c b/tests/checkasm/h264dsp.c >>>>>> index f5f9650224..a0f8fd858a 100644 >>>>>> --- a/tests/checkasm/h264dsp.c >>>>>> +++ b/tests/checkasm/h264dsp.c >>>>>> @@ -328,25 +328,35 @@ static void check_idct_multiple(void) >>>>>> static void check_idct_dequant(void) >>>>>> { >>>>>> static const int depths[5] = { 8, 9, 10, 12, 14 }; >>>>>> - LOCAL_ALIGNED_16(int16_t, src, [16]); >>>>>> - /* Ensure dst buffers are large enough to hold dctcoefs of all bit-depths. */ >>>>>> + /* Ensure buffers are large enough to hold dctcoefs of all bit-depths. */ >>>>>> + LOCAL_ALIGNED_16(uint8_t, src_buf, [16 * sizeof(int32_t)]); >>>>>> LOCAL_ALIGNED_16(uint8_t, dst0, [16 * 16 * sizeof(int32_t)]); >>>>>> LOCAL_ALIGNED_16(uint8_t, dst1, [16 * 16 * sizeof(int32_t)]); >>>>>> + int16_t *src = (int16_t *)src_buf; >>>>>> int16_t *dst_ref = (int16_t *)dst0; >>>>>> int16_t *dst_new = (int16_t *)dst1; >>>>>> H264DSPContext h; >>>>>> int bit_depth, i, qmul; >>>>>> declare_func_emms(AV_CPU_FLAG_MMX | AV_CPU_FLAG_SSE2, void, int16_t *output, int16_t *input, int qmul); >>>>>> >>>>>> - for (int j = 0; j < 16; j++) >>>>>> - src[j] = (rnd() % 512) - 256; >>>>>> - >>>>>> qmul = rnd() % 4096; >>>>>> >>>>>> for (i = 0; i < FF_ARRAY_ELEMS(depths); i++) { >>>>>> bit_depth = depths[i]; >>>>>> ff_h264dsp_init(&h, bit_depth, 1); >>>>>> >>>>>> + if (bit_depth == 8) { >>>>>> + for (size_t j = 0; j < 16; j++) { >>>>>> + int16_t r = (rnd() % 512) - 256; >>>>>> + AV_WN16A(&src_buf[j << 1], r); >>>>>> + } >>>>>> + } else { >>>>>> + for (size_t j = 0; j < 16; j++) { >>>>>> + int32_t r = (rnd() % (1 << (bit_depth + 1))) - (1 << bit_depth); >>>>>> + AV_WN32A(&src_buf[j << 2], r); >>>>>> + } >>>>>> + } >>>>>> + >>>>>> memset(dst0, 0, 16 * 16 * SIZEOF_COEF); >>>>>> memset(dst1, 0, 16 * 16 * SIZEOF_COEF); >>>>>> >>>>> >>>>> This still has an effective-type violation: src_buf is of type uint8_t, >>>>> yet the ff_h264_luma_dc_dequant_idct functions will read it as >>>>> int16_t/int32_t. It also still has the downside that buffer overflows >>>>> for the 8bit case can go undetected. >>>> >>>> A bunch of template has cast like >>>> >>>> pixel *dst = (pixel *)_dst; >>>> const pixel *src = (const pixel *)_src; >>>> >>>> then read and write as int16_t. >>>> >>>> And a bunch of checkasm use uint8_t[] array on stack as src and dst, >>>> which leading to UB. >>>> >>>> This patch isn’t specific. And this patch add zero UB (it’s there before the patch, >>>> both src and dst are accessed as int32_t/int16_t while they are int16_t and uint8_t). >>>> >>> >>> This patch adds UB: src was int16_t before, so that the accesses in the >>> eight bit function were fine, but are not with this patch. Anyway, it is >>> irrelevant now. >> >> Why it suddenly becomes a big problem access to properly aligned uint8_t *? >> >> I don’t mind to discuss the rules regarding to these violating of strict aliasing, >> especially in checkasm. But why it suddenly becomes a rule blocking a patch >> trying to fix a fate failure. >> >> I don’t buy the reason "the accesses in the eight bit function were fine”. >> > > The effective type violation goes hand in hand with using a too big > buffer for the smaller type, making the test less strict. This is an > issue that checkasm should worry about (the effective type violation > itself is not that important). It’s the same buffer size inside libavcodec/h264, the test is as strict as real use case. As long as the output is correct, over read a few bytes inside the input buffer doesn’t matter. And there are tools to detect read uninitialized values. Without tools, stack overflow cannot be detected neither. There is a v5. No more comments. > Anyway, have you seen my patch? > > - Andreas > > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-06-17 2:02 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2025-06-16 9:40 [FFmpeg-devel] [PATCH v3] checkasm/h264dsp: Fix stack overflow in check_idct_dequant Zhao Zhili 2025-06-16 9:46 ` Andreas Rheinhardt 2025-06-16 10:21 ` Zhao Zhili 2025-06-16 11:03 ` Andreas Rheinhardt 2025-06-16 11:49 ` Zhao Zhili 2025-06-16 18:29 ` Andreas Rheinhardt 2025-06-17 2:01 ` Zhao Zhili
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel This inbox may be cloned and mirrored by anyone: git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \ ffmpegdev@gitmailbox.com public-inbox-index ffmpegdev Example config snippet for mirrors. AGPL code for this site: git clone https://public-inbox.org/public-inbox.git