From: Andreas Rheinhardt <andreas.rheinhardt@outlook.com> To: ffmpeg-devel@ffmpeg.org Subject: Re: [FFmpeg-devel] [PATCH v2] avcodec/ppc/h264dsp: Fix unaligned stores Date: Thu, 14 Mar 2024 20:34:13 +0100 Message-ID: <AS8P250MB0744BF79544C54397DE319098F292@AS8P250MB0744.EURP250.PROD.OUTLOOK.COM> (raw) In-Reply-To: <CAPBf_OnKZDjwPFMG40+iXogd7wzw0v8if2BMsCq2xsFAvDSVng@mail.gmail.com> Sean McGovern: > Andreas: > > On Wed, Mar 13, 2024 at 7:31 AM Andreas Rheinhardt > <andreas.rheinhardt@outlook.com> wrote: >> >> Also fix an effective-type violation. >> Exposed by https://fate.ffmpeg.org/report.cgi?time=20240312011016&slot=ppc-linux-gcc-13.2-ubsan-altivec-qemu >> >> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com> >> --- >> libavcodec/ppc/h264dsp.c | 35 +++++++++++++++++------------------ >> 1 file changed, 17 insertions(+), 18 deletions(-) >> >> diff --git a/libavcodec/ppc/h264dsp.c b/libavcodec/ppc/h264dsp.c >> index c02733dda2..f50f2553a2 100644 >> --- a/libavcodec/ppc/h264dsp.c >> +++ b/libavcodec/ppc/h264dsp.c >> @@ -401,30 +401,29 @@ static inline void write16x4(uint8_t *dst, int dst_stride, >> register vec_u8 r0, register vec_u8 r1, >> register vec_u8 r2, register vec_u8 r3) { >> DECLARE_ALIGNED(16, unsigned char, result)[64]; >> - uint32_t *src_int = (uint32_t *)result, *dst_int = (uint32_t *)dst; >> - int int_dst_stride = dst_stride/4; >> + uint32_t *src_int = (uint32_t *)result; >> >> vec_st(r0, 0, result); >> vec_st(r1, 16, result); >> vec_st(r2, 32, result); >> vec_st(r3, 48, result); >> /* FIXME: there has to be a better way!!!! */ >> - *dst_int = *src_int; >> - *(dst_int+ int_dst_stride) = *(src_int + 1); >> - *(dst_int+ 2*int_dst_stride) = *(src_int + 2); >> - *(dst_int+ 3*int_dst_stride) = *(src_int + 3); >> - *(dst_int+ 4*int_dst_stride) = *(src_int + 4); >> - *(dst_int+ 5*int_dst_stride) = *(src_int + 5); >> - *(dst_int+ 6*int_dst_stride) = *(src_int + 6); >> - *(dst_int+ 7*int_dst_stride) = *(src_int + 7); >> - *(dst_int+ 8*int_dst_stride) = *(src_int + 8); >> - *(dst_int+ 9*int_dst_stride) = *(src_int + 9); >> - *(dst_int+10*int_dst_stride) = *(src_int + 10); >> - *(dst_int+11*int_dst_stride) = *(src_int + 11); >> - *(dst_int+12*int_dst_stride) = *(src_int + 12); >> - *(dst_int+13*int_dst_stride) = *(src_int + 13); >> - *(dst_int+14*int_dst_stride) = *(src_int + 14); >> - *(dst_int+15*int_dst_stride) = *(src_int + 15); >> + AV_WN32(dst, AV_RN32A(src_int + 0)); >> + AV_WN32(dst + dst_stride, AV_RN32A(src_int + 1)); >> + AV_WN32(dst + 2 * dst_stride, AV_RN32A(src_int + 2)); >> + AV_WN32(dst + 3 * dst_stride, AV_RN32A(src_int + 3)); >> + AV_WN32(dst + 4 * dst_stride, AV_RN32A(src_int + 4)); >> + AV_WN32(dst + 5 * dst_stride, AV_RN32A(src_int + 5)); >> + AV_WN32(dst + 6 * dst_stride, AV_RN32A(src_int + 6)); >> + AV_WN32(dst + 7 * dst_stride, AV_RN32A(src_int + 7)); >> + AV_WN32(dst + 8 * dst_stride, AV_RN32A(src_int + 8)); >> + AV_WN32(dst + 9 * dst_stride, AV_RN32A(src_int + 9)); >> + AV_WN32(dst + 10 * dst_stride, AV_RN32A(src_int + 10)); >> + AV_WN32(dst + 11 * dst_stride, AV_RN32A(src_int + 11)); >> + AV_WN32(dst + 12 * dst_stride, AV_RN32A(src_int + 12)); >> + AV_WN32(dst + 13 * dst_stride, AV_RN32A(src_int + 13)); >> + AV_WN32(dst + 14 * dst_stride, AV_RN32A(src_int + 14)); >> + AV_WN32(dst + 15 * dst_stride, AV_RN32A(src_int + 15)); >> } >> >> /** @brief performs a 6x16 transpose of data in src, and stores it to dst >> -- >> 2.40.1 >> >> _______________________________________________ >> ffmpeg-devel mailing list >> ffmpeg-devel@ffmpeg.org >> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel >> >> To unsubscribe, visit link above, or email >> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". > > First of all, thank you for looking into this. > > Second, do we feel that this change covers the FIXME immediately above > it that exclaims "there has to be a better way!!!!"? > If so, we can remove the comment. I don't think so. This code comes from a time when FFmpeg did not care about the effective-type-rules or about alignment due to undefined behaviour; it only cared about alignment when it led to crashes. The old discussion confirms this: https://ffmpeg.org/pipermail/ffmpeg-devel/2007-May/034609.html https://ffmpeg.org/pipermail/ffmpeg-devel/2007-May/034612.html contains the following: "As I said, I submitted this patch in order to have PPC users get some speed-up now rather than having a hypothetic optimal code when some of us who work on Altivec sit down and work on it. I do think it's better to have a committed faster code that leaves room for improvement than a fastest code that never sees the light." The fixme relates to this; it was probably considered advantageous to avoid storing the vectors in stack buffers. > > I did not perform a full FATE run as it is expensive on my QEMU setup, > but I can confirm that this fixes the checkasm-h264dsp test under GCC > UBsan there as well as on a POWER7 (ppc64) and a POWER9 (ppc64le). > Ok, will apply then. Thanks for testing. - Andreas _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
prev parent reply other threads:[~2024-03-14 19:34 UTC|newest] Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top 2024-03-12 22:15 [FFmpeg-devel] [PATCH] " Andreas Rheinhardt 2024-03-13 11:30 ` [FFmpeg-devel] [PATCH v2] " Andreas Rheinhardt 2024-03-13 12:04 ` James Almer 2024-03-13 12:10 ` Andreas Rheinhardt 2024-03-14 19:13 ` Sean McGovern 2024-03-14 19:23 ` James Almer 2024-03-14 19:34 ` Andreas Rheinhardt [this message]
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=AS8P250MB0744BF79544C54397DE319098F292@AS8P250MB0744.EURP250.PROD.OUTLOOK.COM \ --to=andreas.rheinhardt@outlook.com \ --cc=ffmpeg-devel@ffmpeg.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel This inbox may be cloned and mirrored by anyone: git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \ ffmpegdev@gitmailbox.com public-inbox-index ffmpegdev Example config snippet for mirrors. AGPL code for this site: git clone https://public-inbox.org/public-inbox.git