Re: [FFmpeg-devel] [PATCH 5/6] tools/target_dec_fuzzer: Use av_buffer_allocz() to avoid missing slices to have unpredictable content

From: Kacper Michajlow <kasper93@gmail.com>
To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Subject: Re: [FFmpeg-devel] [PATCH 5/6] tools/target_dec_fuzzer: Use av_buffer_allocz() to avoid missing slices to have unpredictable content
Date: Fri, 9 Aug 2024 03:56:42 +0200
Message-ID: <CABPLASQvQ4pTOYSmgiX0QWRPSOj-bqgCztZk+cQJWczRgmTyhw@mail.gmail.com> (raw)
In-Reply-To: <20240808212701.GC4991@pb2>

On Fri, 9 Aug 2024 at 00:06, Michael Niedermayer <michael@niedermayer.cc> wrote:
>
> On Thu, Aug 08, 2024 at 02:13:12PM -0300, James Almer wrote:
> > On 8/6/2024 7:18 PM, Michael Niedermayer wrote:
> > > Fixes: use of uninitialized values
> > > Fixes: 70885/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_VP6F_fuzzer-4610946029387776 (and likely others)
> > >
> > > Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
> > > Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
> > > ---
> > >   tools/target_dec_fuzzer.c | 2 +-
> > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/tools/target_dec_fuzzer.c b/tools/target_dec_fuzzer.c
> > > index d2d7e21dac7..794b5b92cc7 100644
> > > --- a/tools/target_dec_fuzzer.c
> > > +++ b/tools/target_dec_fuzzer.c
> > > @@ -129,7 +129,7 @@ static int fuzz_video_get_buffer(AVCodecContext *ctx, AVFrame *frame)
> > >       frame->extended_data = frame->data;
> > >       for (i = 0; i < 4 && size[i]; i++) {
> > > -        frame->buf[i] = av_buffer_alloc(size[i]);
> > > +        frame->buf[i] = av_buffer_allocz(size[i]);
> > >           if (!frame->buf[i])
> > >               goto fail;
> > >           frame->data[i] = frame->buf[i]->data;
> >
> > Wouldn't this hide actual decoder bugs too?
>
> iam not sure i understand what you mean

In general, clearing buffers before processing makes MSAN less
effective in discovering invalid accesses because they would all
appear valid from its point of view. So, I guess the argument was that
this could hide actual decoder bugs since the buffers are already
initialized by the fuzzing binary itself, which, in theory, is
supposed to emulate the worst-case scenario for a tested decoder.

> If decoders are fed with uninitialized buffers thats a
> security issue because there are thousands if not ten thousands of
> pathes if you consider the number of decoders and the number
> of ways they can hit errors

Clearing those buffers in fuzzers does not alleviate this security
issue, as they may still be uninitialized in production code.

> Pathes in which these buffers are not filled completely, so each
> of these pathes would then need to clear the right bits of data.
> Basically that means implementing error concealment for every decoder.
> AND making sure that error concealment code is 100% bugfree and leaves
> never a spot uncleaned and never touched something that was not writen to

Isn't that the point of uninitialized access checking? I can't speak
to the scale of the problem because I don't know what the issues are.
In principle, you don't have to clear each uninitialized path of the
buffer that may occur due to an error. Instead, you should ensure that
the buffer is not accessed when the error occurs. If decoders rely on
external users to provide zeroed buffers to work correctly, then this
should be documented as an API requirement.

Outputting garbage on errors is acceptable, but if decoders process
uninitialized data internally when errors occur, they are, at best,
non-deterministic...

> Security wise this is not possible for production code, its too
> fragile (at least with the number of decoders and active maintainers we have)
> (you want less code to have to be bugfree for security not more code having
>  to be bug free)
>
> Now this is the fuzzer and not production code, ok. And of course is
> great to have error concealment in every decoder
> But then this leaves the question, who will do this work?
> If noone does it then we will accumulate many msan bugs in ossfuzz that we wont
> be able to do much with except ignore them.
> This would make the fuzzer less efficient and it would confuse people looking
> at the issues

MSAN is not forgiving, and I can imagine that stabilizing it could
take time. However, suppressing the reports will not make it more
efficient. I might not fully understand what you meant, though.

That being said, I think the patch makes sense as a short-term
solution to suppress the bulk of reports and focus on the remaining
ones. However, it would be good to make it clear that, at some point,
it should be reverted. As it stands now, no one will remember why it
was zeroed out, and it could remain that way indefinitely. Perhaps it
should be configurable per decoder.

> Or the short punchy reply maybe is
> Produce a volunteer who will fix these bugs before declaring them bugs.
> And when doing so consider that we have bugfixes on the mailing list for which we
> seem to not even have the man power to review and apply them
>
> so yeah my oppinion is the default should be the simple & easy to maintain way.
> If someone declares their decoder to have flawless error concealment (and for some
> simple decoders that could be quite simple) these can always be excluded and use
> uninitialized buffers in the fuzzer

What is the problem with keeping those reports and letting "someone"
work on their decoder based on reports?

- Kacper
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".