Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
 help / color / mirror / Atom feed
From: James Almer <jamrial@gmail.com>
To: ffmpeg-devel@ffmpeg.org
Subject: Re: [FFmpeg-devel] [IMPORTANT] AI written TLS Code in WHIP patch
Date: Tue, 29 Jul 2025 19:39:18 -0300
Message-ID: <5183affe-1897-4c3f-abb2-6f24bd850aca@gmail.com> (raw)
In-Reply-To: <CABPLASRH5mu01nWg2w0Eu0mwkDQQvycFVPxvE-eaRD0gNGsc2Q@mail.gmail.com>


[-- Attachment #1.1.1: Type: text/plain, Size: 2968 bytes --]

On 7/29/2025 5:56 PM, Kacper Michajlow wrote:
> On Tue, 29 Jul 2025 at 22:11, James Almer <jamrial@gmail.com> wrote:
>>
>> On 7/29/2025 5:02 PM, Kieran Kunhya via ffmpeg-devel wrote:
>>> Hello,
>>>
>>> It seem there is strong evidence that AI wrote TLS code as part of the
>>> WHIP patch. It goes without saying why this is bad. Further discussion
>>> here:
>>> https://code.ffmpeg.org/FFmpeg/FFmpeg/pulls/20053
>>>
>>> This patch was pushed without ML review.
>>>
>>> I think this code should be removed before the FFmpeg release. I
>>> include TC in this email for that reason.
>>
>> The UTF8 dashes are not so much an indication of LLM output but one that
>> it was written with an unusual locale, I'd say.
> 
> I disagree. I wouldn't call out AI if there wouldn't be a good
> indication that this is where those hyphens came from. I tested many
> LLMs to evaluate their usefulness and this is the kind of thing that
> they love to insert even in code. I would expect any developer (even
> natively using different locale) to use - in the .c file, after all
> this is a common token in the code too.
> 
> Additionally, now I see there is also an ’ (0x2019) few lines below in
> `to a av_malloc’d PEM string.` Which is also something that LLMs love
> to insert. I can even just now remove those comments and ask one of
> the biggest LLM to comment on the code to reproduce the same 0x2019
> being inserted.

Alright, i was not aware this was common behavior of LLMs.

> 
> Lastly, the strong indication of LLM are dummy comments for every
> operation. LLMs love to explain themselves. Comments in code are very
> useful tools, but you don't have to comment every function call and
> every label. IMHO it adds more noise than information, SNR is
> important. It's harmless, but look at pkey_to_pem_string() and tell me
> it really is organic to add `// Copy data & NUL-terminate` to a memcpy
> call. Again I can reproduce this with quaring LLM to do so.

This one i know is common.

> 
> I'm not saying we should revert this code, but a good review would be
> in-order to ensure we are not shipping something bad in there.
I however am saying we should revert it in the release/8.0 branch after 
it's made and before the release is tagged. A proper review can happen 
in the master branch without the risk of realizing we shipped dubious 
code in a tarball.

> 
> Note that my intention was not to start some big discussion, just
> clean the file from unnecessary similar looking utf-8 characters. I'm
> not opposed to AI/LLM use, but their output should be heavily
> sanitized as they are not reliable on their own.
> 
> - Kacper
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

[-- Attachment #2: Type: text/plain, Size: 251 bytes --]

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

  reply	other threads:[~2025-07-29 22:39 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-29 20:02 Kieran Kunhya via ffmpeg-devel
2025-07-29 20:11 ` James Almer
2025-07-29 20:56   ` Kacper Michajlow
2025-07-29 22:39     ` James Almer [this message]
2025-07-29 22:59       ` Timo Rothenpieler
2025-07-30  0:42 ` Jack Lau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5183affe-1897-4c3f-abb2-6f24bd850aca@gmail.com \
    --to=jamrial@gmail.com \
    --cc=ffmpeg-devel@ffmpeg.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
		ffmpegdev@gitmailbox.com
	public-inbox-index ffmpegdev

Example config snippet for mirrors.


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git