On 7/29/2025 5:56 PM, Kacper Michajlow wrote: > On Tue, 29 Jul 2025 at 22:11, James Almer wrote: >> >> On 7/29/2025 5:02 PM, Kieran Kunhya via ffmpeg-devel wrote: >>> Hello, >>> >>> It seem there is strong evidence that AI wrote TLS code as part of the >>> WHIP patch. It goes without saying why this is bad. Further discussion >>> here: >>> https://code.ffmpeg.org/FFmpeg/FFmpeg/pulls/20053 >>> >>> This patch was pushed without ML review. >>> >>> I think this code should be removed before the FFmpeg release. I >>> include TC in this email for that reason. >> >> The UTF8 dashes are not so much an indication of LLM output but one that >> it was written with an unusual locale, I'd say. > > I disagree. I wouldn't call out AI if there wouldn't be a good > indication that this is where those hyphens came from. I tested many > LLMs to evaluate their usefulness and this is the kind of thing that > they love to insert even in code. I would expect any developer (even > natively using different locale) to use - in the .c file, after all > this is a common token in the code too. > > Additionally, now I see there is also an ’ (0x2019) few lines below in > `to a av_malloc’d PEM string.` Which is also something that LLMs love > to insert. I can even just now remove those comments and ask one of > the biggest LLM to comment on the code to reproduce the same 0x2019 > being inserted. Alright, i was not aware this was common behavior of LLMs. > > Lastly, the strong indication of LLM are dummy comments for every > operation. LLMs love to explain themselves. Comments in code are very > useful tools, but you don't have to comment every function call and > every label. IMHO it adds more noise than information, SNR is > important. It's harmless, but look at pkey_to_pem_string() and tell me > it really is organic to add `// Copy data & NUL-terminate` to a memcpy > call. Again I can reproduce this with quaring LLM to do so. This one i know is common. > > I'm not saying we should revert this code, but a good review would be > in-order to ensure we are not shipping something bad in there. I however am saying we should revert it in the release/8.0 branch after it's made and before the release is tagged. A proper review can happen in the master branch without the risk of realizing we shipped dubious code in a tarball. > > Note that my intention was not to start some big discussion, just > clean the file from unnecessary similar looking utf-8 characters. I'm > not opposed to AI/LLM use, but their output should be heavily > sanitized as they are not reliable on their own. > > - Kacper > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".