[FFmpeg-devel] [RFC] Introducing policies regarding "AI" contributions

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
 help / color / mirror / Atom feed

* [FFmpeg-devel] [RFC] Introducing policies regarding "AI" contributions
@ 2025-07-01 10:58 Alexander Strasser via ffmpeg-devel
  2025-07-01 11:20 ` Gyan Doshi
                   ` (5 more replies)
  0 siblings, 6 replies; 14+ messages in thread
From: Alexander Strasser via ffmpeg-devel @ 2025-07-01 10:58 UTC (permalink / raw)
  To: ffmpeg-devel; +Cc: Alexander Strasser

[-- Attachment #1: Type: message/rfc822, Size: 7221 bytes --]

From: Alexander Strasser <eclipse7@gmx.net>
To: ffmpeg-devel@ffmpeg.org
Subject: [RFC] Introducing policies regarding "AI" contributions
Date: Tue, 1 Jul 2025 12:58:23 +0200
Message-ID: <aGO_T_HfuvhbgYoy@metallschleim.local>

Hi all,

I do not like the branding of the LLMs as AI, thus I will for now
continue to call it "AI" in quotes. I'm open for better terms.

It was just yesterday brought up on IRC in #ffmpeg-devel that there
was at least one, marked attempt to include "AI" generated code[1].

At least I would say that this particular patch series was rejected,
but there were was no explicit discussion and clear statement about
"AI" generated content; especially code.

Thus I want this thread to start a discussion, that eventually leads
to a policy about submitting and integrating "AI" generated content.

Leaving all ethical issues aside for a moment I still see 2 very big
problems with AI generated code:

* looks generally plausible but is often subtly wrong
    * leading to more work, regressions and costs
        * which often lands on a different group of people (other
          projects, reviewers, bug finders, bug fixers, etc.)
        * which are sometimes delayed for quite some time increasing
          the costs of fixing them
* license/copyright violations
    * this might be sometimes a non-issue with small changes
    * but especially for complete components the risk seems high

There is a lot more to the topic and I probably forgot to bring up
many more important aspects and details. Please feel free to bring
more things up in the discussion!

There was a preparation in the musl project to put up a policy[2],
it has not yet been finalized and realized as far as I understand.

It also brings up the point, that it is not really related to
recent "AI" tech, but more to the origin of work and its handling.
Unfortunately "AI" made problems with this a lot more common.

Best regards,
  Alexander

1. https://lists.ffmpeg.org/pipermail/ffmpeg-devel/2025-April/342146.html
2. https://www.openwall.com/lists/musl/2024/10/19/3

[-- Attachment #2: Type: text/plain, Size: 251 bytes --]

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [FFmpeg-devel] [RFC] Introducing policies regarding "AI" contributions
  2025-07-01 10:58 [FFmpeg-devel] [RFC] Introducing policies regarding "AI" contributions Alexander Strasser via ffmpeg-devel
@ 2025-07-01 11:20 ` Gyan Doshi
  2025-07-03 23:42   ` Alexander Strasser via ffmpeg-devel
  2025-07-01 12:44 ` Kacper Michajlow
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 14+ messages in thread
From: Gyan Doshi @ 2025-07-01 11:20 UTC (permalink / raw)
  To: ffmpeg-devel



On 2025-07-01 04:28 pm, Alexander Strasser via ffmpeg-devel wrote:

 > Thus I want this thread to start a discussion, that eventually leads 
 > to a policy about submitting and integrating "AI" generated content.
In practice. unless a patch(set) is explicitly marked or has telltale 
signs of being AI-generated,the project can't stop such AI code getting in.
At best, we can require disclosure and for the human submitter to assume 
responsibility.

Regards,
Gyan
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [FFmpeg-devel] [RFC] Introducing policies regarding "AI" contributions
  2025-07-01 10:58 [FFmpeg-devel] [RFC] Introducing policies regarding "AI" contributions Alexander Strasser via ffmpeg-devel
  2025-07-01 11:20 ` Gyan Doshi
@ 2025-07-01 12:44 ` Kacper Michajlow
  2025-07-03 23:31   ` Alexander Strasser via ffmpeg-devel
  2025-07-04 18:11   ` softworkz .
  2025-07-03  0:16 ` Gerion Entrup
                   ` (3 subsequent siblings)
  5 siblings, 2 replies; 14+ messages in thread
From: Kacper Michajlow @ 2025-07-01 12:44 UTC (permalink / raw)
  To: FFmpeg development discussions and patches; +Cc: Alexander Strasser

On Tue, 1 Jul 2025 at 12:58, Alexander Strasser via ffmpeg-devel
<ffmpeg-devel@ffmpeg.org> wrote:
>
>
>
>
> ---------- Forwarded message ----------
> From: Alexander Strasser <eclipse7@gmx.net>
> To: ffmpeg-devel@ffmpeg.org
> Cc:
> Bcc:
> Date: Tue, 1 Jul 2025 12:58:23 +0200
> Subject: [RFC] Introducing policies regarding "AI" contributions
> Hi all,
>
> I do not like the branding of the LLMs as AI, thus I will for now
> continue to call it "AI" in quotes. I'm open for better terms.
>
> It was just yesterday brought up on IRC in #ffmpeg-devel that there
> was at least one, marked attempt to include "AI" generated code[1].
>
> At least I would say that this particular patch series was rejected,
> but there were was no explicit discussion and clear statement about
> "AI" generated content; especially code.
>
> Thus I want this thread to start a discussion, that eventually leads
> to a policy about submitting and integrating "AI" generated content.

I don't think labeling code as "AI" matters that much. Let's ignore
licensing/legal issues for now.

What's important is the code itself and its quality. It doesn't matter
how it was created. Whether by a human, "AI" or something else. The
key is the final product. "AI" is just a tool, and like any tool, it
can be used well or poorly. How you use it may be completely different
between "operators".

I think the "AI" label exists because the code that LLMs produce is
often incomplete, low quality, and a pile of spaghetti that somehow
works for a single use case. but is far from being a sane, production
ready implementation. Anyone who has used these tools knows their
limitations and what they can or cannot do.

That said, if "AI" code means low quality code, then by all means, it
should be rejected. This applies to human, alien, or "AI" generated
code. There shouldn't be a different metric for "AI" code. If "AI"
(and its "operator") produces high quality code, there's no reason to
reject it.

After all, how can you even detect "AI" code? If the code, regardless
of who or what wrote it, follows project guidelines and is overall
high quality, that's all that matters.

P.S. I don't like those "This code was fully made by an LLM"
statements and the like. Who cares? Maybe some investor who's pushing
this. But from a technical point of view, there's no difference. After
all, you don't start your patchset by saying, "This code was written
in Vim with <list of plugins> on Arch Linux, on an ergonomic split
keyboard, with an XYZ monitor.".

- Kacper

> Leaving all ethical issues aside for a moment I still see 2 very big
> problems with AI generated code:
>
> * looks generally plausible but is often subtly wrong
>     * leading to more work, regressions and costs
>         * which often lands on a different group of people (other
>           projects, reviewers, bug finders, bug fixers, etc.)
>         * which are sometimes delayed for quite some time increasing
>           the costs of fixing them
> * license/copyright violations
>     * this might be sometimes a non-issue with small changes
>     * but especially for complete components the risk seems high
>
> There is a lot more to the topic and I probably forgot to bring up
> many more important aspects and details. Please feel free to bring
> more things up in the discussion!
>
> There was a preparation in the musl project to put up a policy[2],
> it has not yet been finalized and realized as far as I understand.
>
> It also brings up the point, that it is not really related to
> recent "AI" tech, but more to the origin of work and its handling.
> Unfortunately "AI" made problems with this a lot more common.
>
>
> Best regards,
>   Alexander
>
> 1. https://lists.ffmpeg.org/pipermail/ffmpeg-devel/2025-April/342146.html
> 2. https://www.openwall.com/lists/musl/2024/10/19/3
>
>
>
> ---------- Forwarded message ----------
> From: Alexander Strasser via ffmpeg-devel <ffmpeg-devel@ffmpeg.org>
> To: ffmpeg-devel@ffmpeg.org
> Cc: Alexander Strasser <eclipse7@gmx.net>
> Bcc:
> Date: Tue, 1 Jul 2025 12:58:23 +0200
> Subject: [FFmpeg-devel] [RFC] Introducing policies regarding "AI" contributions
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [FFmpeg-devel] [RFC] Introducing policies regarding "AI" contributions
  2025-07-01 10:58 [FFmpeg-devel] [RFC] Introducing policies regarding "AI" contributions Alexander Strasser via ffmpeg-devel
  2025-07-01 11:20 ` Gyan Doshi
  2025-07-01 12:44 ` Kacper Michajlow
@ 2025-07-03  0:16 ` Gerion Entrup
  2025-07-03 23:14   ` Alexander Strasser via ffmpeg-devel
  2025-07-03 23:44 ` Leo Izen
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 14+ messages in thread
From: Gerion Entrup @ 2025-07-03  0:16 UTC (permalink / raw)
  To: ffmpeg-devel


[-- Attachment #1.1: Type: text/plain, Size: 2347 bytes --]

Am Dienstag, 1. Juli 2025, 12:58:23 Mitteleuropäische Sommerzeit schrieb Alexander Strasser via ffmpeg-devel:
> Hi all,
> 
> I do not like the branding of the LLMs as AI, thus I will for now
> continue to call it "AI" in quotes. I'm open for better terms.
> 
> It was just yesterday brought up on IRC in #ffmpeg-devel that there
> was at least one, marked attempt to include "AI" generated code[1].
> 
> At least I would say that this particular patch series was rejected,
> but there were was no explicit discussion and clear statement about
> "AI" generated content; especially code.
> 
> Thus I want this thread to start a discussion, that eventually leads
> to a policy about submitting and integrating "AI" generated content.
> 
> Leaving all ethical issues aside for a moment I still see 2 very big
> problems with AI generated code:
> 
> * looks generally plausible but is often subtly wrong
>     * leading to more work, regressions and costs
>         * which often lands on a different group of people (other
>           projects, reviewers, bug finders, bug fixers, etc.)
>         * which are sometimes delayed for quite some time increasing
>           the costs of fixing them
> * license/copyright violations
>     * this might be sometimes a non-issue with small changes
>     * but especially for complete components the risk seems high
> 
> There is a lot more to the topic and I probably forgot to bring up
> many more important aspects and details. Please feel free to bring
> more things up in the discussion!
> 
> There was a preparation in the musl project to put up a policy[2],
> it has not yet been finalized and realized as far as I understand.

Just to link it here. Remembers me on the Gentoo Linux discussion:
https://archives.gentoo.org/gentoo-dev/9007c921a8a57655ecb2027eb4be4bff02673af4.camel@zougloub.eu/T/#t
https://wiki.gentoo.org/wiki/Project:Council/AI_policy


Best,
Gerion


> 
> It also brings up the point, that it is not really related to
> recent "AI" tech, but more to the origin of work and its handling.
> Unfortunately "AI" made problems with this a lot more common.
> 
> 
> Best regards,
>   Alexander
> 
> 1. https://lists.ffmpeg.org/pipermail/ffmpeg-devel/2025-April/342146.html
> 2. https://www.openwall.com/lists/musl/2024/10/19/3
> 


[-- Attachment #1.2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 659 bytes --]

[-- Attachment #2: Type: text/plain, Size: 251 bytes --]

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [FFmpeg-devel] [RFC] Introducing policies regarding "AI" contributions
  2025-07-03  0:16 ` Gerion Entrup
@ 2025-07-03 23:14   ` Alexander Strasser via ffmpeg-devel
  2025-07-04  7:10     ` Nicolas George
  0 siblings, 1 reply; 14+ messages in thread
From: Alexander Strasser via ffmpeg-devel @ 2025-07-03 23:14 UTC (permalink / raw)
  To: FFmpeg development discussions and patches; +Cc: Alexander Strasser

[-- Attachment #1: Type: message/rfc822, Size: 9303 bytes --]

From: Alexander Strasser <eclipse7@gmx.net>
To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Subject: Re: [FFmpeg-devel] [RFC] Introducing policies regarding "AI" contributions
Date: Fri, 4 Jul 2025 01:14:11 +0200
Message-ID: <aGcOw7VGSWuE3OVp@metallschleim.local>

On 2025-07-03 02:16 +0200, Gerion Entrup wrote:
> Am Dienstag, 1. Juli 2025, 12:58:23 Mitteleuropäische Sommerzeit schrieb Alexander Strasser via ffmpeg-devel:
[...]
> > Thus I want this thread to start a discussion, that eventually leads
> > to a policy about submitting and integrating "AI" generated content.
> > 
> > Leaving all ethical issues aside for a moment I still see 2 very big
> > problems with AI generated code:
> > 
> > * looks generally plausible but is often subtly wrong
> >     * leading to more work, regressions and costs
> >         * which often lands on a different group of people (other
> >           projects, reviewers, bug finders, bug fixers, etc.)
> >         * which are sometimes delayed for quite some time increasing
> >           the costs of fixing them
> > * license/copyright violations
> >     * this might be sometimes a non-issue with small changes
> >     * but especially for complete components the risk seems high
> > 
> > There is a lot more to the topic and I probably forgot to bring up
> > many more important aspects and details. Please feel free to bring
> > more things up in the discussion!
> > 
> > There was a preparation in the musl project to put up a policy[2],
> > it has not yet been finalized and realized as far as I understand.
> 
> Just to link it here. Remembers me on the Gentoo Linux discussion:
> https://archives.gentoo.org/gentoo-dev/9007c921a8a57655ecb2027eb4be4bff02673af4.camel@zougloub.eu/T/#t
> https://wiki.gentoo.org/wiki/Project:Council/AI_policy

Thanks for the links to the Gentoo discussion and policy!

IMHO the discussion and the resulting policy is interesting and maybe
something similar would be appropriate for FFmpeg.

I also became aware of LLVM policy:

  https://llvm.org/docs/DeveloperPolicy.html#ai-generated-contributions

But I must say I do not like it as much. To cite the most critical part:

    As such, the LLVM policy is that contributors are permitted to use
    artificial intelligence tools to produce contributions, provided that
    they have the right to license that code under the project license.
    Contributions found to violate this policy will be removed just like
    any other offending contribution.

For "AI" (in the LLM sense) I think it's usually not at all easy to
say if one has the right to license the code given it's trained on
a huge corpus of copyrighted and particularly licensed code.

Anyway they agree on license/copyright concern I raised. As does Gentoo.

And the LLVM policy also comes to a similar conclusions, as does Gentoo,
regarding waste of project resources:

    We encourage contributors to review all generated code before sending
    it for review to verify its correctness and to understand it so that
    they can answer questions during code review. Reviewing and maintaining
    generated code that the original contributor does not understand is not
    a good use of limited project resources.


If anyone has more examples at hand, it would probably be interesting to
know and take a look.

Best regards,
  Alexander


> > It also brings up the point, that it is not really related to
> > recent "AI" tech, but more to the origin of work and its handling.
> > Unfortunately "AI" made problems with this a lot more common.
> > 
> > 
> > Best regards,
> >   Alexander
> > 
> > 1. https://lists.ffmpeg.org/pipermail/ffmpeg-devel/2025-April/342146.html
> > 2. https://www.openwall.com/lists/musl/2024/10/19/3

[-- Attachment #2: Type: text/plain, Size: 251 bytes --]

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [FFmpeg-devel] [RFC] Introducing policies regarding "AI" contributions
  2025-07-01 12:44 ` Kacper Michajlow
@ 2025-07-03 23:31   ` Alexander Strasser via ffmpeg-devel
  2025-07-04 16:43     ` compn
  2025-07-04 18:11   ` softworkz .
  1 sibling, 1 reply; 14+ messages in thread
From: Alexander Strasser via ffmpeg-devel @ 2025-07-03 23:31 UTC (permalink / raw)
  To: FFmpeg development discussions and patches; +Cc: Alexander Strasser

[-- Attachment #1: Type: message/rfc822, Size: 9115 bytes --]

From: Alexander Strasser <eclipse7@gmx.net>
To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Subject: Re: [FFmpeg-devel] [RFC] Introducing policies regarding "AI" contributions
Date: Fri, 4 Jul 2025 01:31:19 +0200
Message-ID: <aGcSx5IzK7aj6UOd@metallschleim.local>

On 2025-07-01 14:44 +0200, Kacper Michajlow wrote:
> On Tue, 1 Jul 2025 at 12:58, Alexander Strasser via ffmpeg-devel
[...]
> >
> > I do not like the branding of the LLMs as AI, thus I will for now
> > continue to call it "AI" in quotes. I'm open for better terms.
> >
> > It was just yesterday brought up on IRC in #ffmpeg-devel that there
> > was at least one, marked attempt to include "AI" generated code[1].
> >
> > At least I would say that this particular patch series was rejected,
> > but there were was no explicit discussion and clear statement about
> > "AI" generated content; especially code.
> >
> > Thus I want this thread to start a discussion, that eventually leads
> > to a policy about submitting and integrating "AI" generated content.
> 
> I don't think labeling code as "AI" matters that much. Let's ignore
> licensing/legal issues for now.

OK, but I really don't think we can ignore the legal consequences
for FFmpeg, as it is Open Source software, and we would put all
users of FFmpeg, individuals and companies, at risk.


> What's important is the code itself and its quality. It doesn't matter
> how it was created. Whether by a human, "AI" or something else. The
> key is the final product. "AI" is just a tool, and like any tool, it
> can be used well or poorly. How you use it may be completely different
> between "operators".
> 
> I think the "AI" label exists because the code that LLMs produce is
> often incomplete, low quality, and a pile of spaghetti that somehow
> works for a single use case. but is far from being a sane, production
> ready implementation. Anyone who has used these tools knows their
> limitations and what they can or cannot do.
> 
> That said, if "AI" code means low quality code, then by all means, it
> should be rejected. This applies to human, alien, or "AI" generated
> code. There shouldn't be a different metric for "AI" code. If "AI"
> (and its "operator") produces high quality code, there's no reason to
> reject it.
> 
> After all, how can you even detect "AI" code? If the code, regardless
> of who or what wrote it, follows project guidelines and is overall
> high quality, that's all that matters.

I kind of agree that good code is good code, but it's not enough.
Important is also having people around that truly understand the
good code.

To find out if it is truly good code someone needs to review it very
deeply, which is extra hard if it is "AI" generated code as it tends
to look very plausible; which could waste a lot of time for the people
looking at it and reviewing it. This also diminishes the actual value
of the use of "AI" in the first place.

Taking that for granted there is the open question for submissions
by maintainers (with git push access), who could submit "AI" generated
code and push it themselves after a considerable push warning.


> P.S. I don't like those "This code was fully made by an LLM"
> statements and the like. Who cares? Maybe some investor who's pushing
> this. But from a technical point of view, there's no difference. After
> all, you don't start your patchset by saying, "This code was written
> in Vim with <list of plugins> on Arch Linux, on an ergonomic split
> keyboard, with an XYZ monitor.".

[...]

Thanks for your feed back!

Greetings,
  Alexander

[-- Attachment #2: Type: text/plain, Size: 251 bytes --]

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [FFmpeg-devel] [RFC] Introducing policies regarding "AI" contributions
  2025-07-01 11:20 ` Gyan Doshi
@ 2025-07-03 23:42   ` Alexander Strasser via ffmpeg-devel
  0 siblings, 0 replies; 14+ messages in thread
From: Alexander Strasser via ffmpeg-devel @ 2025-07-03 23:42 UTC (permalink / raw)
  To: FFmpeg development discussions and patches; +Cc: Alexander Strasser

[-- Attachment #1: Type: message/rfc822, Size: 6961 bytes --]

From: Alexander Strasser <eclipse7@gmx.net>
To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Subject: Re: [FFmpeg-devel] [RFC] Introducing policies regarding "AI" contributions
Date: Fri, 4 Jul 2025 01:42:10 +0200
Message-ID: <aGcVUqc_8_sbAhs4@metallschleim.local>

On 2025-07-01 16:50 +0530, Gyan Doshi wrote:
> 
> On 2025-07-01 04:28 pm, Alexander Strasser via ffmpeg-devel wrote:
> 
> > Thus I want this thread to start a discussion, that eventually leads
> > to a policy about submitting and integrating "AI" generated content.
>
> In practice. unless a patch(set) is explicitly marked or has telltale signs
> of being AI-generated,the project can't stop such AI code getting in.
> At best, we can require disclosure and for the human submitter to assume
> responsibility.

That's true. It's impossible to completely enforce adherence to a
policy that bans "AI" generated code.

I guess it would still be worthwhile to just do what you said.

From what I have looked at in the other projects so far (musl,
gentoo, llvm), they acknowledge too that they cannot enforce it.

In a way it's nothing new and actually since forever we would
not want to accept contributions of dubious or license-incompatible
origins.

Just the current times seem to warrant spelling this out, I fear.
So maybe just generically writing about it and explicitly mentioning
"AI" would be the better way to achieve the goal.

Thanks for commenting!

Best regards,
  Alexander

[-- Attachment #2: Type: text/plain, Size: 251 bytes --]

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [FFmpeg-devel] [RFC] Introducing policies regarding "AI" contributions
  2025-07-01 10:58 [FFmpeg-devel] [RFC] Introducing policies regarding "AI" contributions Alexander Strasser via ffmpeg-devel
                   ` (2 preceding siblings ...)
  2025-07-03  0:16 ` Gerion Entrup
@ 2025-07-03 23:44 ` Leo Izen
  2025-07-04 10:15 ` Michael Niedermayer
  2025-07-05 11:20 ` Rémi Denis-Courmont
  5 siblings, 0 replies; 14+ messages in thread
From: Leo Izen @ 2025-07-03 23:44 UTC (permalink / raw)
  To: ffmpeg-devel

On 7/1/25 06:58, Alexander Strasser via ffmpeg-devel wrote:
> _______________________________________________

While I agree with you on the merits that LLM-generated code tends to be 
low quality, ideally that will be caught during code review. I think a 
blanket ban on it makes more sense because of the legal implications of 
including LLM-generated code in our codebase.

I am not a lawyer, so I cannot say for certain how the legality plays 
out, and it may be safer to just not permit it than try to hire a laywer 
to figure out how to permit it, if it's even possible.

- Leo Izen

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [FFmpeg-devel] [RFC] Introducing policies regarding "AI" contributions
  2025-07-03 23:14   ` Alexander Strasser via ffmpeg-devel
@ 2025-07-04  7:10     ` Nicolas George
  0 siblings, 0 replies; 14+ messages in thread
From: Nicolas George @ 2025-07-04  7:10 UTC (permalink / raw)
  To: FFmpeg development discussions and patches

Alexander Strasser via ffmpeg-devel (HE12025-07-04):
> For "AI" (in the LLM sense) I think it's usually not at all easy to
> say if one has the right to license the code given it's trained on
> a huge corpus of copyrighted and particularly licensed code.

It is only an issue if the code is taken and submitted as is. But we can
handle this issue because the code will be shit. We just need to be able
to be firm against people who submit shitty code.

On the other hand, if they use a LLM to prototype the use of an API they
rarely use and whose documentation sucks (Android I am looking at you)
and once it work they rewrite the code properly, then there is no
copyright liability.

Regards,

-- 
  Nicolas George
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [FFmpeg-devel] [RFC] Introducing policies regarding "AI" contributions
  2025-07-01 10:58 [FFmpeg-devel] [RFC] Introducing policies regarding "AI" contributions Alexander Strasser via ffmpeg-devel
                   ` (3 preceding siblings ...)
  2025-07-03 23:44 ` Leo Izen
@ 2025-07-04 10:15 ` Michael Niedermayer
  2025-07-05 11:20 ` Rémi Denis-Courmont
  5 siblings, 0 replies; 14+ messages in thread
From: Michael Niedermayer @ 2025-07-04 10:15 UTC (permalink / raw)
  To: FFmpeg development discussions and patches


[-- Attachment #1.1: Type: text/plain, Size: 1782 bytes --]

Hi

The use of tools to assist developers is growing and will
continue to grow. Its not going away.
And what one can and cannot do with these tools will evolve

I dont think i understand the thought process behind this policy.
Licenses need to be complied to, code needs to be of good quality.

If a tool can help with that, you use it, if not you dont.
If you dont know, you try.

I think we should ensure that (static copies) of our bug tracker,
mailing list archieve and git repository are accessible to these tools.
Otherwise we will see more
* ddos by the tools trying to bypass restrictions
    [you know, the developer saying "search this",
     the tool replying with "its behind anubis",
     the developer asking "what can i do?"
     teh tool producing a script that ddos us becuase
     "bypassing restrictions" directs teh LLM to mallicious
     scripts, thats just the context in which "bypassing restrictions" occurs]

* code generators incorporating code from outside that is incompatibly licensed
    If a code generator can see all our code it should preferably use it
    if it can see none of it, it will use forums and random bits of code
    off the internet. Not all of which is LGPL compatible
    so again, IMHO if you are afraid of license issues the very first thing to
    do is ensure our code and coding docs are accessible to the tools
    generating code

* generally tools being less usefull as they have less ffmpeg specific
  context

thx

[...]

-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Whats the most studid thing your enemy could do ? Blow himself up
Whats the most studid thing you could do ? Give up your rights and
freedom because your enemy blew himself up.


[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

[-- Attachment #2: Type: text/plain, Size: 251 bytes --]

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [FFmpeg-devel] [RFC] Introducing policies regarding "AI" contributions
  2025-07-03 23:31   ` Alexander Strasser via ffmpeg-devel
@ 2025-07-04 16:43     ` compn
  0 siblings, 0 replies; 14+ messages in thread
From: compn @ 2025-07-04 16:43 UTC (permalink / raw)
  To: ffmpeg-devel

On Fri, 4 Jul 2025 01:31:19 +0200, Alexander Strasser via ffmpeg-devel
wrote:

> To find out if it is truly good code someone needs to review it very
> deeply, which is extra hard if it is "AI" generated code as it tends
> to look very plausible; which could waste a lot of time for the people
> looking at it and reviewing it. This also diminishes the actual value
> of the use of "AI" in the first place.

eh, if the code works, it works. if it can create a decoder for a
format that we do not have a decoder for, more power to it. its going
to have bugs (at least until code review models show up) the same as
human code.

i'm guessing it will be better than disassembled code (as we've seen in
the past win32 decoder code submitted and committed...)

for me:

morally: 
not opposed.

copyright:
not opposed, if it was trained on gpl code.

code review/quality:
i agree with alex. if it doesnt conform to ffmpeg code standards, its
better if someone conforms it, or says its non conforming to avoid
lengthy code reviews.

-compn
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [FFmpeg-devel] [RFC] Introducing policies regarding "AI" contributions
  2025-07-01 12:44 ` Kacper Michajlow
  2025-07-03 23:31   ` Alexander Strasser via ffmpeg-devel
@ 2025-07-04 18:11   ` softworkz .
  1 sibling, 0 replies; 14+ messages in thread
From: softworkz . @ 2025-07-04 18:11 UTC (permalink / raw)
  To: FFmpeg development discussions and patches; +Cc: Alexander Strasser

> -----Original Message-----
> From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of
> Kacper Michajlow
> Sent: Dienstag, 1. Juli 2025 14:44
> To: FFmpeg development discussions and patches <ffmpeg-
> devel@ffmpeg.org>
> Cc: Alexander Strasser <eclipse7@gmx.net>
> Subject: Re: [FFmpeg-devel] [RFC] Introducing policies regarding "AI"
> contributions
> 
> On Tue, 1 Jul 2025 at 12:58, Alexander Strasser via ffmpeg-devel
> <ffmpeg-devel@ffmpeg.org> wrote:
> >
> >
> >
> >
> > ---------- Forwarded message ----------
> > From: Alexander Strasser <eclipse7@gmx.net>
> > To: ffmpeg-devel@ffmpeg.org
> > Cc:
> > Bcc:
> > Date: Tue, 1 Jul 2025 12:58:23 +0200
> > Subject: [RFC] Introducing policies regarding "AI" contributions
> > Hi all,
> >
> > I do not like the branding of the LLMs as AI, thus I will for now
> > continue to call it "AI" in quotes. I'm open for better terms.
> >
> > It was just yesterday brought up on IRC in #ffmpeg-devel that there
> > was at least one, marked attempt to include "AI" generated code[1].
> >
> > At least I would say that this particular patch series was rejected,
> > but there were was no explicit discussion and clear statement about
> > "AI" generated content; especially code.
> >
> > Thus I want this thread to start a discussion, that eventually leads
> > to a policy about submitting and integrating "AI" generated content.
> 
> I don't think labeling code as "AI" matters that much. Let's ignore
> licensing/legal issues for now.
> 
> What's important is the code itself and its quality. It doesn't matter
> how it was created. Whether by a human, "AI" or something else. The
> key is the final product. "AI" is just a tool, and like any tool, it
> can be used well or poorly. How you use it may be completely different
> between "operators".
> 
> I think the "AI" label exists because the code that LLMs produce is
> often incomplete, low quality, and a pile of spaghetti that somehow
> works for a single use case. but is far from being a sane, production
> ready implementation. Anyone who has used these tools knows their
> limitations and what they can or cannot do.
> 
> That said, if "AI" code means low quality code, then by all means, it
> should be rejected. This applies to human, alien, or "AI" generated
> code. There shouldn't be a different metric for "AI" code. If "AI"
> (and its "operator") produces high quality code, there's no reason to
> reject it.

I see it in a similar way. Things are changing so incredibly fast, that
there's little sense in establishing a policy that is based on the 
assumption that generated code is of low quality, because it might be 
outdated even before it might have been agreed about. 
It's tough to judge anyway, because these models do not generate "low
quality" code of that kind that less experienced human developers may
do. It's rather an insane mix of good quality code mixed up with 
insane flaws, oversights and shortcomings. The mix strongly depends 
on the topic and also language. Yesterday I had let it generate a 
bunch of Bash scripts with options interface for managing certain 
cloud resources. I had to supervise closely, but it tested and fixed
its mistakes itself (after me pointing at), generated useful 
documentation - wow! It would have taken me a multiple of the time.
But in many other cases where I tried, I ended up spending more time
than doing it alone right away. For specific FFmpeg work I haven't
found it useful for anything so far but reviewing changes. Probably
this will change at some time, though.

Even though the script creation was impressive, it doesn't mean that
you can close your eyes and be good. You still need to see and review
and evaluate every single line that is generated.

Which brings me to Gyan's comment:

> At best, we can require disclosure and for the human submitter to
> assume
> responsibility.

IMO this is THE one point that would make a reasonable policy which
is valid independently from any "AI" progress now and in the future:

When someone submits code, we can require that a submitter not only
formally takes responsibility, but we can also expect that there's 
a close understanding of every single line of code that is being 
submitted.
If that isn't given (like in case where it was stated like "I 
don't know what it does, but it's working"), then that should be
a clear reason for rejection.

Best regards,

softworkz

PS: For the ML (or any future communication method), I think 
a simple policy should be in place like that "AI" generated 
messages must be marked as such.

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [FFmpeg-devel] [RFC] Introducing policies regarding "AI" contributions
  2025-07-01 10:58 [FFmpeg-devel] [RFC] Introducing policies regarding "AI" contributions Alexander Strasser via ffmpeg-devel
                   ` (4 preceding siblings ...)
  2025-07-04 10:15 ` Michael Niedermayer
@ 2025-07-05 11:20 ` Rémi Denis-Courmont
  2025-07-05 12:22   ` Kacper Michajlow
  5 siblings, 1 reply; 14+ messages in thread
From: Rémi Denis-Courmont @ 2025-07-05 11:20 UTC (permalink / raw)
  To: ffmpeg-devel

Le tiistaina 1. heinäkuuta 2025, 13.58.23 Itä-Euroopan kesäaika Alexander 
Strasser via ffmpeg-devel a écrit :
> (...) I want this thread to start a discussion, that eventually leads
> to a policy about submitting and integrating "AI" generated content.

Well, you can define a policy and/or make a public statement on FFmpeg.org, but 
as others said, just like we can't prevent someone misattributing their 
contributions and violating copyrights, we can't credibly prevent (mis)use of 
LLMs to generate code.

There is also a problem of definition. While I don't personally use computer 
assistance, I think it's fine to use language servers to automatically generate 
or suggests boilerplate, possible contextual completions, etc. While this sort 
of technology predates LLMs and is clearly distinct from it **at the moment**, 
it's going to be hard to define "AI" and where to draw a line.

Ultimately, I think you need to define the problem(s) as far as FFmpeg-devel is 
concerned. Potential copyright violations are not new, and I think the current 
policies and license terms are adequate, regardless of AI.

Low quality patches are also not really a new problem, and they can be 
rejected with the current processes too.

*Maybe* LLM usage will (willingly or unwittingly) lead to a denial of service 
attacks on the review capacity and motivation of the FFmpeg-devel, TC and GA 
membership, but that remains highly speculative, and I think we don't need to 
solve that what-if problem yet. And again, this attack does not necessarily 
need an LLM to be carried.

-- 
ヅニ-クーモン・レミ
Tapio's place new town, former Finnish Republic of Uusimaa

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [FFmpeg-devel] [RFC] Introducing policies regarding "AI" contributions
  2025-07-05 11:20 ` Rémi Denis-Courmont
@ 2025-07-05 12:22   ` Kacper Michajlow
  0 siblings, 0 replies; 14+ messages in thread
From: Kacper Michajlow @ 2025-07-05 12:22 UTC (permalink / raw)
  To: FFmpeg development discussions and patches

On Sat, 5 Jul 2025 at 13:21, Rémi Denis-Courmont <remi@remlab.net> wrote:
>
> Le tiistaina 1. heinäkuuta 2025, 13.58.23 Itä-Euroopan kesäaika Alexander
> Strasser via ffmpeg-devel a écrit :
> > (...) I want this thread to start a discussion, that eventually leads
> > to a policy about submitting and integrating "AI" generated content.
>
> Well, you can define a policy and/or make a public statement on FFmpeg.org, but
> as others said, just like we can't prevent someone misattributing their
> contributions and violating copyrights, we can't credibly prevent (mis)use of
> LLMs to generate code.
>
> There is also a problem of definition. While I don't personally use computer
> assistance, I think it's fine to use language servers to automatically generate
> or suggests boilerplate, possible contextual completions, etc. While this sort
> of technology predates LLMs and is clearly distinct from it **at the moment**,
> it's going to be hard to define "AI" and where to draw a line.
>
> Ultimately, I think you need to define the problem(s) as far as FFmpeg-devel is
> concerned. Potential copyright violations are not new, and I think the current
> policies and license terms are adequate, regardless of AI.
>
> Low quality patches are also not really a new problem, and they can be
> rejected with the current processes too.
>
> *Maybe* LLM usage will (willingly or unwittingly) lead to a denial of service
> attacks on the review capacity and motivation of the FFmpeg-devel, TC and GA
> membership, but that remains highly speculative, and I think we don't need to
> solve that what-if problem yet. And again, this attack does not necessarily
> need an LLM to be carried.

Fully agreed. I think the bottom line is that there is a human sending
those patches, and while "AI" tools may have been used, it's
ultimately on the human to ensure quality and send patches in a
reasonable state.

I think we can all agree, that skill/experience between people varies.
I trust that experienced developers will produce patches that have
reasonable quality, regardless of tools used. While the main issue I
see is that those "AI" tools enable so-called Vibe Coders to produce
something they do not understand and think it's ok to share that. This
is not acceptable.

Of course there is also the possibility to send completely automated
patches by bots. This should be considered spam and rejected. I know
there is research towards making this work, but currently it backfires
immensely. GitHub offsers service like that to generate pull requests
(patches) from issue description in a fully automated process. This
basically unwillingly converts maintainers/reviewers into Vibe
Codders, who are prompting LLM in the review comments. You can imagine
this doesn't end well... But again I don't think this is specific to
AI tools. If an inexperienced developer produces a patch and doesn't
understand review comments and is not responsive or not able to
correct their changes, it's the same deal. Barrier of entry is
different when using LLM, but review of code really is the same.

- Kacper
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2025-07-05 12:23 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-07-01 10:58 [FFmpeg-devel] [RFC] Introducing policies regarding "AI" contributions Alexander Strasser via ffmpeg-devel
2025-07-01 11:20 ` Gyan Doshi
2025-07-03 23:42   ` Alexander Strasser via ffmpeg-devel
2025-07-01 12:44 ` Kacper Michajlow
2025-07-03 23:31   ` Alexander Strasser via ffmpeg-devel
2025-07-04 16:43     ` compn
2025-07-04 18:11   ` softworkz .
2025-07-03  0:16 ` Gerion Entrup
2025-07-03 23:14   ` Alexander Strasser via ffmpeg-devel
2025-07-04  7:10     ` Nicolas George
2025-07-03 23:44 ` Leo Izen
2025-07-04 10:15 ` Michael Niedermayer
2025-07-05 11:20 ` Rémi Denis-Courmont
2025-07-05 12:22   ` Kacper Michajlow

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
		ffmpegdev@gitmailbox.com
	public-inbox-index ffmpegdev

Example config snippet for mirrors.


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git