From: Thomas Dullien via ffmpeg-devel <ffmpeg-devel@ffmpeg.org>
To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Cc: Michael Niedermayer <michael@niedermayer.cc>,
Christophe Gisquet <christophe.gisquet@gmail.com>,
Thomas Dullien <thomas.dullien@googlemail.com>
Subject: [FFmpeg-devel] Re: question about submitting security patches
Date: Wed, 12 Nov 2025 11:26:36 +0100
Message-ID: <CA+GtNgkYs0g3wuaK=uWJ0pdQ3pYoGOUkA0=Vp7K45M5gkEM=Sg@mail.gmail.com> (raw)
In-Reply-To: <CAPYFPM3H_tXxzbcQMc+=iiFC+yjeK+hqKsBx6pvc69t3pdkx8g@mail.gmail.com>
Hey all,
a quick note: As a person outside of the ffmpeg project that just happened
to contribute a patch,
here is my understanding of the legal situation:
1) Strictly speaking, "nobody knows" what the legalities of LLMs are going
to be. The big LLM providers are trying hard
to establish precedent(s) so that when the actual laws are adapted, they
will reflect current practice; therefore the LLM
providers try very hard to establish "as practice" what is beneficial to
themselves.
2) It is very instructive to look at the process that ended up with
software falling under copyright law. This is much more
recent than people think: The CONTU commission ran from 1974 to 1978, and
it wasn't until 1980 that the law that put
software firmly under the copyright regime we know today was passed. If you
love copyright law, you can find their
meeting notes online.
3) If you take a strict interpretation of the current copyright law, LLM
weights cannot be copyrighted (they are derived by
applying a formula to data, not a creative act); this hasn't stopped all
the LLM companies to attach license terms to their
releases, pretending as if copyright applied. The goal here is to establish
precedent so that in the future LLM weights
will be deemed copyrightable.
4) There are valid arguments that - if LLM weights are copyrightable - they
might be derived works of the training data,
and with it, the output would be tainted (by being similar to a song that
consists only of sampled music: There is some
input by the composer, but it remixes lots of other copyrighted material).
There are practical issues with this, but more
importantly, given the importance of the AI boom for US GDP currently,
there are strong economic incentives for this
interpretation to not gain traction.
5) So the current position that the LLM providers take is "our weights are
copyrightable (even when current law says it
isn't), but all your data we trained on is present in such miniscule
dilution that there's no taint" (even when current law
provides arguments it should be). Clearly this is primarily serving their
own interests, with the goal to establish law in
their favour.
Given that the future legal regime is entirely unclear, it is a valid
decision for each person (or group of persons) that
maintains code to either (a) take the side that the most likely outcome is
that LLM-generated output is taint-free, or
(b) take the side that the most likely outcome is that LLM-generated output
is tainted. This is less a statement about
today's laws, and more a statement about "which societal forces will be
stronger in shaping the consensus".
I'm completely impartial to what FFmpeg (as a project) decides - for the
moment, the patch is human-authored
anyhow, so it doesn't matter much for *this patch*.
That said, it would be helpful to know if commit messages can be authored
by AI if clearly labeled. If societal
consensus falls on the side of AI output being tainted, commit messages
*can* be removed automatically, albeit
at a cost of changing the hashes in the git commit history.
Cheers,
Thomas
Am Mi., 12. Nov. 2025 um 09:24 Uhr schrieb Christophe Gisquet via
ffmpeg-devel <ffmpeg-devel@ffmpeg.org>:
> Hello,
>
> Le mar. 11 nov. 2025 à 04:01, Michael Niedermayer via ffmpeg-devel
> <ffmpeg-devel@ffmpeg.org> a écrit :
> > If you have concrete legal analysis or case law that supports this
> claim, please share it.
>
> I can name at least one Fortune 500 companies, that maybe won't
> disclose publicly these facts, that did equivalent analysis and have
> basically forbidden use of "AI"-generated code for distributed
> software.
> By way of consequence, if that matters to you, maybe these companies
> would be very concerned that the ffmpeg project included such code.
>
> Second, Gyan's Linux Foundation link is extremely telling:
> 1) You need to be able to identify whether the LLM output comes from
> copyrighted code. ie, what it was trained on.
> 2) You need to report the portions affected, included with license
> It's not making it forbidden, just impossible to abide by.
>
> --
> Christophe
> _______________________________________________
> ffmpeg-devel mailing list -- ffmpeg-devel@ffmpeg.org
> To unsubscribe send an email to ffmpeg-devel-leave@ffmpeg.org
>
_______________________________________________
ffmpeg-devel mailing list -- ffmpeg-devel@ffmpeg.org
To unsubscribe send an email to ffmpeg-devel-leave@ffmpeg.org
next prev parent reply other threads:[~2025-11-13 0:12 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-08 8:34 [FFmpeg-devel] " Thomas Dullien via ffmpeg-devel
2025-11-10 16:03 ` [FFmpeg-devel] " Rémi Denis-Courmont via ffmpeg-devel
2025-11-10 16:19 ` Thomas Dullien via ffmpeg-devel
2025-11-11 2:59 ` Michael Niedermayer via ffmpeg-devel
2025-11-11 6:49 ` Rémi Denis-Courmont via ffmpeg-devel
2025-11-11 8:27 ` Gyan Doshi via ffmpeg-devel
2025-11-12 8:09 ` Kieran Kunhya via ffmpeg-devel
2025-11-13 3:06 ` Michael Niedermayer via ffmpeg-devel
2025-11-13 3:52 ` Kieran Kunhya via ffmpeg-devel
2025-11-13 18:38 ` Michael Niedermayer via ffmpeg-devel
2025-11-13 14:50 ` Timo Rothenpieler via ffmpeg-devel
2025-11-13 18:59 ` ff--- via ffmpeg-devel
2025-11-14 7:40 ` Tobias Rapp via ffmpeg-devel
2025-11-12 8:24 ` Christophe Gisquet via ffmpeg-devel
2025-11-12 10:26 ` Thomas Dullien via ffmpeg-devel [this message]
2025-11-13 5:36 ` compn via ffmpeg-devel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CA+GtNgkYs0g3wuaK=uWJ0pdQ3pYoGOUkA0=Vp7K45M5gkEM=Sg@mail.gmail.com' \
--to=ffmpeg-devel@ffmpeg.org \
--cc=christophe.gisquet@gmail.com \
--cc=michael@niedermayer.cc \
--cc=thomas.dullien@googlemail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
This inbox may be cloned and mirrored by anyone:
git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git
# If you have public-inbox 1.1+ installed, you may
# initialize and index your mirror using the following commands:
public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
ffmpegdev@gitmailbox.com
public-inbox-index ffmpegdev
Example config snippet for mirrors.
AGPL code for this site: git clone https://public-inbox.org/public-inbox.git