From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <ffmpeg-devel-bounces@ffmpeg.org>
Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100])
	by master.gitmailbox.com (Postfix) with ESMTP id 8AB1A4556B
	for <ffmpegdev@gitmailbox.com>; Sun, 23 Jun 2024 17:46:30 +0000 (UTC)
Received: from [127.0.1.1] (localhost [127.0.0.1])
	by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id AA17D68D5D2;
	Sun, 23 Jun 2024 20:46:26 +0300 (EEST)
Received: from relay8-d.mail.gandi.net (relay8-d.mail.gandi.net
 [217.70.183.201])
 by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 123A368CB9F
 for <ffmpeg-devel@ffmpeg.org>; Sun, 23 Jun 2024 20:46:21 +0300 (EEST)
Received: by mail.gandi.net (Postfix) with ESMTPSA id 5D51C1BF204
 for <ffmpeg-devel@ffmpeg.org>; Sun, 23 Jun 2024 17:46:19 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=niedermayer.cc;
 s=gm1; t=1719164780;
 h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
 to:to:cc:mime-version:mime-version:content-type:content-type:
 in-reply-to:in-reply-to:references:references;
 bh=BdEfrigyblSmmunKkgbsgEDzGf/9ZF8pT3EMyhiDRXg=;
 b=IAU9eWlbIUOxbttXp+PS5uE34AKf2r9ZCqRNxETZD1O3ggp8K7tFCoXtixYSFok/brsA3V
 uMf6JK2gmonQ9k4PUZgCUM1v79kGIirGo6w3ZG89G5HouFB6ph47YXYsdise9Sd9Mhe2bK
 flY1S2gvBaT+yADxnY9MNti7tq8lyPHgdmPw5FPGaFTecPihZlO07kaHeKO0eDC2X+m6Qp
 zQYaDXaikcEWgvkYosk8yUvkU5REe3GN7GVz+Ime83x2iNps6hjjthCE1/QA5JmwAJYpgh
 MtOG1H5R5oz2VbUKc1jsRmy1O2NLpXyxEUiTUek+NT8VHs8c5Td8oUWSuRHn1A==
Date: Sun, 23 Jun 2024 19:46:19 +0200
From: Michael Niedermayer <michael@niedermayer.cc>
To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Message-ID: <20240623174619.GA4991@pb2>
References: <20240622151334.GD14140@haasn.xyz>
 <CABLWnS_u1_agZG=o7Lft9PCZ3=w3K=agzQJbkGarBWtTpa6cMg@mail.gmail.com>
MIME-Version: 1.0
In-Reply-To: <CABLWnS_u1_agZG=o7Lft9PCZ3=w3K=agzQJbkGarBWtTpa6cMg@mail.gmail.com>
X-GND-Sasl: michael@niedermayer.cc
Subject: Re: [FFmpeg-devel] [RFC]] swscale modernization proposal
X-BeenThere: ffmpeg-devel@ffmpeg.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: FFmpeg development discussions and patches <ffmpeg-devel.ffmpeg.org>
List-Unsubscribe: <https://ffmpeg.org/mailman/options/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=unsubscribe>
List-Archive: <https://ffmpeg.org/pipermail/ffmpeg-devel>
List-Post: <mailto:ffmpeg-devel@ffmpeg.org>
List-Help: <mailto:ffmpeg-devel-request@ffmpeg.org?subject=help>
List-Subscribe: <https://ffmpeg.org/mailman/listinfo/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=subscribe>
Reply-To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Content-Type: multipart/mixed; boundary="===============0810577217065947050=="
Errors-To: ffmpeg-devel-bounces@ffmpeg.org
Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>
Archived-At: <https://master.gitmailbox.com/ffmpegdev/20240623174619.GA4991@pb2/>
List-Archive: <https://master.gitmailbox.com/ffmpegdev/>
List-Post: <mailto:ffmpegdev@gitmailbox.com>


--===============0810577217065947050==
Content-Type: multipart/signed; micalg=pgp-sha512;
	protocol="application/pgp-signature"; boundary="SLk8PSyqxe/ugcCT"
Content-Disposition: inline


--SLk8PSyqxe/ugcCT
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Sun, Jun 23, 2024 at 12:19:13AM +0200, Vittorio Giovara wrote:
> On Sat, Jun 22, 2024 at 3:22=E2=80=AFPM Niklas Haas <ffmpeg@haasn.xyz> wr=
ote:
>=20
> > Hey,
> >
> > As some of you know, I got contracted (by STF 2024) to work on improving
> > swscale, over the course of the next couple of months. I want to share =
my
> > current plans and gather feedback + measure sentiment.
> >
> > ## Problem statement
> >
> > The two issues I'd like to focus on for now are:
> >
> > 1. Lack of support for a lot of modern formats and conversions (HDR, IC=
tCp,
> >    IPTc2, BT.2020-CL, XYZ, YCgCo, Dolby Vision, ...)
> > 2. Complicated context management, with cascaded contexts, threading,
> > stateful
> >    configuration, multi-step init procedures, etc; and related bugs
> >
> > In order to make these feasible, some amount of internal re-organizatio=
n of
> > duties inside swscale is prudent.
> >
> > ## Proposed approach
> >
> > The first step is to create a new API, which will (tentatively) live in
> > <libswscale/avscale.h>. This API will initially start off as a near-copy
> > of the
> > current swscale public API, but with the major difference that I want it
> > to be
> > state-free and only access metadata in terms of AVFrame properties. So
> > there
> > will be no independent configuration of the input chroma location etc. =
like
> > there is currently, and no need to re-configure or re-init the context =
when
> > feeding it frames with different properties. The goal is for users to be
> > able
> > to just feed it AVFrame pairs and have it internally cache expensive
> > pre-processing steps as needed. Finally, avscale_* should ultimately al=
so
> > support hardware frames directly, in which case it will dispatch to some
> > equivalent of scale_vulkan/vaapi/cuda or possibly even libplacebo. (But=
 I
> > will
> > defer this to a future milestone)
> >
> > After this API is established, I want to start expanding the functional=
ity
> > in
> > the following manner:
> >
> > ### Phase 1
> >
> > For basic operation, avscale_* will just dispatch to a sequence of
> > swscale_*
> > invocations. In the basic case, it will just directly invoke swscale wi=
th
> > minimal overhead. In more advanced cases, it might resolve to a *sequen=
ce*
> > of
> > swscale operations, with other operations (e.g. colorspace conversions =
a la
> > vf_colorspace) mixed in.
> >
> > This will allow us to gain new functionality in a minimally invasive wa=
y,
> > and
> > will let API users start porting to the new API. This will also serve a=
s a
> > good
> > "selling point" for the new API, allowing us to hopefully break up the
> > legacy
> > swscale API afterwards.
> >
> > ### Phase 2
> >
> > After this is working, I want to cleanly separate swscale into two dist=
inct
> > components:
> >
> > 1. vertical/horizontal scaling
> > 2. input/output conversions
> >
> > Right now, these operations both live inside the main SwsContext, even
> > though
> > they are conceptually orthogonal. Input handling is done entirely by the
> > abstract callbacks lumToYV12 etc., while output conversion is currently
> > "merged" with vertical scaling (yuv2planeX etc.).
> >
> > I want to cleanly separate these components so they can live inside
> > independent
> > contexts, and be considered as semantically distinct steps. (In particu=
lar,
> > there should ideally be no more "unscaled special converters", instead
> > this can
> > be seen as a special case where there simply is no vertical/horizontal
> > scaling
> > step)
> >
> > The idea is for the colorspace conversion layer to sit in between the
> > input/output converters and the horizontal/vertical scalers. This all
> > would be
> > orchestrated by the avscale_* abstraction.
> >
> > ## Implementation details
> >
> > To avoid performance loss from separating "merged" functions into their
> > constituents, care needs to be taken such that all intermediate data, in
> > addition to all involved look-up tables, will fit comfortably inside th=
e L1
> > cache. The approach I propose, which is also (afaict) used by zscale, i=
s to
> > loop over line segments, applying each operation in sequence, on a small
> > temporary buffer.
> >
> > e.g.
> >
> > hscale_row(pixel *dst, const pixel *src, int img_width)
> > {
> >     const int SIZE =3D 256; // or some other small-ish figure, possibly=
 a
> > design
> >                           // constant of the API so that SIMD
> > implementations
> >                           // can be appropriately unrolled
> >
> >     pixel tmp[SIZE];
> >     for (i =3D 0; i < img_width; i +=3D SIZE) {
> >         int pixels =3D min(SIZE, img_width - i);
> >
> >         { /* inside read input callback */
> >             unpack_input(tmp, src, pixels);
> >             // the amount of separation here will depend on the perform=
ance
> >             apply_matrix3x3(tmp, yuv2rgb, pixels);
> >             apply_lut3x1d(tmp, gamma_lut, pixels);
> >             ...
> >         }
> >
> >         hscale(dst, tmp, filter, pixels);
> >
> >         src +=3D pixels;
> >         dst +=3D scale_factor(pixels);
> >     }
> > }
> >
> > This function can then output rows into a ring buffer for use inside the
> > vertical scaler, after which the same procedure happens (in reverse) for
> > the
> > final output pass.
> >
> > Possibly, we also want to additionally limit the size of a row for the
> > horizontal scaler, to allow arbitrary large input images.
> >
> > ## Comments / feedback?
> >
> > Does the above approach seem reasonable? How do people feel about
> > introducing
> > a new API vs. trying to hammer the existing API into the shape I want it
> > to be?
> >
> > I've attached an example of what <avscale.h> could end up looking like.=
 If
> > there is broad agreement on this design, I will move on to an
> > implementation.
> >
>=20
> What do you think of the concept of kernels like
> https://github.com/lu-zero/avscale/blob/master/kernels/rgb2yuv.c
> The idea is that there is a bit of analysis on input and output format
> requested, and either a specialized kernel is used, or a chain of kernels
> is built and data is passed along.
> Among the design goals of that library, there was also readability (so th=
at
> the flow was always under control) and the ease of writing assembly and/or
> shader for any single kernel.

I think I have not looked at lucas work before, so i cannot comment on it s=
pecifically
But i think what you suggest is what Niklas intends to do.
swscale has evolved over a long time from code with a very small subset of
the current features. The code is in need for being "refactored" into some
cleaner kernel / modular design.
Also as you mention lu_zero, I had talked with him very briefly and he will
be on the next extra member vote for the GA (whoever initiates it, ill try =
to
make sure luca is not forgotten) Just saying, i have not forgotten
him, just that i wanted to accumulate more people before bringing that up.


>=20
> Needless to say I support the plan of renaming the library so that it can

As the main author of libswscale, i find this quite offensive.

thx

[...]
--=20
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Everything should be made as simple as possible, but not simpler.
-- Albert Einstein

--SLk8PSyqxe/ugcCT
Content-Type: application/pgp-signature; name="signature.asc"

-----BEGIN PGP SIGNATURE-----

iF0EABEKAB0WIQSf8hKLFH72cwut8TNhHseHBAsPqwUCZnhfYgAKCRBhHseHBAsP
q3P4AJ9wUqfTdG6RPjlo9EQ2DUB/3o7kGACfc3qFJhL6PnX9CZLqtsouNsoja8A=
=jKBe
-----END PGP SIGNATURE-----

--SLk8PSyqxe/ugcCT--

--===============0810577217065947050==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

--===============0810577217065947050==--