* [FFmpeg-devel] [RFC]] swscale modernization proposal
@ 2024-06-22 13:13 Niklas Haas
2024-06-22 14:23 ` Andrew Sayers
` (4 more replies)
0 siblings, 5 replies; 36+ messages in thread
From: Niklas Haas @ 2024-06-22 13:13 UTC (permalink / raw)
To: ffmpeg-devel
[-- Attachment #1: Type: text/plain, Size: 5067 bytes --]
Hey,
As some of you know, I got contracted (by STF 2024) to work on improving
swscale, over the course of the next couple of months. I want to share my
current plans and gather feedback + measure sentiment.
## Problem statement
The two issues I'd like to focus on for now are:
1. Lack of support for a lot of modern formats and conversions (HDR, ICtCp,
IPTc2, BT.2020-CL, XYZ, YCgCo, Dolby Vision, ...)
2. Complicated context management, with cascaded contexts, threading, stateful
configuration, multi-step init procedures, etc; and related bugs
In order to make these feasible, some amount of internal re-organization of
duties inside swscale is prudent.
## Proposed approach
The first step is to create a new API, which will (tentatively) live in
<libswscale/avscale.h>. This API will initially start off as a near-copy of the
current swscale public API, but with the major difference that I want it to be
state-free and only access metadata in terms of AVFrame properties. So there
will be no independent configuration of the input chroma location etc. like
there is currently, and no need to re-configure or re-init the context when
feeding it frames with different properties. The goal is for users to be able
to just feed it AVFrame pairs and have it internally cache expensive
pre-processing steps as needed. Finally, avscale_* should ultimately also
support hardware frames directly, in which case it will dispatch to some
equivalent of scale_vulkan/vaapi/cuda or possibly even libplacebo. (But I will
defer this to a future milestone)
After this API is established, I want to start expanding the functionality in
the following manner:
### Phase 1
For basic operation, avscale_* will just dispatch to a sequence of swscale_*
invocations. In the basic case, it will just directly invoke swscale with
minimal overhead. In more advanced cases, it might resolve to a *sequence* of
swscale operations, with other operations (e.g. colorspace conversions a la
vf_colorspace) mixed in.
This will allow us to gain new functionality in a minimally invasive way, and
will let API users start porting to the new API. This will also serve as a good
"selling point" for the new API, allowing us to hopefully break up the legacy
swscale API afterwards.
### Phase 2
After this is working, I want to cleanly separate swscale into two distinct
components:
1. vertical/horizontal scaling
2. input/output conversions
Right now, these operations both live inside the main SwsContext, even though
they are conceptually orthogonal. Input handling is done entirely by the
abstract callbacks lumToYV12 etc., while output conversion is currently
"merged" with vertical scaling (yuv2planeX etc.).
I want to cleanly separate these components so they can live inside independent
contexts, and be considered as semantically distinct steps. (In particular,
there should ideally be no more "unscaled special converters", instead this can
be seen as a special case where there simply is no vertical/horizontal scaling
step)
The idea is for the colorspace conversion layer to sit in between the
input/output converters and the horizontal/vertical scalers. This all would be
orchestrated by the avscale_* abstraction.
## Implementation details
To avoid performance loss from separating "merged" functions into their
constituents, care needs to be taken such that all intermediate data, in
addition to all involved look-up tables, will fit comfortably inside the L1
cache. The approach I propose, which is also (afaict) used by zscale, is to
loop over line segments, applying each operation in sequence, on a small
temporary buffer.
e.g.
hscale_row(pixel *dst, const pixel *src, int img_width)
{
const int SIZE = 256; // or some other small-ish figure, possibly a design
// constant of the API so that SIMD implementations
// can be appropriately unrolled
pixel tmp[SIZE];
for (i = 0; i < img_width; i += SIZE) {
int pixels = min(SIZE, img_width - i);
{ /* inside read input callback */
unpack_input(tmp, src, pixels);
// the amount of separation here will depend on the performance
apply_matrix3x3(tmp, yuv2rgb, pixels);
apply_lut3x1d(tmp, gamma_lut, pixels);
...
}
hscale(dst, tmp, filter, pixels);
src += pixels;
dst += scale_factor(pixels);
}
}
This function can then output rows into a ring buffer for use inside the
vertical scaler, after which the same procedure happens (in reverse) for the
final output pass.
Possibly, we also want to additionally limit the size of a row for the
horizontal scaler, to allow arbitrary large input images.
## Comments / feedback?
Does the above approach seem reasonable? How do people feel about introducing
a new API vs. trying to hammer the existing API into the shape I want it to be?
I've attached an example of what <avscale.h> could end up looking like. If
there is broad agreement on this design, I will move on to an implementation.
[-- Attachment #2: avscale.h --]
[-- Type: text/plain, Size: 3865 bytes --]
/*
* Copyright (C) 2024 Niklas Haas
*
* This file is part of FFmpeg.
*
* FFmpeg is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2.1 of the License, or (at your option) any later version.
*
* FFmpeg is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with FFmpeg; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
*/
#ifndef SWSCALE_AVSCALE_H
#define SWSCALE_AVSCALE_H
/**
* @file
* @ingroup libsws
* Higher-level wrapper around libswscale + related libraries, which is
* capable of handling more advanced colorspace transformations.
*/
#include "libavutil/frame.h"
#include "libavutil/log.h"
/**
* Main external API structure. New fields cannot be added to the end with
* minor version bumps. Removal, reordering and changes to existing fields
* require a major version bump. sizeof(AVScaleContext) is not part of the ABI.
*/
typedef struct AVScaleContext {
const AVClass *av_class;
/**
* Private context used for internal data.
*/
struct AVScaleInternal *internal;
/**
* Private data of the user, can be used to carry app specific stuff.
*/
void *opaque;
/**
* Bitmask of AV_SCALE_* flags.
*/
int64_t flags;
/**
* Bitmask of SWS_* flags.
*/
int sws_flags;
/**
* Bitmask of SWS_DITHER_* flags.
*/
int sws_dither;
} AVScaleContext;
enum {
/**
* Skip linearization, speeds up downscaling at the cost of quality.
*/
AV_SCALE_NOLINEAR = 1 << 0,
/**
* Force full internal conversion to RGB. May improve the quality of
* certain operations, at the cost of performance.
*/
AV_SCALE_FULL_RGB = 1 << 1,
/**
* Perform perceptual conversion between colorspaces instead of clipping.
*/
AV_SCALE_PERCEPTUAL = 1 << 2,
// ...
};
/**
* Allocate an AVScaleContext and set its fields to default values. The
* resulting struct should be freed with avscale_free_context().
*/
AVScaleContext *avscale_alloc_context(void);
/**
* Free the codec context and everything associated with it, and write NULL
* to the provided pointer.
*/
void avscale_free_context(AVScaleContext **ctx);
/**
* Get the AVClass for AVScaleContext. It can be used in combination with
* AV_OPT_SEARCH_FAKE_OBJ for examining options.
*
* @see av_opt_find().
*/
const AVClass *avscale_get_class(void);
/**
* Scale source data from `src` and write the output to `dst`.
*
* @param ctx The scaling context.
* @param dst The destination frame.
*
* The data buffers may either be already allocated by the caller
* or left clear, in which case they will be allocated by the
* scaler. The latter may have performance advantages - e.g. in
* certain cases some output planes may be references to input
* planes, rather than copies.
*
* @param src The source frame. If the data buffers are set to NULL, then
* this function performs no conversion. It will instead merely
* initialize internal state that *would* be required to perform
* the operation, as well as returing the correct error code for
* unsupported frame combinations.
*
* @return 0 on success, a negative AVERROR code on failure.
*/
int avscale_frame(AVScaleContext *ctx, AVFrame *dst, const AVFrame *src);
#endif /* SWSCALE_AVSCALE_H */
[-- Attachment #3: Type: text/plain, Size: 251 bytes --]
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [FFmpeg-devel] [RFC]] swscale modernization proposal
2024-06-22 13:13 [FFmpeg-devel] [RFC]] swscale modernization proposal Niklas Haas
@ 2024-06-22 14:23 ` Andrew Sayers
2024-06-22 15:10 ` Niklas Haas
2024-06-22 22:19 ` Vittorio Giovara
` (3 subsequent siblings)
4 siblings, 1 reply; 36+ messages in thread
From: Andrew Sayers @ 2024-06-22 14:23 UTC (permalink / raw)
To: FFmpeg development discussions and patches
On Sat, Jun 22, 2024 at 03:13:34PM +0200, Niklas Haas wrote:
[...]
>
> ## Comments / feedback?
>
> Does the above approach seem reasonable? How do people feel about introducing
> a new API vs. trying to hammer the existing API into the shape I want it to be?
>
> I've attached an example of what <avscale.h> could end up looking like. If
> there is broad agreement on this design, I will move on to an implementation.
API users seem to have difficulty with this type of big change[[1],
and doing the interface before the implementation means there's less
reason for developers to switch while you're still looking for feedback.
What's the plan to bring them along?
[1] https://ffmpeg.org/pipermail/ffmpeg-devel/2024-June/328852.html
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [FFmpeg-devel] [RFC]] swscale modernization proposal
2024-06-22 14:23 ` Andrew Sayers
@ 2024-06-22 15:10 ` Niklas Haas
2024-06-22 19:52 ` Michael Niedermayer
0 siblings, 1 reply; 36+ messages in thread
From: Niklas Haas @ 2024-06-22 15:10 UTC (permalink / raw)
To: FFmpeg development discussions and patches
On Sat, 22 Jun 2024 15:23:22 +0100 Andrew Sayers <ffmpeg-devel@pileofstuff.org> wrote:
> On Sat, Jun 22, 2024 at 03:13:34PM +0200, Niklas Haas wrote:
> [...]
> >
> > ## Comments / feedback?
> >
> > Does the above approach seem reasonable? How do people feel about introducing
> > a new API vs. trying to hammer the existing API into the shape I want it to be?
> >
> > I've attached an example of what <avscale.h> could end up looking like. If
> > there is broad agreement on this design, I will move on to an implementation.
>
> API users seem to have difficulty with this type of big change[[1],
> and doing the interface before the implementation means there's less
> reason for developers to switch while you're still looking for feedback.
>
> What's the plan to bring them along?
Since SwsContext is entirely internal, we can continue providing the
current API on top of whatever internal abstractions we arrive at. As
a trivial example, sws_scale() can construct a temporary AVFrame based
on the provided information, and simply pass that to avscale_frame(). So
I don't think legacy API users are at risk, or pressure to switch,
unless they want access to *new* functionality.
For that, the harder step is moving from sws_scale() to
sws_scale_frame(). This is something API users can *already* do. By
contrast, moving from sws_scale_frame() to avscale_frame() should
hopefully be simple, since it just requires making sure the AVFrame is
correctly tagged. Usually, the flow is in the opposite direction - users
receive a correctly tagged AVFrame and manually forward this information
to the SwsContext. So, most of the time, moving to a fully AVFrame-based
API will result in deleting code, rather than adding it.
If we wanted to maximize the transition comfort, we should consider
re-using the sws_scale_frame() entrypoint directly. The main reason I am
hesitant to do this is because sws_scale_frame() is currently tied to
the stateful configuration of SwsContext, and mostly ignores the AVFrame
metadata. While changing that is possible, it possibly presents a bigger
API break than simply introducing a new function.
>
> [1] https://ffmpeg.org/pipermail/ffmpeg-devel/2024-June/328852.html
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [FFmpeg-devel] [RFC]] swscale modernization proposal
2024-06-22 15:10 ` Niklas Haas
@ 2024-06-22 19:52 ` Michael Niedermayer
2024-06-22 22:24 ` Niklas Haas
0 siblings, 1 reply; 36+ messages in thread
From: Michael Niedermayer @ 2024-06-22 19:52 UTC (permalink / raw)
To: FFmpeg development discussions and patches
[-- Attachment #1.1: Type: text/plain, Size: 3309 bytes --]
On Sat, Jun 22, 2024 at 05:10:28PM +0200, Niklas Haas wrote:
> On Sat, 22 Jun 2024 15:23:22 +0100 Andrew Sayers <ffmpeg-devel@pileofstuff.org> wrote:
> > On Sat, Jun 22, 2024 at 03:13:34PM +0200, Niklas Haas wrote:
> > [...]
> > >
> > > ## Comments / feedback?
> > >
> > > Does the above approach seem reasonable? How do people feel about introducing
> > > a new API vs. trying to hammer the existing API into the shape I want it to be?
> > >
> > > I've attached an example of what <avscale.h> could end up looking like. If
> > > there is broad agreement on this design, I will move on to an implementation.
> >
> > API users seem to have difficulty with this type of big change[[1],
> > and doing the interface before the implementation means there's less
> > reason for developers to switch while you're still looking for feedback.
> >
> > What's the plan to bring them along?
>
> Since SwsContext is entirely internal, we can continue providing the
> current API on top of whatever internal abstractions we arrive at. As
> a trivial example, sws_scale() can construct a temporary AVFrame based
> on the provided information, and simply pass that to avscale_frame(). So
> I don't think legacy API users are at risk, or pressure to switch,
> unless they want access to *new* functionality.
>
> For that, the harder step is moving from sws_scale() to
> sws_scale_frame(). This is something API users can *already* do. By
> contrast, moving from sws_scale_frame() to avscale_frame() should
> hopefully be simple, since it just requires making sure the AVFrame is
> correctly tagged. Usually, the flow is in the opposite direction - users
> receive a correctly tagged AVFrame and manually forward this information
> to the SwsContext. So, most of the time, moving to a fully AVFrame-based
> API will result in deleting code, rather than adding it.
>
> If we wanted to maximize the transition comfort, we should consider
> re-using the sws_scale_frame() entrypoint directly. The main reason I am
> hesitant to do this is because sws_scale_frame() is currently tied to
> the stateful configuration of SwsContext, and mostly ignores the AVFrame
> metadata. While changing that is possible, it possibly presents a bigger
> API break than simply introducing a new function.
I agree we should keep using the same swscale.h header. It matches the library
name thats installed (thats also what the user expects and what (s)he is used to),
and its what users #include today.
Also its not a audio? scaler so the A is confusing.
And sws_scale_frame() should be used obviously if thats as you say does
"maximize the transition comfort"
Maybe simply adding an option for the library user to set the behavior
(favour AVFrame properties vs initial properties)
And then eventually deprecate and phase out the initial ones
The big advantage here is that we capture all users, noone stays on the old
API. And the transition is also simpler, if its just a flag to flip for someone
to try the new fully stateless system.
thx
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
Complexity theory is the science of finding the exact solution to an
approximation. Benchmarking OTOH is finding an approximation of the exact
[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]
[-- Attachment #2: Type: text/plain, Size: 251 bytes --]
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [FFmpeg-devel] [RFC]] swscale modernization proposal
2024-06-22 13:13 [FFmpeg-devel] [RFC]] swscale modernization proposal Niklas Haas
2024-06-22 14:23 ` Andrew Sayers
@ 2024-06-22 22:19 ` Vittorio Giovara
2024-06-22 22:39 ` Niklas Haas
` (2 more replies)
2024-06-29 7:41 ` Zhao Zhili
` (2 subsequent siblings)
4 siblings, 3 replies; 36+ messages in thread
From: Vittorio Giovara @ 2024-06-22 22:19 UTC (permalink / raw)
To: FFmpeg development discussions and patches
On Sat, Jun 22, 2024 at 3:22 PM Niklas Haas <ffmpeg@haasn.xyz> wrote:
> Hey,
>
> As some of you know, I got contracted (by STF 2024) to work on improving
> swscale, over the course of the next couple of months. I want to share my
> current plans and gather feedback + measure sentiment.
>
> ## Problem statement
>
> The two issues I'd like to focus on for now are:
>
> 1. Lack of support for a lot of modern formats and conversions (HDR, ICtCp,
> IPTc2, BT.2020-CL, XYZ, YCgCo, Dolby Vision, ...)
> 2. Complicated context management, with cascaded contexts, threading,
> stateful
> configuration, multi-step init procedures, etc; and related bugs
>
> In order to make these feasible, some amount of internal re-organization of
> duties inside swscale is prudent.
>
> ## Proposed approach
>
> The first step is to create a new API, which will (tentatively) live in
> <libswscale/avscale.h>. This API will initially start off as a near-copy
> of the
> current swscale public API, but with the major difference that I want it
> to be
> state-free and only access metadata in terms of AVFrame properties. So
> there
> will be no independent configuration of the input chroma location etc. like
> there is currently, and no need to re-configure or re-init the context when
> feeding it frames with different properties. The goal is for users to be
> able
> to just feed it AVFrame pairs and have it internally cache expensive
> pre-processing steps as needed. Finally, avscale_* should ultimately also
> support hardware frames directly, in which case it will dispatch to some
> equivalent of scale_vulkan/vaapi/cuda or possibly even libplacebo. (But I
> will
> defer this to a future milestone)
>
> After this API is established, I want to start expanding the functionality
> in
> the following manner:
>
> ### Phase 1
>
> For basic operation, avscale_* will just dispatch to a sequence of
> swscale_*
> invocations. In the basic case, it will just directly invoke swscale with
> minimal overhead. In more advanced cases, it might resolve to a *sequence*
> of
> swscale operations, with other operations (e.g. colorspace conversions a la
> vf_colorspace) mixed in.
>
> This will allow us to gain new functionality in a minimally invasive way,
> and
> will let API users start porting to the new API. This will also serve as a
> good
> "selling point" for the new API, allowing us to hopefully break up the
> legacy
> swscale API afterwards.
>
> ### Phase 2
>
> After this is working, I want to cleanly separate swscale into two distinct
> components:
>
> 1. vertical/horizontal scaling
> 2. input/output conversions
>
> Right now, these operations both live inside the main SwsContext, even
> though
> they are conceptually orthogonal. Input handling is done entirely by the
> abstract callbacks lumToYV12 etc., while output conversion is currently
> "merged" with vertical scaling (yuv2planeX etc.).
>
> I want to cleanly separate these components so they can live inside
> independent
> contexts, and be considered as semantically distinct steps. (In particular,
> there should ideally be no more "unscaled special converters", instead
> this can
> be seen as a special case where there simply is no vertical/horizontal
> scaling
> step)
>
> The idea is for the colorspace conversion layer to sit in between the
> input/output converters and the horizontal/vertical scalers. This all
> would be
> orchestrated by the avscale_* abstraction.
>
> ## Implementation details
>
> To avoid performance loss from separating "merged" functions into their
> constituents, care needs to be taken such that all intermediate data, in
> addition to all involved look-up tables, will fit comfortably inside the L1
> cache. The approach I propose, which is also (afaict) used by zscale, is to
> loop over line segments, applying each operation in sequence, on a small
> temporary buffer.
>
> e.g.
>
> hscale_row(pixel *dst, const pixel *src, int img_width)
> {
> const int SIZE = 256; // or some other small-ish figure, possibly a
> design
> // constant of the API so that SIMD
> implementations
> // can be appropriately unrolled
>
> pixel tmp[SIZE];
> for (i = 0; i < img_width; i += SIZE) {
> int pixels = min(SIZE, img_width - i);
>
> { /* inside read input callback */
> unpack_input(tmp, src, pixels);
> // the amount of separation here will depend on the performance
> apply_matrix3x3(tmp, yuv2rgb, pixels);
> apply_lut3x1d(tmp, gamma_lut, pixels);
> ...
> }
>
> hscale(dst, tmp, filter, pixels);
>
> src += pixels;
> dst += scale_factor(pixels);
> }
> }
>
> This function can then output rows into a ring buffer for use inside the
> vertical scaler, after which the same procedure happens (in reverse) for
> the
> final output pass.
>
> Possibly, we also want to additionally limit the size of a row for the
> horizontal scaler, to allow arbitrary large input images.
>
> ## Comments / feedback?
>
> Does the above approach seem reasonable? How do people feel about
> introducing
> a new API vs. trying to hammer the existing API into the shape I want it
> to be?
>
> I've attached an example of what <avscale.h> could end up looking like. If
> there is broad agreement on this design, I will move on to an
> implementation.
>
What do you think of the concept of kernels like
https://github.com/lu-zero/avscale/blob/master/kernels/rgb2yuv.c
The idea is that there is a bit of analysis on input and output format
requested, and either a specialized kernel is used, or a chain of kernels
is built and data is passed along.
Among the design goals of that library, there was also readability (so that
the flow was always under control) and the ease of writing assembly and/or
shader for any single kernel.
Needless to say I support the plan of renaming the library so that it can
be inline with the other libraries names, and the use of a separate header
since downstream applications will need to update a lot to use the new
library (or the new apis in the existing library) and/or we could provide a
thin conversion layer when the new lib is finalized.
--
Vittorio
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [FFmpeg-devel] [RFC]] swscale modernization proposal
2024-06-22 19:52 ` Michael Niedermayer
@ 2024-06-22 22:24 ` Niklas Haas
2024-06-23 17:27 ` Michael Niedermayer
0 siblings, 1 reply; 36+ messages in thread
From: Niklas Haas @ 2024-06-22 22:24 UTC (permalink / raw)
To: FFmpeg development discussions and patches
On Sat, 22 Jun 2024 21:52:42 +0200 Michael Niedermayer <michael@niedermayer.cc> wrote:
> On Sat, Jun 22, 2024 at 05:10:28PM +0200, Niklas Haas wrote:
> > On Sat, 22 Jun 2024 15:23:22 +0100 Andrew Sayers <ffmpeg-devel@pileofstuff.org> wrote:
> > > On Sat, Jun 22, 2024 at 03:13:34PM +0200, Niklas Haas wrote:
> > > [...]
> > > >
> > > > ## Comments / feedback?
> > > >
> > > > Does the above approach seem reasonable? How do people feel about introducing
> > > > a new API vs. trying to hammer the existing API into the shape I want it to be?
> > > >
> > > > I've attached an example of what <avscale.h> could end up looking like. If
> > > > there is broad agreement on this design, I will move on to an implementation.
> > >
> > > API users seem to have difficulty with this type of big change[[1],
> > > and doing the interface before the implementation means there's less
> > > reason for developers to switch while you're still looking for feedback.
> > >
> > > What's the plan to bring them along?
> >
> > Since SwsContext is entirely internal, we can continue providing the
> > current API on top of whatever internal abstractions we arrive at. As
> > a trivial example, sws_scale() can construct a temporary AVFrame based
> > on the provided information, and simply pass that to avscale_frame(). So
> > I don't think legacy API users are at risk, or pressure to switch,
> > unless they want access to *new* functionality.
> >
> > For that, the harder step is moving from sws_scale() to
> > sws_scale_frame(). This is something API users can *already* do. By
> > contrast, moving from sws_scale_frame() to avscale_frame() should
> > hopefully be simple, since it just requires making sure the AVFrame is
> > correctly tagged. Usually, the flow is in the opposite direction - users
> > receive a correctly tagged AVFrame and manually forward this information
> > to the SwsContext. So, most of the time, moving to a fully AVFrame-based
> > API will result in deleting code, rather than adding it.
> >
> > If we wanted to maximize the transition comfort, we should consider
> > re-using the sws_scale_frame() entrypoint directly. The main reason I am
> > hesitant to do this is because sws_scale_frame() is currently tied to
> > the stateful configuration of SwsContext, and mostly ignores the AVFrame
> > metadata. While changing that is possible, it possibly presents a bigger
> > API break than simply introducing a new function.
>
> I agree we should keep using the same swscale.h header. It matches the library
> name thats installed (thats also what the user expects and what (s)he is used to),
> and its what users #include today.
> Also its not a audio? scaler so the A is confusing.
>
> And sws_scale_frame() should be used obviously if thats as you say does
> "maximize the transition comfort"
>
> Maybe simply adding an option for the library user to set the behavior
> (favour AVFrame properties vs initial properties)
> And then eventually deprecate and phase out the initial ones
>
> The big advantage here is that we capture all users, noone stays on the old
> API. And the transition is also simpler, if its just a flag to flip for someone
> to try the new fully stateless system.
This could definitely work. We could then also eventually flip the
condition to where the new behavior becomes the default, and you need to
set a flag to *disable* it.
And eventually deprecate sws_init_context(), sws_setCoefficients() etc.
altogether and just have sws_alloc_context() + sws_scale_frame() be the
preferred front-ends.
I expect the actual amount of work to be similar; rather than taking
SwsContext and pulling everything high-level out into AVScaleContext, we
start with SwsContext and pull everything low-level out into separate
sub-contexts (e.g. one SwsScaleContext for each individual scaling
step).
>
> thx
>
> [...]
> --
> Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>
> Complexity theory is the science of finding the exact solution to an
> approximation. Benchmarking OTOH is finding an approximation of the exact
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [FFmpeg-devel] [RFC]] swscale modernization proposal
2024-06-22 22:19 ` Vittorio Giovara
@ 2024-06-22 22:39 ` Niklas Haas
2024-06-23 17:46 ` Michael Niedermayer
2024-06-23 17:57 ` James Almer
2 siblings, 0 replies; 36+ messages in thread
From: Niklas Haas @ 2024-06-22 22:39 UTC (permalink / raw)
To: FFmpeg development discussions and patches
On Sun, 23 Jun 2024 00:19:13 +0200 Vittorio Giovara <vittorio.giovara@gmail.com> wrote:
> On Sat, Jun 22, 2024 at 3:22 PM Niklas Haas <ffmpeg@haasn.xyz> wrote:
>
> > Hey,
> >
> > As some of you know, I got contracted (by STF 2024) to work on improving
> > swscale, over the course of the next couple of months. I want to share my
> > current plans and gather feedback + measure sentiment.
> >
> > ## Problem statement
> >
> > The two issues I'd like to focus on for now are:
> >
> > 1. Lack of support for a lot of modern formats and conversions (HDR, ICtCp,
> > IPTc2, BT.2020-CL, XYZ, YCgCo, Dolby Vision, ...)
> > 2. Complicated context management, with cascaded contexts, threading,
> > stateful
> > configuration, multi-step init procedures, etc; and related bugs
> >
> > In order to make these feasible, some amount of internal re-organization of
> > duties inside swscale is prudent.
> >
> > ## Proposed approach
> >
> > The first step is to create a new API, which will (tentatively) live in
> > <libswscale/avscale.h>. This API will initially start off as a near-copy
> > of the
> > current swscale public API, but with the major difference that I want it
> > to be
> > state-free and only access metadata in terms of AVFrame properties. So
> > there
> > will be no independent configuration of the input chroma location etc. like
> > there is currently, and no need to re-configure or re-init the context when
> > feeding it frames with different properties. The goal is for users to be
> > able
> > to just feed it AVFrame pairs and have it internally cache expensive
> > pre-processing steps as needed. Finally, avscale_* should ultimately also
> > support hardware frames directly, in which case it will dispatch to some
> > equivalent of scale_vulkan/vaapi/cuda or possibly even libplacebo. (But I
> > will
> > defer this to a future milestone)
> >
> > After this API is established, I want to start expanding the functionality
> > in
> > the following manner:
> >
> > ### Phase 1
> >
> > For basic operation, avscale_* will just dispatch to a sequence of
> > swscale_*
> > invocations. In the basic case, it will just directly invoke swscale with
> > minimal overhead. In more advanced cases, it might resolve to a *sequence*
> > of
> > swscale operations, with other operations (e.g. colorspace conversions a la
> > vf_colorspace) mixed in.
> >
> > This will allow us to gain new functionality in a minimally invasive way,
> > and
> > will let API users start porting to the new API. This will also serve as a
> > good
> > "selling point" for the new API, allowing us to hopefully break up the
> > legacy
> > swscale API afterwards.
> >
> > ### Phase 2
> >
> > After this is working, I want to cleanly separate swscale into two distinct
> > components:
> >
> > 1. vertical/horizontal scaling
> > 2. input/output conversions
> >
> > Right now, these operations both live inside the main SwsContext, even
> > though
> > they are conceptually orthogonal. Input handling is done entirely by the
> > abstract callbacks lumToYV12 etc., while output conversion is currently
> > "merged" with vertical scaling (yuv2planeX etc.).
> >
> > I want to cleanly separate these components so they can live inside
> > independent
> > contexts, and be considered as semantically distinct steps. (In particular,
> > there should ideally be no more "unscaled special converters", instead
> > this can
> > be seen as a special case where there simply is no vertical/horizontal
> > scaling
> > step)
> >
> > The idea is for the colorspace conversion layer to sit in between the
> > input/output converters and the horizontal/vertical scalers. This all
> > would be
> > orchestrated by the avscale_* abstraction.
> >
> > ## Implementation details
> >
> > To avoid performance loss from separating "merged" functions into their
> > constituents, care needs to be taken such that all intermediate data, in
> > addition to all involved look-up tables, will fit comfortably inside the L1
> > cache. The approach I propose, which is also (afaict) used by zscale, is to
> > loop over line segments, applying each operation in sequence, on a small
> > temporary buffer.
> >
> > e.g.
> >
> > hscale_row(pixel *dst, const pixel *src, int img_width)
> > {
> > const int SIZE = 256; // or some other small-ish figure, possibly a
> > design
> > // constant of the API so that SIMD
> > implementations
> > // can be appropriately unrolled
> >
> > pixel tmp[SIZE];
> > for (i = 0; i < img_width; i += SIZE) {
> > int pixels = min(SIZE, img_width - i);
> >
> > { /* inside read input callback */
> > unpack_input(tmp, src, pixels);
> > // the amount of separation here will depend on the performance
> > apply_matrix3x3(tmp, yuv2rgb, pixels);
> > apply_lut3x1d(tmp, gamma_lut, pixels);
> > ...
> > }
> >
> > hscale(dst, tmp, filter, pixels);
> >
> > src += pixels;
> > dst += scale_factor(pixels);
> > }
> > }
> >
> > This function can then output rows into a ring buffer for use inside the
> > vertical scaler, after which the same procedure happens (in reverse) for
> > the
> > final output pass.
> >
> > Possibly, we also want to additionally limit the size of a row for the
> > horizontal scaler, to allow arbitrary large input images.
> >
> > ## Comments / feedback?
> >
> > Does the above approach seem reasonable? How do people feel about
> > introducing
> > a new API vs. trying to hammer the existing API into the shape I want it
> > to be?
> >
> > I've attached an example of what <avscale.h> could end up looking like. If
> > there is broad agreement on this design, I will move on to an
> > implementation.
> >
>
> What do you think of the concept of kernels like
> https://github.com/lu-zero/avscale/blob/master/kernels/rgb2yuv.c
> The idea is that there is a bit of analysis on input and output format
> requested, and either a specialized kernel is used, or a chain of kernels
> is built and data is passed along.
> Among the design goals of that library, there was also readability (so that
> the flow was always under control) and the ease of writing assembly and/or
> shader for any single kernel.
Yes, this is exactly the plan - the high-level wrapping logic will
essentially decompose the desired initial and final state into
a sequence of primitives for getting from A to B. (Quite possibly even
a *literal* list in memory, that we can inspect and apply
merge-optimizations to)
That's the main reason I am so keen on splitting it off from SwsContext
as much as possible - in implementation, if not in name - so that it can
grow from a clean slate.
Incidentally, it would be pretty useful to arrive at a point where we
have a small (~10) set of "atomic" primitives, the sum of which allows
supporting all possible conversions in principle, so that with just
these SIMD implementations you can get acceptable performance across any
platform. And then platforms that really care about fine-tuning can
provide more specialized, "inlined" kernels for popular or slow
operations.
> Needless to say I support the plan of renaming the library so that it can
> be inline with the other libraries names, and the use of a separate header
> since downstream applications will need to update a lot to use the new
> library (or the new apis in the existing library) and/or we could provide a
> thin conversion layer when the new lib is finalized.
> --
> Vittorio
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [FFmpeg-devel] [RFC]] swscale modernization proposal
2024-06-22 22:24 ` Niklas Haas
@ 2024-06-23 17:27 ` Michael Niedermayer
0 siblings, 0 replies; 36+ messages in thread
From: Michael Niedermayer @ 2024-06-23 17:27 UTC (permalink / raw)
To: FFmpeg development discussions and patches
[-- Attachment #1.1: Type: text/plain, Size: 4378 bytes --]
On Sun, Jun 23, 2024 at 12:24:57AM +0200, Niklas Haas wrote:
> On Sat, 22 Jun 2024 21:52:42 +0200 Michael Niedermayer <michael@niedermayer.cc> wrote:
> > On Sat, Jun 22, 2024 at 05:10:28PM +0200, Niklas Haas wrote:
> > > On Sat, 22 Jun 2024 15:23:22 +0100 Andrew Sayers <ffmpeg-devel@pileofstuff.org> wrote:
> > > > On Sat, Jun 22, 2024 at 03:13:34PM +0200, Niklas Haas wrote:
> > > > [...]
> > > > >
> > > > > ## Comments / feedback?
> > > > >
> > > > > Does the above approach seem reasonable? How do people feel about introducing
> > > > > a new API vs. trying to hammer the existing API into the shape I want it to be?
> > > > >
> > > > > I've attached an example of what <avscale.h> could end up looking like. If
> > > > > there is broad agreement on this design, I will move on to an implementation.
> > > >
> > > > API users seem to have difficulty with this type of big change[[1],
> > > > and doing the interface before the implementation means there's less
> > > > reason for developers to switch while you're still looking for feedback.
> > > >
> > > > What's the plan to bring them along?
> > >
> > > Since SwsContext is entirely internal, we can continue providing the
> > > current API on top of whatever internal abstractions we arrive at. As
> > > a trivial example, sws_scale() can construct a temporary AVFrame based
> > > on the provided information, and simply pass that to avscale_frame(). So
> > > I don't think legacy API users are at risk, or pressure to switch,
> > > unless they want access to *new* functionality.
> > >
> > > For that, the harder step is moving from sws_scale() to
> > > sws_scale_frame(). This is something API users can *already* do. By
> > > contrast, moving from sws_scale_frame() to avscale_frame() should
> > > hopefully be simple, since it just requires making sure the AVFrame is
> > > correctly tagged. Usually, the flow is in the opposite direction - users
> > > receive a correctly tagged AVFrame and manually forward this information
> > > to the SwsContext. So, most of the time, moving to a fully AVFrame-based
> > > API will result in deleting code, rather than adding it.
> > >
> > > If we wanted to maximize the transition comfort, we should consider
> > > re-using the sws_scale_frame() entrypoint directly. The main reason I am
> > > hesitant to do this is because sws_scale_frame() is currently tied to
> > > the stateful configuration of SwsContext, and mostly ignores the AVFrame
> > > metadata. While changing that is possible, it possibly presents a bigger
> > > API break than simply introducing a new function.
> >
> > I agree we should keep using the same swscale.h header. It matches the library
> > name thats installed (thats also what the user expects and what (s)he is used to),
> > and its what users #include today.
> > Also its not a audio? scaler so the A is confusing.
> >
> > And sws_scale_frame() should be used obviously if thats as you say does
> > "maximize the transition comfort"
> >
> > Maybe simply adding an option for the library user to set the behavior
> > (favour AVFrame properties vs initial properties)
> > And then eventually deprecate and phase out the initial ones
> >
> > The big advantage here is that we capture all users, noone stays on the old
> > API. And the transition is also simpler, if its just a flag to flip for someone
> > to try the new fully stateless system.
>
> This could definitely work. We could then also eventually flip the
> condition to where the new behavior becomes the default, and you need to
> set a flag to *disable* it.
>
> And eventually deprecate sws_init_context(), sws_setCoefficients() etc.
> altogether and just have sws_alloc_context() + sws_scale_frame() be the
> preferred front-ends.
>
> I expect the actual amount of work to be similar; rather than taking
> SwsContext and pulling everything high-level out into AVScaleContext, we
> start with SwsContext and pull everything low-level out into separate
> sub-contexts (e.g. one SwsScaleContext for each individual scaling
> step).
Yes, i prefer this
thanks alot
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
"Nothing to hide" only works if the folks in power share the values of
you and everyone you know entirely and always will -- Tom Scott
[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]
[-- Attachment #2: Type: text/plain, Size: 251 bytes --]
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [FFmpeg-devel] [RFC]] swscale modernization proposal
2024-06-22 22:19 ` Vittorio Giovara
2024-06-22 22:39 ` Niklas Haas
@ 2024-06-23 17:46 ` Michael Niedermayer
2024-06-23 19:00 ` Paul B Mahol
2024-06-23 17:57 ` James Almer
2 siblings, 1 reply; 36+ messages in thread
From: Michael Niedermayer @ 2024-06-23 17:46 UTC (permalink / raw)
To: FFmpeg development discussions and patches
[-- Attachment #1.1: Type: text/plain, Size: 7386 bytes --]
On Sun, Jun 23, 2024 at 12:19:13AM +0200, Vittorio Giovara wrote:
> On Sat, Jun 22, 2024 at 3:22 PM Niklas Haas <ffmpeg@haasn.xyz> wrote:
>
> > Hey,
> >
> > As some of you know, I got contracted (by STF 2024) to work on improving
> > swscale, over the course of the next couple of months. I want to share my
> > current plans and gather feedback + measure sentiment.
> >
> > ## Problem statement
> >
> > The two issues I'd like to focus on for now are:
> >
> > 1. Lack of support for a lot of modern formats and conversions (HDR, ICtCp,
> > IPTc2, BT.2020-CL, XYZ, YCgCo, Dolby Vision, ...)
> > 2. Complicated context management, with cascaded contexts, threading,
> > stateful
> > configuration, multi-step init procedures, etc; and related bugs
> >
> > In order to make these feasible, some amount of internal re-organization of
> > duties inside swscale is prudent.
> >
> > ## Proposed approach
> >
> > The first step is to create a new API, which will (tentatively) live in
> > <libswscale/avscale.h>. This API will initially start off as a near-copy
> > of the
> > current swscale public API, but with the major difference that I want it
> > to be
> > state-free and only access metadata in terms of AVFrame properties. So
> > there
> > will be no independent configuration of the input chroma location etc. like
> > there is currently, and no need to re-configure or re-init the context when
> > feeding it frames with different properties. The goal is for users to be
> > able
> > to just feed it AVFrame pairs and have it internally cache expensive
> > pre-processing steps as needed. Finally, avscale_* should ultimately also
> > support hardware frames directly, in which case it will dispatch to some
> > equivalent of scale_vulkan/vaapi/cuda or possibly even libplacebo. (But I
> > will
> > defer this to a future milestone)
> >
> > After this API is established, I want to start expanding the functionality
> > in
> > the following manner:
> >
> > ### Phase 1
> >
> > For basic operation, avscale_* will just dispatch to a sequence of
> > swscale_*
> > invocations. In the basic case, it will just directly invoke swscale with
> > minimal overhead. In more advanced cases, it might resolve to a *sequence*
> > of
> > swscale operations, with other operations (e.g. colorspace conversions a la
> > vf_colorspace) mixed in.
> >
> > This will allow us to gain new functionality in a minimally invasive way,
> > and
> > will let API users start porting to the new API. This will also serve as a
> > good
> > "selling point" for the new API, allowing us to hopefully break up the
> > legacy
> > swscale API afterwards.
> >
> > ### Phase 2
> >
> > After this is working, I want to cleanly separate swscale into two distinct
> > components:
> >
> > 1. vertical/horizontal scaling
> > 2. input/output conversions
> >
> > Right now, these operations both live inside the main SwsContext, even
> > though
> > they are conceptually orthogonal. Input handling is done entirely by the
> > abstract callbacks lumToYV12 etc., while output conversion is currently
> > "merged" with vertical scaling (yuv2planeX etc.).
> >
> > I want to cleanly separate these components so they can live inside
> > independent
> > contexts, and be considered as semantically distinct steps. (In particular,
> > there should ideally be no more "unscaled special converters", instead
> > this can
> > be seen as a special case where there simply is no vertical/horizontal
> > scaling
> > step)
> >
> > The idea is for the colorspace conversion layer to sit in between the
> > input/output converters and the horizontal/vertical scalers. This all
> > would be
> > orchestrated by the avscale_* abstraction.
> >
> > ## Implementation details
> >
> > To avoid performance loss from separating "merged" functions into their
> > constituents, care needs to be taken such that all intermediate data, in
> > addition to all involved look-up tables, will fit comfortably inside the L1
> > cache. The approach I propose, which is also (afaict) used by zscale, is to
> > loop over line segments, applying each operation in sequence, on a small
> > temporary buffer.
> >
> > e.g.
> >
> > hscale_row(pixel *dst, const pixel *src, int img_width)
> > {
> > const int SIZE = 256; // or some other small-ish figure, possibly a
> > design
> > // constant of the API so that SIMD
> > implementations
> > // can be appropriately unrolled
> >
> > pixel tmp[SIZE];
> > for (i = 0; i < img_width; i += SIZE) {
> > int pixels = min(SIZE, img_width - i);
> >
> > { /* inside read input callback */
> > unpack_input(tmp, src, pixels);
> > // the amount of separation here will depend on the performance
> > apply_matrix3x3(tmp, yuv2rgb, pixels);
> > apply_lut3x1d(tmp, gamma_lut, pixels);
> > ...
> > }
> >
> > hscale(dst, tmp, filter, pixels);
> >
> > src += pixels;
> > dst += scale_factor(pixels);
> > }
> > }
> >
> > This function can then output rows into a ring buffer for use inside the
> > vertical scaler, after which the same procedure happens (in reverse) for
> > the
> > final output pass.
> >
> > Possibly, we also want to additionally limit the size of a row for the
> > horizontal scaler, to allow arbitrary large input images.
> >
> > ## Comments / feedback?
> >
> > Does the above approach seem reasonable? How do people feel about
> > introducing
> > a new API vs. trying to hammer the existing API into the shape I want it
> > to be?
> >
> > I've attached an example of what <avscale.h> could end up looking like. If
> > there is broad agreement on this design, I will move on to an
> > implementation.
> >
>
> What do you think of the concept of kernels like
> https://github.com/lu-zero/avscale/blob/master/kernels/rgb2yuv.c
> The idea is that there is a bit of analysis on input and output format
> requested, and either a specialized kernel is used, or a chain of kernels
> is built and data is passed along.
> Among the design goals of that library, there was also readability (so that
> the flow was always under control) and the ease of writing assembly and/or
> shader for any single kernel.
I think I have not looked at lucas work before, so i cannot comment on it specifically
But i think what you suggest is what Niklas intends to do.
swscale has evolved over a long time from code with a very small subset of
the current features. The code is in need for being "refactored" into some
cleaner kernel / modular design.
Also as you mention lu_zero, I had talked with him very briefly and he will
be on the next extra member vote for the GA (whoever initiates it, ill try to
make sure luca is not forgotten) Just saying, i have not forgotten
him, just that i wanted to accumulate more people before bringing that up.
>
> Needless to say I support the plan of renaming the library so that it can
As the main author of libswscale, i find this quite offensive.
thx
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
Everything should be made as simple as possible, but not simpler.
-- Albert Einstein
[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]
[-- Attachment #2: Type: text/plain, Size: 251 bytes --]
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [FFmpeg-devel] [RFC]] swscale modernization proposal
2024-06-22 22:19 ` Vittorio Giovara
2024-06-22 22:39 ` Niklas Haas
2024-06-23 17:46 ` Michael Niedermayer
@ 2024-06-23 17:57 ` James Almer
2024-06-23 18:40 ` Andrew Sayers
` (2 more replies)
2 siblings, 3 replies; 36+ messages in thread
From: James Almer @ 2024-06-23 17:57 UTC (permalink / raw)
To: ffmpeg-devel
On 6/22/2024 7:19 PM, Vittorio Giovara wrote:
> Needless to say I support the plan of renaming the library so that it can
> be inline with the other libraries names, and the use of a separate header
> since downstream applications will need to update a lot to use the new
> library (or the new apis in the existing library) and/or we could provide a
> thin conversion layer when the new lib is finalized.
I don't quite agree with renaming it. As Michael already pointed out,
the av prefix wouldn't fit a scaling library nor a resampling one, as
they only handle one or the other.
There's also the precedent of avresample, which was ultimately dropped
in favor of swresample, so trying to replace swscale with a new avscale
library will be both confusing and going against what was already
established.
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [FFmpeg-devel] [RFC]] swscale modernization proposal
2024-06-23 17:57 ` James Almer
@ 2024-06-23 18:40 ` Andrew Sayers
2024-06-24 14:33 ` Niklas Haas
2024-06-24 14:44 ` Vittorio Giovara
2 siblings, 0 replies; 36+ messages in thread
From: Andrew Sayers @ 2024-06-23 18:40 UTC (permalink / raw)
To: FFmpeg development discussions and patches
On Sun, Jun 23, 2024 at 02:57:31PM -0300, James Almer wrote:
> On 6/22/2024 7:19 PM, Vittorio Giovara wrote:
> > Needless to say I support the plan of renaming the library so that it can
> > be inline with the other libraries names, and the use of a separate header
> > since downstream applications will need to update a lot to use the new
> > library (or the new apis in the existing library) and/or we could provide a
> > thin conversion layer when the new lib is finalized.
>
> I don't quite agree with renaming it. As Michael already pointed out, the av
> prefix wouldn't fit a scaling library nor a resampling one, as they only
> handle one or the other.
> There's also the precedent of avresample, which was ultimately dropped in
> favor of swresample, so trying to replace swscale with a new avscale library
> will be both confusing and going against what was already established.
It wouldn't confuse users, because the meaning isn't documented.
Can you summarise what information "av" vs. "sw" prefixes conveys to API users?
Preferably in the form of a patch to @mainpage section of libavutil/avutil.h :)
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [FFmpeg-devel] [RFC]] swscale modernization proposal
2024-06-23 17:46 ` Michael Niedermayer
@ 2024-06-23 19:00 ` Paul B Mahol
0 siblings, 0 replies; 36+ messages in thread
From: Paul B Mahol @ 2024-06-23 19:00 UTC (permalink / raw)
To: FFmpeg development discussions and patches
On Sun, Jun 23, 2024 at 7:46 PM Michael Niedermayer <michael@niedermayer.cc>
wrote:
> On Sun, Jun 23, 2024 at 12:19:13AM +0200, Vittorio Giovara wrote:
> > On Sat, Jun 22, 2024 at 3:22 PM Niklas Haas <ffmpeg@haasn.xyz> wrote:
> >
> > > Hey,
> > >
> > > As some of you know, I got contracted (by STF 2024) to work on
> improving
> > > swscale, over the course of the next couple of months. I want to share
> my
> > > current plans and gather feedback + measure sentiment.
> > >
> > > ## Problem statement
> > >
> > > The two issues I'd like to focus on for now are:
> > >
> > > 1. Lack of support for a lot of modern formats and conversions (HDR,
> ICtCp,
> > > IPTc2, BT.2020-CL, XYZ, YCgCo, Dolby Vision, ...)
> > > 2. Complicated context management, with cascaded contexts, threading,
> > > stateful
> > > configuration, multi-step init procedures, etc; and related bugs
> > >
> > > In order to make these feasible, some amount of internal
> re-organization of
> > > duties inside swscale is prudent.
> > >
> > > ## Proposed approach
> > >
> > > The first step is to create a new API, which will (tentatively) live in
> > > <libswscale/avscale.h>. This API will initially start off as a
> near-copy
> > > of the
> > > current swscale public API, but with the major difference that I want
> it
> > > to be
> > > state-free and only access metadata in terms of AVFrame properties. So
> > > there
> > > will be no independent configuration of the input chroma location etc.
> like
> > > there is currently, and no need to re-configure or re-init the context
> when
> > > feeding it frames with different properties. The goal is for users to
> be
> > > able
> > > to just feed it AVFrame pairs and have it internally cache expensive
> > > pre-processing steps as needed. Finally, avscale_* should ultimately
> also
> > > support hardware frames directly, in which case it will dispatch to
> some
> > > equivalent of scale_vulkan/vaapi/cuda or possibly even libplacebo.
> (But I
> > > will
> > > defer this to a future milestone)
> > >
> > > After this API is established, I want to start expanding the
> functionality
> > > in
> > > the following manner:
> > >
> > > ### Phase 1
> > >
> > > For basic operation, avscale_* will just dispatch to a sequence of
> > > swscale_*
> > > invocations. In the basic case, it will just directly invoke swscale
> with
> > > minimal overhead. In more advanced cases, it might resolve to a
> *sequence*
> > > of
> > > swscale operations, with other operations (e.g. colorspace conversions
> a la
> > > vf_colorspace) mixed in.
> > >
> > > This will allow us to gain new functionality in a minimally invasive
> way,
> > > and
> > > will let API users start porting to the new API. This will also serve
> as a
> > > good
> > > "selling point" for the new API, allowing us to hopefully break up the
> > > legacy
> > > swscale API afterwards.
> > >
> > > ### Phase 2
> > >
> > > After this is working, I want to cleanly separate swscale into two
> distinct
> > > components:
> > >
> > > 1. vertical/horizontal scaling
> > > 2. input/output conversions
> > >
> > > Right now, these operations both live inside the main SwsContext, even
> > > though
> > > they are conceptually orthogonal. Input handling is done entirely by
> the
> > > abstract callbacks lumToYV12 etc., while output conversion is currently
> > > "merged" with vertical scaling (yuv2planeX etc.).
> > >
> > > I want to cleanly separate these components so they can live inside
> > > independent
> > > contexts, and be considered as semantically distinct steps. (In
> particular,
> > > there should ideally be no more "unscaled special converters", instead
> > > this can
> > > be seen as a special case where there simply is no vertical/horizontal
> > > scaling
> > > step)
> > >
> > > The idea is for the colorspace conversion layer to sit in between the
> > > input/output converters and the horizontal/vertical scalers. This all
> > > would be
> > > orchestrated by the avscale_* abstraction.
> > >
> > > ## Implementation details
> > >
> > > To avoid performance loss from separating "merged" functions into their
> > > constituents, care needs to be taken such that all intermediate data,
> in
> > > addition to all involved look-up tables, will fit comfortably inside
> the L1
> > > cache. The approach I propose, which is also (afaict) used by zscale,
> is to
> > > loop over line segments, applying each operation in sequence, on a
> small
> > > temporary buffer.
> > >
> > > e.g.
> > >
> > > hscale_row(pixel *dst, const pixel *src, int img_width)
> > > {
> > > const int SIZE = 256; // or some other small-ish figure, possibly a
> > > design
> > > // constant of the API so that SIMD
> > > implementations
> > > // can be appropriately unrolled
> > >
> > > pixel tmp[SIZE];
> > > for (i = 0; i < img_width; i += SIZE) {
> > > int pixels = min(SIZE, img_width - i);
> > >
> > > { /* inside read input callback */
> > > unpack_input(tmp, src, pixels);
> > > // the amount of separation here will depend on the
> performance
> > > apply_matrix3x3(tmp, yuv2rgb, pixels);
> > > apply_lut3x1d(tmp, gamma_lut, pixels);
> > > ...
> > > }
> > >
> > > hscale(dst, tmp, filter, pixels);
> > >
> > > src += pixels;
> > > dst += scale_factor(pixels);
> > > }
> > > }
> > >
> > > This function can then output rows into a ring buffer for use inside
> the
> > > vertical scaler, after which the same procedure happens (in reverse)
> for
> > > the
> > > final output pass.
> > >
> > > Possibly, we also want to additionally limit the size of a row for the
> > > horizontal scaler, to allow arbitrary large input images.
> > >
> > > ## Comments / feedback?
> > >
> > > Does the above approach seem reasonable? How do people feel about
> > > introducing
> > > a new API vs. trying to hammer the existing API into the shape I want
> it
> > > to be?
> > >
> > > I've attached an example of what <avscale.h> could end up looking
> like. If
> > > there is broad agreement on this design, I will move on to an
> > > implementation.
> > >
> >
> > What do you think of the concept of kernels like
> > https://github.com/lu-zero/avscale/blob/master/kernels/rgb2yuv.c
> > The idea is that there is a bit of analysis on input and output format
> > requested, and either a specialized kernel is used, or a chain of kernels
> > is built and data is passed along.
> > Among the design goals of that library, there was also readability (so
> that
> > the flow was always under control) and the ease of writing assembly
> and/or
> > shader for any single kernel.
>
> I think I have not looked at lucas work before, so i cannot comment on it
> specifically
> But i think what you suggest is what Niklas intends to do.
> swscale has evolved over a long time from code with a very small subset of
> the current features. The code is in need for being "refactored" into some
> cleaner kernel / modular design.
> Also as you mention lu_zero, I had talked with him very briefly and he will
> be on the next extra member vote for the GA (whoever initiates it, ill try
> to
> make sure luca is not forgotten) Just saying, i have not forgotten
> him, just that i wanted to accumulate more people before bringing that up.
>
>
> >
> > Needless to say I support the plan of renaming the library so that it can
>
> As the main author of libswscale, i find this quite offensive.
>
Looks like Rust is not so popular, so bigger coins are in FFland.
>
> thx
>
> [...]
> --
> Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>
> Everything should be made as simple as possible, but not simpler.
> -- Albert Einstein
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [FFmpeg-devel] [RFC]] swscale modernization proposal
2024-06-23 17:57 ` James Almer
2024-06-23 18:40 ` Andrew Sayers
@ 2024-06-24 14:33 ` Niklas Haas
2024-06-24 14:44 ` Vittorio Giovara
2 siblings, 0 replies; 36+ messages in thread
From: Niklas Haas @ 2024-06-24 14:33 UTC (permalink / raw)
To: ffmpeg-devel
On Sun, 23 Jun 2024 14:57:31 -0300 James Almer <jamrial@gmail.com> wrote:
> On 6/22/2024 7:19 PM, Vittorio Giovara wrote:
> > Needless to say I support the plan of renaming the library so that it can
> > be inline with the other libraries names, and the use of a separate header
> > since downstream applications will need to update a lot to use the new
> > library (or the new apis in the existing library) and/or we could provide a
> > thin conversion layer when the new lib is finalized.
>
> I don't quite agree with renaming it. As Michael already pointed out,
> the av prefix wouldn't fit a scaling library nor a resampling one, as
> they only handle one or the other.
By this logic, both libswscale and libswsresample should be merged into
libavscale. The mathematics of resampling and scaling is the same :)
Anyway, renaming a library needs a really strong motivating reason, and
I don't see that reason being present here. As discussed further
up-thread, I will try and re-use the existing swscale public API, but
internally restructure things so that SwsContext is itself the
"high-level wrapper" that I intended <avscale.h> to be.
We are very fortunate that SwsContext is entirely private, so I'm not
too concerned about the code implications of this. At worst it will
involve a bunch of renaming commits.
> There's also the precedent of avresample, which was ultimately dropped
> in favor of swresample, so trying to replace swscale with a new avscale
> library will be both confusing and going against what was already
> established.
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [FFmpeg-devel] [RFC]] swscale modernization proposal
2024-06-23 17:57 ` James Almer
2024-06-23 18:40 ` Andrew Sayers
2024-06-24 14:33 ` Niklas Haas
@ 2024-06-24 14:44 ` Vittorio Giovara
2024-06-25 15:31 ` Niklas Haas
2024-07-01 21:10 ` Stefano Sabatini
2 siblings, 2 replies; 36+ messages in thread
From: Vittorio Giovara @ 2024-06-24 14:44 UTC (permalink / raw)
To: FFmpeg development discussions and patches
On Sun, Jun 23, 2024 at 7:57 PM James Almer <jamrial@gmail.com> wrote:
> On 6/22/2024 7:19 PM, Vittorio Giovara wrote:
> > Needless to say I support the plan of renaming the library so that it can
> > be inline with the other libraries names, and the use of a separate
> header
> > since downstream applications will need to update a lot to use the new
> > library (or the new apis in the existing library) and/or we could
> provide a
> > thin conversion layer when the new lib is finalized.
>
> I don't quite agree with renaming it. As Michael already pointed out,
> the av prefix wouldn't fit a scaling library nor a resampling one, as
> they only handle one or the other.
>
by that reasoning we should ban all subtitles from all the libraries
av is a shorthand of multimedia and many people in the industry refer to
ffmpeg libs as "libav*" so it feels a bit odd to push for preserving an
alternative name
> There's also the precedent of avresample, which was ultimately dropped
> in favor of swresample, so trying to replace swscale with a new avscale
> library will be both confusing and going against what was already
> established.
it's still 4 libraries vs 2... and swr/avr is shrouded in bad history that
is not worth bringing up
I'd understand opposing a rename just for the sake of renaming, but this is
essentially a new library, i see no value in preserving the old naming
scheme, if not making downstream life worse :x
--
Vittorio
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [FFmpeg-devel] [RFC]] swscale modernization proposal
2024-06-24 14:44 ` Vittorio Giovara
@ 2024-06-25 15:31 ` Niklas Haas
2024-07-01 21:10 ` Stefano Sabatini
1 sibling, 0 replies; 36+ messages in thread
From: Niklas Haas @ 2024-06-25 15:31 UTC (permalink / raw)
To: FFmpeg development discussions and patches
On Mon, 24 Jun 2024 16:44:33 +0200 Vittorio Giovara <vittorio.giovara@gmail.com> wrote:
> On Sun, Jun 23, 2024 at 7:57 PM James Almer <jamrial@gmail.com> wrote:
>
> > On 6/22/2024 7:19 PM, Vittorio Giovara wrote:
> > > Needless to say I support the plan of renaming the library so that it can
> > > be inline with the other libraries names, and the use of a separate
> > header
> > > since downstream applications will need to update a lot to use the new
> > > library (or the new apis in the existing library) and/or we could
> > provide a
> > > thin conversion layer when the new lib is finalized.
> >
> > I don't quite agree with renaming it. As Michael already pointed out,
> > the av prefix wouldn't fit a scaling library nor a resampling one, as
> > they only handle one or the other.
> >
>
> by that reasoning we should ban all subtitles from all the libraries
>
> av is a shorthand of multimedia and many people in the industry refer to
> ffmpeg libs as "libav*" so it feels a bit odd to push for preserving an
> alternative name
>
>
> > There's also the precedent of avresample, which was ultimately dropped
> > in favor of swresample, so trying to replace swscale with a new avscale
> > library will be both confusing and going against what was already
> > established.
>
>
> it's still 4 libraries vs 2... and swr/avr is shrouded in bad history that
> is not worth bringing up
>
> I'd understand opposing a rename just for the sake of renaming, but this is
> essentially a new library, i see no value in preserving the old naming
> scheme, if not making downstream life worse :x
There is a strong motivating reason to keep <avscale.h> (or whatever
name we decide on) and <swscale.h> inside the same *library*, which is
that they both will need to access sws' internal implementation
(ff_sws_* functions).
After a bit of further planning, the current best approach seems to be
to build the new abstraction directly on top of ff_sws_init_scale() and
ff_sws_init_input/output_funcs(), with wholly new code for dispatching
the underlying operations in the correct order.
So I am currently no longer planning on directly wrapping SwsContext at
all. I am also seeing increasingly more hurdles and less benefits to the
idea of directly re-using sws_scale_frame() for the new API.
I think that, in the interest of my own sanity, for the initial
development phase I will go forward with defining a second header file
for the new API; and then only once we've tested it and are confident
that it works as intended, we can think about re-defining sws_scale()
etc. as a wrapper on top of this new API.
> --
> Vittorio
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [FFmpeg-devel] [RFC]] swscale modernization proposal
2024-06-22 13:13 [FFmpeg-devel] [RFC]] swscale modernization proposal Niklas Haas
2024-06-22 14:23 ` Andrew Sayers
2024-06-22 22:19 ` Vittorio Giovara
@ 2024-06-29 7:41 ` Zhao Zhili
2024-06-29 10:58 ` Niklas Haas
2024-06-29 11:47 ` Niklas Haas
2024-07-02 13:27 ` Niklas Haas
4 siblings, 1 reply; 36+ messages in thread
From: Zhao Zhili @ 2024-06-29 7:41 UTC (permalink / raw)
To: FFmpeg development discussions and patches
> On Jun 22, 2024, at 21:13, Niklas Haas <ffmpeg@haasn.xyz> wrote:
>
> Hey,
>
> As some of you know, I got contracted (by STF 2024) to work on improving
> swscale, over the course of the next couple of months. I want to share my
> current plans and gather feedback + measure sentiment.
>
> ## Problem statement
>
> The two issues I'd like to focus on for now are:
>
> 1. Lack of support for a lot of modern formats and conversions (HDR, ICtCp,
> IPTc2, BT.2020-CL, XYZ, YCgCo, Dolby Vision, ...)
> 2. Complicated context management, with cascaded contexts, threading, stateful
> configuration, multi-step init procedures, etc; and related bugs
>
> In order to make these feasible, some amount of internal re-organization of
> duties inside swscale is prudent.
>
> ## Proposed approach
>
> The first step is to create a new API, which will (tentatively) live in
> <libswscale/avscale.h>. This API will initially start off as a near-copy of the
> current swscale public API, but with the major difference that I want it to be
> state-free and only access metadata in terms of AVFrame properties. So there
> will be no independent configuration of the input chroma location etc. like
> there is currently, and no need to re-configure or re-init the context when
> feeding it frames with different properties. The goal is for users to be able
> to just feed it AVFrame pairs and have it internally cache expensive
> pre-processing steps as needed. Finally, avscale_* should ultimately also
> support hardware frames directly, in which case it will dispatch to some
> equivalent of scale_vulkan/vaapi/cuda or possibly even libplacebo. (But I will
> defer this to a future milestone)
>
> After this API is established, I want to start expanding the functionality in
> the following manner:
>
> ### Phase 1
>
> For basic operation, avscale_* will just dispatch to a sequence of swscale_*
> invocations. In the basic case, it will just directly invoke swscale with
> minimal overhead. In more advanced cases, it might resolve to a *sequence* of
> swscale operations, with other operations (e.g. colorspace conversions a la
> vf_colorspace) mixed in.
>
> This will allow us to gain new functionality in a minimally invasive way, and
> will let API users start porting to the new API. This will also serve as a good
> "selling point" for the new API, allowing us to hopefully break up the legacy
> swscale API afterwards.
>
> ### Phase 2
>
> After this is working, I want to cleanly separate swscale into two distinct
> components:
>
> 1. vertical/horizontal scaling
> 2. input/output conversions
>
> Right now, these operations both live inside the main SwsContext, even though
> they are conceptually orthogonal. Input handling is done entirely by the
> abstract callbacks lumToYV12 etc., while output conversion is currently
> "merged" with vertical scaling (yuv2planeX etc.).
>
> I want to cleanly separate these components so they can live inside independent
> contexts, and be considered as semantically distinct steps. (In particular,
> there should ideally be no more "unscaled special converters", instead this can
> be seen as a special case where there simply is no vertical/horizontal scaling
> step)
>
> The idea is for the colorspace conversion layer to sit in between the
> input/output converters and the horizontal/vertical scalers. This all would be
> orchestrated by the avscale_* abstraction.
>
> ## Implementation details
>
> To avoid performance loss from separating "merged" functions into their
> constituents, care needs to be taken such that all intermediate data, in
> addition to all involved look-up tables, will fit comfortably inside the L1
> cache. The approach I propose, which is also (afaict) used by zscale, is to
> loop over line segments, applying each operation in sequence, on a small
> temporary buffer.
>
> e.g.
>
> hscale_row(pixel *dst, const pixel *src, int img_width)
> {
> const int SIZE = 256; // or some other small-ish figure, possibly a design
> // constant of the API so that SIMD implementations
> // can be appropriately unrolled
>
> pixel tmp[SIZE];
> for (i = 0; i < img_width; i += SIZE) {
> int pixels = min(SIZE, img_width - i);
>
> { /* inside read input callback */
> unpack_input(tmp, src, pixels);
> // the amount of separation here will depend on the performance
> apply_matrix3x3(tmp, yuv2rgb, pixels);
> apply_lut3x1d(tmp, gamma_lut, pixels);
> ...
> }
>
> hscale(dst, tmp, filter, pixels);
>
> src += pixels;
> dst += scale_factor(pixels);
> }
> }
>
> This function can then output rows into a ring buffer for use inside the
> vertical scaler, after which the same procedure happens (in reverse) for the
> final output pass.
>
> Possibly, we also want to additionally limit the size of a row for the
> horizontal scaler, to allow arbitrary large input images.
I did a simple benchmark to compare the performance between libswscale and
libyuv. With Apple M1 arm64, libyuv is about 10 times faster than libswscale for
unscaled rgba to yuv420p. After recently aarch64 neon optimizations, libyuv is
still 5 times faster than libswscale. The situation isn’t much better with scaled
conversion.
Sure libswscale has more features and can be more precise than libyuv. Hope
we can catch up the performance after your refactor.
>
> ## Comments / feedback?
>
> Does the above approach seem reasonable? How do people feel about introducing
> a new API vs. trying to hammer the existing API into the shape I want it to be?
>
> I've attached an example of what <avscale.h> could end up looking like. If
> there is broad agreement on this design, I will move on to an implementation.
> <avscale.h>_______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [FFmpeg-devel] [RFC]] swscale modernization proposal
2024-06-29 7:41 ` Zhao Zhili
@ 2024-06-29 10:58 ` Niklas Haas
0 siblings, 0 replies; 36+ messages in thread
From: Niklas Haas @ 2024-06-29 10:58 UTC (permalink / raw)
To: FFmpeg development discussions and patches
On Sat, 29 Jun 2024 15:41:58 +0800 Zhao Zhili <quinkblack@foxmail.com> wrote:
>
>
> > On Jun 22, 2024, at 21:13, Niklas Haas <ffmpeg@haasn.xyz> wrote:
> >
> > Hey,
> >
> > As some of you know, I got contracted (by STF 2024) to work on improving
> > swscale, over the course of the next couple of months. I want to share my
> > current plans and gather feedback + measure sentiment.
> >
> > ## Problem statement
> >
> > The two issues I'd like to focus on for now are:
> >
> > 1. Lack of support for a lot of modern formats and conversions (HDR, ICtCp,
> > IPTc2, BT.2020-CL, XYZ, YCgCo, Dolby Vision, ...)
> > 2. Complicated context management, with cascaded contexts, threading, stateful
> > configuration, multi-step init procedures, etc; and related bugs
> >
> > In order to make these feasible, some amount of internal re-organization of
> > duties inside swscale is prudent.
> >
> > ## Proposed approach
> >
> > The first step is to create a new API, which will (tentatively) live in
> > <libswscale/avscale.h>. This API will initially start off as a near-copy of the
> > current swscale public API, but with the major difference that I want it to be
> > state-free and only access metadata in terms of AVFrame properties. So there
> > will be no independent configuration of the input chroma location etc. like
> > there is currently, and no need to re-configure or re-init the context when
> > feeding it frames with different properties. The goal is for users to be able
> > to just feed it AVFrame pairs and have it internally cache expensive
> > pre-processing steps as needed. Finally, avscale_* should ultimately also
> > support hardware frames directly, in which case it will dispatch to some
> > equivalent of scale_vulkan/vaapi/cuda or possibly even libplacebo. (But I will
> > defer this to a future milestone)
> >
> > After this API is established, I want to start expanding the functionality in
> > the following manner:
> >
> > ### Phase 1
> >
> > For basic operation, avscale_* will just dispatch to a sequence of swscale_*
> > invocations. In the basic case, it will just directly invoke swscale with
> > minimal overhead. In more advanced cases, it might resolve to a *sequence* of
> > swscale operations, with other operations (e.g. colorspace conversions a la
> > vf_colorspace) mixed in.
> >
> > This will allow us to gain new functionality in a minimally invasive way, and
> > will let API users start porting to the new API. This will also serve as a good
> > "selling point" for the new API, allowing us to hopefully break up the legacy
> > swscale API afterwards.
> >
> > ### Phase 2
> >
> > After this is working, I want to cleanly separate swscale into two distinct
> > components:
> >
> > 1. vertical/horizontal scaling
> > 2. input/output conversions
> >
> > Right now, these operations both live inside the main SwsContext, even though
> > they are conceptually orthogonal. Input handling is done entirely by the
> > abstract callbacks lumToYV12 etc., while output conversion is currently
> > "merged" with vertical scaling (yuv2planeX etc.).
> >
> > I want to cleanly separate these components so they can live inside independent
> > contexts, and be considered as semantically distinct steps. (In particular,
> > there should ideally be no more "unscaled special converters", instead this can
> > be seen as a special case where there simply is no vertical/horizontal scaling
> > step)
> >
> > The idea is for the colorspace conversion layer to sit in between the
> > input/output converters and the horizontal/vertical scalers. This all would be
> > orchestrated by the avscale_* abstraction.
> >
> > ## Implementation details
> >
> > To avoid performance loss from separating "merged" functions into their
> > constituents, care needs to be taken such that all intermediate data, in
> > addition to all involved look-up tables, will fit comfortably inside the L1
> > cache. The approach I propose, which is also (afaict) used by zscale, is to
> > loop over line segments, applying each operation in sequence, on a small
> > temporary buffer.
> >
> > e.g.
> >
> > hscale_row(pixel *dst, const pixel *src, int img_width)
> > {
> > const int SIZE = 256; // or some other small-ish figure, possibly a design
> > // constant of the API so that SIMD implementations
> > // can be appropriately unrolled
> >
> > pixel tmp[SIZE];
> > for (i = 0; i < img_width; i += SIZE) {
> > int pixels = min(SIZE, img_width - i);
> >
> > { /* inside read input callback */
> > unpack_input(tmp, src, pixels);
> > // the amount of separation here will depend on the performance
> > apply_matrix3x3(tmp, yuv2rgb, pixels);
> > apply_lut3x1d(tmp, gamma_lut, pixels);
> > ...
> > }
> >
> > hscale(dst, tmp, filter, pixels);
> >
> > src += pixels;
> > dst += scale_factor(pixels);
> > }
> > }
> >
> > This function can then output rows into a ring buffer for use inside the
> > vertical scaler, after which the same procedure happens (in reverse) for the
> > final output pass.
> >
> > Possibly, we also want to additionally limit the size of a row for the
> > horizontal scaler, to allow arbitrary large input images.
>
> I did a simple benchmark to compare the performance between libswscale and
> libyuv. With Apple M1 arm64, libyuv is about 10 times faster than libswscale for
> unscaled rgba to yuv420p. After recently aarch64 neon optimizations, libyuv is
> still 5 times faster than libswscale. The situation isn’t much better with scaled
> conversion.
>
> Sure libswscale has more features and can be more precise than libyuv. Hope
> we can catch up the performance after your refactor.
AFAICT, libyuv does not do any dithering nor advanced filtering, which
swscale is capable of. They also do processing at a pretty low bit
depth, e.g. converting from 8-bit yuv420p straight to 8-bit rgba before
scaling at, you guessed it, 8-bit resolution. (libswscale would use
15-bit here)
That said, there's probably still something we can/should learn from
this implementation.
>
> >
> > ## Comments / feedback?
> >
> > Does the above approach seem reasonable? How do people feel about introducing
> > a new API vs. trying to hammer the existing API into the shape I want it to be?
> >
> > I've attached an example of what <avscale.h> could end up looking like. If
> > there is broad agreement on this design, I will move on to an implementation.
> > <avscale.h>_______________________________________________
> > ffmpeg-devel mailing list
> > ffmpeg-devel@ffmpeg.org
> > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >
> > To unsubscribe, visit link above, or email
> > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [FFmpeg-devel] [RFC]] swscale modernization proposal
2024-06-22 13:13 [FFmpeg-devel] [RFC]] swscale modernization proposal Niklas Haas
` (2 preceding siblings ...)
2024-06-29 7:41 ` Zhao Zhili
@ 2024-06-29 11:47 ` Niklas Haas
2024-06-29 12:35 ` Michael Niedermayer
2024-06-30 6:25 ` Vittorio Giovara
2024-07-02 13:27 ` Niklas Haas
4 siblings, 2 replies; 36+ messages in thread
From: Niklas Haas @ 2024-06-29 11:47 UTC (permalink / raw)
To: ffmpeg-devel
[-- Attachment #1: Type: text/plain, Size: 4012 bytes --]
On Sat, 22 Jun 2024 15:13:34 +0200 Niklas Haas <ffmpeg@haasn.xyz> wrote:
> Hey,
>
> As some of you know, I got contracted (by STF 2024) to work on improving
> swscale, over the course of the next couple of months. I want to share my
> current plans and gather feedback + measure sentiment.
>
> ## Problem statement
>
> The two issues I'd like to focus on for now are:
>
> 1. Lack of support for a lot of modern formats and conversions (HDR, ICtCp,
> IPTc2, BT.2020-CL, XYZ, YCgCo, Dolby Vision, ...)
> 2. Complicated context management, with cascaded contexts, threading, stateful
> configuration, multi-step init procedures, etc; and related bugs
>
> In order to make these feasible, some amount of internal re-organization of
> duties inside swscale is prudent.
>
> ## Proposed approach
>
> The first step is to create a new API, which will (tentatively) live in
> <libswscale/avscale.h>. This API will initially start off as a near-copy of the
> current swscale public API, but with the major difference that I want it to be
> state-free and only access metadata in terms of AVFrame properties. So there
> will be no independent configuration of the input chroma location etc. like
> there is currently, and no need to re-configure or re-init the context when
> feeding it frames with different properties. The goal is for users to be able
> to just feed it AVFrame pairs and have it internally cache expensive
> pre-processing steps as needed. Finally, avscale_* should ultimately also
> support hardware frames directly, in which case it will dispatch to some
> equivalent of scale_vulkan/vaapi/cuda or possibly even libplacebo. (But I will
> defer this to a future milestone)
So, I've spent the past days implementing this API and hooking it up to
swscale internally. (For testing, I am also replacing `vf_scale` by the
equivalent AVScale-based implementation to see how the new API impacts
existing users). It mostly works so far, with some left-over translation
issues that I have to address before it can be sent upstream.
------
One of the things I was thinking about was how to configure
scalers/dither modes, which sws currently, somewhat clunkily, controls
with flags. IMO, flags are not the right design here - if anything, it
should be a separate enum/int, and controllable separately for chroma
resampling (4:4:4 <-> 4:2:0) and main scaling (e.g. 50x50 <-> 80x80).
That said, I think that for most end users, having such fine-grained
options is not really providing any end value - unless you're already
knee-deep in signal theory, the actual differences between, say,
"natural bicubic spline" and "Lanczos" are obtuse at best and alien at
worst.
My idea was to provide a single `int quality`, which the user can set to
tune the speed <-> quality trade-off on an arbitrary numeric scale from
0 to 10, with 0 being the fastest (alias everything, nearest neighbour,
drop half chroma samples, etc.), the default being something in the
vicinity of 3-5, and 10 being the maximum quality (full linear
downscaling, anti-aliasing, error diffusion, etc.).
The upside of this approach is that it would be vastly simpler for most
end users. It would also track newly added functionality automatically;
e.g. if we get a higher-quality tone mapping mode, it can be
retroactively added to the higher quality presets. The biggest downside
I can think of is that doing this would arguably violate the semantics
of a "bitexact" flag, since it would break results relative to
a previous version of libswscale - unless we maybe also force a specific
quality level in bitexact mode?
Open questions:
1. Is this a good idea, or do the downsides outweigh the benefits?
2. Is an "advanced configuration" API still needed, in addition to the
quality presets?
------
I have attached my current working draft of the public half of
<avscale.h>, for reference. You can also find my implementation draft at
the time of writing here:
https://github.com/haasn/FFmpeg/blob/avscale/libswscale/avscale.h
[-- Attachment #2: avscale.h --]
[-- Type: text/plain, Size: 5663 bytes --]
/*
* Copyright (C) 2024 Niklas Haas
*
* This file is part of FFmpeg.
*
* FFmpeg is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2.1 of the License, or (at your option) any later version.
*
* FFmpeg is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with FFmpeg; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
*/
#ifndef SWSCALE_AVSCALE_H
#define SWSCALE_AVSCALE_H
/**
* @file
* @ingroup libsws
* Higher-level wrapper around libswscale + related libraries, which is
* capable of handling more advanced colorspace transformations.
*/
#include "libavutil/frame.h"
#include "libavutil/log.h"
/**
* Main external API structure. New fields cannot be added to the end with
* minor version bumps. Removal, reordering and changes to existing fields
* require a major version bump. sizeof(AVScaleContext) is not part of the ABI.
*/
typedef struct AVScaleContext {
const AVClass *av_class;
/**
* Private context used for internal data.
*/
struct AVScaleInternal *internal;
/**
* Private data of the user, can be used to carry app specific stuff.
*/
void *opaque;
/**
* Bitmask of AV_SCALE_* flags.
*/
int64_t flags;
/**
* How many threads to use for processing, or 0 for automatic selection.
*/
int threads;
/**
* Quality factor (0-10). The default quality is [TBD]. Higher values
* sacrifice speed in exchange for quality.
*
* TODO: explain what changes at each level
*/
int quality;
} AVScaleContext;
enum {
/**
* Force bit-exact output. This will prevent the use of platform-specific
* optimizations that may lead to slight difference in rounding, in favor
* of always maintaining exact bit output compatibility with the reference
* C code.
*
* Note: This is also available under the name "accurate_rnd" for
* backwards compatibility.
*/
AV_SCALE_BITEXACT = 1 << 0,
/**
* Return an error on underspecified conversions. Without this flag,
* unspecified fields are defaulted to sensible values.
*/
AV_SCALE_STRICT = 1 << 1,
};
/**
* Allocate an AVScaleContext and set its fields to default values. The
* resulting struct should be freed with avscale_free_context().
*/
AVScaleContext *avscale_alloc_context(void);
/**
* Free the codec context and everything associated with it, and write NULL
* to the provided pointer.
*/
void avscale_free_context(AVScaleContext **ctx);
/**
* Get the AVClass for AVScaleContext. It can be used in combination with
* AV_OPT_SEARCH_FAKE_OBJ for examining options.
*
* @see av_opt_find().
*/
const AVClass *avscale_get_class(void);
/**
* Statically test if a conversion is supported. Values of (respectively)
* NONE/UNSPECIFIED are ignored.
*
* Returns 1 if the conversion is supported, or 0 otherwise.
*/
int avscale_test_format(enum AVPixelFormat dst, enum AVPixelFormat src);
int avscale_test_colorspace(enum AVColorSpace dst, enum AVColorSpace src);
int avscale_test_primaries(enum AVColorPrimaries dst, enum AVColorPrimaries src);
int avscale_test_transfer(enum AVColorTransferCharacteristic dst,
enum AVColorTransferCharacteristic src);
/**
* Scale source data from `src` and write the output to `dst`. This is
* merely a convenience wrapper around `avscale_frame_slice(ctx, dst, src, 0,
* src->height)`.
*
* @param ctx The scaling context.
* @param dst The destination frame.
*
* The data buffers may either be already allocated by the caller
* or left clear, in which case they will be allocated by the
* scaler. The latter may have performance advantages - e.g. in
* certain cases some (or all) output planes may be references to
* input planes, rather than copies.
* @param src The source frame. If the data buffers are set to NULL, then
* this function performs no conversion. It will instead merely
* initialize internal state that *would* be required to perform
* the operation, as well as returing the correct error code for
* unsupported frame combinations.
*
* @return 0 on success, a negative AVERROR code on failure.
*/
int avscale_frame(AVScaleContext *ctx, AVFrame *dst, const AVFrame *src);
/**
* Like `avscale_frame`, but operates only on the (source) range from `ystart`
* to `height`.
*
* Note: For interlaced or vertically subsampled frames, `ystart` and `height`
* must be aligned to a multiple of the subsampling size (typically 2, or 4 in
* the case of interlaced subsampled material).
*
* @param ctx The scaling context.
* @param dst The destination frame. See avscale_framee() for more details.
* @param src The source frame. See avscale_framee() for more details.
* @param slice_start First row of slice, relative to `src`
* @param slice_height Number of (source) rows in the slice
*
* @return 0 on success, a negative AVERROR code on failure.
*/
int avscale_frame_slice(AVScaleContext *ctx, AVFrame *dst, const AVFrame *src,
int slice_start, int slice_height);
#endif /* SWSCALE_AVSCALE_H */
[-- Attachment #3: Type: text/plain, Size: 251 bytes --]
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [FFmpeg-devel] [RFC]] swscale modernization proposal
2024-06-29 11:47 ` Niklas Haas
@ 2024-06-29 12:35 ` Michael Niedermayer
2024-06-29 14:05 ` Niklas Haas
2024-06-30 6:25 ` Vittorio Giovara
1 sibling, 1 reply; 36+ messages in thread
From: Michael Niedermayer @ 2024-06-29 12:35 UTC (permalink / raw)
To: FFmpeg development discussions and patches
[-- Attachment #1.1: Type: text/plain, Size: 6299 bytes --]
On Sat, Jun 29, 2024 at 01:47:43PM +0200, Niklas Haas wrote:
> On Sat, 22 Jun 2024 15:13:34 +0200 Niklas Haas <ffmpeg@haasn.xyz> wrote:
> > Hey,
> >
> > As some of you know, I got contracted (by STF 2024) to work on improving
> > swscale, over the course of the next couple of months. I want to share my
> > current plans and gather feedback + measure sentiment.
> >
> > ## Problem statement
> >
> > The two issues I'd like to focus on for now are:
> >
> > 1. Lack of support for a lot of modern formats and conversions (HDR, ICtCp,
> > IPTc2, BT.2020-CL, XYZ, YCgCo, Dolby Vision, ...)
> > 2. Complicated context management, with cascaded contexts, threading, stateful
> > configuration, multi-step init procedures, etc; and related bugs
> >
> > In order to make these feasible, some amount of internal re-organization of
> > duties inside swscale is prudent.
> >
> > ## Proposed approach
> >
> > The first step is to create a new API, which will (tentatively) live in
> > <libswscale/avscale.h>. This API will initially start off as a near-copy of the
> > current swscale public API, but with the major difference that I want it to be
> > state-free and only access metadata in terms of AVFrame properties. So there
> > will be no independent configuration of the input chroma location etc. like
> > there is currently, and no need to re-configure or re-init the context when
> > feeding it frames with different properties. The goal is for users to be able
> > to just feed it AVFrame pairs and have it internally cache expensive
> > pre-processing steps as needed. Finally, avscale_* should ultimately also
> > support hardware frames directly, in which case it will dispatch to some
> > equivalent of scale_vulkan/vaapi/cuda or possibly even libplacebo. (But I will
> > defer this to a future milestone)
>
> So, I've spent the past days implementing this API and hooking it up to
> swscale internally. (For testing, I am also replacing `vf_scale` by the
> equivalent AVScale-based implementation to see how the new API impacts
> existing users). It mostly works so far, with some left-over translation
> issues that I have to address before it can be sent upstream.
>
> ------
>
> One of the things I was thinking about was how to configure
> scalers/dither modes, which sws currently, somewhat clunkily, controls
> with flags. IMO, flags are not the right design here - if anything, it
> should be a separate enum/int, and controllable separately for chroma
> resampling (4:4:4 <-> 4:2:0) and main scaling (e.g. 50x50 <-> 80x80).
>
> That said, I think that for most end users, having such fine-grained
> options is not really providing any end value - unless you're already
> knee-deep in signal theory, the actual differences between, say,
> "natural bicubic spline" and "Lanczos" are obtuse at best and alien at
> worst.
>
> My idea was to provide a single `int quality`, which the user can set to
> tune the speed <-> quality trade-off on an arbitrary numeric scale from
> 0 to 10, with 0 being the fastest (alias everything, nearest neighbour,
> drop half chroma samples, etc.), the default being something in the
> vicinity of 3-5, and 10 being the maximum quality (full linear
> downscaling, anti-aliasing, error diffusion, etc.).
I think 10 levels is not fine grained enough,
when there are more then 10 features to switch on/off we would have
to switch more than 1 at a time.
also the scale has an issue, that becomes obvious when you consider the
extreems, like memset(0) at level 0, not converting chroma at level 1
and hiring a human artist to paint a matching upscaled image at 10
using a neural net at 9
the quality factor would probably have thus at least 3 ranges
1. the as fast as possible with noticeable quality issues
2. the normal range
3. the as best as possible, disregarding the computation needed
some encoder (like x264) use words like UltraFast and Placebo for the ends
of this curve
It also would be possible to use a more formal definition of how much one
wants to trade quality per time spend but that then makes it harder to
decide which feature to actually turn on when one requests a ratio between
PSNR and seconds
>
> The upside of this approach is that it would be vastly simpler for most
> end users. It would also track newly added functionality automatically;
> e.g. if we get a higher-quality tone mapping mode, it can be
> retroactively added to the higher quality presets. The biggest downside
> I can think of is that doing this would arguably violate the semantics
> of a "bitexact" flag, since it would break results relative to
> a previous version of libswscale - unless we maybe also force a specific
> quality level in bitexact mode?
>
> Open questions:
>
> 1. Is this a good idea, or do the downsides outweigh the benefits?
> 2. Is an "advanced configuration" API still needed, in addition to the
> quality presets?
For regression testing and debuging it is very usefull to be able to turn
features on one at a time. A failure could then be quickly isolated to
a feature.
[...]
> /**
> * Statically test if a conversion is supported. Values of (respectively)
> * NONE/UNSPECIFIED are ignored.
> *
> * Returns 1 if the conversion is supported, or 0 otherwise.
> */
> int avscale_test_format(enum AVPixelFormat dst, enum AVPixelFormat src);
> int avscale_test_colorspace(enum AVColorSpace dst, enum AVColorSpace src);
> int avscale_test_primaries(enum AVColorPrimaries dst, enum AVColorPrimaries src);
> int avscale_test_transfer(enum AVColorTransferCharacteristic dst,
> enum AVColorTransferCharacteristic src);
If we support A for any input and and support B for any output then we
should support converting from A to B
I dont think this API is a good idea. It allows supporting random subsets
which would cause confusion and wierd bugs by code using it.
(for example removial of an intermediate filter could lead to failure)
[...]
thx
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
Elect your leaders based on what they did after the last election, not
based on what they say before an election.
[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]
[-- Attachment #2: Type: text/plain, Size: 251 bytes --]
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [FFmpeg-devel] [RFC]] swscale modernization proposal
2024-06-29 12:35 ` Michael Niedermayer
@ 2024-06-29 14:05 ` Niklas Haas
2024-06-29 14:11 ` James Almer
0 siblings, 1 reply; 36+ messages in thread
From: Niklas Haas @ 2024-06-29 14:05 UTC (permalink / raw)
To: FFmpeg development discussions and patches
On Sat, 29 Jun 2024 14:35:32 +0200 Michael Niedermayer <michael@niedermayer.cc> wrote:
> On Sat, Jun 29, 2024 at 01:47:43PM +0200, Niklas Haas wrote:
> > On Sat, 22 Jun 2024 15:13:34 +0200 Niklas Haas <ffmpeg@haasn.xyz> wrote:
> > > Hey,
> > >
> > > As some of you know, I got contracted (by STF 2024) to work on improving
> > > swscale, over the course of the next couple of months. I want to share my
> > > current plans and gather feedback + measure sentiment.
> > >
> > > ## Problem statement
> > >
> > > The two issues I'd like to focus on for now are:
> > >
> > > 1. Lack of support for a lot of modern formats and conversions (HDR, ICtCp,
> > > IPTc2, BT.2020-CL, XYZ, YCgCo, Dolby Vision, ...)
> > > 2. Complicated context management, with cascaded contexts, threading, stateful
> > > configuration, multi-step init procedures, etc; and related bugs
> > >
> > > In order to make these feasible, some amount of internal re-organization of
> > > duties inside swscale is prudent.
> > >
> > > ## Proposed approach
> > >
> > > The first step is to create a new API, which will (tentatively) live in
> > > <libswscale/avscale.h>. This API will initially start off as a near-copy of the
> > > current swscale public API, but with the major difference that I want it to be
> > > state-free and only access metadata in terms of AVFrame properties. So there
> > > will be no independent configuration of the input chroma location etc. like
> > > there is currently, and no need to re-configure or re-init the context when
> > > feeding it frames with different properties. The goal is for users to be able
> > > to just feed it AVFrame pairs and have it internally cache expensive
> > > pre-processing steps as needed. Finally, avscale_* should ultimately also
> > > support hardware frames directly, in which case it will dispatch to some
> > > equivalent of scale_vulkan/vaapi/cuda or possibly even libplacebo. (But I will
> > > defer this to a future milestone)
> >
> > So, I've spent the past days implementing this API and hooking it up to
> > swscale internally. (For testing, I am also replacing `vf_scale` by the
> > equivalent AVScale-based implementation to see how the new API impacts
> > existing users). It mostly works so far, with some left-over translation
> > issues that I have to address before it can be sent upstream.
> >
> > ------
> >
> > One of the things I was thinking about was how to configure
> > scalers/dither modes, which sws currently, somewhat clunkily, controls
> > with flags. IMO, flags are not the right design here - if anything, it
> > should be a separate enum/int, and controllable separately for chroma
> > resampling (4:4:4 <-> 4:2:0) and main scaling (e.g. 50x50 <-> 80x80).
> >
> > That said, I think that for most end users, having such fine-grained
> > options is not really providing any end value - unless you're already
> > knee-deep in signal theory, the actual differences between, say,
> > "natural bicubic spline" and "Lanczos" are obtuse at best and alien at
> > worst.
> >
> > My idea was to provide a single `int quality`, which the user can set to
> > tune the speed <-> quality trade-off on an arbitrary numeric scale from
> > 0 to 10, with 0 being the fastest (alias everything, nearest neighbour,
> > drop half chroma samples, etc.), the default being something in the
> > vicinity of 3-5, and 10 being the maximum quality (full linear
> > downscaling, anti-aliasing, error diffusion, etc.).
>
> I think 10 levels is not fine grained enough,
> when there are more then 10 features to switch on/off we would have
> to switch more than 1 at a time.
>
> also the scale has an issue, that becomes obvious when you consider the
> extreems, like memset(0) at level 0, not converting chroma at level 1
> and hiring a human artist to paint a matching upscaled image at 10
> using a neural net at 9
>
> the quality factor would probably have thus at least 3 ranges
> 1. the as fast as possible with noticeable quality issues
> 2. the normal range
> 3. the as best as possible, disregarding the computation needed
>
> some encoder (like x264) use words like UltraFast and Placebo for the ends
> of this curve
I like the idea of using explicit names instead of numbers. It
translates well onto the human-facing API anyway.
I don't think 10 levels is too few if we also pair it with a granular
API for controlling exactly which scalers etc. you want.
In particular, if we want human-compatible names for them (ranging from
"ultrafast" to "placebo" as discussed), you would be hard-pressed to
find many more sensible names than 10.
Especially if we treat these just as presets and not the only way to
configure them.
>
> It also would be possible to use a more formal definition of how much one
> wants to trade quality per time spend but that then makes it harder to
> decide which feature to actually turn on when one requests a ratio between
> PSNR and seconds
>
>
> >
> > The upside of this approach is that it would be vastly simpler for most
> > end users. It would also track newly added functionality automatically;
> > e.g. if we get a higher-quality tone mapping mode, it can be
> > retroactively added to the higher quality presets. The biggest downside
> > I can think of is that doing this would arguably violate the semantics
> > of a "bitexact" flag, since it would break results relative to
> > a previous version of libswscale - unless we maybe also force a specific
> > quality level in bitexact mode?
> >
> > Open questions:
> >
> > 1. Is this a good idea, or do the downsides outweigh the benefits?
>
>
>
> > 2. Is an "advanced configuration" API still needed, in addition to the
> > quality presets?
>
> For regression testing and debuging it is very usefull to be able to turn
> features on one at a time. A failure could then be quickly isolated to
> a feature.
Very strong argument in favor of granular control. I'll find a way to
support it while still having "presets".
>
>
>
> [...]
>
> > /**
> > * Statically test if a conversion is supported. Values of (respectively)
> > * NONE/UNSPECIFIED are ignored.
> > *
> > * Returns 1 if the conversion is supported, or 0 otherwise.
> > */
> > int avscale_test_format(enum AVPixelFormat dst, enum AVPixelFormat src);
> > int avscale_test_colorspace(enum AVColorSpace dst, enum AVColorSpace src);
> > int avscale_test_primaries(enum AVColorPrimaries dst, enum AVColorPrimaries src);
> > int avscale_test_transfer(enum AVColorTransferCharacteristic dst,
> > enum AVColorTransferCharacteristic src);
>
> If we support A for any input and and support B for any output then we
> should support converting from A to B
>
> I dont think this API is a good idea. It allows supporting random subsets
> which would cause confusion and wierd bugs by code using it.
> (for example removial of an intermediate filter could lead to failure)
Good point, will change. The prototypal use case for this API is setting
up format lists inside vf_scale, which need to be set up independently
anyway.
I was planning on adding another _test_frames() function that takes two
AVFrames and returns in a tri-state manner whether conversion is
supported, unsupported, or a no-op. If an exception to the input/output
independence does ever arise, we can test for it in this function.
>
> [...]
>
> thx
>
> --
> Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>
> Elect your leaders based on what they did after the last election, not
> based on what they say before an election.
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [FFmpeg-devel] [RFC]] swscale modernization proposal
2024-06-29 14:05 ` Niklas Haas
@ 2024-06-29 14:11 ` James Almer
0 siblings, 0 replies; 36+ messages in thread
From: James Almer @ 2024-06-29 14:11 UTC (permalink / raw)
To: ffmpeg-devel
On 6/29/2024 11:05 AM, Niklas Haas wrote:
> On Sat, 29 Jun 2024 14:35:32 +0200 Michael Niedermayer <michael@niedermayer.cc> wrote:
>> On Sat, Jun 29, 2024 at 01:47:43PM +0200, Niklas Haas wrote:
>>> On Sat, 22 Jun 2024 15:13:34 +0200 Niklas Haas <ffmpeg@haasn.xyz> wrote:
>>>> Hey,
>>>>
>>>> As some of you know, I got contracted (by STF 2024) to work on improving
>>>> swscale, over the course of the next couple of months. I want to share my
>>>> current plans and gather feedback + measure sentiment.
>>>>
>>>> ## Problem statement
>>>>
>>>> The two issues I'd like to focus on for now are:
>>>>
>>>> 1. Lack of support for a lot of modern formats and conversions (HDR, ICtCp,
>>>> IPTc2, BT.2020-CL, XYZ, YCgCo, Dolby Vision, ...)
>>>> 2. Complicated context management, with cascaded contexts, threading, stateful
>>>> configuration, multi-step init procedures, etc; and related bugs
>>>>
>>>> In order to make these feasible, some amount of internal re-organization of
>>>> duties inside swscale is prudent.
>>>>
>>>> ## Proposed approach
>>>>
>>>> The first step is to create a new API, which will (tentatively) live in
>>>> <libswscale/avscale.h>. This API will initially start off as a near-copy of the
>>>> current swscale public API, but with the major difference that I want it to be
>>>> state-free and only access metadata in terms of AVFrame properties. So there
>>>> will be no independent configuration of the input chroma location etc. like
>>>> there is currently, and no need to re-configure or re-init the context when
>>>> feeding it frames with different properties. The goal is for users to be able
>>>> to just feed it AVFrame pairs and have it internally cache expensive
>>>> pre-processing steps as needed. Finally, avscale_* should ultimately also
>>>> support hardware frames directly, in which case it will dispatch to some
>>>> equivalent of scale_vulkan/vaapi/cuda or possibly even libplacebo. (But I will
>>>> defer this to a future milestone)
>>>
>>> So, I've spent the past days implementing this API and hooking it up to
>>> swscale internally. (For testing, I am also replacing `vf_scale` by the
>>> equivalent AVScale-based implementation to see how the new API impacts
>>> existing users). It mostly works so far, with some left-over translation
>>> issues that I have to address before it can be sent upstream.
>>>
>>> ------
>>>
>>> One of the things I was thinking about was how to configure
>>> scalers/dither modes, which sws currently, somewhat clunkily, controls
>>> with flags. IMO, flags are not the right design here - if anything, it
>>> should be a separate enum/int, and controllable separately for chroma
>>> resampling (4:4:4 <-> 4:2:0) and main scaling (e.g. 50x50 <-> 80x80).
>>>
>>> That said, I think that for most end users, having such fine-grained
>>> options is not really providing any end value - unless you're already
>>> knee-deep in signal theory, the actual differences between, say,
>>> "natural bicubic spline" and "Lanczos" are obtuse at best and alien at
>>> worst.
>>>
>>> My idea was to provide a single `int quality`, which the user can set to
>>> tune the speed <-> quality trade-off on an arbitrary numeric scale from
>>> 0 to 10, with 0 being the fastest (alias everything, nearest neighbour,
>>> drop half chroma samples, etc.), the default being something in the
>>> vicinity of 3-5, and 10 being the maximum quality (full linear
>>> downscaling, anti-aliasing, error diffusion, etc.).
>>
>> I think 10 levels is not fine grained enough,
>> when there are more then 10 features to switch on/off we would have
>> to switch more than 1 at a time.
>>
>> also the scale has an issue, that becomes obvious when you consider the
>> extreems, like memset(0) at level 0, not converting chroma at level 1
>> and hiring a human artist to paint a matching upscaled image at 10
>> using a neural net at 9
>>
>> the quality factor would probably have thus at least 3 ranges
>> 1. the as fast as possible with noticeable quality issues
>> 2. the normal range
>> 3. the as best as possible, disregarding the computation needed
>>
>> some encoder (like x264) use words like UltraFast and Placebo for the ends
>> of this curve
>
> I like the idea of using explicit names instead of numbers. It
> translates well onto the human-facing API anyway.
>
> I don't think 10 levels is too few if we also pair it with a granular
> API for controlling exactly which scalers etc. you want.
>
> In particular, if we want human-compatible names for them (ranging from
> "ultrafast" to "placebo" as discussed), you would be hard-pressed to
> find many more sensible names than 10.
>
> Especially if we treat these just as presets and not the only way to
> configure them.
>
>>
>> It also would be possible to use a more formal definition of how much one
>> wants to trade quality per time spend but that then makes it harder to
>> decide which feature to actually turn on when one requests a ratio between
>> PSNR and seconds
>>
>>
>>>
>>> The upside of this approach is that it would be vastly simpler for most
>>> end users. It would also track newly added functionality automatically;
>>> e.g. if we get a higher-quality tone mapping mode, it can be
>>> retroactively added to the higher quality presets. The biggest downside
>>> I can think of is that doing this would arguably violate the semantics
>>> of a "bitexact" flag, since it would break results relative to
>>> a previous version of libswscale - unless we maybe also force a specific
>>> quality level in bitexact mode?
>>>
>>> Open questions:
>>>
>>> 1. Is this a good idea, or do the downsides outweigh the benefits?
>>
>>
>>
>>> 2. Is an "advanced configuration" API still needed, in addition to the
>>> quality presets?
>>
>> For regression testing and debuging it is very usefull to be able to turn
>> features on one at a time. A failure could then be quickly isolated to
>> a feature.
>
> Very strong argument in favor of granular control. I'll find a way to
> support it while still having "presets".
>
>>
>>
>>
>> [...]
>>
>>> /**
>>> * Statically test if a conversion is supported. Values of (respectively)
>>> * NONE/UNSPECIFIED are ignored.
>>> *
>>> * Returns 1 if the conversion is supported, or 0 otherwise.
>>> */
>>> int avscale_test_format(enum AVPixelFormat dst, enum AVPixelFormat src);
>>> int avscale_test_colorspace(enum AVColorSpace dst, enum AVColorSpace src);
>>> int avscale_test_primaries(enum AVColorPrimaries dst, enum AVColorPrimaries src);
>>> int avscale_test_transfer(enum AVColorTransferCharacteristic dst,
>>> enum AVColorTransferCharacteristic src);
I think it'd be best if you define > 0 as supported for API like this,
so it becomes extensible.
Also maybe < 0 for failure, Like AVERROR(EINVAL), so invalid enum values
are not simply treated as "not supported".
>>
>> If we support A for any input and and support B for any output then we
>> should support converting from A to B
>>
>> I dont think this API is a good idea. It allows supporting random subsets
>> which would cause confusion and wierd bugs by code using it.
>> (for example removial of an intermediate filter could lead to failure)
>
> Good point, will change. The prototypal use case for this API is setting
> up format lists inside vf_scale, which need to be set up independently
> anyway.
>
> I was planning on adding another _test_frames() function that takes two
> AVFrames and returns in a tri-state manner whether conversion is
> supported, unsupported, or a no-op. If an exception to the input/output
> independence does ever arise, we can test for it in this function.
>
>>
>> [...]
>>
>> thx
>>
>> --
>> Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>>
>> Elect your leaders based on what they did after the last election, not
>> based on what they say before an election.
>>
>> _______________________________________________
>> ffmpeg-devel mailing list
>> ffmpeg-devel@ffmpeg.org
>> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>>
>> To unsubscribe, visit link above, or email
>> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [FFmpeg-devel] [RFC]] swscale modernization proposal
2024-06-29 11:47 ` Niklas Haas
2024-06-29 12:35 ` Michael Niedermayer
@ 2024-06-30 6:25 ` Vittorio Giovara
1 sibling, 0 replies; 36+ messages in thread
From: Vittorio Giovara @ 2024-06-30 6:25 UTC (permalink / raw)
To: FFmpeg development discussions and patches
On Sat, Jun 29, 2024 at 1:47 PM Niklas Haas <ffmpeg@haasn.xyz> wrote:
>
> Open questions:
>
> 1. Is this a good idea, or do the downsides outweigh the benefits?
> 2. Is an "advanced configuration" API still needed, in addition to the
> quality presets?
>
Having a sort of preset is a good idea, esp to simplify the ffmpeg cli, but
maybe 10 levels is a bit too many, as users will either pick the default,
the fastest or the slowest /most of the times/. In either case, I'm afraid
the advanced config API is needed, as you'd want to let people tweak and
set up the full configuration, since ffmpeg is generally low level IMO.
--
Vittorio
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [FFmpeg-devel] [RFC]] swscale modernization proposal
2024-06-24 14:44 ` Vittorio Giovara
2024-06-25 15:31 ` Niklas Haas
@ 2024-07-01 21:10 ` Stefano Sabatini
1 sibling, 0 replies; 36+ messages in thread
From: Stefano Sabatini @ 2024-07-01 21:10 UTC (permalink / raw)
To: FFmpeg development discussions and patches
On date Monday 2024-06-24 16:44:33 +0200, Vittorio Giovara wrote:
> On Sun, Jun 23, 2024 at 7:57 PM James Almer <jamrial@gmail.com> wrote:
>
> > On 6/22/2024 7:19 PM, Vittorio Giovara wrote:
> > > Needless to say I support the plan of renaming the library so that it can
> > > be inline with the other libraries names, and the use of a separate
> > header
> > > since downstream applications will need to update a lot to use the new
> > > library (or the new apis in the existing library) and/or we could
> > provide a
> > > thin conversion layer when the new lib is finalized.
> >
> > I don't quite agree with renaming it. As Michael already pointed out,
> > the av prefix wouldn't fit a scaling library nor a resampling one, as
> > they only handle one or the other.
> >
>
> by that reasoning we should ban all subtitles from all the libraries
>
> av is a shorthand of multimedia and many people in the industry refer to
> ffmpeg libs as "libav*" so it feels a bit odd to push for preserving an
> alternative name
>
>
> > There's also the precedent of avresample, which was ultimately dropped
> > in favor of swresample, so trying to replace swscale with a new avscale
> > library will be both confusing and going against what was already
> > established.
>
>
> it's still 4 libraries vs 2... and swr/avr is shrouded in bad history that
> is not worth bringing up
>
> I'd understand opposing a rename just for the sake of renaming, but this is
> essentially a new library, i see no value in preserving the old naming
> scheme, if not making downstream life worse :x
+1
av_ is a kind of prefix used by most FFmpeg libraries, so it's not
only about Audio/Video. Since the rewrite will probably break API
anyhow, we should move in the direction of making this more consistent
with the other FFmpeg libraries.
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [FFmpeg-devel] [RFC]] swscale modernization proposal
2024-06-22 13:13 [FFmpeg-devel] [RFC]] swscale modernization proposal Niklas Haas
` (3 preceding siblings ...)
2024-06-29 11:47 ` Niklas Haas
@ 2024-07-02 13:27 ` Niklas Haas
2024-07-03 13:25 ` Niklas Haas
4 siblings, 1 reply; 36+ messages in thread
From: Niklas Haas @ 2024-07-02 13:27 UTC (permalink / raw)
To: ffmpeg-devel
On Sat, 22 Jun 2024 15:13:34 +0200 Niklas Haas <ffmpeg@haasn.xyz> wrote:
> Finally, avscale_* should ultimately also support hardware frames
> directly, in which case it will dispatch to some equivalent of
> scale_vulkan/vaapi/cuda or possibly even libplacebo. (But I will defer
> this to a future milestone)
How do people feel about being able to control the "backend" directly
from the <avscale.h> interface?
For example,
struct AVScaleContext {
int backend;
...
}
enum AVScaleBackend {
AV_SCALE_AUTO = 0, // select first backend that passes avscale_test_*
AV_SCALE_NATIVE, // use new scaling pipeline (when implemented)
AV_SCALE_SWSCALE, // use existing SwsContext code 1:1, to-be-deprecated
AV_SCALE_ZIMG,
AV_SCALE_LIBPLACEBO,
AV_SCALE_CUDA,
AV_SCALE_VAAPI,
...
};
int avscale_test_format(enum AVScaleBackend backend,
enum AVPixelFormat format,
int output);
int avscale_test_colorspace(...);
...
Or alternatively, a perhaps more extensible design:
struct AVScaleBackend {
int (*test_format)(enum AVPixelFormat format, int output);
int (*test_colorspace)(...);
...
/* somehow expose per-backend options here? */
};
My over-arching goal here is to be able to unify vf_scale_*,
vf_libplacebo, vf_zscale etc. into a single common API, and in doing so,
effectively also have a suitable vf_scale_* variant auto-inserted into
filter chains.
I'm also thinking about supporting per-backend options in this case,
since some backends (notably libplacebo) expose about five billion
tunable parameters that go beyond the basic "quality preset, scaling
mode, dither mode" that I plan on exposing at the top level.
Open questions:
1. Is this a good idea, or too confusing / complex to be worth the gain?
Specifically, I am worried about confusion arising due to differences
in behavior, and implemented options, between all of the above.
That said, I think there is a big win to be had from unifying all of
the different scaling and/or conversion filters we have in e.g.
libavfilter, as well as making it trivial for users of this API to
try using e.g. GPU scaling instead of CPU scaling.
2. How should we expose per-backend options? Probably the safest bet is
to wrap the relevant options inside AVOptions, as vf_zscale
/ vf_libplacebo / vf_scale etc. currently do.
Alternatively, it could be nice if we could give users direct access
to the underlying API configuration structs (pl_render_params,
zimg_graph_builder_params, SwsContext, etc.).
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [FFmpeg-devel] [RFC]] swscale modernization proposal
2024-07-02 13:27 ` Niklas Haas
@ 2024-07-03 13:25 ` Niklas Haas
2024-07-05 18:31 ` Niklas Haas
0 siblings, 1 reply; 36+ messages in thread
From: Niklas Haas @ 2024-07-03 13:25 UTC (permalink / raw)
To: ffmpeg-devel
On Tue, 02 Jul 2024 15:27:00 +0200 Niklas Haas <ffmpeg@haasn.xyz> wrote:
> 1. Is this a good idea, or too confusing / complex to be worth the gain?
> Specifically, I am worried about confusion arising due to differences
> in behavior, and implemented options, between all of the above.
>
> That said, I think there is a big win to be had from unifying all of
> the different scaling and/or conversion filters we have in e.g.
> libavfilter, as well as making it trivial for users of this API to
> try using e.g. GPU scaling instead of CPU scaling.
After prototyping this approach a bit (using an internal struct
AVScaleBackend), I think I like it. It specifically makes handling
unscaled special converters pretty straightforward, for example - the
"unscaled" backend can be separate from the generic/scaling backend.
We could also trivially plug in something like libyuv, or some other
limited-use-case fast path, without the user really noticing.
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [FFmpeg-devel] [RFC]] swscale modernization proposal
2024-07-03 13:25 ` Niklas Haas
@ 2024-07-05 18:31 ` Niklas Haas
2024-07-05 21:34 ` Michael Niedermayer
0 siblings, 1 reply; 36+ messages in thread
From: Niklas Haas @ 2024-07-05 18:31 UTC (permalink / raw)
To: ffmpeg-devel
[-- Attachment #1: Type: text/plain, Size: 1518 bytes --]
On Wed, 03 Jul 2024 15:25:58 +0200 Niklas Haas <ffmpeg@haasn.xyz> wrote:
> On Tue, 02 Jul 2024 15:27:00 +0200 Niklas Haas <ffmpeg@haasn.xyz> wrote:
>
> > 1. Is this a good idea, or too confusing / complex to be worth the gain?
> > Specifically, I am worried about confusion arising due to differences
> > in behavior, and implemented options, between all of the above.
> >
> > That said, I think there is a big win to be had from unifying all of
> > the different scaling and/or conversion filters we have in e.g.
> > libavfilter, as well as making it trivial for users of this API to
> > try using e.g. GPU scaling instead of CPU scaling.
>
> After prototyping this approach a bit (using an internal struct
> AVScaleBackend), I think I like it. It specifically makes handling
> unscaled special converters pretty straightforward, for example - the
> "unscaled" backend can be separate from the generic/scaling backend.
>
> We could also trivially plug in something like libyuv, or some other
> limited-use-case fast path, without the user really noticing.
Small update: I decided to scrap the idea of separate user-visible
"backends" for now, but preserved the internal API boundary between the
avscale_* "front-end" and the actual back-end implementation, which
I have called 'AVScaleGraph' for now.
The idea is that this will grow into a full colorspace <-> colorspace
"solver", but for now it is just hooked up to sws_createContext().
Attached is my revised working draft of <avscale.h>.
[-- Attachment #2: avscale.h --]
[-- Type: text/plain, Size: 10215 bytes --]
/*
* Copyright (C) 2024 Niklas Haas
*
* This file is part of FFmpeg.
*
* FFmpeg is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2.1 of the License, or (at your option) any later version.
*
* FFmpeg is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with FFmpeg; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
*/
#ifndef SWSCALE_AVSCALE_H
#define SWSCALE_AVSCALE_H
/**
* @file
* @ingroup libsws
* Higher-level wrapper around libswscale + related libraries, which is
* capable of handling more advanced colorspace transformations.
*/
#include "libavutil/frame.h"
#include "libavutil/log.h"
/**
* Main external API structure. New fields cannot be added to the end with
* minor version bumps. Removal, reordering and changes to existing fields
* require a major version bump. sizeof(AVScaleContext) is not part of the ABI.
*/
typedef struct AVScaleContext {
const AVClass *av_class;
/**
* Private context used for internal data.
*/
struct AVScaleInternal *internal;
/**
* Private data of the user, can be used to carry app specific stuff.
*/
void *opaque;
/**
* Bitmask of AV_SCALE_* flags.
*/
int64_t flags;
/**
* How many threads to use for processing, or 0 for automatic selection.
*/
int threads;
/**
* Quality preset, on a scale from 1 to 10. See `enum AVScaleQuality`.
*/
int preset;
/**
* Dither mode. If set to something other than AV_DITHER_AUTO, this will
* override the dither mode implied by the `preset`.
*/
int dither;
/**
* Scaling filter. If set to something other than AV_SCALE_AUTO, this will
* override the filter implied by the `preset`.
*/
int filter;
/**
* Filter used specifically for up/downsampling subsampled (chroma) planes.
* If set to something other than AV_SCALE_AUTO, this will override the
* filter implied by the `preset`.
*/
int filter_sub;
/**
* Backwards compatibility field with libswscale. Anything set here
* will override the corresponding options implied by the fields above.
*
* @deprecated use AVScaleContext.flags/filter/dither
*/
attribute_deprecated
int sws_flags;
} AVScaleContext;
/**
* Allocate an AVScaleContext and set its fields to default values. The
* resulting struct should be freed with avscale_free_context().
*/
AVScaleContext *avscale_alloc_context(void);
/**
* Free the codec context and everything associated with it, and write NULL
* to the provided pointer.
*/
void avscale_free_context(AVScaleContext **ctx);
/**
* Get the AVClass for AVScaleContext. It can be used in combination with
* AV_OPT_SEARCH_FAKE_OBJ for examining options.
*
* @see av_opt_find().
*/
const AVClass *avscale_get_class(void);
/******************************
* Flags and quality settings *
******************************/
enum AVScaleFlags {
/**
* Force bit-exact output. This will prevent the use of platform-specific
* optimizations that may lead to slight difference in rounding, in favor
* of always maintaining exact bit output compatibility with the reference
* C code.
*/
AV_SCALE_BITEXACT = 1 << 0,
/**
* Return an error on underspecified conversions. Without this flag,
* unspecified fields are defaulted to sensible values.
*/
AV_SCALE_STRICT = 1 << 1,
};
/**
* The exact interpretation of these quality presets depends on the backend
* used, but the backend-invariant common settings are derived as follows:
*/
enum AVScaleQuality {
AV_SCALE_ULTRAFAST = 1, /* no dither, nearest+nearest */
AV_SCALE_SUPERFAST = 2, /* no dither, bilinear+nearest */
AV_SCALE_VERYFAST = 3, /* no dither, bilinear+bilinear */
AV_SCALE_FASTER = 4, /* bayer dither, bilinear+bilinear */
AV_SCALE_FAST = 5, /* bayer dither, bicubic+bilinear */
AV_SCALE_MEDIUM = 6, /* bayer dither, bicubic+bicubic */
AV_SCALE_SLOW = 7, /* bayer dither, lanczos+bicubic */
AV_SCALE_SLOWER = 8, /* full dither, lanczos+bicubic */
AV_SCALE_VERYSLOW = 9, /* full dither, lanczos+lanczos */
AV_SCALE_PLACEBO = 10, /* full dither, lanczos+lanczos */
};
enum AVDitherMode {
AV_DITHER_AUTO = 0, /* auto-select from preset */
AV_DITHER_NONE, /* disable dithering */
AV_DITHER_BAYER, /* ordered dither matrix */
AV_DITHER_FULL, /* full error diffusion */
};
/**
* Returns the default dither mode implied by a given quality preset.
*/
enum AVDitherMode avscale_default_dither(int preset);
enum AVScaleFilter {
AV_SCALE_AUTO = 0, /* auto-select from preset */
AV_SCALE_NEAREST, /* nearest neighbour */
AV_SCALE_BILINEAR, /* bilinear filtering */
AV_SCALE_BICUBIC, /* 2-tap cubic B-spline */
AV_SCALE_GAUSSIAN, /* gaussian approximation */
AV_SCALE_LANCZOS, /* 3-tap sinc/sinc */
};
/**
* Returns the default scaling filters implied by a given quality preset.
*/
enum AVScaleFilter avscale_default_filter(int preset);
enum AVScaleFilter avscale_default_filter_sub(int preset);
/***************************
* Supported frame formats *
***************************/
/**
* Test if a given pixel format is supported.
*
* @param output If 0, test if compatible with the source/input frame;
* otherwise, with the destination/output frame.
* @param format The format to check.
*
* @return A positive integer if supported, 0 otherwise.
*/
int avscale_test_format(enum AVPixelFormat format, int output);
/**
* Test if a given color space is supported.
*
* @param output If 0, test if compatible with the source/input frame;
* otherwise, with the destination/output frame.
* @param colorspace The colorspace to check.
*
* @return A positive integer if supported, 0 otherwise.
*/
int avscale_test_colorspace(enum AVColorSpace colorspace, int output);
/**
* Test if a given set of color primaries is supported.
*
* @param output If 0, test if compatible with the source/input frame;
* otherwise, with the destination/output frame.
* @param primaries The color primaries to check.
*
* @return A positive integer if supported, 0 otherwise.
*/
int avscale_test_primaries(enum AVColorPrimaries primaries, int output);
/**
* Test if a given color transfer function is supported.
*
* @param output If 0, test if compatible with the source/input frame;
* otherwise, with the destination/output frame.
* @param trc The color transfer function to check.
*
* @return A positive integer if supported, 0 otherwise.
*/
int avscale_test_transfer(enum AVColorTransferCharacteristic trc, int output);
/**
* Helper function to run all avscale_test_* against a frame. Ignores irrelevant
* properties, for example AVColorSpace is not checked for RGB frames.
*/
int avscale_test_frame(const AVFrame *frame, int output);
/********************
* Main scaling API *
********************/
/**
* Check if a given conversion is a noop. Returns a positive integer if
* no operation needs to be performed, 0 otherwise.
*/
int avscale_is_noop(const AVFrame *dst, const AVFrame *src);
/**
* Return the minimum slice alignment required for a given frame. This is
* always a power of two, typically 1, 2 or 4 depending on the frame's
* subsampling and interlacing.
*/
int avscale_slice_alignment(const AVFrame *frame);
/**
* Scale source data from `src` and write the output to `dst`. This is
* merely a convenience wrapper around `avscale_frame_slice(ctx, dst, src, 0,
* src->height)`.
*
* @param ctx The scaling context.
* @param dst The destination frame. See avscale_frame_slice().
* @param src The source frame. See avscale_frame_slice().
* @return 0 on success, a negative AVERROR code on failure.
*/
int avscale_frame(AVScaleContext *ctx, AVFrame *dst, const AVFrame *src);
/**
* Like `avscale_frame`, but operates only on the (source) range from `ystart`
* to `height`.
*
* @param ctx The scaling context.
* @param dst The destination frame. The data buffers may either be already
* allocated by the caller or left clear, in which case they will
* be allocated by the scaler. The latter may have performance
* advantages - e.g. in certain cases some (or all) output planes
* may be references to input planes, rather than copies.
* @param src The source frame. If the data buffers are set to NULL, then
* this function behaves identically to `avscale_frame_setup`.
* @param slice_start First row of slice, relative to `src`. Must be a
* multiple of avscale_slice_alignment(src).
* @param slice_height Number of (source) rows in the slice. Must be a
* multiple of avscale_slice_alignment(src).
*
* @return 0 on success, a negative AVERROR code on failure.
*/
int avscale_frame_slice(AVScaleContext *ctx, AVFrame *dst,
const AVFrame *src, int slice_start, int slice_height);
/**
* Like `avscale_frame`, but without actually scaling. It will instead merely
* initialize internal state that *would* be required to perform the operation,
* as well as returning the correct error code for unsupported frame
* combinations.
*
* @param ctx The scaling context.
* @param dst The destination frame to consider.
* @param src The source frame to consider.
* @return 0 on success, a negative AVERROR code on failure.
*/
int avscale_frame_setup(AVScaleContext *ctx, const AVFrame *dst,
const AVFrame *src);
#endif /* SWSCALE_AVSCALE_H */
[-- Attachment #3: Type: text/plain, Size: 251 bytes --]
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [FFmpeg-devel] [RFC]] swscale modernization proposal
2024-07-05 18:31 ` Niklas Haas
@ 2024-07-05 21:34 ` Michael Niedermayer
2024-07-06 0:11 ` Hendrik Leppkes
` (2 more replies)
0 siblings, 3 replies; 36+ messages in thread
From: Michael Niedermayer @ 2024-07-05 21:34 UTC (permalink / raw)
To: FFmpeg development discussions and patches
[-- Attachment #1.1: Type: text/plain, Size: 5066 bytes --]
On Fri, Jul 05, 2024 at 08:31:17PM +0200, Niklas Haas wrote:
> On Wed, 03 Jul 2024 15:25:58 +0200 Niklas Haas <ffmpeg@haasn.xyz> wrote:
> > On Tue, 02 Jul 2024 15:27:00 +0200 Niklas Haas <ffmpeg@haasn.xyz> wrote:
> >
> > > 1. Is this a good idea, or too confusing / complex to be worth the gain?
> > > Specifically, I am worried about confusion arising due to differences
> > > in behavior, and implemented options, between all of the above.
> > >
> > > That said, I think there is a big win to be had from unifying all of
> > > the different scaling and/or conversion filters we have in e.g.
> > > libavfilter, as well as making it trivial for users of this API to
> > > try using e.g. GPU scaling instead of CPU scaling.
> >
> > After prototyping this approach a bit (using an internal struct
> > AVScaleBackend), I think I like it. It specifically makes handling
> > unscaled special converters pretty straightforward, for example - the
> > "unscaled" backend can be separate from the generic/scaling backend.
> >
> > We could also trivially plug in something like libyuv, or some other
> > limited-use-case fast path, without the user really noticing.
>
> Small update: I decided to scrap the idea of separate user-visible
> "backends" for now, but preserved the internal API boundary between the
> avscale_* "front-end" and the actual back-end implementation, which
> I have called 'AVScaleGraph' for now.
>
> The idea is that this will grow into a full colorspace <-> colorspace
> "solver", but for now it is just hooked up to sws_createContext().
>
> Attached is my revised working draft of <avscale.h>.
I dont agree to the renaming of swscale, that is heading toward
[...]
> /**
> * The exact interpretation of these quality presets depends on the backend
> * used, but the backend-invariant common settings are derived as follows:
> */
> enum AVScaleQuality {
> AV_SCALE_ULTRAFAST = 1, /* no dither, nearest+nearest */
> AV_SCALE_SUPERFAST = 2, /* no dither, bilinear+nearest */
> AV_SCALE_VERYFAST = 3, /* no dither, bilinear+bilinear */
> AV_SCALE_FASTER = 4, /* bayer dither, bilinear+bilinear */
> AV_SCALE_FAST = 5, /* bayer dither, bicubic+bilinear */
> AV_SCALE_MEDIUM = 6, /* bayer dither, bicubic+bicubic */
> AV_SCALE_SLOW = 7, /* bayer dither, lanczos+bicubic */
> AV_SCALE_SLOWER = 8, /* full dither, lanczos+bicubic */
> AV_SCALE_VERYSLOW = 9, /* full dither, lanczos+lanczos */
> AV_SCALE_PLACEBO = 10, /* full dither, lanczos+lanczos */
I dont think its a good idea to hardcode dither and the "FIR" filter to the quality level in the API
[...]
> /**
> * Like `avscale_frame`, but operates only on the (source) range from `ystart`
> * to `height`.
> *
> * @param ctx The scaling context.
> * @param dst The destination frame. The data buffers may either be already
> * allocated by the caller or left clear, in which case they will
> * be allocated by the scaler. The latter may have performance
> * advantages - e.g. in certain cases some (or all) output planes
> * may be references to input planes, rather than copies.
> * @param src The source frame. If the data buffers are set to NULL, then
> * this function behaves identically to `avscale_frame_setup`.
> * @param slice_start First row of slice, relative to `src`. Must be a
> * multiple of avscale_slice_alignment(src).
> * @param slice_height Number of (source) rows in the slice. Must be a
> * multiple of avscale_slice_alignment(src).
> *
> * @return 0 on success, a negative AVERROR code on failure.
> */
> int avscale_frame_slice(AVScaleContext *ctx, AVFrame *dst,
> const AVFrame *src, int slice_start, int slice_height);
>
> /**
> * Like `avscale_frame`, but without actually scaling. It will instead merely
> * initialize internal state that *would* be required to perform the operation,
> * as well as returning the correct error code for unsupported frame
> * combinations.
> *
> * @param ctx The scaling context.
> * @param dst The destination frame to consider.
> * @param src The source frame to consider.
> * @return 0 on success, a negative AVERROR code on failure.
> */
> int avscale_frame_setup(AVScaleContext *ctx, const AVFrame *dst,
> const AVFrame *src);
somewhat off topic as this is public API but
the swscale filtering code could internally use libavutil/executor.h
having filters and slices interdepend and need to execute them "in order"
and parallel, maybe that API is usefull, not sure, just wanted to mention it
thx
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
Why not whip the teacher when the pupil misbehaves? -- Diogenes of Sinope
[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]
[-- Attachment #2: Type: text/plain, Size: 251 bytes --]
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [FFmpeg-devel] [RFC]] swscale modernization proposal
2024-07-05 21:34 ` Michael Niedermayer
@ 2024-07-06 0:11 ` Hendrik Leppkes
2024-07-06 12:32 ` Niklas Haas
2024-07-06 16:42 ` Michael Niedermayer
2024-07-06 11:36 ` Andrew Sayers
2024-07-06 12:27 ` Niklas Haas
2 siblings, 2 replies; 36+ messages in thread
From: Hendrik Leppkes @ 2024-07-06 0:11 UTC (permalink / raw)
To: FFmpeg development discussions and patches
On Fri, Jul 5, 2024 at 11:34 PM Michael Niedermayer
<michael@niedermayer.cc> wrote:
> > /**
> > * The exact interpretation of these quality presets depends on the backend
> > * used, but the backend-invariant common settings are derived as follows:
> > */
> > enum AVScaleQuality {
> > AV_SCALE_ULTRAFAST = 1, /* no dither, nearest+nearest */
> > AV_SCALE_SUPERFAST = 2, /* no dither, bilinear+nearest */
> > AV_SCALE_VERYFAST = 3, /* no dither, bilinear+bilinear */
> > AV_SCALE_FASTER = 4, /* bayer dither, bilinear+bilinear */
> > AV_SCALE_FAST = 5, /* bayer dither, bicubic+bilinear */
> > AV_SCALE_MEDIUM = 6, /* bayer dither, bicubic+bicubic */
> > AV_SCALE_SLOW = 7, /* bayer dither, lanczos+bicubic */
> > AV_SCALE_SLOWER = 8, /* full dither, lanczos+bicubic */
> > AV_SCALE_VERYSLOW = 9, /* full dither, lanczos+lanczos */
> > AV_SCALE_PLACEBO = 10, /* full dither, lanczos+lanczos */
>
> I dont think its a good idea to hardcode dither and the "FIR" filter to the quality level in the API
>
The entire point of presets is to have them provide a predefined set
of parameters, easy for users to pick one value, rather than a bunch.
And what a preset actually means should be documented.
How do you define "presets" if they don't hardcode a list of choices
for all the relevant options?
Advanced settings exist for a user to select any particular detail, if
they so desire.
- Hendrik
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [FFmpeg-devel] [RFC]] swscale modernization proposal
2024-07-05 21:34 ` Michael Niedermayer
2024-07-06 0:11 ` Hendrik Leppkes
@ 2024-07-06 11:36 ` Andrew Sayers
2024-07-06 12:27 ` Niklas Haas
2 siblings, 0 replies; 36+ messages in thread
From: Andrew Sayers @ 2024-07-06 11:36 UTC (permalink / raw)
To: FFmpeg development discussions and patches
On Fri, Jul 05, 2024 at 11:34:06PM +0200, Michael Niedermayer wrote:
> On Fri, Jul 05, 2024 at 08:31:17PM +0200, Niklas Haas wrote:
[...]
>
> > Attached is my revised working draft of <avscale.h>.
>
> I dont agree to the renaming of swscale, that is heading toward
Would you mind talking a bit more about this distinction?
I feel like I'm missing something important here.
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [FFmpeg-devel] [RFC]] swscale modernization proposal
2024-07-05 21:34 ` Michael Niedermayer
2024-07-06 0:11 ` Hendrik Leppkes
2024-07-06 11:36 ` Andrew Sayers
@ 2024-07-06 12:27 ` Niklas Haas
2 siblings, 0 replies; 36+ messages in thread
From: Niklas Haas @ 2024-07-06 12:27 UTC (permalink / raw)
To: FFmpeg development discussions and patches
On Fri, 05 Jul 2024 23:34:06 +0200 Michael Niedermayer <michael@niedermayer.cc> wrote:
> On Fri, Jul 05, 2024 at 08:31:17PM +0200, Niklas Haas wrote:
> > On Wed, 03 Jul 2024 15:25:58 +0200 Niklas Haas <ffmpeg@haasn.xyz> wrote:
> > > On Tue, 02 Jul 2024 15:27:00 +0200 Niklas Haas <ffmpeg@haasn.xyz> wrote:
> > >
> > > > 1. Is this a good idea, or too confusing / complex to be worth the gain?
> > > > Specifically, I am worried about confusion arising due to differences
> > > > in behavior, and implemented options, between all of the above.
> > > >
> > > > That said, I think there is a big win to be had from unifying all of
> > > > the different scaling and/or conversion filters we have in e.g.
> > > > libavfilter, as well as making it trivial for users of this API to
> > > > try using e.g. GPU scaling instead of CPU scaling.
> > >
> > > After prototyping this approach a bit (using an internal struct
> > > AVScaleBackend), I think I like it. It specifically makes handling
> > > unscaled special converters pretty straightforward, for example - the
> > > "unscaled" backend can be separate from the generic/scaling backend.
> > >
> > > We could also trivially plug in something like libyuv, or some other
> > > limited-use-case fast path, without the user really noticing.
> >
> > Small update: I decided to scrap the idea of separate user-visible
> > "backends" for now, but preserved the internal API boundary between the
> > avscale_* "front-end" and the actual back-end implementation, which
> > I have called 'AVScaleGraph' for now.
> >
> > The idea is that this will grow into a full colorspace <-> colorspace
> > "solver", but for now it is just hooked up to sws_createContext().
> >
>
> > Attached is my revised working draft of <avscale.h>.
>
> I dont agree to the renaming of swscale, that is heading toward
I am not married to the name AVScaleContext, I only named it this way
because it is a new API and the old name is already taken. Consider it
a draft name, if you can come up with something better we can switch it.
>
>
> [...]
> > /**
> > * The exact interpretation of these quality presets depends on the backend
> > * used, but the backend-invariant common settings are derived as follows:
> > */
> > enum AVScaleQuality {
> > AV_SCALE_ULTRAFAST = 1, /* no dither, nearest+nearest */
> > AV_SCALE_SUPERFAST = 2, /* no dither, bilinear+nearest */
> > AV_SCALE_VERYFAST = 3, /* no dither, bilinear+bilinear */
> > AV_SCALE_FASTER = 4, /* bayer dither, bilinear+bilinear */
> > AV_SCALE_FAST = 5, /* bayer dither, bicubic+bilinear */
> > AV_SCALE_MEDIUM = 6, /* bayer dither, bicubic+bicubic */
> > AV_SCALE_SLOW = 7, /* bayer dither, lanczos+bicubic */
> > AV_SCALE_SLOWER = 8, /* full dither, lanczos+bicubic */
> > AV_SCALE_VERYSLOW = 9, /* full dither, lanczos+lanczos */
> > AV_SCALE_PLACEBO = 10, /* full dither, lanczos+lanczos */
>
> I dont think its a good idea to hardcode dither and the "FIR" filter to the quality level in the API
>
>
> [...]
> > /**
> > * Like `avscale_frame`, but operates only on the (source) range from `ystart`
> > * to `height`.
> > *
> > * @param ctx The scaling context.
> > * @param dst The destination frame. The data buffers may either be already
> > * allocated by the caller or left clear, in which case they will
> > * be allocated by the scaler. The latter may have performance
> > * advantages - e.g. in certain cases some (or all) output planes
> > * may be references to input planes, rather than copies.
> > * @param src The source frame. If the data buffers are set to NULL, then
> > * this function behaves identically to `avscale_frame_setup`.
> > * @param slice_start First row of slice, relative to `src`. Must be a
> > * multiple of avscale_slice_alignment(src).
> > * @param slice_height Number of (source) rows in the slice. Must be a
> > * multiple of avscale_slice_alignment(src).
> > *
> > * @return 0 on success, a negative AVERROR code on failure.
> > */
> > int avscale_frame_slice(AVScaleContext *ctx, AVFrame *dst,
> > const AVFrame *src, int slice_start, int slice_height);
> >
> > /**
> > * Like `avscale_frame`, but without actually scaling. It will instead merely
> > * initialize internal state that *would* be required to perform the operation,
> > * as well as returning the correct error code for unsupported frame
> > * combinations.
> > *
> > * @param ctx The scaling context.
> > * @param dst The destination frame to consider.
> > * @param src The source frame to consider.
> > * @return 0 on success, a negative AVERROR code on failure.
> > */
> > int avscale_frame_setup(AVScaleContext *ctx, const AVFrame *dst,
> > const AVFrame *src);
>
> somewhat off topic as this is public API but
> the swscale filtering code could internally use libavutil/executor.h
> having filters and slices interdepend and need to execute them "in order"
> and parallel, maybe that API is usefull, not sure, just wanted to mention it
Here is the design that I think both works and minimizes work:
1. Split the input image up into slices, dispatch one slice per thread.
2. Input processing, horizontal scaling etc. happens in thread-local
buffers.
3. The output of this process is written to a (shared) thread frame
with a synchronization primitive per slice. (*)
That way, there is a (mostly) clear hierarchy - the high-level code
splits the frame into slices, each thread sees only one slice, and the
dynamic input/output/conversion/etc. code does not need to care about
threading at all. The only nontrivial component is the vertical scaler,
which will have to somehow wait for the following slice's completion
before processing the last few rows.
(*) To save a bit on memory, we could again use a ring buffer for the
rows that are *not* shared between adjacent threads, but I think that is
an optimization we can worry about after it works and is fast enough.
>
> thx
>
> [...]
>
> --
> Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>
> Why not whip the teacher when the pupil misbehaves? -- Diogenes of Sinope
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [FFmpeg-devel] [RFC]] swscale modernization proposal
2024-07-06 0:11 ` Hendrik Leppkes
@ 2024-07-06 12:32 ` Niklas Haas
2024-07-06 16:42 ` Michael Niedermayer
1 sibling, 0 replies; 36+ messages in thread
From: Niklas Haas @ 2024-07-06 12:32 UTC (permalink / raw)
To: FFmpeg development discussions and patches
On Sat, 06 Jul 2024 02:11:30 +0200 Hendrik Leppkes <h.leppkes@gmail.com> wrote:
> On Fri, Jul 5, 2024 at 11:34 PM Michael Niedermayer
> <michael@niedermayer.cc> wrote:
> > > /**
> > > * The exact interpretation of these quality presets depends on the backend
> > > * used, but the backend-invariant common settings are derived as follows:
> > > */
> > > enum AVScaleQuality {
> > > AV_SCALE_ULTRAFAST = 1, /* no dither, nearest+nearest */
> > > AV_SCALE_SUPERFAST = 2, /* no dither, bilinear+nearest */
> > > AV_SCALE_VERYFAST = 3, /* no dither, bilinear+bilinear */
> > > AV_SCALE_FASTER = 4, /* bayer dither, bilinear+bilinear */
> > > AV_SCALE_FAST = 5, /* bayer dither, bicubic+bilinear */
> > > AV_SCALE_MEDIUM = 6, /* bayer dither, bicubic+bicubic */
> > > AV_SCALE_SLOW = 7, /* bayer dither, lanczos+bicubic */
> > > AV_SCALE_SLOWER = 8, /* full dither, lanczos+bicubic */
> > > AV_SCALE_VERYSLOW = 9, /* full dither, lanczos+lanczos */
> > > AV_SCALE_PLACEBO = 10, /* full dither, lanczos+lanczos */
> >
> > I dont think its a good idea to hardcode dither and the "FIR" filter to the quality level in the API
> >
>
> The entire point of presets is to have them provide a predefined set
> of parameters, easy for users to pick one value, rather than a bunch.
> And what a preset actually means should be documented.
> How do you define "presets" if they don't hardcode a list of choices
> for all the relevant options?
>
> Advanced settings exist for a user to select any particular detail, if
> they so desire.
One thought I had is to make the presets mechanism something separate,
e.g.
/* Explicitly defaults all options left as AVSCALE_*_AUTO */
avscale_default_options(AVScaleContext *ctx, int preset);
But I think this would just represent an extra burden on the typical API
user, who now needs to manually define this "int preset" with all
available options, and call this function at init time, rather than
getting it for free as part of the AVOptions exposed by AVScaleContext.
>
> - Hendrik
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [FFmpeg-devel] [RFC]] swscale modernization proposal
2024-07-06 0:11 ` Hendrik Leppkes
2024-07-06 12:32 ` Niklas Haas
@ 2024-07-06 16:42 ` Michael Niedermayer
2024-07-06 17:29 ` Hendrik Leppkes
1 sibling, 1 reply; 36+ messages in thread
From: Michael Niedermayer @ 2024-07-06 16:42 UTC (permalink / raw)
To: FFmpeg development discussions and patches
[-- Attachment #1.1: Type: text/plain, Size: 2349 bytes --]
On Sat, Jul 06, 2024 at 02:11:30AM +0200, Hendrik Leppkes wrote:
> On Fri, Jul 5, 2024 at 11:34 PM Michael Niedermayer
> <michael@niedermayer.cc> wrote:
> > > /**
> > > * The exact interpretation of these quality presets depends on the backend
> > > * used, but the backend-invariant common settings are derived as follows:
> > > */
> > > enum AVScaleQuality {
> > > AV_SCALE_ULTRAFAST = 1, /* no dither, nearest+nearest */
> > > AV_SCALE_SUPERFAST = 2, /* no dither, bilinear+nearest */
> > > AV_SCALE_VERYFAST = 3, /* no dither, bilinear+bilinear */
> > > AV_SCALE_FASTER = 4, /* bayer dither, bilinear+bilinear */
> > > AV_SCALE_FAST = 5, /* bayer dither, bicubic+bilinear */
> > > AV_SCALE_MEDIUM = 6, /* bayer dither, bicubic+bicubic */
> > > AV_SCALE_SLOW = 7, /* bayer dither, lanczos+bicubic */
> > > AV_SCALE_SLOWER = 8, /* full dither, lanczos+bicubic */
> > > AV_SCALE_VERYSLOW = 9, /* full dither, lanczos+lanczos */
> > > AV_SCALE_PLACEBO = 10, /* full dither, lanczos+lanczos */
> >
> > I dont think its a good idea to hardcode dither and the "FIR" filter to the quality level in the API
> >
>
> The entire point of presets is to have them provide a predefined set
> of parameters, easy for users to pick one value, rather than a bunch.
> And what a preset actually means should be documented.
> How do you define "presets" if they don't hardcode a list of choices
> for all the relevant options?
>
> Advanced settings exist for a user to select any particular detail, if
> they so desire.
The problem is if new features are added and you have a hardcoded list in
the API what each quality corresponds to change it you have to bump major
also, do we really have or want to have optimized nearest neighbor scaler
code ?
If not the AV_SCALE_ULTRAFAST could be slower than AV_SCALE_VERYFAST
simply because it now "has to" do something we actually have not optimized
thx
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
Concerning the gods, I have no means of knowing whether they exist or not
or of what sort they may be, because of the obscurity of the subject, and
the brevity of human life -- Protagoras
[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]
[-- Attachment #2: Type: text/plain, Size: 251 bytes --]
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [FFmpeg-devel] [RFC]] swscale modernization proposal
2024-07-06 16:42 ` Michael Niedermayer
@ 2024-07-06 17:29 ` Hendrik Leppkes
2024-07-08 11:58 ` Ronald S. Bultje
0 siblings, 1 reply; 36+ messages in thread
From: Hendrik Leppkes @ 2024-07-06 17:29 UTC (permalink / raw)
To: FFmpeg development discussions and patches
On Sat, Jul 6, 2024 at 6:42 PM Michael Niedermayer
<michael@niedermayer.cc> wrote:
>
> On Sat, Jul 06, 2024 at 02:11:30AM +0200, Hendrik Leppkes wrote:
> > On Fri, Jul 5, 2024 at 11:34 PM Michael Niedermayer
> > <michael@niedermayer.cc> wrote:
> > > > /**
> > > > * The exact interpretation of these quality presets depends on the backend
> > > > * used, but the backend-invariant common settings are derived as follows:
> > > > */
> > > > enum AVScaleQuality {
> > > > AV_SCALE_ULTRAFAST = 1, /* no dither, nearest+nearest */
> > > > AV_SCALE_SUPERFAST = 2, /* no dither, bilinear+nearest */
> > > > AV_SCALE_VERYFAST = 3, /* no dither, bilinear+bilinear */
> > > > AV_SCALE_FASTER = 4, /* bayer dither, bilinear+bilinear */
> > > > AV_SCALE_FAST = 5, /* bayer dither, bicubic+bilinear */
> > > > AV_SCALE_MEDIUM = 6, /* bayer dither, bicubic+bicubic */
> > > > AV_SCALE_SLOW = 7, /* bayer dither, lanczos+bicubic */
> > > > AV_SCALE_SLOWER = 8, /* full dither, lanczos+bicubic */
> > > > AV_SCALE_VERYSLOW = 9, /* full dither, lanczos+lanczos */
> > > > AV_SCALE_PLACEBO = 10, /* full dither, lanczos+lanczos */
> > >
> > > I dont think its a good idea to hardcode dither and the "FIR" filter to the quality level in the API
> > >
> >
> > The entire point of presets is to have them provide a predefined set
> > of parameters, easy for users to pick one value, rather than a bunch.
> > And what a preset actually means should be documented.
> > How do you define "presets" if they don't hardcode a list of choices
> > for all the relevant options?
> >
> > Advanced settings exist for a user to select any particular detail, if
> > they so desire.
>
> The problem is if new features are added and you have a hardcoded list in
> the API what each quality corresponds to change it you have to bump major
>
> also, do we really have or want to have optimized nearest neighbor scaler
> code ?
> If not the AV_SCALE_ULTRAFAST could be slower than AV_SCALE_VERYFAST
> simply because it now "has to" do something we actually have not optimized
>
So.. you object to the comments that explain what it does?
Someone that uses presets will never have a guarantee to the selected
algorithms and options, if thats your desire - set what you want. That
doesn't mean you can't inform users what they are at this time. Its a
comment indicating a behavior, we don't make any stability guarantees
about that, and no API linkage occurs.
- Hendrik
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [FFmpeg-devel] [RFC]] swscale modernization proposal
2024-07-06 17:29 ` Hendrik Leppkes
@ 2024-07-08 11:58 ` Ronald S. Bultje
2024-07-08 12:33 ` Andrew Sayers
0 siblings, 1 reply; 36+ messages in thread
From: Ronald S. Bultje @ 2024-07-08 11:58 UTC (permalink / raw)
To: FFmpeg development discussions and patches
Hi,
On Sat, Jul 6, 2024 at 1:29 PM Hendrik Leppkes <h.leppkes@gmail.com> wrote:
> On Sat, Jul 6, 2024 at 6:42 PM Michael Niedermayer
> <michael@niedermayer.cc> wrote:
> >
> > On Sat, Jul 06, 2024 at 02:11:30AM +0200, Hendrik Leppkes wrote:
> > > On Fri, Jul 5, 2024 at 11:34 PM Michael Niedermayer
> > > <michael@niedermayer.cc> wrote:
> > > > > /**
> > > > > * The exact interpretation of these quality presets depends on
> the backend
> > > > > * used, but the backend-invariant common settings are derived as
> follows:
> > > > > */
> > > > > enum AVScaleQuality {
> > > > > AV_SCALE_ULTRAFAST = 1, /* no dither, nearest+nearest
> */
> > > > > AV_SCALE_SUPERFAST = 2, /* no dither, bilinear+nearest
> */
> > > > > AV_SCALE_VERYFAST = 3, /* no dither, bilinear+bilinear
> */
> > > > > AV_SCALE_FASTER = 4, /* bayer dither, bilinear+bilinear
> */
> > > > > AV_SCALE_FAST = 5, /* bayer dither, bicubic+bilinear
> */
> > > > > AV_SCALE_MEDIUM = 6, /* bayer dither, bicubic+bicubic
> */
> > > > > AV_SCALE_SLOW = 7, /* bayer dither, lanczos+bicubic
> */
> > > > > AV_SCALE_SLOWER = 8, /* full dither, lanczos+bicubic
> */
> > > > > AV_SCALE_VERYSLOW = 9, /* full dither, lanczos+lanczos
> */
> > > > > AV_SCALE_PLACEBO = 10, /* full dither, lanczos+lanczos
> */
> > > >
> > > > I dont think its a good idea to hardcode dither and the "FIR" filter
> to the quality level in the API
> > > >
> > >
> > > The entire point of presets is to have them provide a predefined set
> > > of parameters, easy for users to pick one value, rather than a bunch.
> > > And what a preset actually means should be documented.
> > > How do you define "presets" if they don't hardcode a list of choices
> > > for all the relevant options?
> > >
> > > Advanced settings exist for a user to select any particular detail, if
> > > they so desire.
> >
> > The problem is if new features are added and you have a hardcoded list in
> > the API what each quality corresponds to change it you have to bump major
> >
> > also, do we really have or want to have optimized nearest neighbor scaler
> > code ?
> > If not the AV_SCALE_ULTRAFAST could be slower than AV_SCALE_VERYFAST
> > simply because it now "has to" do something we actually have not
> optimized
> >
>
> So.. you object to the comments that explain what it does?
> Someone that uses presets will never have a guarantee to the selected
> algorithms and options
>
But then why did we provide this information? It's one thing to have it in
a stackoverflow answer re: a specific FFmpeg version, but in a header, it
feels much more ... burdening. Even if no actual API linkage occurred.
Ronald
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [FFmpeg-devel] [RFC]] swscale modernization proposal
2024-07-08 11:58 ` Ronald S. Bultje
@ 2024-07-08 12:33 ` Andrew Sayers
2024-07-08 13:25 ` Ronald S. Bultje
0 siblings, 1 reply; 36+ messages in thread
From: Andrew Sayers @ 2024-07-08 12:33 UTC (permalink / raw)
To: FFmpeg development discussions and patches
On Mon, Jul 08, 2024 at 07:58:44AM -0400, Ronald S. Bultje wrote:
> On Sat, Jul 6, 2024 at 1:29 PM Hendrik Leppkes <h.leppkes@gmail.com> wrote:
> > On Sat, Jul 6, 2024 at 6:42 PM Michael Niedermayer
[...]
> > > > The entire point of presets is to have them provide a predefined set
> > > > of parameters, easy for users to pick one value, rather than a bunch.
> > > > And what a preset actually means should be documented.
> > > > How do you define "presets" if they don't hardcode a list of choices
> > > > for all the relevant options?
> > > >
> > > > Advanced settings exist for a user to select any particular detail, if
> > > > they so desire.
> > >
> > > The problem is if new features are added and you have a hardcoded list in
> > > the API what each quality corresponds to change it you have to bump major
> > >
> > > also, do we really have or want to have optimized nearest neighbor scaler
> > > code ?
> > > If not the AV_SCALE_ULTRAFAST could be slower than AV_SCALE_VERYFAST
> > > simply because it now "has to" do something we actually have not
> > optimized
> > >
> >
> > So.. you object to the comments that explain what it does?
> > Someone that uses presets will never have a guarantee to the selected
> > algorithms and options
> >
>
> But then why did we provide this information? It's one thing to have it in
> a stackoverflow answer re: a specific FFmpeg version, but in a header, it
> feels much more ... burdening. Even if no actual API linkage occurred.
That burden exists no matter what, documentation just puts it on the shoulders
of a small number of developers who understand the problem; instead of a large
number of users who have to hope an SO answer isn't out-of-date yet.
We often say e.g. "this struct currently has such-and-such members, but the
size is not part of the public API". So it's not much of a stretch to say
"this preset enables such-and-such features, but the value is not part of the
public API".
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [FFmpeg-devel] [RFC]] swscale modernization proposal
2024-07-08 12:33 ` Andrew Sayers
@ 2024-07-08 13:25 ` Ronald S. Bultje
0 siblings, 0 replies; 36+ messages in thread
From: Ronald S. Bultje @ 2024-07-08 13:25 UTC (permalink / raw)
To: FFmpeg development discussions and patches
Hi,
On Mon, Jul 8, 2024 at 8:34 AM Andrew Sayers <ffmpeg-devel@pileofstuff.org>
wrote:
> We often say e.g. "this struct currently has such-and-such members, but the
> size is not part of the public API". So it's not much of a stretch to say
> "this preset enables such-and-such features, but the value is not part of
> the
> public API".
>
I like that, +1. No other comments for now.
Ronald
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 36+ messages in thread
end of thread, other threads:[~2024-07-08 13:25 UTC | newest]
Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-06-22 13:13 [FFmpeg-devel] [RFC]] swscale modernization proposal Niklas Haas
2024-06-22 14:23 ` Andrew Sayers
2024-06-22 15:10 ` Niklas Haas
2024-06-22 19:52 ` Michael Niedermayer
2024-06-22 22:24 ` Niklas Haas
2024-06-23 17:27 ` Michael Niedermayer
2024-06-22 22:19 ` Vittorio Giovara
2024-06-22 22:39 ` Niklas Haas
2024-06-23 17:46 ` Michael Niedermayer
2024-06-23 19:00 ` Paul B Mahol
2024-06-23 17:57 ` James Almer
2024-06-23 18:40 ` Andrew Sayers
2024-06-24 14:33 ` Niklas Haas
2024-06-24 14:44 ` Vittorio Giovara
2024-06-25 15:31 ` Niklas Haas
2024-07-01 21:10 ` Stefano Sabatini
2024-06-29 7:41 ` Zhao Zhili
2024-06-29 10:58 ` Niklas Haas
2024-06-29 11:47 ` Niklas Haas
2024-06-29 12:35 ` Michael Niedermayer
2024-06-29 14:05 ` Niklas Haas
2024-06-29 14:11 ` James Almer
2024-06-30 6:25 ` Vittorio Giovara
2024-07-02 13:27 ` Niklas Haas
2024-07-03 13:25 ` Niklas Haas
2024-07-05 18:31 ` Niklas Haas
2024-07-05 21:34 ` Michael Niedermayer
2024-07-06 0:11 ` Hendrik Leppkes
2024-07-06 12:32 ` Niklas Haas
2024-07-06 16:42 ` Michael Niedermayer
2024-07-06 17:29 ` Hendrik Leppkes
2024-07-08 11:58 ` Ronald S. Bultje
2024-07-08 12:33 ` Andrew Sayers
2024-07-08 13:25 ` Ronald S. Bultje
2024-07-06 11:36 ` Andrew Sayers
2024-07-06 12:27 ` Niklas Haas
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
This inbox may be cloned and mirrored by anyone:
git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git
# If you have public-inbox 1.1+ installed, you may
# initialize and index your mirror using the following commands:
public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
ffmpegdev@gitmailbox.com
public-inbox-index ffmpegdev
Example config snippet for mirrors.
AGPL code for this site: git clone https://public-inbox.org/public-inbox.git