From: Niklas Haas <ffmpeg@haasn.xyz>
To: ffmpeg-devel@ffmpeg.org
Subject: Re: [FFmpeg-devel] [RFC]] swscale modernization proposal
Date: Sat, 29 Jun 2024 13:47:43 +0200
Message-ID: <20240629134743.GD4857@haasn.xyz> (raw)
In-Reply-To: <20240622151334.GD14140@haasn.xyz>
[-- Attachment #1: Type: text/plain, Size: 4012 bytes --]
On Sat, 22 Jun 2024 15:13:34 +0200 Niklas Haas <ffmpeg@haasn.xyz> wrote:
> Hey,
>
> As some of you know, I got contracted (by STF 2024) to work on improving
> swscale, over the course of the next couple of months. I want to share my
> current plans and gather feedback + measure sentiment.
>
> ## Problem statement
>
> The two issues I'd like to focus on for now are:
>
> 1. Lack of support for a lot of modern formats and conversions (HDR, ICtCp,
> IPTc2, BT.2020-CL, XYZ, YCgCo, Dolby Vision, ...)
> 2. Complicated context management, with cascaded contexts, threading, stateful
> configuration, multi-step init procedures, etc; and related bugs
>
> In order to make these feasible, some amount of internal re-organization of
> duties inside swscale is prudent.
>
> ## Proposed approach
>
> The first step is to create a new API, which will (tentatively) live in
> <libswscale/avscale.h>. This API will initially start off as a near-copy of the
> current swscale public API, but with the major difference that I want it to be
> state-free and only access metadata in terms of AVFrame properties. So there
> will be no independent configuration of the input chroma location etc. like
> there is currently, and no need to re-configure or re-init the context when
> feeding it frames with different properties. The goal is for users to be able
> to just feed it AVFrame pairs and have it internally cache expensive
> pre-processing steps as needed. Finally, avscale_* should ultimately also
> support hardware frames directly, in which case it will dispatch to some
> equivalent of scale_vulkan/vaapi/cuda or possibly even libplacebo. (But I will
> defer this to a future milestone)
So, I've spent the past days implementing this API and hooking it up to
swscale internally. (For testing, I am also replacing `vf_scale` by the
equivalent AVScale-based implementation to see how the new API impacts
existing users). It mostly works so far, with some left-over translation
issues that I have to address before it can be sent upstream.
------
One of the things I was thinking about was how to configure
scalers/dither modes, which sws currently, somewhat clunkily, controls
with flags. IMO, flags are not the right design here - if anything, it
should be a separate enum/int, and controllable separately for chroma
resampling (4:4:4 <-> 4:2:0) and main scaling (e.g. 50x50 <-> 80x80).
That said, I think that for most end users, having such fine-grained
options is not really providing any end value - unless you're already
knee-deep in signal theory, the actual differences between, say,
"natural bicubic spline" and "Lanczos" are obtuse at best and alien at
worst.
My idea was to provide a single `int quality`, which the user can set to
tune the speed <-> quality trade-off on an arbitrary numeric scale from
0 to 10, with 0 being the fastest (alias everything, nearest neighbour,
drop half chroma samples, etc.), the default being something in the
vicinity of 3-5, and 10 being the maximum quality (full linear
downscaling, anti-aliasing, error diffusion, etc.).
The upside of this approach is that it would be vastly simpler for most
end users. It would also track newly added functionality automatically;
e.g. if we get a higher-quality tone mapping mode, it can be
retroactively added to the higher quality presets. The biggest downside
I can think of is that doing this would arguably violate the semantics
of a "bitexact" flag, since it would break results relative to
a previous version of libswscale - unless we maybe also force a specific
quality level in bitexact mode?
Open questions:
1. Is this a good idea, or do the downsides outweigh the benefits?
2. Is an "advanced configuration" API still needed, in addition to the
quality presets?
------
I have attached my current working draft of the public half of
<avscale.h>, for reference. You can also find my implementation draft at
the time of writing here:
https://github.com/haasn/FFmpeg/blob/avscale/libswscale/avscale.h
[-- Attachment #2: avscale.h --]
[-- Type: text/plain, Size: 5663 bytes --]
/*
* Copyright (C) 2024 Niklas Haas
*
* This file is part of FFmpeg.
*
* FFmpeg is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2.1 of the License, or (at your option) any later version.
*
* FFmpeg is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with FFmpeg; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
*/
#ifndef SWSCALE_AVSCALE_H
#define SWSCALE_AVSCALE_H
/**
* @file
* @ingroup libsws
* Higher-level wrapper around libswscale + related libraries, which is
* capable of handling more advanced colorspace transformations.
*/
#include "libavutil/frame.h"
#include "libavutil/log.h"
/**
* Main external API structure. New fields cannot be added to the end with
* minor version bumps. Removal, reordering and changes to existing fields
* require a major version bump. sizeof(AVScaleContext) is not part of the ABI.
*/
typedef struct AVScaleContext {
const AVClass *av_class;
/**
* Private context used for internal data.
*/
struct AVScaleInternal *internal;
/**
* Private data of the user, can be used to carry app specific stuff.
*/
void *opaque;
/**
* Bitmask of AV_SCALE_* flags.
*/
int64_t flags;
/**
* How many threads to use for processing, or 0 for automatic selection.
*/
int threads;
/**
* Quality factor (0-10). The default quality is [TBD]. Higher values
* sacrifice speed in exchange for quality.
*
* TODO: explain what changes at each level
*/
int quality;
} AVScaleContext;
enum {
/**
* Force bit-exact output. This will prevent the use of platform-specific
* optimizations that may lead to slight difference in rounding, in favor
* of always maintaining exact bit output compatibility with the reference
* C code.
*
* Note: This is also available under the name "accurate_rnd" for
* backwards compatibility.
*/
AV_SCALE_BITEXACT = 1 << 0,
/**
* Return an error on underspecified conversions. Without this flag,
* unspecified fields are defaulted to sensible values.
*/
AV_SCALE_STRICT = 1 << 1,
};
/**
* Allocate an AVScaleContext and set its fields to default values. The
* resulting struct should be freed with avscale_free_context().
*/
AVScaleContext *avscale_alloc_context(void);
/**
* Free the codec context and everything associated with it, and write NULL
* to the provided pointer.
*/
void avscale_free_context(AVScaleContext **ctx);
/**
* Get the AVClass for AVScaleContext. It can be used in combination with
* AV_OPT_SEARCH_FAKE_OBJ for examining options.
*
* @see av_opt_find().
*/
const AVClass *avscale_get_class(void);
/**
* Statically test if a conversion is supported. Values of (respectively)
* NONE/UNSPECIFIED are ignored.
*
* Returns 1 if the conversion is supported, or 0 otherwise.
*/
int avscale_test_format(enum AVPixelFormat dst, enum AVPixelFormat src);
int avscale_test_colorspace(enum AVColorSpace dst, enum AVColorSpace src);
int avscale_test_primaries(enum AVColorPrimaries dst, enum AVColorPrimaries src);
int avscale_test_transfer(enum AVColorTransferCharacteristic dst,
enum AVColorTransferCharacteristic src);
/**
* Scale source data from `src` and write the output to `dst`. This is
* merely a convenience wrapper around `avscale_frame_slice(ctx, dst, src, 0,
* src->height)`.
*
* @param ctx The scaling context.
* @param dst The destination frame.
*
* The data buffers may either be already allocated by the caller
* or left clear, in which case they will be allocated by the
* scaler. The latter may have performance advantages - e.g. in
* certain cases some (or all) output planes may be references to
* input planes, rather than copies.
* @param src The source frame. If the data buffers are set to NULL, then
* this function performs no conversion. It will instead merely
* initialize internal state that *would* be required to perform
* the operation, as well as returing the correct error code for
* unsupported frame combinations.
*
* @return 0 on success, a negative AVERROR code on failure.
*/
int avscale_frame(AVScaleContext *ctx, AVFrame *dst, const AVFrame *src);
/**
* Like `avscale_frame`, but operates only on the (source) range from `ystart`
* to `height`.
*
* Note: For interlaced or vertically subsampled frames, `ystart` and `height`
* must be aligned to a multiple of the subsampling size (typically 2, or 4 in
* the case of interlaced subsampled material).
*
* @param ctx The scaling context.
* @param dst The destination frame. See avscale_framee() for more details.
* @param src The source frame. See avscale_framee() for more details.
* @param slice_start First row of slice, relative to `src`
* @param slice_height Number of (source) rows in the slice
*
* @return 0 on success, a negative AVERROR code on failure.
*/
int avscale_frame_slice(AVScaleContext *ctx, AVFrame *dst, const AVFrame *src,
int slice_start, int slice_height);
#endif /* SWSCALE_AVSCALE_H */
[-- Attachment #3: Type: text/plain, Size: 251 bytes --]
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
next prev parent reply other threads:[~2024-06-29 11:47 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-22 13:13 Niklas Haas
2024-06-22 14:23 ` Andrew Sayers
2024-06-22 15:10 ` Niklas Haas
2024-06-22 19:52 ` Michael Niedermayer
2024-06-22 22:24 ` Niklas Haas
2024-06-23 17:27 ` Michael Niedermayer
2024-06-22 22:19 ` Vittorio Giovara
2024-06-22 22:39 ` Niklas Haas
2024-06-23 17:46 ` Michael Niedermayer
2024-06-23 19:00 ` Paul B Mahol
2024-06-23 17:57 ` James Almer
2024-06-23 18:40 ` Andrew Sayers
2024-06-24 14:33 ` Niklas Haas
2024-06-24 14:44 ` Vittorio Giovara
2024-06-25 15:31 ` Niklas Haas
2024-07-01 21:10 ` Stefano Sabatini
2024-06-29 7:41 ` Zhao Zhili
2024-06-29 10:58 ` Niklas Haas
2024-06-29 11:47 ` Niklas Haas [this message]
2024-06-29 12:35 ` Michael Niedermayer
2024-06-29 14:05 ` Niklas Haas
2024-06-29 14:11 ` James Almer
2024-06-30 6:25 ` Vittorio Giovara
2024-07-02 13:27 ` Niklas Haas
2024-07-03 13:25 ` Niklas Haas
2024-07-05 18:31 ` Niklas Haas
2024-07-05 21:34 ` Michael Niedermayer
2024-07-06 0:11 ` Hendrik Leppkes
2024-07-06 12:32 ` Niklas Haas
2024-07-06 16:42 ` Michael Niedermayer
2024-07-06 17:29 ` Hendrik Leppkes
2024-07-08 11:58 ` Ronald S. Bultje
2024-07-08 12:33 ` Andrew Sayers
2024-07-08 13:25 ` Ronald S. Bultje
2024-07-06 11:36 ` Andrew Sayers
2024-07-06 12:27 ` Niklas Haas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240629134743.GD4857@haasn.xyz \
--to=ffmpeg@haasn.xyz \
--cc=ffmpeg-devel@ffmpeg.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
This inbox may be cloned and mirrored by anyone:
git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git
# If you have public-inbox 1.1+ installed, you may
# initialize and index your mirror using the following commands:
public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
ffmpegdev@gitmailbox.com
public-inbox-index ffmpegdev
Example config snippet for mirrors.
AGPL code for this site: git clone https://public-inbox.org/public-inbox.git