Re: [FFmpeg-devel] [PATCH 1/1] [doc/filters] add nvidia cuda and cuda npp sections

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
 help / color / mirror / Atom feed

* Re: [FFmpeg-devel] [PATCH 1/1] [doc/filters] add nvidia cuda and cuda npp sections
       [not found] <20250116004751.56789-1-danyaschenko@gmail.com>
@ 2025-01-26 10:55 ` Gyan Doshi
  2025-01-28 19:12   ` [FFmpeg-devel] [PATCH 1/2] doc/filters: Add CUDA Video Filters section for CUDA-based and CUDA+NPP based filters Danil Iashchenko
  0 siblings, 1 reply; 12+ messages in thread
From: Gyan Doshi @ 2025-01-26 10:55 UTC (permalink / raw)
  To: ffmpeg-devel



On 2025-01-16 06:17 am, Danil Iashchenko wrote:
> Add Nvidia Cuda and Cuda NPP sections for video filters.
> ---
>   doc/filters.texi | 1545 +++++++++++++++++++++++-----------------------
>   1 file changed, 778 insertions(+), 767 deletions(-)
>
> diff --git a/doc/filters.texi b/doc/filters.texi
> index b926b865ae..efa5e84f29 100644
> --- a/doc/filters.texi
> +++ b/doc/filters.texi
> @@ -8619,45 +8619,6 @@ Set planes to filter. Default is first only.
>   
>   This filter supports the all above options as @ref{commands}.
>   
> -@section bilateral_cuda
> -CUDA accelerated bilateral filter, an edge preserving filter.
> -This filter is mathematically accurate thanks to the use of GPU acceleration.
> -For best output quality, use one to one chroma subsampling, i.e. yuv444p format.
> -
> -The filter accepts the following options:
> -@table @option
> -@item sigmaS
> -Set sigma of gaussian function to calculate spatial weight, also called sigma space.
> -Allowed range is 0.1 to 512. Default is 0.1.
> -
> -@item sigmaR
> -Set sigma of gaussian function to calculate color range weight, also called sigma color.
> -Allowed range is 0.1 to 512. Default is 0.1.
> -
> -@item window_size
> -Set window size of the bilateral function to determine the number of neighbours to loop on.
> -If the number entered is even, one will be added automatically.
> -Allowed range is 1 to 255. Default is 1.
> -@end table
> -@subsection Examples
> -
> -@itemize
> -@item
> -Apply the bilateral filter on a video.
> -
> -@example
> -./ffmpeg -v verbose \
> --hwaccel cuda -hwaccel_output_format cuda -i input.mp4  \
> --init_hw_device cuda \
> --filter_complex \
> -" \
> -[0:v]scale_cuda=format=yuv444p[scaled_video];
> -[scaled_video]bilateral_cuda=window_size=9:sigmaS=3.0:sigmaR=50.0" \
> --an -sn -c:v h264_nvenc -cq 20 out.mp4
> -@end example
> -
> -@end itemize
> -
>   @section bitplanenoise
>   
>   Show and measure bit plane noise.
> @@ -9243,58 +9204,6 @@ Only deinterlace frames marked as interlaced.
>   The default value is @code{all}.
>   @end table
>   
> -@section bwdif_cuda
> -
> -Deinterlace the input video using the @ref{bwdif} algorithm, but implemented
> -in CUDA so that it can work as part of a GPU accelerated pipeline with nvdec
> -and/or nvenc.
> -
> -It accepts the following parameters:
> -
> -@table @option
> -@item mode
> -The interlacing mode to adopt. It accepts one of the following values:
> -
> -@table @option
> -@item 0, send_frame
> -Output one frame for each frame.
> -@item 1, send_field
> -Output one frame for each field.
> -@end table
> -
> -The default value is @code{send_field}.
> -
> -@item parity
> -The picture field parity assumed for the input interlaced video. It accepts one
> -of the following values:
> -
> -@table @option
> -@item 0, tff
> -Assume the top field is first.
> -@item 1, bff
> -Assume the bottom field is first.
> -@item -1, auto
> -Enable automatic detection of field parity.
> -@end table
> -
> -The default value is @code{auto}.
> -If the interlacing is unknown or the decoder does not export this information,
> -top field first will be assumed.
> -
> -@item deint
> -Specify which frames to deinterlace. Accepts one of the following
> -values:
> -
> -@table @option
> -@item 0, all
> -Deinterlace all frames.
> -@item 1, interlaced
> -Only deinterlace frames marked as interlaced.
> -@end table
> -
> -The default value is @code{all}.
> -@end table
> -
>   @section ccrepack
>   
>   Repack CEA-708 closed captioning side data
> @@ -9408,48 +9317,6 @@ ffmpeg -f lavfi -i color=c=black:s=1280x720 -i video.mp4 -shortest -filter_compl
>   @end example
>   @end itemize
>   
> -@section chromakey_cuda
> -CUDA accelerated YUV colorspace color/chroma keying.
> -
> -This filter works like normal chromakey filter but operates on CUDA frames.
> -for more details and parameters see @ref{chromakey}.
> -
> -@subsection Examples
> -
> -@itemize
> -@item
> -Make all the green pixels in the input video transparent and use it as an overlay for another video:
> -
> -@example
> -./ffmpeg \
> -    -hwaccel cuda -hwaccel_output_format cuda -i input_green.mp4  \
> -    -hwaccel cuda -hwaccel_output_format cuda -i base_video.mp4 \
> -    -init_hw_device cuda \
> -    -filter_complex \
> -    " \
> -        [0:v]chromakey_cuda=0x25302D:0.1:0.12:1[overlay_video]; \
> -        [1:v]scale_cuda=format=yuv420p[base]; \
> -        [base][overlay_video]overlay_cuda" \
> -    -an -sn -c:v h264_nvenc -cq 20 output.mp4
> -@end example
> -
> -@item
> -Process two software sources, explicitly uploading the frames:
> -
> -@example
> -./ffmpeg -init_hw_device cuda=cuda -filter_hw_device cuda \
> -    -f lavfi -i color=size=800x600:color=white,format=yuv420p \
> -    -f lavfi -i yuvtestsrc=size=200x200,format=yuv420p \
> -    -filter_complex \
> -    " \
> -        [0]hwupload[under]; \
> -        [1]hwupload,chromakey_cuda=green:0.1:0.12[over]; \
> -        [under][over]overlay_cuda" \
> -    -c:v hevc_nvenc -cq 18 -preset slow output.mp4
> -@end example
> -
> -@end itemize
> -
>   @section chromanr
>   Reduce chrominance noise.
>   
> @@ -10427,38 +10294,6 @@ For example to convert the input to SMPTE-240M, use the command:
>   colorspace=smpte240m
>   @end example
>   
> -@section colorspace_cuda
> -
> -CUDA accelerated implementation of the colorspace filter.
> -
> -It is by no means feature complete compared to the software colorspace filter,
> -and at the current time only supports color range conversion between jpeg/full
> -and mpeg/limited range.
> -
> -The filter accepts the following options:
> -
> -@table @option
> -@item range
> -Specify output color range.
> -
> -The accepted values are:
> -@table @samp
> -@item tv
> -TV (restricted) range
> -
> -@item mpeg
> -MPEG (restricted) range
> -
> -@item pc
> -PC (full) range
> -
> -@item jpeg
> -JPEG (full) range
> -
> -@end table
> -
> -@end table
> -
>   @section colortemperature
>   Adjust color temperature in video to simulate variations in ambient color temperature.
>   
> @@ -17058,32 +16893,6 @@ ffmpeg -i distorted.mpg -i reference.mkv -lavfi "[0:v]settb=AVTB,setpts=PTS-STAR
>   @end example
>   @end itemize
>   
> -@section libvmaf_cuda
> -
> -This is the CUDA variant of the @ref{libvmaf} filter. It only accepts CUDA frames.
> -
> -It requires Netflix's vmaf library (libvmaf) as a pre-requisite.
> -After installing the library it can be enabled using:
> -@code{./configure --enable-nonfree --enable-ffnvcodec --enable-libvmaf}.
> -
> -@subsection Examples
> -@itemize
> -
> -@item
> -Basic usage showing CUVID hardware decoding and CUDA scaling with @ref{scale_cuda}:
> -@example
> -ffmpeg \
> -    -hwaccel cuda -hwaccel_output_format cuda -codec:v av1_cuvid -i dis.obu \
> -    -hwaccel cuda -hwaccel_output_format cuda -codec:v av1_cuvid -i ref.obu \
> -    -filter_complex "
> -        [0:v]scale_cuda=format=yuv420p[dis]; \
> -        [1:v]scale_cuda=format=yuv420p[ref]; \
> -        [dis][ref]libvmaf_cuda=log_fmt=json:log_path=output.json
> -    " \
> -    -f null -
> -@end example
> -@end itemize
> -
>   @section limitdiff
>   Apply limited difference filter using second and optionally third video stream.
>   
> @@ -18977,129 +18786,51 @@ testsrc=s=100x100, split=4 [in0][in1][in2][in3];
>   
>   @end itemize
>   
> -@anchor{overlay_cuda}
> -@section overlay_cuda
> -
> -Overlay one video on top of another.
> -
> -This is the CUDA variant of the @ref{overlay} filter.
> -It only accepts CUDA frames. The underlying input pixel formats have to match.
> +@section owdenoise
>   
> -It takes two inputs and has one output. The first input is the "main"
> -video on which the second input is overlaid.
> +Apply Overcomplete Wavelet denoiser.
>   
> -It accepts the following parameters:
> +The filter accepts the following options:
>   
>   @table @option
> -@item x
> -@item y
> -Set expressions for the x and y coordinates of the overlaid video
> -on the main video.
> +@item depth
> +Set depth.
>   
> -They can contain the following parameters:
> +Larger depth values will denoise lower frequency components more, but
> +slow down filtering.
>   
> -@table @option
> +Must be an int in the range 8-16, default is @code{8}.
>   
> -@item main_w, W
> -@item main_h, H
> -The main input width and height.
> +@item luma_strength, ls
> +Set luma strength.
>   
> -@item overlay_w, w
> -@item overlay_h, h
> -The overlay input width and height.
> +Must be a double value in the range 0-1000, default is @code{1.0}.
>   
> -@item x
> -@item y
> -The computed values for @var{x} and @var{y}. They are evaluated for
> -each new frame.
> +@item chroma_strength, cs
> +Set chroma strength.
>   
> -@item n
> -The ordinal index of the main input frame, starting from 0.
> +Must be a double value in the range 0-1000, default is @code{1.0}.
> +@end table
>   
> -@item pos
> -The byte offset position in the file of the main input frame, NAN if unknown.
> -Deprecated, do not use.
> +@anchor{pad}
> +@section pad
>   
> -@item t
> -The timestamp of the main input frame, expressed in seconds, NAN if unknown.
> +Add paddings to the input image, and place the original input at the
> +provided @var{x}, @var{y} coordinates.
>   
> -@end table
> +It accepts the following parameters:
>   
> -Default value is "0" for both expressions.
> +@table @option
> +@item width, w
> +@item height, h
> +Specify an expression for the size of the output image with the
> +paddings added. If the value for @var{width} or @var{height} is 0, the
> +corresponding input size is used for the output.
>   
> -@item eval
> -Set when the expressions for @option{x} and @option{y} are evaluated.
> +The @var{width} expression can reference the value set by the
> +@var{height} expression, and vice versa.
>   
> -It accepts the following values:
> -@table @option
> -@item init
> -Evaluate expressions once during filter initialization or
> -when a command is processed.
> -
> -@item frame
> -Evaluate expressions for each incoming frame
> -@end table
> -
> -Default value is @option{frame}.
> -
> -@item eof_action
> -See @ref{framesync}.
> -
> -@item shortest
> -See @ref{framesync}.
> -
> -@item repeatlast
> -See @ref{framesync}.
> -
> -@end table
> -
> -This filter also supports the @ref{framesync} options.
> -
> -@section owdenoise
> -
> -Apply Overcomplete Wavelet denoiser.
> -
> -The filter accepts the following options:
> -
> -@table @option
> -@item depth
> -Set depth.
> -
> -Larger depth values will denoise lower frequency components more, but
> -slow down filtering.
> -
> -Must be an int in the range 8-16, default is @code{8}.
> -
> -@item luma_strength, ls
> -Set luma strength.
> -
> -Must be a double value in the range 0-1000, default is @code{1.0}.
> -
> -@item chroma_strength, cs
> -Set chroma strength.
> -
> -Must be a double value in the range 0-1000, default is @code{1.0}.
> -@end table
> -
> -@anchor{pad}
> -@section pad
> -
> -Add paddings to the input image, and place the original input at the
> -provided @var{x}, @var{y} coordinates.
> -
> -It accepts the following parameters:
> -
> -@table @option
> -@item width, w
> -@item height, h
> -Specify an expression for the size of the output image with the
> -paddings added. If the value for @var{width} or @var{height} is 0, the
> -corresponding input size is used for the output.
> -
> -The @var{width} expression can reference the value set by the
> -@var{height} expression, and vice versa.
> -
> -The default value of @var{width} and @var{height} is 0.
> +The default value of @var{width} and @var{height} is 0.
>   
>   @item x
>   @item y
> @@ -21479,11 +21210,9 @@ If the specified expression is not valid, it is kept at its current
>   value.
>   @end table
>   
> -@anchor{scale_cuda}
> -@section scale_cuda
> +@section scale_vt
>   
> -Scale (resize) and convert (pixel format) the input video, using accelerated CUDA kernels.
> -Setting the output width and height works in the same way as for the @ref{scale} filter.
> +Scale and convert the color parameters using VTPixelTransferSession.
>   
>   The filter accepts the following options:
>   @table @option
> @@ -21491,390 +21220,117 @@ The filter accepts the following options:
>   @item h
>   Set the output video dimension expression. Default value is the input dimension.
>   
> -Allows for the same expressions as the @ref{scale} filter.
> +@item color_matrix
> +Set the output colorspace matrix.
>   
> -@item interp_algo
> -Sets the algorithm used for scaling:
> +@item color_primaries
> +Set the output color primaries.
>   
> -@table @var
> -@item nearest
> -Nearest neighbour
> +@item color_transfer
> +Set the output transfer characteristics.
>   
> -Used by default if input parameters match the desired output.
> +@end table
>   
> -@item bilinear
> -Bilinear
> +@section scharr
> +Apply scharr operator to input video stream.
>   
> -@item bicubic
> -Bicubic
> +The filter accepts the following option:
>   
> -This is the default.
> +@table @option
> +@item planes
> +Set which planes will be processed, unprocessed planes will be copied.
> +By default value 0xf, all planes will be processed.
>   
> -@item lanczos
> -Lanczos
> +@item scale
> +Set value which will be multiplied with filtered result.
>   
> +@item delta
> +Set value which will be added to filtered result.
>   @end table
>   
> -@item format
> -Controls the output pixel format. By default, or if none is specified, the input
> -pixel format is used.
> -
> -The filter does not support converting between YUV and RGB pixel formats.
> +@subsection Commands
>   
> -@item passthrough
> -If set to 0, every frame is processed, even if no conversion is necessary.
> -This mode can be useful to use the filter as a buffer for a downstream
> -frame-consumer that exhausts the limited decoder frame pool.
> +This filter supports the all above options as @ref{commands}.
>   
> -If set to 1, frames are passed through as-is if they match the desired output
> -parameters. This is the default behaviour.
> +@section scroll
> +Scroll input video horizontally and/or vertically by constant speed.
>   
> -@item param
> -Algorithm-Specific parameter.
> +The filter accepts the following options:
> +@table @option
> +@item horizontal, h
> +Set the horizontal scrolling speed. Default is 0. Allowed range is from -1 to 1.
> +Negative values changes scrolling direction.
>   
> -Affects the curves of the bicubic algorithm.
> +@item vertical, v
> +Set the vertical scrolling speed. Default is 0. Allowed range is from -1 to 1.
> +Negative values changes scrolling direction.
>   
> -@item force_original_aspect_ratio
> -@item force_divisible_by
> -Work the same as the identical @ref{scale} filter options.
> +@item hpos
> +Set the initial horizontal scrolling position. Default is 0. Allowed range is from 0 to 1.
>   
> +@item vpos
> +Set the initial vertical scrolling position. Default is 0. Allowed range is from 0 to 1.
>   @end table
>   
> -@subsection Examples
> -
> -@itemize
> -@item
> -Scale input to 720p, keeping aspect ratio and ensuring the output is yuv420p.
> -@example
> -scale_cuda=-2:720:format=yuv420p
> -@end example
> -
> -@item
> -Upscale to 4K using nearest neighbour algorithm.
> -@example
> -scale_cuda=4096:2160:interp_algo=nearest
> -@end example
> -
> -@item
> -Don't do any conversion or scaling, but copy all input frames into newly allocated ones.
> -This can be useful to deal with a filter and encode chain that otherwise exhausts the
> -decoders frame pool.
> -@example
> -scale_cuda=passthrough=0
> -@end example
> -@end itemize
> -
> -@anchor{scale_npp}
> -@section scale_npp
> -
> -Use the NVIDIA Performance Primitives (libnpp) to perform scaling and/or pixel
> -format conversion on CUDA video frames. Setting the output width and height
> -works in the same way as for the @var{scale} filter.
> +@subsection Commands
>   
> -The following additional options are accepted:
> +This filter supports the following @ref{commands}:
>   @table @option
> -@item format
> -The pixel format of the output CUDA frames. If set to the string "same" (the
> -default), the input format will be kept. Note that automatic format negotiation
> -and conversion is not yet supported for hardware frames
> +@item horizontal, h
> +Set the horizontal scrolling speed.
> +@item vertical, v
> +Set the vertical scrolling speed.
> +@end table
>   
> -@item interp_algo
> -The interpolation algorithm used for resizing. One of the following:
> -@table @option
> -@item nn
> -Nearest neighbour.
> +@anchor{scdet}
> +@section scdet
>   
> -@item linear
> -@item cubic
> -@item cubic2p_bspline
> -2-parameter cubic (B=1, C=0)
> +Detect video scene change.
>   
> -@item cubic2p_catmullrom
> -2-parameter cubic (B=0, C=1/2)
> +This filter sets frame metadata with mafd between frame, the scene score, and
> +forward the frame to the next filter, so they can use these metadata to detect
> +scene change or others.
>   
> -@item cubic2p_b05c03
> -2-parameter cubic (B=1/2, C=3/10)
> +In addition, this filter logs a message and sets frame metadata when it detects
> +a scene change by @option{threshold}.
>   
> -@item super
> -Supersampling
> +@code{lavfi.scd.mafd} metadata keys are set with mafd for every frame.
>   
> -@item lanczos
> -@end table
> +@code{lavfi.scd.score} metadata keys are set with scene change score for every frame
> +to detect scene change.
>   
> -@item force_original_aspect_ratio
> -Enable decreasing or increasing output video width or height if necessary to
> -keep the original aspect ratio. Possible values:
> +@code{lavfi.scd.time} metadata keys are set with current filtered frame time which
> +detect scene change with @option{threshold}.
>   
> -@table @samp
> -@item disable
> -Scale the video as specified and disable this feature.
> +The filter accepts the following options:
>   
> -@item decrease
> -The output video dimensions will automatically be decreased if needed.
> +@table @option
> +@item threshold, t
> +Set the scene change detection threshold as a percentage of maximum change. Good
> +values are in the @code{[8.0, 14.0]} range. The range for @option{threshold} is
> +@code{[0., 100.]}.
>   
> -@item increase
> -The output video dimensions will automatically be increased if needed.
> +Default value is @code{10.}.
>   
> +@item sc_pass, s
> +Set the flag to pass scene change frames to the next filter. Default value is @code{0}
> +You can enable it if you want to get snapshot of scene change frames only.
>   @end table
>   
> -One useful instance of this option is that when you know a specific device's
> -maximum allowed resolution, you can use this to limit the output video to
> -that, while retaining the aspect ratio. For example, device A allows
> -1280x720 playback, and your video is 1920x800. Using this option (set it to
> -decrease) and specifying 1280x720 to the command line makes the output
> -1280x533.
> +@anchor{selectivecolor}
> +@section selectivecolor
>   
> -Please note that this is a different thing than specifying -1 for @option{w}
> -or @option{h}, you still need to specify the output resolution for this option
> -to work.
> +Adjust cyan, magenta, yellow and black (CMYK) to certain ranges of colors (such
> +as "reds", "yellows", "greens", "cyans", ...). The adjustment range is defined
> +by the "purity" of the color (that is, how saturated it already is).
>   
> -@item force_divisible_by
> -Ensures that both the output dimensions, width and height, are divisible by the
> -given integer when used together with @option{force_original_aspect_ratio}. This
> -works similar to using @code{-n} in the @option{w} and @option{h} options.
> +This filter is similar to the Adobe Photoshop Selective Color tool.
>   
> -This option respects the value set for @option{force_original_aspect_ratio},
> -increasing or decreasing the resolution accordingly. The video's aspect ratio
> -may be slightly modified.
> +The filter accepts the following options:
>   
> -This option can be handy if you need to have a video fit within or exceed
> -a defined resolution using @option{force_original_aspect_ratio} but also have
> -encoder restrictions on width or height divisibility.
> -
> -@item eval
> -Specify when to evaluate @var{width} and @var{height} expression. It accepts the following values:
> -
> -@table @samp
> -@item init
> -Only evaluate expressions once during the filter initialization or when a command is processed.
> -
> -@item frame
> -Evaluate expressions for each incoming frame.
> -
> -@end table
> -
> -@end table
> -
> -The values of the @option{w} and @option{h} options are expressions
> -containing the following constants:
> -
> -@table @var
> -@item in_w
> -@item in_h
> -The input width and height
> -
> -@item iw
> -@item ih
> -These are the same as @var{in_w} and @var{in_h}.
> -
> -@item out_w
> -@item out_h
> -The output (scaled) width and height
> -
> -@item ow
> -@item oh
> -These are the same as @var{out_w} and @var{out_h}
> -
> -@item a
> -The same as @var{iw} / @var{ih}
> -
> -@item sar
> -input sample aspect ratio
> -
> -@item dar
> -The input display aspect ratio. Calculated from @code{(iw / ih) * sar}.
> -
> -@item n
> -The (sequential) number of the input frame, starting from 0.
> -Only available with @code{eval=frame}.
> -
> -@item t
> -The presentation timestamp of the input frame, expressed as a number of
> -seconds. Only available with @code{eval=frame}.
> -
> -@item pos
> -The position (byte offset) of the frame in the input stream, or NaN if
> -this information is unavailable and/or meaningless (for example in case of synthetic video).
> -Only available with @code{eval=frame}.
> -Deprecated, do not use.
> -@end table
> -
> -@section scale2ref_npp
> -
> -Use the NVIDIA Performance Primitives (libnpp) to scale (resize) the input
> -video, based on a reference video.
> -
> -See the @ref{scale_npp} filter for available options, scale2ref_npp supports the same
> -but uses the reference video instead of the main input as basis. scale2ref_npp
> -also supports the following additional constants for the @option{w} and
> -@option{h} options:
> -
> -@table @var
> -@item main_w
> -@item main_h
> -The main input video's width and height
> -
> -@item main_a
> -The same as @var{main_w} / @var{main_h}
> -
> -@item main_sar
> -The main input video's sample aspect ratio
> -
> -@item main_dar, mdar
> -The main input video's display aspect ratio. Calculated from
> -@code{(main_w / main_h) * main_sar}.
> -
> -@item main_n
> -The (sequential) number of the main input frame, starting from 0.
> -Only available with @code{eval=frame}.
> -
> -@item main_t
> -The presentation timestamp of the main input frame, expressed as a number of
> -seconds. Only available with @code{eval=frame}.
> -
> -@item main_pos
> -The position (byte offset) of the frame in the main input stream, or NaN if
> -this information is unavailable and/or meaningless (for example in case of synthetic video).
> -Only available with @code{eval=frame}.
> -@end table
> -
> -@subsection Examples
> -
> -@itemize
> -@item
> -Scale a subtitle stream (b) to match the main video (a) in size before overlaying
> -@example
> -'scale2ref_npp[b][a];[a][b]overlay_cuda'
> -@end example
> -
> -@item
> -Scale a logo to 1/10th the height of a video, while preserving its display aspect ratio.
> -@example
> -[logo-in][video-in]scale2ref_npp=w=oh*mdar:h=ih/10[logo-out][video-out]
> -@end example
> -@end itemize
> -
> -@section scale_vt
> -
> -Scale and convert the color parameters using VTPixelTransferSession.
> -
> -The filter accepts the following options:
> -@table @option
> -@item w
> -@item h
> -Set the output video dimension expression. Default value is the input dimension.
> -
> -@item color_matrix
> -Set the output colorspace matrix.
> -
> -@item color_primaries
> -Set the output color primaries.
> -
> -@item color_transfer
> -Set the output transfer characteristics.
> -
> -@end table
> -
> -@section scharr
> -Apply scharr operator to input video stream.
> -
> -The filter accepts the following option:
> -
> -@table @option
> -@item planes
> -Set which planes will be processed, unprocessed planes will be copied.
> -By default value 0xf, all planes will be processed.
> -
> -@item scale
> -Set value which will be multiplied with filtered result.
> -
> -@item delta
> -Set value which will be added to filtered result.
> -@end table
> -
> -@subsection Commands
> -
> -This filter supports the all above options as @ref{commands}.
> -
> -@section scroll
> -Scroll input video horizontally and/or vertically by constant speed.
> -
> -The filter accepts the following options:
> -@table @option
> -@item horizontal, h
> -Set the horizontal scrolling speed. Default is 0. Allowed range is from -1 to 1.
> -Negative values changes scrolling direction.
> -
> -@item vertical, v
> -Set the vertical scrolling speed. Default is 0. Allowed range is from -1 to 1.
> -Negative values changes scrolling direction.
> -
> -@item hpos
> -Set the initial horizontal scrolling position. Default is 0. Allowed range is from 0 to 1.
> -
> -@item vpos
> -Set the initial vertical scrolling position. Default is 0. Allowed range is from 0 to 1.
> -@end table
> -
> -@subsection Commands
> -
> -This filter supports the following @ref{commands}:
> -@table @option
> -@item horizontal, h
> -Set the horizontal scrolling speed.
> -@item vertical, v
> -Set the vertical scrolling speed.
> -@end table
> -
> -@anchor{scdet}
> -@section scdet
> -
> -Detect video scene change.
> -
> -This filter sets frame metadata with mafd between frame, the scene score, and
> -forward the frame to the next filter, so they can use these metadata to detect
> -scene change or others.
> -
> -In addition, this filter logs a message and sets frame metadata when it detects
> -a scene change by @option{threshold}.
> -
> -@code{lavfi.scd.mafd} metadata keys are set with mafd for every frame.
> -
> -@code{lavfi.scd.score} metadata keys are set with scene change score for every frame
> -to detect scene change.
> -
> -@code{lavfi.scd.time} metadata keys are set with current filtered frame time which
> -detect scene change with @option{threshold}.
> -
> -The filter accepts the following options:
> -
> -@table @option
> -@item threshold, t
> -Set the scene change detection threshold as a percentage of maximum change. Good
> -values are in the @code{[8.0, 14.0]} range. The range for @option{threshold} is
> -@code{[0., 100.]}.
> -
> -Default value is @code{10.}.
> -
> -@item sc_pass, s
> -Set the flag to pass scene change frames to the next filter. Default value is @code{0}
> -You can enable it if you want to get snapshot of scene change frames only.
> -@end table
> -
> -@anchor{selectivecolor}
> -@section selectivecolor
> -
> -Adjust cyan, magenta, yellow and black (CMYK) to certain ranges of colors (such
> -as "reds", "yellows", "greens", "cyans", ...). The adjustment range is defined
> -by the "purity" of the color (that is, how saturated it already is).
> -
> -This filter is similar to the Adobe Photoshop Selective Color tool.
> -
> -The filter accepts the following options:
> -
> -@table @option
> -@item correction_method
> -Select color correction method.
> +@table @option
> +@item correction_method
> +Select color correction method.
>   
>   Available values are:
>   @table @samp
> @@ -22200,23 +21656,6 @@ Keep the same chroma location (default).
>   @end table
>   @end table
>   
> -@section sharpen_npp
> -Use the NVIDIA Performance Primitives (libnpp) to perform image sharpening with
> -border control.
> -
> -The following additional options are accepted:
> -@table @option
> -
> -@item border_type
> -Type of sampling to be used ad frame borders. One of the following:
> -@table @option
> -
> -@item replicate
> -Replicate pixel values.
> -
> -@end table
> -@end table
> -
>   @section shear
>   Apply shear transform to input video.
>   
> @@ -24304,49 +23743,8 @@ The command above can also be specified as:
>   transpose=1:portrait
>   @end example
>   
> -@section transpose_npp
> -
> -Transpose rows with columns in the input video and optionally flip it.
> -For more in depth examples see the @ref{transpose} video filter, which shares mostly the same options.
> -
> -It accepts the following parameters:
> -
> -@table @option
> -
> -@item dir
> -Specify the transposition direction.
> -
> -Can assume the following values:
> -@table @samp
> -@item cclock_flip
> -Rotate by 90 degrees counterclockwise and vertically flip. (default)
> -
> -@item clock
> -Rotate by 90 degrees clockwise.
> -
> -@item cclock
> -Rotate by 90 degrees counterclockwise.
> -
> -@item clock_flip
> -Rotate by 90 degrees clockwise and vertically flip.
> -@end table
> -
> -@item passthrough
> -Do not apply the transposition if the input geometry matches the one
> -specified by the specified value. It accepts the following values:
> -@table @samp
> -@item none
> -Always apply transposition. (default)
> -@item portrait
> -Preserve portrait geometry (when @var{height} >= @var{width}).
> -@item landscape
> -Preserve landscape geometry (when @var{width} >= @var{height}).
> -@end table
> -
> -@end table
> -
> -@section trim
> -Trim the input so that the output contains one continuous subpart of the input.
> +@section trim
> +Trim the input so that the output contains one continuous subpart of the input.
>   
>   It accepts the following parameters:
>   @table @option
> @@ -26362,64 +25760,6 @@ filter").
>   It accepts the following parameters:
>   
>   
> -@table @option
> -
> -@item mode
> -The interlacing mode to adopt. It accepts one of the following values:
> -
> -@table @option
> -@item 0, send_frame
> -Output one frame for each frame.
> -@item 1, send_field
> -Output one frame for each field.
> -@item 2, send_frame_nospatial
> -Like @code{send_frame}, but it skips the spatial interlacing check.
> -@item 3, send_field_nospatial
> -Like @code{send_field}, but it skips the spatial interlacing check.
> -@end table
> -
> -The default value is @code{send_frame}.
> -
> -@item parity
> -The picture field parity assumed for the input interlaced video. It accepts one
> -of the following values:
> -
> -@table @option
> -@item 0, tff
> -Assume the top field is first.
> -@item 1, bff
> -Assume the bottom field is first.
> -@item -1, auto
> -Enable automatic detection of field parity.
> -@end table
> -
> -The default value is @code{auto}.
> -If the interlacing is unknown or the decoder does not export this information,
> -top field first will be assumed.
> -
> -@item deint
> -Specify which frames to deinterlace. Accepts one of the following
> -values:
> -
> -@table @option
> -@item 0, all
> -Deinterlace all frames.
> -@item 1, interlaced
> -Only deinterlace frames marked as interlaced.
> -@end table
> -
> -The default value is @code{all}.
> -@end table
> -
> -@section yadif_cuda
> -
> -Deinterlace the input video using the @ref{yadif} algorithm, but implemented
> -in CUDA so that it can work as part of a GPU accelerated pipeline with nvdec
> -and/or nvenc.
> -
> -It accepts the following parameters:
> -
> -
>   @table @option
>   
>   @item mode
> @@ -26890,6 +26230,677 @@ value.
>   
>   @c man end VIDEO FILTERS
>   
> +@chapter Cuda Video Filters
> +@c man begin Cuda Video Filters
> +
> +@section Cuda
> +Below is a description of the currently available Nvidia Cuda video filters.

s/Cuda/CUDA/
(except where referring to a filter or code)

Mention what are the requirements to build and use these filters.


> +
> +@subsection bilateral_cuda
> +CUDA accelerated bilateral filter, an edge preserving filter.
> +This filter is mathematically accurate thanks to the use of GPU acceleration.
> +For best output quality, use one to one chroma subsampling, i.e. yuv444p format.
> +
> +The filter accepts the following options:
> +@table @option
> +@item sigmaS
> +Set sigma of gaussian function to calculate spatial weight, also called sigma space.
> +Allowed range is 0.1 to 512. Default is 0.1.
> +
> +@item sigmaR
> +Set sigma of gaussian function to calculate color range weight, also called sigma color.
> +Allowed range is 0.1 to 512. Default is 0.1.
> +
> +@item window_size
> +Set window size of the bilateral function to determine the number of neighbours to loop on.
> +If the number entered is even, one will be added automatically.
> +Allowed range is 1 to 255. Default is 1.
> +@end table
> +@subsubsection Examples
> +
> +@itemize
> +@item
> +Apply the bilateral filter on a video.
> +
> +@example
> +./ffmpeg -v verbose \
> +-hwaccel cuda -hwaccel_output_format cuda -i input.mp4  \
> +-init_hw_device cuda \
> +-filter_complex \
> +" \
> +[0:v]scale_cuda=format=yuv444p[scaled_video];
> +[scaled_video]bilateral_cuda=window_size=9:sigmaS=3.0:sigmaR=50.0" \
> +-an -sn -c:v h264_nvenc -cq 20 out.mp4
> +@end example
> +
> +@end itemize
> +
> +@subsection bwdif_cuda
> +
> +Deinterlace the input video using the @ref{bwdif} algorithm, but implemented
> +in CUDA so that it can work as part of a GPU accelerated pipeline with nvdec
> +and/or nvenc.
> +
> +It accepts the following parameters:
> +
> +@table @option
> +@item mode
> +The interlacing mode to adopt. It accepts one of the following values:
> +
> +@table @option
> +@item 0, send_frame
> +Output one frame for each frame.
> +@item 1, send_field
> +Output one frame for each field.
> +@end table
> +
> +The default value is @code{send_field}.
> +
> +@item parity
> +The picture field parity assumed for the input interlaced video. It accepts one
> +of the following values:
> +
> +@table @option
> +@item 0, tff
> +Assume the top field is first.
> +@item 1, bff
> +Assume the bottom field is first.
> +@item -1, auto
> +Enable automatic detection of field parity.
> +@end table
> +
> +The default value is @code{auto}.
> +If the interlacing is unknown or the decoder does not export this information,
> +top field first will be assumed.
> +
> +@item deint
> +Specify which frames to deinterlace. Accepts one of the following
> +values:
> +
> +@table @option
> +@item 0, all
> +Deinterlace all frames.
> +@item 1, interlaced
> +Only deinterlace frames marked as interlaced.
> +@end table
> +
> +The default value is @code{all}.
> +@end table
> +
> +@subsection chromakey_cuda
> +CUDA accelerated YUV colorspace color/chroma keying.
> +
> +This filter works like normal chromakey filter but operates on CUDA frames.
> +for more details and parameters see @ref{chromakey}.
> +
> +@subsubsection Examples
> +
> +@itemize
> +@item
> +Make all the green pixels in the input video transparent and use it as an overlay for another video:
> +
> +@example
> +./ffmpeg \
> +    -hwaccel cuda -hwaccel_output_format cuda -i input_green.mp4  \
> +    -hwaccel cuda -hwaccel_output_format cuda -i base_video.mp4 \
> +    -init_hw_device cuda \
> +    -filter_complex \
> +    " \
> +        [0:v]chromakey_cuda=0x25302D:0.1:0.12:1[overlay_video]; \
> +        [1:v]scale_cuda=format=yuv420p[base]; \
> +        [base][overlay_video]overlay_cuda" \
> +    -an -sn -c:v h264_nvenc -cq 20 output.mp4
> +@end example
> +
> +@item
> +Process two software sources, explicitly uploading the frames:
> +
> +@example
> +./ffmpeg -init_hw_device cuda=cuda -filter_hw_device cuda \
> +    -f lavfi -i color=size=800x600:color=white,format=yuv420p \
> +    -f lavfi -i yuvtestsrc=size=200x200,format=yuv420p \
> +    -filter_complex \
> +    " \
> +        [0]hwupload[under]; \
> +        [1]hwupload,chromakey_cuda=green:0.1:0.12[over]; \
> +        [under][over]overlay_cuda" \
> +    -c:v hevc_nvenc -cq 18 -preset slow output.mp4
> +@end example
> +
> +@end itemize
> +
> +@subsection colorspace_cuda
> +
> +CUDA accelerated implementation of the colorspace filter.
> +
> +It is by no means feature complete compared to the software colorspace filter,
> +and at the current time only supports color range conversion between jpeg/full
> +and mpeg/limited range.
> +
> +The filter accepts the following options:
> +
> +@table @option
> +@item range
> +Specify output color range.
> +
> +The accepted values are:
> +@table @samp
> +@item tv
> +TV (restricted) range
> +
> +@item mpeg
> +MPEG (restricted) range
> +
> +@item pc
> +PC (full) range
> +
> +@item jpeg
> +JPEG (full) range
> +
> +@end table
> +
> +@end table
> +
> +@subsection libvmaf_cuda
> +
> +This is the CUDA variant of the @ref{libvmaf} filter. It only accepts CUDA frames.
> +
> +It requires Netflix's vmaf library (libvmaf) as a pre-requisite.
> +After installing the library it can be enabled using:
> +@code{./configure --enable-nonfree --enable-ffnvcodec --enable-libvmaf}.
> +
> +@subsubsection Examples
> +@itemize
> +
> +@item
> +Basic usage showing CUVID hardware decoding and CUDA scaling with @ref{scale_cuda}:
> +@example
> +ffmpeg \
> +    -hwaccel cuda -hwaccel_output_format cuda -codec:v av1_cuvid -i dis.obu \
> +    -hwaccel cuda -hwaccel_output_format cuda -codec:v av1_cuvid -i ref.obu \
> +    -filter_complex "
> +        [0:v]scale_cuda=format=yuv420p[dis]; \
> +        [1:v]scale_cuda=format=yuv420p[ref]; \
> +        [dis][ref]libvmaf_cuda=log_fmt=json:log_path=output.json
> +    " \
> +    -f null -
> +@end example
> +@end itemize
> +
> +@anchor{overlay_cuda}
> +@subsection overlay_cuda
> +
> +Overlay one video on top of another.
> +
> +This is the CUDA variant of the @ref{overlay} filter.
> +It only accepts CUDA frames. The underlying input pixel formats have to match.
> +
> +It takes two inputs and has one output. The first input is the "main"
> +video on which the second input is overlaid.
> +
> +It accepts the following parameters:
> +
> +@table @option
> +@item x
> +@item y
> +Set expressions for the x and y coordinates of the overlaid video
> +on the main video.
> +
> +They can contain the following parameters:
> +
> +@table @option
> +
> +@item main_w, W
> +@item main_h, H
> +The main input width and height.
> +
> +@item overlay_w, w
> +@item overlay_h, h
> +The overlay input width and height.
> +
> +@item x
> +@item y
> +The computed values for @var{x} and @var{y}. They are evaluated for
> +each new frame.
> +
> +@item n
> +The ordinal index of the main input frame, starting from 0.
> +
> +@item pos
> +The byte offset position in the file of the main input frame, NAN if unknown.
> +Deprecated, do not use.
> +
> +@item t
> +The timestamp of the main input frame, expressed in seconds, NAN if unknown.
> +
> +@end table
> +
> +Default value is "0" for both expressions.
> +
> +@item eval
> +Set when the expressions for @option{x} and @option{y} are evaluated.
> +
> +It accepts the following values:
> +@table @option
> +@item init
> +Evaluate expressions once during filter initialization or
> +when a command is processed.
> +
> +@item frame
> +Evaluate expressions for each incoming frame
> +@end table
> +
> +Default value is @option{frame}.
> +
> +@item eof_action
> +See @ref{framesync}.
> +
> +@item shortest
> +See @ref{framesync}.
> +
> +@item repeatlast
> +See @ref{framesync}.
> +
> +@end table
> +
> +This filter also supports the @ref{framesync} options.
> +
> +@anchor{scale_cuda}
> +@subsection scale_cuda
> +
> +Scale (resize) and convert (pixel format) the input video, using accelerated CUDA kernels.
> +Setting the output width and height works in the same way as for the @ref{scale} filter.
> +
> +The filter accepts the following options:
> +@table @option
> +@item w
> +@item h
> +Set the output video dimension expression. Default value is the input dimension.
> +
> +Allows for the same expressions as the @ref{scale} filter.
> +
> +@item interp_algo
> +Sets the algorithm used for scaling:
> +
> +@table @var
> +@item nearest
> +Nearest neighbour
> +
> +Used by default if input parameters match the desired output.
> +
> +@item bilinear
> +Bilinear
> +
> +@item bicubic
> +Bicubic
> +
> +This is the default.
> +
> +@item lanczos
> +Lanczos
> +
> +@end table
> +
> +@item format
> +Controls the output pixel format. By default, or if none is specified, the input
> +pixel format is used.
> +
> +The filter does not support converting between YUV and RGB pixel formats.
> +
> +@item passthrough
> +If set to 0, every frame is processed, even if no conversion is necessary.
> +This mode can be useful to use the filter as a buffer for a downstream
> +frame-consumer that exhausts the limited decoder frame pool.
> +
> +If set to 1, frames are passed through as-is if they match the desired output
> +parameters. This is the default behaviour.
> +
> +@item param
> +Algorithm-Specific parameter.
> +
> +Affects the curves of the bicubic algorithm.
> +
> +@item force_original_aspect_ratio
> +@item force_divisible_by
> +Work the same as the identical @ref{scale} filter options.
> +
> +@end table
> +
> +@subsubsection Examples
> +
> +@itemize
> +@item
> +Scale input to 720p, keeping aspect ratio and ensuring the output is yuv420p.
> +@example
> +scale_cuda=-2:720:format=yuv420p
> +@end example
> +
> +@item
> +Upscale to 4K using nearest neighbour algorithm.
> +@example
> +scale_cuda=4096:2160:interp_algo=nearest
> +@end example
> +
> +@item
> +Don't do any conversion or scaling, but copy all input frames into newly allocated ones.
> +This can be useful to deal with a filter and encode chain that otherwise exhausts the
> +decoders frame pool.
> +@example
> +scale_cuda=passthrough=0
> +@end example
> +@end itemize
> +
> +@subsection yadif_cuda
> +
> +Deinterlace the input video using the @ref{yadif} algorithm, but implemented
> +in CUDA so that it can work as part of a GPU accelerated pipeline with nvdec
> +and/or nvenc.
> +
> +It accepts the following parameters:
> +
> +
> +@table @option
> +
> +@item mode
> +The interlacing mode to adopt. It accepts one of the following values:
> +
> +@table @option
> +@item 0, send_frame
> +Output one frame for each frame.
> +@item 1, send_field
> +Output one frame for each field.
> +@item 2, send_frame_nospatial
> +Like @code{send_frame}, but it skips the spatial interlacing check.
> +@item 3, send_field_nospatial
> +Like @code{send_field}, but it skips the spatial interlacing check.
> +@end table
> +
> +The default value is @code{send_frame}.
> +
> +@item parity
> +The picture field parity assumed for the input interlaced video. It accepts one
> +of the following values:
> +
> +@table @option
> +@item 0, tff
> +Assume the top field is first.
> +@item 1, bff
> +Assume the bottom field is first.
> +@item -1, auto
> +Enable automatic detection of field parity.
> +@end table
> +
> +The default value is @code{auto}.
> +If the interlacing is unknown or the decoder does not export this information,
> +top field first will be assumed.
> +
> +@item deint
> +Specify which frames to deinterlace. Accepts one of the following
> +values:
> +
> +@table @option
> +@item 0, all
> +Deinterlace all frames.
> +@item 1, interlaced
> +Only deinterlace frames marked as interlaced.
> +@end table
> +
> +The default value is @code{all}.
> +@end table
> +
> +@section Cuda NPP
> +Below is a description of the currently available NVIDIA Performance Primitives (libnpp) video filters.

Need similar mention here on how to enable these filters.

Once done, you can remove the then-redundant passages from the 
individual filter entries.

Regards,
Gyan


> +
> +@anchor{scale_npp}
> +@subsection scale_npp
> +
> +Use the NVIDIA Performance Primitives (libnpp) to perform scaling and/or pixel
> +format conversion on CUDA video frames. Setting the output width and height
> +works in the same way as for the @var{scale} filter.
> +
> +The following additional options are accepted:
> +@table @option
> +@item format
> +The pixel format of the output CUDA frames. If set to the string "same" (the
> +default), the input format will be kept. Note that automatic format negotiation
> +and conversion is not yet supported for hardware frames
> +
> +@item interp_algo
> +The interpolation algorithm used for resizing. One of the following:
> +@table @option
> +@item nn
> +Nearest neighbour.
> +
> +@item linear
> +@item cubic
> +@item cubic2p_bspline
> +2-parameter cubic (B=1, C=0)
> +
> +@item cubic2p_catmullrom
> +2-parameter cubic (B=0, C=1/2)
> +
> +@item cubic2p_b05c03
> +2-parameter cubic (B=1/2, C=3/10)
> +
> +@item super
> +Supersampling
> +
> +@item lanczos
> +@end table
> +
> +@item force_original_aspect_ratio
> +Enable decreasing or increasing output video width or height if necessary to
> +keep the original aspect ratio. Possible values:
> +
> +@table @samp
> +@item disable
> +Scale the video as specified and disable this feature.
> +
> +@item decrease
> +The output video dimensions will automatically be decreased if needed.
> +
> +@item increase
> +The output video dimensions will automatically be increased if needed.
> +
> +@end table
> +
> +One useful instance of this option is that when you know a specific device's
> +maximum allowed resolution, you can use this to limit the output video to
> +that, while retaining the aspect ratio. For example, device A allows
> +1280x720 playback, and your video is 1920x800. Using this option (set it to
> +decrease) and specifying 1280x720 to the command line makes the output
> +1280x533.
> +
> +Please note that this is a different thing than specifying -1 for @option{w}
> +or @option{h}, you still need to specify the output resolution for this option
> +to work.
> +
> +@item force_divisible_by
> +Ensures that both the output dimensions, width and height, are divisible by the
> +given integer when used together with @option{force_original_aspect_ratio}. This
> +works similar to using @code{-n} in the @option{w} and @option{h} options.
> +
> +This option respects the value set for @option{force_original_aspect_ratio},
> +increasing or decreasing the resolution accordingly. The video's aspect ratio
> +may be slightly modified.
> +
> +This option can be handy if you need to have a video fit within or exceed
> +a defined resolution using @option{force_original_aspect_ratio} but also have
> +encoder restrictions on width or height divisibility.
> +
> +@item eval
> +Specify when to evaluate @var{width} and @var{height} expression. It accepts the following values:
> +
> +@table @samp
> +@item init
> +Only evaluate expressions once during the filter initialization or when a command is processed.
> +
> +@item frame
> +Evaluate expressions for each incoming frame.
> +
> +@end table
> +
> +@end table
> +
> +The values of the @option{w} and @option{h} options are expressions
> +containing the following constants:
> +
> +@table @var
> +@item in_w
> +@item in_h
> +The input width and height
> +
> +@item iw
> +@item ih
> +These are the same as @var{in_w} and @var{in_h}.
> +
> +@item out_w
> +@item out_h
> +The output (scaled) width and height
> +
> +@item ow
> +@item oh
> +These are the same as @var{out_w} and @var{out_h}
> +
> +@item a
> +The same as @var{iw} / @var{ih}
> +
> +@item sar
> +input sample aspect ratio
> +
> +@item dar
> +The input display aspect ratio. Calculated from @code{(iw / ih) * sar}.
> +
> +@item n
> +The (sequential) number of the input frame, starting from 0.
> +Only available with @code{eval=frame}.
> +
> +@item t
> +The presentation timestamp of the input frame, expressed as a number of
> +seconds. Only available with @code{eval=frame}.
> +
> +@item pos
> +The position (byte offset) of the frame in the input stream, or NaN if
> +this information is unavailable and/or meaningless (for example in case of synthetic video).
> +Only available with @code{eval=frame}.
> +Deprecated, do not use.
> +@end table
> +
> +@subsection scale2ref_npp
> +
> +Use the NVIDIA Performance Primitives (libnpp) to scale (resize) the input
> +video, based on a reference video.
> +
> +See the @ref{scale_npp} filter for available options, scale2ref_npp supports the same
> +but uses the reference video instead of the main input as basis. scale2ref_npp
> +also supports the following additional constants for the @option{w} and
> +@option{h} options:
> +
> +@table @var
> +@item main_w
> +@item main_h
> +The main input video's width and height
> +
> +@item main_a
> +The same as @var{main_w} / @var{main_h}
> +
> +@item main_sar
> +The main input video's sample aspect ratio
> +
> +@item main_dar, mdar
> +The main input video's display aspect ratio. Calculated from
> +@code{(main_w / main_h) * main_sar}.
> +
> +@item main_n
> +The (sequential) number of the main input frame, starting from 0.
> +Only available with @code{eval=frame}.
> +
> +@item main_t
> +The presentation timestamp of the main input frame, expressed as a number of
> +seconds. Only available with @code{eval=frame}.
> +
> +@item main_pos
> +The position (byte offset) of the frame in the main input stream, or NaN if
> +this information is unavailable and/or meaningless (for example in case of synthetic video).
> +Only available with @code{eval=frame}.
> +@end table
> +
> +@subsubsection Examples
> +
> +@itemize
> +@item
> +Scale a subtitle stream (b) to match the main video (a) in size before overlaying
> +@example
> +'scale2ref_npp[b][a];[a][b]overlay_cuda'
> +@end example
> +
> +@item
> +Scale a logo to 1/10th the height of a video, while preserving its display aspect ratio.
> +@example
> +[logo-in][video-in]scale2ref_npp=w=oh*mdar:h=ih/10[logo-out][video-out]
> +@end example
> +@end itemize
> +
> +@subsection sharpen_npp
> +Use the NVIDIA Performance Primitives (libnpp) to perform image sharpening with
> +border control.
> +
> +The following additional options are accepted:
> +@table @option
> +
> +@item border_type
> +Type of sampling to be used ad frame borders. One of the following:
> +@table @option
> +
> +@item replicate
> +Replicate pixel values.
> +
> +@end table
> +@end table
> +
> +@subsection transpose_npp
> +
> +Transpose rows with columns in the input video and optionally flip it.
> +For more in depth examples see the @ref{transpose} video filter, which shares mostly the same options.
> +
> +It accepts the following parameters:
> +
> +@table @option
> +
> +@item dir
> +Specify the transposition direction.
> +
> +Can assume the following values:
> +@table @samp
> +@item cclock_flip
> +Rotate by 90 degrees counterclockwise and vertically flip. (default)
> +
> +@item clock
> +Rotate by 90 degrees clockwise.
> +
> +@item cclock
> +Rotate by 90 degrees counterclockwise.
> +
> +@item clock_flip
> +Rotate by 90 degrees clockwise and vertically flip.
> +@end table
> +
> +@item passthrough
> +Do not apply the transposition if the input geometry matches the one
> +specified by the specified value. It accepts the following values:
> +@table @samp
> +@item none
> +Always apply transposition. (default)
> +@item portrait
> +Preserve portrait geometry (when @var{height} >= @var{width}).
> +@item landscape
> +Preserve landscape geometry (when @var{width} >= @var{height}).
> +@end table
> +
> +@end table
> +
> +@c man end Cuda Video Filters
> +
>   @chapter OpenCL Video Filters
>   @c man begin OPENCL VIDEO FILTERS
>   

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [FFmpeg-devel] [PATCH 1/2] doc/filters: Add CUDA Video Filters section for CUDA-based and CUDA+NPP based filters.
  2025-01-26 10:55 ` [FFmpeg-devel] [PATCH 1/1] [doc/filters] add nvidia cuda and cuda npp sections Gyan Doshi
@ 2025-01-28 19:12   ` Danil Iashchenko
  2025-01-28 19:12     ` [FFmpeg-devel] [PATCH 2/2] doc/filters: Remove redundant *_cuda and *_npp filters since they are already in CUDA Video Filters section Danil Iashchenko
  0 siblings, 1 reply; 12+ messages in thread
From: Danil Iashchenko @ 2025-01-28 19:12 UTC (permalink / raw)
  To: ffmpeg-devel; +Cc: Danil Iashchenko

---
 doc/filters.texi | 687 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 687 insertions(+)

diff --git a/doc/filters.texi b/doc/filters.texi
index a14c7e7e77..c4f312d2b8 100644
--- a/doc/filters.texi
+++ b/doc/filters.texi
@@ -26890,6 +26890,693 @@ value.
 
 @c man end VIDEO FILTERS
 
+@chapter CUDA Video Filters
+@c man begin CUDA Video Filters
+
+To enable compilation of these filters you need to configure FFmpeg with
+@code{--enable-cuda-nvcc} and/or @code{--enable-libnpp} and Nvidia CUDA Toolkit must be installed.
+
+Running CUDA filters requires you to initialize a hardware device and to pass that device to all filters in any filter graph.
+@table @option
+
+@item -init_hw_device cuda[=@var{name}][:@var{device}[,@var{key=value}...]]
+Initialise a new hardware device of type @var{cuda} called @var{name}, using the
+given device parameters.
+
+@item -filter_hw_device @var{name}
+Pass the hardware device called @var{name} to all filters in any filter graph.
+
+@end table
+
+For more detailed information see @url{https://www.ffmpeg.org/ffmpeg.html#Advanced-Video-options}
+
+@itemize
+@item
+Example of initializing second CUDA device on the system and running scale_cuda and bilateral_cuda filters.
+@example
+./ffmpeg -hwaccel cuda -hwaccel_output_format cuda -i input.mp4 -init_hw_device cuda:1 -filter_complex \
+"[0:v]scale_cuda=format=yuv444p[scaled_video];[scaled_video]bilateral_cuda=window_size=9:sigmaS=3.0:sigmaR=50.0" \
+-an -sn -c:v h264_nvenc -cq 20 out.mp4
+@end example
+@end itemize
+
+Since CUDA filters operate exclusively on GPU memory, frame data must sometimes be uploaded (@ref{hwupload}) to hardware surfaces associated with the appropriate CUDA device before processing, and downloaded (@ref{hwdownload}) back to normal memory afterward, if required. Whether @ref{hwupload} or @ref{hwdownload} is necessary depends on the specific workflow:
+
+@itemize
+@item If the input frames are already in GPU memory (e.g., when using @code{-hwaccel cuda} or @code{-hwaccel_output_format cuda}), explicit use of @ref{hwupload} is not needed, as the data is already in the appropriate memory space.
+@item If the input frames are in CPU memory (e.g., software-decoded frames or frames processed by CPU-based filters), it is necessary to use @ref{hwupload} to transfer the data to GPU memory for CUDA processing.
+@item If the output of the CUDA filters needs to be further processed by software-based filters or saved in a format not supported by GPU-based encoders, @ref{hwdownload} is required to transfer the data back to CPU memory.
+@end itemize
+Note that @ref{hwupload} uploads data to a surface with the same layout as the software frame, so it may be necessary to add a @ref{format} filter immediately before @ref{hwupload} to ensure the input is in the correct format. Similarly, @ref{hwdownload} may not support all output formats, so an additional @ref{format} filter may need to be inserted immediately after @ref{hwdownload} in the filter graph to ensure compatibility.
+
+@section CUDA
+Below is a description of the currently available Nvidia CUDA video filters.
+
+To enable compilation of these filters you need to configure FFmpeg with
+@code{--enable-cuda-nvcc} and Nvidia CUDA Toolkit must be installed.
+
+@subsection bilateral_cuda
+CUDA accelerated bilateral filter, an edge preserving filter.
+This filter is mathematically accurate thanks to the use of GPU acceleration.
+For best output quality, use one to one chroma subsampling, i.e. yuv444p format.
+
+The filter accepts the following options:
+@table @option
+@item sigmaS
+Set sigma of gaussian function to calculate spatial weight, also called sigma space.
+Allowed range is 0.1 to 512. Default is 0.1.
+
+@item sigmaR
+Set sigma of gaussian function to calculate color range weight, also called sigma color.
+Allowed range is 0.1 to 512. Default is 0.1.
+
+@item window_size
+Set window size of the bilateral function to determine the number of neighbours to loop on.
+If the number entered is even, one will be added automatically.
+Allowed range is 1 to 255. Default is 1.
+@end table
+@subsubsection Examples
+
+@itemize
+@item
+Apply the bilateral filter on a video.
+
+@example
+./ffmpeg -v verbose \
+-hwaccel cuda -hwaccel_output_format cuda -i input.mp4  \
+-init_hw_device cuda \
+-filter_complex \
+" \
+[0:v]scale_cuda=format=yuv444p[scaled_video];
+[scaled_video]bilateral_cuda=window_size=9:sigmaS=3.0:sigmaR=50.0" \
+-an -sn -c:v h264_nvenc -cq 20 out.mp4
+@end example
+
+@end itemize
+
+@subsection bwdif_cuda
+
+Deinterlace the input video using the @ref{bwdif} algorithm, but implemented
+in CUDA so that it can work as part of a GPU accelerated pipeline with nvdec
+and/or nvenc.
+
+It accepts the following parameters:
+
+@table @option
+@item mode
+The interlacing mode to adopt. It accepts one of the following values:
+
+@table @option
+@item 0, send_frame
+Output one frame for each frame.
+@item 1, send_field
+Output one frame for each field.
+@end table
+
+The default value is @code{send_field}.
+
+@item parity
+The picture field parity assumed for the input interlaced video. It accepts one
+of the following values:
+
+@table @option
+@item 0, tff
+Assume the top field is first.
+@item 1, bff
+Assume the bottom field is first.
+@item -1, auto
+Enable automatic detection of field parity.
+@end table
+
+The default value is @code{auto}.
+If the interlacing is unknown or the decoder does not export this information,
+top field first will be assumed.
+
+@item deint
+Specify which frames to deinterlace. Accepts one of the following
+values:
+
+@table @option
+@item 0, all
+Deinterlace all frames.
+@item 1, interlaced
+Only deinterlace frames marked as interlaced.
+@end table
+
+The default value is @code{all}.
+@end table
+
+@subsection chromakey_cuda
+CUDA accelerated YUV colorspace color/chroma keying.
+
+This filter works like normal chromakey filter but operates on CUDA frames.
+for more details and parameters see @ref{chromakey}.
+
+@subsubsection Examples
+
+@itemize
+@item
+Make all the green pixels in the input video transparent and use it as an overlay for another video:
+
+@example
+./ffmpeg \
+    -hwaccel cuda -hwaccel_output_format cuda -i input_green.mp4  \
+    -hwaccel cuda -hwaccel_output_format cuda -i base_video.mp4 \
+    -init_hw_device cuda \
+    -filter_complex \
+    " \
+        [0:v]chromakey_cuda=0x25302D:0.1:0.12:1[overlay_video]; \
+        [1:v]scale_cuda=format=yuv420p[base]; \
+        [base][overlay_video]overlay_cuda" \
+    -an -sn -c:v h264_nvenc -cq 20 output.mp4
+@end example
+
+@item
+Process two software sources, explicitly uploading the frames:
+
+@example
+./ffmpeg -init_hw_device cuda=cuda -filter_hw_device cuda \
+    -f lavfi -i color=size=800x600:color=white,format=yuv420p \
+    -f lavfi -i yuvtestsrc=size=200x200,format=yuv420p \
+    -filter_complex \
+    " \
+        [0]hwupload[under]; \
+        [1]hwupload,chromakey_cuda=green:0.1:0.12[over]; \
+        [under][over]overlay_cuda" \
+    -c:v hevc_nvenc -cq 18 -preset slow output.mp4
+@end example
+
+@end itemize
+
+@subsection colorspace_cuda
+
+CUDA accelerated implementation of the colorspace filter.
+
+It is by no means feature complete compared to the software colorspace filter,
+and at the current time only supports color range conversion between jpeg/full
+and mpeg/limited range.
+
+The filter accepts the following options:
+
+@table @option
+@item range
+Specify output color range.
+
+The accepted values are:
+@table @samp
+@item tv
+TV (restricted) range
+
+@item mpeg
+MPEG (restricted) range
+
+@item pc
+PC (full) range
+
+@item jpeg
+JPEG (full) range
+
+@end table
+
+@end table
+
+@anchor{overlay_cuda_section}
+@subsection overlay_cuda
+
+Overlay one video on top of another.
+
+This is the CUDA variant of the @ref{overlay} filter.
+It only accepts CUDA frames. The underlying input pixel formats have to match.
+
+It takes two inputs and has one output. The first input is the "main"
+video on which the second input is overlaid.
+
+It accepts the following parameters:
+
+@table @option
+@item x
+@item y
+Set expressions for the x and y coordinates of the overlaid video
+on the main video.
+
+They can contain the following parameters:
+
+@table @option
+
+@item main_w, W
+@item main_h, H
+The main input width and height.
+
+@item overlay_w, w
+@item overlay_h, h
+The overlay input width and height.
+
+@item x
+@item y
+The computed values for @var{x} and @var{y}. They are evaluated for
+each new frame.
+
+@item n
+The ordinal index of the main input frame, starting from 0.
+
+@item pos
+The byte offset position in the file of the main input frame, NAN if unknown.
+Deprecated, do not use.
+
+@item t
+The timestamp of the main input frame, expressed in seconds, NAN if unknown.
+
+@end table
+
+Default value is "0" for both expressions.
+
+@item eval
+Set when the expressions for @option{x} and @option{y} are evaluated.
+
+It accepts the following values:
+@table @option
+@item init
+Evaluate expressions once during filter initialization or
+when a command is processed.
+
+@item frame
+Evaluate expressions for each incoming frame
+@end table
+
+Default value is @option{frame}.
+
+@item eof_action
+See @ref{framesync}.
+
+@item shortest
+See @ref{framesync}.
+
+@item repeatlast
+See @ref{framesync}.
+
+@end table
+
+This filter also supports the @ref{framesync} options.
+
+@anchor{scale_cuda_section}
+@subsection scale_cuda
+
+Scale (resize) and convert (pixel format) the input video, using accelerated CUDA kernels.
+Setting the output width and height works in the same way as for the @ref{scale} filter.
+
+The filter accepts the following options:
+@table @option
+@item w
+@item h
+Set the output video dimension expression. Default value is the input dimension.
+
+Allows for the same expressions as the @ref{scale} filter.
+
+@item interp_algo
+Sets the algorithm used for scaling:
+
+@table @var
+@item nearest
+Nearest neighbour
+
+Used by default if input parameters match the desired output.
+
+@item bilinear
+Bilinear
+
+@item bicubic
+Bicubic
+
+This is the default.
+
+@item lanczos
+Lanczos
+
+@end table
+
+@item format
+Controls the output pixel format. By default, or if none is specified, the input
+pixel format is used.
+
+The filter does not support converting between YUV and RGB pixel formats.
+
+@item passthrough
+If set to 0, every frame is processed, even if no conversion is necessary.
+This mode can be useful to use the filter as a buffer for a downstream
+frame-consumer that exhausts the limited decoder frame pool.
+
+If set to 1, frames are passed through as-is if they match the desired output
+parameters. This is the default behaviour.
+
+@item param
+Algorithm-Specific parameter.
+
+Affects the curves of the bicubic algorithm.
+
+@item force_original_aspect_ratio
+@item force_divisible_by
+Work the same as the identical @ref{scale} filter options.
+
+@end table
+
+@subsubsection Examples
+
+@itemize
+@item
+Scale input to 720p, keeping aspect ratio and ensuring the output is yuv420p.
+@example
+scale_cuda=-2:720:format=yuv420p
+@end example
+
+@item
+Upscale to 4K using nearest neighbour algorithm.
+@example
+scale_cuda=4096:2160:interp_algo=nearest
+@end example
+
+@item
+Don't do any conversion or scaling, but copy all input frames into newly allocated ones.
+This can be useful to deal with a filter and encode chain that otherwise exhausts the
+decoders frame pool.
+@example
+scale_cuda=passthrough=0
+@end example
+@end itemize
+
+@subsection yadif_cuda
+
+Deinterlace the input video using the @ref{yadif} algorithm, but implemented
+in CUDA so that it can work as part of a GPU accelerated pipeline with nvdec
+and/or nvenc.
+
+It accepts the following parameters:
+
+
+@table @option
+
+@item mode
+The interlacing mode to adopt. It accepts one of the following values:
+
+@table @option
+@item 0, send_frame
+Output one frame for each frame.
+@item 1, send_field
+Output one frame for each field.
+@item 2, send_frame_nospatial
+Like @code{send_frame}, but it skips the spatial interlacing check.
+@item 3, send_field_nospatial
+Like @code{send_field}, but it skips the spatial interlacing check.
+@end table
+
+The default value is @code{send_frame}.
+
+@item parity
+The picture field parity assumed for the input interlaced video. It accepts one
+of the following values:
+
+@table @option
+@item 0, tff
+Assume the top field is first.
+@item 1, bff
+Assume the bottom field is first.
+@item -1, auto
+Enable automatic detection of field parity.
+@end table
+
+The default value is @code{auto}.
+If the interlacing is unknown or the decoder does not export this information,
+top field first will be assumed.
+
+@item deint
+Specify which frames to deinterlace. Accepts one of the following
+values:
+
+@table @option
+@item 0, all
+Deinterlace all frames.
+@item 1, interlaced
+Only deinterlace frames marked as interlaced.
+@end table
+
+The default value is @code{all}.
+@end table
+
+@section CUDA NPP
+Below is a description of the currently available NVIDIA Performance Primitives (libnpp) video filters.
+
+To enable compilation of these filters you need to configure FFmpeg with @code{--enable-libnpp} and Nvidia CUDA Toolkit must be installed.
+
+@anchor{scale_npp_section}
+@subsection scale_npp
+
+Use the NVIDIA Performance Primitives (libnpp) to perform scaling and/or pixel
+format conversion on CUDA video frames. Setting the output width and height
+works in the same way as for the @var{scale} filter.
+
+The following additional options are accepted:
+@table @option
+@item format
+The pixel format of the output CUDA frames. If set to the string "same" (the
+default), the input format will be kept. Note that automatic format negotiation
+and conversion is not yet supported for hardware frames
+
+@item interp_algo
+The interpolation algorithm used for resizing. One of the following:
+@table @option
+@item nn
+Nearest neighbour.
+
+@item linear
+@item cubic
+@item cubic2p_bspline
+2-parameter cubic (B=1, C=0)
+
+@item cubic2p_catmullrom
+2-parameter cubic (B=0, C=1/2)
+
+@item cubic2p_b05c03
+2-parameter cubic (B=1/2, C=3/10)
+
+@item super
+Supersampling
+
+@item lanczos
+@end table
+
+@item force_original_aspect_ratio
+Enable decreasing or increasing output video width or height if necessary to
+keep the original aspect ratio. Possible values:
+
+@table @samp
+@item disable
+Scale the video as specified and disable this feature.
+
+@item decrease
+The output video dimensions will automatically be decreased if needed.
+
+@item increase
+The output video dimensions will automatically be increased if needed.
+
+@end table
+
+One useful instance of this option is that when you know a specific device's
+maximum allowed resolution, you can use this to limit the output video to
+that, while retaining the aspect ratio. For example, device A allows
+1280x720 playback, and your video is 1920x800. Using this option (set it to
+decrease) and specifying 1280x720 to the command line makes the output
+1280x533.
+
+Please note that this is a different thing than specifying -1 for @option{w}
+or @option{h}, you still need to specify the output resolution for this option
+to work.
+
+@item force_divisible_by
+Ensures that both the output dimensions, width and height, are divisible by the
+given integer when used together with @option{force_original_aspect_ratio}. This
+works similar to using @code{-n} in the @option{w} and @option{h} options.
+
+This option respects the value set for @option{force_original_aspect_ratio},
+increasing or decreasing the resolution accordingly. The video's aspect ratio
+may be slightly modified.
+
+This option can be handy if you need to have a video fit within or exceed
+a defined resolution using @option{force_original_aspect_ratio} but also have
+encoder restrictions on width or height divisibility.
+
+@item eval
+Specify when to evaluate @var{width} and @var{height} expression. It accepts the following values:
+
+@table @samp
+@item init
+Only evaluate expressions once during the filter initialization or when a command is processed.
+
+@item frame
+Evaluate expressions for each incoming frame.
+
+@end table
+
+@end table
+
+The values of the @option{w} and @option{h} options are expressions
+containing the following constants:
+
+@table @var
+@item in_w
+@item in_h
+The input width and height
+
+@item iw
+@item ih
+These are the same as @var{in_w} and @var{in_h}.
+
+@item out_w
+@item out_h
+The output (scaled) width and height
+
+@item ow
+@item oh
+These are the same as @var{out_w} and @var{out_h}
+
+@item a
+The same as @var{iw} / @var{ih}
+
+@item sar
+input sample aspect ratio
+
+@item dar
+The input display aspect ratio. Calculated from @code{(iw / ih) * sar}.
+
+@item n
+The (sequential) number of the input frame, starting from 0.
+Only available with @code{eval=frame}.
+
+@item t
+The presentation timestamp of the input frame, expressed as a number of
+seconds. Only available with @code{eval=frame}.
+
+@item pos
+The position (byte offset) of the frame in the input stream, or NaN if
+this information is unavailable and/or meaningless (for example in case of synthetic video).
+Only available with @code{eval=frame}.
+Deprecated, do not use.
+@end table
+
+@subsection scale2ref_npp
+
+Use the NVIDIA Performance Primitives (libnpp) to scale (resize) the input
+video, based on a reference video.
+
+See the @ref{scale_npp} filter for available options, scale2ref_npp supports the same
+but uses the reference video instead of the main input as basis. scale2ref_npp
+also supports the following additional constants for the @option{w} and
+@option{h} options:
+
+@table @var
+@item main_w
+@item main_h
+The main input video's width and height
+
+@item main_a
+The same as @var{main_w} / @var{main_h}
+
+@item main_sar
+The main input video's sample aspect ratio
+
+@item main_dar, mdar
+The main input video's display aspect ratio. Calculated from
+@code{(main_w / main_h) * main_sar}.
+
+@item main_n
+The (sequential) number of the main input frame, starting from 0.
+Only available with @code{eval=frame}.
+
+@item main_t
+The presentation timestamp of the main input frame, expressed as a number of
+seconds. Only available with @code{eval=frame}.
+
+@item main_pos
+The position (byte offset) of the frame in the main input stream, or NaN if
+this information is unavailable and/or meaningless (for example in case of synthetic video).
+Only available with @code{eval=frame}.
+@end table
+
+@subsubsection Examples
+
+@itemize
+@item
+Scale a subtitle stream (b) to match the main video (a) in size before overlaying
+@example
+'scale2ref_npp[b][a];[a][b]overlay_cuda'
+@end example
+
+@item
+Scale a logo to 1/10th the height of a video, while preserving its display aspect ratio.
+@example
+[logo-in][video-in]scale2ref_npp=w=oh*mdar:h=ih/10[logo-out][video-out]
+@end example
+@end itemize
+
+@subsection sharpen_npp
+Use the NVIDIA Performance Primitives (libnpp) to perform image sharpening with
+border control.
+
+The following additional options are accepted:
+@table @option
+
+@item border_type
+Type of sampling to be used ad frame borders. One of the following:
+@table @option
+
+@item replicate
+Replicate pixel values.
+
+@end table
+@end table
+
+@subsection transpose_npp
+
+Transpose rows with columns in the input video and optionally flip it.
+For more in depth examples see the @ref{transpose} video filter, which shares mostly the same options.
+
+It accepts the following parameters:
+
+@table @option
+
+@item dir
+Specify the transposition direction.
+
+Can assume the following values:
+@table @samp
+@item cclock_flip
+Rotate by 90 degrees counterclockwise and vertically flip. (default)
+
+@item clock
+Rotate by 90 degrees clockwise.
+
+@item cclock
+Rotate by 90 degrees counterclockwise.
+
+@item clock_flip
+Rotate by 90 degrees clockwise and vertically flip.
+@end table
+
+@item passthrough
+Do not apply the transposition if the input geometry matches the one
+specified by the specified value. It accepts the following values:
+@table @samp
+@item none
+Always apply transposition. (default)
+@item portrait
+Preserve portrait geometry (when @var{height} >= @var{width}).
+@item landscape
+Preserve landscape geometry (when @var{width} >= @var{height}).
+@end table
+
+@end table
+
+@c man end CUDA Video Filters
+
+
 @chapter OpenCL Video Filters
 @c man begin OPENCL VIDEO FILTERS
 
-- 
2.39.5 (Apple Git-154)

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [FFmpeg-devel] [PATCH 2/2] doc/filters: Remove redundant *_cuda and *_npp filters since they are already in CUDA Video Filters section
  2025-01-28 19:12   ` [FFmpeg-devel] [PATCH 1/2] doc/filters: Add CUDA Video Filters section for CUDA-based and CUDA+NPP based filters Danil Iashchenko
@ 2025-01-28 19:12     ` Danil Iashchenko
  2025-02-02  6:58       ` Gyan Doshi
  0 siblings, 1 reply; 12+ messages in thread
From: Danil Iashchenko @ 2025-01-28 19:12 UTC (permalink / raw)
  To: ffmpeg-devel; +Cc: Danil Iashchenko

---
 doc/filters.texi | 630 +----------------------------------------------
 1 file changed, 10 insertions(+), 620 deletions(-)

diff --git a/doc/filters.texi b/doc/filters.texi
index c4f312d2b8..28be8920fd 100644
--- a/doc/filters.texi
+++ b/doc/filters.texi
@@ -8619,45 +8619,6 @@ Set planes to filter. Default is first only.
 
 This filter supports the all above options as @ref{commands}.
 
-@section bilateral_cuda
-CUDA accelerated bilateral filter, an edge preserving filter.
-This filter is mathematically accurate thanks to the use of GPU acceleration.
-For best output quality, use one to one chroma subsampling, i.e. yuv444p format.
-
-The filter accepts the following options:
-@table @option
-@item sigmaS
-Set sigma of gaussian function to calculate spatial weight, also called sigma space.
-Allowed range is 0.1 to 512. Default is 0.1.
-
-@item sigmaR
-Set sigma of gaussian function to calculate color range weight, also called sigma color.
-Allowed range is 0.1 to 512. Default is 0.1.
-
-@item window_size
-Set window size of the bilateral function to determine the number of neighbours to loop on.
-If the number entered is even, one will be added automatically.
-Allowed range is 1 to 255. Default is 1.
-@end table
-@subsection Examples
-
-@itemize
-@item
-Apply the bilateral filter on a video.
-
-@example
-./ffmpeg -v verbose \
--hwaccel cuda -hwaccel_output_format cuda -i input.mp4  \
--init_hw_device cuda \
--filter_complex \
-" \
-[0:v]scale_cuda=format=yuv444p[scaled_video];
-[scaled_video]bilateral_cuda=window_size=9:sigmaS=3.0:sigmaR=50.0" \
--an -sn -c:v h264_nvenc -cq 20 out.mp4
-@end example
-
-@end itemize
-
 @section bitplanenoise
 
 Show and measure bit plane noise.
@@ -9243,58 +9204,6 @@ Only deinterlace frames marked as interlaced.
 The default value is @code{all}.
 @end table
 
-@section bwdif_cuda
-
-Deinterlace the input video using the @ref{bwdif} algorithm, but implemented
-in CUDA so that it can work as part of a GPU accelerated pipeline with nvdec
-and/or nvenc.
-
-It accepts the following parameters:
-
-@table @option
-@item mode
-The interlacing mode to adopt. It accepts one of the following values:
-
-@table @option
-@item 0, send_frame
-Output one frame for each frame.
-@item 1, send_field
-Output one frame for each field.
-@end table
-
-The default value is @code{send_field}.
-
-@item parity
-The picture field parity assumed for the input interlaced video. It accepts one
-of the following values:
-
-@table @option
-@item 0, tff
-Assume the top field is first.
-@item 1, bff
-Assume the bottom field is first.
-@item -1, auto
-Enable automatic detection of field parity.
-@end table
-
-The default value is @code{auto}.
-If the interlacing is unknown or the decoder does not export this information,
-top field first will be assumed.
-
-@item deint
-Specify which frames to deinterlace. Accepts one of the following
-values:
-
-@table @option
-@item 0, all
-Deinterlace all frames.
-@item 1, interlaced
-Only deinterlace frames marked as interlaced.
-@end table
-
-The default value is @code{all}.
-@end table
-
 @section ccrepack
 
 Repack CEA-708 closed captioning side data
@@ -9408,48 +9317,6 @@ ffmpeg -f lavfi -i color=c=black:s=1280x720 -i video.mp4 -shortest -filter_compl
 @end example
 @end itemize
 
-@section chromakey_cuda
-CUDA accelerated YUV colorspace color/chroma keying.
-
-This filter works like normal chromakey filter but operates on CUDA frames.
-for more details and parameters see @ref{chromakey}.
-
-@subsection Examples
-
-@itemize
-@item
-Make all the green pixels in the input video transparent and use it as an overlay for another video:
-
-@example
-./ffmpeg \
-    -hwaccel cuda -hwaccel_output_format cuda -i input_green.mp4  \
-    -hwaccel cuda -hwaccel_output_format cuda -i base_video.mp4 \
-    -init_hw_device cuda \
-    -filter_complex \
-    " \
-        [0:v]chromakey_cuda=0x25302D:0.1:0.12:1[overlay_video]; \
-        [1:v]scale_cuda=format=yuv420p[base]; \
-        [base][overlay_video]overlay_cuda" \
-    -an -sn -c:v h264_nvenc -cq 20 output.mp4
-@end example
-
-@item
-Process two software sources, explicitly uploading the frames:
-
-@example
-./ffmpeg -init_hw_device cuda=cuda -filter_hw_device cuda \
-    -f lavfi -i color=size=800x600:color=white,format=yuv420p \
-    -f lavfi -i yuvtestsrc=size=200x200,format=yuv420p \
-    -filter_complex \
-    " \
-        [0]hwupload[under]; \
-        [1]hwupload,chromakey_cuda=green:0.1:0.12[over]; \
-        [under][over]overlay_cuda" \
-    -c:v hevc_nvenc -cq 18 -preset slow output.mp4
-@end example
-
-@end itemize
-
 @section chromanr
 Reduce chrominance noise.
 
@@ -10427,38 +10294,6 @@ For example to convert the input to SMPTE-240M, use the command:
 colorspace=smpte240m
 @end example
 
-@section colorspace_cuda
-
-CUDA accelerated implementation of the colorspace filter.
-
-It is by no means feature complete compared to the software colorspace filter,
-and at the current time only supports color range conversion between jpeg/full
-and mpeg/limited range.
-
-The filter accepts the following options:
-
-@table @option
-@item range
-Specify output color range.
-
-The accepted values are:
-@table @samp
-@item tv
-TV (restricted) range
-
-@item mpeg
-MPEG (restricted) range
-
-@item pc
-PC (full) range
-
-@item jpeg
-JPEG (full) range
-
-@end table
-
-@end table
-
 @section colortemperature
 Adjust color temperature in video to simulate variations in ambient color temperature.
 
@@ -18977,84 +18812,6 @@ testsrc=s=100x100, split=4 [in0][in1][in2][in3];
 
 @end itemize
 
-@anchor{overlay_cuda}
-@section overlay_cuda
-
-Overlay one video on top of another.
-
-This is the CUDA variant of the @ref{overlay} filter.
-It only accepts CUDA frames. The underlying input pixel formats have to match.
-
-It takes two inputs and has one output. The first input is the "main"
-video on which the second input is overlaid.
-
-It accepts the following parameters:
-
-@table @option
-@item x
-@item y
-Set expressions for the x and y coordinates of the overlaid video
-on the main video.
-
-They can contain the following parameters:
-
-@table @option
-
-@item main_w, W
-@item main_h, H
-The main input width and height.
-
-@item overlay_w, w
-@item overlay_h, h
-The overlay input width and height.
-
-@item x
-@item y
-The computed values for @var{x} and @var{y}. They are evaluated for
-each new frame.
-
-@item n
-The ordinal index of the main input frame, starting from 0.
-
-@item pos
-The byte offset position in the file of the main input frame, NAN if unknown.
-Deprecated, do not use.
-
-@item t
-The timestamp of the main input frame, expressed in seconds, NAN if unknown.
-
-@end table
-
-Default value is "0" for both expressions.
-
-@item eval
-Set when the expressions for @option{x} and @option{y} are evaluated.
-
-It accepts the following values:
-@table @option
-@item init
-Evaluate expressions once during filter initialization or
-when a command is processed.
-
-@item frame
-Evaluate expressions for each incoming frame
-@end table
-
-Default value is @option{frame}.
-
-@item eof_action
-See @ref{framesync}.
-
-@item shortest
-See @ref{framesync}.
-
-@item repeatlast
-See @ref{framesync}.
-
-@end table
-
-This filter also supports the @ref{framesync} options.
-
 @section owdenoise
 
 Apply Overcomplete Wavelet denoiser.
@@ -21479,75 +21236,14 @@ If the specified expression is not valid, it is kept at its current
 value.
 @end table
 
-@anchor{scale_cuda}
-@section scale_cuda
-
-Scale (resize) and convert (pixel format) the input video, using accelerated CUDA kernels.
-Setting the output width and height works in the same way as for the @ref{scale} filter.
-
-The filter accepts the following options:
-@table @option
-@item w
-@item h
-Set the output video dimension expression. Default value is the input dimension.
+@subsection Examples
 
-Allows for the same expressions as the @ref{scale} filter.
-
-@item interp_algo
-Sets the algorithm used for scaling:
-
-@table @var
-@item nearest
-Nearest neighbour
-
-Used by default if input parameters match the desired output.
-
-@item bilinear
-Bilinear
-
-@item bicubic
-Bicubic
-
-This is the default.
-
-@item lanczos
-Lanczos
-
-@end table
-
-@item format
-Controls the output pixel format. By default, or if none is specified, the input
-pixel format is used.
-
-The filter does not support converting between YUV and RGB pixel formats.
-
-@item passthrough
-If set to 0, every frame is processed, even if no conversion is necessary.
-This mode can be useful to use the filter as a buffer for a downstream
-frame-consumer that exhausts the limited decoder frame pool.
-
-If set to 1, frames are passed through as-is if they match the desired output
-parameters. This is the default behaviour.
-
-@item param
-Algorithm-Specific parameter.
-
-Affects the curves of the bicubic algorithm.
-
-@item force_original_aspect_ratio
-@item force_divisible_by
-Work the same as the identical @ref{scale} filter options.
-
-@end table
-
-@subsection Examples
-
-@itemize
-@item
-Scale input to 720p, keeping aspect ratio and ensuring the output is yuv420p.
-@example
-scale_cuda=-2:720:format=yuv420p
-@end example
+@itemize
+@item
+Scale input to 720p, keeping aspect ratio and ensuring the output is yuv420p.
+@example
+scale_cuda=-2:720:format=yuv420p
+@end example
 
 @item
 Upscale to 4K using nearest neighbour algorithm.
@@ -21564,196 +21260,6 @@ scale_cuda=passthrough=0
 @end example
 @end itemize
 
-@anchor{scale_npp}
-@section scale_npp
-
-Use the NVIDIA Performance Primitives (libnpp) to perform scaling and/or pixel
-format conversion on CUDA video frames. Setting the output width and height
-works in the same way as for the @var{scale} filter.
-
-The following additional options are accepted:
-@table @option
-@item format
-The pixel format of the output CUDA frames. If set to the string "same" (the
-default), the input format will be kept. Note that automatic format negotiation
-and conversion is not yet supported for hardware frames
-
-@item interp_algo
-The interpolation algorithm used for resizing. One of the following:
-@table @option
-@item nn
-Nearest neighbour.
-
-@item linear
-@item cubic
-@item cubic2p_bspline
-2-parameter cubic (B=1, C=0)
-
-@item cubic2p_catmullrom
-2-parameter cubic (B=0, C=1/2)
-
-@item cubic2p_b05c03
-2-parameter cubic (B=1/2, C=3/10)
-
-@item super
-Supersampling
-
-@item lanczos
-@end table
-
-@item force_original_aspect_ratio
-Enable decreasing or increasing output video width or height if necessary to
-keep the original aspect ratio. Possible values:
-
-@table @samp
-@item disable
-Scale the video as specified and disable this feature.
-
-@item decrease
-The output video dimensions will automatically be decreased if needed.
-
-@item increase
-The output video dimensions will automatically be increased if needed.
-
-@end table
-
-One useful instance of this option is that when you know a specific device's
-maximum allowed resolution, you can use this to limit the output video to
-that, while retaining the aspect ratio. For example, device A allows
-1280x720 playback, and your video is 1920x800. Using this option (set it to
-decrease) and specifying 1280x720 to the command line makes the output
-1280x533.
-
-Please note that this is a different thing than specifying -1 for @option{w}
-or @option{h}, you still need to specify the output resolution for this option
-to work.
-
-@item force_divisible_by
-Ensures that both the output dimensions, width and height, are divisible by the
-given integer when used together with @option{force_original_aspect_ratio}. This
-works similar to using @code{-n} in the @option{w} and @option{h} options.
-
-This option respects the value set for @option{force_original_aspect_ratio},
-increasing or decreasing the resolution accordingly. The video's aspect ratio
-may be slightly modified.
-
-This option can be handy if you need to have a video fit within or exceed
-a defined resolution using @option{force_original_aspect_ratio} but also have
-encoder restrictions on width or height divisibility.
-
-@item eval
-Specify when to evaluate @var{width} and @var{height} expression. It accepts the following values:
-
-@table @samp
-@item init
-Only evaluate expressions once during the filter initialization or when a command is processed.
-
-@item frame
-Evaluate expressions for each incoming frame.
-
-@end table
-
-@end table
-
-The values of the @option{w} and @option{h} options are expressions
-containing the following constants:
-
-@table @var
-@item in_w
-@item in_h
-The input width and height
-
-@item iw
-@item ih
-These are the same as @var{in_w} and @var{in_h}.
-
-@item out_w
-@item out_h
-The output (scaled) width and height
-
-@item ow
-@item oh
-These are the same as @var{out_w} and @var{out_h}
-
-@item a
-The same as @var{iw} / @var{ih}
-
-@item sar
-input sample aspect ratio
-
-@item dar
-The input display aspect ratio. Calculated from @code{(iw / ih) * sar}.
-
-@item n
-The (sequential) number of the input frame, starting from 0.
-Only available with @code{eval=frame}.
-
-@item t
-The presentation timestamp of the input frame, expressed as a number of
-seconds. Only available with @code{eval=frame}.
-
-@item pos
-The position (byte offset) of the frame in the input stream, or NaN if
-this information is unavailable and/or meaningless (for example in case of synthetic video).
-Only available with @code{eval=frame}.
-Deprecated, do not use.
-@end table
-
-@section scale2ref_npp
-
-Use the NVIDIA Performance Primitives (libnpp) to scale (resize) the input
-video, based on a reference video.
-
-See the @ref{scale_npp} filter for available options, scale2ref_npp supports the same
-but uses the reference video instead of the main input as basis. scale2ref_npp
-also supports the following additional constants for the @option{w} and
-@option{h} options:
-
-@table @var
-@item main_w
-@item main_h
-The main input video's width and height
-
-@item main_a
-The same as @var{main_w} / @var{main_h}
-
-@item main_sar
-The main input video's sample aspect ratio
-
-@item main_dar, mdar
-The main input video's display aspect ratio. Calculated from
-@code{(main_w / main_h) * main_sar}.
-
-@item main_n
-The (sequential) number of the main input frame, starting from 0.
-Only available with @code{eval=frame}.
-
-@item main_t
-The presentation timestamp of the main input frame, expressed as a number of
-seconds. Only available with @code{eval=frame}.
-
-@item main_pos
-The position (byte offset) of the frame in the main input stream, or NaN if
-this information is unavailable and/or meaningless (for example in case of synthetic video).
-Only available with @code{eval=frame}.
-@end table
-
-@subsection Examples
-
-@itemize
-@item
-Scale a subtitle stream (b) to match the main video (a) in size before overlaying
-@example
-'scale2ref_npp[b][a];[a][b]overlay_cuda'
-@end example
-
-@item
-Scale a logo to 1/10th the height of a video, while preserving its display aspect ratio.
-@example
-[logo-in][video-in]scale2ref_npp=w=oh*mdar:h=ih/10[logo-out][video-out]
-@end example
-@end itemize
-
 @section scale_vt
 
 Scale and convert the color parameters using VTPixelTransferSession.
@@ -22200,23 +21706,6 @@ Keep the same chroma location (default).
 @end table
 @end table
 
-@section sharpen_npp
-Use the NVIDIA Performance Primitives (libnpp) to perform image sharpening with
-border control.
-
-The following additional options are accepted:
-@table @option
-
-@item border_type
-Type of sampling to be used ad frame borders. One of the following:
-@table @option
-
-@item replicate
-Replicate pixel values.
-
-@end table
-@end table
-
 @section shear
 Apply shear transform to input video.
 
@@ -24304,47 +23793,6 @@ The command above can also be specified as:
 transpose=1:portrait
 @end example
 
-@section transpose_npp
-
-Transpose rows with columns in the input video and optionally flip it.
-For more in depth examples see the @ref{transpose} video filter, which shares mostly the same options.
-
-It accepts the following parameters:
-
-@table @option
-
-@item dir
-Specify the transposition direction.
-
-Can assume the following values:
-@table @samp
-@item cclock_flip
-Rotate by 90 degrees counterclockwise and vertically flip. (default)
-
-@item clock
-Rotate by 90 degrees clockwise.
-
-@item cclock
-Rotate by 90 degrees counterclockwise.
-
-@item clock_flip
-Rotate by 90 degrees clockwise and vertically flip.
-@end table
-
-@item passthrough
-Do not apply the transposition if the input geometry matches the one
-specified by the specified value. It accepts the following values:
-@table @samp
-@item none
-Always apply transposition. (default)
-@item portrait
-Preserve portrait geometry (when @var{height} >= @var{width}).
-@item landscape
-Preserve landscape geometry (when @var{width} >= @var{height}).
-@end table
-
-@end table
-
 @section trim
 Trim the input so that the output contains one continuous subpart of the input.
 
@@ -26362,64 +25810,6 @@ filter").
 It accepts the following parameters:
 
 
-@table @option
-
-@item mode
-The interlacing mode to adopt. It accepts one of the following values:
-
-@table @option
-@item 0, send_frame
-Output one frame for each frame.
-@item 1, send_field
-Output one frame for each field.
-@item 2, send_frame_nospatial
-Like @code{send_frame}, but it skips the spatial interlacing check.
-@item 3, send_field_nospatial
-Like @code{send_field}, but it skips the spatial interlacing check.
-@end table
-
-The default value is @code{send_frame}.
-
-@item parity
-The picture field parity assumed for the input interlaced video. It accepts one
-of the following values:
-
-@table @option
-@item 0, tff
-Assume the top field is first.
-@item 1, bff
-Assume the bottom field is first.
-@item -1, auto
-Enable automatic detection of field parity.
-@end table
-
-The default value is @code{auto}.
-If the interlacing is unknown or the decoder does not export this information,
-top field first will be assumed.
-
-@item deint
-Specify which frames to deinterlace. Accepts one of the following
-values:
-
-@table @option
-@item 0, all
-Deinterlace all frames.
-@item 1, interlaced
-Only deinterlace frames marked as interlaced.
-@end table
-
-The default value is @code{all}.
-@end table
-
-@section yadif_cuda
-
-Deinterlace the input video using the @ref{yadif} algorithm, but implemented
-in CUDA so that it can work as part of a GPU accelerated pipeline with nvdec
-and/or nvenc.
-
-It accepts the following parameters:
-
-
 @table @option
 
 @item mode
@@ -27100,7 +26490,7 @@ JPEG (full) range
 
 @end table
 
-@anchor{overlay_cuda_section}
+@anchor{overlay_cuda}
 @subsection overlay_cuda
 
 Overlay one video on top of another.
@@ -27178,7 +26568,7 @@ See @ref{framesync}.
 
 This filter also supports the @ref{framesync} options.
 
-@anchor{scale_cuda_section}
+@anchor{scale_cuda}
 @subsection scale_cuda
 
 Scale (resize) and convert (pixel format) the input video, using accelerated CUDA kernels.
@@ -27326,7 +26716,7 @@ Below is a description of the currently available NVIDIA Performance Primitives
 
 To enable compilation of these filters you need to configure FFmpeg with @code{--enable-libnpp} and Nvidia CUDA Toolkit must be installed.
 
-@anchor{scale_npp_section}
+@anchor{scale_npp}
 @subsection scale_npp
 
 Use the NVIDIA Performance Primitives (libnpp) to perform scaling and/or pixel
-- 
2.39.5 (Apple Git-154)

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [FFmpeg-devel] [PATCH 2/2] doc/filters: Remove redundant *_cuda and *_npp filters since they are already in CUDA Video Filters section
  2025-01-28 19:12     ` [FFmpeg-devel] [PATCH 2/2] doc/filters: Remove redundant *_cuda and *_npp filters since they are already in CUDA Video Filters section Danil Iashchenko
@ 2025-02-02  6:58       ` Gyan Doshi
  2025-02-02 13:23         ` [FFmpeg-devel] [PATCH] doc/filters: Add CUDA Video Filters section for CUDA-based and CUDA+NPP based filters Danil Iashchenko
  0 siblings, 1 reply; 12+ messages in thread
From: Gyan Doshi @ 2025-02-02  6:58 UTC (permalink / raw)
  To: ffmpeg-devel



On 2025-01-29 12:42 am, Danil Iashchenko wrote:
> ---
>   doc/filters.texi | 630 +----------------------------------------------
>   1 file changed, 10 insertions(+), 620 deletions(-)
>
> diff --git a/doc/filters.texi b/doc/filters.texi
> index c4f312d2b8..28be8920fd 100644
> --- a/doc/filters.texi
> +++ b/doc/filters.texi
> @@ -8619,45 +8619,6 @@ Set planes to filter. Default is first only.
>   
>   This filter supports the all above options as @ref{commands}.
The removal should happen together with the shifting.

Regards,
Gyan

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [FFmpeg-devel] [PATCH] doc/filters: Add CUDA Video Filters section for CUDA-based and CUDA+NPP based filters.
  2025-02-02  6:58       ` Gyan Doshi
@ 2025-02-02 13:23         ` Danil Iashchenko
  2025-02-04  5:40           ` Gyan Doshi
  0 siblings, 1 reply; 12+ messages in thread
From: Danil Iashchenko @ 2025-02-02 13:23 UTC (permalink / raw)
  To: ffmpeg-devel; +Cc: Danil Iashchenko

---
 doc/filters.texi | 1323 ++++++++++++++++++++++++----------------------
 1 file changed, 700 insertions(+), 623 deletions(-)

diff --git a/doc/filters.texi b/doc/filters.texi
index c2817b2661..7460b7ef18 100644
--- a/doc/filters.texi
+++ b/doc/filters.texi
@@ -8619,45 +8619,6 @@ Set planes to filter. Default is first only.
 
 This filter supports the all above options as @ref{commands}.
 
-@section bilateral_cuda
-CUDA accelerated bilateral filter, an edge preserving filter.
-This filter is mathematically accurate thanks to the use of GPU acceleration.
-For best output quality, use one to one chroma subsampling, i.e. yuv444p format.
-
-The filter accepts the following options:
-@table @option
-@item sigmaS
-Set sigma of gaussian function to calculate spatial weight, also called sigma space.
-Allowed range is 0.1 to 512. Default is 0.1.
-
-@item sigmaR
-Set sigma of gaussian function to calculate color range weight, also called sigma color.
-Allowed range is 0.1 to 512. Default is 0.1.
-
-@item window_size
-Set window size of the bilateral function to determine the number of neighbours to loop on.
-If the number entered is even, one will be added automatically.
-Allowed range is 1 to 255. Default is 1.
-@end table
-@subsection Examples
-
-@itemize
-@item
-Apply the bilateral filter on a video.
-
-@example
-./ffmpeg -v verbose \
--hwaccel cuda -hwaccel_output_format cuda -i input.mp4  \
--init_hw_device cuda \
--filter_complex \
-" \
-[0:v]scale_cuda=format=yuv444p[scaled_video];
-[scaled_video]bilateral_cuda=window_size=9:sigmaS=3.0:sigmaR=50.0" \
--an -sn -c:v h264_nvenc -cq 20 out.mp4
-@end example
-
-@end itemize
-
 @section bitplanenoise
 
 Show and measure bit plane noise.
@@ -9243,58 +9204,6 @@ Only deinterlace frames marked as interlaced.
 The default value is @code{all}.
 @end table
 
-@section bwdif_cuda
-
-Deinterlace the input video using the @ref{bwdif} algorithm, but implemented
-in CUDA so that it can work as part of a GPU accelerated pipeline with nvdec
-and/or nvenc.
-
-It accepts the following parameters:
-
-@table @option
-@item mode
-The interlacing mode to adopt. It accepts one of the following values:
-
-@table @option
-@item 0, send_frame
-Output one frame for each frame.
-@item 1, send_field
-Output one frame for each field.
-@end table
-
-The default value is @code{send_field}.
-
-@item parity
-The picture field parity assumed for the input interlaced video. It accepts one
-of the following values:
-
-@table @option
-@item 0, tff
-Assume the top field is first.
-@item 1, bff
-Assume the bottom field is first.
-@item -1, auto
-Enable automatic detection of field parity.
-@end table
-
-The default value is @code{auto}.
-If the interlacing is unknown or the decoder does not export this information,
-top field first will be assumed.
-
-@item deint
-Specify which frames to deinterlace. Accepts one of the following
-values:
-
-@table @option
-@item 0, all
-Deinterlace all frames.
-@item 1, interlaced
-Only deinterlace frames marked as interlaced.
-@end table
-
-The default value is @code{all}.
-@end table
-
 @section ccrepack
 
 Repack CEA-708 closed captioning side data
@@ -9408,48 +9317,6 @@ ffmpeg -f lavfi -i color=c=black:s=1280x720 -i video.mp4 -shortest -filter_compl
 @end example
 @end itemize
 
-@section chromakey_cuda
-CUDA accelerated YUV colorspace color/chroma keying.
-
-This filter works like normal chromakey filter but operates on CUDA frames.
-for more details and parameters see @ref{chromakey}.
-
-@subsection Examples
-
-@itemize
-@item
-Make all the green pixels in the input video transparent and use it as an overlay for another video:
-
-@example
-./ffmpeg \
-    -hwaccel cuda -hwaccel_output_format cuda -i input_green.mp4  \
-    -hwaccel cuda -hwaccel_output_format cuda -i base_video.mp4 \
-    -init_hw_device cuda \
-    -filter_complex \
-    " \
-        [0:v]chromakey_cuda=0x25302D:0.1:0.12:1[overlay_video]; \
-        [1:v]scale_cuda=format=yuv420p[base]; \
-        [base][overlay_video]overlay_cuda" \
-    -an -sn -c:v h264_nvenc -cq 20 output.mp4
-@end example
-
-@item
-Process two software sources, explicitly uploading the frames:
-
-@example
-./ffmpeg -init_hw_device cuda=cuda -filter_hw_device cuda \
-    -f lavfi -i color=size=800x600:color=white,format=yuv420p \
-    -f lavfi -i yuvtestsrc=size=200x200,format=yuv420p \
-    -filter_complex \
-    " \
-        [0]hwupload[under]; \
-        [1]hwupload,chromakey_cuda=green:0.1:0.12[over]; \
-        [under][over]overlay_cuda" \
-    -c:v hevc_nvenc -cq 18 -preset slow output.mp4
-@end example
-
-@end itemize
-
 @section chromanr
 Reduce chrominance noise.
 
@@ -10427,38 +10294,6 @@ For example to convert the input to SMPTE-240M, use the command:
 colorspace=smpte240m
 @end example
 
-@section colorspace_cuda
-
-CUDA accelerated implementation of the colorspace filter.
-
-It is by no means feature complete compared to the software colorspace filter,
-and at the current time only supports color range conversion between jpeg/full
-and mpeg/limited range.
-
-The filter accepts the following options:
-
-@table @option
-@item range
-Specify output color range.
-
-The accepted values are:
-@table @samp
-@item tv
-TV (restricted) range
-
-@item mpeg
-MPEG (restricted) range
-
-@item pc
-PC (full) range
-
-@item jpeg
-JPEG (full) range
-
-@end table
-
-@end table
-
 @section colortemperature
 Adjust color temperature in video to simulate variations in ambient color temperature.
 
@@ -18977,84 +18812,6 @@ testsrc=s=100x100, split=4 [in0][in1][in2][in3];
 
 @end itemize
 
-@anchor{overlay_cuda}
-@section overlay_cuda
-
-Overlay one video on top of another.
-
-This is the CUDA variant of the @ref{overlay} filter.
-It only accepts CUDA frames. The underlying input pixel formats have to match.
-
-It takes two inputs and has one output. The first input is the "main"
-video on which the second input is overlaid.
-
-It accepts the following parameters:
-
-@table @option
-@item x
-@item y
-Set expressions for the x and y coordinates of the overlaid video
-on the main video.
-
-They can contain the following parameters:
-
-@table @option
-
-@item main_w, W
-@item main_h, H
-The main input width and height.
-
-@item overlay_w, w
-@item overlay_h, h
-The overlay input width and height.
-
-@item x
-@item y
-The computed values for @var{x} and @var{y}. They are evaluated for
-each new frame.
-
-@item n
-The ordinal index of the main input frame, starting from 0.
-
-@item pos
-The byte offset position in the file of the main input frame, NAN if unknown.
-Deprecated, do not use.
-
-@item t
-The timestamp of the main input frame, expressed in seconds, NAN if unknown.
-
-@end table
-
-Default value is "0" for both expressions.
-
-@item eval
-Set when the expressions for @option{x} and @option{y} are evaluated.
-
-It accepts the following values:
-@table @option
-@item init
-Evaluate expressions once during filter initialization or
-when a command is processed.
-
-@item frame
-Evaluate expressions for each incoming frame
-@end table
-
-Default value is @option{frame}.
-
-@item eof_action
-See @ref{framesync}.
-
-@item shortest
-See @ref{framesync}.
-
-@item repeatlast
-See @ref{framesync}.
-
-@end table
-
-This filter also supports the @ref{framesync} options.
-
 @section owdenoise
 
 Apply Overcomplete Wavelet denoiser.
@@ -21479,75 +21236,14 @@ If the specified expression is not valid, it is kept at its current
 value.
 @end table
 
-@anchor{scale_cuda}
-@section scale_cuda
-
-Scale (resize) and convert (pixel format) the input video, using accelerated CUDA kernels.
-Setting the output width and height works in the same way as for the @ref{scale} filter.
-
-The filter accepts the following options:
-@table @option
-@item w
-@item h
-Set the output video dimension expression. Default value is the input dimension.
+@subsection Examples
 
-Allows for the same expressions as the @ref{scale} filter.
-
-@item interp_algo
-Sets the algorithm used for scaling:
-
-@table @var
-@item nearest
-Nearest neighbour
-
-Used by default if input parameters match the desired output.
-
-@item bilinear
-Bilinear
-
-@item bicubic
-Bicubic
-
-This is the default.
-
-@item lanczos
-Lanczos
-
-@end table
-
-@item format
-Controls the output pixel format. By default, or if none is specified, the input
-pixel format is used.
-
-The filter does not support converting between YUV and RGB pixel formats.
-
-@item passthrough
-If set to 0, every frame is processed, even if no conversion is necessary.
-This mode can be useful to use the filter as a buffer for a downstream
-frame-consumer that exhausts the limited decoder frame pool.
-
-If set to 1, frames are passed through as-is if they match the desired output
-parameters. This is the default behaviour.
-
-@item param
-Algorithm-Specific parameter.
-
-Affects the curves of the bicubic algorithm.
-
-@item force_original_aspect_ratio
-@item force_divisible_by
-Work the same as the identical @ref{scale} filter options.
-
-@end table
-
-@subsection Examples
-
-@itemize
-@item
-Scale input to 720p, keeping aspect ratio and ensuring the output is yuv420p.
-@example
-scale_cuda=-2:720:format=yuv420p
-@end example
+@itemize
+@item
+Scale input to 720p, keeping aspect ratio and ensuring the output is yuv420p.
+@example
+scale_cuda=-2:720:format=yuv420p
+@end example
 
 @item
 Upscale to 4K using nearest neighbour algorithm.
@@ -21564,196 +21260,6 @@ scale_cuda=passthrough=0
 @end example
 @end itemize
 
-@anchor{scale_npp}
-@section scale_npp
-
-Use the NVIDIA Performance Primitives (libnpp) to perform scaling and/or pixel
-format conversion on CUDA video frames. Setting the output width and height
-works in the same way as for the @var{scale} filter.
-
-The following additional options are accepted:
-@table @option
-@item format
-The pixel format of the output CUDA frames. If set to the string "same" (the
-default), the input format will be kept. Note that automatic format negotiation
-and conversion is not yet supported for hardware frames
-
-@item interp_algo
-The interpolation algorithm used for resizing. One of the following:
-@table @option
-@item nn
-Nearest neighbour.
-
-@item linear
-@item cubic
-@item cubic2p_bspline
-2-parameter cubic (B=1, C=0)
-
-@item cubic2p_catmullrom
-2-parameter cubic (B=0, C=1/2)
-
-@item cubic2p_b05c03
-2-parameter cubic (B=1/2, C=3/10)
-
-@item super
-Supersampling
-
-@item lanczos
-@end table
-
-@item force_original_aspect_ratio
-Enable decreasing or increasing output video width or height if necessary to
-keep the original aspect ratio. Possible values:
-
-@table @samp
-@item disable
-Scale the video as specified and disable this feature.
-
-@item decrease
-The output video dimensions will automatically be decreased if needed.
-
-@item increase
-The output video dimensions will automatically be increased if needed.
-
-@end table
-
-One useful instance of this option is that when you know a specific device's
-maximum allowed resolution, you can use this to limit the output video to
-that, while retaining the aspect ratio. For example, device A allows
-1280x720 playback, and your video is 1920x800. Using this option (set it to
-decrease) and specifying 1280x720 to the command line makes the output
-1280x533.
-
-Please note that this is a different thing than specifying -1 for @option{w}
-or @option{h}, you still need to specify the output resolution for this option
-to work.
-
-@item force_divisible_by
-Ensures that both the output dimensions, width and height, are divisible by the
-given integer when used together with @option{force_original_aspect_ratio}. This
-works similar to using @code{-n} in the @option{w} and @option{h} options.
-
-This option respects the value set for @option{force_original_aspect_ratio},
-increasing or decreasing the resolution accordingly. The video's aspect ratio
-may be slightly modified.
-
-This option can be handy if you need to have a video fit within or exceed
-a defined resolution using @option{force_original_aspect_ratio} but also have
-encoder restrictions on width or height divisibility.
-
-@item eval
-Specify when to evaluate @var{width} and @var{height} expression. It accepts the following values:
-
-@table @samp
-@item init
-Only evaluate expressions once during the filter initialization or when a command is processed.
-
-@item frame
-Evaluate expressions for each incoming frame.
-
-@end table
-
-@end table
-
-The values of the @option{w} and @option{h} options are expressions
-containing the following constants:
-
-@table @var
-@item in_w
-@item in_h
-The input width and height
-
-@item iw
-@item ih
-These are the same as @var{in_w} and @var{in_h}.
-
-@item out_w
-@item out_h
-The output (scaled) width and height
-
-@item ow
-@item oh
-These are the same as @var{out_w} and @var{out_h}
-
-@item a
-The same as @var{iw} / @var{ih}
-
-@item sar
-input sample aspect ratio
-
-@item dar
-The input display aspect ratio. Calculated from @code{(iw / ih) * sar}.
-
-@item n
-The (sequential) number of the input frame, starting from 0.
-Only available with @code{eval=frame}.
-
-@item t
-The presentation timestamp of the input frame, expressed as a number of
-seconds. Only available with @code{eval=frame}.
-
-@item pos
-The position (byte offset) of the frame in the input stream, or NaN if
-this information is unavailable and/or meaningless (for example in case of synthetic video).
-Only available with @code{eval=frame}.
-Deprecated, do not use.
-@end table
-
-@section scale2ref_npp
-
-Use the NVIDIA Performance Primitives (libnpp) to scale (resize) the input
-video, based on a reference video.
-
-See the @ref{scale_npp} filter for available options, scale2ref_npp supports the same
-but uses the reference video instead of the main input as basis. scale2ref_npp
-also supports the following additional constants for the @option{w} and
-@option{h} options:
-
-@table @var
-@item main_w
-@item main_h
-The main input video's width and height
-
-@item main_a
-The same as @var{main_w} / @var{main_h}
-
-@item main_sar
-The main input video's sample aspect ratio
-
-@item main_dar, mdar
-The main input video's display aspect ratio. Calculated from
-@code{(main_w / main_h) * main_sar}.
-
-@item main_n
-The (sequential) number of the main input frame, starting from 0.
-Only available with @code{eval=frame}.
-
-@item main_t
-The presentation timestamp of the main input frame, expressed as a number of
-seconds. Only available with @code{eval=frame}.
-
-@item main_pos
-The position (byte offset) of the frame in the main input stream, or NaN if
-this information is unavailable and/or meaningless (for example in case of synthetic video).
-Only available with @code{eval=frame}.
-@end table
-
-@subsection Examples
-
-@itemize
-@item
-Scale a subtitle stream (b) to match the main video (a) in size before overlaying
-@example
-'scale2ref_npp[b][a];[a][b]overlay_cuda'
-@end example
-
-@item
-Scale a logo to 1/10th the height of a video, while preserving its display aspect ratio.
-@example
-[logo-in][video-in]scale2ref_npp=w=oh*mdar:h=ih/10[logo-out][video-out]
-@end example
-@end itemize
-
 @section scale_vt
 
 Scale and convert the color parameters using VTPixelTransferSession.
@@ -22200,32 +21706,15 @@ Keep the same chroma location (default).
 @end table
 @end table
 
-@section sharpen_npp
-Use the NVIDIA Performance Primitives (libnpp) to perform image sharpening with
-border control.
+@section shear
+Apply shear transform to input video.
 
-The following additional options are accepted:
-@table @option
+This filter supports the following options:
 
-@item border_type
-Type of sampling to be used ad frame borders. One of the following:
 @table @option
-
-@item replicate
-Replicate pixel values.
-
-@end table
-@end table
-
-@section shear
-Apply shear transform to input video.
-
-This filter supports the following options:
-
-@table @option
-@item shx
-Shear factor in X-direction. Default value is 0.
-Allowed range is from -2 to 2.
+@item shx
+Shear factor in X-direction. Default value is 0.
+Allowed range is from -2 to 2.
 
 @item shy
 Shear factor in Y-direction. Default value is 0.
@@ -24304,47 +23793,6 @@ The command above can also be specified as:
 transpose=1:portrait
 @end example
 
-@section transpose_npp
-
-Transpose rows with columns in the input video and optionally flip it.
-For more in depth examples see the @ref{transpose} video filter, which shares mostly the same options.
-
-It accepts the following parameters:
-
-@table @option
-
-@item dir
-Specify the transposition direction.
-
-Can assume the following values:
-@table @samp
-@item cclock_flip
-Rotate by 90 degrees counterclockwise and vertically flip. (default)
-
-@item clock
-Rotate by 90 degrees clockwise.
-
-@item cclock
-Rotate by 90 degrees counterclockwise.
-
-@item clock_flip
-Rotate by 90 degrees clockwise and vertically flip.
-@end table
-
-@item passthrough
-Do not apply the transposition if the input geometry matches the one
-specified by the specified value. It accepts the following values:
-@table @samp
-@item none
-Always apply transposition. (default)
-@item portrait
-Preserve portrait geometry (when @var{height} >= @var{width}).
-@item landscape
-Preserve landscape geometry (when @var{width} >= @var{height}).
-@end table
-
-@end table
-
 @section trim
 Trim the input so that the output contains one continuous subpart of the input.
 
@@ -26362,64 +25810,6 @@ filter").
 It accepts the following parameters:
 
 
-@table @option
-
-@item mode
-The interlacing mode to adopt. It accepts one of the following values:
-
-@table @option
-@item 0, send_frame
-Output one frame for each frame.
-@item 1, send_field
-Output one frame for each field.
-@item 2, send_frame_nospatial
-Like @code{send_frame}, but it skips the spatial interlacing check.
-@item 3, send_field_nospatial
-Like @code{send_field}, but it skips the spatial interlacing check.
-@end table
-
-The default value is @code{send_frame}.
-
-@item parity
-The picture field parity assumed for the input interlaced video. It accepts one
-of the following values:
-
-@table @option
-@item 0, tff
-Assume the top field is first.
-@item 1, bff
-Assume the bottom field is first.
-@item -1, auto
-Enable automatic detection of field parity.
-@end table
-
-The default value is @code{auto}.
-If the interlacing is unknown or the decoder does not export this information,
-top field first will be assumed.
-
-@item deint
-Specify which frames to deinterlace. Accepts one of the following
-values:
-
-@table @option
-@item 0, all
-Deinterlace all frames.
-@item 1, interlaced
-Only deinterlace frames marked as interlaced.
-@end table
-
-The default value is @code{all}.
-@end table
-
-@section yadif_cuda
-
-Deinterlace the input video using the @ref{yadif} algorithm, but implemented
-in CUDA so that it can work as part of a GPU accelerated pipeline with nvdec
-and/or nvenc.
-
-It accepts the following parameters:
-
-
 @table @option
 
 @item mode
@@ -26890,6 +26280,693 @@ value.
 
 @c man end VIDEO FILTERS
 
+@chapter CUDA Video Filters
+@c man begin CUDA Video Filters
+
+To enable compilation of these filters you need to configure FFmpeg with
+@code{--enable-cuda-nvcc} and/or @code{--enable-libnpp} and Nvidia CUDA Toolkit must be installed.
+
+Running CUDA filters requires you to initialize a hardware device and to pass that device to all filters in any filter graph.
+@table @option
+
+@item -init_hw_device cuda[=@var{name}][:@var{device}[,@var{key=value}...]]
+Initialise a new hardware device of type @var{cuda} called @var{name}, using the
+given device parameters.
+
+@item -filter_hw_device @var{name}
+Pass the hardware device called @var{name} to all filters in any filter graph.
+
+@end table
+
+For more detailed information see @url{https://www.ffmpeg.org/ffmpeg.html#Advanced-Video-options}
+
+@itemize
+@item
+Example of initializing second CUDA device on the system and running scale_cuda and bilateral_cuda filters.
+@example
+./ffmpeg -hwaccel cuda -hwaccel_output_format cuda -i input.mp4 -init_hw_device cuda:1 -filter_complex \
+"[0:v]scale_cuda=format=yuv444p[scaled_video];[scaled_video]bilateral_cuda=window_size=9:sigmaS=3.0:sigmaR=50.0" \
+-an -sn -c:v h264_nvenc -cq 20 out.mp4
+@end example
+@end itemize
+
+Since CUDA filters operate exclusively on GPU memory, frame data must sometimes be uploaded (@ref{hwupload}) to hardware surfaces associated with the appropriate CUDA device before processing, and downloaded (@ref{hwdownload}) back to normal memory afterward, if required. Whether @ref{hwupload} or @ref{hwdownload} is necessary depends on the specific workflow:
+
+@itemize
+@item If the input frames are already in GPU memory (e.g., when using @code{-hwaccel cuda} or @code{-hwaccel_output_format cuda}), explicit use of @ref{hwupload} is not needed, as the data is already in the appropriate memory space.
+@item If the input frames are in CPU memory (e.g., software-decoded frames or frames processed by CPU-based filters), it is necessary to use @ref{hwupload} to transfer the data to GPU memory for CUDA processing.
+@item If the output of the CUDA filters needs to be further processed by software-based filters or saved in a format not supported by GPU-based encoders, @ref{hwdownload} is required to transfer the data back to CPU memory.
+@end itemize
+Note that @ref{hwupload} uploads data to a surface with the same layout as the software frame, so it may be necessary to add a @ref{format} filter immediately before @ref{hwupload} to ensure the input is in the correct format. Similarly, @ref{hwdownload} may not support all output formats, so an additional @ref{format} filter may need to be inserted immediately after @ref{hwdownload} in the filter graph to ensure compatibility.
+
+@section CUDA
+Below is a description of the currently available Nvidia CUDA video filters.
+
+To enable compilation of these filters you need to configure FFmpeg with
+@code{--enable-cuda-nvcc} and Nvidia CUDA Toolkit must be installed.
+
+@subsection bilateral_cuda
+CUDA accelerated bilateral filter, an edge preserving filter.
+This filter is mathematically accurate thanks to the use of GPU acceleration.
+For best output quality, use one to one chroma subsampling, i.e. yuv444p format.
+
+The filter accepts the following options:
+@table @option
+@item sigmaS
+Set sigma of gaussian function to calculate spatial weight, also called sigma space.
+Allowed range is 0.1 to 512. Default is 0.1.
+
+@item sigmaR
+Set sigma of gaussian function to calculate color range weight, also called sigma color.
+Allowed range is 0.1 to 512. Default is 0.1.
+
+@item window_size
+Set window size of the bilateral function to determine the number of neighbours to loop on.
+If the number entered is even, one will be added automatically.
+Allowed range is 1 to 255. Default is 1.
+@end table
+@subsubsection Examples
+
+@itemize
+@item
+Apply the bilateral filter on a video.
+
+@example
+./ffmpeg -v verbose \
+-hwaccel cuda -hwaccel_output_format cuda -i input.mp4  \
+-init_hw_device cuda \
+-filter_complex \
+" \
+[0:v]scale_cuda=format=yuv444p[scaled_video];
+[scaled_video]bilateral_cuda=window_size=9:sigmaS=3.0:sigmaR=50.0" \
+-an -sn -c:v h264_nvenc -cq 20 out.mp4
+@end example
+
+@end itemize
+
+@subsection bwdif_cuda
+
+Deinterlace the input video using the @ref{bwdif} algorithm, but implemented
+in CUDA so that it can work as part of a GPU accelerated pipeline with nvdec
+and/or nvenc.
+
+It accepts the following parameters:
+
+@table @option
+@item mode
+The interlacing mode to adopt. It accepts one of the following values:
+
+@table @option
+@item 0, send_frame
+Output one frame for each frame.
+@item 1, send_field
+Output one frame for each field.
+@end table
+
+The default value is @code{send_field}.
+
+@item parity
+The picture field parity assumed for the input interlaced video. It accepts one
+of the following values:
+
+@table @option
+@item 0, tff
+Assume the top field is first.
+@item 1, bff
+Assume the bottom field is first.
+@item -1, auto
+Enable automatic detection of field parity.
+@end table
+
+The default value is @code{auto}.
+If the interlacing is unknown or the decoder does not export this information,
+top field first will be assumed.
+
+@item deint
+Specify which frames to deinterlace. Accepts one of the following
+values:
+
+@table @option
+@item 0, all
+Deinterlace all frames.
+@item 1, interlaced
+Only deinterlace frames marked as interlaced.
+@end table
+
+The default value is @code{all}.
+@end table
+
+@subsection chromakey_cuda
+CUDA accelerated YUV colorspace color/chroma keying.
+
+This filter works like normal chromakey filter but operates on CUDA frames.
+for more details and parameters see @ref{chromakey}.
+
+@subsubsection Examples
+
+@itemize
+@item
+Make all the green pixels in the input video transparent and use it as an overlay for another video:
+
+@example
+./ffmpeg \
+    -hwaccel cuda -hwaccel_output_format cuda -i input_green.mp4  \
+    -hwaccel cuda -hwaccel_output_format cuda -i base_video.mp4 \
+    -init_hw_device cuda \
+    -filter_complex \
+    " \
+        [0:v]chromakey_cuda=0x25302D:0.1:0.12:1[overlay_video]; \
+        [1:v]scale_cuda=format=yuv420p[base]; \
+        [base][overlay_video]overlay_cuda" \
+    -an -sn -c:v h264_nvenc -cq 20 output.mp4
+@end example
+
+@item
+Process two software sources, explicitly uploading the frames:
+
+@example
+./ffmpeg -init_hw_device cuda=cuda -filter_hw_device cuda \
+    -f lavfi -i color=size=800x600:color=white,format=yuv420p \
+    -f lavfi -i yuvtestsrc=size=200x200,format=yuv420p \
+    -filter_complex \
+    " \
+        [0]hwupload[under]; \
+        [1]hwupload,chromakey_cuda=green:0.1:0.12[over]; \
+        [under][over]overlay_cuda" \
+    -c:v hevc_nvenc -cq 18 -preset slow output.mp4
+@end example
+
+@end itemize
+
+@subsection colorspace_cuda
+
+CUDA accelerated implementation of the colorspace filter.
+
+It is by no means feature complete compared to the software colorspace filter,
+and at the current time only supports color range conversion between jpeg/full
+and mpeg/limited range.
+
+The filter accepts the following options:
+
+@table @option
+@item range
+Specify output color range.
+
+The accepted values are:
+@table @samp
+@item tv
+TV (restricted) range
+
+@item mpeg
+MPEG (restricted) range
+
+@item pc
+PC (full) range
+
+@item jpeg
+JPEG (full) range
+
+@end table
+
+@end table
+
+@anchor{overlay_cuda}
+@subsection overlay_cuda
+
+Overlay one video on top of another.
+
+This is the CUDA variant of the @ref{overlay} filter.
+It only accepts CUDA frames. The underlying input pixel formats have to match.
+
+It takes two inputs and has one output. The first input is the "main"
+video on which the second input is overlaid.
+
+It accepts the following parameters:
+
+@table @option
+@item x
+@item y
+Set expressions for the x and y coordinates of the overlaid video
+on the main video.
+
+They can contain the following parameters:
+
+@table @option
+
+@item main_w, W
+@item main_h, H
+The main input width and height.
+
+@item overlay_w, w
+@item overlay_h, h
+The overlay input width and height.
+
+@item x
+@item y
+The computed values for @var{x} and @var{y}. They are evaluated for
+each new frame.
+
+@item n
+The ordinal index of the main input frame, starting from 0.
+
+@item pos
+The byte offset position in the file of the main input frame, NAN if unknown.
+Deprecated, do not use.
+
+@item t
+The timestamp of the main input frame, expressed in seconds, NAN if unknown.
+
+@end table
+
+Default value is "0" for both expressions.
+
+@item eval
+Set when the expressions for @option{x} and @option{y} are evaluated.
+
+It accepts the following values:
+@table @option
+@item init
+Evaluate expressions once during filter initialization or
+when a command is processed.
+
+@item frame
+Evaluate expressions for each incoming frame
+@end table
+
+Default value is @option{frame}.
+
+@item eof_action
+See @ref{framesync}.
+
+@item shortest
+See @ref{framesync}.
+
+@item repeatlast
+See @ref{framesync}.
+
+@end table
+
+This filter also supports the @ref{framesync} options.
+
+@anchor{scale_cuda}
+@subsection scale_cuda
+
+Scale (resize) and convert (pixel format) the input video, using accelerated CUDA kernels.
+Setting the output width and height works in the same way as for the @ref{scale} filter.
+
+The filter accepts the following options:
+@table @option
+@item w
+@item h
+Set the output video dimension expression. Default value is the input dimension.
+
+Allows for the same expressions as the @ref{scale} filter.
+
+@item interp_algo
+Sets the algorithm used for scaling:
+
+@table @var
+@item nearest
+Nearest neighbour
+
+Used by default if input parameters match the desired output.
+
+@item bilinear
+Bilinear
+
+@item bicubic
+Bicubic
+
+This is the default.
+
+@item lanczos
+Lanczos
+
+@end table
+
+@item format
+Controls the output pixel format. By default, or if none is specified, the input
+pixel format is used.
+
+The filter does not support converting between YUV and RGB pixel formats.
+
+@item passthrough
+If set to 0, every frame is processed, even if no conversion is necessary.
+This mode can be useful to use the filter as a buffer for a downstream
+frame-consumer that exhausts the limited decoder frame pool.
+
+If set to 1, frames are passed through as-is if they match the desired output
+parameters. This is the default behaviour.
+
+@item param
+Algorithm-Specific parameter.
+
+Affects the curves of the bicubic algorithm.
+
+@item force_original_aspect_ratio
+@item force_divisible_by
+Work the same as the identical @ref{scale} filter options.
+
+@end table
+
+@subsubsection Examples
+
+@itemize
+@item
+Scale input to 720p, keeping aspect ratio and ensuring the output is yuv420p.
+@example
+scale_cuda=-2:720:format=yuv420p
+@end example
+
+@item
+Upscale to 4K using nearest neighbour algorithm.
+@example
+scale_cuda=4096:2160:interp_algo=nearest
+@end example
+
+@item
+Don't do any conversion or scaling, but copy all input frames into newly allocated ones.
+This can be useful to deal with a filter and encode chain that otherwise exhausts the
+decoders frame pool.
+@example
+scale_cuda=passthrough=0
+@end example
+@end itemize
+
+@subsection yadif_cuda
+
+Deinterlace the input video using the @ref{yadif} algorithm, but implemented
+in CUDA so that it can work as part of a GPU accelerated pipeline with nvdec
+and/or nvenc.
+
+It accepts the following parameters:
+
+
+@table @option
+
+@item mode
+The interlacing mode to adopt. It accepts one of the following values:
+
+@table @option
+@item 0, send_frame
+Output one frame for each frame.
+@item 1, send_field
+Output one frame for each field.
+@item 2, send_frame_nospatial
+Like @code{send_frame}, but it skips the spatial interlacing check.
+@item 3, send_field_nospatial
+Like @code{send_field}, but it skips the spatial interlacing check.
+@end table
+
+The default value is @code{send_frame}.
+
+@item parity
+The picture field parity assumed for the input interlaced video. It accepts one
+of the following values:
+
+@table @option
+@item 0, tff
+Assume the top field is first.
+@item 1, bff
+Assume the bottom field is first.
+@item -1, auto
+Enable automatic detection of field parity.
+@end table
+
+The default value is @code{auto}.
+If the interlacing is unknown or the decoder does not export this information,
+top field first will be assumed.
+
+@item deint
+Specify which frames to deinterlace. Accepts one of the following
+values:
+
+@table @option
+@item 0, all
+Deinterlace all frames.
+@item 1, interlaced
+Only deinterlace frames marked as interlaced.
+@end table
+
+The default value is @code{all}.
+@end table
+
+@section CUDA NPP
+Below is a description of the currently available NVIDIA Performance Primitives (libnpp) video filters.
+
+To enable compilation of these filters you need to configure FFmpeg with @code{--enable-libnpp} and Nvidia CUDA Toolkit must be installed.
+
+@anchor{scale_npp}
+@subsection scale_npp
+
+Use the NVIDIA Performance Primitives (libnpp) to perform scaling and/or pixel
+format conversion on CUDA video frames. Setting the output width and height
+works in the same way as for the @var{scale} filter.
+
+The following additional options are accepted:
+@table @option
+@item format
+The pixel format of the output CUDA frames. If set to the string "same" (the
+default), the input format will be kept. Note that automatic format negotiation
+and conversion is not yet supported for hardware frames
+
+@item interp_algo
+The interpolation algorithm used for resizing. One of the following:
+@table @option
+@item nn
+Nearest neighbour.
+
+@item linear
+@item cubic
+@item cubic2p_bspline
+2-parameter cubic (B=1, C=0)
+
+@item cubic2p_catmullrom
+2-parameter cubic (B=0, C=1/2)
+
+@item cubic2p_b05c03
+2-parameter cubic (B=1/2, C=3/10)
+
+@item super
+Supersampling
+
+@item lanczos
+@end table
+
+@item force_original_aspect_ratio
+Enable decreasing or increasing output video width or height if necessary to
+keep the original aspect ratio. Possible values:
+
+@table @samp
+@item disable
+Scale the video as specified and disable this feature.
+
+@item decrease
+The output video dimensions will automatically be decreased if needed.
+
+@item increase
+The output video dimensions will automatically be increased if needed.
+
+@end table
+
+One useful instance of this option is that when you know a specific device's
+maximum allowed resolution, you can use this to limit the output video to
+that, while retaining the aspect ratio. For example, device A allows
+1280x720 playback, and your video is 1920x800. Using this option (set it to
+decrease) and specifying 1280x720 to the command line makes the output
+1280x533.
+
+Please note that this is a different thing than specifying -1 for @option{w}
+or @option{h}, you still need to specify the output resolution for this option
+to work.
+
+@item force_divisible_by
+Ensures that both the output dimensions, width and height, are divisible by the
+given integer when used together with @option{force_original_aspect_ratio}. This
+works similar to using @code{-n} in the @option{w} and @option{h} options.
+
+This option respects the value set for @option{force_original_aspect_ratio},
+increasing or decreasing the resolution accordingly. The video's aspect ratio
+may be slightly modified.
+
+This option can be handy if you need to have a video fit within or exceed
+a defined resolution using @option{force_original_aspect_ratio} but also have
+encoder restrictions on width or height divisibility.
+
+@item eval
+Specify when to evaluate @var{width} and @var{height} expression. It accepts the following values:
+
+@table @samp
+@item init
+Only evaluate expressions once during the filter initialization or when a command is processed.
+
+@item frame
+Evaluate expressions for each incoming frame.
+
+@end table
+
+@end table
+
+The values of the @option{w} and @option{h} options are expressions
+containing the following constants:
+
+@table @var
+@item in_w
+@item in_h
+The input width and height
+
+@item iw
+@item ih
+These are the same as @var{in_w} and @var{in_h}.
+
+@item out_w
+@item out_h
+The output (scaled) width and height
+
+@item ow
+@item oh
+These are the same as @var{out_w} and @var{out_h}
+
+@item a
+The same as @var{iw} / @var{ih}
+
+@item sar
+input sample aspect ratio
+
+@item dar
+The input display aspect ratio. Calculated from @code{(iw / ih) * sar}.
+
+@item n
+The (sequential) number of the input frame, starting from 0.
+Only available with @code{eval=frame}.
+
+@item t
+The presentation timestamp of the input frame, expressed as a number of
+seconds. Only available with @code{eval=frame}.
+
+@item pos
+The position (byte offset) of the frame in the input stream, or NaN if
+this information is unavailable and/or meaningless (for example in case of synthetic video).
+Only available with @code{eval=frame}.
+Deprecated, do not use.
+@end table
+
+@subsection scale2ref_npp
+
+Use the NVIDIA Performance Primitives (libnpp) to scale (resize) the input
+video, based on a reference video.
+
+See the @ref{scale_npp} filter for available options, scale2ref_npp supports the same
+but uses the reference video instead of the main input as basis. scale2ref_npp
+also supports the following additional constants for the @option{w} and
+@option{h} options:
+
+@table @var
+@item main_w
+@item main_h
+The main input video's width and height
+
+@item main_a
+The same as @var{main_w} / @var{main_h}
+
+@item main_sar
+The main input video's sample aspect ratio
+
+@item main_dar, mdar
+The main input video's display aspect ratio. Calculated from
+@code{(main_w / main_h) * main_sar}.
+
+@item main_n
+The (sequential) number of the main input frame, starting from 0.
+Only available with @code{eval=frame}.
+
+@item main_t
+The presentation timestamp of the main input frame, expressed as a number of
+seconds. Only available with @code{eval=frame}.
+
+@item main_pos
+The position (byte offset) of the frame in the main input stream, or NaN if
+this information is unavailable and/or meaningless (for example in case of synthetic video).
+Only available with @code{eval=frame}.
+@end table
+
+@subsubsection Examples
+
+@itemize
+@item
+Scale a subtitle stream (b) to match the main video (a) in size before overlaying
+@example
+'scale2ref_npp[b][a];[a][b]overlay_cuda'
+@end example
+
+@item
+Scale a logo to 1/10th the height of a video, while preserving its display aspect ratio.
+@example
+[logo-in][video-in]scale2ref_npp=w=oh*mdar:h=ih/10[logo-out][video-out]
+@end example
+@end itemize
+
+@subsection sharpen_npp
+Use the NVIDIA Performance Primitives (libnpp) to perform image sharpening with
+border control.
+
+The following additional options are accepted:
+@table @option
+
+@item border_type
+Type of sampling to be used ad frame borders. One of the following:
+@table @option
+
+@item replicate
+Replicate pixel values.
+
+@end table
+@end table
+
+@subsection transpose_npp
+
+Transpose rows with columns in the input video and optionally flip it.
+For more in depth examples see the @ref{transpose} video filter, which shares mostly the same options.
+
+It accepts the following parameters:
+
+@table @option
+
+@item dir
+Specify the transposition direction.
+
+Can assume the following values:
+@table @samp
+@item cclock_flip
+Rotate by 90 degrees counterclockwise and vertically flip. (default)
+
+@item clock
+Rotate by 90 degrees clockwise.
+
+@item cclock
+Rotate by 90 degrees counterclockwise.
+
+@item clock_flip
+Rotate by 90 degrees clockwise and vertically flip.
+@end table
+
+@item passthrough
+Do not apply the transposition if the input geometry matches the one
+specified by the specified value. It accepts the following values:
+@table @samp
+@item none
+Always apply transposition. (default)
+@item portrait
+Preserve portrait geometry (when @var{height} >= @var{width}).
+@item landscape
+Preserve landscape geometry (when @var{width} >= @var{height}).
+@end table
+
+@end table
+
+@c man end CUDA Video Filters
+
+
 @chapter OpenCL Video Filters
 @c man begin OPENCL VIDEO FILTERS
 
-- 
2.39.5 (Apple Git-154)

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [FFmpeg-devel] [PATCH] doc/filters: Add CUDA Video Filters section for CUDA-based and CUDA+NPP based filters.
  2025-02-02 13:23         ` [FFmpeg-devel] [PATCH] doc/filters: Add CUDA Video Filters section for CUDA-based and CUDA+NPP based filters Danil Iashchenko
@ 2025-02-04  5:40           ` Gyan Doshi
  2025-03-02 11:31             ` Danil Iashchenko
  0 siblings, 1 reply; 12+ messages in thread
From: Gyan Doshi @ 2025-02-04  5:40 UTC (permalink / raw)
  To: ffmpeg-devel

Hi Danil,

See notes below.

On 2025-02-02 06:53 pm, Danil Iashchenko wrote:
> ---
>   doc/filters.texi | 1323 ++++++++++++++++++++++++----------------------
>   1 file changed, 700 insertions(+), 623 deletions(-)
>
> diff --git a/doc/filters.texi b/doc/filters.texi
> index c2817b2661..7460b7ef18 100644
> --- a/doc/filters.texi
> +++ b/doc/filters.texi

> @@ -26890,6 +26280,693 @@ value.
>   
>   @c man end VIDEO FILTERS
>   
> +@chapter CUDA Video Filters
> +@c man begin CUDA Video Filters
> +
> +To enable compilation of these filters you need to configure FFmpeg with
> +@code{--enable-cuda-nvcc} and/or @code{--enable-libnpp} and Nvidia CUDA Toolkit must be installed.

1) cuda-llvm also suffices and doesn't require non-free.
2) Identify what npp is and make clear that only npp filters require 
libnpp. For either cuda-nvcc and npp, nonfree is also required.

> +@section CUDA
> +Below is a description of the currently available Nvidia CUDA video filters.
> +
> +To enable compilation of these filters you need to configure FFmpeg with
> +@code{--enable-cuda-nvcc} and Nvidia CUDA Toolkit must be installed.

Same note about cuda-llvm.

> +@section CUDA NPP
> +Below is a description of the currently available NVIDIA Performance Primitives (libnpp) video filters.
> +
> +To enable compilation of these filters you need to configure FFmpeg with @code{--enable-libnpp} and Nvidia CUDA Toolkit must be installed.
Note that it is "configure FFmpeg with libnpp" in addition to cuda, and 
with nonfree.
> +@end table
> +
> +@c man end CUDA Video Filters

Rest all looks ok.

Regards,
Gyan
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [FFmpeg-devel] [PATCH] doc/filters: Add CUDA Video Filters section for CUDA-based and CUDA+NPP based filters.
  2025-02-04  5:40           ` Gyan Doshi
@ 2025-03-02 11:31             ` Danil Iashchenko
  2025-03-03 22:27               ` Michael Niedermayer
  0 siblings, 1 reply; 12+ messages in thread
From: Danil Iashchenko @ 2025-03-02 11:31 UTC (permalink / raw)
  To: ffmpeg-devel; +Cc: Danil Iashchenko

---
 doc/filters.texi | 1353 ++++++++++++++++++++++++----------------------
 1 file changed, 713 insertions(+), 640 deletions(-)

diff --git a/doc/filters.texi b/doc/filters.texi
index 0ba7d3035f..b2e836e112 100644
--- a/doc/filters.texi
+++ b/doc/filters.texi
@@ -8619,45 +8619,6 @@ Set planes to filter. Default is first only.
 
 This filter supports the all above options as @ref{commands}.
 
-@section bilateral_cuda
-CUDA accelerated bilateral filter, an edge preserving filter.
-This filter is mathematically accurate thanks to the use of GPU acceleration.
-For best output quality, use one to one chroma subsampling, i.e. yuv444p format.
-
-The filter accepts the following options:
-@table @option
-@item sigmaS
-Set sigma of gaussian function to calculate spatial weight, also called sigma space.
-Allowed range is 0.1 to 512. Default is 0.1.
-
-@item sigmaR
-Set sigma of gaussian function to calculate color range weight, also called sigma color.
-Allowed range is 0.1 to 512. Default is 0.1.
-
-@item window_size
-Set window size of the bilateral function to determine the number of neighbours to loop on.
-If the number entered is even, one will be added automatically.
-Allowed range is 1 to 255. Default is 1.
-@end table
-@subsection Examples
-
-@itemize
-@item
-Apply the bilateral filter on a video.
-
-@example
-./ffmpeg -v verbose \
--hwaccel cuda -hwaccel_output_format cuda -i input.mp4  \
--init_hw_device cuda \
--filter_complex \
-" \
-[0:v]scale_cuda=format=yuv444p[scaled_video];
-[scaled_video]bilateral_cuda=window_size=9:sigmaS=3.0:sigmaR=50.0" \
--an -sn -c:v h264_nvenc -cq 20 out.mp4
-@end example
-
-@end itemize
-
 @section bitplanenoise
 
 Show and measure bit plane noise.
@@ -9243,58 +9204,6 @@ Only deinterlace frames marked as interlaced.
 The default value is @code{all}.
 @end table
 
-@section bwdif_cuda
-
-Deinterlace the input video using the @ref{bwdif} algorithm, but implemented
-in CUDA so that it can work as part of a GPU accelerated pipeline with nvdec
-and/or nvenc.
-
-It accepts the following parameters:
-
-@table @option
-@item mode
-The interlacing mode to adopt. It accepts one of the following values:
-
-@table @option
-@item 0, send_frame
-Output one frame for each frame.
-@item 1, send_field
-Output one frame for each field.
-@end table
-
-The default value is @code{send_field}.
-
-@item parity
-The picture field parity assumed for the input interlaced video. It accepts one
-of the following values:
-
-@table @option
-@item 0, tff
-Assume the top field is first.
-@item 1, bff
-Assume the bottom field is first.
-@item -1, auto
-Enable automatic detection of field parity.
-@end table
-
-The default value is @code{auto}.
-If the interlacing is unknown or the decoder does not export this information,
-top field first will be assumed.
-
-@item deint
-Specify which frames to deinterlace. Accepts one of the following
-values:
-
-@table @option
-@item 0, all
-Deinterlace all frames.
-@item 1, interlaced
-Only deinterlace frames marked as interlaced.
-@end table
-
-The default value is @code{all}.
-@end table
-
 @section ccrepack
 
 Repack CEA-708 closed captioning side data
@@ -9408,48 +9317,6 @@ ffmpeg -f lavfi -i color=c=black:s=1280x720 -i video.mp4 -shortest -filter_compl
 @end example
 @end itemize
 
-@section chromakey_cuda
-CUDA accelerated YUV colorspace color/chroma keying.
-
-This filter works like normal chromakey filter but operates on CUDA frames.
-for more details and parameters see @ref{chromakey}.
-
-@subsection Examples
-
-@itemize
-@item
-Make all the green pixels in the input video transparent and use it as an overlay for another video:
-
-@example
-./ffmpeg \
-    -hwaccel cuda -hwaccel_output_format cuda -i input_green.mp4  \
-    -hwaccel cuda -hwaccel_output_format cuda -i base_video.mp4 \
-    -init_hw_device cuda \
-    -filter_complex \
-    " \
-        [0:v]chromakey_cuda=0x25302D:0.1:0.12:1[overlay_video]; \
-        [1:v]scale_cuda=format=yuv420p[base]; \
-        [base][overlay_video]overlay_cuda" \
-    -an -sn -c:v h264_nvenc -cq 20 output.mp4
-@end example
-
-@item
-Process two software sources, explicitly uploading the frames:
-
-@example
-./ffmpeg -init_hw_device cuda=cuda -filter_hw_device cuda \
-    -f lavfi -i color=size=800x600:color=white,format=yuv420p \
-    -f lavfi -i yuvtestsrc=size=200x200,format=yuv420p \
-    -filter_complex \
-    " \
-        [0]hwupload[under]; \
-        [1]hwupload,chromakey_cuda=green:0.1:0.12[over]; \
-        [under][over]overlay_cuda" \
-    -c:v hevc_nvenc -cq 18 -preset slow output.mp4
-@end example
-
-@end itemize
-
 @section chromanr
 Reduce chrominance noise.
 
@@ -10427,38 +10294,6 @@ For example to convert the input to SMPTE-240M, use the command:
 colorspace=smpte240m
 @end example
 
-@section colorspace_cuda
-
-CUDA accelerated implementation of the colorspace filter.
-
-It is by no means feature complete compared to the software colorspace filter,
-and at the current time only supports color range conversion between jpeg/full
-and mpeg/limited range.
-
-The filter accepts the following options:
-
-@table @option
-@item range
-Specify output color range.
-
-The accepted values are:
-@table @samp
-@item tv
-TV (restricted) range
-
-@item mpeg
-MPEG (restricted) range
-
-@item pc
-PC (full) range
-
-@item jpeg
-JPEG (full) range
-
-@end table
-
-@end table
-
 @section colortemperature
 Adjust color temperature in video to simulate variations in ambient color temperature.
 
@@ -18988,84 +18823,6 @@ testsrc=s=100x100, split=4 [in0][in1][in2][in3];
 
 @end itemize
 
-@anchor{overlay_cuda}
-@section overlay_cuda
-
-Overlay one video on top of another.
-
-This is the CUDA variant of the @ref{overlay} filter.
-It only accepts CUDA frames. The underlying input pixel formats have to match.
-
-It takes two inputs and has one output. The first input is the "main"
-video on which the second input is overlaid.
-
-It accepts the following parameters:
-
-@table @option
-@item x
-@item y
-Set expressions for the x and y coordinates of the overlaid video
-on the main video.
-
-They can contain the following parameters:
-
-@table @option
-
-@item main_w, W
-@item main_h, H
-The main input width and height.
-
-@item overlay_w, w
-@item overlay_h, h
-The overlay input width and height.
-
-@item x
-@item y
-The computed values for @var{x} and @var{y}. They are evaluated for
-each new frame.
-
-@item n
-The ordinal index of the main input frame, starting from 0.
-
-@item pos
-The byte offset position in the file of the main input frame, NAN if unknown.
-Deprecated, do not use.
-
-@item t
-The timestamp of the main input frame, expressed in seconds, NAN if unknown.
-
-@end table
-
-Default value is "0" for both expressions.
-
-@item eval
-Set when the expressions for @option{x} and @option{y} are evaluated.
-
-It accepts the following values:
-@table @option
-@item init
-Evaluate expressions once during filter initialization or
-when a command is processed.
-
-@item frame
-Evaluate expressions for each incoming frame
-@end table
-
-Default value is @option{frame}.
-
-@item eof_action
-See @ref{framesync}.
-
-@item shortest
-See @ref{framesync}.
-
-@item repeatlast
-See @ref{framesync}.
-
-@end table
-
-This filter also supports the @ref{framesync} options.
-
 @section owdenoise
 
 Apply Overcomplete Wavelet denoiser.
@@ -21516,287 +21273,6 @@ If the specified expression is not valid, it is kept at its current
 value.
 @end table
 
-@anchor{scale_cuda}
-@section scale_cuda
-
-Scale (resize) and convert (pixel format) the input video, using accelerated CUDA kernels.
-Setting the output width and height works in the same way as for the @ref{scale} filter.
-
-The filter accepts the following options:
-@table @option
-@item w
-@item h
-Set the output video dimension expression. Default value is the input dimension.
-
-Allows for the same expressions as the @ref{scale} filter.
-
-@item interp_algo
-Sets the algorithm used for scaling:
-
-@table @var
-@item nearest
-Nearest neighbour
-
-Used by default if input parameters match the desired output.
-
-@item bilinear
-Bilinear
-
-@item bicubic
-Bicubic
-
-This is the default.
-
-@item lanczos
-Lanczos
-
-@end table
-
-@item format
-Controls the output pixel format. By default, or if none is specified, the input
-pixel format is used.
-
-The filter does not support converting between YUV and RGB pixel formats.
-
-@item passthrough
-If set to 0, every frame is processed, even if no conversion is necessary.
-This mode can be useful to use the filter as a buffer for a downstream
-frame-consumer that exhausts the limited decoder frame pool.
-
-If set to 1, frames are passed through as-is if they match the desired output
-parameters. This is the default behaviour.
-
-@item param
-Algorithm-Specific parameter.
-
-Affects the curves of the bicubic algorithm.
-
-@item force_original_aspect_ratio
-@item force_divisible_by
-Work the same as the identical @ref{scale} filter options.
-
-@item reset_sar
-Works the same as the identical @ref{scale} filter option.
-
-@end table
-
-@subsection Examples
-
-@itemize
-@item
-Scale input to 720p, keeping aspect ratio and ensuring the output is yuv420p.
-@example
-scale_cuda=-2:720:format=yuv420p
-@end example
-
-@item
-Upscale to 4K using nearest neighbour algorithm.
-@example
-scale_cuda=4096:2160:interp_algo=nearest
-@end example
-
-@item
-Don't do any conversion or scaling, but copy all input frames into newly allocated ones.
-This can be useful to deal with a filter and encode chain that otherwise exhausts the
-decoders frame pool.
-@example
-scale_cuda=passthrough=0
-@end example
-@end itemize
-
-@anchor{scale_npp}
-@section scale_npp
-
-Use the NVIDIA Performance Primitives (libnpp) to perform scaling and/or pixel
-format conversion on CUDA video frames. Setting the output width and height
-works in the same way as for the @var{scale} filter.
-
-The following additional options are accepted:
-@table @option
-@item format
-The pixel format of the output CUDA frames. If set to the string "same" (the
-default), the input format will be kept. Note that automatic format negotiation
-and conversion is not yet supported for hardware frames
-
-@item interp_algo
-The interpolation algorithm used for resizing. One of the following:
-@table @option
-@item nn
-Nearest neighbour.
-
-@item linear
-@item cubic
-@item cubic2p_bspline
-2-parameter cubic (B=1, C=0)
-
-@item cubic2p_catmullrom
-2-parameter cubic (B=0, C=1/2)
-
-@item cubic2p_b05c03
-2-parameter cubic (B=1/2, C=3/10)
-
-@item super
-Supersampling
-
-@item lanczos
-@end table
-
-@item force_original_aspect_ratio
-Enable decreasing or increasing output video width or height if necessary to
-keep the original aspect ratio. Possible values:
-
-@table @samp
-@item disable
-Scale the video as specified and disable this feature.
-
-@item decrease
-The output video dimensions will automatically be decreased if needed.
-
-@item increase
-The output video dimensions will automatically be increased if needed.
-
-@end table
-
-One useful instance of this option is that when you know a specific device's
-maximum allowed resolution, you can use this to limit the output video to
-that, while retaining the aspect ratio. For example, device A allows
-1280x720 playback, and your video is 1920x800. Using this option (set it to
-decrease) and specifying 1280x720 to the command line makes the output
-1280x533.
-
-Please note that this is a different thing than specifying -1 for @option{w}
-or @option{h}, you still need to specify the output resolution for this option
-to work.
-
-@item force_divisible_by
-Ensures that both the output dimensions, width and height, are divisible by the
-given integer when used together with @option{force_original_aspect_ratio}. This
-works similar to using @code{-n} in the @option{w} and @option{h} options.
-
-This option respects the value set for @option{force_original_aspect_ratio},
-increasing or decreasing the resolution accordingly. The video's aspect ratio
-may be slightly modified.
-
-This option can be handy if you need to have a video fit within or exceed
-a defined resolution using @option{force_original_aspect_ratio} but also have
-encoder restrictions on width or height divisibility.
-
-@item reset_sar
-Works the same as the identical @ref{scale} filter option.
-
-@item eval
-Specify when to evaluate @var{width} and @var{height} expression. It accepts the following values:
-
-@table @samp
-@item init
-Only evaluate expressions once during the filter initialization or when a command is processed.
-
-@item frame
-Evaluate expressions for each incoming frame.
-
-@end table
-
-@end table
-
-The values of the @option{w} and @option{h} options are expressions
-containing the following constants:
-
-@table @var
-@item in_w
-@item in_h
-The input width and height
-
-@item iw
-@item ih
-These are the same as @var{in_w} and @var{in_h}.
-
-@item out_w
-@item out_h
-The output (scaled) width and height
-
-@item ow
-@item oh
-These are the same as @var{out_w} and @var{out_h}
-
-@item a
-The same as @var{iw} / @var{ih}
-
-@item sar
-input sample aspect ratio
-
-@item dar
-The input display aspect ratio. Calculated from @code{(iw / ih) * sar}.
-
-@item n
-The (sequential) number of the input frame, starting from 0.
-Only available with @code{eval=frame}.
-
-@item t
-The presentation timestamp of the input frame, expressed as a number of
-seconds. Only available with @code{eval=frame}.
-
-@item pos
-The position (byte offset) of the frame in the input stream, or NaN if
-this information is unavailable and/or meaningless (for example in case of synthetic video).
-Only available with @code{eval=frame}.
-Deprecated, do not use.
-@end table
-
-@section scale2ref_npp
-
-Use the NVIDIA Performance Primitives (libnpp) to scale (resize) the input
-video, based on a reference video.
-
-See the @ref{scale_npp} filter for available options, scale2ref_npp supports the same
-but uses the reference video instead of the main input as basis. scale2ref_npp
-also supports the following additional constants for the @option{w} and
-@option{h} options:
-
-@table @var
-@item main_w
-@item main_h
-The main input video's width and height
-
-@item main_a
-The same as @var{main_w} / @var{main_h}
-
-@item main_sar
-The main input video's sample aspect ratio
-
-@item main_dar, mdar
-The main input video's display aspect ratio. Calculated from
-@code{(main_w / main_h) * main_sar}.
-
-@item main_n
-The (sequential) number of the main input frame, starting from 0.
-Only available with @code{eval=frame}.
-
-@item main_t
-The presentation timestamp of the main input frame, expressed as a number of
-seconds. Only available with @code{eval=frame}.
-
-@item main_pos
-The position (byte offset) of the frame in the main input stream, or NaN if
-this information is unavailable and/or meaningless (for example in case of synthetic video).
-Only available with @code{eval=frame}.
-@end table
-
-@subsection Examples
-
-@itemize
-@item
-Scale a subtitle stream (b) to match the main video (a) in size before overlaying
-@example
-'scale2ref_npp[b][a];[a][b]overlay_cuda'
-@end example
-
-@item
-Scale a logo to 1/10th the height of a video, while preserving its display aspect ratio.
-@example
-[logo-in][video-in]scale2ref_npp=w=oh*mdar:h=ih/10[logo-out][video-out]
-@end example
-@end itemize
-
 @section scale_vt
 
 Scale and convert the color parameters using VTPixelTransferSession.
@@ -22243,23 +21719,6 @@ Keep the same chroma location (default).
 @end table
 @end table
 
-@section sharpen_npp
-Use the NVIDIA Performance Primitives (libnpp) to perform image sharpening with
-border control.
-
-The following additional options are accepted:
-@table @option
-
-@item border_type
-Type of sampling to be used ad frame borders. One of the following:
-@table @option
-
-@item replicate
-Replicate pixel values.
-
-@end table
-@end table
-
 @section shear
 Apply shear transform to input video.
 
@@ -24417,47 +23876,6 @@ The command above can also be specified as:
 transpose=1:portrait
 @end example
 
-@section transpose_npp
-
-Transpose rows with columns in the input video and optionally flip it.
-For more in depth examples see the @ref{transpose} video filter, which shares mostly the same options.
-
-It accepts the following parameters:
-
-@table @option
-
-@item dir
-Specify the transposition direction.
-
-Can assume the following values:
-@table @samp
-@item cclock_flip
-Rotate by 90 degrees counterclockwise and vertically flip. (default)
-
-@item clock
-Rotate by 90 degrees clockwise.
-
-@item cclock
-Rotate by 90 degrees counterclockwise.
-
-@item clock_flip
-Rotate by 90 degrees clockwise and vertically flip.
-@end table
-
-@item passthrough
-Do not apply the transposition if the input geometry matches the one
-specified by the specified value. It accepts the following values:
-@table @samp
-@item none
-Always apply transposition. (default)
-@item portrait
-Preserve portrait geometry (when @var{height} >= @var{width}).
-@item landscape
-Preserve landscape geometry (when @var{width} >= @var{height}).
-@end table
-
-@end table
-
 @section trim
 Trim the input so that the output contains one continuous subpart of the input.
 
@@ -26644,64 +26062,6 @@ filter").
 It accepts the following parameters:
 
 
-@table @option
-
-@item mode
-The interlacing mode to adopt. It accepts one of the following values:
-
-@table @option
-@item 0, send_frame
-Output one frame for each frame.
-@item 1, send_field
-Output one frame for each field.
-@item 2, send_frame_nospatial
-Like @code{send_frame}, but it skips the spatial interlacing check.
-@item 3, send_field_nospatial
-Like @code{send_field}, but it skips the spatial interlacing check.
-@end table
-
-The default value is @code{send_frame}.
-
-@item parity
-The picture field parity assumed for the input interlaced video. It accepts one
-of the following values:
-
-@table @option
-@item 0, tff
-Assume the top field is first.
-@item 1, bff
-Assume the bottom field is first.
-@item -1, auto
-Enable automatic detection of field parity.
-@end table
-
-The default value is @code{auto}.
-If the interlacing is unknown or the decoder does not export this information,
-top field first will be assumed.
-
-@item deint
-Specify which frames to deinterlace. Accepts one of the following
-values:
-
-@table @option
-@item 0, all
-Deinterlace all frames.
-@item 1, interlaced
-Only deinterlace frames marked as interlaced.
-@end table
-
-The default value is @code{all}.
-@end table
-
-@section yadif_cuda
-
-Deinterlace the input video using the @ref{yadif} algorithm, but implemented
-in CUDA so that it can work as part of a GPU accelerated pipeline with nvdec
-and/or nvenc.
-
-It accepts the following parameters:
-
-
 @table @option
 
 @item mode
@@ -27172,6 +26532,719 @@ value.
 
 @c man end VIDEO FILTERS
 
+@chapter CUDA Video Filters
+@c man begin CUDA Video Filters
+
+To enable CUDA and/or NPP filters please refer to configuration guidelines for @ref{CUDA} and for @ref{CUDA NPP} filters.
+
+Running CUDA filters requires you to initialize a hardware device and to pass that device to all filters in any filter graph.
+@table @option
+
+@item -init_hw_device cuda[=@var{name}][:@var{device}[,@var{key=value}...]]
+Initialise a new hardware device of type @var{cuda} called @var{name}, using the
+given device parameters.
+
+@item -filter_hw_device @var{name}
+Pass the hardware device called @var{name} to all filters in any filter graph.
+
+@end table
+
+For more detailed information see @url{https://www.ffmpeg.org/ffmpeg.html#Advanced-Video-options}
+
+@itemize
+@item
+Example of initializing second CUDA device on the system and running scale_cuda and bilateral_cuda filters.
+@example
+./ffmpeg -hwaccel cuda -hwaccel_output_format cuda -i input.mp4 -init_hw_device cuda:1 -filter_complex \
+"[0:v]scale_cuda=format=yuv444p[scaled_video];[scaled_video]bilateral_cuda=window_size=9:sigmaS=3.0:sigmaR=50.0" \
+-an -sn -c:v h264_nvenc -cq 20 out.mp4
+@end example
+@end itemize
+
+Since CUDA filters operate exclusively on GPU memory, frame data must sometimes be uploaded (@ref{hwupload}) to hardware surfaces associated with the appropriate CUDA device before processing, and downloaded (@ref{hwdownload}) back to normal memory afterward, if required. Whether @ref{hwupload} or @ref{hwdownload} is necessary depends on the specific workflow:
+
+@itemize
+@item If the input frames are already in GPU memory (e.g., when using @code{-hwaccel cuda} or @code{-hwaccel_output_format cuda}), explicit use of @ref{hwupload} is not needed, as the data is already in the appropriate memory space.
+@item If the input frames are in CPU memory (e.g., software-decoded frames or frames processed by CPU-based filters), it is necessary to use @ref{hwupload} to transfer the data to GPU memory for CUDA processing.
+@item If the output of the CUDA filters needs to be further processed by software-based filters or saved in a format not supported by GPU-based encoders, @ref{hwdownload} is required to transfer the data back to CPU memory.
+@end itemize
+Note that @ref{hwupload} uploads data to a surface with the same layout as the software frame, so it may be necessary to add a @ref{format} filter immediately before @ref{hwupload} to ensure the input is in the correct format. Similarly, @ref{hwdownload} may not support all output formats, so an additional @ref{format} filter may need to be inserted immediately after @ref{hwdownload} in the filter graph to ensure compatibility.
+
+@anchor{CUDA}
+@section CUDA
+Below is a description of the currently available Nvidia CUDA video filters.
+
+Prerequisites:
+@itemize
+@item Install Nvidia CUDA Toolkit
+@end itemize
+
+Note: If FFmpeg detects the Nvidia CUDA Toolkit during configuration, it will enable CUDA filters automatically without requiring any additional flags. If you want to explicitly enable them, use the following options:
+
+@itemize
+@item Configure FFmpeg with @code{--enable-cuda-nvcc --enable-nonfree}.
+@item Configure FFmpeg with @code{--enable-cuda-llvm}. Additional requirement: @code{llvm} lib must be installed.
+@end itemize
+
+@subsection bilateral_cuda
+CUDA accelerated bilateral filter, an edge preserving filter.
+This filter is mathematically accurate thanks to the use of GPU acceleration.
+For best output quality, use one to one chroma subsampling, i.e. yuv444p format.
+
+The filter accepts the following options:
+@table @option
+@item sigmaS
+Set sigma of gaussian function to calculate spatial weight, also called sigma space.
+Allowed range is 0.1 to 512. Default is 0.1.
+
+@item sigmaR
+Set sigma of gaussian function to calculate color range weight, also called sigma color.
+Allowed range is 0.1 to 512. Default is 0.1.
+
+@item window_size
+Set window size of the bilateral function to determine the number of neighbours to loop on.
+If the number entered is even, one will be added automatically.
+Allowed range is 1 to 255. Default is 1.
+@end table
+@subsubsection Examples
+
+@itemize
+@item
+Apply the bilateral filter on a video.
+
+@example
+./ffmpeg -v verbose \
+-hwaccel cuda -hwaccel_output_format cuda -i input.mp4  \
+-init_hw_device cuda \
+-filter_complex \
+" \
+[0:v]scale_cuda=format=yuv444p[scaled_video];
+[scaled_video]bilateral_cuda=window_size=9:sigmaS=3.0:sigmaR=50.0" \
+-an -sn -c:v h264_nvenc -cq 20 out.mp4
+@end example
+
+@end itemize
+
+@subsection bwdif_cuda
+
+Deinterlace the input video using the @ref{bwdif} algorithm, but implemented
+in CUDA so that it can work as part of a GPU accelerated pipeline with nvdec
+and/or nvenc.
+
+It accepts the following parameters:
+
+@table @option
+@item mode
+The interlacing mode to adopt. It accepts one of the following values:
+
+@table @option
+@item 0, send_frame
+Output one frame for each frame.
+@item 1, send_field
+Output one frame for each field.
+@end table
+
+The default value is @code{send_field}.
+
+@item parity
+The picture field parity assumed for the input interlaced video. It accepts one
+of the following values:
+
+@table @option
+@item 0, tff
+Assume the top field is first.
+@item 1, bff
+Assume the bottom field is first.
+@item -1, auto
+Enable automatic detection of field parity.
+@end table
+
+The default value is @code{auto}.
+If the interlacing is unknown or the decoder does not export this information,
+top field first will be assumed.
+
+@item deint
+Specify which frames to deinterlace. Accepts one of the following
+values:
+
+@table @option
+@item 0, all
+Deinterlace all frames.
+@item 1, interlaced
+Only deinterlace frames marked as interlaced.
+@end table
+
+The default value is @code{all}.
+@end table
+
+@subsection chromakey_cuda
+CUDA accelerated YUV colorspace color/chroma keying.
+
+This filter works like normal chromakey filter but operates on CUDA frames.
+for more details and parameters see @ref{chromakey}.
+
+@subsubsection Examples
+
+@itemize
+@item
+Make all the green pixels in the input video transparent and use it as an overlay for another video:
+
+@example
+./ffmpeg \
+    -hwaccel cuda -hwaccel_output_format cuda -i input_green.mp4  \
+    -hwaccel cuda -hwaccel_output_format cuda -i base_video.mp4 \
+    -init_hw_device cuda \
+    -filter_complex \
+    " \
+        [0:v]chromakey_cuda=0x25302D:0.1:0.12:1[overlay_video]; \
+        [1:v]scale_cuda=format=yuv420p[base]; \
+        [base][overlay_video]overlay_cuda" \
+    -an -sn -c:v h264_nvenc -cq 20 output.mp4
+@end example
+
+@item
+Process two software sources, explicitly uploading the frames:
+
+@example
+./ffmpeg -init_hw_device cuda=cuda -filter_hw_device cuda \
+    -f lavfi -i color=size=800x600:color=white,format=yuv420p \
+    -f lavfi -i yuvtestsrc=size=200x200,format=yuv420p \
+    -filter_complex \
+    " \
+        [0]hwupload[under]; \
+        [1]hwupload,chromakey_cuda=green:0.1:0.12[over]; \
+        [under][over]overlay_cuda" \
+    -c:v hevc_nvenc -cq 18 -preset slow output.mp4
+@end example
+
+@end itemize
+
+@subsection colorspace_cuda
+
+CUDA accelerated implementation of the colorspace filter.
+
+It is by no means feature complete compared to the software colorspace filter,
+and at the current time only supports color range conversion between jpeg/full
+and mpeg/limited range.
+
+The filter accepts the following options:
+
+@table @option
+@item range
+Specify output color range.
+
+The accepted values are:
+@table @samp
+@item tv
+TV (restricted) range
+
+@item mpeg
+MPEG (restricted) range
+
+@item pc
+PC (full) range
+
+@item jpeg
+JPEG (full) range
+
+@end table
+
+@end table
+
+@anchor{overlay_cuda}
+@subsection overlay_cuda
+
+Overlay one video on top of another.
+
+This is the CUDA variant of the @ref{overlay} filter.
+It only accepts CUDA frames. The underlying input pixel formats have to match.
+
+It takes two inputs and has one output. The first input is the "main"
+video on which the second input is overlaid.
+
+It accepts the following parameters:
+
+@table @option
+@item x
+@item y
+Set expressions for the x and y coordinates of the overlaid video
+on the main video.
+
+They can contain the following parameters:
+
+@table @option
+
+@item main_w, W
+@item main_h, H
+The main input width and height.
+
+@item overlay_w, w
+@item overlay_h, h
+The overlay input width and height.
+
+@item x
+@item y
+The computed values for @var{x} and @var{y}. They are evaluated for
+each new frame.
+
+@item n
+The ordinal index of the main input frame, starting from 0.
+
+@item pos
+The byte offset position in the file of the main input frame, NAN if unknown.
+Deprecated, do not use.
+
+@item t
+The timestamp of the main input frame, expressed in seconds, NAN if unknown.
+
+@end table
+
+Default value is "0" for both expressions.
+
+@item eval
+Set when the expressions for @option{x} and @option{y} are evaluated.
+
+It accepts the following values:
+@table @option
+@item init
+Evaluate expressions once during filter initialization or
+when a command is processed.
+
+@item frame
+Evaluate expressions for each incoming frame
+@end table
+
+Default value is @option{frame}.
+
+@item eof_action
+See @ref{framesync}.
+
+@item shortest
+See @ref{framesync}.
+
+@item repeatlast
+See @ref{framesync}.
+
+@end table
+
+This filter also supports the @ref{framesync} options.
+
+@anchor{scale_cuda}
+@subsection scale_cuda
+
+Scale (resize) and convert (pixel format) the input video, using accelerated CUDA kernels.
+Setting the output width and height works in the same way as for the @ref{scale} filter.
+
+The filter accepts the following options:
+@table @option
+@item w
+@item h
+Set the output video dimension expression. Default value is the input dimension.
+
+Allows for the same expressions as the @ref{scale} filter.
+
+@item interp_algo
+Sets the algorithm used for scaling:
+
+@table @var
+@item nearest
+Nearest neighbour
+
+Used by default if input parameters match the desired output.
+
+@item bilinear
+Bilinear
+
+@item bicubic
+Bicubic
+
+This is the default.
+
+@item lanczos
+Lanczos
+
+@end table
+
+@item format
+Controls the output pixel format. By default, or if none is specified, the input
+pixel format is used.
+
+The filter does not support converting between YUV and RGB pixel formats.
+
+@item passthrough
+If set to 0, every frame is processed, even if no conversion is necessary.
+This mode can be useful to use the filter as a buffer for a downstream
+frame-consumer that exhausts the limited decoder frame pool.
+
+If set to 1, frames are passed through as-is if they match the desired output
+parameters. This is the default behaviour.
+
+@item param
+Algorithm-Specific parameter.
+
+Affects the curves of the bicubic algorithm.
+
+@item force_original_aspect_ratio
+@item force_divisible_by
+Work the same as the identical @ref{scale} filter options.
+
+@item reset_sar
+Works the same as the identical @ref{scale} filter option.
+
+@end table
+
+@subsubsection Examples
+
+@itemize
+@item
+Scale input to 720p, keeping aspect ratio and ensuring the output is yuv420p.
+@example
+scale_cuda=-2:720:format=yuv420p
+@end example
+
+@item
+Upscale to 4K using nearest neighbour algorithm.
+@example
+scale_cuda=4096:2160:interp_algo=nearest
+@end example
+
+@item
+Don't do any conversion or scaling, but copy all input frames into newly allocated ones.
+This can be useful to deal with a filter and encode chain that otherwise exhausts the
+decoders frame pool.
+@example
+scale_cuda=passthrough=0
+@end example
+@end itemize
+
+@subsection yadif_cuda
+
+Deinterlace the input video using the @ref{yadif} algorithm, but implemented
+in CUDA so that it can work as part of a GPU accelerated pipeline with nvdec
+and/or nvenc.
+
+It accepts the following parameters:
+
+
+@table @option
+
+@item mode
+The interlacing mode to adopt. It accepts one of the following values:
+
+@table @option
+@item 0, send_frame
+Output one frame for each frame.
+@item 1, send_field
+Output one frame for each field.
+@item 2, send_frame_nospatial
+Like @code{send_frame}, but it skips the spatial interlacing check.
+@item 3, send_field_nospatial
+Like @code{send_field}, but it skips the spatial interlacing check.
+@end table
+
+The default value is @code{send_frame}.
+
+@item parity
+The picture field parity assumed for the input interlaced video. It accepts one
+of the following values:
+
+@table @option
+@item 0, tff
+Assume the top field is first.
+@item 1, bff
+Assume the bottom field is first.
+@item -1, auto
+Enable automatic detection of field parity.
+@end table
+
+The default value is @code{auto}.
+If the interlacing is unknown or the decoder does not export this information,
+top field first will be assumed.
+
+@item deint
+Specify which frames to deinterlace. Accepts one of the following
+values:
+
+@table @option
+@item 0, all
+Deinterlace all frames.
+@item 1, interlaced
+Only deinterlace frames marked as interlaced.
+@end table
+
+The default value is @code{all}.
+@end table
+
+@anchor {CUDA NPP}
+@section CUDA NPP
+Below is a description of the currently available NVIDIA Performance Primitives (libnpp) video filters.
+
+Prerequisites:
+@itemize
+@item Install Nvidia CUDA Toolkit
+@item Install libnpp
+@end itemize
+
+To enable CUDA NPP filters:
+
+@itemize
+@item Configure FFmpeg with @code{--enable-nonfree --enable-libnpp}.
+@end itemize
+
+
+@anchor{scale_npp}
+@subsection scale_npp
+
+Use the NVIDIA Performance Primitives (libnpp) to perform scaling and/or pixel
+format conversion on CUDA video frames. Setting the output width and height
+works in the same way as for the @var{scale} filter.
+
+The following additional options are accepted:
+@table @option
+@item format
+The pixel format of the output CUDA frames. If set to the string "same" (the
+default), the input format will be kept. Note that automatic format negotiation
+and conversion is not yet supported for hardware frames
+
+@item interp_algo
+The interpolation algorithm used for resizing. One of the following:
+@table @option
+@item nn
+Nearest neighbour.
+
+@item linear
+@item cubic
+@item cubic2p_bspline
+2-parameter cubic (B=1, C=0)
+
+@item cubic2p_catmullrom
+2-parameter cubic (B=0, C=1/2)
+
+@item cubic2p_b05c03
+2-parameter cubic (B=1/2, C=3/10)
+
+@item super
+Supersampling
+
+@item lanczos
+@end table
+
+@item force_original_aspect_ratio
+Enable decreasing or increasing output video width or height if necessary to
+keep the original aspect ratio. Possible values:
+
+@table @samp
+@item disable
+Scale the video as specified and disable this feature.
+
+@item decrease
+The output video dimensions will automatically be decreased if needed.
+
+@item increase
+The output video dimensions will automatically be increased if needed.
+
+@end table
+
+One useful instance of this option is that when you know a specific device's
+maximum allowed resolution, you can use this to limit the output video to
+that, while retaining the aspect ratio. For example, device A allows
+1280x720 playback, and your video is 1920x800. Using this option (set it to
+decrease) and specifying 1280x720 to the command line makes the output
+1280x533.
+
+Please note that this is a different thing than specifying -1 for @option{w}
+or @option{h}, you still need to specify the output resolution for this option
+to work.
+
+@item force_divisible_by
+Ensures that both the output dimensions, width and height, are divisible by the
+given integer when used together with @option{force_original_aspect_ratio}. This
+works similar to using @code{-n} in the @option{w} and @option{h} options.
+
+This option respects the value set for @option{force_original_aspect_ratio},
+increasing or decreasing the resolution accordingly. The video's aspect ratio
+may be slightly modified.
+
+This option can be handy if you need to have a video fit within or exceed
+a defined resolution using @option{force_original_aspect_ratio} but also have
+encoder restrictions on width or height divisibility.
+
+@item reset_sar
+Works the same as the identical @ref{scale} filter option.
+
+@item eval
+Specify when to evaluate @var{width} and @var{height} expression. It accepts the following values:
+
+@table @samp
+@item init
+Only evaluate expressions once during the filter initialization or when a command is processed.
+
+@item frame
+Evaluate expressions for each incoming frame.
+
+@end table
+
+@end table
+
+The values of the @option{w} and @option{h} options are expressions
+containing the following constants:
+
+@table @var
+@item in_w
+@item in_h
+The input width and height
+
+@item iw
+@item ih
+These are the same as @var{in_w} and @var{in_h}.
+
+@item out_w
+@item out_h
+The output (scaled) width and height
+
+@item ow
+@item oh
+These are the same as @var{out_w} and @var{out_h}
+
+@item a
+The same as @var{iw} / @var{ih}
+
+@item sar
+input sample aspect ratio
+
+@item dar
+The input display aspect ratio. Calculated from @code{(iw / ih) * sar}.
+
+@item n
+The (sequential) number of the input frame, starting from 0.
+Only available with @code{eval=frame}.
+
+@item t
+The presentation timestamp of the input frame, expressed as a number of
+seconds. Only available with @code{eval=frame}.
+
+@item pos
+The position (byte offset) of the frame in the input stream, or NaN if
+this information is unavailable and/or meaningless (for example in case of synthetic video).
+Only available with @code{eval=frame}.
+Deprecated, do not use.
+@end table
+
+@subsection scale2ref_npp
+
+Use the NVIDIA Performance Primitives (libnpp) to scale (resize) the input
+video, based on a reference video.
+
+See the @ref{scale_npp} filter for available options, scale2ref_npp supports the same
+but uses the reference video instead of the main input as basis. scale2ref_npp
+also supports the following additional constants for the @option{w} and
+@option{h} options:
+
+@table @var
+@item main_w
+@item main_h
+The main input video's width and height
+
+@item main_a
+The same as @var{main_w} / @var{main_h}
+
+@item main_sar
+The main input video's sample aspect ratio
+
+@item main_dar, mdar
+The main input video's display aspect ratio. Calculated from
+@code{(main_w / main_h) * main_sar}.
+
+@item main_n
+The (sequential) number of the main input frame, starting from 0.
+Only available with @code{eval=frame}.
+
+@item main_t
+The presentation timestamp of the main input frame, expressed as a number of
+seconds. Only available with @code{eval=frame}.
+
+@item main_pos
+The position (byte offset) of the frame in the main input stream, or NaN if
+this information is unavailable and/or meaningless (for example in case of synthetic video).
+Only available with @code{eval=frame}.
+@end table
+
+@subsubsection Examples
+
+@itemize
+@item
+Scale a subtitle stream (b) to match the main video (a) in size before overlaying
+@example
+'scale2ref_npp[b][a];[a][b]overlay_cuda'
+@end example
+
+@item
+Scale a logo to 1/10th the height of a video, while preserving its display aspect ratio.
+@example
+[logo-in][video-in]scale2ref_npp=w=oh*mdar:h=ih/10[logo-out][video-out]
+@end example
+@end itemize
+
+@subsection sharpen_npp
+Use the NVIDIA Performance Primitives (libnpp) to perform image sharpening with
+border control.
+
+The following additional options are accepted:
+@table @option
+
+@item border_type
+Type of sampling to be used ad frame borders. One of the following:
+@table @option
+
+@item replicate
+Replicate pixel values.
+
+@end table
+@end table
+
+@subsection transpose_npp
+
+Transpose rows with columns in the input video and optionally flip it.
+For more in depth examples see the @ref{transpose} video filter, which shares mostly the same options.
+
+It accepts the following parameters:
+
+@table @option
+
+@item dir
+Specify the transposition direction.
+
+Can assume the following values:
+@table @samp
+@item cclock_flip
+Rotate by 90 degrees counterclockwise and vertically flip. (default)
+
+@item clock
+Rotate by 90 degrees clockwise.
+
+@item cclock
+Rotate by 90 degrees counterclockwise.
+
+@item clock_flip
+Rotate by 90 degrees clockwise and vertically flip.
+@end table
+
+@item passthrough
+Do not apply the transposition if the input geometry matches the one
+specified by the specified value. It accepts the following values:
+@table @samp
+@item none
+Always apply transposition. (default)
+@item portrait
+Preserve portrait geometry (when @var{height} >= @var{width}).
+@item landscape
+Preserve landscape geometry (when @var{width} >= @var{height}).
+@end table
+
+@end table
+
+@c man end CUDA Video Filters
+
 @chapter OpenCL Video Filters
 @c man begin OPENCL VIDEO FILTERS
 
-- 
2.39.5 (Apple Git-154)

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [FFmpeg-devel] [PATCH] doc/filters: Add CUDA Video Filters section for CUDA-based and CUDA+NPP based filters.
  2025-03-02 11:31             ` Danil Iashchenko
@ 2025-03-03 22:27               ` Michael Niedermayer
  2025-03-10 10:38                 ` Danil Iashchenko
  0 siblings, 1 reply; 12+ messages in thread
From: Michael Niedermayer @ 2025-03-03 22:27 UTC (permalink / raw)
  To: FFmpeg development discussions and patches


[-- Attachment #1.1: Type: text/plain, Size: 731 bytes --]

Hi Danil

On Sun, Mar 02, 2025 at 11:31:54AM +0000, Danil Iashchenko wrote:
> ---
>  doc/filters.texi | 1353 ++++++++++++++++++++++++----------------------
>  1 file changed, 713 insertions(+), 640 deletions(-)

make doc/ffmpeg-filters.html
HTML	doc/ffmpeg-filters.html
*** Undefined node `CUDA NPP' in @ref (in doc/filters.texi l. 26538)
** Unknown command `@anchor' (left as is) (in doc/filters.texi l. 26978)
*** '{' without macro. Before: CUDA NPP} (in doc/filters.texi l. 26978)
*** '}' without opening '{' before:  (in doc/filters.texi l. 26978)


[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Those who are best at talking, realize last or never when they are wrong.

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

[-- Attachment #2: Type: text/plain, Size: 251 bytes --]

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [FFmpeg-devel] [PATCH] doc/filters: Add CUDA Video Filters section for CUDA-based and CUDA+NPP based filters.
  2025-03-03 22:27               ` Michael Niedermayer
@ 2025-03-10 10:38                 ` Danil Iashchenko
  2025-03-16 19:15                   ` Danil Iashchenko
  0 siblings, 1 reply; 12+ messages in thread
From: Danil Iashchenko @ 2025-03-10 10:38 UTC (permalink / raw)
  To: ffmpeg-devel; +Cc: Danil Iashchenko

---
 doc/filters.texi | 1353 ++++++++++++++++++++++++----------------------
 1 file changed, 713 insertions(+), 640 deletions(-)

diff --git a/doc/filters.texi b/doc/filters.texi
index 0ba7d3035f..37b8674756 100644
--- a/doc/filters.texi
+++ b/doc/filters.texi
@@ -8619,45 +8619,6 @@ Set planes to filter. Default is first only.
 
 This filter supports the all above options as @ref{commands}.
 
-@section bilateral_cuda
-CUDA accelerated bilateral filter, an edge preserving filter.
-This filter is mathematically accurate thanks to the use of GPU acceleration.
-For best output quality, use one to one chroma subsampling, i.e. yuv444p format.
-
-The filter accepts the following options:
-@table @option
-@item sigmaS
-Set sigma of gaussian function to calculate spatial weight, also called sigma space.
-Allowed range is 0.1 to 512. Default is 0.1.
-
-@item sigmaR
-Set sigma of gaussian function to calculate color range weight, also called sigma color.
-Allowed range is 0.1 to 512. Default is 0.1.
-
-@item window_size
-Set window size of the bilateral function to determine the number of neighbours to loop on.
-If the number entered is even, one will be added automatically.
-Allowed range is 1 to 255. Default is 1.
-@end table
-@subsection Examples
-
-@itemize
-@item
-Apply the bilateral filter on a video.
-
-@example
-./ffmpeg -v verbose \
--hwaccel cuda -hwaccel_output_format cuda -i input.mp4  \
--init_hw_device cuda \
--filter_complex \
-" \
-[0:v]scale_cuda=format=yuv444p[scaled_video];
-[scaled_video]bilateral_cuda=window_size=9:sigmaS=3.0:sigmaR=50.0" \
--an -sn -c:v h264_nvenc -cq 20 out.mp4
-@end example
-
-@end itemize
-
 @section bitplanenoise
 
 Show and measure bit plane noise.
@@ -9243,58 +9204,6 @@ Only deinterlace frames marked as interlaced.
 The default value is @code{all}.
 @end table
 
-@section bwdif_cuda
-
-Deinterlace the input video using the @ref{bwdif} algorithm, but implemented
-in CUDA so that it can work as part of a GPU accelerated pipeline with nvdec
-and/or nvenc.
-
-It accepts the following parameters:
-
-@table @option
-@item mode
-The interlacing mode to adopt. It accepts one of the following values:
-
-@table @option
-@item 0, send_frame
-Output one frame for each frame.
-@item 1, send_field
-Output one frame for each field.
-@end table
-
-The default value is @code{send_field}.
-
-@item parity
-The picture field parity assumed for the input interlaced video. It accepts one
-of the following values:
-
-@table @option
-@item 0, tff
-Assume the top field is first.
-@item 1, bff
-Assume the bottom field is first.
-@item -1, auto
-Enable automatic detection of field parity.
-@end table
-
-The default value is @code{auto}.
-If the interlacing is unknown or the decoder does not export this information,
-top field first will be assumed.
-
-@item deint
-Specify which frames to deinterlace. Accepts one of the following
-values:
-
-@table @option
-@item 0, all
-Deinterlace all frames.
-@item 1, interlaced
-Only deinterlace frames marked as interlaced.
-@end table
-
-The default value is @code{all}.
-@end table
-
 @section ccrepack
 
 Repack CEA-708 closed captioning side data
@@ -9408,48 +9317,6 @@ ffmpeg -f lavfi -i color=c=black:s=1280x720 -i video.mp4 -shortest -filter_compl
 @end example
 @end itemize
 
-@section chromakey_cuda
-CUDA accelerated YUV colorspace color/chroma keying.
-
-This filter works like normal chromakey filter but operates on CUDA frames.
-for more details and parameters see @ref{chromakey}.
-
-@subsection Examples
-
-@itemize
-@item
-Make all the green pixels in the input video transparent and use it as an overlay for another video:
-
-@example
-./ffmpeg \
-    -hwaccel cuda -hwaccel_output_format cuda -i input_green.mp4  \
-    -hwaccel cuda -hwaccel_output_format cuda -i base_video.mp4 \
-    -init_hw_device cuda \
-    -filter_complex \
-    " \
-        [0:v]chromakey_cuda=0x25302D:0.1:0.12:1[overlay_video]; \
-        [1:v]scale_cuda=format=yuv420p[base]; \
-        [base][overlay_video]overlay_cuda" \
-    -an -sn -c:v h264_nvenc -cq 20 output.mp4
-@end example
-
-@item
-Process two software sources, explicitly uploading the frames:
-
-@example
-./ffmpeg -init_hw_device cuda=cuda -filter_hw_device cuda \
-    -f lavfi -i color=size=800x600:color=white,format=yuv420p \
-    -f lavfi -i yuvtestsrc=size=200x200,format=yuv420p \
-    -filter_complex \
-    " \
-        [0]hwupload[under]; \
-        [1]hwupload,chromakey_cuda=green:0.1:0.12[over]; \
-        [under][over]overlay_cuda" \
-    -c:v hevc_nvenc -cq 18 -preset slow output.mp4
-@end example
-
-@end itemize
-
 @section chromanr
 Reduce chrominance noise.
 
@@ -10427,38 +10294,6 @@ For example to convert the input to SMPTE-240M, use the command:
 colorspace=smpte240m
 @end example
 
-@section colorspace_cuda
-
-CUDA accelerated implementation of the colorspace filter.
-
-It is by no means feature complete compared to the software colorspace filter,
-and at the current time only supports color range conversion between jpeg/full
-and mpeg/limited range.
-
-The filter accepts the following options:
-
-@table @option
-@item range
-Specify output color range.
-
-The accepted values are:
-@table @samp
-@item tv
-TV (restricted) range
-
-@item mpeg
-MPEG (restricted) range
-
-@item pc
-PC (full) range
-
-@item jpeg
-JPEG (full) range
-
-@end table
-
-@end table
-
 @section colortemperature
 Adjust color temperature in video to simulate variations in ambient color temperature.
 
@@ -18988,84 +18823,6 @@ testsrc=s=100x100, split=4 [in0][in1][in2][in3];
 
 @end itemize
 
-@anchor{overlay_cuda}
-@section overlay_cuda
-
-Overlay one video on top of another.
-
-This is the CUDA variant of the @ref{overlay} filter.
-It only accepts CUDA frames. The underlying input pixel formats have to match.
-
-It takes two inputs and has one output. The first input is the "main"
-video on which the second input is overlaid.
-
-It accepts the following parameters:
-
-@table @option
-@item x
-@item y
-Set expressions for the x and y coordinates of the overlaid video
-on the main video.
-
-They can contain the following parameters:
-
-@table @option
-
-@item main_w, W
-@item main_h, H
-The main input width and height.
-
-@item overlay_w, w
-@item overlay_h, h
-The overlay input width and height.
-
-@item x
-@item y
-The computed values for @var{x} and @var{y}. They are evaluated for
-each new frame.
-
-@item n
-The ordinal index of the main input frame, starting from 0.
-
-@item pos
-The byte offset position in the file of the main input frame, NAN if unknown.
-Deprecated, do not use.
-
-@item t
-The timestamp of the main input frame, expressed in seconds, NAN if unknown.
-
-@end table
-
-Default value is "0" for both expressions.
-
-@item eval
-Set when the expressions for @option{x} and @option{y} are evaluated.
-
-It accepts the following values:
-@table @option
-@item init
-Evaluate expressions once during filter initialization or
-when a command is processed.
-
-@item frame
-Evaluate expressions for each incoming frame
-@end table
-
-Default value is @option{frame}.
-
-@item eof_action
-See @ref{framesync}.
-
-@item shortest
-See @ref{framesync}.
-
-@item repeatlast
-See @ref{framesync}.
-
-@end table
-
-This filter also supports the @ref{framesync} options.
-
 @section owdenoise
 
 Apply Overcomplete Wavelet denoiser.
@@ -21516,287 +21273,6 @@ If the specified expression is not valid, it is kept at its current
 value.
 @end table
 
-@anchor{scale_cuda}
-@section scale_cuda
-
-Scale (resize) and convert (pixel format) the input video, using accelerated CUDA kernels.
-Setting the output width and height works in the same way as for the @ref{scale} filter.
-
-The filter accepts the following options:
-@table @option
-@item w
-@item h
-Set the output video dimension expression. Default value is the input dimension.
-
-Allows for the same expressions as the @ref{scale} filter.
-
-@item interp_algo
-Sets the algorithm used for scaling:
-
-@table @var
-@item nearest
-Nearest neighbour
-
-Used by default if input parameters match the desired output.
-
-@item bilinear
-Bilinear
-
-@item bicubic
-Bicubic
-
-This is the default.
-
-@item lanczos
-Lanczos
-
-@end table
-
-@item format
-Controls the output pixel format. By default, or if none is specified, the input
-pixel format is used.
-
-The filter does not support converting between YUV and RGB pixel formats.
-
-@item passthrough
-If set to 0, every frame is processed, even if no conversion is necessary.
-This mode can be useful to use the filter as a buffer for a downstream
-frame-consumer that exhausts the limited decoder frame pool.
-
-If set to 1, frames are passed through as-is if they match the desired output
-parameters. This is the default behaviour.
-
-@item param
-Algorithm-Specific parameter.
-
-Affects the curves of the bicubic algorithm.
-
-@item force_original_aspect_ratio
-@item force_divisible_by
-Work the same as the identical @ref{scale} filter options.
-
-@item reset_sar
-Works the same as the identical @ref{scale} filter option.
-
-@end table
-
-@subsection Examples
-
-@itemize
-@item
-Scale input to 720p, keeping aspect ratio and ensuring the output is yuv420p.
-@example
-scale_cuda=-2:720:format=yuv420p
-@end example
-
-@item
-Upscale to 4K using nearest neighbour algorithm.
-@example
-scale_cuda=4096:2160:interp_algo=nearest
-@end example
-
-@item
-Don't do any conversion or scaling, but copy all input frames into newly allocated ones.
-This can be useful to deal with a filter and encode chain that otherwise exhausts the
-decoders frame pool.
-@example
-scale_cuda=passthrough=0
-@end example
-@end itemize
-
-@anchor{scale_npp}
-@section scale_npp
-
-Use the NVIDIA Performance Primitives (libnpp) to perform scaling and/or pixel
-format conversion on CUDA video frames. Setting the output width and height
-works in the same way as for the @var{scale} filter.
-
-The following additional options are accepted:
-@table @option
-@item format
-The pixel format of the output CUDA frames. If set to the string "same" (the
-default), the input format will be kept. Note that automatic format negotiation
-and conversion is not yet supported for hardware frames
-
-@item interp_algo
-The interpolation algorithm used for resizing. One of the following:
-@table @option
-@item nn
-Nearest neighbour.
-
-@item linear
-@item cubic
-@item cubic2p_bspline
-2-parameter cubic (B=1, C=0)
-
-@item cubic2p_catmullrom
-2-parameter cubic (B=0, C=1/2)
-
-@item cubic2p_b05c03
-2-parameter cubic (B=1/2, C=3/10)
-
-@item super
-Supersampling
-
-@item lanczos
-@end table
-
-@item force_original_aspect_ratio
-Enable decreasing or increasing output video width or height if necessary to
-keep the original aspect ratio. Possible values:
-
-@table @samp
-@item disable
-Scale the video as specified and disable this feature.
-
-@item decrease
-The output video dimensions will automatically be decreased if needed.
-
-@item increase
-The output video dimensions will automatically be increased if needed.
-
-@end table
-
-One useful instance of this option is that when you know a specific device's
-maximum allowed resolution, you can use this to limit the output video to
-that, while retaining the aspect ratio. For example, device A allows
-1280x720 playback, and your video is 1920x800. Using this option (set it to
-decrease) and specifying 1280x720 to the command line makes the output
-1280x533.
-
-Please note that this is a different thing than specifying -1 for @option{w}
-or @option{h}, you still need to specify the output resolution for this option
-to work.
-
-@item force_divisible_by
-Ensures that both the output dimensions, width and height, are divisible by the
-given integer when used together with @option{force_original_aspect_ratio}. This
-works similar to using @code{-n} in the @option{w} and @option{h} options.
-
-This option respects the value set for @option{force_original_aspect_ratio},
-increasing or decreasing the resolution accordingly. The video's aspect ratio
-may be slightly modified.
-
-This option can be handy if you need to have a video fit within or exceed
-a defined resolution using @option{force_original_aspect_ratio} but also have
-encoder restrictions on width or height divisibility.
-
-@item reset_sar
-Works the same as the identical @ref{scale} filter option.
-
-@item eval
-Specify when to evaluate @var{width} and @var{height} expression. It accepts the following values:
-
-@table @samp
-@item init
-Only evaluate expressions once during the filter initialization or when a command is processed.
-
-@item frame
-Evaluate expressions for each incoming frame.
-
-@end table
-
-@end table
-
-The values of the @option{w} and @option{h} options are expressions
-containing the following constants:
-
-@table @var
-@item in_w
-@item in_h
-The input width and height
-
-@item iw
-@item ih
-These are the same as @var{in_w} and @var{in_h}.
-
-@item out_w
-@item out_h
-The output (scaled) width and height
-
-@item ow
-@item oh
-These are the same as @var{out_w} and @var{out_h}
-
-@item a
-The same as @var{iw} / @var{ih}
-
-@item sar
-input sample aspect ratio
-
-@item dar
-The input display aspect ratio. Calculated from @code{(iw / ih) * sar}.
-
-@item n
-The (sequential) number of the input frame, starting from 0.
-Only available with @code{eval=frame}.
-
-@item t
-The presentation timestamp of the input frame, expressed as a number of
-seconds. Only available with @code{eval=frame}.
-
-@item pos
-The position (byte offset) of the frame in the input stream, or NaN if
-this information is unavailable and/or meaningless (for example in case of synthetic video).
-Only available with @code{eval=frame}.
-Deprecated, do not use.
-@end table
-
-@section scale2ref_npp
-
-Use the NVIDIA Performance Primitives (libnpp) to scale (resize) the input
-video, based on a reference video.
-
-See the @ref{scale_npp} filter for available options, scale2ref_npp supports the same
-but uses the reference video instead of the main input as basis. scale2ref_npp
-also supports the following additional constants for the @option{w} and
-@option{h} options:
-
-@table @var
-@item main_w
-@item main_h
-The main input video's width and height
-
-@item main_a
-The same as @var{main_w} / @var{main_h}
-
-@item main_sar
-The main input video's sample aspect ratio
-
-@item main_dar, mdar
-The main input video's display aspect ratio. Calculated from
-@code{(main_w / main_h) * main_sar}.
-
-@item main_n
-The (sequential) number of the main input frame, starting from 0.
-Only available with @code{eval=frame}.
-
-@item main_t
-The presentation timestamp of the main input frame, expressed as a number of
-seconds. Only available with @code{eval=frame}.
-
-@item main_pos
-The position (byte offset) of the frame in the main input stream, or NaN if
-this information is unavailable and/or meaningless (for example in case of synthetic video).
-Only available with @code{eval=frame}.
-@end table
-
-@subsection Examples
-
-@itemize
-@item
-Scale a subtitle stream (b) to match the main video (a) in size before overlaying
-@example
-'scale2ref_npp[b][a];[a][b]overlay_cuda'
-@end example
-
-@item
-Scale a logo to 1/10th the height of a video, while preserving its display aspect ratio.
-@example
-[logo-in][video-in]scale2ref_npp=w=oh*mdar:h=ih/10[logo-out][video-out]
-@end example
-@end itemize
-
 @section scale_vt
 
 Scale and convert the color parameters using VTPixelTransferSession.
@@ -22243,23 +21719,6 @@ Keep the same chroma location (default).
 @end table
 @end table
 
-@section sharpen_npp
-Use the NVIDIA Performance Primitives (libnpp) to perform image sharpening with
-border control.
-
-The following additional options are accepted:
-@table @option
-
-@item border_type
-Type of sampling to be used ad frame borders. One of the following:
-@table @option
-
-@item replicate
-Replicate pixel values.
-
-@end table
-@end table
-
 @section shear
 Apply shear transform to input video.
 
@@ -24417,47 +23876,6 @@ The command above can also be specified as:
 transpose=1:portrait
 @end example
 
-@section transpose_npp
-
-Transpose rows with columns in the input video and optionally flip it.
-For more in depth examples see the @ref{transpose} video filter, which shares mostly the same options.
-
-It accepts the following parameters:
-
-@table @option
-
-@item dir
-Specify the transposition direction.
-
-Can assume the following values:
-@table @samp
-@item cclock_flip
-Rotate by 90 degrees counterclockwise and vertically flip. (default)
-
-@item clock
-Rotate by 90 degrees clockwise.
-
-@item cclock
-Rotate by 90 degrees counterclockwise.
-
-@item clock_flip
-Rotate by 90 degrees clockwise and vertically flip.
-@end table
-
-@item passthrough
-Do not apply the transposition if the input geometry matches the one
-specified by the specified value. It accepts the following values:
-@table @samp
-@item none
-Always apply transposition. (default)
-@item portrait
-Preserve portrait geometry (when @var{height} >= @var{width}).
-@item landscape
-Preserve landscape geometry (when @var{width} >= @var{height}).
-@end table
-
-@end table
-
 @section trim
 Trim the input so that the output contains one continuous subpart of the input.
 
@@ -26644,64 +26062,6 @@ filter").
 It accepts the following parameters:
 
 
-@table @option
-
-@item mode
-The interlacing mode to adopt. It accepts one of the following values:
-
-@table @option
-@item 0, send_frame
-Output one frame for each frame.
-@item 1, send_field
-Output one frame for each field.
-@item 2, send_frame_nospatial
-Like @code{send_frame}, but it skips the spatial interlacing check.
-@item 3, send_field_nospatial
-Like @code{send_field}, but it skips the spatial interlacing check.
-@end table
-
-The default value is @code{send_frame}.
-
-@item parity
-The picture field parity assumed for the input interlaced video. It accepts one
-of the following values:
-
-@table @option
-@item 0, tff
-Assume the top field is first.
-@item 1, bff
-Assume the bottom field is first.
-@item -1, auto
-Enable automatic detection of field parity.
-@end table
-
-The default value is @code{auto}.
-If the interlacing is unknown or the decoder does not export this information,
-top field first will be assumed.
-
-@item deint
-Specify which frames to deinterlace. Accepts one of the following
-values:
-
-@table @option
-@item 0, all
-Deinterlace all frames.
-@item 1, interlaced
-Only deinterlace frames marked as interlaced.
-@end table
-
-The default value is @code{all}.
-@end table
-
-@section yadif_cuda
-
-Deinterlace the input video using the @ref{yadif} algorithm, but implemented
-in CUDA so that it can work as part of a GPU accelerated pipeline with nvdec
-and/or nvenc.
-
-It accepts the following parameters:
-
-
 @table @option
 
 @item mode
@@ -27172,6 +26532,719 @@ value.
 
 @c man end VIDEO FILTERS
 
+@chapter CUDA Video Filters
+@c man begin CUDA Video Filters
+
+To enable CUDA and/or NPP filters please refer to configuration guidelines for @ref{CUDA} and for @ref{CUDA NPP} filters.
+
+Running CUDA filters requires you to initialize a hardware device and to pass that device to all filters in any filter graph.
+@table @option
+
+@item -init_hw_device cuda[=@var{name}][:@var{device}[,@var{key=value}...]]
+Initialise a new hardware device of type @var{cuda} called @var{name}, using the
+given device parameters.
+
+@item -filter_hw_device @var{name}
+Pass the hardware device called @var{name} to all filters in any filter graph.
+
+@end table
+
+For more detailed information see @url{https://www.ffmpeg.org/ffmpeg.html#Advanced-Video-options}
+
+@itemize
+@item
+Example of initializing second CUDA device on the system and running scale_cuda and bilateral_cuda filters.
+@example
+./ffmpeg -hwaccel cuda -hwaccel_output_format cuda -i input.mp4 -init_hw_device cuda:1 -filter_complex \
+"[0:v]scale_cuda=format=yuv444p[scaled_video];[scaled_video]bilateral_cuda=window_size=9:sigmaS=3.0:sigmaR=50.0" \
+-an -sn -c:v h264_nvenc -cq 20 out.mp4
+@end example
+@end itemize
+
+Since CUDA filters operate exclusively on GPU memory, frame data must sometimes be uploaded (@ref{hwupload}) to hardware surfaces associated with the appropriate CUDA device before processing, and downloaded (@ref{hwdownload}) back to normal memory afterward, if required. Whether @ref{hwupload} or @ref{hwdownload} is necessary depends on the specific workflow:
+
+@itemize
+@item If the input frames are already in GPU memory (e.g., when using @code{-hwaccel cuda} or @code{-hwaccel_output_format cuda}), explicit use of @ref{hwupload} is not needed, as the data is already in the appropriate memory space.
+@item If the input frames are in CPU memory (e.g., software-decoded frames or frames processed by CPU-based filters), it is necessary to use @ref{hwupload} to transfer the data to GPU memory for CUDA processing.
+@item If the output of the CUDA filters needs to be further processed by software-based filters or saved in a format not supported by GPU-based encoders, @ref{hwdownload} is required to transfer the data back to CPU memory.
+@end itemize
+Note that @ref{hwupload} uploads data to a surface with the same layout as the software frame, so it may be necessary to add a @ref{format} filter immediately before @ref{hwupload} to ensure the input is in the correct format. Similarly, @ref{hwdownload} may not support all output formats, so an additional @ref{format} filter may need to be inserted immediately after @ref{hwdownload} in the filter graph to ensure compatibility.
+
+@anchor{CUDA}
+@section CUDA
+Below is a description of the currently available Nvidia CUDA video filters.
+
+Prerequisites:
+@itemize
+@item Install Nvidia CUDA Toolkit
+@end itemize
+
+Note: If FFmpeg detects the Nvidia CUDA Toolkit during configuration, it will enable CUDA filters automatically without requiring any additional flags. If you want to explicitly enable them, use the following options:
+
+@itemize
+@item Configure FFmpeg with @code{--enable-cuda-nvcc --enable-nonfree}.
+@item Configure FFmpeg with @code{--enable-cuda-llvm}. Additional requirement: @code{llvm} lib must be installed.
+@end itemize
+
+@subsection bilateral_cuda
+CUDA accelerated bilateral filter, an edge preserving filter.
+This filter is mathematically accurate thanks to the use of GPU acceleration.
+For best output quality, use one to one chroma subsampling, i.e. yuv444p format.
+
+The filter accepts the following options:
+@table @option
+@item sigmaS
+Set sigma of gaussian function to calculate spatial weight, also called sigma space.
+Allowed range is 0.1 to 512. Default is 0.1.
+
+@item sigmaR
+Set sigma of gaussian function to calculate color range weight, also called sigma color.
+Allowed range is 0.1 to 512. Default is 0.1.
+
+@item window_size
+Set window size of the bilateral function to determine the number of neighbours to loop on.
+If the number entered is even, one will be added automatically.
+Allowed range is 1 to 255. Default is 1.
+@end table
+@subsubsection Examples
+
+@itemize
+@item
+Apply the bilateral filter on a video.
+
+@example
+./ffmpeg -v verbose \
+-hwaccel cuda -hwaccel_output_format cuda -i input.mp4  \
+-init_hw_device cuda \
+-filter_complex \
+" \
+[0:v]scale_cuda=format=yuv444p[scaled_video];
+[scaled_video]bilateral_cuda=window_size=9:sigmaS=3.0:sigmaR=50.0" \
+-an -sn -c:v h264_nvenc -cq 20 out.mp4
+@end example
+
+@end itemize
+
+@subsection bwdif_cuda
+
+Deinterlace the input video using the @ref{bwdif} algorithm, but implemented
+in CUDA so that it can work as part of a GPU accelerated pipeline with nvdec
+and/or nvenc.
+
+It accepts the following parameters:
+
+@table @option
+@item mode
+The interlacing mode to adopt. It accepts one of the following values:
+
+@table @option
+@item 0, send_frame
+Output one frame for each frame.
+@item 1, send_field
+Output one frame for each field.
+@end table
+
+The default value is @code{send_field}.
+
+@item parity
+The picture field parity assumed for the input interlaced video. It accepts one
+of the following values:
+
+@table @option
+@item 0, tff
+Assume the top field is first.
+@item 1, bff
+Assume the bottom field is first.
+@item -1, auto
+Enable automatic detection of field parity.
+@end table
+
+The default value is @code{auto}.
+If the interlacing is unknown or the decoder does not export this information,
+top field first will be assumed.
+
+@item deint
+Specify which frames to deinterlace. Accepts one of the following
+values:
+
+@table @option
+@item 0, all
+Deinterlace all frames.
+@item 1, interlaced
+Only deinterlace frames marked as interlaced.
+@end table
+
+The default value is @code{all}.
+@end table
+
+@subsection chromakey_cuda
+CUDA accelerated YUV colorspace color/chroma keying.
+
+This filter works like normal chromakey filter but operates on CUDA frames.
+for more details and parameters see @ref{chromakey}.
+
+@subsubsection Examples
+
+@itemize
+@item
+Make all the green pixels in the input video transparent and use it as an overlay for another video:
+
+@example
+./ffmpeg \
+    -hwaccel cuda -hwaccel_output_format cuda -i input_green.mp4  \
+    -hwaccel cuda -hwaccel_output_format cuda -i base_video.mp4 \
+    -init_hw_device cuda \
+    -filter_complex \
+    " \
+        [0:v]chromakey_cuda=0x25302D:0.1:0.12:1[overlay_video]; \
+        [1:v]scale_cuda=format=yuv420p[base]; \
+        [base][overlay_video]overlay_cuda" \
+    -an -sn -c:v h264_nvenc -cq 20 output.mp4
+@end example
+
+@item
+Process two software sources, explicitly uploading the frames:
+
+@example
+./ffmpeg -init_hw_device cuda=cuda -filter_hw_device cuda \
+    -f lavfi -i color=size=800x600:color=white,format=yuv420p \
+    -f lavfi -i yuvtestsrc=size=200x200,format=yuv420p \
+    -filter_complex \
+    " \
+        [0]hwupload[under]; \
+        [1]hwupload,chromakey_cuda=green:0.1:0.12[over]; \
+        [under][over]overlay_cuda" \
+    -c:v hevc_nvenc -cq 18 -preset slow output.mp4
+@end example
+
+@end itemize
+
+@subsection colorspace_cuda
+
+CUDA accelerated implementation of the colorspace filter.
+
+It is by no means feature complete compared to the software colorspace filter,
+and at the current time only supports color range conversion between jpeg/full
+and mpeg/limited range.
+
+The filter accepts the following options:
+
+@table @option
+@item range
+Specify output color range.
+
+The accepted values are:
+@table @samp
+@item tv
+TV (restricted) range
+
+@item mpeg
+MPEG (restricted) range
+
+@item pc
+PC (full) range
+
+@item jpeg
+JPEG (full) range
+
+@end table
+
+@end table
+
+@anchor{overlay_cuda}
+@subsection overlay_cuda
+
+Overlay one video on top of another.
+
+This is the CUDA variant of the @ref{overlay} filter.
+It only accepts CUDA frames. The underlying input pixel formats have to match.
+
+It takes two inputs and has one output. The first input is the "main"
+video on which the second input is overlaid.
+
+It accepts the following parameters:
+
+@table @option
+@item x
+@item y
+Set expressions for the x and y coordinates of the overlaid video
+on the main video.
+
+They can contain the following parameters:
+
+@table @option
+
+@item main_w, W
+@item main_h, H
+The main input width and height.
+
+@item overlay_w, w
+@item overlay_h, h
+The overlay input width and height.
+
+@item x
+@item y
+The computed values for @var{x} and @var{y}. They are evaluated for
+each new frame.
+
+@item n
+The ordinal index of the main input frame, starting from 0.
+
+@item pos
+The byte offset position in the file of the main input frame, NAN if unknown.
+Deprecated, do not use.
+
+@item t
+The timestamp of the main input frame, expressed in seconds, NAN if unknown.
+
+@end table
+
+Default value is "0" for both expressions.
+
+@item eval
+Set when the expressions for @option{x} and @option{y} are evaluated.
+
+It accepts the following values:
+@table @option
+@item init
+Evaluate expressions once during filter initialization or
+when a command is processed.
+
+@item frame
+Evaluate expressions for each incoming frame
+@end table
+
+Default value is @option{frame}.
+
+@item eof_action
+See @ref{framesync}.
+
+@item shortest
+See @ref{framesync}.
+
+@item repeatlast
+See @ref{framesync}.
+
+@end table
+
+This filter also supports the @ref{framesync} options.
+
+@anchor{scale_cuda}
+@subsection scale_cuda
+
+Scale (resize) and convert (pixel format) the input video, using accelerated CUDA kernels.
+Setting the output width and height works in the same way as for the @ref{scale} filter.
+
+The filter accepts the following options:
+@table @option
+@item w
+@item h
+Set the output video dimension expression. Default value is the input dimension.
+
+Allows for the same expressions as the @ref{scale} filter.
+
+@item interp_algo
+Sets the algorithm used for scaling:
+
+@table @var
+@item nearest
+Nearest neighbour
+
+Used by default if input parameters match the desired output.
+
+@item bilinear
+Bilinear
+
+@item bicubic
+Bicubic
+
+This is the default.
+
+@item lanczos
+Lanczos
+
+@end table
+
+@item format
+Controls the output pixel format. By default, or if none is specified, the input
+pixel format is used.
+
+The filter does not support converting between YUV and RGB pixel formats.
+
+@item passthrough
+If set to 0, every frame is processed, even if no conversion is necessary.
+This mode can be useful to use the filter as a buffer for a downstream
+frame-consumer that exhausts the limited decoder frame pool.
+
+If set to 1, frames are passed through as-is if they match the desired output
+parameters. This is the default behaviour.
+
+@item param
+Algorithm-Specific parameter.
+
+Affects the curves of the bicubic algorithm.
+
+@item force_original_aspect_ratio
+@item force_divisible_by
+Work the same as the identical @ref{scale} filter options.
+
+@item reset_sar
+Works the same as the identical @ref{scale} filter option.
+
+@end table
+
+@subsubsection Examples
+
+@itemize
+@item
+Scale input to 720p, keeping aspect ratio and ensuring the output is yuv420p.
+@example
+scale_cuda=-2:720:format=yuv420p
+@end example
+
+@item
+Upscale to 4K using nearest neighbour algorithm.
+@example
+scale_cuda=4096:2160:interp_algo=nearest
+@end example
+
+@item
+Don't do any conversion or scaling, but copy all input frames into newly allocated ones.
+This can be useful to deal with a filter and encode chain that otherwise exhausts the
+decoders frame pool.
+@example
+scale_cuda=passthrough=0
+@end example
+@end itemize
+
+@subsection yadif_cuda
+
+Deinterlace the input video using the @ref{yadif} algorithm, but implemented
+in CUDA so that it can work as part of a GPU accelerated pipeline with nvdec
+and/or nvenc.
+
+It accepts the following parameters:
+
+
+@table @option
+
+@item mode
+The interlacing mode to adopt. It accepts one of the following values:
+
+@table @option
+@item 0, send_frame
+Output one frame for each frame.
+@item 1, send_field
+Output one frame for each field.
+@item 2, send_frame_nospatial
+Like @code{send_frame}, but it skips the spatial interlacing check.
+@item 3, send_field_nospatial
+Like @code{send_field}, but it skips the spatial interlacing check.
+@end table
+
+The default value is @code{send_frame}.
+
+@item parity
+The picture field parity assumed for the input interlaced video. It accepts one
+of the following values:
+
+@table @option
+@item 0, tff
+Assume the top field is first.
+@item 1, bff
+Assume the bottom field is first.
+@item -1, auto
+Enable automatic detection of field parity.
+@end table
+
+The default value is @code{auto}.
+If the interlacing is unknown or the decoder does not export this information,
+top field first will be assumed.
+
+@item deint
+Specify which frames to deinterlace. Accepts one of the following
+values:
+
+@table @option
+@item 0, all
+Deinterlace all frames.
+@item 1, interlaced
+Only deinterlace frames marked as interlaced.
+@end table
+
+The default value is @code{all}.
+@end table
+
+@anchor{CUDA NPP}
+@section CUDA NPP
+Below is a description of the currently available NVIDIA Performance Primitives (libnpp) video filters.
+
+Prerequisites:
+@itemize
+@item Install Nvidia CUDA Toolkit
+@item Install libnpp
+@end itemize
+
+To enable CUDA NPP filters:
+
+@itemize
+@item Configure FFmpeg with @code{--enable-nonfree --enable-libnpp}.
+@end itemize
+
+
+@anchor{scale_npp}
+@subsection scale_npp
+
+Use the NVIDIA Performance Primitives (libnpp) to perform scaling and/or pixel
+format conversion on CUDA video frames. Setting the output width and height
+works in the same way as for the @var{scale} filter.
+
+The following additional options are accepted:
+@table @option
+@item format
+The pixel format of the output CUDA frames. If set to the string "same" (the
+default), the input format will be kept. Note that automatic format negotiation
+and conversion is not yet supported for hardware frames
+
+@item interp_algo
+The interpolation algorithm used for resizing. One of the following:
+@table @option
+@item nn
+Nearest neighbour.
+
+@item linear
+@item cubic
+@item cubic2p_bspline
+2-parameter cubic (B=1, C=0)
+
+@item cubic2p_catmullrom
+2-parameter cubic (B=0, C=1/2)
+
+@item cubic2p_b05c03
+2-parameter cubic (B=1/2, C=3/10)
+
+@item super
+Supersampling
+
+@item lanczos
+@end table
+
+@item force_original_aspect_ratio
+Enable decreasing or increasing output video width or height if necessary to
+keep the original aspect ratio. Possible values:
+
+@table @samp
+@item disable
+Scale the video as specified and disable this feature.
+
+@item decrease
+The output video dimensions will automatically be decreased if needed.
+
+@item increase
+The output video dimensions will automatically be increased if needed.
+
+@end table
+
+One useful instance of this option is that when you know a specific device's
+maximum allowed resolution, you can use this to limit the output video to
+that, while retaining the aspect ratio. For example, device A allows
+1280x720 playback, and your video is 1920x800. Using this option (set it to
+decrease) and specifying 1280x720 to the command line makes the output
+1280x533.
+
+Please note that this is a different thing than specifying -1 for @option{w}
+or @option{h}, you still need to specify the output resolution for this option
+to work.
+
+@item force_divisible_by
+Ensures that both the output dimensions, width and height, are divisible by the
+given integer when used together with @option{force_original_aspect_ratio}. This
+works similar to using @code{-n} in the @option{w} and @option{h} options.
+
+This option respects the value set for @option{force_original_aspect_ratio},
+increasing or decreasing the resolution accordingly. The video's aspect ratio
+may be slightly modified.
+
+This option can be handy if you need to have a video fit within or exceed
+a defined resolution using @option{force_original_aspect_ratio} but also have
+encoder restrictions on width or height divisibility.
+
+@item reset_sar
+Works the same as the identical @ref{scale} filter option.
+
+@item eval
+Specify when to evaluate @var{width} and @var{height} expression. It accepts the following values:
+
+@table @samp
+@item init
+Only evaluate expressions once during the filter initialization or when a command is processed.
+
+@item frame
+Evaluate expressions for each incoming frame.
+
+@end table
+
+@end table
+
+The values of the @option{w} and @option{h} options are expressions
+containing the following constants:
+
+@table @var
+@item in_w
+@item in_h
+The input width and height
+
+@item iw
+@item ih
+These are the same as @var{in_w} and @var{in_h}.
+
+@item out_w
+@item out_h
+The output (scaled) width and height
+
+@item ow
+@item oh
+These are the same as @var{out_w} and @var{out_h}
+
+@item a
+The same as @var{iw} / @var{ih}
+
+@item sar
+input sample aspect ratio
+
+@item dar
+The input display aspect ratio. Calculated from @code{(iw / ih) * sar}.
+
+@item n
+The (sequential) number of the input frame, starting from 0.
+Only available with @code{eval=frame}.
+
+@item t
+The presentation timestamp of the input frame, expressed as a number of
+seconds. Only available with @code{eval=frame}.
+
+@item pos
+The position (byte offset) of the frame in the input stream, or NaN if
+this information is unavailable and/or meaningless (for example in case of synthetic video).
+Only available with @code{eval=frame}.
+Deprecated, do not use.
+@end table
+
+@subsection scale2ref_npp
+
+Use the NVIDIA Performance Primitives (libnpp) to scale (resize) the input
+video, based on a reference video.
+
+See the @ref{scale_npp} filter for available options, scale2ref_npp supports the same
+but uses the reference video instead of the main input as basis. scale2ref_npp
+also supports the following additional constants for the @option{w} and
+@option{h} options:
+
+@table @var
+@item main_w
+@item main_h
+The main input video's width and height
+
+@item main_a
+The same as @var{main_w} / @var{main_h}
+
+@item main_sar
+The main input video's sample aspect ratio
+
+@item main_dar, mdar
+The main input video's display aspect ratio. Calculated from
+@code{(main_w / main_h) * main_sar}.
+
+@item main_n
+The (sequential) number of the main input frame, starting from 0.
+Only available with @code{eval=frame}.
+
+@item main_t
+The presentation timestamp of the main input frame, expressed as a number of
+seconds. Only available with @code{eval=frame}.
+
+@item main_pos
+The position (byte offset) of the frame in the main input stream, or NaN if
+this information is unavailable and/or meaningless (for example in case of synthetic video).
+Only available with @code{eval=frame}.
+@end table
+
+@subsubsection Examples
+
+@itemize
+@item
+Scale a subtitle stream (b) to match the main video (a) in size before overlaying
+@example
+'scale2ref_npp[b][a];[a][b]overlay_cuda'
+@end example
+
+@item
+Scale a logo to 1/10th the height of a video, while preserving its display aspect ratio.
+@example
+[logo-in][video-in]scale2ref_npp=w=oh*mdar:h=ih/10[logo-out][video-out]
+@end example
+@end itemize
+
+@subsection sharpen_npp
+Use the NVIDIA Performance Primitives (libnpp) to perform image sharpening with
+border control.
+
+The following additional options are accepted:
+@table @option
+
+@item border_type
+Type of sampling to be used ad frame borders. One of the following:
+@table @option
+
+@item replicate
+Replicate pixel values.
+
+@end table
+@end table
+
+@subsection transpose_npp
+
+Transpose rows with columns in the input video and optionally flip it.
+For more in depth examples see the @ref{transpose} video filter, which shares mostly the same options.
+
+It accepts the following parameters:
+
+@table @option
+
+@item dir
+Specify the transposition direction.
+
+Can assume the following values:
+@table @samp
+@item cclock_flip
+Rotate by 90 degrees counterclockwise and vertically flip. (default)
+
+@item clock
+Rotate by 90 degrees clockwise.
+
+@item cclock
+Rotate by 90 degrees counterclockwise.
+
+@item clock_flip
+Rotate by 90 degrees clockwise and vertically flip.
+@end table
+
+@item passthrough
+Do not apply the transposition if the input geometry matches the one
+specified by the specified value. It accepts the following values:
+@table @samp
+@item none
+Always apply transposition. (default)
+@item portrait
+Preserve portrait geometry (when @var{height} >= @var{width}).
+@item landscape
+Preserve landscape geometry (when @var{width} >= @var{height}).
+@end table
+
+@end table
+
+@c man end CUDA Video Filters
+
 @chapter OpenCL Video Filters
 @c man begin OPENCL VIDEO FILTERS
 
-- 
2.39.5 (Apple Git-154)

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [FFmpeg-devel] [PATCH] doc/filters: Add CUDA Video Filters section for CUDA-based and CUDA+NPP based filters.
  2025-03-10 10:38                 ` Danil Iashchenko
@ 2025-03-16 19:15                   ` Danil Iashchenko
  2025-03-17  5:55                     ` Gyan Doshi
  0 siblings, 1 reply; 12+ messages in thread
From: Danil Iashchenko @ 2025-03-16 19:15 UTC (permalink / raw)
  To: ffmpeg-devel; +Cc: Danil Iashchenko


Hi Gyan and Michael,
Thank you for reviewing the patch and providing feedback!
I've addressed all the issues and resubmitting the patch (built and tested with Texinfo 7.1.1).

Per Gyan's suggestion, I'm resubmitting since Patchwork was down when I originally sent it on the 10th.

Please let me know if there's anything else I can clarify or improve.
Thanks again!

---
 doc/filters.texi | 1353 ++++++++++++++++++++++++----------------------
 1 file changed, 713 insertions(+), 640 deletions(-)

diff --git a/doc/filters.texi b/doc/filters.texi
index 0ba7d3035f..37b8674756 100644
--- a/doc/filters.texi
+++ b/doc/filters.texi
@@ -8619,45 +8619,6 @@ Set planes to filter. Default is first only.
 
 This filter supports the all above options as @ref{commands}.
 
-@section bilateral_cuda
-CUDA accelerated bilateral filter, an edge preserving filter.
-This filter is mathematically accurate thanks to the use of GPU acceleration.
-For best output quality, use one to one chroma subsampling, i.e. yuv444p format.
-
-The filter accepts the following options:
-@table @option
-@item sigmaS
-Set sigma of gaussian function to calculate spatial weight, also called sigma space.
-Allowed range is 0.1 to 512. Default is 0.1.
-
-@item sigmaR
-Set sigma of gaussian function to calculate color range weight, also called sigma color.
-Allowed range is 0.1 to 512. Default is 0.1.
-
-@item window_size
-Set window size of the bilateral function to determine the number of neighbours to loop on.
-If the number entered is even, one will be added automatically.
-Allowed range is 1 to 255. Default is 1.
-@end table
-@subsection Examples
-
-@itemize
-@item
-Apply the bilateral filter on a video.
-
-@example
-./ffmpeg -v verbose \
--hwaccel cuda -hwaccel_output_format cuda -i input.mp4  \
--init_hw_device cuda \
--filter_complex \
-" \
-[0:v]scale_cuda=format=yuv444p[scaled_video];
-[scaled_video]bilateral_cuda=window_size=9:sigmaS=3.0:sigmaR=50.0" \
--an -sn -c:v h264_nvenc -cq 20 out.mp4
-@end example
-
-@end itemize
-
 @section bitplanenoise
 
 Show and measure bit plane noise.
@@ -9243,58 +9204,6 @@ Only deinterlace frames marked as interlaced.
 The default value is @code{all}.
 @end table
 
-@section bwdif_cuda
-
-Deinterlace the input video using the @ref{bwdif} algorithm, but implemented
-in CUDA so that it can work as part of a GPU accelerated pipeline with nvdec
-and/or nvenc.
-
-It accepts the following parameters:
-
-@table @option
-@item mode
-The interlacing mode to adopt. It accepts one of the following values:
-
-@table @option
-@item 0, send_frame
-Output one frame for each frame.
-@item 1, send_field
-Output one frame for each field.
-@end table
-
-The default value is @code{send_field}.
-
-@item parity
-The picture field parity assumed for the input interlaced video. It accepts one
-of the following values:
-
-@table @option
-@item 0, tff
-Assume the top field is first.
-@item 1, bff
-Assume the bottom field is first.
-@item -1, auto
-Enable automatic detection of field parity.
-@end table
-
-The default value is @code{auto}.
-If the interlacing is unknown or the decoder does not export this information,
-top field first will be assumed.
-
-@item deint
-Specify which frames to deinterlace. Accepts one of the following
-values:
-
-@table @option
-@item 0, all
-Deinterlace all frames.
-@item 1, interlaced
-Only deinterlace frames marked as interlaced.
-@end table
-
-The default value is @code{all}.
-@end table
-
 @section ccrepack
 
 Repack CEA-708 closed captioning side data
@@ -9408,48 +9317,6 @@ ffmpeg -f lavfi -i color=c=black:s=1280x720 -i video.mp4 -shortest -filter_compl
 @end example
 @end itemize
 
-@section chromakey_cuda
-CUDA accelerated YUV colorspace color/chroma keying.
-
-This filter works like normal chromakey filter but operates on CUDA frames.
-for more details and parameters see @ref{chromakey}.
-
-@subsection Examples
-
-@itemize
-@item
-Make all the green pixels in the input video transparent and use it as an overlay for another video:
-
-@example
-./ffmpeg \
-    -hwaccel cuda -hwaccel_output_format cuda -i input_green.mp4  \
-    -hwaccel cuda -hwaccel_output_format cuda -i base_video.mp4 \
-    -init_hw_device cuda \
-    -filter_complex \
-    " \
-        [0:v]chromakey_cuda=0x25302D:0.1:0.12:1[overlay_video]; \
-        [1:v]scale_cuda=format=yuv420p[base]; \
-        [base][overlay_video]overlay_cuda" \
-    -an -sn -c:v h264_nvenc -cq 20 output.mp4
-@end example
-
-@item
-Process two software sources, explicitly uploading the frames:
-
-@example
-./ffmpeg -init_hw_device cuda=cuda -filter_hw_device cuda \
-    -f lavfi -i color=size=800x600:color=white,format=yuv420p \
-    -f lavfi -i yuvtestsrc=size=200x200,format=yuv420p \
-    -filter_complex \
-    " \
-        [0]hwupload[under]; \
-        [1]hwupload,chromakey_cuda=green:0.1:0.12[over]; \
-        [under][over]overlay_cuda" \
-    -c:v hevc_nvenc -cq 18 -preset slow output.mp4
-@end example
-
-@end itemize
-
 @section chromanr
 Reduce chrominance noise.
 
@@ -10427,38 +10294,6 @@ For example to convert the input to SMPTE-240M, use the command:
 colorspace=smpte240m
 @end example
 
-@section colorspace_cuda
-
-CUDA accelerated implementation of the colorspace filter.
-
-It is by no means feature complete compared to the software colorspace filter,
-and at the current time only supports color range conversion between jpeg/full
-and mpeg/limited range.
-
-The filter accepts the following options:
-
-@table @option
-@item range
-Specify output color range.
-
-The accepted values are:
-@table @samp
-@item tv
-TV (restricted) range
-
-@item mpeg
-MPEG (restricted) range
-
-@item pc
-PC (full) range
-
-@item jpeg
-JPEG (full) range
-
-@end table
-
-@end table
-
 @section colortemperature
 Adjust color temperature in video to simulate variations in ambient color temperature.
 
@@ -18988,84 +18823,6 @@ testsrc=s=100x100, split=4 [in0][in1][in2][in3];
 
 @end itemize
 
-@anchor{overlay_cuda}
-@section overlay_cuda
-
-Overlay one video on top of another.
-
-This is the CUDA variant of the @ref{overlay} filter.
-It only accepts CUDA frames. The underlying input pixel formats have to match.
-
-It takes two inputs and has one output. The first input is the "main"
-video on which the second input is overlaid.
-
-It accepts the following parameters:
-
-@table @option
-@item x
-@item y
-Set expressions for the x and y coordinates of the overlaid video
-on the main video.
-
-They can contain the following parameters:
-
-@table @option
-
-@item main_w, W
-@item main_h, H
-The main input width and height.
-
-@item overlay_w, w
-@item overlay_h, h
-The overlay input width and height.
-
-@item x
-@item y
-The computed values for @var{x} and @var{y}. They are evaluated for
-each new frame.
-
-@item n
-The ordinal index of the main input frame, starting from 0.
-
-@item pos
-The byte offset position in the file of the main input frame, NAN if unknown.
-Deprecated, do not use.
-
-@item t
-The timestamp of the main input frame, expressed in seconds, NAN if unknown.
-
-@end table
-
-Default value is "0" for both expressions.
-
-@item eval
-Set when the expressions for @option{x} and @option{y} are evaluated.
-
-It accepts the following values:
-@table @option
-@item init
-Evaluate expressions once during filter initialization or
-when a command is processed.
-
-@item frame
-Evaluate expressions for each incoming frame
-@end table
-
-Default value is @option{frame}.
-
-@item eof_action
-See @ref{framesync}.
-
-@item shortest
-See @ref{framesync}.
-
-@item repeatlast
-See @ref{framesync}.
-
-@end table
-
-This filter also supports the @ref{framesync} options.
-
 @section owdenoise
 
 Apply Overcomplete Wavelet denoiser.
@@ -21516,287 +21273,6 @@ If the specified expression is not valid, it is kept at its current
 value.
 @end table
 
-@anchor{scale_cuda}
-@section scale_cuda
-
-Scale (resize) and convert (pixel format) the input video, using accelerated CUDA kernels.
-Setting the output width and height works in the same way as for the @ref{scale} filter.
-
-The filter accepts the following options:
-@table @option
-@item w
-@item h
-Set the output video dimension expression. Default value is the input dimension.
-
-Allows for the same expressions as the @ref{scale} filter.
-
-@item interp_algo
-Sets the algorithm used for scaling:
-
-@table @var
-@item nearest
-Nearest neighbour
-
-Used by default if input parameters match the desired output.
-
-@item bilinear
-Bilinear
-
-@item bicubic
-Bicubic
-
-This is the default.
-
-@item lanczos
-Lanczos
-
-@end table
-
-@item format
-Controls the output pixel format. By default, or if none is specified, the input
-pixel format is used.
-
-The filter does not support converting between YUV and RGB pixel formats.
-
-@item passthrough
-If set to 0, every frame is processed, even if no conversion is necessary.
-This mode can be useful to use the filter as a buffer for a downstream
-frame-consumer that exhausts the limited decoder frame pool.
-
-If set to 1, frames are passed through as-is if they match the desired output
-parameters. This is the default behaviour.
-
-@item param
-Algorithm-Specific parameter.
-
-Affects the curves of the bicubic algorithm.
-
-@item force_original_aspect_ratio
-@item force_divisible_by
-Work the same as the identical @ref{scale} filter options.
-
-@item reset_sar
-Works the same as the identical @ref{scale} filter option.
-
-@end table
-
-@subsection Examples
-
-@itemize
-@item
-Scale input to 720p, keeping aspect ratio and ensuring the output is yuv420p.
-@example
-scale_cuda=-2:720:format=yuv420p
-@end example
-
-@item
-Upscale to 4K using nearest neighbour algorithm.
-@example
-scale_cuda=4096:2160:interp_algo=nearest
-@end example
-
-@item
-Don't do any conversion or scaling, but copy all input frames into newly allocated ones.
-This can be useful to deal with a filter and encode chain that otherwise exhausts the
-decoders frame pool.
-@example
-scale_cuda=passthrough=0
-@end example
-@end itemize
-
-@anchor{scale_npp}
-@section scale_npp
-
-Use the NVIDIA Performance Primitives (libnpp) to perform scaling and/or pixel
-format conversion on CUDA video frames. Setting the output width and height
-works in the same way as for the @var{scale} filter.
-
-The following additional options are accepted:
-@table @option
-@item format
-The pixel format of the output CUDA frames. If set to the string "same" (the
-default), the input format will be kept. Note that automatic format negotiation
-and conversion is not yet supported for hardware frames
-
-@item interp_algo
-The interpolation algorithm used for resizing. One of the following:
-@table @option
-@item nn
-Nearest neighbour.
-
-@item linear
-@item cubic
-@item cubic2p_bspline
-2-parameter cubic (B=1, C=0)
-
-@item cubic2p_catmullrom
-2-parameter cubic (B=0, C=1/2)
-
-@item cubic2p_b05c03
-2-parameter cubic (B=1/2, C=3/10)
-
-@item super
-Supersampling
-
-@item lanczos
-@end table
-
-@item force_original_aspect_ratio
-Enable decreasing or increasing output video width or height if necessary to
-keep the original aspect ratio. Possible values:
-
-@table @samp
-@item disable
-Scale the video as specified and disable this feature.
-
-@item decrease
-The output video dimensions will automatically be decreased if needed.
-
-@item increase
-The output video dimensions will automatically be increased if needed.
-
-@end table
-
-One useful instance of this option is that when you know a specific device's
-maximum allowed resolution, you can use this to limit the output video to
-that, while retaining the aspect ratio. For example, device A allows
-1280x720 playback, and your video is 1920x800. Using this option (set it to
-decrease) and specifying 1280x720 to the command line makes the output
-1280x533.
-
-Please note that this is a different thing than specifying -1 for @option{w}
-or @option{h}, you still need to specify the output resolution for this option
-to work.
-
-@item force_divisible_by
-Ensures that both the output dimensions, width and height, are divisible by the
-given integer when used together with @option{force_original_aspect_ratio}. This
-works similar to using @code{-n} in the @option{w} and @option{h} options.
-
-This option respects the value set for @option{force_original_aspect_ratio},
-increasing or decreasing the resolution accordingly. The video's aspect ratio
-may be slightly modified.
-
-This option can be handy if you need to have a video fit within or exceed
-a defined resolution using @option{force_original_aspect_ratio} but also have
-encoder restrictions on width or height divisibility.
-
-@item reset_sar
-Works the same as the identical @ref{scale} filter option.
-
-@item eval
-Specify when to evaluate @var{width} and @var{height} expression. It accepts the following values:
-
-@table @samp
-@item init
-Only evaluate expressions once during the filter initialization or when a command is processed.
-
-@item frame
-Evaluate expressions for each incoming frame.
-
-@end table
-
-@end table
-
-The values of the @option{w} and @option{h} options are expressions
-containing the following constants:
-
-@table @var
-@item in_w
-@item in_h
-The input width and height
-
-@item iw
-@item ih
-These are the same as @var{in_w} and @var{in_h}.
-
-@item out_w
-@item out_h
-The output (scaled) width and height
-
-@item ow
-@item oh
-These are the same as @var{out_w} and @var{out_h}
-
-@item a
-The same as @var{iw} / @var{ih}
-
-@item sar
-input sample aspect ratio
-
-@item dar
-The input display aspect ratio. Calculated from @code{(iw / ih) * sar}.
-
-@item n
-The (sequential) number of the input frame, starting from 0.
-Only available with @code{eval=frame}.
-
-@item t
-The presentation timestamp of the input frame, expressed as a number of
-seconds. Only available with @code{eval=frame}.
-
-@item pos
-The position (byte offset) of the frame in the input stream, or NaN if
-this information is unavailable and/or meaningless (for example in case of synthetic video).
-Only available with @code{eval=frame}.
-Deprecated, do not use.
-@end table
-
-@section scale2ref_npp
-
-Use the NVIDIA Performance Primitives (libnpp) to scale (resize) the input
-video, based on a reference video.
-
-See the @ref{scale_npp} filter for available options, scale2ref_npp supports the same
-but uses the reference video instead of the main input as basis. scale2ref_npp
-also supports the following additional constants for the @option{w} and
-@option{h} options:
-
-@table @var
-@item main_w
-@item main_h
-The main input video's width and height
-
-@item main_a
-The same as @var{main_w} / @var{main_h}
-
-@item main_sar
-The main input video's sample aspect ratio
-
-@item main_dar, mdar
-The main input video's display aspect ratio. Calculated from
-@code{(main_w / main_h) * main_sar}.
-
-@item main_n
-The (sequential) number of the main input frame, starting from 0.
-Only available with @code{eval=frame}.
-
-@item main_t
-The presentation timestamp of the main input frame, expressed as a number of
-seconds. Only available with @code{eval=frame}.
-
-@item main_pos
-The position (byte offset) of the frame in the main input stream, or NaN if
-this information is unavailable and/or meaningless (for example in case of synthetic video).
-Only available with @code{eval=frame}.
-@end table
-
-@subsection Examples
-
-@itemize
-@item
-Scale a subtitle stream (b) to match the main video (a) in size before overlaying
-@example
-'scale2ref_npp[b][a];[a][b]overlay_cuda'
-@end example
-
-@item
-Scale a logo to 1/10th the height of a video, while preserving its display aspect ratio.
-@example
-[logo-in][video-in]scale2ref_npp=w=oh*mdar:h=ih/10[logo-out][video-out]
-@end example
-@end itemize
-
 @section scale_vt
 
 Scale and convert the color parameters using VTPixelTransferSession.
@@ -22243,23 +21719,6 @@ Keep the same chroma location (default).
 @end table
 @end table
 
-@section sharpen_npp
-Use the NVIDIA Performance Primitives (libnpp) to perform image sharpening with
-border control.
-
-The following additional options are accepted:
-@table @option
-
-@item border_type
-Type of sampling to be used ad frame borders. One of the following:
-@table @option
-
-@item replicate
-Replicate pixel values.
-
-@end table
-@end table
-
 @section shear
 Apply shear transform to input video.
 
@@ -24417,47 +23876,6 @@ The command above can also be specified as:
 transpose=1:portrait
 @end example
 
-@section transpose_npp
-
-Transpose rows with columns in the input video and optionally flip it.
-For more in depth examples see the @ref{transpose} video filter, which shares mostly the same options.
-
-It accepts the following parameters:
-
-@table @option
-
-@item dir
-Specify the transposition direction.
-
-Can assume the following values:
-@table @samp
-@item cclock_flip
-Rotate by 90 degrees counterclockwise and vertically flip. (default)
-
-@item clock
-Rotate by 90 degrees clockwise.
-
-@item cclock
-Rotate by 90 degrees counterclockwise.
-
-@item clock_flip
-Rotate by 90 degrees clockwise and vertically flip.
-@end table
-
-@item passthrough
-Do not apply the transposition if the input geometry matches the one
-specified by the specified value. It accepts the following values:
-@table @samp
-@item none
-Always apply transposition. (default)
-@item portrait
-Preserve portrait geometry (when @var{height} >= @var{width}).
-@item landscape
-Preserve landscape geometry (when @var{width} >= @var{height}).
-@end table
-
-@end table
-
 @section trim
 Trim the input so that the output contains one continuous subpart of the input.
 
@@ -26644,64 +26062,6 @@ filter").
 It accepts the following parameters:
 
 
-@table @option
-
-@item mode
-The interlacing mode to adopt. It accepts one of the following values:
-
-@table @option
-@item 0, send_frame
-Output one frame for each frame.
-@item 1, send_field
-Output one frame for each field.
-@item 2, send_frame_nospatial
-Like @code{send_frame}, but it skips the spatial interlacing check.
-@item 3, send_field_nospatial
-Like @code{send_field}, but it skips the spatial interlacing check.
-@end table
-
-The default value is @code{send_frame}.
-
-@item parity
-The picture field parity assumed for the input interlaced video. It accepts one
-of the following values:
-
-@table @option
-@item 0, tff
-Assume the top field is first.
-@item 1, bff
-Assume the bottom field is first.
-@item -1, auto
-Enable automatic detection of field parity.
-@end table
-
-The default value is @code{auto}.
-If the interlacing is unknown or the decoder does not export this information,
-top field first will be assumed.
-
-@item deint
-Specify which frames to deinterlace. Accepts one of the following
-values:
-
-@table @option
-@item 0, all
-Deinterlace all frames.
-@item 1, interlaced
-Only deinterlace frames marked as interlaced.
-@end table
-
-The default value is @code{all}.
-@end table
-
-@section yadif_cuda
-
-Deinterlace the input video using the @ref{yadif} algorithm, but implemented
-in CUDA so that it can work as part of a GPU accelerated pipeline with nvdec
-and/or nvenc.
-
-It accepts the following parameters:
-
-
 @table @option
 
 @item mode
@@ -27172,6 +26532,719 @@ value.
 
 @c man end VIDEO FILTERS
 
+@chapter CUDA Video Filters
+@c man begin CUDA Video Filters
+
+To enable CUDA and/or NPP filters please refer to configuration guidelines for @ref{CUDA} and for @ref{CUDA NPP} filters.
+
+Running CUDA filters requires you to initialize a hardware device and to pass that device to all filters in any filter graph.
+@table @option
+
+@item -init_hw_device cuda[=@var{name}][:@var{device}[,@var{key=value}...]]
+Initialise a new hardware device of type @var{cuda} called @var{name}, using the
+given device parameters.
+
+@item -filter_hw_device @var{name}
+Pass the hardware device called @var{name} to all filters in any filter graph.
+
+@end table
+
+For more detailed information see @url{https://www.ffmpeg.org/ffmpeg.html#Advanced-Video-options}
+
+@itemize
+@item
+Example of initializing second CUDA device on the system and running scale_cuda and bilateral_cuda filters.
+@example
+./ffmpeg -hwaccel cuda -hwaccel_output_format cuda -i input.mp4 -init_hw_device cuda:1 -filter_complex \
+"[0:v]scale_cuda=format=yuv444p[scaled_video];[scaled_video]bilateral_cuda=window_size=9:sigmaS=3.0:sigmaR=50.0" \
+-an -sn -c:v h264_nvenc -cq 20 out.mp4
+@end example
+@end itemize
+
+Since CUDA filters operate exclusively on GPU memory, frame data must sometimes be uploaded (@ref{hwupload}) to hardware surfaces associated with the appropriate CUDA device before processing, and downloaded (@ref{hwdownload}) back to normal memory afterward, if required. Whether @ref{hwupload} or @ref{hwdownload} is necessary depends on the specific workflow:
+
+@itemize
+@item If the input frames are already in GPU memory (e.g., when using @code{-hwaccel cuda} or @code{-hwaccel_output_format cuda}), explicit use of @ref{hwupload} is not needed, as the data is already in the appropriate memory space.
+@item If the input frames are in CPU memory (e.g., software-decoded frames or frames processed by CPU-based filters), it is necessary to use @ref{hwupload} to transfer the data to GPU memory for CUDA processing.
+@item If the output of the CUDA filters needs to be further processed by software-based filters or saved in a format not supported by GPU-based encoders, @ref{hwdownload} is required to transfer the data back to CPU memory.
+@end itemize
+Note that @ref{hwupload} uploads data to a surface with the same layout as the software frame, so it may be necessary to add a @ref{format} filter immediately before @ref{hwupload} to ensure the input is in the correct format. Similarly, @ref{hwdownload} may not support all output formats, so an additional @ref{format} filter may need to be inserted immediately after @ref{hwdownload} in the filter graph to ensure compatibility.
+
+@anchor{CUDA}
+@section CUDA
+Below is a description of the currently available Nvidia CUDA video filters.
+
+Prerequisites:
+@itemize
+@item Install Nvidia CUDA Toolkit
+@end itemize
+
+Note: If FFmpeg detects the Nvidia CUDA Toolkit during configuration, it will enable CUDA filters automatically without requiring any additional flags. If you want to explicitly enable them, use the following options:
+
+@itemize
+@item Configure FFmpeg with @code{--enable-cuda-nvcc --enable-nonfree}.
+@item Configure FFmpeg with @code{--enable-cuda-llvm}. Additional requirement: @code{llvm} lib must be installed.
+@end itemize
+
+@subsection bilateral_cuda
+CUDA accelerated bilateral filter, an edge preserving filter.
+This filter is mathematically accurate thanks to the use of GPU acceleration.
+For best output quality, use one to one chroma subsampling, i.e. yuv444p format.
+
+The filter accepts the following options:
+@table @option
+@item sigmaS
+Set sigma of gaussian function to calculate spatial weight, also called sigma space.
+Allowed range is 0.1 to 512. Default is 0.1.
+
+@item sigmaR
+Set sigma of gaussian function to calculate color range weight, also called sigma color.
+Allowed range is 0.1 to 512. Default is 0.1.
+
+@item window_size
+Set window size of the bilateral function to determine the number of neighbours to loop on.
+If the number entered is even, one will be added automatically.
+Allowed range is 1 to 255. Default is 1.
+@end table
+@subsubsection Examples
+
+@itemize
+@item
+Apply the bilateral filter on a video.
+
+@example
+./ffmpeg -v verbose \
+-hwaccel cuda -hwaccel_output_format cuda -i input.mp4  \
+-init_hw_device cuda \
+-filter_complex \
+" \
+[0:v]scale_cuda=format=yuv444p[scaled_video];
+[scaled_video]bilateral_cuda=window_size=9:sigmaS=3.0:sigmaR=50.0" \
+-an -sn -c:v h264_nvenc -cq 20 out.mp4
+@end example
+
+@end itemize
+
+@subsection bwdif_cuda
+
+Deinterlace the input video using the @ref{bwdif} algorithm, but implemented
+in CUDA so that it can work as part of a GPU accelerated pipeline with nvdec
+and/or nvenc.
+
+It accepts the following parameters:
+
+@table @option
+@item mode
+The interlacing mode to adopt. It accepts one of the following values:
+
+@table @option
+@item 0, send_frame
+Output one frame for each frame.
+@item 1, send_field
+Output one frame for each field.
+@end table
+
+The default value is @code{send_field}.
+
+@item parity
+The picture field parity assumed for the input interlaced video. It accepts one
+of the following values:
+
+@table @option
+@item 0, tff
+Assume the top field is first.
+@item 1, bff
+Assume the bottom field is first.
+@item -1, auto
+Enable automatic detection of field parity.
+@end table
+
+The default value is @code{auto}.
+If the interlacing is unknown or the decoder does not export this information,
+top field first will be assumed.
+
+@item deint
+Specify which frames to deinterlace. Accepts one of the following
+values:
+
+@table @option
+@item 0, all
+Deinterlace all frames.
+@item 1, interlaced
+Only deinterlace frames marked as interlaced.
+@end table
+
+The default value is @code{all}.
+@end table
+
+@subsection chromakey_cuda
+CUDA accelerated YUV colorspace color/chroma keying.
+
+This filter works like normal chromakey filter but operates on CUDA frames.
+for more details and parameters see @ref{chromakey}.
+
+@subsubsection Examples
+
+@itemize
+@item
+Make all the green pixels in the input video transparent and use it as an overlay for another video:
+
+@example
+./ffmpeg \
+    -hwaccel cuda -hwaccel_output_format cuda -i input_green.mp4  \
+    -hwaccel cuda -hwaccel_output_format cuda -i base_video.mp4 \
+    -init_hw_device cuda \
+    -filter_complex \
+    " \
+        [0:v]chromakey_cuda=0x25302D:0.1:0.12:1[overlay_video]; \
+        [1:v]scale_cuda=format=yuv420p[base]; \
+        [base][overlay_video]overlay_cuda" \
+    -an -sn -c:v h264_nvenc -cq 20 output.mp4
+@end example
+
+@item
+Process two software sources, explicitly uploading the frames:
+
+@example
+./ffmpeg -init_hw_device cuda=cuda -filter_hw_device cuda \
+    -f lavfi -i color=size=800x600:color=white,format=yuv420p \
+    -f lavfi -i yuvtestsrc=size=200x200,format=yuv420p \
+    -filter_complex \
+    " \
+        [0]hwupload[under]; \
+        [1]hwupload,chromakey_cuda=green:0.1:0.12[over]; \
+        [under][over]overlay_cuda" \
+    -c:v hevc_nvenc -cq 18 -preset slow output.mp4
+@end example
+
+@end itemize
+
+@subsection colorspace_cuda
+
+CUDA accelerated implementation of the colorspace filter.
+
+It is by no means feature complete compared to the software colorspace filter,
+and at the current time only supports color range conversion between jpeg/full
+and mpeg/limited range.
+
+The filter accepts the following options:
+
+@table @option
+@item range
+Specify output color range.
+
+The accepted values are:
+@table @samp
+@item tv
+TV (restricted) range
+
+@item mpeg
+MPEG (restricted) range
+
+@item pc
+PC (full) range
+
+@item jpeg
+JPEG (full) range
+
+@end table
+
+@end table
+
+@anchor{overlay_cuda}
+@subsection overlay_cuda
+
+Overlay one video on top of another.
+
+This is the CUDA variant of the @ref{overlay} filter.
+It only accepts CUDA frames. The underlying input pixel formats have to match.
+
+It takes two inputs and has one output. The first input is the "main"
+video on which the second input is overlaid.
+
+It accepts the following parameters:
+
+@table @option
+@item x
+@item y
+Set expressions for the x and y coordinates of the overlaid video
+on the main video.
+
+They can contain the following parameters:
+
+@table @option
+
+@item main_w, W
+@item main_h, H
+The main input width and height.
+
+@item overlay_w, w
+@item overlay_h, h
+The overlay input width and height.
+
+@item x
+@item y
+The computed values for @var{x} and @var{y}. They are evaluated for
+each new frame.
+
+@item n
+The ordinal index of the main input frame, starting from 0.
+
+@item pos
+The byte offset position in the file of the main input frame, NAN if unknown.
+Deprecated, do not use.
+
+@item t
+The timestamp of the main input frame, expressed in seconds, NAN if unknown.
+
+@end table
+
+Default value is "0" for both expressions.
+
+@item eval
+Set when the expressions for @option{x} and @option{y} are evaluated.
+
+It accepts the following values:
+@table @option
+@item init
+Evaluate expressions once during filter initialization or
+when a command is processed.
+
+@item frame
+Evaluate expressions for each incoming frame
+@end table
+
+Default value is @option{frame}.
+
+@item eof_action
+See @ref{framesync}.
+
+@item shortest
+See @ref{framesync}.
+
+@item repeatlast
+See @ref{framesync}.
+
+@end table
+
+This filter also supports the @ref{framesync} options.
+
+@anchor{scale_cuda}
+@subsection scale_cuda
+
+Scale (resize) and convert (pixel format) the input video, using accelerated CUDA kernels.
+Setting the output width and height works in the same way as for the @ref{scale} filter.
+
+The filter accepts the following options:
+@table @option
+@item w
+@item h
+Set the output video dimension expression. Default value is the input dimension.
+
+Allows for the same expressions as the @ref{scale} filter.
+
+@item interp_algo
+Sets the algorithm used for scaling:
+
+@table @var
+@item nearest
+Nearest neighbour
+
+Used by default if input parameters match the desired output.
+
+@item bilinear
+Bilinear
+
+@item bicubic
+Bicubic
+
+This is the default.
+
+@item lanczos
+Lanczos
+
+@end table
+
+@item format
+Controls the output pixel format. By default, or if none is specified, the input
+pixel format is used.
+
+The filter does not support converting between YUV and RGB pixel formats.
+
+@item passthrough
+If set to 0, every frame is processed, even if no conversion is necessary.
+This mode can be useful to use the filter as a buffer for a downstream
+frame-consumer that exhausts the limited decoder frame pool.
+
+If set to 1, frames are passed through as-is if they match the desired output
+parameters. This is the default behaviour.
+
+@item param
+Algorithm-Specific parameter.
+
+Affects the curves of the bicubic algorithm.
+
+@item force_original_aspect_ratio
+@item force_divisible_by
+Work the same as the identical @ref{scale} filter options.
+
+@item reset_sar
+Works the same as the identical @ref{scale} filter option.
+
+@end table
+
+@subsubsection Examples
+
+@itemize
+@item
+Scale input to 720p, keeping aspect ratio and ensuring the output is yuv420p.
+@example
+scale_cuda=-2:720:format=yuv420p
+@end example
+
+@item
+Upscale to 4K using nearest neighbour algorithm.
+@example
+scale_cuda=4096:2160:interp_algo=nearest
+@end example
+
+@item
+Don't do any conversion or scaling, but copy all input frames into newly allocated ones.
+This can be useful to deal with a filter and encode chain that otherwise exhausts the
+decoders frame pool.
+@example
+scale_cuda=passthrough=0
+@end example
+@end itemize
+
+@subsection yadif_cuda
+
+Deinterlace the input video using the @ref{yadif} algorithm, but implemented
+in CUDA so that it can work as part of a GPU accelerated pipeline with nvdec
+and/or nvenc.
+
+It accepts the following parameters:
+
+
+@table @option
+
+@item mode
+The interlacing mode to adopt. It accepts one of the following values:
+
+@table @option
+@item 0, send_frame
+Output one frame for each frame.
+@item 1, send_field
+Output one frame for each field.
+@item 2, send_frame_nospatial
+Like @code{send_frame}, but it skips the spatial interlacing check.
+@item 3, send_field_nospatial
+Like @code{send_field}, but it skips the spatial interlacing check.
+@end table
+
+The default value is @code{send_frame}.
+
+@item parity
+The picture field parity assumed for the input interlaced video. It accepts one
+of the following values:
+
+@table @option
+@item 0, tff
+Assume the top field is first.
+@item 1, bff
+Assume the bottom field is first.
+@item -1, auto
+Enable automatic detection of field parity.
+@end table
+
+The default value is @code{auto}.
+If the interlacing is unknown or the decoder does not export this information,
+top field first will be assumed.
+
+@item deint
+Specify which frames to deinterlace. Accepts one of the following
+values:
+
+@table @option
+@item 0, all
+Deinterlace all frames.
+@item 1, interlaced
+Only deinterlace frames marked as interlaced.
+@end table
+
+The default value is @code{all}.
+@end table
+
+@anchor{CUDA NPP}
+@section CUDA NPP
+Below is a description of the currently available NVIDIA Performance Primitives (libnpp) video filters.
+
+Prerequisites:
+@itemize
+@item Install Nvidia CUDA Toolkit
+@item Install libnpp
+@end itemize
+
+To enable CUDA NPP filters:
+
+@itemize
+@item Configure FFmpeg with @code{--enable-nonfree --enable-libnpp}.
+@end itemize
+
+
+@anchor{scale_npp}
+@subsection scale_npp
+
+Use the NVIDIA Performance Primitives (libnpp) to perform scaling and/or pixel
+format conversion on CUDA video frames. Setting the output width and height
+works in the same way as for the @var{scale} filter.
+
+The following additional options are accepted:
+@table @option
+@item format
+The pixel format of the output CUDA frames. If set to the string "same" (the
+default), the input format will be kept. Note that automatic format negotiation
+and conversion is not yet supported for hardware frames
+
+@item interp_algo
+The interpolation algorithm used for resizing. One of the following:
+@table @option
+@item nn
+Nearest neighbour.
+
+@item linear
+@item cubic
+@item cubic2p_bspline
+2-parameter cubic (B=1, C=0)
+
+@item cubic2p_catmullrom
+2-parameter cubic (B=0, C=1/2)
+
+@item cubic2p_b05c03
+2-parameter cubic (B=1/2, C=3/10)
+
+@item super
+Supersampling
+
+@item lanczos
+@end table
+
+@item force_original_aspect_ratio
+Enable decreasing or increasing output video width or height if necessary to
+keep the original aspect ratio. Possible values:
+
+@table @samp
+@item disable
+Scale the video as specified and disable this feature.
+
+@item decrease
+The output video dimensions will automatically be decreased if needed.
+
+@item increase
+The output video dimensions will automatically be increased if needed.
+
+@end table
+
+One useful instance of this option is that when you know a specific device's
+maximum allowed resolution, you can use this to limit the output video to
+that, while retaining the aspect ratio. For example, device A allows
+1280x720 playback, and your video is 1920x800. Using this option (set it to
+decrease) and specifying 1280x720 to the command line makes the output
+1280x533.
+
+Please note that this is a different thing than specifying -1 for @option{w}
+or @option{h}, you still need to specify the output resolution for this option
+to work.
+
+@item force_divisible_by
+Ensures that both the output dimensions, width and height, are divisible by the
+given integer when used together with @option{force_original_aspect_ratio}. This
+works similar to using @code{-n} in the @option{w} and @option{h} options.
+
+This option respects the value set for @option{force_original_aspect_ratio},
+increasing or decreasing the resolution accordingly. The video's aspect ratio
+may be slightly modified.
+
+This option can be handy if you need to have a video fit within or exceed
+a defined resolution using @option{force_original_aspect_ratio} but also have
+encoder restrictions on width or height divisibility.
+
+@item reset_sar
+Works the same as the identical @ref{scale} filter option.
+
+@item eval
+Specify when to evaluate @var{width} and @var{height} expression. It accepts the following values:
+
+@table @samp
+@item init
+Only evaluate expressions once during the filter initialization or when a command is processed.
+
+@item frame
+Evaluate expressions for each incoming frame.
+
+@end table
+
+@end table
+
+The values of the @option{w} and @option{h} options are expressions
+containing the following constants:
+
+@table @var
+@item in_w
+@item in_h
+The input width and height
+
+@item iw
+@item ih
+These are the same as @var{in_w} and @var{in_h}.
+
+@item out_w
+@item out_h
+The output (scaled) width and height
+
+@item ow
+@item oh
+These are the same as @var{out_w} and @var{out_h}
+
+@item a
+The same as @var{iw} / @var{ih}
+
+@item sar
+input sample aspect ratio
+
+@item dar
+The input display aspect ratio. Calculated from @code{(iw / ih) * sar}.
+
+@item n
+The (sequential) number of the input frame, starting from 0.
+Only available with @code{eval=frame}.
+
+@item t
+The presentation timestamp of the input frame, expressed as a number of
+seconds. Only available with @code{eval=frame}.
+
+@item pos
+The position (byte offset) of the frame in the input stream, or NaN if
+this information is unavailable and/or meaningless (for example in case of synthetic video).
+Only available with @code{eval=frame}.
+Deprecated, do not use.
+@end table
+
+@subsection scale2ref_npp
+
+Use the NVIDIA Performance Primitives (libnpp) to scale (resize) the input
+video, based on a reference video.
+
+See the @ref{scale_npp} filter for available options, scale2ref_npp supports the same
+but uses the reference video instead of the main input as basis. scale2ref_npp
+also supports the following additional constants for the @option{w} and
+@option{h} options:
+
+@table @var
+@item main_w
+@item main_h
+The main input video's width and height
+
+@item main_a
+The same as @var{main_w} / @var{main_h}
+
+@item main_sar
+The main input video's sample aspect ratio
+
+@item main_dar, mdar
+The main input video's display aspect ratio. Calculated from
+@code{(main_w / main_h) * main_sar}.
+
+@item main_n
+The (sequential) number of the main input frame, starting from 0.
+Only available with @code{eval=frame}.
+
+@item main_t
+The presentation timestamp of the main input frame, expressed as a number of
+seconds. Only available with @code{eval=frame}.
+
+@item main_pos
+The position (byte offset) of the frame in the main input stream, or NaN if
+this information is unavailable and/or meaningless (for example in case of synthetic video).
+Only available with @code{eval=frame}.
+@end table
+
+@subsubsection Examples
+
+@itemize
+@item
+Scale a subtitle stream (b) to match the main video (a) in size before overlaying
+@example
+'scale2ref_npp[b][a];[a][b]overlay_cuda'
+@end example
+
+@item
+Scale a logo to 1/10th the height of a video, while preserving its display aspect ratio.
+@example
+[logo-in][video-in]scale2ref_npp=w=oh*mdar:h=ih/10[logo-out][video-out]
+@end example
+@end itemize
+
+@subsection sharpen_npp
+Use the NVIDIA Performance Primitives (libnpp) to perform image sharpening with
+border control.
+
+The following additional options are accepted:
+@table @option
+
+@item border_type
+Type of sampling to be used ad frame borders. One of the following:
+@table @option
+
+@item replicate
+Replicate pixel values.
+
+@end table
+@end table
+
+@subsection transpose_npp
+
+Transpose rows with columns in the input video and optionally flip it.
+For more in depth examples see the @ref{transpose} video filter, which shares mostly the same options.
+
+It accepts the following parameters:
+
+@table @option
+
+@item dir
+Specify the transposition direction.
+
+Can assume the following values:
+@table @samp
+@item cclock_flip
+Rotate by 90 degrees counterclockwise and vertically flip. (default)
+
+@item clock
+Rotate by 90 degrees clockwise.
+
+@item cclock
+Rotate by 90 degrees counterclockwise.
+
+@item clock_flip
+Rotate by 90 degrees clockwise and vertically flip.
+@end table
+
+@item passthrough
+Do not apply the transposition if the input geometry matches the one
+specified by the specified value. It accepts the following values:
+@table @samp
+@item none
+Always apply transposition. (default)
+@item portrait
+Preserve portrait geometry (when @var{height} >= @var{width}).
+@item landscape
+Preserve landscape geometry (when @var{width} >= @var{height}).
+@end table
+
+@end table
+
+@c man end CUDA Video Filters
+
 @chapter OpenCL Video Filters
 @c man begin OPENCL VIDEO FILTERS
 
-- 
2.39.5 (Apple Git-154)

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [FFmpeg-devel] [PATCH] doc/filters: Add CUDA Video Filters section for CUDA-based and CUDA+NPP based filters.
  2025-03-16 19:15                   ` Danil Iashchenko
@ 2025-03-17  5:55                     ` Gyan Doshi
  2025-03-17  7:14                       ` Gyan Doshi
  0 siblings, 1 reply; 12+ messages in thread
From: Gyan Doshi @ 2025-03-17  5:55 UTC (permalink / raw)
  To: ffmpeg-devel



On 2025-03-17 12:45 am, Danil Iashchenko wrote:
> Hi Gyan and Michael,
> Thank you for reviewing the patch and providing feedback!
> I've addressed all the issues and resubmitting the patch (built and tested with Texinfo 7.1.1).
>
> Per Gyan's suggestion, I'm resubmitting since Patchwork was down when I originally sent it on the 10th.
>
> Please let me know if there's anything else I can clarify or improve.
> Thanks again!

Generally, looks fine. I've a couple of minor gripes but I'll adjust the 
commit msg and apply this.
We can then address the minor points.

Regards,
Gyan


>
> ---
>   doc/filters.texi | 1353 ++++++++++++++++++++++++----------------------
>   1 file changed, 713 insertions(+), 640 deletions(-)
>
> diff --git a/doc/filters.texi b/doc/filters.texi
> index 0ba7d3035f..37b8674756 100644
> --- a/doc/filters.texi
> +++ b/doc/filters.texi
> @@ -8619,45 +8619,6 @@ Set planes to filter. Default is first only.
>   
>   This filter supports the all above options as @ref{commands}.
>   
> -@section bilateral_cuda
> -CUDA accelerated bilateral filter, an edge preserving filter.
> -This filter is mathematically accurate thanks to the use of GPU acceleration.
> -For best output quality, use one to one chroma subsampling, i.e. yuv444p format.
> -
> -The filter accepts the following options:
> -@table @option
> -@item sigmaS
> -Set sigma of gaussian function to calculate spatial weight, also called sigma space.
> -Allowed range is 0.1 to 512. Default is 0.1.
> -
> -@item sigmaR
> -Set sigma of gaussian function to calculate color range weight, also called sigma color.
> -Allowed range is 0.1 to 512. Default is 0.1.
> -
> -@item window_size
> -Set window size of the bilateral function to determine the number of neighbours to loop on.
> -If the number entered is even, one will be added automatically.
> -Allowed range is 1 to 255. Default is 1.
> -@end table
> -@subsection Examples
> -
> -@itemize
> -@item
> -Apply the bilateral filter on a video.
> -
> -@example
> -./ffmpeg -v verbose \
> --hwaccel cuda -hwaccel_output_format cuda -i input.mp4  \
> --init_hw_device cuda \
> --filter_complex \
> -" \
> -[0:v]scale_cuda=format=yuv444p[scaled_video];
> -[scaled_video]bilateral_cuda=window_size=9:sigmaS=3.0:sigmaR=50.0" \
> --an -sn -c:v h264_nvenc -cq 20 out.mp4
> -@end example
> -
> -@end itemize
> -
>   @section bitplanenoise
>   
>   Show and measure bit plane noise.
> @@ -9243,58 +9204,6 @@ Only deinterlace frames marked as interlaced.
>   The default value is @code{all}.
>   @end table
>   
> -@section bwdif_cuda
> -
> -Deinterlace the input video using the @ref{bwdif} algorithm, but implemented
> -in CUDA so that it can work as part of a GPU accelerated pipeline with nvdec
> -and/or nvenc.
> -
> -It accepts the following parameters:
> -
> -@table @option
> -@item mode
> -The interlacing mode to adopt. It accepts one of the following values:
> -
> -@table @option
> -@item 0, send_frame
> -Output one frame for each frame.
> -@item 1, send_field
> -Output one frame for each field.
> -@end table
> -
> -The default value is @code{send_field}.
> -
> -@item parity
> -The picture field parity assumed for the input interlaced video. It accepts one
> -of the following values:
> -
> -@table @option
> -@item 0, tff
> -Assume the top field is first.
> -@item 1, bff
> -Assume the bottom field is first.
> -@item -1, auto
> -Enable automatic detection of field parity.
> -@end table
> -
> -The default value is @code{auto}.
> -If the interlacing is unknown or the decoder does not export this information,
> -top field first will be assumed.
> -
> -@item deint
> -Specify which frames to deinterlace. Accepts one of the following
> -values:
> -
> -@table @option
> -@item 0, all
> -Deinterlace all frames.
> -@item 1, interlaced
> -Only deinterlace frames marked as interlaced.
> -@end table
> -
> -The default value is @code{all}.
> -@end table
> -
>   @section ccrepack
>   
>   Repack CEA-708 closed captioning side data
> @@ -9408,48 +9317,6 @@ ffmpeg -f lavfi -i color=c=black:s=1280x720 -i video.mp4 -shortest -filter_compl
>   @end example
>   @end itemize
>   
> -@section chromakey_cuda
> -CUDA accelerated YUV colorspace color/chroma keying.
> -
> -This filter works like normal chromakey filter but operates on CUDA frames.
> -for more details and parameters see @ref{chromakey}.
> -
> -@subsection Examples
> -
> -@itemize
> -@item
> -Make all the green pixels in the input video transparent and use it as an overlay for another video:
> -
> -@example
> -./ffmpeg \
> -    -hwaccel cuda -hwaccel_output_format cuda -i input_green.mp4  \
> -    -hwaccel cuda -hwaccel_output_format cuda -i base_video.mp4 \
> -    -init_hw_device cuda \
> -    -filter_complex \
> -    " \
> -        [0:v]chromakey_cuda=0x25302D:0.1:0.12:1[overlay_video]; \
> -        [1:v]scale_cuda=format=yuv420p[base]; \
> -        [base][overlay_video]overlay_cuda" \
> -    -an -sn -c:v h264_nvenc -cq 20 output.mp4
> -@end example
> -
> -@item
> -Process two software sources, explicitly uploading the frames:
> -
> -@example
> -./ffmpeg -init_hw_device cuda=cuda -filter_hw_device cuda \
> -    -f lavfi -i color=size=800x600:color=white,format=yuv420p \
> -    -f lavfi -i yuvtestsrc=size=200x200,format=yuv420p \
> -    -filter_complex \
> -    " \
> -        [0]hwupload[under]; \
> -        [1]hwupload,chromakey_cuda=green:0.1:0.12[over]; \
> -        [under][over]overlay_cuda" \
> -    -c:v hevc_nvenc -cq 18 -preset slow output.mp4
> -@end example
> -
> -@end itemize
> -
>   @section chromanr
>   Reduce chrominance noise.
>   
> @@ -10427,38 +10294,6 @@ For example to convert the input to SMPTE-240M, use the command:
>   colorspace=smpte240m
>   @end example
>   
> -@section colorspace_cuda
> -
> -CUDA accelerated implementation of the colorspace filter.
> -
> -It is by no means feature complete compared to the software colorspace filter,
> -and at the current time only supports color range conversion between jpeg/full
> -and mpeg/limited range.
> -
> -The filter accepts the following options:
> -
> -@table @option
> -@item range
> -Specify output color range.
> -
> -The accepted values are:
> -@table @samp
> -@item tv
> -TV (restricted) range
> -
> -@item mpeg
> -MPEG (restricted) range
> -
> -@item pc
> -PC (full) range
> -
> -@item jpeg
> -JPEG (full) range
> -
> -@end table
> -
> -@end table
> -
>   @section colortemperature
>   Adjust color temperature in video to simulate variations in ambient color temperature.
>   
> @@ -18988,84 +18823,6 @@ testsrc=s=100x100, split=4 [in0][in1][in2][in3];
>   
>   @end itemize
>   
> -@anchor{overlay_cuda}
> -@section overlay_cuda
> -
> -Overlay one video on top of another.
> -
> -This is the CUDA variant of the @ref{overlay} filter.
> -It only accepts CUDA frames. The underlying input pixel formats have to match.
> -
> -It takes two inputs and has one output. The first input is the "main"
> -video on which the second input is overlaid.
> -
> -It accepts the following parameters:
> -
> -@table @option
> -@item x
> -@item y
> -Set expressions for the x and y coordinates of the overlaid video
> -on the main video.
> -
> -They can contain the following parameters:
> -
> -@table @option
> -
> -@item main_w, W
> -@item main_h, H
> -The main input width and height.
> -
> -@item overlay_w, w
> -@item overlay_h, h
> -The overlay input width and height.
> -
> -@item x
> -@item y
> -The computed values for @var{x} and @var{y}. They are evaluated for
> -each new frame.
> -
> -@item n
> -The ordinal index of the main input frame, starting from 0.
> -
> -@item pos
> -The byte offset position in the file of the main input frame, NAN if unknown.
> -Deprecated, do not use.
> -
> -@item t
> -The timestamp of the main input frame, expressed in seconds, NAN if unknown.
> -
> -@end table
> -
> -Default value is "0" for both expressions.
> -
> -@item eval
> -Set when the expressions for @option{x} and @option{y} are evaluated.
> -
> -It accepts the following values:
> -@table @option
> -@item init
> -Evaluate expressions once during filter initialization or
> -when a command is processed.
> -
> -@item frame
> -Evaluate expressions for each incoming frame
> -@end table
> -
> -Default value is @option{frame}.
> -
> -@item eof_action
> -See @ref{framesync}.
> -
> -@item shortest
> -See @ref{framesync}.
> -
> -@item repeatlast
> -See @ref{framesync}.
> -
> -@end table
> -
> -This filter also supports the @ref{framesync} options.
> -
>   @section owdenoise
>   
>   Apply Overcomplete Wavelet denoiser.
> @@ -21516,287 +21273,6 @@ If the specified expression is not valid, it is kept at its current
>   value.
>   @end table
>   
> -@anchor{scale_cuda}
> -@section scale_cuda
> -
> -Scale (resize) and convert (pixel format) the input video, using accelerated CUDA kernels.
> -Setting the output width and height works in the same way as for the @ref{scale} filter.
> -
> -The filter accepts the following options:
> -@table @option
> -@item w
> -@item h
> -Set the output video dimension expression. Default value is the input dimension.
> -
> -Allows for the same expressions as the @ref{scale} filter.
> -
> -@item interp_algo
> -Sets the algorithm used for scaling:
> -
> -@table @var
> -@item nearest
> -Nearest neighbour
> -
> -Used by default if input parameters match the desired output.
> -
> -@item bilinear
> -Bilinear
> -
> -@item bicubic
> -Bicubic
> -
> -This is the default.
> -
> -@item lanczos
> -Lanczos
> -
> -@end table
> -
> -@item format
> -Controls the output pixel format. By default, or if none is specified, the input
> -pixel format is used.
> -
> -The filter does not support converting between YUV and RGB pixel formats.
> -
> -@item passthrough
> -If set to 0, every frame is processed, even if no conversion is necessary.
> -This mode can be useful to use the filter as a buffer for a downstream
> -frame-consumer that exhausts the limited decoder frame pool.
> -
> -If set to 1, frames are passed through as-is if they match the desired output
> -parameters. This is the default behaviour.
> -
> -@item param
> -Algorithm-Specific parameter.
> -
> -Affects the curves of the bicubic algorithm.
> -
> -@item force_original_aspect_ratio
> -@item force_divisible_by
> -Work the same as the identical @ref{scale} filter options.
> -
> -@item reset_sar
> -Works the same as the identical @ref{scale} filter option.
> -
> -@end table
> -
> -@subsection Examples
> -
> -@itemize
> -@item
> -Scale input to 720p, keeping aspect ratio and ensuring the output is yuv420p.
> -@example
> -scale_cuda=-2:720:format=yuv420p
> -@end example
> -
> -@item
> -Upscale to 4K using nearest neighbour algorithm.
> -@example
> -scale_cuda=4096:2160:interp_algo=nearest
> -@end example
> -
> -@item
> -Don't do any conversion or scaling, but copy all input frames into newly allocated ones.
> -This can be useful to deal with a filter and encode chain that otherwise exhausts the
> -decoders frame pool.
> -@example
> -scale_cuda=passthrough=0
> -@end example
> -@end itemize
> -
> -@anchor{scale_npp}
> -@section scale_npp
> -
> -Use the NVIDIA Performance Primitives (libnpp) to perform scaling and/or pixel
> -format conversion on CUDA video frames. Setting the output width and height
> -works in the same way as for the @var{scale} filter.
> -
> -The following additional options are accepted:
> -@table @option
> -@item format
> -The pixel format of the output CUDA frames. If set to the string "same" (the
> -default), the input format will be kept. Note that automatic format negotiation
> -and conversion is not yet supported for hardware frames
> -
> -@item interp_algo
> -The interpolation algorithm used for resizing. One of the following:
> -@table @option
> -@item nn
> -Nearest neighbour.
> -
> -@item linear
> -@item cubic
> -@item cubic2p_bspline
> -2-parameter cubic (B=1, C=0)
> -
> -@item cubic2p_catmullrom
> -2-parameter cubic (B=0, C=1/2)
> -
> -@item cubic2p_b05c03
> -2-parameter cubic (B=1/2, C=3/10)
> -
> -@item super
> -Supersampling
> -
> -@item lanczos
> -@end table
> -
> -@item force_original_aspect_ratio
> -Enable decreasing or increasing output video width or height if necessary to
> -keep the original aspect ratio. Possible values:
> -
> -@table @samp
> -@item disable
> -Scale the video as specified and disable this feature.
> -
> -@item decrease
> -The output video dimensions will automatically be decreased if needed.
> -
> -@item increase
> -The output video dimensions will automatically be increased if needed.
> -
> -@end table
> -
> -One useful instance of this option is that when you know a specific device's
> -maximum allowed resolution, you can use this to limit the output video to
> -that, while retaining the aspect ratio. For example, device A allows
> -1280x720 playback, and your video is 1920x800. Using this option (set it to
> -decrease) and specifying 1280x720 to the command line makes the output
> -1280x533.
> -
> -Please note that this is a different thing than specifying -1 for @option{w}
> -or @option{h}, you still need to specify the output resolution for this option
> -to work.
> -
> -@item force_divisible_by
> -Ensures that both the output dimensions, width and height, are divisible by the
> -given integer when used together with @option{force_original_aspect_ratio}. This
> -works similar to using @code{-n} in the @option{w} and @option{h} options.
> -
> -This option respects the value set for @option{force_original_aspect_ratio},
> -increasing or decreasing the resolution accordingly. The video's aspect ratio
> -may be slightly modified.
> -
> -This option can be handy if you need to have a video fit within or exceed
> -a defined resolution using @option{force_original_aspect_ratio} but also have
> -encoder restrictions on width or height divisibility.
> -
> -@item reset_sar
> -Works the same as the identical @ref{scale} filter option.
> -
> -@item eval
> -Specify when to evaluate @var{width} and @var{height} expression. It accepts the following values:
> -
> -@table @samp
> -@item init
> -Only evaluate expressions once during the filter initialization or when a command is processed.
> -
> -@item frame
> -Evaluate expressions for each incoming frame.
> -
> -@end table
> -
> -@end table
> -
> -The values of the @option{w} and @option{h} options are expressions
> -containing the following constants:
> -
> -@table @var
> -@item in_w
> -@item in_h
> -The input width and height
> -
> -@item iw
> -@item ih
> -These are the same as @var{in_w} and @var{in_h}.
> -
> -@item out_w
> -@item out_h
> -The output (scaled) width and height
> -
> -@item ow
> -@item oh
> -These are the same as @var{out_w} and @var{out_h}
> -
> -@item a
> -The same as @var{iw} / @var{ih}
> -
> -@item sar
> -input sample aspect ratio
> -
> -@item dar
> -The input display aspect ratio. Calculated from @code{(iw / ih) * sar}.
> -
> -@item n
> -The (sequential) number of the input frame, starting from 0.
> -Only available with @code{eval=frame}.
> -
> -@item t
> -The presentation timestamp of the input frame, expressed as a number of
> -seconds. Only available with @code{eval=frame}.
> -
> -@item pos
> -The position (byte offset) of the frame in the input stream, or NaN if
> -this information is unavailable and/or meaningless (for example in case of synthetic video).
> -Only available with @code{eval=frame}.
> -Deprecated, do not use.
> -@end table
> -
> -@section scale2ref_npp
> -
> -Use the NVIDIA Performance Primitives (libnpp) to scale (resize) the input
> -video, based on a reference video.
> -
> -See the @ref{scale_npp} filter for available options, scale2ref_npp supports the same
> -but uses the reference video instead of the main input as basis. scale2ref_npp
> -also supports the following additional constants for the @option{w} and
> -@option{h} options:
> -
> -@table @var
> -@item main_w
> -@item main_h
> -The main input video's width and height
> -
> -@item main_a
> -The same as @var{main_w} / @var{main_h}
> -
> -@item main_sar
> -The main input video's sample aspect ratio
> -
> -@item main_dar, mdar
> -The main input video's display aspect ratio. Calculated from
> -@code{(main_w / main_h) * main_sar}.
> -
> -@item main_n
> -The (sequential) number of the main input frame, starting from 0.
> -Only available with @code{eval=frame}.
> -
> -@item main_t
> -The presentation timestamp of the main input frame, expressed as a number of
> -seconds. Only available with @code{eval=frame}.
> -
> -@item main_pos
> -The position (byte offset) of the frame in the main input stream, or NaN if
> -this information is unavailable and/or meaningless (for example in case of synthetic video).
> -Only available with @code{eval=frame}.
> -@end table
> -
> -@subsection Examples
> -
> -@itemize
> -@item
> -Scale a subtitle stream (b) to match the main video (a) in size before overlaying
> -@example
> -'scale2ref_npp[b][a];[a][b]overlay_cuda'
> -@end example
> -
> -@item
> -Scale a logo to 1/10th the height of a video, while preserving its display aspect ratio.
> -@example
> -[logo-in][video-in]scale2ref_npp=w=oh*mdar:h=ih/10[logo-out][video-out]
> -@end example
> -@end itemize
> -
>   @section scale_vt
>   
>   Scale and convert the color parameters using VTPixelTransferSession.
> @@ -22243,23 +21719,6 @@ Keep the same chroma location (default).
>   @end table
>   @end table
>   
> -@section sharpen_npp
> -Use the NVIDIA Performance Primitives (libnpp) to perform image sharpening with
> -border control.
> -
> -The following additional options are accepted:
> -@table @option
> -
> -@item border_type
> -Type of sampling to be used ad frame borders. One of the following:
> -@table @option
> -
> -@item replicate
> -Replicate pixel values.
> -
> -@end table
> -@end table
> -
>   @section shear
>   Apply shear transform to input video.
>   
> @@ -24417,47 +23876,6 @@ The command above can also be specified as:
>   transpose=1:portrait
>   @end example
>   
> -@section transpose_npp
> -
> -Transpose rows with columns in the input video and optionally flip it.
> -For more in depth examples see the @ref{transpose} video filter, which shares mostly the same options.
> -
> -It accepts the following parameters:
> -
> -@table @option
> -
> -@item dir
> -Specify the transposition direction.
> -
> -Can assume the following values:
> -@table @samp
> -@item cclock_flip
> -Rotate by 90 degrees counterclockwise and vertically flip. (default)
> -
> -@item clock
> -Rotate by 90 degrees clockwise.
> -
> -@item cclock
> -Rotate by 90 degrees counterclockwise.
> -
> -@item clock_flip
> -Rotate by 90 degrees clockwise and vertically flip.
> -@end table
> -
> -@item passthrough
> -Do not apply the transposition if the input geometry matches the one
> -specified by the specified value. It accepts the following values:
> -@table @samp
> -@item none
> -Always apply transposition. (default)
> -@item portrait
> -Preserve portrait geometry (when @var{height} >= @var{width}).
> -@item landscape
> -Preserve landscape geometry (when @var{width} >= @var{height}).
> -@end table
> -
> -@end table
> -
>   @section trim
>   Trim the input so that the output contains one continuous subpart of the input.
>   
> @@ -26644,64 +26062,6 @@ filter").
>   It accepts the following parameters:
>   
>   
> -@table @option
> -
> -@item mode
> -The interlacing mode to adopt. It accepts one of the following values:
> -
> -@table @option
> -@item 0, send_frame
> -Output one frame for each frame.
> -@item 1, send_field
> -Output one frame for each field.
> -@item 2, send_frame_nospatial
> -Like @code{send_frame}, but it skips the spatial interlacing check.
> -@item 3, send_field_nospatial
> -Like @code{send_field}, but it skips the spatial interlacing check.
> -@end table
> -
> -The default value is @code{send_frame}.
> -
> -@item parity
> -The picture field parity assumed for the input interlaced video. It accepts one
> -of the following values:
> -
> -@table @option
> -@item 0, tff
> -Assume the top field is first.
> -@item 1, bff
> -Assume the bottom field is first.
> -@item -1, auto
> -Enable automatic detection of field parity.
> -@end table
> -
> -The default value is @code{auto}.
> -If the interlacing is unknown or the decoder does not export this information,
> -top field first will be assumed.
> -
> -@item deint
> -Specify which frames to deinterlace. Accepts one of the following
> -values:
> -
> -@table @option
> -@item 0, all
> -Deinterlace all frames.
> -@item 1, interlaced
> -Only deinterlace frames marked as interlaced.
> -@end table
> -
> -The default value is @code{all}.
> -@end table
> -
> -@section yadif_cuda
> -
> -Deinterlace the input video using the @ref{yadif} algorithm, but implemented
> -in CUDA so that it can work as part of a GPU accelerated pipeline with nvdec
> -and/or nvenc.
> -
> -It accepts the following parameters:
> -
> -
>   @table @option
>   
>   @item mode
> @@ -27172,6 +26532,719 @@ value.
>   
>   @c man end VIDEO FILTERS
>   
> +@chapter CUDA Video Filters
> +@c man begin CUDA Video Filters
> +
> +To enable CUDA and/or NPP filters please refer to configuration guidelines for @ref{CUDA} and for @ref{CUDA NPP} filters.
> +
> +Running CUDA filters requires you to initialize a hardware device and to pass that device to all filters in any filter graph.
> +@table @option
> +
> +@item -init_hw_device cuda[=@var{name}][:@var{device}[,@var{key=value}...]]
> +Initialise a new hardware device of type @var{cuda} called @var{name}, using the
> +given device parameters.
> +
> +@item -filter_hw_device @var{name}
> +Pass the hardware device called @var{name} to all filters in any filter graph.
> +
> +@end table
> +
> +For more detailed information see @url{https://www.ffmpeg.org/ffmpeg.html#Advanced-Video-options}
> +
> +@itemize
> +@item
> +Example of initializing second CUDA device on the system and running scale_cuda and bilateral_cuda filters.
> +@example
> +./ffmpeg -hwaccel cuda -hwaccel_output_format cuda -i input.mp4 -init_hw_device cuda:1 -filter_complex \
> +"[0:v]scale_cuda=format=yuv444p[scaled_video];[scaled_video]bilateral_cuda=window_size=9:sigmaS=3.0:sigmaR=50.0" \
> +-an -sn -c:v h264_nvenc -cq 20 out.mp4
> +@end example
> +@end itemize
> +
> +Since CUDA filters operate exclusively on GPU memory, frame data must sometimes be uploaded (@ref{hwupload}) to hardware surfaces associated with the appropriate CUDA device before processing, and downloaded (@ref{hwdownload}) back to normal memory afterward, if required. Whether @ref{hwupload} or @ref{hwdownload} is necessary depends on the specific workflow:
> +
> +@itemize
> +@item If the input frames are already in GPU memory (e.g., when using @code{-hwaccel cuda} or @code{-hwaccel_output_format cuda}), explicit use of @ref{hwupload} is not needed, as the data is already in the appropriate memory space.
> +@item If the input frames are in CPU memory (e.g., software-decoded frames or frames processed by CPU-based filters), it is necessary to use @ref{hwupload} to transfer the data to GPU memory for CUDA processing.
> +@item If the output of the CUDA filters needs to be further processed by software-based filters or saved in a format not supported by GPU-based encoders, @ref{hwdownload} is required to transfer the data back to CPU memory.
> +@end itemize
> +Note that @ref{hwupload} uploads data to a surface with the same layout as the software frame, so it may be necessary to add a @ref{format} filter immediately before @ref{hwupload} to ensure the input is in the correct format. Similarly, @ref{hwdownload} may not support all output formats, so an additional @ref{format} filter may need to be inserted immediately after @ref{hwdownload} in the filter graph to ensure compatibility.
> +
> +@anchor{CUDA}
> +@section CUDA
> +Below is a description of the currently available Nvidia CUDA video filters.
> +
> +Prerequisites:
> +@itemize
> +@item Install Nvidia CUDA Toolkit
> +@end itemize
> +
> +Note: If FFmpeg detects the Nvidia CUDA Toolkit during configuration, it will enable CUDA filters automatically without requiring any additional flags. If you want to explicitly enable them, use the following options:
> +
> +@itemize
> +@item Configure FFmpeg with @code{--enable-cuda-nvcc --enable-nonfree}.
> +@item Configure FFmpeg with @code{--enable-cuda-llvm}. Additional requirement: @code{llvm} lib must be installed.
> +@end itemize
> +
> +@subsection bilateral_cuda
> +CUDA accelerated bilateral filter, an edge preserving filter.
> +This filter is mathematically accurate thanks to the use of GPU acceleration.
> +For best output quality, use one to one chroma subsampling, i.e. yuv444p format.
> +
> +The filter accepts the following options:
> +@table @option
> +@item sigmaS
> +Set sigma of gaussian function to calculate spatial weight, also called sigma space.
> +Allowed range is 0.1 to 512. Default is 0.1.
> +
> +@item sigmaR
> +Set sigma of gaussian function to calculate color range weight, also called sigma color.
> +Allowed range is 0.1 to 512. Default is 0.1.
> +
> +@item window_size
> +Set window size of the bilateral function to determine the number of neighbours to loop on.
> +If the number entered is even, one will be added automatically.
> +Allowed range is 1 to 255. Default is 1.
> +@end table
> +@subsubsection Examples
> +
> +@itemize
> +@item
> +Apply the bilateral filter on a video.
> +
> +@example
> +./ffmpeg -v verbose \
> +-hwaccel cuda -hwaccel_output_format cuda -i input.mp4  \
> +-init_hw_device cuda \
> +-filter_complex \
> +" \
> +[0:v]scale_cuda=format=yuv444p[scaled_video];
> +[scaled_video]bilateral_cuda=window_size=9:sigmaS=3.0:sigmaR=50.0" \
> +-an -sn -c:v h264_nvenc -cq 20 out.mp4
> +@end example
> +
> +@end itemize
> +
> +@subsection bwdif_cuda
> +
> +Deinterlace the input video using the @ref{bwdif} algorithm, but implemented
> +in CUDA so that it can work as part of a GPU accelerated pipeline with nvdec
> +and/or nvenc.
> +
> +It accepts the following parameters:
> +
> +@table @option
> +@item mode
> +The interlacing mode to adopt. It accepts one of the following values:
> +
> +@table @option
> +@item 0, send_frame
> +Output one frame for each frame.
> +@item 1, send_field
> +Output one frame for each field.
> +@end table
> +
> +The default value is @code{send_field}.
> +
> +@item parity
> +The picture field parity assumed for the input interlaced video. It accepts one
> +of the following values:
> +
> +@table @option
> +@item 0, tff
> +Assume the top field is first.
> +@item 1, bff
> +Assume the bottom field is first.
> +@item -1, auto
> +Enable automatic detection of field parity.
> +@end table
> +
> +The default value is @code{auto}.
> +If the interlacing is unknown or the decoder does not export this information,
> +top field first will be assumed.
> +
> +@item deint
> +Specify which frames to deinterlace. Accepts one of the following
> +values:
> +
> +@table @option
> +@item 0, all
> +Deinterlace all frames.
> +@item 1, interlaced
> +Only deinterlace frames marked as interlaced.
> +@end table
> +
> +The default value is @code{all}.
> +@end table
> +
> +@subsection chromakey_cuda
> +CUDA accelerated YUV colorspace color/chroma keying.
> +
> +This filter works like normal chromakey filter but operates on CUDA frames.
> +for more details and parameters see @ref{chromakey}.
> +
> +@subsubsection Examples
> +
> +@itemize
> +@item
> +Make all the green pixels in the input video transparent and use it as an overlay for another video:
> +
> +@example
> +./ffmpeg \
> +    -hwaccel cuda -hwaccel_output_format cuda -i input_green.mp4  \
> +    -hwaccel cuda -hwaccel_output_format cuda -i base_video.mp4 \
> +    -init_hw_device cuda \
> +    -filter_complex \
> +    " \
> +        [0:v]chromakey_cuda=0x25302D:0.1:0.12:1[overlay_video]; \
> +        [1:v]scale_cuda=format=yuv420p[base]; \
> +        [base][overlay_video]overlay_cuda" \
> +    -an -sn -c:v h264_nvenc -cq 20 output.mp4
> +@end example
> +
> +@item
> +Process two software sources, explicitly uploading the frames:
> +
> +@example
> +./ffmpeg -init_hw_device cuda=cuda -filter_hw_device cuda \
> +    -f lavfi -i color=size=800x600:color=white,format=yuv420p \
> +    -f lavfi -i yuvtestsrc=size=200x200,format=yuv420p \
> +    -filter_complex \
> +    " \
> +        [0]hwupload[under]; \
> +        [1]hwupload,chromakey_cuda=green:0.1:0.12[over]; \
> +        [under][over]overlay_cuda" \
> +    -c:v hevc_nvenc -cq 18 -preset slow output.mp4
> +@end example
> +
> +@end itemize
> +
> +@subsection colorspace_cuda
> +
> +CUDA accelerated implementation of the colorspace filter.
> +
> +It is by no means feature complete compared to the software colorspace filter,
> +and at the current time only supports color range conversion between jpeg/full
> +and mpeg/limited range.
> +
> +The filter accepts the following options:
> +
> +@table @option
> +@item range
> +Specify output color range.
> +
> +The accepted values are:
> +@table @samp
> +@item tv
> +TV (restricted) range
> +
> +@item mpeg
> +MPEG (restricted) range
> +
> +@item pc
> +PC (full) range
> +
> +@item jpeg
> +JPEG (full) range
> +
> +@end table
> +
> +@end table
> +
> +@anchor{overlay_cuda}
> +@subsection overlay_cuda
> +
> +Overlay one video on top of another.
> +
> +This is the CUDA variant of the @ref{overlay} filter.
> +It only accepts CUDA frames. The underlying input pixel formats have to match.
> +
> +It takes two inputs and has one output. The first input is the "main"
> +video on which the second input is overlaid.
> +
> +It accepts the following parameters:
> +
> +@table @option
> +@item x
> +@item y
> +Set expressions for the x and y coordinates of the overlaid video
> +on the main video.
> +
> +They can contain the following parameters:
> +
> +@table @option
> +
> +@item main_w, W
> +@item main_h, H
> +The main input width and height.
> +
> +@item overlay_w, w
> +@item overlay_h, h
> +The overlay input width and height.
> +
> +@item x
> +@item y
> +The computed values for @var{x} and @var{y}. They are evaluated for
> +each new frame.
> +
> +@item n
> +The ordinal index of the main input frame, starting from 0.
> +
> +@item pos
> +The byte offset position in the file of the main input frame, NAN if unknown.
> +Deprecated, do not use.
> +
> +@item t
> +The timestamp of the main input frame, expressed in seconds, NAN if unknown.
> +
> +@end table
> +
> +Default value is "0" for both expressions.
> +
> +@item eval
> +Set when the expressions for @option{x} and @option{y} are evaluated.
> +
> +It accepts the following values:
> +@table @option
> +@item init
> +Evaluate expressions once during filter initialization or
> +when a command is processed.
> +
> +@item frame
> +Evaluate expressions for each incoming frame
> +@end table
> +
> +Default value is @option{frame}.
> +
> +@item eof_action
> +See @ref{framesync}.
> +
> +@item shortest
> +See @ref{framesync}.
> +
> +@item repeatlast
> +See @ref{framesync}.
> +
> +@end table
> +
> +This filter also supports the @ref{framesync} options.
> +
> +@anchor{scale_cuda}
> +@subsection scale_cuda
> +
> +Scale (resize) and convert (pixel format) the input video, using accelerated CUDA kernels.
> +Setting the output width and height works in the same way as for the @ref{scale} filter.
> +
> +The filter accepts the following options:
> +@table @option
> +@item w
> +@item h
> +Set the output video dimension expression. Default value is the input dimension.
> +
> +Allows for the same expressions as the @ref{scale} filter.
> +
> +@item interp_algo
> +Sets the algorithm used for scaling:
> +
> +@table @var
> +@item nearest
> +Nearest neighbour
> +
> +Used by default if input parameters match the desired output.
> +
> +@item bilinear
> +Bilinear
> +
> +@item bicubic
> +Bicubic
> +
> +This is the default.
> +
> +@item lanczos
> +Lanczos
> +
> +@end table
> +
> +@item format
> +Controls the output pixel format. By default, or if none is specified, the input
> +pixel format is used.
> +
> +The filter does not support converting between YUV and RGB pixel formats.
> +
> +@item passthrough
> +If set to 0, every frame is processed, even if no conversion is necessary.
> +This mode can be useful to use the filter as a buffer for a downstream
> +frame-consumer that exhausts the limited decoder frame pool.
> +
> +If set to 1, frames are passed through as-is if they match the desired output
> +parameters. This is the default behaviour.
> +
> +@item param
> +Algorithm-Specific parameter.
> +
> +Affects the curves of the bicubic algorithm.
> +
> +@item force_original_aspect_ratio
> +@item force_divisible_by
> +Work the same as the identical @ref{scale} filter options.
> +
> +@item reset_sar
> +Works the same as the identical @ref{scale} filter option.
> +
> +@end table
> +
> +@subsubsection Examples
> +
> +@itemize
> +@item
> +Scale input to 720p, keeping aspect ratio and ensuring the output is yuv420p.
> +@example
> +scale_cuda=-2:720:format=yuv420p
> +@end example
> +
> +@item
> +Upscale to 4K using nearest neighbour algorithm.
> +@example
> +scale_cuda=4096:2160:interp_algo=nearest
> +@end example
> +
> +@item
> +Don't do any conversion or scaling, but copy all input frames into newly allocated ones.
> +This can be useful to deal with a filter and encode chain that otherwise exhausts the
> +decoders frame pool.
> +@example
> +scale_cuda=passthrough=0
> +@end example
> +@end itemize
> +
> +@subsection yadif_cuda
> +
> +Deinterlace the input video using the @ref{yadif} algorithm, but implemented
> +in CUDA so that it can work as part of a GPU accelerated pipeline with nvdec
> +and/or nvenc.
> +
> +It accepts the following parameters:
> +
> +
> +@table @option
> +
> +@item mode
> +The interlacing mode to adopt. It accepts one of the following values:
> +
> +@table @option
> +@item 0, send_frame
> +Output one frame for each frame.
> +@item 1, send_field
> +Output one frame for each field.
> +@item 2, send_frame_nospatial
> +Like @code{send_frame}, but it skips the spatial interlacing check.
> +@item 3, send_field_nospatial
> +Like @code{send_field}, but it skips the spatial interlacing check.
> +@end table
> +
> +The default value is @code{send_frame}.
> +
> +@item parity
> +The picture field parity assumed for the input interlaced video. It accepts one
> +of the following values:
> +
> +@table @option
> +@item 0, tff
> +Assume the top field is first.
> +@item 1, bff
> +Assume the bottom field is first.
> +@item -1, auto
> +Enable automatic detection of field parity.
> +@end table
> +
> +The default value is @code{auto}.
> +If the interlacing is unknown or the decoder does not export this information,
> +top field first will be assumed.
> +
> +@item deint
> +Specify which frames to deinterlace. Accepts one of the following
> +values:
> +
> +@table @option
> +@item 0, all
> +Deinterlace all frames.
> +@item 1, interlaced
> +Only deinterlace frames marked as interlaced.
> +@end table
> +
> +The default value is @code{all}.
> +@end table
> +
> +@anchor{CUDA NPP}
> +@section CUDA NPP
> +Below is a description of the currently available NVIDIA Performance Primitives (libnpp) video filters.
> +
> +Prerequisites:
> +@itemize
> +@item Install Nvidia CUDA Toolkit
> +@item Install libnpp
> +@end itemize
> +
> +To enable CUDA NPP filters:
> +
> +@itemize
> +@item Configure FFmpeg with @code{--enable-nonfree --enable-libnpp}.
> +@end itemize
> +
> +
> +@anchor{scale_npp}
> +@subsection scale_npp
> +
> +Use the NVIDIA Performance Primitives (libnpp) to perform scaling and/or pixel
> +format conversion on CUDA video frames. Setting the output width and height
> +works in the same way as for the @var{scale} filter.
> +
> +The following additional options are accepted:
> +@table @option
> +@item format
> +The pixel format of the output CUDA frames. If set to the string "same" (the
> +default), the input format will be kept. Note that automatic format negotiation
> +and conversion is not yet supported for hardware frames
> +
> +@item interp_algo
> +The interpolation algorithm used for resizing. One of the following:
> +@table @option
> +@item nn
> +Nearest neighbour.
> +
> +@item linear
> +@item cubic
> +@item cubic2p_bspline
> +2-parameter cubic (B=1, C=0)
> +
> +@item cubic2p_catmullrom
> +2-parameter cubic (B=0, C=1/2)
> +
> +@item cubic2p_b05c03
> +2-parameter cubic (B=1/2, C=3/10)
> +
> +@item super
> +Supersampling
> +
> +@item lanczos
> +@end table
> +
> +@item force_original_aspect_ratio
> +Enable decreasing or increasing output video width or height if necessary to
> +keep the original aspect ratio. Possible values:
> +
> +@table @samp
> +@item disable
> +Scale the video as specified and disable this feature.
> +
> +@item decrease
> +The output video dimensions will automatically be decreased if needed.
> +
> +@item increase
> +The output video dimensions will automatically be increased if needed.
> +
> +@end table
> +
> +One useful instance of this option is that when you know a specific device's
> +maximum allowed resolution, you can use this to limit the output video to
> +that, while retaining the aspect ratio. For example, device A allows
> +1280x720 playback, and your video is 1920x800. Using this option (set it to
> +decrease) and specifying 1280x720 to the command line makes the output
> +1280x533.
> +
> +Please note that this is a different thing than specifying -1 for @option{w}
> +or @option{h}, you still need to specify the output resolution for this option
> +to work.
> +
> +@item force_divisible_by
> +Ensures that both the output dimensions, width and height, are divisible by the
> +given integer when used together with @option{force_original_aspect_ratio}. This
> +works similar to using @code{-n} in the @option{w} and @option{h} options.
> +
> +This option respects the value set for @option{force_original_aspect_ratio},
> +increasing or decreasing the resolution accordingly. The video's aspect ratio
> +may be slightly modified.
> +
> +This option can be handy if you need to have a video fit within or exceed
> +a defined resolution using @option{force_original_aspect_ratio} but also have
> +encoder restrictions on width or height divisibility.
> +
> +@item reset_sar
> +Works the same as the identical @ref{scale} filter option.
> +
> +@item eval
> +Specify when to evaluate @var{width} and @var{height} expression. It accepts the following values:
> +
> +@table @samp
> +@item init
> +Only evaluate expressions once during the filter initialization or when a command is processed.
> +
> +@item frame
> +Evaluate expressions for each incoming frame.
> +
> +@end table
> +
> +@end table
> +
> +The values of the @option{w} and @option{h} options are expressions
> +containing the following constants:
> +
> +@table @var
> +@item in_w
> +@item in_h
> +The input width and height
> +
> +@item iw
> +@item ih
> +These are the same as @var{in_w} and @var{in_h}.
> +
> +@item out_w
> +@item out_h
> +The output (scaled) width and height
> +
> +@item ow
> +@item oh
> +These are the same as @var{out_w} and @var{out_h}
> +
> +@item a
> +The same as @var{iw} / @var{ih}
> +
> +@item sar
> +input sample aspect ratio
> +
> +@item dar
> +The input display aspect ratio. Calculated from @code{(iw / ih) * sar}.
> +
> +@item n
> +The (sequential) number of the input frame, starting from 0.
> +Only available with @code{eval=frame}.
> +
> +@item t
> +The presentation timestamp of the input frame, expressed as a number of
> +seconds. Only available with @code{eval=frame}.
> +
> +@item pos
> +The position (byte offset) of the frame in the input stream, or NaN if
> +this information is unavailable and/or meaningless (for example in case of synthetic video).
> +Only available with @code{eval=frame}.
> +Deprecated, do not use.
> +@end table
> +
> +@subsection scale2ref_npp
> +
> +Use the NVIDIA Performance Primitives (libnpp) to scale (resize) the input
> +video, based on a reference video.
> +
> +See the @ref{scale_npp} filter for available options, scale2ref_npp supports the same
> +but uses the reference video instead of the main input as basis. scale2ref_npp
> +also supports the following additional constants for the @option{w} and
> +@option{h} options:
> +
> +@table @var
> +@item main_w
> +@item main_h
> +The main input video's width and height
> +
> +@item main_a
> +The same as @var{main_w} / @var{main_h}
> +
> +@item main_sar
> +The main input video's sample aspect ratio
> +
> +@item main_dar, mdar
> +The main input video's display aspect ratio. Calculated from
> +@code{(main_w / main_h) * main_sar}.
> +
> +@item main_n
> +The (sequential) number of the main input frame, starting from 0.
> +Only available with @code{eval=frame}.
> +
> +@item main_t
> +The presentation timestamp of the main input frame, expressed as a number of
> +seconds. Only available with @code{eval=frame}.
> +
> +@item main_pos
> +The position (byte offset) of the frame in the main input stream, or NaN if
> +this information is unavailable and/or meaningless (for example in case of synthetic video).
> +Only available with @code{eval=frame}.
> +@end table
> +
> +@subsubsection Examples
> +
> +@itemize
> +@item
> +Scale a subtitle stream (b) to match the main video (a) in size before overlaying
> +@example
> +'scale2ref_npp[b][a];[a][b]overlay_cuda'
> +@end example
> +
> +@item
> +Scale a logo to 1/10th the height of a video, while preserving its display aspect ratio.
> +@example
> +[logo-in][video-in]scale2ref_npp=w=oh*mdar:h=ih/10[logo-out][video-out]
> +@end example
> +@end itemize
> +
> +@subsection sharpen_npp
> +Use the NVIDIA Performance Primitives (libnpp) to perform image sharpening with
> +border control.
> +
> +The following additional options are accepted:
> +@table @option
> +
> +@item border_type
> +Type of sampling to be used ad frame borders. One of the following:
> +@table @option
> +
> +@item replicate
> +Replicate pixel values.
> +
> +@end table
> +@end table
> +
> +@subsection transpose_npp
> +
> +Transpose rows with columns in the input video and optionally flip it.
> +For more in depth examples see the @ref{transpose} video filter, which shares mostly the same options.
> +
> +It accepts the following parameters:
> +
> +@table @option
> +
> +@item dir
> +Specify the transposition direction.
> +
> +Can assume the following values:
> +@table @samp
> +@item cclock_flip
> +Rotate by 90 degrees counterclockwise and vertically flip. (default)
> +
> +@item clock
> +Rotate by 90 degrees clockwise.
> +
> +@item cclock
> +Rotate by 90 degrees counterclockwise.
> +
> +@item clock_flip
> +Rotate by 90 degrees clockwise and vertically flip.
> +@end table
> +
> +@item passthrough
> +Do not apply the transposition if the input geometry matches the one
> +specified by the specified value. It accepts the following values:
> +@table @samp
> +@item none
> +Always apply transposition. (default)
> +@item portrait
> +Preserve portrait geometry (when @var{height} >= @var{width}).
> +@item landscape
> +Preserve landscape geometry (when @var{width} >= @var{height}).
> +@end table
> +
> +@end table
> +
> +@c man end CUDA Video Filters
> +
>   @chapter OpenCL Video Filters
>   @c man begin OPENCL VIDEO FILTERS
>   

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [FFmpeg-devel] [PATCH] doc/filters: Add CUDA Video Filters section for CUDA-based and CUDA+NPP based filters.
  2025-03-17  5:55                     ` Gyan Doshi
@ 2025-03-17  7:14                       ` Gyan Doshi
  0 siblings, 0 replies; 12+ messages in thread
From: Gyan Doshi @ 2025-03-17  7:14 UTC (permalink / raw)
  To: ffmpeg-devel



On 2025-03-17 11:25 am, Gyan Doshi wrote:
>
>
> On 2025-03-17 12:45 am, Danil Iashchenko wrote:
>> Hi Gyan and Michael,
>> Thank you for reviewing the patch and providing feedback!
>> I've addressed all the issues and resubmitting the patch (built and 
>> tested with Texinfo 7.1.1).
>>
>> Per Gyan's suggestion, I'm resubmitting since Patchwork was down when 
>> I originally sent it on the 10th.
>>
>> Please let me know if there's anything else I can clarify or improve.
>> Thanks again!
>
> Generally, looks fine. I've a couple of minor gripes but I'll adjust 
> the commit msg and apply this.
> We can then address the minor points.

Pushed as a1c6ca1683708978c24ed8a632bb29fafc9dacdf

Regards,
Gyan

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2025-03-17  7:14 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20250116004751.56789-1-danyaschenko@gmail.com>
2025-01-26 10:55 ` [FFmpeg-devel] [PATCH 1/1] [doc/filters] add nvidia cuda and cuda npp sections Gyan Doshi
2025-01-28 19:12   ` [FFmpeg-devel] [PATCH 1/2] doc/filters: Add CUDA Video Filters section for CUDA-based and CUDA+NPP based filters Danil Iashchenko
2025-01-28 19:12     ` [FFmpeg-devel] [PATCH 2/2] doc/filters: Remove redundant *_cuda and *_npp filters since they are already in CUDA Video Filters section Danil Iashchenko
2025-02-02  6:58       ` Gyan Doshi
2025-02-02 13:23         ` [FFmpeg-devel] [PATCH] doc/filters: Add CUDA Video Filters section for CUDA-based and CUDA+NPP based filters Danil Iashchenko
2025-02-04  5:40           ` Gyan Doshi
2025-03-02 11:31             ` Danil Iashchenko
2025-03-03 22:27               ` Michael Niedermayer
2025-03-10 10:38                 ` Danil Iashchenko
2025-03-16 19:15                   ` Danil Iashchenko
2025-03-17  5:55                     ` Gyan Doshi
2025-03-17  7:14                       ` Gyan Doshi

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
		ffmpegdev@gitmailbox.com
	public-inbox-index ffmpegdev

Example config snippet for mirrors.


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git