* [FFmpeg-devel] [PATCH v2 1/2] lavfi/edge_common: Add 16bit versions of gaussian_blur and sobel @ 2022-07-11 8:53 Thilo Borgmann 2022-07-11 8:54 ` [FFmpeg-devel] [PATCH v2 2/2] lavfi/cropdetect: Add new mode to detect crop-area based on motion vectors and edges Thilo Borgmann 2022-07-16 21:07 ` [FFmpeg-devel] [PATCH v2 1/2] lavfi/edge_common: Add 16bit versions of gaussian_blur and sobel Thilo Borgmann 0 siblings, 2 replies; 10+ messages in thread From: Thilo Borgmann @ 2022-07-11 8:53 UTC (permalink / raw) To: FFmpeg development discussions and patches [-- Attachment #1: Type: text/plain, Size: 112 bytes --] Hi, 1/2 adds 16 bit versions of ff_gaussian_blur and ff_sobel. 2/2 adds new mode to cropdetect. Thanks, Thilo [-- Attachment #2: v2-0001-lavfi-edge_common-Add-16bit-versions-of-gaussian_.patch --] [-- Type: text/plain, Size: 13032 bytes --] From fc8c179e2de4dee3d32d2e02684f3e3215af63c6 Mon Sep 17 00:00:00 2001 From: Thilo Borgmann <thilo.borgmann@mail.de> Date: Sun, 10 Jul 2022 12:40:27 +0200 Subject: [PATCH v2 1/2] lavfi/edge_common: Add 16bit versions of gaussian_blur and sobel --- libavfilter/edge_common.c | 134 ++++++++++++++++++++++++++++-------- libavfilter/edge_common.h | 14 +++- libavfilter/vf_blurdetect.c | 4 +- libavfilter/vf_edgedetect.c | 4 +- 4 files changed, 121 insertions(+), 35 deletions(-) diff --git a/libavfilter/edge_common.c b/libavfilter/edge_common.c index d72e8521cd..f0bf273b84 100644 --- a/libavfilter/edge_common.c +++ b/libavfilter/edge_common.c @@ -50,7 +50,7 @@ static int get_rounded_direction(int gx, int gy) void ff_sobel(int w, int h, uint16_t *dst, int dst_linesize, int8_t *dir, int dir_linesize, - const uint8_t *src, int src_linesize) + const uint8_t *src, int src_linesize, int src_stride) { int i, j; @@ -60,13 +60,43 @@ void ff_sobel(int w, int h, src += src_linesize; for (i = 1; i < w - 1; i++) { const int gx = - -1*src[-src_linesize + i-1] + 1*src[-src_linesize + i+1] - -2*src[ i-1] + 2*src[ i+1] - -1*src[ src_linesize + i-1] + 1*src[ src_linesize + i+1]; + -1*src[-src_linesize + (i-1)*src_stride] + 1*src[-src_linesize + (i+1)*src_stride] + -2*src[ (i-1)*src_stride] + 2*src[ (i+1)*src_stride] + -1*src[ src_linesize + (i-1)*src_stride] + 1*src[ src_linesize + (i+1)*src_stride]; const int gy = - -1*src[-src_linesize + i-1] + 1*src[ src_linesize + i-1] - -2*src[-src_linesize + i ] + 2*src[ src_linesize + i ] - -1*src[-src_linesize + i+1] + 1*src[ src_linesize + i+1]; + -1*src[-src_linesize + (i-1)*src_stride] + 1*src[ src_linesize + (i-1)*src_stride] + -2*src[-src_linesize + (i )*src_stride] + 2*src[ src_linesize + (i )*src_stride] + -1*src[-src_linesize + (i+1)*src_stride] + 1*src[ src_linesize + (i+1)*src_stride]; + + dst[i] = FFABS(gx) + FFABS(gy); + dir[i] = get_rounded_direction(gx, gy); + } + } +} + +void ff_sobel16(int w, int h, + uint16_t *dst, int dst_linesize, + int8_t *dir, int dir_linesize, + const uint8_t *src, int src_linesize, int src_stride) +{ + int i, j; + uint16_t *src16 = (uint16_t *)src; + int src16_stride = src_stride / 2; + int src16_linesize = src_linesize / 2; + + for (j = 1; j < h - 1; j++) { + dst += dst_linesize; + dir += dir_linesize; + src16 += src16_linesize; + for (i = 1; i < w - 1; i++) { + const int gx = + -1*src16[-src16_linesize + (i-1)*src16_stride] + 1*src16[-src16_linesize + (i+1)*src16_stride] + -2*src16[ (i-1)*src16_stride] + 2*src16[ (i+1)*src16_stride] + -1*src16[ src16_linesize + (i-1)*src16_stride] + 1*src16[ src16_linesize + (i+1)*src16_stride]; + const int gy = + -1*src16[-src16_linesize + (i-1)*src16_stride] + 1*src16[ src16_linesize + (i-1)*src16_stride] + -2*src16[-src16_linesize + (i )*src16_stride] + 2*src16[ src16_linesize + (i )*src16_stride] + -1*src16[-src16_linesize + (i+1)*src16_stride] + 1*src16[ src16_linesize + (i+1)*src16_stride]; dst[i] = FFABS(gx) + FFABS(gy); dir[i] = get_rounded_direction(gx, gy); @@ -141,37 +171,37 @@ void ff_double_threshold(int low, int high, int w, int h, // Applies gaussian blur, using 5x5 kernels, sigma = 1.4 void ff_gaussian_blur(int w, int h, uint8_t *dst, int dst_linesize, - const uint8_t *src, int src_linesize) + const uint8_t *src, int src_linesize, int src_stride) { int i, j; memcpy(dst, src, w); dst += dst_linesize; src += src_linesize; memcpy(dst, src, w); dst += dst_linesize; src += src_linesize; for (j = 2; j < h - 2; j++) { - dst[0] = src[0]; - dst[1] = src[1]; + dst[0] = src[(0)*src_stride]; + dst[1] = src[(1)*src_stride]; for (i = 2; i < w - 2; i++) { /* Gaussian mask of size 5x5 with sigma = 1.4 */ - dst[i] = ((src[-2*src_linesize + i-2] + src[2*src_linesize + i-2]) * 2 - + (src[-2*src_linesize + i-1] + src[2*src_linesize + i-1]) * 4 - + (src[-2*src_linesize + i ] + src[2*src_linesize + i ]) * 5 - + (src[-2*src_linesize + i+1] + src[2*src_linesize + i+1]) * 4 - + (src[-2*src_linesize + i+2] + src[2*src_linesize + i+2]) * 2 - - + (src[ -src_linesize + i-2] + src[ src_linesize + i-2]) * 4 - + (src[ -src_linesize + i-1] + src[ src_linesize + i-1]) * 9 - + (src[ -src_linesize + i ] + src[ src_linesize + i ]) * 12 - + (src[ -src_linesize + i+1] + src[ src_linesize + i+1]) * 9 - + (src[ -src_linesize + i+2] + src[ src_linesize + i+2]) * 4 - - + src[i-2] * 5 - + src[i-1] * 12 - + src[i ] * 15 - + src[i+1] * 12 - + src[i+2] * 5) / 159; + dst[i] = ((src[-2*src_linesize + (i-2)*src_stride] + src[2*src_linesize + (i-2)*src_stride]) * 2 + + (src[-2*src_linesize + (i-1)*src_stride] + src[2*src_linesize + (i-1)*src_stride]) * 4 + + (src[-2*src_linesize + (i )*src_stride] + src[2*src_linesize + (i )*src_stride]) * 5 + + (src[-2*src_linesize + (i+1)*src_stride] + src[2*src_linesize + (i+1)*src_stride]) * 4 + + (src[-2*src_linesize + (i+2)*src_stride] + src[2*src_linesize + (i+2)*src_stride]) * 2 + + + (src[ -src_linesize + (i-2)*src_stride] + src[ src_linesize + (i-2)*src_stride]) * 4 + + (src[ -src_linesize + (i-1)*src_stride] + src[ src_linesize + (i-1)*src_stride]) * 9 + + (src[ -src_linesize + (i )*src_stride] + src[ src_linesize + (i )*src_stride]) * 12 + + (src[ -src_linesize + (i+1)*src_stride] + src[ src_linesize + (i+1)*src_stride]) * 9 + + (src[ -src_linesize + (i+2)*src_stride] + src[ src_linesize + (i+2)*src_stride]) * 4 + + + src[(i-2)*src_stride] * 5 + + src[(i-1)*src_stride] * 12 + + src[(i )*src_stride] * 15 + + src[(i+1)*src_stride] * 12 + + src[(i+2)*src_stride] * 5) / 159; } - dst[i ] = src[i ]; - dst[i + 1] = src[i + 1]; + dst[i ] = src[(i )*src_stride]; + dst[i + 1] = src[(i + 1)*src_stride]; dst += dst_linesize; src += src_linesize; @@ -179,3 +209,49 @@ void ff_gaussian_blur(int w, int h, memcpy(dst, src, w); dst += dst_linesize; src += src_linesize; memcpy(dst, src, w); } + +void ff_gaussian_blur16(int w, int h, + uint8_t *dst, int dst_linesize, + const uint8_t *src, int src_linesize, int src_stride) +{ + int i, j; + uint16_t *src16 = (uint16_t *)src; + uint16_t *dst16 = (uint16_t *)dst; + int src16_stride = src_stride / 2; + int src16_linesize = src_linesize / 2; + int dst16_linesize = dst_linesize / 2; + + memcpy(dst16, src16, w*2); dst16 += dst16_linesize; src16 += src16_linesize; + memcpy(dst16, src16, w*2); dst16 += dst16_linesize; src16 += src16_linesize; + for (j = 2; j < h - 2; j++) { + dst16[0] = src16[(0)*src16_stride]; + dst16[1] = src16[(1)*src16_stride]; + for (i = 2; i < w - 2; i++) { + /* Gaussian mask of size 5x5 with sigma = 1.4 */ + dst16[i] = ((src16[-2*src16_linesize + (i-2)*src16_stride] + src16[2*src16_linesize + (i-2)*src16_stride]) * 2 + + (src16[-2*src16_linesize + (i-1)*src16_stride] + src16[2*src16_linesize + (i-1)*src16_stride]) * 4 + + (src16[-2*src16_linesize + (i )*src16_stride] + src16[2*src16_linesize + (i )*src16_stride]) * 5 + + (src16[-2*src16_linesize + (i+1)*src16_stride] + src16[2*src16_linesize + (i+1)*src16_stride]) * 4 + + (src16[-2*src16_linesize + (i+2)*src16_stride] + src16[2*src16_linesize + (i+2)*src16_stride]) * 2 + + + (src16[ -src16_linesize + (i-2)*src16_stride] + src16[ src16_linesize + (i-2)*src16_stride]) * 4 + + (src16[ -src16_linesize + (i-1)*src16_stride] + src16[ src16_linesize + (i-1)*src16_stride]) * 9 + + (src16[ -src16_linesize + (i )*src16_stride] + src16[ src16_linesize + (i )*src16_stride]) * 12 + + (src16[ -src16_linesize + (i+1)*src16_stride] + src16[ src16_linesize + (i+1)*src16_stride]) * 9 + + (src16[ -src16_linesize + (i+2)*src16_stride] + src16[ src16_linesize + (i+2)*src16_stride]) * 4 + + + src16[(i-2)*src16_stride] * 5 + + src16[(i-1)*src16_stride] * 12 + + src16[(i )*src16_stride] * 15 + + src16[(i+1)*src16_stride] * 12 + + src16[(i+2)*src16_stride] * 5) / 159; + } + dst16[i ] = src16[(i )*src16_stride]; + dst16[i + 1] = src16[(i + 1)*src16_stride]; + + dst16 += dst16_linesize; + src16 += src16_linesize; + } + memcpy(dst16, src16, w*2); dst16 += dst16_linesize; src16 += src16_linesize; + memcpy(dst16, src16, w*2); +} diff --git a/libavfilter/edge_common.h b/libavfilter/edge_common.h index 87c143f2b8..310d92a388 100644 --- a/libavfilter/edge_common.h +++ b/libavfilter/edge_common.h @@ -51,7 +51,13 @@ enum AVRoundedDirection { void ff_sobel(int w, int h, uint16_t *dst, int dst_linesize, int8_t *dir, int dir_linesize, - const uint8_t *src, int src_linesize); + const uint8_t *src, int src_linesize, int src_stride); + +void ff_sobel16(int w, int h, + uint16_t *dst, int dst_linesize, + int8_t *dir, int dir_linesize, + const uint8_t *src, int src_linesize, int src_stride); + /** * Filters rounded gradients to drop all non-maxima pixels in the magnitude image @@ -102,6 +108,10 @@ void ff_double_threshold(int low, int high, int w, int h, */ void ff_gaussian_blur(int w, int h, uint8_t *dst, int dst_linesize, - const uint8_t *src, int src_linesize); + const uint8_t *src, int src_linesize, int src_stride); + +void ff_gaussian_blur16(int w, int h, + uint8_t *dst, int dst_linesize, + const uint8_t *src, int src_linesize, int src_stride); #endif diff --git a/libavfilter/vf_blurdetect.c b/libavfilter/vf_blurdetect.c index 0e08ba96de..ed4fb29b31 100644 --- a/libavfilter/vf_blurdetect.c +++ b/libavfilter/vf_blurdetect.c @@ -285,10 +285,10 @@ static int blurdetect_filter_frame(AVFilterLink *inlink, AVFrame *in) // gaussian filter to reduce noise ff_gaussian_blur(w, h, filterbuf, w, - in->data[plane], in->linesize[plane]); + in->data[plane], in->linesize[plane], 1); // compute the 16-bits gradients and directions for the next step - ff_sobel(w, h, gradients, w, directions, w, filterbuf, w); + ff_sobel(w, h, gradients, w, directions, w, filterbuf, w, 1); // non_maximum_suppression() will actually keep & clip what's necessary and // ignore the rest, so we need a clean output buffer diff --git a/libavfilter/vf_edgedetect.c b/libavfilter/vf_edgedetect.c index 90390ceb3e..10397fb8dc 100644 --- a/libavfilter/vf_edgedetect.c +++ b/libavfilter/vf_edgedetect.c @@ -193,13 +193,13 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *in) /* gaussian filter to reduce noise */ ff_gaussian_blur(width, height, tmpbuf, width, - in->data[p], in->linesize[p]); + in->data[p], in->linesize[p], 1); /* compute the 16-bits gradients and directions for the next step */ ff_sobel(width, height, gradients, width, directions,width, - tmpbuf, width); + tmpbuf, width, 1); /* non_maximum_suppression() will actually keep & clip what's necessary and * ignore the rest, so we need a clean output buffer */ -- 2.20.1 (Apple Git-117) [-- Attachment #3: Type: text/plain, Size: 251 bytes --] _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 10+ messages in thread
* [FFmpeg-devel] [PATCH v2 2/2] lavfi/cropdetect: Add new mode to detect crop-area based on motion vectors and edges 2022-07-11 8:53 [FFmpeg-devel] [PATCH v2 1/2] lavfi/edge_common: Add 16bit versions of gaussian_blur and sobel Thilo Borgmann @ 2022-07-11 8:54 ` Thilo Borgmann 2022-07-16 21:09 ` Thilo Borgmann 2022-07-16 21:07 ` [FFmpeg-devel] [PATCH v2 1/2] lavfi/edge_common: Add 16bit versions of gaussian_blur and sobel Thilo Borgmann 1 sibling, 1 reply; 10+ messages in thread From: Thilo Borgmann @ 2022-07-11 8:54 UTC (permalink / raw) To: ffmpeg-devel [-- Attachment #1: Type: text/plain, Size: 16 bytes --] $subject -Thilo [-- Attachment #2: v2-0002-lavfi-cropdetect-Add-new-mode-to-detect-crop-area.patch --] [-- Type: text/plain, Size: 21009 bytes --] From ccc2b5ab29c4ca00c0a59af318fc865d37832377 Mon Sep 17 00:00:00 2001 From: Thilo Borgmann <thilo.borgmann@mail.de> Date: Mon, 11 Jul 2022 10:48:53 +0200 Subject: [PATCH v2 2/2] lavfi/cropdetect: Add new mode to detect crop-area based on motion vectors and edges This filter allows crop detection even if the video is embedded in non-black areas. --- doc/filters.texi | 42 +++- libavfilter/vf_cropdetect.c | 211 ++++++++++++++++++++- tests/fate/filter-video.mak | 8 +- tests/ref/fate/filter-metadata-cropdetect1 | 9 + tests/ref/fate/filter-metadata-cropdetect2 | 9 + 5 files changed, 276 insertions(+), 3 deletions(-) create mode 100644 tests/ref/fate/filter-metadata-cropdetect1 create mode 100644 tests/ref/fate/filter-metadata-cropdetect2 diff --git a/doc/filters.texi b/doc/filters.texi index d65e83d4d0..bd2d2429d7 100644 --- a/doc/filters.texi +++ b/doc/filters.texi @@ -10075,7 +10075,7 @@ Auto-detect the crop size. It calculates the necessary cropping parameters and prints the recommended parameters via the logging system. The detected dimensions -correspond to the non-black area of the input video. +correspond to the non-black or video area of the input video according to @var{mode}. It accepts the following parameters: @@ -10106,8 +10106,48 @@ detect the current optimal crop area. Default value is 0. This can be useful when channel logos distort the video area. 0 indicates 'never reset', and returns the largest area encountered during playback. + +@item mv_threshold +Set motion in pixel units as threshold for motion detection. It defaults to 8. + +@item low +@item high +Set low and high threshold values used by the Canny thresholding +algorithm. + +The high threshold selects the "strong" edge pixels, which are then +connected through 8-connectivity with the "weak" edge pixels selected +by the low threshold. + +@var{low} and @var{high} threshold values must be chosen in the range +[0,1], and @var{low} should be lesser or equal to @var{high}. + +Default value for @var{low} is @code{5/255}, and default value for @var{high} +is @code{15/255}. @end table +@subsection Examples + +@itemize +@item +Find video area surrounded by black borders: +@example +ffmpeg -i file.mp4 -vf cropdetect,metadata=mode=print -f null - +@end example + +@item +Find an embedded video area, generate motion vectors beforehand: +@example +ffmpeg -i file.mp4 -vf mestimate,cropdetect=mode=mvedges,metadata=mode=print -f null - +@end example + +@item +Find an embedded video area, use motion vectors from decoder: +@example +ffmpeg -flags2 +export_mvs -i file.mp4 -vf cropdetect=mode=mvedges,metadata=mode=print -f null - +@end example +@end itemize + @anchor{cue} @section cue diff --git a/libavfilter/vf_cropdetect.c b/libavfilter/vf_cropdetect.c index b887b9ecb1..68313064bd 100644 --- a/libavfilter/vf_cropdetect.c +++ b/libavfilter/vf_cropdetect.c @@ -26,11 +26,14 @@ #include "libavutil/imgutils.h" #include "libavutil/internal.h" #include "libavutil/opt.h" +#include "libavutil/motion_vector.h" +#include "libavutil/qsort.h" #include "avfilter.h" #include "formats.h" #include "internal.h" #include "video.h" +#include "edge_common.h" typedef struct CropDetectContext { const AVClass *class; @@ -42,6 +45,16 @@ typedef struct CropDetectContext { int frame_nb; int max_pixsteps[4]; int max_outliers; + int mode; + int window_size; + int mv_threshold; + float low, high; + uint8_t low_u8, high_u8; + uint8_t *filterbuf; + uint8_t *tmpbuf; + uint16_t *gradients; + char *directions; + int *bboxes[4]; } CropDetectContext; static const enum AVPixelFormat pix_fmts[] = { @@ -61,6 +74,17 @@ static const enum AVPixelFormat pix_fmts[] = { AV_PIX_FMT_NONE }; +enum CropMode { + MODE_BLACK, + MODE_MV_EDGES, + MODE_NB +}; + +static int comp(const int *a,const int *b) +{ + return FFDIFFSIGN(*a, *b); +} + static int checkline(void *ctx, const unsigned char *src, int stride, int len, int bpp) { int total = 0; @@ -116,11 +140,43 @@ static int checkline(void *ctx, const unsigned char *src, int stride, int len, i return total; } +static int checkline_edge(void *ctx, const unsigned char *src, int stride, int len, int bpp) +{ + const uint16_t *src16 = (const uint16_t *)src; + + switch (bpp) { + case 1: + while (--len >= 0) { + if(src[0]) return 0; + src += stride; + } + break; + case 2: + stride >>= 1; + while (--len >= 0) { + if(src16[0]) return 0; + src16 += stride; + } + break; + case 3: + case 4: + while (--len >= 0) { + if(src[0] || src[1] || src[2]) return 0; + src += stride; + } + break; + } + + return 1; +} + static av_cold int init(AVFilterContext *ctx) { CropDetectContext *s = ctx->priv; s->frame_nb = -1 * s->skip; + s->low_u8 = s->low * 255. + .5; + s->high_u8 = s->high * 255. + .5; av_log(ctx, AV_LOG_VERBOSE, "limit:%f round:%d skip:%d reset_count:%d\n", s->limit, s->round, s->skip, s->reset_count); @@ -128,11 +184,27 @@ static av_cold int init(AVFilterContext *ctx) return 0; } +static av_cold void uninit(AVFilterContext *ctx) +{ + CropDetectContext *s = ctx->priv; + + av_freep(&s->tmpbuf); + av_freep(&s->filterbuf); + av_freep(&s->gradients); + av_freep(&s->directions); + av_freep(&s->bboxes[0]); + av_freep(&s->bboxes[1]); + av_freep(&s->bboxes[2]); + av_freep(&s->bboxes[3]); +} + static int config_input(AVFilterLink *inlink) { AVFilterContext *ctx = inlink->dst; CropDetectContext *s = ctx->priv; const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(inlink->format); + const int bufsize = inlink->w * inlink->h; + int bpp; av_image_fill_max_pixsteps(s->max_pixsteps, NULL, desc); @@ -144,6 +216,21 @@ static int config_input(AVFilterLink *inlink) s->x2 = 0; s->y2 = 0; + bpp = s->max_pixsteps[0]; + s->window_size = FFMAX(s->reset_count, 15); + s->tmpbuf = av_malloc(bufsize); + s->filterbuf = av_malloc(bufsize * s->max_pixsteps[0]); + s->gradients = av_calloc(bufsize, sizeof(*s->gradients)); + s->directions = av_malloc(bufsize); + s->bboxes[0] = av_malloc(s->window_size * sizeof(*s->bboxes[0])); + s->bboxes[1] = av_malloc(s->window_size * sizeof(*s->bboxes[1])); + s->bboxes[2] = av_malloc(s->window_size * sizeof(*s->bboxes[2])); + s->bboxes[3] = av_malloc(s->window_size * sizeof(*s->bboxes[3])); + + if (!s->tmpbuf || !s->filterbuf || !s->gradients || !s->directions || + !s->bboxes[0] || !s->bboxes[1] || !s->bboxes[2] || !s->bboxes[3]) + return AVERROR(ENOMEM); + return 0; } @@ -155,11 +242,28 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *frame) AVFilterContext *ctx = inlink->dst; CropDetectContext *s = ctx->priv; int bpp = s->max_pixsteps[0]; - int w, h, x, y, shrink_by; + int w, h, x, y, shrink_by, i; AVDictionary **metadata; int outliers, last_y; int limit = lrint(s->limit); + const int inw = inlink->w; + const int inh = inlink->h; + uint8_t *tmpbuf = s->tmpbuf; + uint8_t *filterbuf = s->filterbuf; + uint16_t *gradients = s->gradients; + int8_t *directions = s->directions; + const AVFrameSideData *sd = NULL; + int scan_w, scan_h, bboff; + + void (*sobel)(int w, int h, uint16_t *dst, int dst_linesize, + int8_t *dir, int dir_linesize, + const uint8_t *src, int src_linesize, int src_stride) = (bpp == 2) ? &ff_sobel16 : &ff_sobel; + void (*gaussian_blur)(int w, int h, + uint8_t *dst, int dst_linesize, + const uint8_t *src, int src_linesize, int src_stride) = (bpp == 2) ? &ff_gaussian_blur16 : &ff_gaussian_blur; + + // ignore first s->skip frames if (++s->frame_nb > 0) { metadata = &frame->metadata; @@ -185,11 +289,109 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *frame) last_y = y INC;\ } + if (s->mode == MODE_BLACK) { FIND(s->y1, 0, y < s->y1, +1, frame->linesize[0], bpp, frame->width); FIND(s->y2, frame->height - 1, y > FFMAX(s->y2, s->y1), -1, frame->linesize[0], bpp, frame->width); FIND(s->x1, 0, y < s->x1, +1, bpp, frame->linesize[0], frame->height); FIND(s->x2, frame->width - 1, y > FFMAX(s->x2, s->x1), -1, bpp, frame->linesize[0], frame->height); + } else { // MODE_MV_EDGES + sd = av_frame_get_side_data(frame, AV_FRAME_DATA_MOTION_VECTORS); + s->x1 = 0; + s->y1 = 0; + s->x2 = inw - 1; + s->y2 = inh - 1; + + if (!sd) { + av_log(ctx, AV_LOG_WARNING, "Cannot detect: no motion vectors available"); + } else { + // gaussian filter to reduce noise + gaussian_blur(inw, inh, + filterbuf, inw*bpp, + frame->data[0], frame->linesize[0], bpp); + + // compute the 16-bits gradients and directions for the next step + sobel(inw, inh, gradients, inw, directions, inw, filterbuf, inw*bpp, bpp); + + // non_maximum_suppression() will actually keep & clip what's necessary and + // ignore the rest, so we need a clean output buffer + memset(tmpbuf, 0, inw * inh); + ff_non_maximum_suppression(inw, inh, tmpbuf, inw, directions, inw, gradients, inw); + + + // keep high values, or low values surrounded by high values + ff_double_threshold(s->low_u8, s->high_u8, inw, inh, + tmpbuf, inw, tmpbuf, inw); + + // scan all MVs and store bounding box + s->x1 = inw - 1; + s->y1 = inh - 1; + s->x2 = 0; + s->y2 = 0; + for (i = 0; i < sd->size / sizeof(AVMotionVector); i++) { + const AVMotionVector *mv = (const AVMotionVector*)sd->data + i; + const int mx = mv->dst_x - mv->src_x; + const int my = mv->dst_y - mv->src_y; + + if (mv->dst_x >= 0 && mv->dst_x < inw && + mv->dst_y >= 0 && mv->dst_y < inh && + mv->src_x >= 0 && mv->src_x < inw && + mv->src_y >= 0 && mv->src_y < inh && + mx * mx + my * my >= s->mv_threshold * s->mv_threshold) { + s->x1 = mv->dst_x < s->x1 ? mv->dst_x : s->x1; + s->y1 = mv->dst_y < s->y1 ? mv->dst_y : s->y1; + s->x2 = mv->dst_x > s->x2 ? mv->dst_x : s->x2; + s->y2 = mv->dst_y > s->y2 ? mv->dst_y : s->y2; + } + } + + // assert x1<x2, y1<y2 + if (s->x1 > s->x2) FFSWAP(int, s->x1, s->x2); + if (s->y1 > s->y2) FFSWAP(int, s->y1, s->y2); + + // scan outward looking for 0-edge-lines in edge image + scan_w = s->x2 - s->x1; + scan_h = s->y2 - s->y1; + +#define FIND_EDGE(DST, FROM, NOEND, INC, STEP0, STEP1, LEN) \ + for (last_y = y = FROM; NOEND; y = y INC) { \ + if (checkline_edge(ctx, tmpbuf + STEP0 * y, STEP1, LEN, bpp)) { \ + if (last_y INC == y) { \ + DST = y; \ + break; \ + } else \ + last_y = y; \ + } \ + } \ + if (!(NOEND)) { \ + DST = y -(INC); \ + } + FIND_EDGE(s->y1, s->y1, y >= 0, -1, inw, bpp, scan_w); + FIND_EDGE(s->y2, s->y2, y < inh, +1, inw, bpp, scan_w); + FIND_EDGE(s->x1, s->x1, y >= 0, -1, bpp, inw, scan_h); + FIND_EDGE(s->x2, s->x2, y < inw, +1, bpp, inw, scan_h); + + // queue bboxes + bboff = (s->frame_nb - 1) % s->window_size; + s->bboxes[0][bboff] = s->x1; + s->bboxes[1][bboff] = s->x2; + s->bboxes[2][bboff] = s->y1; + s->bboxes[3][bboff] = s->y2; + + // sort queue + bboff = FFMIN(s->frame_nb, s->window_size); + AV_QSORT(s->bboxes[0], bboff, int, comp); + AV_QSORT(s->bboxes[1], bboff, int, comp); + AV_QSORT(s->bboxes[2], bboff, int, comp); + AV_QSORT(s->bboxes[3], bboff, int, comp); + + // return median of window_size elems + s->x1 = s->bboxes[0][bboff/2]; + s->x2 = s->bboxes[1][bboff/2]; + s->y1 = s->bboxes[2][bboff/2]; + s->y2 = s->bboxes[3][bboff/2]; + } + } // round x and y (up), important for yuv colorspaces // make sure they stay rounded! @@ -243,6 +445,12 @@ static const AVOption cropdetect_options[] = { { "skip", "Number of initial frames to skip", OFFSET(skip), AV_OPT_TYPE_INT, { .i64 = 2 }, 0, INT_MAX, FLAGS }, { "reset_count", "Recalculate the crop area after this many frames",OFFSET(reset_count),AV_OPT_TYPE_INT,{ .i64 = 0 }, 0, INT_MAX, FLAGS }, { "max_outliers", "Threshold count of outliers", OFFSET(max_outliers),AV_OPT_TYPE_INT, { .i64 = 0 }, 0, INT_MAX, FLAGS }, + { "mode", "set mode", OFFSET(mode), AV_OPT_TYPE_INT, {.i64=MODE_BLACK}, 0, MODE_NB-1, FLAGS, "mode" }, + { "black", "detect black pixels surrounding the video", 0, AV_OPT_TYPE_CONST, {.i64=MODE_BLACK}, INT_MIN, INT_MAX, FLAGS, "mode" }, + { "mvedges", "detect motion and edged surrounding the video", 0, AV_OPT_TYPE_CONST, {.i64=MODE_MV_EDGES}, INT_MIN, INT_MAX, FLAGS, "mode" }, + { "high", "Set high threshold for edge detection", OFFSET(high), AV_OPT_TYPE_FLOAT, {.dbl=25/255.}, 0, 1, FLAGS }, + { "low", "Set low threshold for edge detection", OFFSET(low), AV_OPT_TYPE_FLOAT, {.dbl=15/255.}, 0, 1, FLAGS }, + { "mv_threshold", "motion vector threshold when estimating video window size", OFFSET(mv_threshold), AV_OPT_TYPE_INT, {.i64=8}, 0, 100, FLAGS}, { NULL } }; @@ -270,6 +478,7 @@ const AVFilter ff_vf_cropdetect = { .priv_size = sizeof(CropDetectContext), .priv_class = &cropdetect_class, .init = init, + .uninit = uninit, FILTER_INPUTS(avfilter_vf_cropdetect_inputs), FILTER_OUTPUTS(avfilter_vf_cropdetect_outputs), FILTER_PIXFMTS_ARRAY(pix_fmts), diff --git a/tests/fate/filter-video.mak b/tests/fate/filter-video.mak index faed832cd4..372c70bba7 100644 --- a/tests/fate/filter-video.mak +++ b/tests/fate/filter-video.mak @@ -641,11 +641,17 @@ FATE_METADATA_FILTER-$(call ALLYES, $(SCDET_DEPS)) += fate-filter-metadata-scdet fate-filter-metadata-scdet: SRC = $(TARGET_SAMPLES)/svq3/Vertical400kbit.sorenson3.mov fate-filter-metadata-scdet: CMD = run $(FILTER_METADATA_COMMAND) "sws_flags=+accurate_rnd+bitexact;movie='$(SRC)',scdet=s=1" -CROPDETECT_DEPS = LAVFI_INDEV FILE_PROTOCOL MOVIE_FILTER CROPDETECT_FILTER \ +CROPDETECT_DEPS = LAVFI_INDEV FILE_PROTOCOL MOVIE_FILTER MOVIE_FILTER MESTIMATE_FILTER CROPDETECT_FILTER \ SCALE_FILTER MOV_DEMUXER H264_DECODER FATE_METADATA_FILTER-$(call ALLYES, $(CROPDETECT_DEPS)) += fate-filter-metadata-cropdetect fate-filter-metadata-cropdetect: SRC = $(TARGET_SAMPLES)/filter/cropdetect.mp4 fate-filter-metadata-cropdetect: CMD = run $(FILTER_METADATA_COMMAND) "sws_flags=+accurate_rnd+bitexact;movie='$(SRC)',cropdetect=max_outliers=3" +FATE_METADATA_FILTER-$(call ALLYES, $(CROPDETECT_DEPS)) += fate-filter-metadata-cropdetect1 +fate-filter-metadata-cropdetect1: SRC = $(TARGET_SAMPLES)/filter/cropdetect1.mp4 +fate-filter-metadata-cropdetect1: CMD = run $(FILTER_METADATA_COMMAND) "sws_flags=+accurate_rnd+bitexact;movie='$(SRC)',mestimate,cropdetect=mode=mvedges,metadata=mode=print" +FATE_METADATA_FILTER-$(call ALLYES, $(CROPDETECT_DEPS)) += fate-filter-metadata-cropdetect2 +fate-filter-metadata-cropdetect2: SRC = $(TARGET_SAMPLES)/filter/cropdetect2.mp4 +fate-filter-metadata-cropdetect2: CMD = run $(FILTER_METADATA_COMMAND) "sws_flags=+accurate_rnd+bitexact;movie='$(SRC)',mestimate,cropdetect=mode=mvedges,metadata=mode=print" FREEZEDETECT_DEPS = LAVFI_INDEV MPTESTSRC_FILTER SCALE_FILTER FREEZEDETECT_FILTER FATE_METADATA_FILTER-$(call ALLYES, $(FREEZEDETECT_DEPS)) += fate-filter-metadata-freezedetect diff --git a/tests/ref/fate/filter-metadata-cropdetect1 b/tests/ref/fate/filter-metadata-cropdetect1 new file mode 100644 index 0000000000..892373cc11 --- /dev/null +++ b/tests/ref/fate/filter-metadata-cropdetect1 @@ -0,0 +1,9 @@ +pts=0 +pts=1001 +pts=2002|tag:lavfi.cropdetect.x1=20|tag:lavfi.cropdetect.x2=851|tag:lavfi.cropdetect.y1=311|tag:lavfi.cropdetect.y2=601|tag:lavfi.cropdetect.w=832|tag:lavfi.cropdetect.h=288|tag:lavfi.cropdetect.x=20|tag:lavfi.cropdetect.y=314 +pts=3003|tag:lavfi.cropdetect.x1=20|tag:lavfi.cropdetect.x2=885|tag:lavfi.cropdetect.y1=311|tag:lavfi.cropdetect.y2=621|tag:lavfi.cropdetect.w=864|tag:lavfi.cropdetect.h=304|tag:lavfi.cropdetect.x=22|tag:lavfi.cropdetect.y=316 +pts=4004|tag:lavfi.cropdetect.x1=0|tag:lavfi.cropdetect.x2=885|tag:lavfi.cropdetect.y1=115|tag:lavfi.cropdetect.y2=621|tag:lavfi.cropdetect.w=880|tag:lavfi.cropdetect.h=496|tag:lavfi.cropdetect.x=4|tag:lavfi.cropdetect.y=122 +pts=5005|tag:lavfi.cropdetect.x1=20|tag:lavfi.cropdetect.x2=885|tag:lavfi.cropdetect.y1=311|tag:lavfi.cropdetect.y2=621|tag:lavfi.cropdetect.w=864|tag:lavfi.cropdetect.h=304|tag:lavfi.cropdetect.x=22|tag:lavfi.cropdetect.y=316 +pts=6006|tag:lavfi.cropdetect.x1=0|tag:lavfi.cropdetect.x2=885|tag:lavfi.cropdetect.y1=115|tag:lavfi.cropdetect.y2=621|tag:lavfi.cropdetect.w=880|tag:lavfi.cropdetect.h=496|tag:lavfi.cropdetect.x=4|tag:lavfi.cropdetect.y=122 +pts=7007|tag:lavfi.cropdetect.x1=0|tag:lavfi.cropdetect.x2=885|tag:lavfi.cropdetect.y1=115|tag:lavfi.cropdetect.y2=621|tag:lavfi.cropdetect.w=880|tag:lavfi.cropdetect.h=496|tag:lavfi.cropdetect.x=4|tag:lavfi.cropdetect.y=122 +pts=8008|tag:lavfi.cropdetect.x1=0|tag:lavfi.cropdetect.x2=885|tag:lavfi.cropdetect.y1=115|tag:lavfi.cropdetect.y2=621|tag:lavfi.cropdetect.w=880|tag:lavfi.cropdetect.h=496|tag:lavfi.cropdetect.x=4|tag:lavfi.cropdetect.y=122 diff --git a/tests/ref/fate/filter-metadata-cropdetect2 b/tests/ref/fate/filter-metadata-cropdetect2 new file mode 100644 index 0000000000..6b433d17cb --- /dev/null +++ b/tests/ref/fate/filter-metadata-cropdetect2 @@ -0,0 +1,9 @@ +pts=0 +pts=512 +pts=1024|tag:lavfi.cropdetect.x1=21|tag:lavfi.cropdetect.x2=817|tag:lavfi.cropdetect.y1=33|tag:lavfi.cropdetect.y2=465|tag:lavfi.cropdetect.w=784|tag:lavfi.cropdetect.h=432|tag:lavfi.cropdetect.x=28|tag:lavfi.cropdetect.y=34 +pts=1536|tag:lavfi.cropdetect.x1=21|tag:lavfi.cropdetect.x2=817|tag:lavfi.cropdetect.y1=33|tag:lavfi.cropdetect.y2=465|tag:lavfi.cropdetect.w=784|tag:lavfi.cropdetect.h=432|tag:lavfi.cropdetect.x=28|tag:lavfi.cropdetect.y=34 +pts=2048|tag:lavfi.cropdetect.x1=21|tag:lavfi.cropdetect.x2=817|tag:lavfi.cropdetect.y1=29|tag:lavfi.cropdetect.y2=465|tag:lavfi.cropdetect.w=784|tag:lavfi.cropdetect.h=432|tag:lavfi.cropdetect.x=28|tag:lavfi.cropdetect.y=32 +pts=2560|tag:lavfi.cropdetect.x1=21|tag:lavfi.cropdetect.x2=817|tag:lavfi.cropdetect.y1=29|tag:lavfi.cropdetect.y2=465|tag:lavfi.cropdetect.w=784|tag:lavfi.cropdetect.h=432|tag:lavfi.cropdetect.x=28|tag:lavfi.cropdetect.y=32 +pts=3072|tag:lavfi.cropdetect.x1=21|tag:lavfi.cropdetect.x2=817|tag:lavfi.cropdetect.y1=29|tag:lavfi.cropdetect.y2=465|tag:lavfi.cropdetect.w=784|tag:lavfi.cropdetect.h=432|tag:lavfi.cropdetect.x=28|tag:lavfi.cropdetect.y=32 +pts=3584|tag:lavfi.cropdetect.x1=21|tag:lavfi.cropdetect.x2=817|tag:lavfi.cropdetect.y1=29|tag:lavfi.cropdetect.y2=465|tag:lavfi.cropdetect.w=784|tag:lavfi.cropdetect.h=432|tag:lavfi.cropdetect.x=28|tag:lavfi.cropdetect.y=32 +pts=4096|tag:lavfi.cropdetect.x1=21|tag:lavfi.cropdetect.x2=817|tag:lavfi.cropdetect.y1=29|tag:lavfi.cropdetect.y2=465|tag:lavfi.cropdetect.w=784|tag:lavfi.cropdetect.h=432|tag:lavfi.cropdetect.x=28|tag:lavfi.cropdetect.y=32 -- 2.20.1 (Apple Git-117) [-- Attachment #3: Type: text/plain, Size: 251 bytes --] _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [FFmpeg-devel] [PATCH v2 2/2] lavfi/cropdetect: Add new mode to detect crop-area based on motion vectors and edges 2022-07-11 8:54 ` [FFmpeg-devel] [PATCH v2 2/2] lavfi/cropdetect: Add new mode to detect crop-area based on motion vectors and edges Thilo Borgmann @ 2022-07-16 21:09 ` Thilo Borgmann 2022-07-17 7:54 ` Thilo Borgmann 0 siblings, 1 reply; 10+ messages in thread From: Thilo Borgmann @ 2022-07-16 21:09 UTC (permalink / raw) To: ffmpeg-devel [-- Attachment #1: Type: text/plain, Size: 68 bytes --] Am 11.07.22 um 10:54 schrieb Thilo Borgmann: > $subject v3. -Thilo [-- Attachment #2: v3-0002-lavfi-cropdetect-Add-new-mode-to-detect-crop-area.patch --] [-- Type: text/plain, Size: 21015 bytes --] From 763c169d82395ec3fd59fa66ebc78c676f0f186d Mon Sep 17 00:00:00 2001 From: Thilo Borgmann <thilo.borgmann@mail.de> Date: Sat, 16 Jul 2022 23:01:14 +0200 Subject: [PATCH v3 2/2] lavfi/cropdetect: Add new mode to detect crop-area based on motion vectors and edges This filter allows crop detection even if the video is embedded in non-black areas. --- doc/filters.texi | 42 +++- libavfilter/vf_cropdetect.c | 211 ++++++++++++++++++++- tests/fate/filter-video.mak | 8 +- tests/ref/fate/filter-metadata-cropdetect1 | 9 + tests/ref/fate/filter-metadata-cropdetect2 | 9 + 5 files changed, 276 insertions(+), 3 deletions(-) create mode 100644 tests/ref/fate/filter-metadata-cropdetect1 create mode 100644 tests/ref/fate/filter-metadata-cropdetect2 diff --git a/doc/filters.texi b/doc/filters.texi index d65e83d4d0..bd2d2429d7 100644 --- a/doc/filters.texi +++ b/doc/filters.texi @@ -10075,7 +10075,7 @@ Auto-detect the crop size. It calculates the necessary cropping parameters and prints the recommended parameters via the logging system. The detected dimensions -correspond to the non-black area of the input video. +correspond to the non-black or video area of the input video according to @var{mode}. It accepts the following parameters: @@ -10106,8 +10106,48 @@ detect the current optimal crop area. Default value is 0. This can be useful when channel logos distort the video area. 0 indicates 'never reset', and returns the largest area encountered during playback. + +@item mv_threshold +Set motion in pixel units as threshold for motion detection. It defaults to 8. + +@item low +@item high +Set low and high threshold values used by the Canny thresholding +algorithm. + +The high threshold selects the "strong" edge pixels, which are then +connected through 8-connectivity with the "weak" edge pixels selected +by the low threshold. + +@var{low} and @var{high} threshold values must be chosen in the range +[0,1], and @var{low} should be lesser or equal to @var{high}. + +Default value for @var{low} is @code{5/255}, and default value for @var{high} +is @code{15/255}. @end table +@subsection Examples + +@itemize +@item +Find video area surrounded by black borders: +@example +ffmpeg -i file.mp4 -vf cropdetect,metadata=mode=print -f null - +@end example + +@item +Find an embedded video area, generate motion vectors beforehand: +@example +ffmpeg -i file.mp4 -vf mestimate,cropdetect=mode=mvedges,metadata=mode=print -f null - +@end example + +@item +Find an embedded video area, use motion vectors from decoder: +@example +ffmpeg -flags2 +export_mvs -i file.mp4 -vf cropdetect=mode=mvedges,metadata=mode=print -f null - +@end example +@end itemize + @anchor{cue} @section cue diff --git a/libavfilter/vf_cropdetect.c b/libavfilter/vf_cropdetect.c index b887b9ecb1..5e9e99ab17 100644 --- a/libavfilter/vf_cropdetect.c +++ b/libavfilter/vf_cropdetect.c @@ -26,11 +26,14 @@ #include "libavutil/imgutils.h" #include "libavutil/internal.h" #include "libavutil/opt.h" +#include "libavutil/motion_vector.h" +#include "libavutil/qsort.h" #include "avfilter.h" #include "formats.h" #include "internal.h" #include "video.h" +#include "edge_common.h" typedef struct CropDetectContext { const AVClass *class; @@ -42,6 +45,16 @@ typedef struct CropDetectContext { int frame_nb; int max_pixsteps[4]; int max_outliers; + int mode; + int window_size; + int mv_threshold; + float low, high; + uint8_t low_u8, high_u8; + uint8_t *filterbuf; + uint8_t *tmpbuf; + uint16_t *gradients; + char *directions; + int *bboxes[4]; } CropDetectContext; static const enum AVPixelFormat pix_fmts[] = { @@ -61,6 +74,17 @@ static const enum AVPixelFormat pix_fmts[] = { AV_PIX_FMT_NONE }; +enum CropMode { + MODE_BLACK, + MODE_MV_EDGES, + MODE_NB +}; + +static int comp(const int *a,const int *b) +{ + return FFDIFFSIGN(*a, *b); +} + static int checkline(void *ctx, const unsigned char *src, int stride, int len, int bpp) { int total = 0; @@ -116,11 +140,43 @@ static int checkline(void *ctx, const unsigned char *src, int stride, int len, i return total; } +static int checkline_edge(void *ctx, const unsigned char *src, int stride, int len, int bpp) +{ + const uint16_t *src16 = (const uint16_t *)src; + + switch (bpp) { + case 1: + while (--len >= 0) { + if(src[0]) return 0; + src += stride; + } + break; + case 2: + stride >>= 1; + while (--len >= 0) { + if(src16[0]) return 0; + src16 += stride; + } + break; + case 3: + case 4: + while (--len >= 0) { + if(src[0] || src[1] || src[2]) return 0; + src += stride; + } + break; + } + + return 1; +} + static av_cold int init(AVFilterContext *ctx) { CropDetectContext *s = ctx->priv; s->frame_nb = -1 * s->skip; + s->low_u8 = s->low * 255. + .5; + s->high_u8 = s->high * 255. + .5; av_log(ctx, AV_LOG_VERBOSE, "limit:%f round:%d skip:%d reset_count:%d\n", s->limit, s->round, s->skip, s->reset_count); @@ -128,11 +184,27 @@ static av_cold int init(AVFilterContext *ctx) return 0; } +static av_cold void uninit(AVFilterContext *ctx) +{ + CropDetectContext *s = ctx->priv; + + av_freep(&s->tmpbuf); + av_freep(&s->filterbuf); + av_freep(&s->gradients); + av_freep(&s->directions); + av_freep(&s->bboxes[0]); + av_freep(&s->bboxes[1]); + av_freep(&s->bboxes[2]); + av_freep(&s->bboxes[3]); +} + static int config_input(AVFilterLink *inlink) { AVFilterContext *ctx = inlink->dst; CropDetectContext *s = ctx->priv; const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(inlink->format); + const int bufsize = inlink->w * inlink->h; + int bpp; av_image_fill_max_pixsteps(s->max_pixsteps, NULL, desc); @@ -144,6 +216,21 @@ static int config_input(AVFilterLink *inlink) s->x2 = 0; s->y2 = 0; + bpp = s->max_pixsteps[0]; + s->window_size = FFMAX(s->reset_count, 15); + s->tmpbuf = av_malloc(bufsize); + s->filterbuf = av_malloc(bufsize * s->max_pixsteps[0]); + s->gradients = av_calloc(bufsize, sizeof(*s->gradients)); + s->directions = av_malloc(bufsize); + s->bboxes[0] = av_malloc(s->window_size * sizeof(*s->bboxes[0])); + s->bboxes[1] = av_malloc(s->window_size * sizeof(*s->bboxes[1])); + s->bboxes[2] = av_malloc(s->window_size * sizeof(*s->bboxes[2])); + s->bboxes[3] = av_malloc(s->window_size * sizeof(*s->bboxes[3])); + + if (!s->tmpbuf || !s->filterbuf || !s->gradients || !s->directions || + !s->bboxes[0] || !s->bboxes[1] || !s->bboxes[2] || !s->bboxes[3]) + return AVERROR(ENOMEM); + return 0; } @@ -155,11 +242,28 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *frame) AVFilterContext *ctx = inlink->dst; CropDetectContext *s = ctx->priv; int bpp = s->max_pixsteps[0]; - int w, h, x, y, shrink_by; + int w, h, x, y, shrink_by, i; AVDictionary **metadata; int outliers, last_y; int limit = lrint(s->limit); + const int inw = inlink->w; + const int inh = inlink->h; + uint8_t *tmpbuf = s->tmpbuf; + uint8_t *filterbuf = s->filterbuf; + uint16_t *gradients = s->gradients; + int8_t *directions = s->directions; + const AVFrameSideData *sd = NULL; + int scan_w, scan_h, bboff; + + void (*sobel)(int w, int h, uint16_t *dst, int dst_linesize, + int8_t *dir, int dir_linesize, + const uint8_t *src, int src_linesize, int src_stride) = (bpp == 2) ? &ff_sobel_16 : &ff_sobel_8; + void (*gaussian_blur)(int w, int h, + uint8_t *dst, int dst_linesize, + const uint8_t *src, int src_linesize, int src_stride) = (bpp == 2) ? &ff_gaussian_blur_16 : &ff_gaussian_blur_8; + + // ignore first s->skip frames if (++s->frame_nb > 0) { metadata = &frame->metadata; @@ -185,11 +289,109 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *frame) last_y = y INC;\ } + if (s->mode == MODE_BLACK) { FIND(s->y1, 0, y < s->y1, +1, frame->linesize[0], bpp, frame->width); FIND(s->y2, frame->height - 1, y > FFMAX(s->y2, s->y1), -1, frame->linesize[0], bpp, frame->width); FIND(s->x1, 0, y < s->x1, +1, bpp, frame->linesize[0], frame->height); FIND(s->x2, frame->width - 1, y > FFMAX(s->x2, s->x1), -1, bpp, frame->linesize[0], frame->height); + } else { // MODE_MV_EDGES + sd = av_frame_get_side_data(frame, AV_FRAME_DATA_MOTION_VECTORS); + s->x1 = 0; + s->y1 = 0; + s->x2 = inw - 1; + s->y2 = inh - 1; + + if (!sd) { + av_log(ctx, AV_LOG_WARNING, "Cannot detect: no motion vectors available"); + } else { + // gaussian filter to reduce noise + gaussian_blur(inw, inh, + filterbuf, inw*bpp, + frame->data[0], frame->linesize[0], bpp); + + // compute the 16-bits gradients and directions for the next step + sobel(inw, inh, gradients, inw, directions, inw, filterbuf, inw*bpp, bpp); + + // non_maximum_suppression() will actually keep & clip what's necessary and + // ignore the rest, so we need a clean output buffer + memset(tmpbuf, 0, inw * inh); + ff_non_maximum_suppression(inw, inh, tmpbuf, inw, directions, inw, gradients, inw); + + + // keep high values, or low values surrounded by high values + ff_double_threshold(s->low_u8, s->high_u8, inw, inh, + tmpbuf, inw, tmpbuf, inw); + + // scan all MVs and store bounding box + s->x1 = inw - 1; + s->y1 = inh - 1; + s->x2 = 0; + s->y2 = 0; + for (i = 0; i < sd->size / sizeof(AVMotionVector); i++) { + const AVMotionVector *mv = (const AVMotionVector*)sd->data + i; + const int mx = mv->dst_x - mv->src_x; + const int my = mv->dst_y - mv->src_y; + + if (mv->dst_x >= 0 && mv->dst_x < inw && + mv->dst_y >= 0 && mv->dst_y < inh && + mv->src_x >= 0 && mv->src_x < inw && + mv->src_y >= 0 && mv->src_y < inh && + mx * mx + my * my >= s->mv_threshold * s->mv_threshold) { + s->x1 = mv->dst_x < s->x1 ? mv->dst_x : s->x1; + s->y1 = mv->dst_y < s->y1 ? mv->dst_y : s->y1; + s->x2 = mv->dst_x > s->x2 ? mv->dst_x : s->x2; + s->y2 = mv->dst_y > s->y2 ? mv->dst_y : s->y2; + } + } + + // assert x1<x2, y1<y2 + if (s->x1 > s->x2) FFSWAP(int, s->x1, s->x2); + if (s->y1 > s->y2) FFSWAP(int, s->y1, s->y2); + + // scan outward looking for 0-edge-lines in edge image + scan_w = s->x2 - s->x1; + scan_h = s->y2 - s->y1; + +#define FIND_EDGE(DST, FROM, NOEND, INC, STEP0, STEP1, LEN) \ + for (last_y = y = FROM; NOEND; y = y INC) { \ + if (checkline_edge(ctx, tmpbuf + STEP0 * y, STEP1, LEN, bpp)) { \ + if (last_y INC == y) { \ + DST = y; \ + break; \ + } else \ + last_y = y; \ + } \ + } \ + if (!(NOEND)) { \ + DST = y -(INC); \ + } + FIND_EDGE(s->y1, s->y1, y >= 0, -1, inw, bpp, scan_w); + FIND_EDGE(s->y2, s->y2, y < inh, +1, inw, bpp, scan_w); + FIND_EDGE(s->x1, s->x1, y >= 0, -1, bpp, inw, scan_h); + FIND_EDGE(s->x2, s->x2, y < inw, +1, bpp, inw, scan_h); + + // queue bboxes + bboff = (s->frame_nb - 1) % s->window_size; + s->bboxes[0][bboff] = s->x1; + s->bboxes[1][bboff] = s->x2; + s->bboxes[2][bboff] = s->y1; + s->bboxes[3][bboff] = s->y2; + + // sort queue + bboff = FFMIN(s->frame_nb, s->window_size); + AV_QSORT(s->bboxes[0], bboff, int, comp); + AV_QSORT(s->bboxes[1], bboff, int, comp); + AV_QSORT(s->bboxes[2], bboff, int, comp); + AV_QSORT(s->bboxes[3], bboff, int, comp); + + // return median of window_size elems + s->x1 = s->bboxes[0][bboff/2]; + s->x2 = s->bboxes[1][bboff/2]; + s->y1 = s->bboxes[2][bboff/2]; + s->y2 = s->bboxes[3][bboff/2]; + } + } // round x and y (up), important for yuv colorspaces // make sure they stay rounded! @@ -243,6 +445,12 @@ static const AVOption cropdetect_options[] = { { "skip", "Number of initial frames to skip", OFFSET(skip), AV_OPT_TYPE_INT, { .i64 = 2 }, 0, INT_MAX, FLAGS }, { "reset_count", "Recalculate the crop area after this many frames",OFFSET(reset_count),AV_OPT_TYPE_INT,{ .i64 = 0 }, 0, INT_MAX, FLAGS }, { "max_outliers", "Threshold count of outliers", OFFSET(max_outliers),AV_OPT_TYPE_INT, { .i64 = 0 }, 0, INT_MAX, FLAGS }, + { "mode", "set mode", OFFSET(mode), AV_OPT_TYPE_INT, {.i64=MODE_BLACK}, 0, MODE_NB-1, FLAGS, "mode" }, + { "black", "detect black pixels surrounding the video", 0, AV_OPT_TYPE_CONST, {.i64=MODE_BLACK}, INT_MIN, INT_MAX, FLAGS, "mode" }, + { "mvedges", "detect motion and edged surrounding the video", 0, AV_OPT_TYPE_CONST, {.i64=MODE_MV_EDGES}, INT_MIN, INT_MAX, FLAGS, "mode" }, + { "high", "Set high threshold for edge detection", OFFSET(high), AV_OPT_TYPE_FLOAT, {.dbl=25/255.}, 0, 1, FLAGS }, + { "low", "Set low threshold for edge detection", OFFSET(low), AV_OPT_TYPE_FLOAT, {.dbl=15/255.}, 0, 1, FLAGS }, + { "mv_threshold", "motion vector threshold when estimating video window size", OFFSET(mv_threshold), AV_OPT_TYPE_INT, {.i64=8}, 0, 100, FLAGS}, { NULL } }; @@ -270,6 +478,7 @@ const AVFilter ff_vf_cropdetect = { .priv_size = sizeof(CropDetectContext), .priv_class = &cropdetect_class, .init = init, + .uninit = uninit, FILTER_INPUTS(avfilter_vf_cropdetect_inputs), FILTER_OUTPUTS(avfilter_vf_cropdetect_outputs), FILTER_PIXFMTS_ARRAY(pix_fmts), diff --git a/tests/fate/filter-video.mak b/tests/fate/filter-video.mak index faed832cd4..372c70bba7 100644 --- a/tests/fate/filter-video.mak +++ b/tests/fate/filter-video.mak @@ -641,11 +641,17 @@ FATE_METADATA_FILTER-$(call ALLYES, $(SCDET_DEPS)) += fate-filter-metadata-scdet fate-filter-metadata-scdet: SRC = $(TARGET_SAMPLES)/svq3/Vertical400kbit.sorenson3.mov fate-filter-metadata-scdet: CMD = run $(FILTER_METADATA_COMMAND) "sws_flags=+accurate_rnd+bitexact;movie='$(SRC)',scdet=s=1" -CROPDETECT_DEPS = LAVFI_INDEV FILE_PROTOCOL MOVIE_FILTER CROPDETECT_FILTER \ +CROPDETECT_DEPS = LAVFI_INDEV FILE_PROTOCOL MOVIE_FILTER MOVIE_FILTER MESTIMATE_FILTER CROPDETECT_FILTER \ SCALE_FILTER MOV_DEMUXER H264_DECODER FATE_METADATA_FILTER-$(call ALLYES, $(CROPDETECT_DEPS)) += fate-filter-metadata-cropdetect fate-filter-metadata-cropdetect: SRC = $(TARGET_SAMPLES)/filter/cropdetect.mp4 fate-filter-metadata-cropdetect: CMD = run $(FILTER_METADATA_COMMAND) "sws_flags=+accurate_rnd+bitexact;movie='$(SRC)',cropdetect=max_outliers=3" +FATE_METADATA_FILTER-$(call ALLYES, $(CROPDETECT_DEPS)) += fate-filter-metadata-cropdetect1 +fate-filter-metadata-cropdetect1: SRC = $(TARGET_SAMPLES)/filter/cropdetect1.mp4 +fate-filter-metadata-cropdetect1: CMD = run $(FILTER_METADATA_COMMAND) "sws_flags=+accurate_rnd+bitexact;movie='$(SRC)',mestimate,cropdetect=mode=mvedges,metadata=mode=print" +FATE_METADATA_FILTER-$(call ALLYES, $(CROPDETECT_DEPS)) += fate-filter-metadata-cropdetect2 +fate-filter-metadata-cropdetect2: SRC = $(TARGET_SAMPLES)/filter/cropdetect2.mp4 +fate-filter-metadata-cropdetect2: CMD = run $(FILTER_METADATA_COMMAND) "sws_flags=+accurate_rnd+bitexact;movie='$(SRC)',mestimate,cropdetect=mode=mvedges,metadata=mode=print" FREEZEDETECT_DEPS = LAVFI_INDEV MPTESTSRC_FILTER SCALE_FILTER FREEZEDETECT_FILTER FATE_METADATA_FILTER-$(call ALLYES, $(FREEZEDETECT_DEPS)) += fate-filter-metadata-freezedetect diff --git a/tests/ref/fate/filter-metadata-cropdetect1 b/tests/ref/fate/filter-metadata-cropdetect1 new file mode 100644 index 0000000000..892373cc11 --- /dev/null +++ b/tests/ref/fate/filter-metadata-cropdetect1 @@ -0,0 +1,9 @@ +pts=0 +pts=1001 +pts=2002|tag:lavfi.cropdetect.x1=20|tag:lavfi.cropdetect.x2=851|tag:lavfi.cropdetect.y1=311|tag:lavfi.cropdetect.y2=601|tag:lavfi.cropdetect.w=832|tag:lavfi.cropdetect.h=288|tag:lavfi.cropdetect.x=20|tag:lavfi.cropdetect.y=314 +pts=3003|tag:lavfi.cropdetect.x1=20|tag:lavfi.cropdetect.x2=885|tag:lavfi.cropdetect.y1=311|tag:lavfi.cropdetect.y2=621|tag:lavfi.cropdetect.w=864|tag:lavfi.cropdetect.h=304|tag:lavfi.cropdetect.x=22|tag:lavfi.cropdetect.y=316 +pts=4004|tag:lavfi.cropdetect.x1=0|tag:lavfi.cropdetect.x2=885|tag:lavfi.cropdetect.y1=115|tag:lavfi.cropdetect.y2=621|tag:lavfi.cropdetect.w=880|tag:lavfi.cropdetect.h=496|tag:lavfi.cropdetect.x=4|tag:lavfi.cropdetect.y=122 +pts=5005|tag:lavfi.cropdetect.x1=20|tag:lavfi.cropdetect.x2=885|tag:lavfi.cropdetect.y1=311|tag:lavfi.cropdetect.y2=621|tag:lavfi.cropdetect.w=864|tag:lavfi.cropdetect.h=304|tag:lavfi.cropdetect.x=22|tag:lavfi.cropdetect.y=316 +pts=6006|tag:lavfi.cropdetect.x1=0|tag:lavfi.cropdetect.x2=885|tag:lavfi.cropdetect.y1=115|tag:lavfi.cropdetect.y2=621|tag:lavfi.cropdetect.w=880|tag:lavfi.cropdetect.h=496|tag:lavfi.cropdetect.x=4|tag:lavfi.cropdetect.y=122 +pts=7007|tag:lavfi.cropdetect.x1=0|tag:lavfi.cropdetect.x2=885|tag:lavfi.cropdetect.y1=115|tag:lavfi.cropdetect.y2=621|tag:lavfi.cropdetect.w=880|tag:lavfi.cropdetect.h=496|tag:lavfi.cropdetect.x=4|tag:lavfi.cropdetect.y=122 +pts=8008|tag:lavfi.cropdetect.x1=0|tag:lavfi.cropdetect.x2=885|tag:lavfi.cropdetect.y1=115|tag:lavfi.cropdetect.y2=621|tag:lavfi.cropdetect.w=880|tag:lavfi.cropdetect.h=496|tag:lavfi.cropdetect.x=4|tag:lavfi.cropdetect.y=122 diff --git a/tests/ref/fate/filter-metadata-cropdetect2 b/tests/ref/fate/filter-metadata-cropdetect2 new file mode 100644 index 0000000000..6b433d17cb --- /dev/null +++ b/tests/ref/fate/filter-metadata-cropdetect2 @@ -0,0 +1,9 @@ +pts=0 +pts=512 +pts=1024|tag:lavfi.cropdetect.x1=21|tag:lavfi.cropdetect.x2=817|tag:lavfi.cropdetect.y1=33|tag:lavfi.cropdetect.y2=465|tag:lavfi.cropdetect.w=784|tag:lavfi.cropdetect.h=432|tag:lavfi.cropdetect.x=28|tag:lavfi.cropdetect.y=34 +pts=1536|tag:lavfi.cropdetect.x1=21|tag:lavfi.cropdetect.x2=817|tag:lavfi.cropdetect.y1=33|tag:lavfi.cropdetect.y2=465|tag:lavfi.cropdetect.w=784|tag:lavfi.cropdetect.h=432|tag:lavfi.cropdetect.x=28|tag:lavfi.cropdetect.y=34 +pts=2048|tag:lavfi.cropdetect.x1=21|tag:lavfi.cropdetect.x2=817|tag:lavfi.cropdetect.y1=29|tag:lavfi.cropdetect.y2=465|tag:lavfi.cropdetect.w=784|tag:lavfi.cropdetect.h=432|tag:lavfi.cropdetect.x=28|tag:lavfi.cropdetect.y=32 +pts=2560|tag:lavfi.cropdetect.x1=21|tag:lavfi.cropdetect.x2=817|tag:lavfi.cropdetect.y1=29|tag:lavfi.cropdetect.y2=465|tag:lavfi.cropdetect.w=784|tag:lavfi.cropdetect.h=432|tag:lavfi.cropdetect.x=28|tag:lavfi.cropdetect.y=32 +pts=3072|tag:lavfi.cropdetect.x1=21|tag:lavfi.cropdetect.x2=817|tag:lavfi.cropdetect.y1=29|tag:lavfi.cropdetect.y2=465|tag:lavfi.cropdetect.w=784|tag:lavfi.cropdetect.h=432|tag:lavfi.cropdetect.x=28|tag:lavfi.cropdetect.y=32 +pts=3584|tag:lavfi.cropdetect.x1=21|tag:lavfi.cropdetect.x2=817|tag:lavfi.cropdetect.y1=29|tag:lavfi.cropdetect.y2=465|tag:lavfi.cropdetect.w=784|tag:lavfi.cropdetect.h=432|tag:lavfi.cropdetect.x=28|tag:lavfi.cropdetect.y=32 +pts=4096|tag:lavfi.cropdetect.x1=21|tag:lavfi.cropdetect.x2=817|tag:lavfi.cropdetect.y1=29|tag:lavfi.cropdetect.y2=465|tag:lavfi.cropdetect.w=784|tag:lavfi.cropdetect.h=432|tag:lavfi.cropdetect.x=28|tag:lavfi.cropdetect.y=32 -- 2.20.1 (Apple Git-117) [-- Attachment #3: Type: text/plain, Size: 251 bytes --] _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [FFmpeg-devel] [PATCH v2 2/2] lavfi/cropdetect: Add new mode to detect crop-area based on motion vectors and edges 2022-07-16 21:09 ` Thilo Borgmann @ 2022-07-17 7:54 ` Thilo Borgmann 2022-07-18 14:15 ` Thilo Borgmann 0 siblings, 1 reply; 10+ messages in thread From: Thilo Borgmann @ 2022-07-17 7:54 UTC (permalink / raw) To: ffmpeg-devel [-- Attachment #1: Type: text/plain, Size: 125 bytes --] Am 16.07.22 um 23:09 schrieb Thilo Borgmann: > Am 11.07.22 um 10:54 schrieb Thilo Borgmann: >> $subject > > v3. v4. -Thilo [-- Attachment #2: v4-0002-lavfi-cropdetect-Add-new-mode-to-detect-crop-area.patch --] [-- Type: text/plain, Size: 21015 bytes --] From 9933d7d69781e1922b4a2ddc22777fdef588dbb2 Mon Sep 17 00:00:00 2001 From: Thilo Borgmann <thilo.borgmann@mail.de> Date: Sun, 17 Jul 2022 09:52:01 +0200 Subject: [PATCH v4 2/2] lavfi/cropdetect: Add new mode to detect crop-area based on motion vectors and edges This filter allows crop detection even if the video is embedded in non-black areas. --- doc/filters.texi | 42 +++- libavfilter/vf_cropdetect.c | 211 ++++++++++++++++++++- tests/fate/filter-video.mak | 8 +- tests/ref/fate/filter-metadata-cropdetect1 | 9 + tests/ref/fate/filter-metadata-cropdetect2 | 9 + 5 files changed, 276 insertions(+), 3 deletions(-) create mode 100644 tests/ref/fate/filter-metadata-cropdetect1 create mode 100644 tests/ref/fate/filter-metadata-cropdetect2 diff --git a/doc/filters.texi b/doc/filters.texi index d65e83d4d0..bd2d2429d7 100644 --- a/doc/filters.texi +++ b/doc/filters.texi @@ -10075,7 +10075,7 @@ Auto-detect the crop size. It calculates the necessary cropping parameters and prints the recommended parameters via the logging system. The detected dimensions -correspond to the non-black area of the input video. +correspond to the non-black or video area of the input video according to @var{mode}. It accepts the following parameters: @@ -10106,8 +10106,48 @@ detect the current optimal crop area. Default value is 0. This can be useful when channel logos distort the video area. 0 indicates 'never reset', and returns the largest area encountered during playback. + +@item mv_threshold +Set motion in pixel units as threshold for motion detection. It defaults to 8. + +@item low +@item high +Set low and high threshold values used by the Canny thresholding +algorithm. + +The high threshold selects the "strong" edge pixels, which are then +connected through 8-connectivity with the "weak" edge pixels selected +by the low threshold. + +@var{low} and @var{high} threshold values must be chosen in the range +[0,1], and @var{low} should be lesser or equal to @var{high}. + +Default value for @var{low} is @code{5/255}, and default value for @var{high} +is @code{15/255}. @end table +@subsection Examples + +@itemize +@item +Find video area surrounded by black borders: +@example +ffmpeg -i file.mp4 -vf cropdetect,metadata=mode=print -f null - +@end example + +@item +Find an embedded video area, generate motion vectors beforehand: +@example +ffmpeg -i file.mp4 -vf mestimate,cropdetect=mode=mvedges,metadata=mode=print -f null - +@end example + +@item +Find an embedded video area, use motion vectors from decoder: +@example +ffmpeg -flags2 +export_mvs -i file.mp4 -vf cropdetect=mode=mvedges,metadata=mode=print -f null - +@end example +@end itemize + @anchor{cue} @section cue diff --git a/libavfilter/vf_cropdetect.c b/libavfilter/vf_cropdetect.c index b887b9ecb1..5e9e99ab17 100644 --- a/libavfilter/vf_cropdetect.c +++ b/libavfilter/vf_cropdetect.c @@ -26,11 +26,14 @@ #include "libavutil/imgutils.h" #include "libavutil/internal.h" #include "libavutil/opt.h" +#include "libavutil/motion_vector.h" +#include "libavutil/qsort.h" #include "avfilter.h" #include "formats.h" #include "internal.h" #include "video.h" +#include "edge_common.h" typedef struct CropDetectContext { const AVClass *class; @@ -42,6 +45,16 @@ typedef struct CropDetectContext { int frame_nb; int max_pixsteps[4]; int max_outliers; + int mode; + int window_size; + int mv_threshold; + float low, high; + uint8_t low_u8, high_u8; + uint8_t *filterbuf; + uint8_t *tmpbuf; + uint16_t *gradients; + char *directions; + int *bboxes[4]; } CropDetectContext; static const enum AVPixelFormat pix_fmts[] = { @@ -61,6 +74,17 @@ static const enum AVPixelFormat pix_fmts[] = { AV_PIX_FMT_NONE }; +enum CropMode { + MODE_BLACK, + MODE_MV_EDGES, + MODE_NB +}; + +static int comp(const int *a,const int *b) +{ + return FFDIFFSIGN(*a, *b); +} + static int checkline(void *ctx, const unsigned char *src, int stride, int len, int bpp) { int total = 0; @@ -116,11 +140,43 @@ static int checkline(void *ctx, const unsigned char *src, int stride, int len, i return total; } +static int checkline_edge(void *ctx, const unsigned char *src, int stride, int len, int bpp) +{ + const uint16_t *src16 = (const uint16_t *)src; + + switch (bpp) { + case 1: + while (--len >= 0) { + if(src[0]) return 0; + src += stride; + } + break; + case 2: + stride >>= 1; + while (--len >= 0) { + if(src16[0]) return 0; + src16 += stride; + } + break; + case 3: + case 4: + while (--len >= 0) { + if(src[0] || src[1] || src[2]) return 0; + src += stride; + } + break; + } + + return 1; +} + static av_cold int init(AVFilterContext *ctx) { CropDetectContext *s = ctx->priv; s->frame_nb = -1 * s->skip; + s->low_u8 = s->low * 255. + .5; + s->high_u8 = s->high * 255. + .5; av_log(ctx, AV_LOG_VERBOSE, "limit:%f round:%d skip:%d reset_count:%d\n", s->limit, s->round, s->skip, s->reset_count); @@ -128,11 +184,27 @@ static av_cold int init(AVFilterContext *ctx) return 0; } +static av_cold void uninit(AVFilterContext *ctx) +{ + CropDetectContext *s = ctx->priv; + + av_freep(&s->tmpbuf); + av_freep(&s->filterbuf); + av_freep(&s->gradients); + av_freep(&s->directions); + av_freep(&s->bboxes[0]); + av_freep(&s->bboxes[1]); + av_freep(&s->bboxes[2]); + av_freep(&s->bboxes[3]); +} + static int config_input(AVFilterLink *inlink) { AVFilterContext *ctx = inlink->dst; CropDetectContext *s = ctx->priv; const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(inlink->format); + const int bufsize = inlink->w * inlink->h; + int bpp; av_image_fill_max_pixsteps(s->max_pixsteps, NULL, desc); @@ -144,6 +216,21 @@ static int config_input(AVFilterLink *inlink) s->x2 = 0; s->y2 = 0; + bpp = s->max_pixsteps[0]; + s->window_size = FFMAX(s->reset_count, 15); + s->tmpbuf = av_malloc(bufsize); + s->filterbuf = av_malloc(bufsize * s->max_pixsteps[0]); + s->gradients = av_calloc(bufsize, sizeof(*s->gradients)); + s->directions = av_malloc(bufsize); + s->bboxes[0] = av_malloc(s->window_size * sizeof(*s->bboxes[0])); + s->bboxes[1] = av_malloc(s->window_size * sizeof(*s->bboxes[1])); + s->bboxes[2] = av_malloc(s->window_size * sizeof(*s->bboxes[2])); + s->bboxes[3] = av_malloc(s->window_size * sizeof(*s->bboxes[3])); + + if (!s->tmpbuf || !s->filterbuf || !s->gradients || !s->directions || + !s->bboxes[0] || !s->bboxes[1] || !s->bboxes[2] || !s->bboxes[3]) + return AVERROR(ENOMEM); + return 0; } @@ -155,11 +242,28 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *frame) AVFilterContext *ctx = inlink->dst; CropDetectContext *s = ctx->priv; int bpp = s->max_pixsteps[0]; - int w, h, x, y, shrink_by; + int w, h, x, y, shrink_by, i; AVDictionary **metadata; int outliers, last_y; int limit = lrint(s->limit); + const int inw = inlink->w; + const int inh = inlink->h; + uint8_t *tmpbuf = s->tmpbuf; + uint8_t *filterbuf = s->filterbuf; + uint16_t *gradients = s->gradients; + int8_t *directions = s->directions; + const AVFrameSideData *sd = NULL; + int scan_w, scan_h, bboff; + + void (*sobel)(int w, int h, uint16_t *dst, int dst_linesize, + int8_t *dir, int dir_linesize, + const uint8_t *src, int src_linesize, int src_stride) = (bpp == 2) ? &ff_sobel_16 : &ff_sobel_8; + void (*gaussian_blur)(int w, int h, + uint8_t *dst, int dst_linesize, + const uint8_t *src, int src_linesize, int src_stride) = (bpp == 2) ? &ff_gaussian_blur_16 : &ff_gaussian_blur_8; + + // ignore first s->skip frames if (++s->frame_nb > 0) { metadata = &frame->metadata; @@ -185,11 +289,109 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *frame) last_y = y INC;\ } + if (s->mode == MODE_BLACK) { FIND(s->y1, 0, y < s->y1, +1, frame->linesize[0], bpp, frame->width); FIND(s->y2, frame->height - 1, y > FFMAX(s->y2, s->y1), -1, frame->linesize[0], bpp, frame->width); FIND(s->x1, 0, y < s->x1, +1, bpp, frame->linesize[0], frame->height); FIND(s->x2, frame->width - 1, y > FFMAX(s->x2, s->x1), -1, bpp, frame->linesize[0], frame->height); + } else { // MODE_MV_EDGES + sd = av_frame_get_side_data(frame, AV_FRAME_DATA_MOTION_VECTORS); + s->x1 = 0; + s->y1 = 0; + s->x2 = inw - 1; + s->y2 = inh - 1; + + if (!sd) { + av_log(ctx, AV_LOG_WARNING, "Cannot detect: no motion vectors available"); + } else { + // gaussian filter to reduce noise + gaussian_blur(inw, inh, + filterbuf, inw*bpp, + frame->data[0], frame->linesize[0], bpp); + + // compute the 16-bits gradients and directions for the next step + sobel(inw, inh, gradients, inw, directions, inw, filterbuf, inw*bpp, bpp); + + // non_maximum_suppression() will actually keep & clip what's necessary and + // ignore the rest, so we need a clean output buffer + memset(tmpbuf, 0, inw * inh); + ff_non_maximum_suppression(inw, inh, tmpbuf, inw, directions, inw, gradients, inw); + + + // keep high values, or low values surrounded by high values + ff_double_threshold(s->low_u8, s->high_u8, inw, inh, + tmpbuf, inw, tmpbuf, inw); + + // scan all MVs and store bounding box + s->x1 = inw - 1; + s->y1 = inh - 1; + s->x2 = 0; + s->y2 = 0; + for (i = 0; i < sd->size / sizeof(AVMotionVector); i++) { + const AVMotionVector *mv = (const AVMotionVector*)sd->data + i; + const int mx = mv->dst_x - mv->src_x; + const int my = mv->dst_y - mv->src_y; + + if (mv->dst_x >= 0 && mv->dst_x < inw && + mv->dst_y >= 0 && mv->dst_y < inh && + mv->src_x >= 0 && mv->src_x < inw && + mv->src_y >= 0 && mv->src_y < inh && + mx * mx + my * my >= s->mv_threshold * s->mv_threshold) { + s->x1 = mv->dst_x < s->x1 ? mv->dst_x : s->x1; + s->y1 = mv->dst_y < s->y1 ? mv->dst_y : s->y1; + s->x2 = mv->dst_x > s->x2 ? mv->dst_x : s->x2; + s->y2 = mv->dst_y > s->y2 ? mv->dst_y : s->y2; + } + } + + // assert x1<x2, y1<y2 + if (s->x1 > s->x2) FFSWAP(int, s->x1, s->x2); + if (s->y1 > s->y2) FFSWAP(int, s->y1, s->y2); + + // scan outward looking for 0-edge-lines in edge image + scan_w = s->x2 - s->x1; + scan_h = s->y2 - s->y1; + +#define FIND_EDGE(DST, FROM, NOEND, INC, STEP0, STEP1, LEN) \ + for (last_y = y = FROM; NOEND; y = y INC) { \ + if (checkline_edge(ctx, tmpbuf + STEP0 * y, STEP1, LEN, bpp)) { \ + if (last_y INC == y) { \ + DST = y; \ + break; \ + } else \ + last_y = y; \ + } \ + } \ + if (!(NOEND)) { \ + DST = y -(INC); \ + } + FIND_EDGE(s->y1, s->y1, y >= 0, -1, inw, bpp, scan_w); + FIND_EDGE(s->y2, s->y2, y < inh, +1, inw, bpp, scan_w); + FIND_EDGE(s->x1, s->x1, y >= 0, -1, bpp, inw, scan_h); + FIND_EDGE(s->x2, s->x2, y < inw, +1, bpp, inw, scan_h); + + // queue bboxes + bboff = (s->frame_nb - 1) % s->window_size; + s->bboxes[0][bboff] = s->x1; + s->bboxes[1][bboff] = s->x2; + s->bboxes[2][bboff] = s->y1; + s->bboxes[3][bboff] = s->y2; + + // sort queue + bboff = FFMIN(s->frame_nb, s->window_size); + AV_QSORT(s->bboxes[0], bboff, int, comp); + AV_QSORT(s->bboxes[1], bboff, int, comp); + AV_QSORT(s->bboxes[2], bboff, int, comp); + AV_QSORT(s->bboxes[3], bboff, int, comp); + + // return median of window_size elems + s->x1 = s->bboxes[0][bboff/2]; + s->x2 = s->bboxes[1][bboff/2]; + s->y1 = s->bboxes[2][bboff/2]; + s->y2 = s->bboxes[3][bboff/2]; + } + } // round x and y (up), important for yuv colorspaces // make sure they stay rounded! @@ -243,6 +445,12 @@ static const AVOption cropdetect_options[] = { { "skip", "Number of initial frames to skip", OFFSET(skip), AV_OPT_TYPE_INT, { .i64 = 2 }, 0, INT_MAX, FLAGS }, { "reset_count", "Recalculate the crop area after this many frames",OFFSET(reset_count),AV_OPT_TYPE_INT,{ .i64 = 0 }, 0, INT_MAX, FLAGS }, { "max_outliers", "Threshold count of outliers", OFFSET(max_outliers),AV_OPT_TYPE_INT, { .i64 = 0 }, 0, INT_MAX, FLAGS }, + { "mode", "set mode", OFFSET(mode), AV_OPT_TYPE_INT, {.i64=MODE_BLACK}, 0, MODE_NB-1, FLAGS, "mode" }, + { "black", "detect black pixels surrounding the video", 0, AV_OPT_TYPE_CONST, {.i64=MODE_BLACK}, INT_MIN, INT_MAX, FLAGS, "mode" }, + { "mvedges", "detect motion and edged surrounding the video", 0, AV_OPT_TYPE_CONST, {.i64=MODE_MV_EDGES}, INT_MIN, INT_MAX, FLAGS, "mode" }, + { "high", "Set high threshold for edge detection", OFFSET(high), AV_OPT_TYPE_FLOAT, {.dbl=25/255.}, 0, 1, FLAGS }, + { "low", "Set low threshold for edge detection", OFFSET(low), AV_OPT_TYPE_FLOAT, {.dbl=15/255.}, 0, 1, FLAGS }, + { "mv_threshold", "motion vector threshold when estimating video window size", OFFSET(mv_threshold), AV_OPT_TYPE_INT, {.i64=8}, 0, 100, FLAGS}, { NULL } }; @@ -270,6 +478,7 @@ const AVFilter ff_vf_cropdetect = { .priv_size = sizeof(CropDetectContext), .priv_class = &cropdetect_class, .init = init, + .uninit = uninit, FILTER_INPUTS(avfilter_vf_cropdetect_inputs), FILTER_OUTPUTS(avfilter_vf_cropdetect_outputs), FILTER_PIXFMTS_ARRAY(pix_fmts), diff --git a/tests/fate/filter-video.mak b/tests/fate/filter-video.mak index faed832cd4..372c70bba7 100644 --- a/tests/fate/filter-video.mak +++ b/tests/fate/filter-video.mak @@ -641,11 +641,17 @@ FATE_METADATA_FILTER-$(call ALLYES, $(SCDET_DEPS)) += fate-filter-metadata-scdet fate-filter-metadata-scdet: SRC = $(TARGET_SAMPLES)/svq3/Vertical400kbit.sorenson3.mov fate-filter-metadata-scdet: CMD = run $(FILTER_METADATA_COMMAND) "sws_flags=+accurate_rnd+bitexact;movie='$(SRC)',scdet=s=1" -CROPDETECT_DEPS = LAVFI_INDEV FILE_PROTOCOL MOVIE_FILTER CROPDETECT_FILTER \ +CROPDETECT_DEPS = LAVFI_INDEV FILE_PROTOCOL MOVIE_FILTER MOVIE_FILTER MESTIMATE_FILTER CROPDETECT_FILTER \ SCALE_FILTER MOV_DEMUXER H264_DECODER FATE_METADATA_FILTER-$(call ALLYES, $(CROPDETECT_DEPS)) += fate-filter-metadata-cropdetect fate-filter-metadata-cropdetect: SRC = $(TARGET_SAMPLES)/filter/cropdetect.mp4 fate-filter-metadata-cropdetect: CMD = run $(FILTER_METADATA_COMMAND) "sws_flags=+accurate_rnd+bitexact;movie='$(SRC)',cropdetect=max_outliers=3" +FATE_METADATA_FILTER-$(call ALLYES, $(CROPDETECT_DEPS)) += fate-filter-metadata-cropdetect1 +fate-filter-metadata-cropdetect1: SRC = $(TARGET_SAMPLES)/filter/cropdetect1.mp4 +fate-filter-metadata-cropdetect1: CMD = run $(FILTER_METADATA_COMMAND) "sws_flags=+accurate_rnd+bitexact;movie='$(SRC)',mestimate,cropdetect=mode=mvedges,metadata=mode=print" +FATE_METADATA_FILTER-$(call ALLYES, $(CROPDETECT_DEPS)) += fate-filter-metadata-cropdetect2 +fate-filter-metadata-cropdetect2: SRC = $(TARGET_SAMPLES)/filter/cropdetect2.mp4 +fate-filter-metadata-cropdetect2: CMD = run $(FILTER_METADATA_COMMAND) "sws_flags=+accurate_rnd+bitexact;movie='$(SRC)',mestimate,cropdetect=mode=mvedges,metadata=mode=print" FREEZEDETECT_DEPS = LAVFI_INDEV MPTESTSRC_FILTER SCALE_FILTER FREEZEDETECT_FILTER FATE_METADATA_FILTER-$(call ALLYES, $(FREEZEDETECT_DEPS)) += fate-filter-metadata-freezedetect diff --git a/tests/ref/fate/filter-metadata-cropdetect1 b/tests/ref/fate/filter-metadata-cropdetect1 new file mode 100644 index 0000000000..892373cc11 --- /dev/null +++ b/tests/ref/fate/filter-metadata-cropdetect1 @@ -0,0 +1,9 @@ +pts=0 +pts=1001 +pts=2002|tag:lavfi.cropdetect.x1=20|tag:lavfi.cropdetect.x2=851|tag:lavfi.cropdetect.y1=311|tag:lavfi.cropdetect.y2=601|tag:lavfi.cropdetect.w=832|tag:lavfi.cropdetect.h=288|tag:lavfi.cropdetect.x=20|tag:lavfi.cropdetect.y=314 +pts=3003|tag:lavfi.cropdetect.x1=20|tag:lavfi.cropdetect.x2=885|tag:lavfi.cropdetect.y1=311|tag:lavfi.cropdetect.y2=621|tag:lavfi.cropdetect.w=864|tag:lavfi.cropdetect.h=304|tag:lavfi.cropdetect.x=22|tag:lavfi.cropdetect.y=316 +pts=4004|tag:lavfi.cropdetect.x1=0|tag:lavfi.cropdetect.x2=885|tag:lavfi.cropdetect.y1=115|tag:lavfi.cropdetect.y2=621|tag:lavfi.cropdetect.w=880|tag:lavfi.cropdetect.h=496|tag:lavfi.cropdetect.x=4|tag:lavfi.cropdetect.y=122 +pts=5005|tag:lavfi.cropdetect.x1=20|tag:lavfi.cropdetect.x2=885|tag:lavfi.cropdetect.y1=311|tag:lavfi.cropdetect.y2=621|tag:lavfi.cropdetect.w=864|tag:lavfi.cropdetect.h=304|tag:lavfi.cropdetect.x=22|tag:lavfi.cropdetect.y=316 +pts=6006|tag:lavfi.cropdetect.x1=0|tag:lavfi.cropdetect.x2=885|tag:lavfi.cropdetect.y1=115|tag:lavfi.cropdetect.y2=621|tag:lavfi.cropdetect.w=880|tag:lavfi.cropdetect.h=496|tag:lavfi.cropdetect.x=4|tag:lavfi.cropdetect.y=122 +pts=7007|tag:lavfi.cropdetect.x1=0|tag:lavfi.cropdetect.x2=885|tag:lavfi.cropdetect.y1=115|tag:lavfi.cropdetect.y2=621|tag:lavfi.cropdetect.w=880|tag:lavfi.cropdetect.h=496|tag:lavfi.cropdetect.x=4|tag:lavfi.cropdetect.y=122 +pts=8008|tag:lavfi.cropdetect.x1=0|tag:lavfi.cropdetect.x2=885|tag:lavfi.cropdetect.y1=115|tag:lavfi.cropdetect.y2=621|tag:lavfi.cropdetect.w=880|tag:lavfi.cropdetect.h=496|tag:lavfi.cropdetect.x=4|tag:lavfi.cropdetect.y=122 diff --git a/tests/ref/fate/filter-metadata-cropdetect2 b/tests/ref/fate/filter-metadata-cropdetect2 new file mode 100644 index 0000000000..6b433d17cb --- /dev/null +++ b/tests/ref/fate/filter-metadata-cropdetect2 @@ -0,0 +1,9 @@ +pts=0 +pts=512 +pts=1024|tag:lavfi.cropdetect.x1=21|tag:lavfi.cropdetect.x2=817|tag:lavfi.cropdetect.y1=33|tag:lavfi.cropdetect.y2=465|tag:lavfi.cropdetect.w=784|tag:lavfi.cropdetect.h=432|tag:lavfi.cropdetect.x=28|tag:lavfi.cropdetect.y=34 +pts=1536|tag:lavfi.cropdetect.x1=21|tag:lavfi.cropdetect.x2=817|tag:lavfi.cropdetect.y1=33|tag:lavfi.cropdetect.y2=465|tag:lavfi.cropdetect.w=784|tag:lavfi.cropdetect.h=432|tag:lavfi.cropdetect.x=28|tag:lavfi.cropdetect.y=34 +pts=2048|tag:lavfi.cropdetect.x1=21|tag:lavfi.cropdetect.x2=817|tag:lavfi.cropdetect.y1=29|tag:lavfi.cropdetect.y2=465|tag:lavfi.cropdetect.w=784|tag:lavfi.cropdetect.h=432|tag:lavfi.cropdetect.x=28|tag:lavfi.cropdetect.y=32 +pts=2560|tag:lavfi.cropdetect.x1=21|tag:lavfi.cropdetect.x2=817|tag:lavfi.cropdetect.y1=29|tag:lavfi.cropdetect.y2=465|tag:lavfi.cropdetect.w=784|tag:lavfi.cropdetect.h=432|tag:lavfi.cropdetect.x=28|tag:lavfi.cropdetect.y=32 +pts=3072|tag:lavfi.cropdetect.x1=21|tag:lavfi.cropdetect.x2=817|tag:lavfi.cropdetect.y1=29|tag:lavfi.cropdetect.y2=465|tag:lavfi.cropdetect.w=784|tag:lavfi.cropdetect.h=432|tag:lavfi.cropdetect.x=28|tag:lavfi.cropdetect.y=32 +pts=3584|tag:lavfi.cropdetect.x1=21|tag:lavfi.cropdetect.x2=817|tag:lavfi.cropdetect.y1=29|tag:lavfi.cropdetect.y2=465|tag:lavfi.cropdetect.w=784|tag:lavfi.cropdetect.h=432|tag:lavfi.cropdetect.x=28|tag:lavfi.cropdetect.y=32 +pts=4096|tag:lavfi.cropdetect.x1=21|tag:lavfi.cropdetect.x2=817|tag:lavfi.cropdetect.y1=29|tag:lavfi.cropdetect.y2=465|tag:lavfi.cropdetect.w=784|tag:lavfi.cropdetect.h=432|tag:lavfi.cropdetect.x=28|tag:lavfi.cropdetect.y=32 -- 2.20.1 (Apple Git-117) [-- Attachment #3: Type: text/plain, Size: 251 bytes --] _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [FFmpeg-devel] [PATCH v2 2/2] lavfi/cropdetect: Add new mode to detect crop-area based on motion vectors and edges 2022-07-17 7:54 ` Thilo Borgmann @ 2022-07-18 14:15 ` Thilo Borgmann 0 siblings, 0 replies; 10+ messages in thread From: Thilo Borgmann @ 2022-07-18 14:15 UTC (permalink / raw) To: ffmpeg-devel [-- Attachment #1: Type: text/plain, Size: 204 bytes --] Am 17.07.22 um 09:54 schrieb Thilo Borgmann: > Am 16.07.22 um 23:09 schrieb Thilo Borgmann: >> Am 11.07.22 um 10:54 schrieb Thilo Borgmann: >>> $subject >> >> v3. > > v4. v5 now with fixed docs. -Thilo [-- Attachment #2: v5-0002-lavfi-cropdetect-Add-new-mode-to-detect-crop-area.patch --] [-- Type: text/plain, Size: 21660 bytes --] From c5962d580dd103d0b9cc08724a6863e62383da7b Mon Sep 17 00:00:00 2001 From: Thilo Borgmann <thilo.borgmann@mail.de> Date: Mon, 18 Jul 2022 16:10:12 +0200 Subject: [PATCH v5 2/2] lavfi/cropdetect: Add new mode to detect crop-area based on motion vectors and edges This filter allows crop detection even if the video is embedded in non-black areas. --- doc/filters.texi | 53 +++++- libavfilter/vf_cropdetect.c | 211 ++++++++++++++++++++- tests/fate/filter-video.mak | 8 +- tests/ref/fate/filter-metadata-cropdetect1 | 9 + tests/ref/fate/filter-metadata-cropdetect2 | 9 + 5 files changed, 287 insertions(+), 3 deletions(-) create mode 100644 tests/ref/fate/filter-metadata-cropdetect1 create mode 100644 tests/ref/fate/filter-metadata-cropdetect2 diff --git a/doc/filters.texi b/doc/filters.texi index d65e83d4d0..19db8e05aa 100644 --- a/doc/filters.texi +++ b/doc/filters.texi @@ -10075,12 +10075,23 @@ Auto-detect the crop size. It calculates the necessary cropping parameters and prints the recommended parameters via the logging system. The detected dimensions -correspond to the non-black area of the input video. +correspond to the non-black or video area of the input video according to @var{mode}. It accepts the following parameters: @table @option +@item mode +Depending on @var{mode} crop detection is based on either the mere black value of surrounding pixels or a combination of motion vectors and edge pixels. + +@table @samp +@item black +Detect black pixels surrounding the playing video. For fine control use option @var{limit}. + +@item mvedges +Detect the playing video by the motion vectors inside the video and scanning for edge pixels typically forming the border of a playing video. +@end table + @item limit Set higher black value threshold, which can be optionally specified from nothing (0) to everything (255 for 8-bit based formats). An intensity @@ -10106,8 +10117,48 @@ detect the current optimal crop area. Default value is 0. This can be useful when channel logos distort the video area. 0 indicates 'never reset', and returns the largest area encountered during playback. + +@item mv_threshold +Set motion in pixel units as threshold for motion detection. It defaults to 8. + +@item low +@item high +Set low and high threshold values used by the Canny thresholding +algorithm. + +The high threshold selects the "strong" edge pixels, which are then +connected through 8-connectivity with the "weak" edge pixels selected +by the low threshold. + +@var{low} and @var{high} threshold values must be chosen in the range +[0,1], and @var{low} should be lesser or equal to @var{high}. + +Default value for @var{low} is @code{5/255}, and default value for @var{high} +is @code{15/255}. @end table +@subsection Examples + +@itemize +@item +Find video area surrounded by black borders: +@example +ffmpeg -i file.mp4 -vf cropdetect,metadata=mode=print -f null - +@end example + +@item +Find an embedded video area, generate motion vectors beforehand: +@example +ffmpeg -i file.mp4 -vf mestimate,cropdetect=mode=mvedges,metadata=mode=print -f null - +@end example + +@item +Find an embedded video area, use motion vectors from decoder: +@example +ffmpeg -flags2 +export_mvs -i file.mp4 -vf cropdetect=mode=mvedges,metadata=mode=print -f null - +@end example +@end itemize + @anchor{cue} @section cue diff --git a/libavfilter/vf_cropdetect.c b/libavfilter/vf_cropdetect.c index b887b9ecb1..e920e671ab 100644 --- a/libavfilter/vf_cropdetect.c +++ b/libavfilter/vf_cropdetect.c @@ -26,11 +26,14 @@ #include "libavutil/imgutils.h" #include "libavutil/internal.h" #include "libavutil/opt.h" +#include "libavutil/motion_vector.h" +#include "libavutil/qsort.h" #include "avfilter.h" #include "formats.h" #include "internal.h" #include "video.h" +#include "edge_common.h" typedef struct CropDetectContext { const AVClass *class; @@ -42,6 +45,16 @@ typedef struct CropDetectContext { int frame_nb; int max_pixsteps[4]; int max_outliers; + int mode; + int window_size; + int mv_threshold; + float low, high; + uint8_t low_u8, high_u8; + uint8_t *filterbuf; + uint8_t *tmpbuf; + uint16_t *gradients; + char *directions; + int *bboxes[4]; } CropDetectContext; static const enum AVPixelFormat pix_fmts[] = { @@ -61,6 +74,17 @@ static const enum AVPixelFormat pix_fmts[] = { AV_PIX_FMT_NONE }; +enum CropMode { + MODE_BLACK, + MODE_MV_EDGES, + MODE_NB +}; + +static int comp(const int *a,const int *b) +{ + return FFDIFFSIGN(*a, *b); +} + static int checkline(void *ctx, const unsigned char *src, int stride, int len, int bpp) { int total = 0; @@ -116,11 +140,43 @@ static int checkline(void *ctx, const unsigned char *src, int stride, int len, i return total; } +static int checkline_edge(void *ctx, const unsigned char *src, int stride, int len, int bpp) +{ + const uint16_t *src16 = (const uint16_t *)src; + + switch (bpp) { + case 1: + while (--len >= 0) { + if (src[0]) return 0; + src += stride; + } + break; + case 2: + stride >>= 1; + while (--len >= 0) { + if (src16[0]) return 0; + src16 += stride; + } + break; + case 3: + case 4: + while (--len >= 0) { + if (src[0] || src[1] || src[2]) return 0; + src += stride; + } + break; + } + + return 1; +} + static av_cold int init(AVFilterContext *ctx) { CropDetectContext *s = ctx->priv; s->frame_nb = -1 * s->skip; + s->low_u8 = s->low * 255. + .5; + s->high_u8 = s->high * 255. + .5; av_log(ctx, AV_LOG_VERBOSE, "limit:%f round:%d skip:%d reset_count:%d\n", s->limit, s->round, s->skip, s->reset_count); @@ -128,11 +184,27 @@ static av_cold int init(AVFilterContext *ctx) return 0; } +static av_cold void uninit(AVFilterContext *ctx) +{ + CropDetectContext *s = ctx->priv; + + av_freep(&s->tmpbuf); + av_freep(&s->filterbuf); + av_freep(&s->gradients); + av_freep(&s->directions); + av_freep(&s->bboxes[0]); + av_freep(&s->bboxes[1]); + av_freep(&s->bboxes[2]); + av_freep(&s->bboxes[3]); +} + static int config_input(AVFilterLink *inlink) { AVFilterContext *ctx = inlink->dst; CropDetectContext *s = ctx->priv; const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(inlink->format); + const int bufsize = inlink->w * inlink->h; + int bpp; av_image_fill_max_pixsteps(s->max_pixsteps, NULL, desc); @@ -144,6 +216,21 @@ static int config_input(AVFilterLink *inlink) s->x2 = 0; s->y2 = 0; + bpp = s->max_pixsteps[0]; + s->window_size = FFMAX(s->reset_count, 15); + s->tmpbuf = av_malloc(bufsize); + s->filterbuf = av_malloc(bufsize * s->max_pixsteps[0]); + s->gradients = av_calloc(bufsize, sizeof(*s->gradients)); + s->directions = av_malloc(bufsize); + s->bboxes[0] = av_malloc(s->window_size * sizeof(*s->bboxes[0])); + s->bboxes[1] = av_malloc(s->window_size * sizeof(*s->bboxes[1])); + s->bboxes[2] = av_malloc(s->window_size * sizeof(*s->bboxes[2])); + s->bboxes[3] = av_malloc(s->window_size * sizeof(*s->bboxes[3])); + + if (!s->tmpbuf || !s->filterbuf || !s->gradients || !s->directions || + !s->bboxes[0] || !s->bboxes[1] || !s->bboxes[2] || !s->bboxes[3]) + return AVERROR(ENOMEM); + return 0; } @@ -155,11 +242,28 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *frame) AVFilterContext *ctx = inlink->dst; CropDetectContext *s = ctx->priv; int bpp = s->max_pixsteps[0]; - int w, h, x, y, shrink_by; + int w, h, x, y, shrink_by, i; AVDictionary **metadata; int outliers, last_y; int limit = lrint(s->limit); + const int inw = inlink->w; + const int inh = inlink->h; + uint8_t *tmpbuf = s->tmpbuf; + uint8_t *filterbuf = s->filterbuf; + uint16_t *gradients = s->gradients; + int8_t *directions = s->directions; + const AVFrameSideData *sd = NULL; + int scan_w, scan_h, bboff; + + void (*sobel)(int w, int h, uint16_t *dst, int dst_linesize, + int8_t *dir, int dir_linesize, + const uint8_t *src, int src_linesize, int src_stride) = (bpp == 2) ? &ff_sobel_16 : &ff_sobel_8; + void (*gaussian_blur)(int w, int h, + uint8_t *dst, int dst_linesize, + const uint8_t *src, int src_linesize, int src_stride) = (bpp == 2) ? &ff_gaussian_blur_16 : &ff_gaussian_blur_8; + + // ignore first s->skip frames if (++s->frame_nb > 0) { metadata = &frame->metadata; @@ -185,11 +289,109 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *frame) last_y = y INC;\ } + if (s->mode == MODE_BLACK) { FIND(s->y1, 0, y < s->y1, +1, frame->linesize[0], bpp, frame->width); FIND(s->y2, frame->height - 1, y > FFMAX(s->y2, s->y1), -1, frame->linesize[0], bpp, frame->width); FIND(s->x1, 0, y < s->x1, +1, bpp, frame->linesize[0], frame->height); FIND(s->x2, frame->width - 1, y > FFMAX(s->x2, s->x1), -1, bpp, frame->linesize[0], frame->height); + } else { // MODE_MV_EDGES + sd = av_frame_get_side_data(frame, AV_FRAME_DATA_MOTION_VECTORS); + s->x1 = 0; + s->y1 = 0; + s->x2 = inw - 1; + s->y2 = inh - 1; + + if (!sd) { + av_log(ctx, AV_LOG_WARNING, "Cannot detect: no motion vectors available"); + } else { + // gaussian filter to reduce noise + gaussian_blur(inw, inh, + filterbuf, inw*bpp, + frame->data[0], frame->linesize[0], bpp); + + // compute the 16-bits gradients and directions for the next step + sobel(inw, inh, gradients, inw, directions, inw, filterbuf, inw*bpp, bpp); + + // non_maximum_suppression() will actually keep & clip what's necessary and + // ignore the rest, so we need a clean output buffer + memset(tmpbuf, 0, inw * inh); + ff_non_maximum_suppression(inw, inh, tmpbuf, inw, directions, inw, gradients, inw); + + + // keep high values, or low values surrounded by high values + ff_double_threshold(s->low_u8, s->high_u8, inw, inh, + tmpbuf, inw, tmpbuf, inw); + + // scan all MVs and store bounding box + s->x1 = inw - 1; + s->y1 = inh - 1; + s->x2 = 0; + s->y2 = 0; + for (i = 0; i < sd->size / sizeof(AVMotionVector); i++) { + const AVMotionVector *mv = (const AVMotionVector*)sd->data + i; + const int mx = mv->dst_x - mv->src_x; + const int my = mv->dst_y - mv->src_y; + + if (mv->dst_x >= 0 && mv->dst_x < inw && + mv->dst_y >= 0 && mv->dst_y < inh && + mv->src_x >= 0 && mv->src_x < inw && + mv->src_y >= 0 && mv->src_y < inh && + mx * mx + my * my >= s->mv_threshold * s->mv_threshold) { + s->x1 = mv->dst_x < s->x1 ? mv->dst_x : s->x1; + s->y1 = mv->dst_y < s->y1 ? mv->dst_y : s->y1; + s->x2 = mv->dst_x > s->x2 ? mv->dst_x : s->x2; + s->y2 = mv->dst_y > s->y2 ? mv->dst_y : s->y2; + } + } + + // assert x1<x2, y1<y2 + if (s->x1 > s->x2) FFSWAP(int, s->x1, s->x2); + if (s->y1 > s->y2) FFSWAP(int, s->y1, s->y2); + + // scan outward looking for 0-edge-lines in edge image + scan_w = s->x2 - s->x1; + scan_h = s->y2 - s->y1; + +#define FIND_EDGE(DST, FROM, NOEND, INC, STEP0, STEP1, LEN) \ + for (last_y = y = FROM; NOEND; y = y INC) { \ + if (checkline_edge(ctx, tmpbuf + STEP0 * y, STEP1, LEN, bpp)) { \ + if (last_y INC == y) { \ + DST = y; \ + break; \ + } else \ + last_y = y; \ + } \ + } \ + if (!(NOEND)) { \ + DST = y -(INC); \ + } + FIND_EDGE(s->y1, s->y1, y >= 0, -1, inw, bpp, scan_w); + FIND_EDGE(s->y2, s->y2, y < inh, +1, inw, bpp, scan_w); + FIND_EDGE(s->x1, s->x1, y >= 0, -1, bpp, inw, scan_h); + FIND_EDGE(s->x2, s->x2, y < inw, +1, bpp, inw, scan_h); + + // queue bboxes + bboff = (s->frame_nb - 1) % s->window_size; + s->bboxes[0][bboff] = s->x1; + s->bboxes[1][bboff] = s->x2; + s->bboxes[2][bboff] = s->y1; + s->bboxes[3][bboff] = s->y2; + + // sort queue + bboff = FFMIN(s->frame_nb, s->window_size); + AV_QSORT(s->bboxes[0], bboff, int, comp); + AV_QSORT(s->bboxes[1], bboff, int, comp); + AV_QSORT(s->bboxes[2], bboff, int, comp); + AV_QSORT(s->bboxes[3], bboff, int, comp); + + // return median of window_size elems + s->x1 = s->bboxes[0][bboff/2]; + s->x2 = s->bboxes[1][bboff/2]; + s->y1 = s->bboxes[2][bboff/2]; + s->y2 = s->bboxes[3][bboff/2]; + } + } // round x and y (up), important for yuv colorspaces // make sure they stay rounded! @@ -243,6 +445,12 @@ static const AVOption cropdetect_options[] = { { "skip", "Number of initial frames to skip", OFFSET(skip), AV_OPT_TYPE_INT, { .i64 = 2 }, 0, INT_MAX, FLAGS }, { "reset_count", "Recalculate the crop area after this many frames",OFFSET(reset_count),AV_OPT_TYPE_INT,{ .i64 = 0 }, 0, INT_MAX, FLAGS }, { "max_outliers", "Threshold count of outliers", OFFSET(max_outliers),AV_OPT_TYPE_INT, { .i64 = 0 }, 0, INT_MAX, FLAGS }, + { "mode", "set mode", OFFSET(mode), AV_OPT_TYPE_INT, {.i64=MODE_BLACK}, 0, MODE_NB-1, FLAGS, "mode" }, + { "black", "detect black pixels surrounding the video", 0, AV_OPT_TYPE_CONST, {.i64=MODE_BLACK}, INT_MIN, INT_MAX, FLAGS, "mode" }, + { "mvedges", "detect motion and edged surrounding the video", 0, AV_OPT_TYPE_CONST, {.i64=MODE_MV_EDGES}, INT_MIN, INT_MAX, FLAGS, "mode" }, + { "high", "Set high threshold for edge detection", OFFSET(high), AV_OPT_TYPE_FLOAT, {.dbl=25/255.}, 0, 1, FLAGS }, + { "low", "Set low threshold for edge detection", OFFSET(low), AV_OPT_TYPE_FLOAT, {.dbl=15/255.}, 0, 1, FLAGS }, + { "mv_threshold", "motion vector threshold when estimating video window size", OFFSET(mv_threshold), AV_OPT_TYPE_INT, {.i64=8}, 0, 100, FLAGS}, { NULL } }; @@ -270,6 +478,7 @@ const AVFilter ff_vf_cropdetect = { .priv_size = sizeof(CropDetectContext), .priv_class = &cropdetect_class, .init = init, + .uninit = uninit, FILTER_INPUTS(avfilter_vf_cropdetect_inputs), FILTER_OUTPUTS(avfilter_vf_cropdetect_outputs), FILTER_PIXFMTS_ARRAY(pix_fmts), diff --git a/tests/fate/filter-video.mak b/tests/fate/filter-video.mak index faed832cd4..372c70bba7 100644 --- a/tests/fate/filter-video.mak +++ b/tests/fate/filter-video.mak @@ -641,11 +641,17 @@ FATE_METADATA_FILTER-$(call ALLYES, $(SCDET_DEPS)) += fate-filter-metadata-scdet fate-filter-metadata-scdet: SRC = $(TARGET_SAMPLES)/svq3/Vertical400kbit.sorenson3.mov fate-filter-metadata-scdet: CMD = run $(FILTER_METADATA_COMMAND) "sws_flags=+accurate_rnd+bitexact;movie='$(SRC)',scdet=s=1" -CROPDETECT_DEPS = LAVFI_INDEV FILE_PROTOCOL MOVIE_FILTER CROPDETECT_FILTER \ +CROPDETECT_DEPS = LAVFI_INDEV FILE_PROTOCOL MOVIE_FILTER MOVIE_FILTER MESTIMATE_FILTER CROPDETECT_FILTER \ SCALE_FILTER MOV_DEMUXER H264_DECODER FATE_METADATA_FILTER-$(call ALLYES, $(CROPDETECT_DEPS)) += fate-filter-metadata-cropdetect fate-filter-metadata-cropdetect: SRC = $(TARGET_SAMPLES)/filter/cropdetect.mp4 fate-filter-metadata-cropdetect: CMD = run $(FILTER_METADATA_COMMAND) "sws_flags=+accurate_rnd+bitexact;movie='$(SRC)',cropdetect=max_outliers=3" +FATE_METADATA_FILTER-$(call ALLYES, $(CROPDETECT_DEPS)) += fate-filter-metadata-cropdetect1 +fate-filter-metadata-cropdetect1: SRC = $(TARGET_SAMPLES)/filter/cropdetect1.mp4 +fate-filter-metadata-cropdetect1: CMD = run $(FILTER_METADATA_COMMAND) "sws_flags=+accurate_rnd+bitexact;movie='$(SRC)',mestimate,cropdetect=mode=mvedges,metadata=mode=print" +FATE_METADATA_FILTER-$(call ALLYES, $(CROPDETECT_DEPS)) += fate-filter-metadata-cropdetect2 +fate-filter-metadata-cropdetect2: SRC = $(TARGET_SAMPLES)/filter/cropdetect2.mp4 +fate-filter-metadata-cropdetect2: CMD = run $(FILTER_METADATA_COMMAND) "sws_flags=+accurate_rnd+bitexact;movie='$(SRC)',mestimate,cropdetect=mode=mvedges,metadata=mode=print" FREEZEDETECT_DEPS = LAVFI_INDEV MPTESTSRC_FILTER SCALE_FILTER FREEZEDETECT_FILTER FATE_METADATA_FILTER-$(call ALLYES, $(FREEZEDETECT_DEPS)) += fate-filter-metadata-freezedetect diff --git a/tests/ref/fate/filter-metadata-cropdetect1 b/tests/ref/fate/filter-metadata-cropdetect1 new file mode 100644 index 0000000000..892373cc11 --- /dev/null +++ b/tests/ref/fate/filter-metadata-cropdetect1 @@ -0,0 +1,9 @@ +pts=0 +pts=1001 +pts=2002|tag:lavfi.cropdetect.x1=20|tag:lavfi.cropdetect.x2=851|tag:lavfi.cropdetect.y1=311|tag:lavfi.cropdetect.y2=601|tag:lavfi.cropdetect.w=832|tag:lavfi.cropdetect.h=288|tag:lavfi.cropdetect.x=20|tag:lavfi.cropdetect.y=314 +pts=3003|tag:lavfi.cropdetect.x1=20|tag:lavfi.cropdetect.x2=885|tag:lavfi.cropdetect.y1=311|tag:lavfi.cropdetect.y2=621|tag:lavfi.cropdetect.w=864|tag:lavfi.cropdetect.h=304|tag:lavfi.cropdetect.x=22|tag:lavfi.cropdetect.y=316 +pts=4004|tag:lavfi.cropdetect.x1=0|tag:lavfi.cropdetect.x2=885|tag:lavfi.cropdetect.y1=115|tag:lavfi.cropdetect.y2=621|tag:lavfi.cropdetect.w=880|tag:lavfi.cropdetect.h=496|tag:lavfi.cropdetect.x=4|tag:lavfi.cropdetect.y=122 +pts=5005|tag:lavfi.cropdetect.x1=20|tag:lavfi.cropdetect.x2=885|tag:lavfi.cropdetect.y1=311|tag:lavfi.cropdetect.y2=621|tag:lavfi.cropdetect.w=864|tag:lavfi.cropdetect.h=304|tag:lavfi.cropdetect.x=22|tag:lavfi.cropdetect.y=316 +pts=6006|tag:lavfi.cropdetect.x1=0|tag:lavfi.cropdetect.x2=885|tag:lavfi.cropdetect.y1=115|tag:lavfi.cropdetect.y2=621|tag:lavfi.cropdetect.w=880|tag:lavfi.cropdetect.h=496|tag:lavfi.cropdetect.x=4|tag:lavfi.cropdetect.y=122 +pts=7007|tag:lavfi.cropdetect.x1=0|tag:lavfi.cropdetect.x2=885|tag:lavfi.cropdetect.y1=115|tag:lavfi.cropdetect.y2=621|tag:lavfi.cropdetect.w=880|tag:lavfi.cropdetect.h=496|tag:lavfi.cropdetect.x=4|tag:lavfi.cropdetect.y=122 +pts=8008|tag:lavfi.cropdetect.x1=0|tag:lavfi.cropdetect.x2=885|tag:lavfi.cropdetect.y1=115|tag:lavfi.cropdetect.y2=621|tag:lavfi.cropdetect.w=880|tag:lavfi.cropdetect.h=496|tag:lavfi.cropdetect.x=4|tag:lavfi.cropdetect.y=122 diff --git a/tests/ref/fate/filter-metadata-cropdetect2 b/tests/ref/fate/filter-metadata-cropdetect2 new file mode 100644 index 0000000000..6b433d17cb --- /dev/null +++ b/tests/ref/fate/filter-metadata-cropdetect2 @@ -0,0 +1,9 @@ +pts=0 +pts=512 +pts=1024|tag:lavfi.cropdetect.x1=21|tag:lavfi.cropdetect.x2=817|tag:lavfi.cropdetect.y1=33|tag:lavfi.cropdetect.y2=465|tag:lavfi.cropdetect.w=784|tag:lavfi.cropdetect.h=432|tag:lavfi.cropdetect.x=28|tag:lavfi.cropdetect.y=34 +pts=1536|tag:lavfi.cropdetect.x1=21|tag:lavfi.cropdetect.x2=817|tag:lavfi.cropdetect.y1=33|tag:lavfi.cropdetect.y2=465|tag:lavfi.cropdetect.w=784|tag:lavfi.cropdetect.h=432|tag:lavfi.cropdetect.x=28|tag:lavfi.cropdetect.y=34 +pts=2048|tag:lavfi.cropdetect.x1=21|tag:lavfi.cropdetect.x2=817|tag:lavfi.cropdetect.y1=29|tag:lavfi.cropdetect.y2=465|tag:lavfi.cropdetect.w=784|tag:lavfi.cropdetect.h=432|tag:lavfi.cropdetect.x=28|tag:lavfi.cropdetect.y=32 +pts=2560|tag:lavfi.cropdetect.x1=21|tag:lavfi.cropdetect.x2=817|tag:lavfi.cropdetect.y1=29|tag:lavfi.cropdetect.y2=465|tag:lavfi.cropdetect.w=784|tag:lavfi.cropdetect.h=432|tag:lavfi.cropdetect.x=28|tag:lavfi.cropdetect.y=32 +pts=3072|tag:lavfi.cropdetect.x1=21|tag:lavfi.cropdetect.x2=817|tag:lavfi.cropdetect.y1=29|tag:lavfi.cropdetect.y2=465|tag:lavfi.cropdetect.w=784|tag:lavfi.cropdetect.h=432|tag:lavfi.cropdetect.x=28|tag:lavfi.cropdetect.y=32 +pts=3584|tag:lavfi.cropdetect.x1=21|tag:lavfi.cropdetect.x2=817|tag:lavfi.cropdetect.y1=29|tag:lavfi.cropdetect.y2=465|tag:lavfi.cropdetect.w=784|tag:lavfi.cropdetect.h=432|tag:lavfi.cropdetect.x=28|tag:lavfi.cropdetect.y=32 +pts=4096|tag:lavfi.cropdetect.x1=21|tag:lavfi.cropdetect.x2=817|tag:lavfi.cropdetect.y1=29|tag:lavfi.cropdetect.y2=465|tag:lavfi.cropdetect.w=784|tag:lavfi.cropdetect.h=432|tag:lavfi.cropdetect.x=28|tag:lavfi.cropdetect.y=32 -- 2.20.1 (Apple Git-117) [-- Attachment #3: Type: text/plain, Size: 251 bytes --] _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [FFmpeg-devel] [PATCH v2 1/2] lavfi/edge_common: Add 16bit versions of gaussian_blur and sobel 2022-07-11 8:53 [FFmpeg-devel] [PATCH v2 1/2] lavfi/edge_common: Add 16bit versions of gaussian_blur and sobel Thilo Borgmann 2022-07-11 8:54 ` [FFmpeg-devel] [PATCH v2 2/2] lavfi/cropdetect: Add new mode to detect crop-area based on motion vectors and edges Thilo Borgmann @ 2022-07-16 21:07 ` Thilo Borgmann 2022-07-17 7:54 ` Thilo Borgmann 1 sibling, 1 reply; 10+ messages in thread From: Thilo Borgmann @ 2022-07-16 21:07 UTC (permalink / raw) To: ffmpeg-devel [-- Attachment #1: Type: text/plain, Size: 166 bytes --] Hi, > 1/2 adds 16 bit versions of ff_gaussian_blur and ff_sobel. > 2/2 adds new mode to cropdetect. v3 does it the template way for 1/2 as requested on IRC. -Thilo [-- Attachment #2: v3-0001-lavfi-edge_common-Templatify-ff_gaussian_blur-and.patch --] [-- Type: text/plain, Size: 13281 bytes --] From 5453c0e27cd2c54931b012d663178a7c0b5a9f5f Mon Sep 17 00:00:00 2001 From: Thilo Borgmann <thilo.borgmann@mail.de> Date: Sat, 16 Jul 2022 22:59:57 +0200 Subject: [PATCH v3 1/2] lavfi/edge_common: Templatify ff_gaussian_blur and ff_sobel --- libavfilter/edge_common.c | 74 ++-------------------- libavfilter/edge_common.h | 22 ++++--- libavfilter/edge_template.c | 120 ++++++++++++++++++++++++++++++++++++ libavfilter/vf_blurdetect.c | 8 +-- libavfilter/vf_edgedetect.c | 14 ++--- 5 files changed, 152 insertions(+), 86 deletions(-) create mode 100644 libavfilter/edge_template.c diff --git a/libavfilter/edge_common.c b/libavfilter/edge_common.c index d72e8521cd..ebd47d7c53 100644 --- a/libavfilter/edge_common.c +++ b/libavfilter/edge_common.c @@ -46,33 +46,13 @@ static int get_rounded_direction(int gx, int gy) return DIRECTION_VERTICAL; } -// Simple sobel operator to get rounded gradients -void ff_sobel(int w, int h, - uint16_t *dst, int dst_linesize, - int8_t *dir, int dir_linesize, - const uint8_t *src, int src_linesize) -{ - int i, j; - - for (j = 1; j < h - 1; j++) { - dst += dst_linesize; - dir += dir_linesize; - src += src_linesize; - for (i = 1; i < w - 1; i++) { - const int gx = - -1*src[-src_linesize + i-1] + 1*src[-src_linesize + i+1] - -2*src[ i-1] + 2*src[ i+1] - -1*src[ src_linesize + i-1] + 1*src[ src_linesize + i+1]; - const int gy = - -1*src[-src_linesize + i-1] + 1*src[ src_linesize + i-1] - -2*src[-src_linesize + i ] + 2*src[ src_linesize + i ] - -1*src[-src_linesize + i+1] + 1*src[ src_linesize + i+1]; +#undef DEPTH +#define DEPTH 8 +#include "edge_template.c" - dst[i] = FFABS(gx) + FFABS(gy); - dir[i] = get_rounded_direction(gx, gy); - } - } -} +#undef DEPTH +#define DEPTH 16 +#include "edge_template.c" // Filters rounded gradients to drop all non-maxima // Expects gradients generated by ff_sobel() @@ -137,45 +117,3 @@ void ff_double_threshold(int low, int high, int w, int h, src += src_linesize; } } - -// Applies gaussian blur, using 5x5 kernels, sigma = 1.4 -void ff_gaussian_blur(int w, int h, - uint8_t *dst, int dst_linesize, - const uint8_t *src, int src_linesize) -{ - int i, j; - - memcpy(dst, src, w); dst += dst_linesize; src += src_linesize; - memcpy(dst, src, w); dst += dst_linesize; src += src_linesize; - for (j = 2; j < h - 2; j++) { - dst[0] = src[0]; - dst[1] = src[1]; - for (i = 2; i < w - 2; i++) { - /* Gaussian mask of size 5x5 with sigma = 1.4 */ - dst[i] = ((src[-2*src_linesize + i-2] + src[2*src_linesize + i-2]) * 2 - + (src[-2*src_linesize + i-1] + src[2*src_linesize + i-1]) * 4 - + (src[-2*src_linesize + i ] + src[2*src_linesize + i ]) * 5 - + (src[-2*src_linesize + i+1] + src[2*src_linesize + i+1]) * 4 - + (src[-2*src_linesize + i+2] + src[2*src_linesize + i+2]) * 2 - - + (src[ -src_linesize + i-2] + src[ src_linesize + i-2]) * 4 - + (src[ -src_linesize + i-1] + src[ src_linesize + i-1]) * 9 - + (src[ -src_linesize + i ] + src[ src_linesize + i ]) * 12 - + (src[ -src_linesize + i+1] + src[ src_linesize + i+1]) * 9 - + (src[ -src_linesize + i+2] + src[ src_linesize + i+2]) * 4 - - + src[i-2] * 5 - + src[i-1] * 12 - + src[i ] * 15 - + src[i+1] * 12 - + src[i+2] * 5) / 159; - } - dst[i ] = src[i ]; - dst[i + 1] = src[i + 1]; - - dst += dst_linesize; - src += src_linesize; - } - memcpy(dst, src, w); dst += dst_linesize; src += src_linesize; - memcpy(dst, src, w); -} diff --git a/libavfilter/edge_common.h b/libavfilter/edge_common.h index 87c143f2b8..cff4febd70 100644 --- a/libavfilter/edge_common.h +++ b/libavfilter/edge_common.h @@ -48,10 +48,14 @@ enum AVRoundedDirection { * @param src data pointers to source image * @param src_linesize linesizes for the source image */ -void ff_sobel(int w, int h, - uint16_t *dst, int dst_linesize, - int8_t *dir, int dir_linesize, - const uint8_t *src, int src_linesize); +#define PROTO_SOBEL(depth) \ +void ff_sobel_##depth(int w, int h, \ + uint16_t *dst, int dst_linesize, \ + int8_t *dir, int dir_linesize, \ + const uint8_t *src, int src_linesize, int src_stride); + +PROTO_SOBEL(8) +PROTO_SOBEL(16) /** * Filters rounded gradients to drop all non-maxima pixels in the magnitude image @@ -100,8 +104,12 @@ void ff_double_threshold(int low, int high, int w, int h, * @param src data pointers to source image * @param src_linesize linesizes for the source image */ -void ff_gaussian_blur(int w, int h, - uint8_t *dst, int dst_linesize, - const uint8_t *src, int src_linesize); +#define PROTO_GAUSSIAN_BLUR(depth) \ +void ff_gaussian_blur_##depth(int w, int h, \ + uint8_t *dst, int dst_linesize, \ + const uint8_t *src, int src_linesize, int src_stride); + +PROTO_GAUSSIAN_BLUR(8) +PROTO_GAUSSIAN_BLUR(16) #endif diff --git a/libavfilter/edge_template.c b/libavfilter/edge_template.c new file mode 100644 index 0000000000..d3cf8221a4 --- /dev/null +++ b/libavfilter/edge_template.c @@ -0,0 +1,120 @@ +/* + * Copyright (c) 2022 Thilo Borgmann <thilo.borgmann _at_ mail.de> + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + * Redistribution and use in source and binary forms, with or without modification, + * are permitted provided that the following conditions are met: + */ + +#include "libavutil/avassert.h" +#include "avfilter.h" +#include "formats.h" +#include "internal.h" +#include "video.h" + +#undef pixel +#if DEPTH == 8 +#define pixel uint8_t +#else +#define pixel uint16_t +#endif + +#undef fn +#undef fn2 +#undef fn3 +#define fn3(a,b) ff_##a##_##b +#define fn2(a,b) fn3(a,b) +#define fn(a) fn2(a, DEPTH) + +void fn(sobel)(int w, int h, + uint16_t *dst, int dst_linesize, + int8_t *dir, int dir_linesize, + const uint8_t *src, int src_linesize, int src_stride) +{ + int i, j; + pixel *srcp = (pixel *)src; + + src_stride /= sizeof(pixel); + src_linesize /= sizeof(pixel); + dst_linesize /= sizeof(pixel); + + for (j = 1; j < h - 1; j++) { + dst += dst_linesize; + dir += dir_linesize; + srcp += src_linesize; + for (i = 1; i < w - 1; i++) { + const int gx = + -1*srcp[-src_linesize + (i-1)*src_stride] + 1*srcp[-src_linesize + (i+1)*src_stride] + -2*srcp[ (i-1)*src_stride] + 2*srcp[ (i+1)*src_stride] + -1*srcp[ src_linesize + (i-1)*src_stride] + 1*srcp[ src_linesize + (i+1)*src_stride]; + const int gy = + -1*srcp[-src_linesize + (i-1)*src_stride] + 1*srcp[ src_linesize + (i-1)*src_stride] + -2*srcp[-src_linesize + (i )*src_stride] + 2*srcp[ src_linesize + (i )*src_stride] + -1*srcp[-src_linesize + (i+1)*src_stride] + 1*srcp[ src_linesize + (i+1)*src_stride]; + + dst[i] = FFABS(gx) + FFABS(gy); + dir[i] = get_rounded_direction(gx, gy); + } + } +} + +void fn(gaussian_blur)(int w, int h, + uint8_t *dst, int dst_linesize, + const uint8_t *src, int src_linesize, int src_stride) +{ + int i, j; + pixel *srcp = (pixel *)src; + pixel *dstp = (pixel *)dst; + + src_stride /= sizeof(pixel); + src_linesize /= sizeof(pixel); + dst_linesize /= sizeof(pixel); + + memcpy(dstp, srcp, w*2); dstp += dst_linesize; srcp += src_linesize; + memcpy(dstp, srcp, w*2); dstp += dst_linesize; srcp += src_linesize; + for (j = 2; j < h - 2; j++) { + dstp[0] = srcp[(0)*src_stride]; + dstp[1] = srcp[(1)*src_stride]; + for (i = 2; i < w - 2; i++) { + /* Gaussian mask of size 5x5 with sigma = 1.4 */ + dstp[i] = ((srcp[-2*src_linesize + (i-2)*src_stride] + srcp[2*src_linesize + (i-2)*src_stride]) * 2 + + (srcp[-2*src_linesize + (i-1)*src_stride] + srcp[2*src_linesize + (i-1)*src_stride]) * 4 + + (srcp[-2*src_linesize + (i )*src_stride] + srcp[2*src_linesize + (i )*src_stride]) * 5 + + (srcp[-2*src_linesize + (i+1)*src_stride] + srcp[2*src_linesize + (i+1)*src_stride]) * 4 + + (srcp[-2*src_linesize + (i+2)*src_stride] + srcp[2*src_linesize + (i+2)*src_stride]) * 2 + + + (srcp[ -src_linesize + (i-2)*src_stride] + srcp[ src_linesize + (i-2)*src_stride]) * 4 + + (srcp[ -src_linesize + (i-1)*src_stride] + srcp[ src_linesize + (i-1)*src_stride]) * 9 + + (srcp[ -src_linesize + (i )*src_stride] + srcp[ src_linesize + (i )*src_stride]) * 12 + + (srcp[ -src_linesize + (i+1)*src_stride] + srcp[ src_linesize + (i+1)*src_stride]) * 9 + + (srcp[ -src_linesize + (i+2)*src_stride] + srcp[ src_linesize + (i+2)*src_stride]) * 4 + + + srcp[(i-2)*src_stride] * 5 + + srcp[(i-1)*src_stride] * 12 + + srcp[(i )*src_stride] * 15 + + srcp[(i+1)*src_stride] * 12 + + srcp[(i+2)*src_stride] * 5) / 159; + } + dstp[i ] = srcp[(i )*src_stride]; + dstp[i + 1] = srcp[(i + 1)*src_stride]; + + dstp += dst_linesize; + srcp += src_linesize; + } + memcpy(dstp, srcp, w*2); dstp += dst_linesize; srcp += src_linesize; + memcpy(dstp, srcp, w*2); +} diff --git a/libavfilter/vf_blurdetect.c b/libavfilter/vf_blurdetect.c index 0e08ba96de..db06efcce7 100644 --- a/libavfilter/vf_blurdetect.c +++ b/libavfilter/vf_blurdetect.c @@ -283,12 +283,12 @@ static int blurdetect_filter_frame(AVFilterLink *inlink, AVFrame *in) nplanes++; // gaussian filter to reduce noise - ff_gaussian_blur(w, h, - filterbuf, w, - in->data[plane], in->linesize[plane]); + ff_gaussian_blur_8(w, h, + filterbuf, w, + in->data[plane], in->linesize[plane], 1); // compute the 16-bits gradients and directions for the next step - ff_sobel(w, h, gradients, w, directions, w, filterbuf, w); + ff_sobel_8(w, h, gradients, w, directions, w, filterbuf, w, 1); // non_maximum_suppression() will actually keep & clip what's necessary and // ignore the rest, so we need a clean output buffer diff --git a/libavfilter/vf_edgedetect.c b/libavfilter/vf_edgedetect.c index 90390ceb3e..603f06f141 100644 --- a/libavfilter/vf_edgedetect.c +++ b/libavfilter/vf_edgedetect.c @@ -191,15 +191,15 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *in) } /* gaussian filter to reduce noise */ - ff_gaussian_blur(width, height, - tmpbuf, width, - in->data[p], in->linesize[p]); + ff_gaussian_blur_8(width, height, + tmpbuf, width, + in->data[p], in->linesize[p], 1); /* compute the 16-bits gradients and directions for the next step */ - ff_sobel(width, height, - gradients, width, - directions,width, - tmpbuf, width); + ff_sobel_8(width, height, + gradients, width, + directions,width, + tmpbuf, width, 1); /* non_maximum_suppression() will actually keep & clip what's necessary and * ignore the rest, so we need a clean output buffer */ -- 2.20.1 (Apple Git-117) [-- Attachment #3: Type: text/plain, Size: 251 bytes --] _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [FFmpeg-devel] [PATCH v2 1/2] lavfi/edge_common: Add 16bit versions of gaussian_blur and sobel 2022-07-16 21:07 ` [FFmpeg-devel] [PATCH v2 1/2] lavfi/edge_common: Add 16bit versions of gaussian_blur and sobel Thilo Borgmann @ 2022-07-17 7:54 ` Thilo Borgmann 2022-07-18 14:15 ` Thilo Borgmann 0 siblings, 1 reply; 10+ messages in thread From: Thilo Borgmann @ 2022-07-17 7:54 UTC (permalink / raw) To: ffmpeg-devel [-- Attachment #1: Type: text/plain, Size: 274 bytes --] Am 16.07.22 um 23:07 schrieb Thilo Borgmann: > Hi, > >> 1/2 adds 16 bit versions of ff_gaussian_blur and ff_sobel. >> 2/2 adds new mode to cropdetect. > > v3 does it the template way for 1/2 as requested on IRC. v4 fixed bug in gaussian_blur. Otherwise identical. -Thilo [-- Attachment #2: v4-0001-lavfi-edge_common-Templatify-ff_gaussian_blur-and.patch --] [-- Type: text/plain, Size: 13329 bytes --] From 2cced42f8053c647384fe020cdb2e12f8b7b3d0a Mon Sep 17 00:00:00 2001 From: Thilo Borgmann <thilo.borgmann@mail.de> Date: Sun, 17 Jul 2022 09:51:33 +0200 Subject: [PATCH v4 1/2] lavfi/edge_common: Templatify ff_gaussian_blur and ff_sobel --- libavfilter/edge_common.c | 74 ++-------------------- libavfilter/edge_common.h | 22 ++++--- libavfilter/edge_template.c | 120 ++++++++++++++++++++++++++++++++++++ libavfilter/vf_blurdetect.c | 8 +-- libavfilter/vf_edgedetect.c | 14 ++--- 5 files changed, 152 insertions(+), 86 deletions(-) create mode 100644 libavfilter/edge_template.c diff --git a/libavfilter/edge_common.c b/libavfilter/edge_common.c index d72e8521cd..ebd47d7c53 100644 --- a/libavfilter/edge_common.c +++ b/libavfilter/edge_common.c @@ -46,33 +46,13 @@ static int get_rounded_direction(int gx, int gy) return DIRECTION_VERTICAL; } -// Simple sobel operator to get rounded gradients -void ff_sobel(int w, int h, - uint16_t *dst, int dst_linesize, - int8_t *dir, int dir_linesize, - const uint8_t *src, int src_linesize) -{ - int i, j; - - for (j = 1; j < h - 1; j++) { - dst += dst_linesize; - dir += dir_linesize; - src += src_linesize; - for (i = 1; i < w - 1; i++) { - const int gx = - -1*src[-src_linesize + i-1] + 1*src[-src_linesize + i+1] - -2*src[ i-1] + 2*src[ i+1] - -1*src[ src_linesize + i-1] + 1*src[ src_linesize + i+1]; - const int gy = - -1*src[-src_linesize + i-1] + 1*src[ src_linesize + i-1] - -2*src[-src_linesize + i ] + 2*src[ src_linesize + i ] - -1*src[-src_linesize + i+1] + 1*src[ src_linesize + i+1]; +#undef DEPTH +#define DEPTH 8 +#include "edge_template.c" - dst[i] = FFABS(gx) + FFABS(gy); - dir[i] = get_rounded_direction(gx, gy); - } - } -} +#undef DEPTH +#define DEPTH 16 +#include "edge_template.c" // Filters rounded gradients to drop all non-maxima // Expects gradients generated by ff_sobel() @@ -137,45 +117,3 @@ void ff_double_threshold(int low, int high, int w, int h, src += src_linesize; } } - -// Applies gaussian blur, using 5x5 kernels, sigma = 1.4 -void ff_gaussian_blur(int w, int h, - uint8_t *dst, int dst_linesize, - const uint8_t *src, int src_linesize) -{ - int i, j; - - memcpy(dst, src, w); dst += dst_linesize; src += src_linesize; - memcpy(dst, src, w); dst += dst_linesize; src += src_linesize; - for (j = 2; j < h - 2; j++) { - dst[0] = src[0]; - dst[1] = src[1]; - for (i = 2; i < w - 2; i++) { - /* Gaussian mask of size 5x5 with sigma = 1.4 */ - dst[i] = ((src[-2*src_linesize + i-2] + src[2*src_linesize + i-2]) * 2 - + (src[-2*src_linesize + i-1] + src[2*src_linesize + i-1]) * 4 - + (src[-2*src_linesize + i ] + src[2*src_linesize + i ]) * 5 - + (src[-2*src_linesize + i+1] + src[2*src_linesize + i+1]) * 4 - + (src[-2*src_linesize + i+2] + src[2*src_linesize + i+2]) * 2 - - + (src[ -src_linesize + i-2] + src[ src_linesize + i-2]) * 4 - + (src[ -src_linesize + i-1] + src[ src_linesize + i-1]) * 9 - + (src[ -src_linesize + i ] + src[ src_linesize + i ]) * 12 - + (src[ -src_linesize + i+1] + src[ src_linesize + i+1]) * 9 - + (src[ -src_linesize + i+2] + src[ src_linesize + i+2]) * 4 - - + src[i-2] * 5 - + src[i-1] * 12 - + src[i ] * 15 - + src[i+1] * 12 - + src[i+2] * 5) / 159; - } - dst[i ] = src[i ]; - dst[i + 1] = src[i + 1]; - - dst += dst_linesize; - src += src_linesize; - } - memcpy(dst, src, w); dst += dst_linesize; src += src_linesize; - memcpy(dst, src, w); -} diff --git a/libavfilter/edge_common.h b/libavfilter/edge_common.h index 87c143f2b8..cff4febd70 100644 --- a/libavfilter/edge_common.h +++ b/libavfilter/edge_common.h @@ -48,10 +48,14 @@ enum AVRoundedDirection { * @param src data pointers to source image * @param src_linesize linesizes for the source image */ -void ff_sobel(int w, int h, - uint16_t *dst, int dst_linesize, - int8_t *dir, int dir_linesize, - const uint8_t *src, int src_linesize); +#define PROTO_SOBEL(depth) \ +void ff_sobel_##depth(int w, int h, \ + uint16_t *dst, int dst_linesize, \ + int8_t *dir, int dir_linesize, \ + const uint8_t *src, int src_linesize, int src_stride); + +PROTO_SOBEL(8) +PROTO_SOBEL(16) /** * Filters rounded gradients to drop all non-maxima pixels in the magnitude image @@ -100,8 +104,12 @@ void ff_double_threshold(int low, int high, int w, int h, * @param src data pointers to source image * @param src_linesize linesizes for the source image */ -void ff_gaussian_blur(int w, int h, - uint8_t *dst, int dst_linesize, - const uint8_t *src, int src_linesize); +#define PROTO_GAUSSIAN_BLUR(depth) \ +void ff_gaussian_blur_##depth(int w, int h, \ + uint8_t *dst, int dst_linesize, \ + const uint8_t *src, int src_linesize, int src_stride); + +PROTO_GAUSSIAN_BLUR(8) +PROTO_GAUSSIAN_BLUR(16) #endif diff --git a/libavfilter/edge_template.c b/libavfilter/edge_template.c new file mode 100644 index 0000000000..de8b318d91 --- /dev/null +++ b/libavfilter/edge_template.c @@ -0,0 +1,120 @@ +/* + * Copyright (c) 2022 Thilo Borgmann <thilo.borgmann _at_ mail.de> + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + * Redistribution and use in source and binary forms, with or without modification, + * are permitted provided that the following conditions are met: + */ + +#include "libavutil/avassert.h" +#include "avfilter.h" +#include "formats.h" +#include "internal.h" +#include "video.h" + +#undef pixel +#if DEPTH == 8 +#define pixel uint8_t +#else +#define pixel uint16_t +#endif + +#undef fn +#undef fn2 +#undef fn3 +#define fn3(a,b) ff_##a##_##b +#define fn2(a,b) fn3(a,b) +#define fn(a) fn2(a, DEPTH) + +void fn(sobel)(int w, int h, + uint16_t *dst, int dst_linesize, + int8_t *dir, int dir_linesize, + const uint8_t *src, int src_linesize, int src_stride) +{ + int i, j; + pixel *srcp = (pixel *)src; + + src_stride /= sizeof(pixel); + src_linesize /= sizeof(pixel); + dst_linesize /= sizeof(pixel); + + for (j = 1; j < h - 1; j++) { + dst += dst_linesize; + dir += dir_linesize; + srcp += src_linesize; + for (i = 1; i < w - 1; i++) { + const int gx = + -1*srcp[-src_linesize + (i-1)*src_stride] + 1*srcp[-src_linesize + (i+1)*src_stride] + -2*srcp[ (i-1)*src_stride] + 2*srcp[ (i+1)*src_stride] + -1*srcp[ src_linesize + (i-1)*src_stride] + 1*srcp[ src_linesize + (i+1)*src_stride]; + const int gy = + -1*srcp[-src_linesize + (i-1)*src_stride] + 1*srcp[ src_linesize + (i-1)*src_stride] + -2*srcp[-src_linesize + (i )*src_stride] + 2*srcp[ src_linesize + (i )*src_stride] + -1*srcp[-src_linesize + (i+1)*src_stride] + 1*srcp[ src_linesize + (i+1)*src_stride]; + + dst[i] = FFABS(gx) + FFABS(gy); + dir[i] = get_rounded_direction(gx, gy); + } + } +} + +void fn(gaussian_blur)(int w, int h, + uint8_t *dst, int dst_linesize, + const uint8_t *src, int src_linesize, int src_stride) +{ + int i, j; + pixel *srcp = (pixel *)src; + pixel *dstp = (pixel *)dst; + + src_stride /= sizeof(pixel); + src_linesize /= sizeof(pixel); + dst_linesize /= sizeof(pixel); + + memcpy(dstp, srcp, w*sizeof(pixel)); dstp += dst_linesize; srcp += src_linesize; + memcpy(dstp, srcp, w*sizeof(pixel)); dstp += dst_linesize; srcp += src_linesize; + for (j = 2; j < h - 2; j++) { + dstp[0] = srcp[(0)*src_stride]; + dstp[1] = srcp[(1)*src_stride]; + for (i = 2; i < w - 2; i++) { + /* Gaussian mask of size 5x5 with sigma = 1.4 */ + dstp[i] = ((srcp[-2*src_linesize + (i-2)*src_stride] + srcp[2*src_linesize + (i-2)*src_stride]) * 2 + + (srcp[-2*src_linesize + (i-1)*src_stride] + srcp[2*src_linesize + (i-1)*src_stride]) * 4 + + (srcp[-2*src_linesize + (i )*src_stride] + srcp[2*src_linesize + (i )*src_stride]) * 5 + + (srcp[-2*src_linesize + (i+1)*src_stride] + srcp[2*src_linesize + (i+1)*src_stride]) * 4 + + (srcp[-2*src_linesize + (i+2)*src_stride] + srcp[2*src_linesize + (i+2)*src_stride]) * 2 + + + (srcp[ -src_linesize + (i-2)*src_stride] + srcp[ src_linesize + (i-2)*src_stride]) * 4 + + (srcp[ -src_linesize + (i-1)*src_stride] + srcp[ src_linesize + (i-1)*src_stride]) * 9 + + (srcp[ -src_linesize + (i )*src_stride] + srcp[ src_linesize + (i )*src_stride]) * 12 + + (srcp[ -src_linesize + (i+1)*src_stride] + srcp[ src_linesize + (i+1)*src_stride]) * 9 + + (srcp[ -src_linesize + (i+2)*src_stride] + srcp[ src_linesize + (i+2)*src_stride]) * 4 + + + srcp[(i-2)*src_stride] * 5 + + srcp[(i-1)*src_stride] * 12 + + srcp[(i )*src_stride] * 15 + + srcp[(i+1)*src_stride] * 12 + + srcp[(i+2)*src_stride] * 5) / 159; + } + dstp[i ] = srcp[(i )*src_stride]; + dstp[i + 1] = srcp[(i + 1)*src_stride]; + + dstp += dst_linesize; + srcp += src_linesize; + } + memcpy(dstp, srcp, w*sizeof(pixel)); dstp += dst_linesize; srcp += src_linesize; + memcpy(dstp, srcp, w*sizeof(pixel)); +} diff --git a/libavfilter/vf_blurdetect.c b/libavfilter/vf_blurdetect.c index 0e08ba96de..db06efcce7 100644 --- a/libavfilter/vf_blurdetect.c +++ b/libavfilter/vf_blurdetect.c @@ -283,12 +283,12 @@ static int blurdetect_filter_frame(AVFilterLink *inlink, AVFrame *in) nplanes++; // gaussian filter to reduce noise - ff_gaussian_blur(w, h, - filterbuf, w, - in->data[plane], in->linesize[plane]); + ff_gaussian_blur_8(w, h, + filterbuf, w, + in->data[plane], in->linesize[plane], 1); // compute the 16-bits gradients and directions for the next step - ff_sobel(w, h, gradients, w, directions, w, filterbuf, w); + ff_sobel_8(w, h, gradients, w, directions, w, filterbuf, w, 1); // non_maximum_suppression() will actually keep & clip what's necessary and // ignore the rest, so we need a clean output buffer diff --git a/libavfilter/vf_edgedetect.c b/libavfilter/vf_edgedetect.c index 90390ceb3e..603f06f141 100644 --- a/libavfilter/vf_edgedetect.c +++ b/libavfilter/vf_edgedetect.c @@ -191,15 +191,15 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *in) } /* gaussian filter to reduce noise */ - ff_gaussian_blur(width, height, - tmpbuf, width, - in->data[p], in->linesize[p]); + ff_gaussian_blur_8(width, height, + tmpbuf, width, + in->data[p], in->linesize[p], 1); /* compute the 16-bits gradients and directions for the next step */ - ff_sobel(width, height, - gradients, width, - directions,width, - tmpbuf, width); + ff_sobel_8(width, height, + gradients, width, + directions,width, + tmpbuf, width, 1); /* non_maximum_suppression() will actually keep & clip what's necessary and * ignore the rest, so we need a clean output buffer */ -- 2.20.1 (Apple Git-117) [-- Attachment #3: Type: text/plain, Size: 251 bytes --] _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [FFmpeg-devel] [PATCH v2 1/2] lavfi/edge_common: Add 16bit versions of gaussian_blur and sobel 2022-07-17 7:54 ` Thilo Borgmann @ 2022-07-18 14:15 ` Thilo Borgmann 2022-07-29 13:11 ` Thilo Borgmann 0 siblings, 1 reply; 10+ messages in thread From: Thilo Borgmann @ 2022-07-18 14:15 UTC (permalink / raw) To: ffmpeg-devel [-- Attachment #1: Type: text/plain, Size: 403 bytes --] Am 17.07.22 um 09:54 schrieb Thilo Borgmann: > Am 16.07.22 um 23:07 schrieb Thilo Borgmann: >> Hi, >> >>> 1/2 adds 16 bit versions of ff_gaussian_blur and ff_sobel. >>> 2/2 adds new mode to cropdetect. >> >> v3 does it the template way for 1/2 as requested on IRC. > > v4 fixed bug in gaussian_blur. Otherwise identical. v5 fixes minor things mentioned on IRC and another bug found on the way. -Thilo [-- Attachment #2: v5-0001-lavfi-edge_common-Templatify-ff_gaussian_blur-and.patch --] [-- Type: text/plain, Size: 13315 bytes --] From 74ed982d46acb980d97ec8ba969036504fdbe777 Mon Sep 17 00:00:00 2001 From: Thilo Borgmann <thilo.borgmann@mail.de> Date: Mon, 18 Jul 2022 16:09:46 +0200 Subject: [PATCH v5 1/2] lavfi/edge_common: Templatify ff_gaussian_blur and ff_sobel --- libavfilter/edge_common.c | 74 ++-------------------- libavfilter/edge_common.h | 22 ++++--- libavfilter/edge_template.c | 118 ++++++++++++++++++++++++++++++++++++ libavfilter/vf_blurdetect.c | 8 +-- libavfilter/vf_edgedetect.c | 14 ++--- 5 files changed, 150 insertions(+), 86 deletions(-) create mode 100644 libavfilter/edge_template.c diff --git a/libavfilter/edge_common.c b/libavfilter/edge_common.c index d72e8521cd..ebd47d7c53 100644 --- a/libavfilter/edge_common.c +++ b/libavfilter/edge_common.c @@ -46,33 +46,13 @@ static int get_rounded_direction(int gx, int gy) return DIRECTION_VERTICAL; } -// Simple sobel operator to get rounded gradients -void ff_sobel(int w, int h, - uint16_t *dst, int dst_linesize, - int8_t *dir, int dir_linesize, - const uint8_t *src, int src_linesize) -{ - int i, j; - - for (j = 1; j < h - 1; j++) { - dst += dst_linesize; - dir += dir_linesize; - src += src_linesize; - for (i = 1; i < w - 1; i++) { - const int gx = - -1*src[-src_linesize + i-1] + 1*src[-src_linesize + i+1] - -2*src[ i-1] + 2*src[ i+1] - -1*src[ src_linesize + i-1] + 1*src[ src_linesize + i+1]; - const int gy = - -1*src[-src_linesize + i-1] + 1*src[ src_linesize + i-1] - -2*src[-src_linesize + i ] + 2*src[ src_linesize + i ] - -1*src[-src_linesize + i+1] + 1*src[ src_linesize + i+1]; +#undef DEPTH +#define DEPTH 8 +#include "edge_template.c" - dst[i] = FFABS(gx) + FFABS(gy); - dir[i] = get_rounded_direction(gx, gy); - } - } -} +#undef DEPTH +#define DEPTH 16 +#include "edge_template.c" // Filters rounded gradients to drop all non-maxima // Expects gradients generated by ff_sobel() @@ -137,45 +117,3 @@ void ff_double_threshold(int low, int high, int w, int h, src += src_linesize; } } - -// Applies gaussian blur, using 5x5 kernels, sigma = 1.4 -void ff_gaussian_blur(int w, int h, - uint8_t *dst, int dst_linesize, - const uint8_t *src, int src_linesize) -{ - int i, j; - - memcpy(dst, src, w); dst += dst_linesize; src += src_linesize; - memcpy(dst, src, w); dst += dst_linesize; src += src_linesize; - for (j = 2; j < h - 2; j++) { - dst[0] = src[0]; - dst[1] = src[1]; - for (i = 2; i < w - 2; i++) { - /* Gaussian mask of size 5x5 with sigma = 1.4 */ - dst[i] = ((src[-2*src_linesize + i-2] + src[2*src_linesize + i-2]) * 2 - + (src[-2*src_linesize + i-1] + src[2*src_linesize + i-1]) * 4 - + (src[-2*src_linesize + i ] + src[2*src_linesize + i ]) * 5 - + (src[-2*src_linesize + i+1] + src[2*src_linesize + i+1]) * 4 - + (src[-2*src_linesize + i+2] + src[2*src_linesize + i+2]) * 2 - - + (src[ -src_linesize + i-2] + src[ src_linesize + i-2]) * 4 - + (src[ -src_linesize + i-1] + src[ src_linesize + i-1]) * 9 - + (src[ -src_linesize + i ] + src[ src_linesize + i ]) * 12 - + (src[ -src_linesize + i+1] + src[ src_linesize + i+1]) * 9 - + (src[ -src_linesize + i+2] + src[ src_linesize + i+2]) * 4 - - + src[i-2] * 5 - + src[i-1] * 12 - + src[i ] * 15 - + src[i+1] * 12 - + src[i+2] * 5) / 159; - } - dst[i ] = src[i ]; - dst[i + 1] = src[i + 1]; - - dst += dst_linesize; - src += src_linesize; - } - memcpy(dst, src, w); dst += dst_linesize; src += src_linesize; - memcpy(dst, src, w); -} diff --git a/libavfilter/edge_common.h b/libavfilter/edge_common.h index 87c143f2b8..cff4febd70 100644 --- a/libavfilter/edge_common.h +++ b/libavfilter/edge_common.h @@ -48,10 +48,14 @@ enum AVRoundedDirection { * @param src data pointers to source image * @param src_linesize linesizes for the source image */ -void ff_sobel(int w, int h, - uint16_t *dst, int dst_linesize, - int8_t *dir, int dir_linesize, - const uint8_t *src, int src_linesize); +#define PROTO_SOBEL(depth) \ +void ff_sobel_##depth(int w, int h, \ + uint16_t *dst, int dst_linesize, \ + int8_t *dir, int dir_linesize, \ + const uint8_t *src, int src_linesize, int src_stride); + +PROTO_SOBEL(8) +PROTO_SOBEL(16) /** * Filters rounded gradients to drop all non-maxima pixels in the magnitude image @@ -100,8 +104,12 @@ void ff_double_threshold(int low, int high, int w, int h, * @param src data pointers to source image * @param src_linesize linesizes for the source image */ -void ff_gaussian_blur(int w, int h, - uint8_t *dst, int dst_linesize, - const uint8_t *src, int src_linesize); +#define PROTO_GAUSSIAN_BLUR(depth) \ +void ff_gaussian_blur_##depth(int w, int h, \ + uint8_t *dst, int dst_linesize, \ + const uint8_t *src, int src_linesize, int src_stride); + +PROTO_GAUSSIAN_BLUR(8) +PROTO_GAUSSIAN_BLUR(16) #endif diff --git a/libavfilter/edge_template.c b/libavfilter/edge_template.c new file mode 100644 index 0000000000..af33c178af --- /dev/null +++ b/libavfilter/edge_template.c @@ -0,0 +1,118 @@ +/* + * Copyright (c) 2022 Thilo Borgmann <thilo.borgmann _at_ mail.de> + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + * Redistribution and use in source and binary forms, with or without modification, + * are permitted provided that the following conditions are met: + */ + +#include "libavutil/avassert.h" +#include "avfilter.h" +#include "formats.h" +#include "internal.h" +#include "video.h" + +#undef pixel +#if DEPTH == 8 +#define pixel uint8_t +#else +#define pixel uint16_t +#endif + +#undef fn +#undef fn2 +#undef fn3 +#define fn3(a,b) ff_##a##_##b +#define fn2(a,b) fn3(a,b) +#define fn(a) fn2(a, DEPTH) + +void fn(sobel)(int w, int h, + uint16_t *dst, int dst_linesize, + int8_t *dir, int dir_linesize, + const uint8_t *src, int src_linesize, int src_stride) +{ + pixel *srcp = (pixel *)src; + + src_stride /= sizeof(pixel); + src_linesize /= sizeof(pixel); + dst_linesize /= sizeof(pixel); + + for (int j = 1; j < h - 1; j++) { + dst += dst_linesize; + dir += dir_linesize; + srcp += src_linesize; + for (int i = 1; i < w - 1; i++) { + const int gx = + -1*srcp[-src_linesize + (i-1)*src_stride] + 1*srcp[-src_linesize + (i+1)*src_stride] + -2*srcp[ (i-1)*src_stride] + 2*srcp[ (i+1)*src_stride] + -1*srcp[ src_linesize + (i-1)*src_stride] + 1*srcp[ src_linesize + (i+1)*src_stride]; + const int gy = + -1*srcp[-src_linesize + (i-1)*src_stride] + 1*srcp[ src_linesize + (i-1)*src_stride] + -2*srcp[-src_linesize + (i )*src_stride] + 2*srcp[ src_linesize + (i )*src_stride] + -1*srcp[-src_linesize + (i+1)*src_stride] + 1*srcp[ src_linesize + (i+1)*src_stride]; + + dst[i] = FFABS(gx) + FFABS(gy); + dir[i] = get_rounded_direction(gx, gy); + } + } +} + +void fn(gaussian_blur)(int w, int h, + uint8_t *dst, int dst_linesize, + const uint8_t *src, int src_linesize, int src_stride) +{ + pixel *srcp = (pixel *)src; + pixel *dstp = (pixel *)dst; + + src_stride /= sizeof(pixel); + src_linesize /= sizeof(pixel); + dst_linesize /= sizeof(pixel); + + memcpy(dstp, srcp, w*sizeof(pixel)); dstp += dst_linesize; srcp += src_linesize; + memcpy(dstp, srcp, w*sizeof(pixel)); dstp += dst_linesize; srcp += src_linesize; + for (int j = 2; j < h - 2; j++) { + dstp[0] = srcp[(0)*src_stride]; + dstp[1] = srcp[(1)*src_stride]; + for (int i = 2; i < w - 2; i++) { + /* Gaussian mask of size 5x5 with sigma = 1.4 */ + dstp[i] = ((srcp[-2*src_linesize + (i-2)*src_stride] + srcp[2*src_linesize + (i-2)*src_stride]) * 2 + + (srcp[-2*src_linesize + (i-1)*src_stride] + srcp[2*src_linesize + (i-1)*src_stride]) * 4 + + (srcp[-2*src_linesize + (i )*src_stride] + srcp[2*src_linesize + (i )*src_stride]) * 5 + + (srcp[-2*src_linesize + (i+1)*src_stride] + srcp[2*src_linesize + (i+1)*src_stride]) * 4 + + (srcp[-2*src_linesize + (i+2)*src_stride] + srcp[2*src_linesize + (i+2)*src_stride]) * 2 + + + (srcp[ -src_linesize + (i-2)*src_stride] + srcp[ src_linesize + (i-2)*src_stride]) * 4 + + (srcp[ -src_linesize + (i-1)*src_stride] + srcp[ src_linesize + (i-1)*src_stride]) * 9 + + (srcp[ -src_linesize + (i )*src_stride] + srcp[ src_linesize + (i )*src_stride]) * 12 + + (srcp[ -src_linesize + (i+1)*src_stride] + srcp[ src_linesize + (i+1)*src_stride]) * 9 + + (srcp[ -src_linesize + (i+2)*src_stride] + srcp[ src_linesize + (i+2)*src_stride]) * 4 + + + srcp[(i-2)*src_stride] * 5 + + srcp[(i-1)*src_stride] * 12 + + srcp[(i )*src_stride] * 15 + + srcp[(i+1)*src_stride] * 12 + + srcp[(i+2)*src_stride] * 5) / 159; + } + dstp[w - 2] = srcp[(w - 2)*src_stride]; + dstp[w - 1] = srcp[(w - 1)*src_stride]; + + dstp += dst_linesize; + srcp += src_linesize; + } + memcpy(dstp, srcp, w*sizeof(pixel)); dstp += dst_linesize; srcp += src_linesize; + memcpy(dstp, srcp, w*sizeof(pixel)); +} diff --git a/libavfilter/vf_blurdetect.c b/libavfilter/vf_blurdetect.c index 0e08ba96de..db06efcce7 100644 --- a/libavfilter/vf_blurdetect.c +++ b/libavfilter/vf_blurdetect.c @@ -283,12 +283,12 @@ static int blurdetect_filter_frame(AVFilterLink *inlink, AVFrame *in) nplanes++; // gaussian filter to reduce noise - ff_gaussian_blur(w, h, - filterbuf, w, - in->data[plane], in->linesize[plane]); + ff_gaussian_blur_8(w, h, + filterbuf, w, + in->data[plane], in->linesize[plane], 1); // compute the 16-bits gradients and directions for the next step - ff_sobel(w, h, gradients, w, directions, w, filterbuf, w); + ff_sobel_8(w, h, gradients, w, directions, w, filterbuf, w, 1); // non_maximum_suppression() will actually keep & clip what's necessary and // ignore the rest, so we need a clean output buffer diff --git a/libavfilter/vf_edgedetect.c b/libavfilter/vf_edgedetect.c index 90390ceb3e..603f06f141 100644 --- a/libavfilter/vf_edgedetect.c +++ b/libavfilter/vf_edgedetect.c @@ -191,15 +191,15 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *in) } /* gaussian filter to reduce noise */ - ff_gaussian_blur(width, height, - tmpbuf, width, - in->data[p], in->linesize[p]); + ff_gaussian_blur_8(width, height, + tmpbuf, width, + in->data[p], in->linesize[p], 1); /* compute the 16-bits gradients and directions for the next step */ - ff_sobel(width, height, - gradients, width, - directions,width, - tmpbuf, width); + ff_sobel_8(width, height, + gradients, width, + directions,width, + tmpbuf, width, 1); /* non_maximum_suppression() will actually keep & clip what's necessary and * ignore the rest, so we need a clean output buffer */ -- 2.20.1 (Apple Git-117) [-- Attachment #3: Type: text/plain, Size: 251 bytes --] _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [FFmpeg-devel] [PATCH v2 1/2] lavfi/edge_common: Add 16bit versions of gaussian_blur and sobel 2022-07-18 14:15 ` Thilo Borgmann @ 2022-07-29 13:11 ` Thilo Borgmann 2022-07-30 11:22 ` Thilo Borgmann 0 siblings, 1 reply; 10+ messages in thread From: Thilo Borgmann @ 2022-07-29 13:11 UTC (permalink / raw) To: ffmpeg-devel Hi, >>>> 1/2 adds 16 bit versions of ff_gaussian_blur and ff_sobel. >>>> 2/2 adds new mode to cropdetect. >>> >>> v3 does it the template way for 1/2 as requested on IRC. >> >> v4 fixed bug in gaussian_blur. Otherwise identical. > > v5 fixes minor things mentioned on IRC and another bug found on the way. will push v5 patchiest soon if there are no more comments. -Thilo _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [FFmpeg-devel] [PATCH v2 1/2] lavfi/edge_common: Add 16bit versions of gaussian_blur and sobel 2022-07-29 13:11 ` Thilo Borgmann @ 2022-07-30 11:22 ` Thilo Borgmann 0 siblings, 0 replies; 10+ messages in thread From: Thilo Borgmann @ 2022-07-30 11:22 UTC (permalink / raw) To: ffmpeg-devel Am 29.07.22 um 15:11 schrieb Thilo Borgmann: > Hi, > >>>>> 1/2 adds 16 bit versions of ff_gaussian_blur and ff_sobel. >>>>> 2/2 adds new mode to cropdetect. >>>> >>>> v3 does it the template way for 1/2 as requested on IRC. >>> >>> v4 fixed bug in gaussian_blur. Otherwise identical. >> >> v5 fixes minor things mentioned on IRC and another bug found on the way. > > will push v5 patchiest soon if there are no more comments. Patchset OK'd on IRC and pushed. Thanks! -Thilo _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2022-07-30 11:22 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-07-11 8:53 [FFmpeg-devel] [PATCH v2 1/2] lavfi/edge_common: Add 16bit versions of gaussian_blur and sobel Thilo Borgmann 2022-07-11 8:54 ` [FFmpeg-devel] [PATCH v2 2/2] lavfi/cropdetect: Add new mode to detect crop-area based on motion vectors and edges Thilo Borgmann 2022-07-16 21:09 ` Thilo Borgmann 2022-07-17 7:54 ` Thilo Borgmann 2022-07-18 14:15 ` Thilo Borgmann 2022-07-16 21:07 ` [FFmpeg-devel] [PATCH v2 1/2] lavfi/edge_common: Add 16bit versions of gaussian_blur and sobel Thilo Borgmann 2022-07-17 7:54 ` Thilo Borgmann 2022-07-18 14:15 ` Thilo Borgmann 2022-07-29 13:11 ` Thilo Borgmann 2022-07-30 11:22 ` Thilo Borgmann
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel This inbox may be cloned and mirrored by anyone: git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \ ffmpegdev@gitmailbox.com public-inbox-index ffmpegdev Example config snippet for mirrors. AGPL code for this site: git clone https://public-inbox.org/public-inbox.git