* [FFmpeg-devel] [PATCH v1] libavfi/dnn: add Paddle Inference as one of DNN backend
@ 2023-04-06 10:36 wongwwz
2023-04-08 21:31 ` Jean-Baptiste Kempf
0 siblings, 1 reply; 7+ messages in thread
From: wongwwz @ 2023-04-06 10:36 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: wenzhe.wang
From: "wenzhe.wang" <wongwwz@foxmail.com>
PaddlePaddle (PArallel Distributed Deep LEarning) is a simple, efficient and extensible deep learning framework that accelerates the path from research prototyping to production deployment and provide many industrial AI solutions. Official websit:https://www.paddlepaddle.org.cn/en. We use the Paddle inference library---Paddle Inference to provide high performance reasoning capability. To build FFmpeg with Paddle Inference, please take following steps as reference:
1. download Paddle C library inhttps://www.paddlepaddle.org.cn/inference/v2.4/guides/install/download_lib.html#id1, please select the options as your need.
2. unzip the file to your own dir, with command unzip paddle_inference_c.tgz in your_dir
3. export paddle_root/paddle/include $C_INCLUDE_PATH and export paddle_root/paddle/lib/ to $LD_LIBRARY_PATH
4. config FFmpeg with ../configure --enable-libpaddle --extra-cflags="-I/paddle_root/paddle/include/ /paddle_root/third_party/install/paddle2onnx/lib/libpaddle2onnx.so /paddle_root/third_party/install/onnxruntime/lib/libonnxruntime.so /paddle_root/third_party/install/mklml/lib/libiomp5.so /paddle_root/third_party/install/mkldnn/lib/libdnnl.so.2 /paddle_root/paddle/lib/libpaddle_inference_c.so"
5. make
To run FFmpeg DNN inference with paddle backend: ./ffmpeg -i /input.jpg -vf dnn_processing=dnn_backend=3:model=paddle_model/model:input=input_name:output=output_name:options="input_layout=NCHW" -y output.jpg
The Paddle model dir must contain model.pdmodel and model.pdiparams files.
---
configure | 6 +-
libavfilter/dnn/Makefile | 1 +
libavfilter/dnn/dnn_backend_common.c | 2 +-
libavfilter/dnn/dnn_backend_pd.c | 840 +++++++++++++++++++++++++++
libavfilter/dnn/dnn_backend_pd.h | 38 ++
libavfilter/dnn/dnn_interface.c | 15 +-
libavfilter/dnn/dnn_io_proc.c | 8 +
libavfilter/dnn_interface.h | 2 +-
libavfilter/vf_dnn_detect.c | 82 +++
libavfilter/vf_dnn_processing.c | 3 +
10 files changed, 993 insertions(+), 4 deletions(-)
create mode 100644 libavfilter/dnn/dnn_backend_pd.c
create mode 100644 libavfilter/dnn/dnn_backend_pd.h
diff --git a/configure b/configure
index 6e363eb470..41cc8f99e2 100755
--- a/configure
+++ b/configure
@@ -276,6 +276,8 @@ External library support:
--enable-libsvtav1 enable AV1 encoding via SVT [no]
--enable-libtensorflow enable TensorFlow as a DNN module backend
for DNN based filters like sr [no]
+ --enable-libpaddle enable Paddlepaddle as a DNN module backend
+ for DNN based filters like sr [no]
--enable-libtesseract enable Tesseract, needed for ocr filter [no]
--enable-libtheora enable Theora encoding via libtheora [no]
--enable-libtls enable LibreSSL (via libtls), needed for https support
@@ -1855,6 +1857,7 @@ EXTERNAL_LIBRARY_LIST="
libssh
libsvtav1
libtensorflow
+ libpaddle
libtesseract
libtheora
libtwolame
@@ -2717,7 +2720,7 @@ dct_select="rdft"
deflate_wrapper_deps="zlib"
dirac_parse_select="golomb"
dovi_rpu_select="golomb"
-dnn_suggest="libtensorflow libopenvino"
+dnn_suggest="libtensorflow libopenvino libpaddle"
dnn_deps="avformat swscale"
error_resilience_select="me_cmp"
faandct_deps="faan"
@@ -6695,6 +6698,7 @@ enabled libspeex && require_pkg_config libspeex speex speex/speex.h spe
enabled libsrt && require_pkg_config libsrt "srt >= 1.3.0" srt/srt.h srt_socket
enabled libsvtav1 && require_pkg_config libsvtav1 "SvtAv1Enc >= 0.9.0" EbSvtAv1Enc.h svt_av1_enc_init_handle
enabled libtensorflow && require libtensorflow tensorflow/c/c_api.h TF_Version -ltensorflow
+enabled libpaddle && require libpaddle pd_inference_api.h PD_GetVersion $CFLAGS
enabled libtesseract && require_pkg_config libtesseract tesseract tesseract/capi.h TessBaseAPICreate
enabled libtheora && require libtheora theora/theoraenc.h th_info_init -ltheoraenc -ltheoradec -logg
enabled libtls && require_pkg_config libtls libtls tls.h tls_configure
diff --git a/libavfilter/dnn/Makefile b/libavfilter/dnn/Makefile
index 4cfbce0efc..b9fc8e4a30 100644
--- a/libavfilter/dnn/Makefile
+++ b/libavfilter/dnn/Makefile
@@ -16,5 +16,6 @@ OBJS-$(CONFIG_DNN) += dnn/dnn_backend_native_layer_mat
DNN-OBJS-$(CONFIG_LIBTENSORFLOW) += dnn/dnn_backend_tf.o
DNN-OBJS-$(CONFIG_LIBOPENVINO) += dnn/dnn_backend_openvino.o
+DNN-OBJS-$(CONFIG_LIBPADDLE) += dnn/dnn_backend_pd.o
OBJS-$(CONFIG_DNN) += $(DNN-OBJS-yes)
diff --git a/libavfilter/dnn/dnn_backend_common.c b/libavfilter/dnn/dnn_backend_common.c
index 91a4a3c4bf..5adeb6bb3b 100644
--- a/libavfilter/dnn/dnn_backend_common.c
+++ b/libavfilter/dnn/dnn_backend_common.c
@@ -43,7 +43,7 @@ int ff_check_exec_params(void *ctx, DNNBackendType backend, DNNFunctionType func
return AVERROR(EINVAL);
}
- if (exec_params->nb_output != 1 && backend != DNN_TF) {
+ if (exec_params->nb_output != 1 && backend != DNN_TF && backend !=DNN_PD) {
// currently, the filter does not need multiple outputs,
// so we just pending the support until we really need it.
avpriv_report_missing_feature(ctx, "multiple outputs");
diff --git a/libavfilter/dnn/dnn_backend_pd.c b/libavfilter/dnn/dnn_backend_pd.c
new file mode 100644
index 0000000000..b397b945b1
--- /dev/null
+++ b/libavfilter/dnn/dnn_backend_pd.c
@@ -0,0 +1,840 @@
+/*
+ * Copyright (c) 2023
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+/**
+ * @file
+ * DNN paddle backend implementation.
+ */
+
+#include "dnn_backend_pd.h"
+#include "dnn_backend_native.h"
+#include "libavutil/avassert.h"
+#include "libavutil/avstring.h"
+#include "libavutil/cpu.h"
+#include "../internal.h"
+#include "dnn_io_proc.h"
+#include "dnn_backend_common.h"
+#include "safe_queue.h"
+#include <paddle/pd_inference_api.h>
+
+typedef struct PDOptions {
+ float im_height, im_width;
+ float scale_factorH, scale_factorW;
+ char *input_layout;
+ uint8_t async;
+ uint32_t nireq;
+} PDOptions;
+
+typedef struct PDContext {
+ const AVClass *class;
+ PDOptions options;
+} PDContext;
+
+typedef struct PDModel {
+ PDContext ctx;
+ DNNModel *model;
+ PD_Config *config;
+ PD_Predictor *predictor;
+ PD_Bool status;
+ SafeQueue *request_queue;
+ Queue *lltask_queue;
+ Queue *task_queue;
+} PDModel;
+/**
+ * Stores execution parameters for single
+ * call to the Paddlepaddle C API
+ */
+typedef struct PDInferRequest {
+
+ PD_OneDimArrayCstr *input_names;
+ PD_OneDimArrayCstr *output_names;
+ PD_Tensor **output_tensors;
+ PD_Tensor *input_tensor;
+} PDInferRequest;
+
+typedef struct PDRequestItem {
+ PDInferRequest *infer_request;
+ LastLevelTaskItem *lltask;
+ PD_Bool status;
+ DNNAsyncExecModule exec_module;
+} PDRequestItem;
+
+#define OFFSET(x) offsetof(PDContext, x)
+#define FLAGS AV_OPT_FLAG_FILTERING_PARAM
+static const AVOption dnn_paddle_options[] = {
+ {"im_height", "image shape(H,W)", OFFSET(options.im_height), AV_OPT_TYPE_FLOAT, {.dbl = 320}, 0, 10000,
+ FLAGS},
+ {"im_width", "image shape(H,W)", OFFSET(options.im_width), AV_OPT_TYPE_FLOAT, {.dbl = 320}, 0, 10000,
+ FLAGS},
+ {"scale_factorH", "scalar factor for height", OFFSET(options.scale_factorH), AV_OPT_TYPE_FLOAT, {.dbl = 1.0},
+ 0, 10000,
+ FLAGS},
+ {"scale_factorW", "scalar factor for height", OFFSET(options.scale_factorW), AV_OPT_TYPE_FLOAT, {.dbl = 1.0},
+ 0, 10000,
+ FLAGS},
+ {"input_layout", "NHWC or NCHW", OFFSET(options.input_layout), AV_OPT_TYPE_STRING, {.str = "NCHW"}, 0, 0,
+ FLAGS},
+ DNN_BACKEND_COMMON_OPTIONS
+ {NULL}
+};
+
+AVFILTER_DEFINE_CLASS(dnn_paddle);
+
+static int execute_model_pd(PDRequestItem *request, Queue *lltask_queue);
+
+static void infer_completion_callback(void *args);
+
+static inline void destroy_request_item(PDRequestItem **arg);
+
+
+/**
+ * Free the contents of Paddle inference request.
+ * It does not free the PDInferRequest instance.
+ *
+ * @param request pointer to PDInferRequest instance.
+ * NULL pointer is allowed.
+ */
+static void pd_free_request(PDInferRequest *request) {
+ if (!request)
+ return;
+ if (request->input_tensor) {
+ PD_TensorDestroy(request->input_tensor);
+ request->input_tensor = NULL;
+ }
+ av_freep(&request->input_names);
+ av_freep(&request->output_names);
+ if (request->output_tensors) {
+ int nb_output = sizeof(*request->output_tensors) / sizeof(request->output_tensors[0]);
+ for (uint32_t i = 0; i < nb_output; ++i) {
+ if (request->output_tensors[i]) {
+ PD_TensorDestroy(request->output_tensors[i]);
+ request->output_tensors[i] = NULL;
+ }
+ }
+ av_freep(&request->output_tensors);
+ }
+}
+
+/**
+ * Free the PaddkeRequestItem completely.
+ *
+ * @param arg Address of the PaddleInferRequest instance.
+ */
+static inline void destroy_request_item(PDRequestItem **arg) {
+ PDRequestItem *request;
+ if (!arg) {
+ return;
+ }
+ request = *arg;
+ pd_free_request(request->infer_request);
+ av_freep(&request->infer_request);
+ av_freep(&request->lltask);
+ ff_dnn_async_module_cleanup(&request->exec_module);
+ av_freep(arg);
+}
+
+/**
+ * Create a Paddle inference request. All properties
+ * are initially unallocated and set as NULL.
+ *
+ * @return pointer to the allocated PDInferRequest instance.
+ */
+static PDInferRequest *pd_create_inference_request(void) {
+ PDInferRequest *infer_request = av_malloc(sizeof(PDInferRequest));
+ if (!infer_request) {
+ return NULL;
+ }
+ infer_request->input_names = NULL;
+ infer_request->output_names = NULL;
+ infer_request->input_tensor = NULL;
+ infer_request->output_tensors = NULL;
+ return infer_request;
+}
+
+static int load_pd_model(PDModel *pd_model, const char *model_filename) {
+
+ PDContext *ctx = &pd_model->ctx;
+ char *model_path = (char *) malloc(strlen(model_filename) + strlen(".pdmodel")+1);
+ char *params_path = (char *) malloc(strlen(model_filename) + strlen(".pdiparams")+1);
+ pd_model->config = PD_ConfigCreate();
+ strcpy(model_path, model_filename);
+ strcat(model_path, ".pdmodel");
+ strcpy(params_path, model_filename);
+ strcat(params_path, ".pdiparams");
+ PD_ConfigSetModel(pd_model->config, model_path, params_path);
+ free(model_path);
+ free(params_path);
+ pd_model->status = PD_ConfigIsValid(pd_model->config);
+ pd_model->predictor = PD_PredictorCreate(pd_model->config);
+ if (!pd_model->status) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to read model \"%s\" graph\n", model_filename);
+ PD_ConfigDestroy(pd_model->config);
+ PD_PredictorDestroy(pd_model->predictor);
+ return DNN_GENERIC_ERROR;
+ }
+ return 0;
+}
+
+static float *transposeNHWC2NCHW(float *data, const int32_t shape[4]) {
+ // the shape layout is NCHW
+ int N = shape[0];
+ int H = shape[2];
+ int W = shape[3];
+ int C = shape[1];
+ float *transposed = calloc(shape[0] * shape[1] * shape[2] * shape[3], sizeof(float));
+ // [N,H,W,C] -> [N,C,H,W]
+ for (int n = 0; n < N; ++n) {
+ for (int c = 0; c < C; ++c) {
+ for (int h = 0; h < H; ++h) {
+ for (int w = 0; w < W; ++w) {
+ int old_index = n * H * W * C + h * W * C + w * C + c;
+ int new_index = n * C * H * W + c * H * W + h * W + w;
+ transposed[new_index] = data[old_index];
+ }
+ }
+ }
+ }
+ memcpy(data, transposed, shape[0] * shape[1] * shape[2] * shape[3] * sizeof(float));
+ free(transposed);
+ return data;
+}
+
+static float *transposeNCHW2NHWC(float *data, const int32_t shape[4]) {
+ // the shape layout is NCHW
+ int N = shape[0];
+ int C = shape[1];
+ int H = shape[2];
+ int W = shape[3];
+ float *transposed = calloc(shape[0] * shape[1] * shape[2] * shape[3], sizeof(float));
+ // [N,C,H,W] -> [N,H,W,C]
+ for (int n = 0; n < N; ++n) {
+ for (int h = 0; h < H; ++h) {
+ for (int w = 0; w < W; ++w) {
+ for (int c = 0; c < C; ++c) {
+ int old_index = n * C * H * W + c * H * W + h * W + w;
+ int new_index = n * H * W * C + h * W * C + w * C + c;
+ transposed[new_index] = data[old_index];
+ }
+ }
+ }
+ }
+ memcpy(data, transposed, shape[0] * shape[1] * shape[2] * shape[3] * sizeof(float));
+ free(transposed);
+ return data;
+}
+
+static int get_name_index(PDModel *pd_model, TaskItem *task) {
+ int name_index = -1;
+ PD_OneDimArrayCstr *pd_input_names = PD_PredictorGetInputNames(pd_model->predictor);
+ for (int i = 0; i < pd_input_names->size; ++i) {
+ if (strcmp(pd_input_names->data[i], task->input_name) == 0) {
+ name_index = i;
+ }
+ }
+ PD_OneDimArrayCstrDestroy(pd_input_names);
+ if (name_index == -1) {
+ av_log(&pd_model->ctx, AV_LOG_ERROR, "Could not find \"%s\" in model\n", task->input_name);
+ return AVERROR(EINVAL);
+ }
+ return name_index;
+}
+
+static int pd_start_inference(void *args) {
+ DNNData input;
+ PDRequestItem *request = args;
+ PDInferRequest *infer_request = request->infer_request;
+ LastLevelTaskItem *lltask = request->lltask;
+ TaskItem *task = lltask->task;
+ PDModel *pd_model = task->model;
+ // get input data nhwc
+ PD_Tensor *input_tensor = infer_request->input_tensor;
+ int32_t input_shape[4] = {1, -1, -1, -1};
+
+ for (int i = 0; i < infer_request->input_names->size; ++i) {
+
+ if (strcmp(infer_request->input_names->data[i], "im_shape") == 0) {
+ PD_Tensor *im_shape_tensor = PD_PredictorGetInputHandle(pd_model->predictor,
+ infer_request->input_names->data[i]);
+ int32_t im_shape_shape[2] = {1, 2};
+ float im_shape_data[2] = {pd_model->ctx.options.im_height, pd_model->ctx.options.im_height};
+ PD_TensorReshape(im_shape_tensor, 2, im_shape_shape);
+ PD_TensorCopyFromCpuFloat(im_shape_tensor, im_shape_data);
+ } else if (strcmp(infer_request->input_names->data[i], "scale_factor") == 0) {
+ PD_Tensor *scale_factor_tensor = PD_PredictorGetInputHandle(pd_model->predictor,
+ infer_request->input_names->data[i]);
+ int32_t scale_factor_shape[2] = {1, 2};
+ float scale_factor_data[2] = {pd_model->ctx.options.scale_factorH, pd_model->ctx.options.scale_factorW};
+ PD_TensorReshape(scale_factor_tensor, 2, scale_factor_shape);
+ PD_TensorCopyFromCpuFloat(scale_factor_tensor, scale_factor_data);
+ }
+ }
+
+ if (strcmp(pd_model->ctx.options.input_layout, "NCHW") == 0) {
+ input_shape[1] = 3;
+ input_shape[2] = task->in_frame->height;
+ input_shape[3] = task->in_frame->width;
+ } else if (strcmp(pd_model->ctx.options.input_layout, "NHWC") == 0) {
+ input_shape[1] = task->in_frame->height;
+ input_shape[2] = task->in_frame->width;
+ input_shape[3] = 3;
+ } else {
+ av_log(&pd_model->ctx, AV_LOG_ERROR, "The input layout should be NCHW or NHWC\n");
+ }
+ float *in_data = (float *) calloc(1 * input_shape[1] * input_shape[2] * input_shape[3], sizeof(float));
+ PD_TensorCopyToCpuFloat(input_tensor, in_data);
+ if (strcmp(pd_model->ctx.options.input_layout, "NCHW") == 0) {
+ in_data = transposeNHWC2NCHW(in_data, input_shape);
+ }
+
+ PD_TensorReshape(input_tensor, 4, input_shape);
+ PD_TensorCopyFromCpuFloat(input_tensor, in_data);
+ free(in_data);
+
+ request->status = PD_PredictorRun(pd_model->predictor);
+
+ if (!request->status) {
+ av_log(&pd_model->ctx, AV_LOG_ERROR, "%s", "paddlepaddle predictor run fail!");
+ pd_free_request(infer_request);
+ if (ff_safe_queue_push_back(pd_model->request_queue, request) < 0) {
+ destroy_request_item(&request);
+ }
+ return DNN_GENERIC_ERROR;
+ }
+ return 0;
+}
+
+static void infer_completion_callback(void *args) {
+ PDRequestItem *request = args;
+ LastLevelTaskItem *lltask = request->lltask;
+ TaskItem *task = lltask->task;
+ DNNData *outputs;
+ PDInferRequest *infer_request = request->infer_request;
+ PDModel *pd_model = task->model;
+ PDContext *ctx = &pd_model->ctx;
+
+ outputs = av_malloc_array(task->nb_output, sizeof(*outputs));
+ if (!outputs) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to allocate memory for *outputs\n");
+ goto err;
+ }
+
+ for (uint32_t i = 0; i < task->nb_output; ++i) {
+ const size_t shape_size = PD_TensorGetShape(infer_request->output_tensors[i])->size;
+ int32_t length = 1;
+ PD_DataType out_dt = PD_TensorGetDataType(infer_request->output_tensors[i]);
+ size_t size;
+ float *out_data;
+
+ if (strcmp(pd_model->ctx.options.input_layout, "NCHW") == 0) {
+ outputs[i].height = PD_TensorGetShape(infer_request->output_tensors[i])->data[2];
+ outputs[i].width = PD_TensorGetShape(infer_request->output_tensors[i])->data[3];
+ outputs[i].channels = PD_TensorGetShape(infer_request->output_tensors[i])->data[1];
+ } else {
+ outputs[i].height = PD_TensorGetShape(infer_request->output_tensors[i])->data[1];
+ outputs[i].width = PD_TensorGetShape(infer_request->output_tensors[i])->data[2];
+ outputs[i].channels = PD_TensorGetShape(infer_request->output_tensors[i])->data[3];
+ }
+
+ for (int j = 0; j < shape_size; ++j) {
+ length *= PD_TensorGetShape(infer_request->output_tensors[i])->data[j];
+ }
+
+ if (out_dt != PD_DATA_FLOAT32){
+ av_log(&pd_model->ctx, AV_LOG_ERROR, "The model output datatype has to be float.\n");
+ } else {
+ outputs[i].dt = DNN_FLOAT;
+ size = sizeof(float);
+ out_data = (float *) malloc(length * size);
+ PD_TensorCopyToCpuFloat(infer_request->output_tensors[i], out_data);
+ }
+
+ if (shape_size == 4 && (strcmp(pd_model->ctx.options.input_layout, "NCHW") == 0)) {
+ int32_t output_shape[4] = {PD_TensorGetShape(infer_request->output_tensors[i])->data[0],
+ PD_TensorGetShape(infer_request->output_tensors[i])->data[1],
+ PD_TensorGetShape(infer_request->output_tensors[i])->data[2],
+ PD_TensorGetShape(infer_request->output_tensors[i])->data[3]};
+ out_data = transposeNCHW2NHWC(out_data, output_shape);
+ }
+
+ outputs[i].order = DCO_BGR;
+ outputs[i].data = out_data;
+ }
+ switch (pd_model->model->func_type) {
+ case DFT_PROCESS_FRAME:
+ //it only support 1 output if it's frame in & frame out
+ if (task->do_ioproc) {
+ if (pd_model->model->frame_post_proc != NULL) {
+ pd_model->model->frame_post_proc(task->out_frame, outputs, pd_model->model->filter_ctx);
+ } else {
+ ff_proc_from_dnn_to_frame(task->out_frame, outputs, ctx);
+ }
+ } else {
+ task->out_frame->width = outputs[0].width;
+ task->out_frame->height = outputs[0].height;
+ }
+ break;
+ case DFT_ANALYTICS_DETECT:
+ if (!pd_model->model->detect_post_proc) {
+ av_log(ctx, AV_LOG_ERROR, "Detect filter needs provide post proc\n");
+ return;
+ }
+ pd_model->model->detect_post_proc(task->in_frame, outputs, task->nb_output, pd_model->model->filter_ctx);
+ break;
+ default:
+ av_log(ctx, AV_LOG_ERROR, "Paddle Inference backend does not support this kind of dnn filter now\n");
+ goto err;
+ }
+ task->inference_done++;
+ err:
+ pd_free_request(infer_request);
+ av_freep(&outputs);
+ if (ff_safe_queue_push_back(pd_model->request_queue, request) < 0) {
+ destroy_request_item(&request);
+ av_log(ctx, AV_LOG_ERROR, "Failed to push back request_queue.\n");
+ }
+}
+
+static int extract_lltask_from_task(TaskItem *task, Queue *lltask_queue) {
+ PDModel *pd_model = task->model;
+ PDContext *ctx = &pd_model->ctx;
+ LastLevelTaskItem *lltask = av_malloc(sizeof(*lltask));
+ if (!lltask) {
+ av_log(ctx, AV_LOG_ERROR, "Unable to allocate space for LastLevelTaskItem\n");
+ return AVERROR(ENOMEM);
+ }
+ task->inference_todo = 1;
+ task->inference_done = 0;
+ lltask->task = task;
+ if (ff_queue_push_back(lltask_queue, lltask) < 0) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to push back lltask_queue.\n");
+ av_freep(&lltask);
+ return AVERROR(ENOMEM);
+ }
+ return 0;
+}
+
+static int get_input_pd(void *model, DNNData *input, const char *input_name) {
+ PDModel *pd_model = model;
+ PDContext *ctx = &pd_model->ctx;
+ int has_name = -1;
+ PD_OneDimArrayCstr *pd_input_names = PD_PredictorGetInputNames(pd_model->predictor);
+ for (int i = 0; i < pd_input_names->size; ++i) {
+ if (strcmp(pd_input_names->data[i], input_name) == 0) {
+ has_name = i;
+ break;
+ }
+ }
+ PD_OneDimArrayCstrDestroy(pd_input_names);
+ if (has_name == -1) {
+ av_log(ctx, AV_LOG_ERROR, "Could not find \"%s\" in model\n", input_name);
+ return AVERROR(EINVAL);
+ }
+ input->dt = DNN_FLOAT;
+ input->order = DCO_RGB;
+ input->height = -1;
+ input->width = -1;
+ input->channels = 3;
+ return 0;
+}
+
+static int get_output_pd(void *model, const char *input_name, int input_width, int input_height,
+ const char *output_name, int *output_width, int *output_height) {
+ int ret = 0;
+ PDModel *pd_model = model;
+ PDContext *ctx = &pd_model->ctx;
+ TaskItem task;
+ PDRequestItem *request;
+ DNNExecBaseParams exec_params = {
+ .input_name = input_name,
+ .output_names = &output_name,
+ .nb_output = 1,
+ .in_frame = NULL,
+ .out_frame = NULL,
+ };
+
+ ret = ff_dnn_fill_gettingoutput_task(&task, &exec_params, pd_model, input_height, input_width, ctx);
+ if (ret != 0) {
+ goto err;
+ }
+
+ ret = extract_lltask_from_task(&task, pd_model->lltask_queue);
+ if (ret != 0) {
+ av_log(ctx, AV_LOG_ERROR, "unable to extract inference from task.\n");
+ goto err;
+ }
+
+ request = ff_safe_queue_pop_front(pd_model->request_queue);
+ if (!request) {
+ av_log(ctx, AV_LOG_ERROR, "unable to get infer request.\n");
+ ret = AVERROR(EINVAL);
+ goto err;
+ }
+
+ ret = execute_model_pd(request, pd_model->lltask_queue);
+ *output_width = task.out_frame->width;
+ *output_height = task.out_frame->height;
+
+ err:
+ av_frame_free(&task.out_frame);
+ av_frame_free(&task.in_frame);
+ return ret;
+}
+
+DNNModel *ff_dnn_load_model_pd(const char *model_filename, DNNFunctionType func_type, const char *options,
+ AVFilterContext *filter_ctx) {
+ DNNModel *model = NULL;
+ PDModel *pd_model = NULL;
+ PDRequestItem *item = NULL;
+ PDContext *ctx = NULL;
+
+ model = av_mallocz(sizeof(DNNModel));
+ if (!model) {
+ return NULL;
+ }
+
+ pd_model = av_mallocz(sizeof(PDModel));
+ if (!pd_model) {
+ av_freep(&model);
+ return NULL;
+ }
+ pd_model->model = model;
+ ctx = &pd_model->ctx;
+ ctx->class = &dnn_paddle_class;
+
+ //parse options
+ av_opt_set_defaults(ctx);
+ if (av_opt_set_from_string(ctx, options, NULL, "=", "&") < 0) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to parse options \"%s\"\n", options);
+ goto err;
+ }
+
+ if (load_pd_model(pd_model, model_filename) != 0) {
+ goto err;
+ }
+
+ if (ctx->options.nireq <= 0) {
+ ctx->options.nireq = av_cpu_count() / 2 + 1;
+ }
+
+#if !HAVE_PTHREAD_CANCEL
+ if (ctx->options.async) {
+ ctx->options.async = 0;
+ av_log(filter_ctx, AV_LOG_WARNING, "pthread is not supported, roll back to sync.\n");
+ }
+#endif
+
+ pd_model->request_queue = ff_safe_queue_create();
+ if (!pd_model->request_queue) {
+ goto err;
+ }
+
+ for (int i = 0; i < ctx->options.nireq; i++) {
+ PDRequestItem *item = av_mallocz(sizeof(*item));
+ if (!item) {
+ goto err;
+ }
+ item->lltask = NULL;
+ item->infer_request = pd_create_inference_request();
+ if (!item->infer_request) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to allocate memory for Paddle inference request\n");
+ av_freep(&item);
+ goto err;
+ }
+ item->exec_module.start_inference = &pd_start_inference;
+ item->exec_module.callback = &infer_completion_callback;
+ item->exec_module.args = item;
+
+ if (ff_safe_queue_push_back(pd_model->request_queue, item) < 0) {
+ destroy_request_item(&item);
+ goto err;
+ }
+ }
+
+ pd_model->lltask_queue = ff_queue_create();
+ if (!pd_model->lltask_queue) {
+ goto err;
+ }
+
+ pd_model->task_queue = ff_queue_create();
+ if (!pd_model->task_queue) {
+ goto err;
+ }
+
+ model->model = pd_model;
+ model->get_input = &get_input_pd;
+ model->get_output = &get_output_pd;
+ model->options = options;
+ model->filter_ctx = filter_ctx;
+ model->func_type = func_type;
+
+ return model;
+ err:
+ ff_dnn_free_model_pd(&model);
+ return NULL;
+}
+
+static int fill_model_input_pd(PDModel *pd_model, PDRequestItem *request) {
+ DNNData input;
+ LastLevelTaskItem *lltask;
+ TaskItem *task;
+ PDInferRequest *infer_request;
+ PDContext *ctx = &pd_model->ctx;
+ int ret = 0;
+ int32_t input_shape[4] = {1, -1, -1, -1};
+
+ lltask = ff_queue_pop_front(pd_model->lltask_queue);
+ av_assert0(lltask);
+ task = lltask->task;
+ request->lltask = lltask;
+
+ ret = get_input_pd(pd_model, &input, task->input_name);
+ if (ret != 0) {
+ goto err;
+ }
+
+ infer_request = request->infer_request;
+ input.height = task->in_frame->height;
+ input.width = task->in_frame->width;
+
+ infer_request->input_names = av_malloc(sizeof(PD_OneDimArrayCstr));
+ if (!infer_request->input_names) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to allocate memory for input tensor\n");
+ ret = AVERROR(ENOMEM);
+ goto err;
+ }
+
+ int name_index = get_name_index(pd_model, task);
+ infer_request->input_names = PD_PredictorGetInputNames(pd_model->predictor);
+ infer_request->input_tensor = PD_PredictorGetInputHandle(pd_model->predictor,
+ infer_request->input_names->data[name_index]);
+ if (!infer_request->input_tensor) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to allocate memory for input tensor\n");
+ ret = AVERROR(ENOMEM);
+ goto err;
+ }
+
+ if (strcmp(pd_model->ctx.options.input_layout, "NCHW") == 0) {
+ input_shape[1] = input.channels;
+ input_shape[2] = input.height;
+ input_shape[3] = input.width;
+ } else if (strcmp(pd_model->ctx.options.input_layout, "NHWC") == 0) {
+ input_shape[1] = input.height;
+ input_shape[2] = input.width;
+ input_shape[3] = input.channels;
+ } else {
+ av_log(ctx, AV_LOG_ERROR, "The input layout should be NCHW or NHWC\n");
+ }
+ float *in_data = (float *) calloc(1 * input_shape[1] * input_shape[2] * input_shape[3], sizeof(float));
+ PD_TensorReshape(infer_request->input_tensor, 4, input_shape);
+ input.data = in_data;
+ PD_TensorCopyFromCpuFloat(infer_request->input_tensor, input.data);
+
+ switch (pd_model->model->func_type) {
+ case DFT_PROCESS_FRAME:
+ if (task->do_ioproc) {
+ if (pd_model->model->frame_pre_proc != NULL) {
+ pd_model->model->frame_pre_proc(task->in_frame, &input, pd_model->model->filter_ctx);
+ } else {
+ ff_proc_from_frame_to_dnn(task->in_frame, &input, ctx);
+ }
+ PD_TensorCopyFromCpuFloat(infer_request->input_tensor, input.data);
+ }
+ break;
+ case DFT_ANALYTICS_DETECT:
+ ff_proc_from_frame_to_dnn(task->in_frame, &input, ctx);
+ PD_TensorCopyFromCpuFloat(infer_request->input_tensor, input.data);
+ break;
+ default:
+ avpriv_report_missing_feature(ctx, "model function type %d", pd_model->model->func_type);
+ break;
+ }
+
+ infer_request->output_names = PD_PredictorGetOutputNames(pd_model->predictor);;
+ if (infer_request->output_names == NULL) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to allocate memory for *pd_outputs\n");
+ ret = AVERROR(ENOMEM);
+ goto err;
+ }
+
+ infer_request->output_tensors = av_calloc(task->nb_output, sizeof(*infer_request->output_tensors));
+ if (!infer_request->output_tensors) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to allocate memory for output tensor\n");
+ ret = AVERROR(ENOMEM);
+ goto err;
+ }
+
+
+ for (int i = 0; i < task->nb_output; ++i) {
+ infer_request->output_tensors[i] = PD_PredictorGetOutputHandle(pd_model->predictor,
+ infer_request->output_names->data[i]);
+ if (strcmp(infer_request->output_names->data[i], task->output_names[i]) != 0) {
+ av_log(ctx, AV_LOG_ERROR, "Could not find output \"%s\" in model\n", task->output_names[i]);
+ ret = DNN_GENERIC_ERROR;
+ goto err;
+ }
+ }
+ return 0;
+ err:
+ pd_free_request(infer_request);
+ return ret;
+}
+
+static int execute_model_pd(PDRequestItem *request, Queue *lltask_queue) {
+ PDModel *pd_model;
+ PDContext *ctx;
+ LastLevelTaskItem *lltask;
+ TaskItem *task;
+ int ret;
+
+ if (ff_queue_size(lltask_queue) == 0) {
+ destroy_request_item(&request);
+ return 0;
+ }
+
+ lltask = ff_queue_peek_front(lltask_queue);
+ task = lltask->task;
+ pd_model = task->model;
+ ctx = &pd_model->ctx;
+
+ ret = fill_model_input_pd(pd_model, request);
+ if (ret != 0) {
+ goto err;
+ }
+
+ ret = pd_start_inference(request);
+ if (ret != 0) {
+ goto err;
+ }
+ infer_completion_callback(request);
+ return (task->inference_done == task->inference_todo) ? 0 : DNN_GENERIC_ERROR;
+
+ err:
+ pd_free_request(request->infer_request);
+ if (ff_safe_queue_push_back(pd_model->request_queue, request) < 0) {
+ destroy_request_item(&request);
+ }
+ return ret;
+}
+
+int ff_dnn_execute_model_pd(const DNNModel *model, DNNExecBaseParams *exec_params) {
+ PDModel *pd_model = model->model;
+ PDContext *ctx = &pd_model->ctx;
+ TaskItem *task;
+ PDRequestItem *request;
+ int ret = 0;
+
+ ret = ff_check_exec_params(ctx, DNN_PD, model->func_type, exec_params);
+ if (ret != 0) {
+ return ret;
+ }
+
+ task = av_malloc(sizeof(*task));
+ if (!task) {
+ av_log(ctx, AV_LOG_ERROR, "unable to alloc memory for task item.\n");
+ return AVERROR(ENOMEM);
+ }
+
+ ret = ff_dnn_fill_task(task, exec_params, pd_model, ctx->options.async, 1);
+ if (ret != 0) {
+ av_freep(&task);
+ return ret;
+ }
+
+ if (ff_queue_push_back(pd_model->task_queue, task) < 0) {
+ av_freep(&task);
+ av_log(ctx, AV_LOG_ERROR, "unable to push back task_queue.\n");
+ return AVERROR(ENOMEM);
+ }
+
+ ret = extract_lltask_from_task(task, pd_model->lltask_queue);
+ if (ret != 0) {
+ av_log(ctx, AV_LOG_ERROR, "unable to extract last level task from task.\n");
+ return ret;
+ }
+
+ request = ff_safe_queue_pop_front(pd_model->request_queue);
+ if (!request) {
+ av_log(ctx, AV_LOG_ERROR, "unable to get infer request.\n");
+ return AVERROR(EINVAL);
+ }
+ return execute_model_pd(request, pd_model->lltask_queue);
+}
+
+int ff_dnn_flush_pd(const DNNModel *model) {
+ PDModel *pd_model = model->model;
+ PDContext *ctx = &pd_model->ctx;
+ PDRequestItem *request;
+ int ret;
+
+ if (ff_queue_size(pd_model->lltask_queue) == 0) {
+ // no pending task need to flush
+ return 0;
+ }
+
+ request = ff_safe_queue_pop_front(pd_model->request_queue);
+ if (!request) {
+ av_log(ctx, AV_LOG_ERROR, "unable to get infer request.\n");
+ return AVERROR(EINVAL);
+ }
+
+ ret = fill_model_input_pd(pd_model, request);
+ if (ret != 0) {
+ av_log(ctx, AV_LOG_ERROR, "Failed to fill model input.\n");
+ if (ff_safe_queue_push_back(pd_model->request_queue, request) < 0) {
+ destroy_request_item(&request);
+ }
+ return ret;
+ }
+ return execute_model_pd(request, pd_model->lltask_queue);
+}
+
+void ff_dnn_free_model_pd(DNNModel **model) {
+ PDModel *pd_model;
+
+ if (*model) {
+ pd_model = (*model)->model;
+ while (ff_safe_queue_size(pd_model->request_queue) != 0) {
+ PDRequestItem *item = ff_safe_queue_pop_front(pd_model->request_queue);
+ destroy_request_item(&item);
+ }
+ ff_safe_queue_destroy(pd_model->request_queue);
+
+ while (ff_queue_size(pd_model->lltask_queue) != 0) {
+ LastLevelTaskItem *item = (LastLevelTaskItem *)ff_queue_pop_front(pd_model->lltask_queue);
+ av_freep(&item);
+ }
+ ff_queue_destroy(pd_model->lltask_queue);
+
+ while (ff_queue_size(pd_model->task_queue) != 0) {
+ TaskItem *item = ff_queue_pop_front(pd_model->task_queue);
+ av_frame_free(&item->in_frame);
+ av_frame_free(&item->out_frame);
+ av_freep(&item);
+ }
+ ff_queue_destroy(pd_model->task_queue);
+ av_freep(&pd_model);
+ av_freep(model);
+ }
+}
+
+DNNAsyncStatusType ff_dnn_get_result_pd(const DNNModel *model, AVFrame **in, AVFrame **out) {
+ PDModel *pd_model = model->model;
+ return ff_dnn_get_result_common(pd_model->task_queue, in, out);
+}
diff --git a/libavfilter/dnn/dnn_backend_pd.h b/libavfilter/dnn/dnn_backend_pd.h
new file mode 100644
index 0000000000..67dd3c986f
--- /dev/null
+++ b/libavfilter/dnn/dnn_backend_pd.h
@@ -0,0 +1,38 @@
+/*
+ * Copyright (c) 2023
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+/**
+ * @file
+ * DNN inference functions interface for paddle backend.
+ */
+
+#ifndef FFMPEG_DNN_BACKEND_PD_H
+#define FFMPEG_DNN_BACKEND_PD_H
+#include "../dnn_interface.h"
+
+DNNModel *ff_dnn_load_model_pd(const char *model_filename, DNNFunctionType func_type, const char *options, AVFilterContext *filter_ctx);
+
+int ff_dnn_execute_model_pd(const DNNModel *model, DNNExecBaseParams *exec_params);
+DNNAsyncStatusType ff_dnn_get_result_pd(const DNNModel *model, AVFrame **in, AVFrame **out);
+int ff_dnn_flush_pd(const DNNModel *model);
+
+void ff_dnn_free_model_pd(DNNModel **model);
+
+#endif //FFMPEG_DNN_BACKEND_PD_H
diff --git a/libavfilter/dnn/dnn_interface.c b/libavfilter/dnn/dnn_interface.c
index 554a36b0dc..e8d86bbc3a 100644
--- a/libavfilter/dnn/dnn_interface.c
+++ b/libavfilter/dnn/dnn_interface.c
@@ -27,6 +27,7 @@
#include "dnn_backend_native.h"
#include "dnn_backend_tf.h"
#include "dnn_backend_openvino.h"
+#include "dnn_backend_pd.h"
#include "libavutil/mem.h"
DNNModule *ff_get_dnn_module(DNNBackendType backend_type)
@@ -70,8 +71,20 @@ DNNModule *ff_get_dnn_module(DNNBackendType backend_type)
return NULL;
#endif
break;
+ case DNN_PD:
+ #if (CONFIG_LIBPADDLE == 1)
+ dnn_module->load_model = &ff_dnn_load_model_pd;
+ dnn_module->execute_model = &ff_dnn_execute_model_pd;
+ dnn_module->get_result = &ff_dnn_get_result_pd;
+ dnn_module->flush = &ff_dnn_flush_pd;
+ dnn_module->free_model = &ff_dnn_free_model_pd;
+ #else
+ av_freep(&dnn_module);
+ return NULL;
+ #endif
+ break;
default:
- av_log(NULL, AV_LOG_ERROR, "Module backend_type is not native or tensorflow\n");
+ av_log(NULL, AV_LOG_ERROR, "Module backend_type is not native or tensorflow or paddlepaddle\n");
av_freep(&dnn_module);
return NULL;
}
diff --git a/libavfilter/dnn/dnn_io_proc.c b/libavfilter/dnn/dnn_io_proc.c
index 7961bf6b95..d7a8904e9c 100644
--- a/libavfilter/dnn/dnn_io_proc.c
+++ b/libavfilter/dnn/dnn_io_proc.c
@@ -184,6 +184,14 @@ static enum AVPixelFormat get_pixel_format(DNNData *data)
av_assert0(!"unsupported data pixel format.\n");
return AV_PIX_FMT_BGR24;
}
+ } else if (data->dt == DNN_FLOAT) {
+ switch (data->order) {
+ case DCO_RGB:
+ return AV_PIX_FMT_BGR24;
+ default:
+ av_assert0(!"unsupported data pixel format.\n");
+ return AV_PIX_FMT_BGR24;
+ }
}
av_assert0(!"unsupported data type.\n");
diff --git a/libavfilter/dnn_interface.h b/libavfilter/dnn_interface.h
index ef8d7ae66f..22fe721e4c 100644
--- a/libavfilter/dnn_interface.h
+++ b/libavfilter/dnn_interface.h
@@ -32,7 +32,7 @@
#define DNN_GENERIC_ERROR FFERRTAG('D','N','N','!')
-typedef enum {DNN_NATIVE, DNN_TF, DNN_OV} DNNBackendType;
+typedef enum {DNN_NATIVE, DNN_TF, DNN_OV, DNN_PD} DNNBackendType;
typedef enum {DNN_FLOAT = 1, DNN_UINT8 = 4} DNNDataType;
diff --git a/libavfilter/vf_dnn_detect.c b/libavfilter/vf_dnn_detect.c
index 7e133f6af5..6658dfbfc7 100644
--- a/libavfilter/vf_dnn_detect.c
+++ b/libavfilter/vf_dnn_detect.c
@@ -210,6 +210,79 @@ static int dnn_detect_post_proc_tf(AVFrame *frame, DNNData *output, AVFilterCont
return 0;
}
+static int dnn_detect_post_proc_pd(AVFrame *frame, DNNData *output, AVFilterContext *filter_ctx)
+{
+ DnnDetectContext *ctx = filter_ctx->priv;
+ int proposal_count;
+ float conf_threshold = ctx->confidence;
+ float *box_info;
+ float x0, y0, x1, y1;
+ int nb_bboxes = 0;
+ AVFrameSideData *sd;
+ AVDetectionBBox *bbox;
+ AVDetectionBBoxHeader *header;
+
+ proposal_count = *(int *) (output[1].data);
+ box_info = output[0].data;
+
+ sd = av_frame_get_side_data(frame, AV_FRAME_DATA_DETECTION_BBOXES);
+ if (sd) {
+ av_log(filter_ctx, AV_LOG_ERROR, "already have dnn bounding boxes in side data.\n");
+ return -1;
+ }
+
+ for (int i = 0; i < proposal_count; ++i) {
+ if (box_info[i * 6 + 1] < conf_threshold)
+ continue;
+ nb_bboxes++;
+ }
+
+ if (nb_bboxes == 0) {
+ av_log(filter_ctx, AV_LOG_VERBOSE, "nothing detected in this frame.\n");
+ return 0;
+ }
+
+ header = av_detection_bbox_create_side_data(frame, nb_bboxes);
+ if (!header) {
+ av_log(filter_ctx, AV_LOG_ERROR, "failed to create side data with %d bounding boxes\n", nb_bboxes);
+ return -1;
+ }
+
+ av_strlcpy(header->source, ctx->dnnctx.model_filename, sizeof(header->source));
+
+ for (int i = 0; i < proposal_count; ++i) {
+ if (box_info[i * 6 + 1] < conf_threshold) {
+ continue;
+ }
+ x0 = box_info[i * 6 + 2];
+ y0 = box_info[i * 6 + 3];
+ x1 = box_info[i * 6 + 4];
+ y1 = box_info[i * 6 + 5];
+
+ bbox = av_get_detection_bbox(header, nb_bboxes - 1);
+
+ bbox->x = (int) (x0 );
+ bbox->w = (int) (x1 ) - bbox->x;
+ bbox->y = (int) (y0 );
+ bbox->h = (int) (y1 ) - bbox->y;
+
+ bbox->detect_confidence = av_make_q((int) (box_info[i * 6 + 1] * 10000), 10000);
+ bbox->classify_count = 0;
+
+ if (ctx->labels && box_info[i * 6] < ctx->label_count) {
+ av_strlcpy(bbox->detect_label, ctx->labels[(int) box_info[i * 6]], sizeof(bbox->detect_label));
+ } else {
+ snprintf(bbox->detect_label, sizeof(bbox->detect_label), "%d", (int) box_info[i * 6]);
+ }
+
+ nb_bboxes--;
+ if (nb_bboxes == 0) {
+ break;
+ }
+ }
+ return 0;
+}
+
static int dnn_detect_post_proc(AVFrame *frame, DNNData *output, uint32_t nb, AVFilterContext *filter_ctx)
{
DnnDetectContext *ctx = filter_ctx->priv;
@@ -219,6 +292,8 @@ static int dnn_detect_post_proc(AVFrame *frame, DNNData *output, uint32_t nb, AV
return dnn_detect_post_proc_ov(frame, output, filter_ctx);
case DNN_TF:
return dnn_detect_post_proc_tf(frame, output, filter_ctx);
+ case DNN_PD:
+ return dnn_detect_post_proc_pd(frame, output, filter_ctx);
default:
avpriv_report_missing_feature(filter_ctx, "Current dnn backend does not support detect filter\n");
return AVERROR(EINVAL);
@@ -309,6 +384,13 @@ static int check_output_nb(DnnDetectContext *ctx, DNNBackendType backend_type, i
return AVERROR(EINVAL);
}
return 0;
+ case DNN_PD:
+ if (output_nb != 2) {
+ av_log(ctx, AV_LOG_ERROR, "Dnn detect filter with paddle backend needs 2 output only, \
+ but get %d instead\n", output_nb);
+ return AVERROR(EINVAL);
+ }
+ return 0;
default:
avpriv_report_missing_feature(ctx, "Dnn detect filter does not support current backend\n");
return AVERROR(EINVAL);
diff --git a/libavfilter/vf_dnn_processing.c b/libavfilter/vf_dnn_processing.c
index 4462915073..260e62c59b 100644
--- a/libavfilter/vf_dnn_processing.c
+++ b/libavfilter/vf_dnn_processing.c
@@ -52,6 +52,9 @@ static const AVOption dnn_processing_options[] = {
#endif
#if (CONFIG_LIBOPENVINO == 1)
{ "openvino", "openvino backend flag", 0, AV_OPT_TYPE_CONST, { .i64 = 2 }, 0, 0, FLAGS, "backend" },
+#endif
+#if (CONFIG_LIBPADDLE == 1)
+ { "paddle", "paddle backend flag", 0, AV_OPT_TYPE_CONST, { .i64 = 3 }, 0, 0, FLAGS, "backend" },
#endif
DNN_COMMON_OPTIONS
{ NULL }
--
2.25.1
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [FFmpeg-devel] [PATCH v1] libavfi/dnn: add Paddle Inference as one of DNN backend
2023-04-06 10:36 [FFmpeg-devel] [PATCH v1] libavfi/dnn: add Paddle Inference as one of DNN backend wongwwz
@ 2023-04-08 21:31 ` Jean-Baptiste Kempf
2023-04-11 15:03 ` WenzheWang
0 siblings, 1 reply; 7+ messages in thread
From: Jean-Baptiste Kempf @ 2023-04-08 21:31 UTC (permalink / raw)
To: ffmpeg-devel
On Thu, 6 Apr 2023, at 12:36, wongwwz@foxmail.com wrote:
> PaddlePaddle (PArallel Distributed Deep LEarning) is a simple,
> efficient and extensible deep learning framework that accelerates the
Please don't add another DNN backend.
--
Jean-Baptiste Kempf - President
+33 672 704 734
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [FFmpeg-devel] [PATCH v1] libavfi/dnn: add Paddle Inference as one of DNN backend
2023-04-08 21:31 ` Jean-Baptiste Kempf
@ 2023-04-11 15:03 ` WenzheWang
2023-05-10 2:25 ` WenzheWang
0 siblings, 1 reply; 7+ messages in thread
From: WenzheWang @ 2023-04-11 15:03 UTC (permalink / raw)
To: ffmpeg-devel
Could you please briefly introduce the reason why not adding any dnn backend?
Do you have any plan for the maintenance and development of the dnn backend in the future? From my understanding, the current backend of dnn has tensoflow, openvino and native, but this cannot meet the needs of users.
Thus, I believe adding other dnn backends will be great for user experience, user growth, and industrial applications. In particular, various dnn backend can be adapted to different application environments, and there are some emerging inference engines that are faster and stronger, such as Pytorch and Paddle. In addition, from the practical point of view, it is not difficult for a deep learning practitioner to learn and use this framework, but how to choose a framework and apply it in practice, people pay more attention to the effect (recall and precision), and easy deployment, that is, high reasoning performance efficiency. The main reason why Paddle is relatively mainstream and why I want to add paddle backend is that it has a very high efficiency and performance. There are several projects maintained by Paddle, such as paddleDetection, paddleSeg, paddleGAN, paddleOCR and paddleCls have a lot of good pre-training models that migrate well to their own data and has excellent perform
ance. Secondly, in terms of reasoning efficiency, Paddle supports many platforms and chips. Models trained using Paddle framework can be directly deployed, and custom device interfaces are open for independent development based on one's own hardware.
FFmpeg itself already has very extensive support for codec. If FFmpeg could support the deployment of more reasoning model backend, it would have a wider application.
In general, I hope that ffmpeg could support the backend of paddle or more. In any case that my code is not mature or proper, I would be grateful if professionals like you could offer me suggestions and comments. I will be absolutely honored if I could contribute to this project :)
Best,
Wenzhe Wang
WenzheWang
wongwwz@foxmail.com
------------------ Original ------------------
From: "FFmpeg development discussions and patches" <jb@videolan.org>;
Date: Sun, Apr 9, 2023 05:31 AM
To: "ffmpeg-devel"<ffmpeg-devel@ffmpeg.org>;
Subject: Re: [FFmpeg-devel] [PATCH v1] libavfi/dnn: add Paddle Inference as one of DNN backend
On Thu, 6 Apr 2023, at 12:36, wongwwz@foxmail.com wrote:
> PaddlePaddle (PArallel Distributed Deep LEarning) is a simple,
> efficient and extensible deep learning framework that accelerates the
Please don't add another DNN backend.
--
Jean-Baptiste Kempf - President
+33 672 704 734
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [FFmpeg-devel] [PATCH v1] libavfi/dnn: add Paddle Inference as one of DNN backend
2023-04-11 15:03 ` WenzheWang
@ 2023-05-10 2:25 ` WenzheWang
2023-05-10 4:08 ` "zhilizhao(赵志立)"
0 siblings, 1 reply; 7+ messages in thread
From: WenzheWang @ 2023-05-10 2:25 UTC (permalink / raw)
To: ffmpeg-devel
Dear Madam or Sir,
Hope this email finds you well.
I am writing this email since i recently found FFmepg remove DNN native backend, and i will be really grateful if you let me know if there is any new plan on libavfilter/dnn.
I would like to explain to you again about the addition of dnn paddle backend.
At present, ffmpeg only supports openvino and tensorflow backend. Among the current deep learning frameworks, TensorFlow is the most active in development. TensorFlow has 174k stars and pytorch has 66.5k. openvino is 4.2k, and the models that openvino can implement are relatively few. But in terms of attention on GitHub, there's no doubt that TensorFlow and pytorch are more promising. Currently, the paddle framework has reached 20.2k stars on github, which is much more widely used and active than frameworks such as mxnet and caffe.
Tensoflow has a very rich ecosystem. The TensorFlow models library updates very quickly and has existing examples of deep learning applications for image classification, object detection, image generation text, and generation of adversus-network models. The dnn libavfilter module is undoubtedly very necessary for tensorflow backend to support. But the complexity of the TensorFlow API and the complexity of the training are almost prohibitive, making it a love-hate framework.
PyTorch framework tends to be applied to academic fast implementation, and its industrial application performance is not good. For example, Pytorch framework makes a model to run on a server, Android phone or embedded system, and its performance is poor compared with other deep learning frameworks.
PaddlePadddle is an open source framework of Baidu, which is also used by many people in China. It is very consistent with the usage habits of developers, but the practicability of the API still needs to be further strengthened. However, Paddle is the only deep learning framework I have ever used, which does not configure any third-party libraries and can be used directly by cloning make. Besides, Paddle occupies a small amount of memory and is fast. It also serves a considerable number of projects inside Baidu, which is very strong in industrial application. And PaddlePaddle supports multiple machine and multiple card training.
Users' choice of different deep learning frameworks is a personal choice, and the reason why most of us chose paddle is because of its better support for embedded development and different hardware platforms and because the community is very active and has proposed industrial improvements and implementations for some advanced models. Especially for the GPU, it supports cuda and opencl, which means we can optimize the model no matter what kind of graphics card is used. In my opinion, more backend support can better improve dnn libavfilter modules.
If there are any new changes in dnn libavfilter module, I will be very willing to adjust our implementation with the new planning and provide continuous maintenance.
Best Regards,
Wenzhe Wang
WenzheWang
wongwwz@foxmail.com
------------------ Original ------------------
From: "WenzheWang" <wongwwz@foxmail.com>;
Date: Tue, Apr 11, 2023 11:03 PM
To: "ffmpeg-devel"<ffmpeg-devel@ffmpeg.org>;
Subject: Re: [FFmpeg-devel] [PATCH v1] libavfi/dnn: add Paddle Inference as one of DNN backend
Could you please briefly introduce the reason why not adding any dnn backend?
Do you have any plan for the maintenance and development of the dnn backend in the future? From my understanding, the current backend of dnn has tensoflow, openvino and native, but this cannot meet the needs of users.
Thus, I believe adding other dnn backends will be great for user experience, user growth, and industrial applications. In particular, various dnn backend can be adapted to different application environments, and there are some emerging inference engines that are faster and stronger, such as Pytorch and Paddle. In addition, from the practical point of view, it is not difficult for a deep learning practitioner to learn and use this framework, but how to choose a framework and apply it in practice, people pay more attention to the effect (recall and precision), and easy deployment, that is, high reasoning performance efficiency. The main reason why Paddle is relatively mainstream and why I want to add paddle backend is that it has a very high efficiency and performance. There are several projects maintained by Paddle, such as paddleDetection, paddleSeg, paddleGAN, paddleOCR and paddleCls have a lot of good pre-training models that migrate well to their own data and has excellent perform
ance. Secondly, in terms of reasoning efficiency, Paddle supports many platforms and chips. Models trained using Paddle framework can be directly deployed, and custom device interfaces are open for independent development based on one's own hardware.
FFmpeg itself already has very extensive support for codec. If FFmpeg could support the deployment of more reasoning model backend, it would have a wider application.
In general, I hope that ffmpeg could support the backend of paddle or more. In any case that my code is not mature or proper, I would be grateful if professionals like you could offer me suggestions and comments. I will be absolutely honored if I could contribute to this project :)
Best,
Wenzhe Wang
WenzheWang
wongwwz@foxmail.com
------------------ Original ------------------
From: "FFmpeg development discussions and patches" <jb@videolan.org>;
Date: Sun, Apr 9, 2023 05:31 AM
To: "ffmpeg-devel"<ffmpeg-devel@ffmpeg.org>;
Subject: Re: [FFmpeg-devel] [PATCH v1] libavfi/dnn: add Paddle Inference as one of DNN backend
On Thu, 6 Apr 2023, at 12:36, wongwwz@foxmail.com wrote:
> PaddlePaddle (PArallel Distributed Deep LEarning) is a simple,
> efficient and extensible deep learning framework that accelerates the
Please don't add another DNN backend.
--
Jean-Baptiste Kempf - President
+33 672 704 734
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [FFmpeg-devel] [PATCH v1] libavfi/dnn: add Paddle Inference as one of DNN backend
2023-05-10 2:25 ` WenzheWang
@ 2023-05-10 4:08 ` "zhilizhao(赵志立)"
2023-05-10 11:30 ` Guo, Yejun
0 siblings, 1 reply; 7+ messages in thread
From: "zhilizhao(赵志立)" @ 2023-05-10 4:08 UTC (permalink / raw)
To: FFmpeg development discussions and patches
> On May 10, 2023, at 10:25, WenzheWang <wongwwz@foxmail.com> wrote:
>
> Dear Madam or Sir,
>
>
> Hope this email finds you well.
>
>
> I am writing this email since i recently found FFmepg remove DNN native backend, and i will be really grateful if you let me know if there is any new plan on libavfilter/dnn.
>
>
> I would like to explain to you again about the addition of dnn paddle backend.
>
> At present, ffmpeg only supports openvino and tensorflow backend. Among the current deep learning frameworks, TensorFlow is the most active in development. TensorFlow has 174k stars and pytorch has 66.5k. openvino is 4.2k, and the models that openvino can implement are relatively few. But in terms of attention on GitHub, there's no doubt that TensorFlow and pytorch are more promising. Currently, the paddle framework has reached 20.2k stars on github, which is much more widely used and active than frameworks such as mxnet and caffe.
Stars don't matter much here.
Just for reference, there is a thread before:
https://patchwork.ffmpeg.org/project/ffmpeg/patch/20220523092918.9548-2-ting.fu@intel.com/
>
> Tensoflow has a very rich ecosystem. The TensorFlow models library updates very quickly and has existing examples of deep learning applications for image classification, object detection, image generation text, and generation of adversus-network models. The dnn libavfilter module is undoubtedly very necessary for tensorflow backend to support. But the complexity of the TensorFlow API and the complexity of the training are almost prohibitive, making it a love-hate framework.
>
> PyTorch framework tends to be applied to academic fast implementation, and its industrial application performance is not good. For example, Pytorch framework makes a model to run on a server, Android phone or embedded system, and its performance is poor compared with other deep learning frameworks.
>
>
> PaddlePadddle is an open source framework of Baidu, which is also used by many people in China. It is very consistent with the usage habits of developers, but the practicability of the API still needs to be further strengthened. However, Paddle is the only deep learning framework I have ever used, which does not configure any third-party libraries and can be used directly by cloning make. Besides, Paddle occupies a small amount of memory and is fast. It also serves a considerable number of projects inside Baidu, which is very strong in industrial application. And PaddlePaddle supports multiple machine and multiple card training.
>
>
> Users' choice of different deep learning frameworks is a personal choice, and the reason why most of us chose paddle is because of its better support for embedded development and different hardware platforms and because the community is very active and has proposed industrial improvements and implementations for some advanced models. Especially for the GPU, it supports cuda and opencl, which means we can optimize the model no matter what kind of graphics card is used. In my opinion, more backend support can better improve dnn libavfilter modules.
>
> If there are any new changes in dnn libavfilter module, I will be very willing to adjust our implementation with the new planning and provide continuous maintenance.
>
>
>
>
> Best Regards,
> Wenzhe Wang
>
>
>
>
>
>
> WenzheWang
> wongwwz@foxmail.com
>
>
>
>
>
>
>
> ------------------ Original ------------------
> From: "WenzheWang" <wongwwz@foxmail.com>;
> Date: Tue, Apr 11, 2023 11:03 PM
> To: "ffmpeg-devel"<ffmpeg-devel@ffmpeg.org>;
>
> Subject: Re: [FFmpeg-devel] [PATCH v1] libavfi/dnn: add Paddle Inference as one of DNN backend
>
>
>
>
> Could you please briefly introduce the reason why not adding any dnn backend?
>
>
>
>
> Do you have any plan for the maintenance and development of the dnn backend in the future? From my understanding, the current backend of dnn has tensoflow, openvino and native, but this cannot meet the needs of users.
>
>
>
>
> Thus, I believe adding other dnn backends will be great for user experience, user growth, and industrial applications. In particular, various dnn backend can be adapted to different application environments, and there are some emerging inference engines that are faster and stronger, such as Pytorch and Paddle. In addition, from the practical point of view, it is not difficult for a deep learning practitioner to learn and use this framework, but how to choose a framework and apply it in practice, people pay more attention to the effect (recall and precision), and easy deployment, that is, high reasoning performance efficiency. The main reason why Paddle is relatively mainstream and why I want to add paddle backend is that it has a very high efficiency and performance. There are several projects maintained by Paddle, such as paddleDetection, paddleSeg, paddleGAN, paddleOCR and paddleCls have a lot of good pre-training models that migrate well to their own data and has excellent perfo
rm
> ance. Secondly, in terms of reasoning efficiency, Paddle supports many platforms and chips. Models trained using Paddle framework can be directly deployed, and custom device interfaces are open for independent development based on one's own hardware.
>
> FFmpeg itself already has very extensive support for codec. If FFmpeg could support the deployment of more reasoning model backend, it would have a wider application.
>
>
>
>
> In general, I hope that ffmpeg could support the backend of paddle or more. In any case that my code is not mature or proper, I would be grateful if professionals like you could offer me suggestions and comments. I will be absolutely honored if I could contribute to this project :)
>
>
>
>
> Best,
>
> Wenzhe Wang
>
>
>
>
> WenzheWang
> wongwwz@foxmail.com
>
>
>
>
>
>
>
> ------------------ Original ------------------
> From: "FFmpeg development discussions and patches" <jb@videolan.org>;
> Date: Sun, Apr 9, 2023 05:31 AM
> To: "ffmpeg-devel"<ffmpeg-devel@ffmpeg.org>;
>
> Subject: Re: [FFmpeg-devel] [PATCH v1] libavfi/dnn: add Paddle Inference as one of DNN backend
>
>
> On Thu, 6 Apr 2023, at 12:36, wongwwz@foxmail.com wrote:
> > PaddlePaddle (PArallel Distributed Deep LEarning) is a simple,
> > efficient and extensible deep learning framework that accelerates the
>
> Please don't add another DNN backend.
>
> --
> Jean-Baptiste Kempf - President
> +33 672 704 734
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [FFmpeg-devel] [PATCH v1] libavfi/dnn: add Paddle Inference as one of DNN backend
2023-05-10 4:08 ` "zhilizhao(赵志立)"
@ 2023-05-10 11:30 ` Guo, Yejun
2023-05-11 7:53 ` [FFmpeg-devel] =?gb18030?b?u9i4tKO6ICBbUEFUQ0ggdjFdIGxpYmF2Zmkv?= =?gb18030?q?dnn=3A_add_Paddle_Inference_as_one_of_DNN_backend?= =?gb18030?B?V2VuemhlV2FuZw==?=
0 siblings, 1 reply; 7+ messages in thread
From: Guo, Yejun @ 2023-05-10 11:30 UTC (permalink / raw)
To: FFmpeg development discussions and patches
> -----Original Message-----
> From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of
> "zhilizhao(赵志立)"
> Sent: Wednesday, May 10, 2023 12:09 PM
> To: FFmpeg development discussions and patches <ffmpeg-
> devel@ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH v1] libavfi/dnn: add Paddle Inference as
> one of DNN backend
>
>
>
> > On May 10, 2023, at 10:25, WenzheWang <wongwwz@foxmail.com> wrote:
> >
> > Dear Madam or Sir,
> >
> >
> > Hope this email finds you well.
> >
> >
> > I am writing this email since i recently found FFmepg remove DNN native
> backend, and i will be really grateful if you let me know if there is any new
> plan on libavfilter/dnn.
> >
> >
> > I would like to explain to you again about the addition of dnn paddle
> backend.
> >
> > At present, ffmpeg only supports openvino and tensorflow backend.
> Among the current deep learning frameworks, TensorFlow is the most active
> in development. TensorFlow has 174k stars and pytorch has 66.5k. openvino
> is 4.2k, and the models that openvino can implement are relatively few. But
> in terms of attention on GitHub, there's no doubt that TensorFlow and
> pytorch are more promising. Currently, the paddle framework has reached
> 20.2k stars on github, which is much more widely used and active than
> frameworks such as mxnet and caffe.
>
> Stars don't matter much here.
>
> Just for reference, there is a thread before:
>
> https://patchwork.ffmpeg.org/project/ffmpeg/patch/20220523092918.9548-
> 2-ting.fu@intel.com/
>
> >
> > Tensoflow has a very rich ecosystem. The TensorFlow models library
> updates very quickly and has existing examples of deep learning applications
> for image classification, object detection, image generation text, and
> generation of adversus-network models. The dnn libavfilter module is
> undoubtedly very necessary for tensorflow backend to support. But the
> complexity of the TensorFlow API and the complexity of the training are
> almost prohibitive, making it a love-hate framework.
> >
> > PyTorch framework tends to be applied to academic fast implementation,
> and its industrial application performance is not good. For example, Pytorch
> framework makes a model to run on a server, Android phone or embedded
> system, and its performance is poor compared with other deep learning
> frameworks.
> >
> >
> > PaddlePadddle is an open source framework of Baidu, which is also used
> by many people in China. It is very consistent with the usage habits of
> developers, but the practicability of the API still needs to be further
> strengthened. However, Paddle is the only deep learning framework I have
> ever used, which does not configure any third-party libraries and can be
> used directly by cloning make. Besides, Paddle occupies a small amount of
> memory and is fast. It also serves a considerable number of projects inside
> Baidu, which is very strong in industrial application. And PaddlePaddle
> supports multiple machine and multiple card training.
Imo, my idea is that we can add 1 or 2 dnn backends as discussed at
http://ffmpeg.org/pipermail/ffmpeg-devel/2022-December/304534.html
The background is that we see different good models from different deep learning
frameworks, and most framework does not support models developed with other
frameworks due to different model format. imo, we'd support several popular frameworks.
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 7+ messages in thread
* [FFmpeg-devel] =?gb18030?b?u9i4tKO6ICBbUEFUQ0ggdjFdIGxpYmF2Zmkv?= =?gb18030?q?dnn=3A_add_Paddle_Inference_as_one_of_DNN_backend?=
2023-05-10 11:30 ` Guo, Yejun
@ 2023-05-11 7:53 ` =?gb18030?B?V2VuemhlV2FuZw==?=
0 siblings, 0 replies; 7+ messages in thread
From: =?gb18030?B?V2VuemhlV2FuZw==?= @ 2023-05-11 7:53 UTC (permalink / raw)
To: =?gb18030?B?RkZtcGVnIGRldmVsb3BtZW50IGRpc2N1c3Npb25zIGFuZCBwYXRjaGVz?=
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="gb18030", Size: 5578 bytes --]
Thank you for your reply and I am very glad to receive your opinion.
For me,github star is not very important, I just want to use it to highlight how many people are using deep learning frameworks right now. However, I have learned about dnn's plans for other frameworks.
I think it would be a good idea to add a glue layer and hope to see it implemented soon. Do I have the opportunity to participate in the development of this module?
Thank you again for answering my question.
best,
Wenzhe
WenzheWang
wongwwz@foxmail.com
------------------ ÔʼÓʼþ ------------------
·¢¼þÈË: "FFmpeg development discussions and patches" <yejun.guo-at-intel.com@ffmpeg.org>;
·¢ËÍʱ¼ä: 2023Äê5ÔÂ10ÈÕ(ÐÇÆÚÈý) ÍíÉÏ7:30
ÊÕ¼þÈË: "FFmpeg development discussions and patches"<ffmpeg-devel@ffmpeg.org>;
Ö÷Ìâ: Re: [FFmpeg-devel] [PATCH v1] libavfi/dnn: add Paddle Inference as one of DNN backend
> -----Original Message-----
> From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of
> "zhilizhao(ÕÔÖ¾Á¢)"
> Sent: Wednesday, May 10, 2023 12:09 PM
> To: FFmpeg development discussions and patches <ffmpeg-
> devel@ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH v1] libavfi/dnn: add Paddle Inference as
> one of DNN backend
>
>
>
> > On May 10, 2023, at 10:25, WenzheWang <wongwwz@foxmail.com> wrote:
> >
> > Dear Madam or Sir,
> >
> >
> > Hope this email finds you well.
> >
> >
> > I am writing this email since i recently found FFmepg remove DNN native
> backend, and i will be really grateful if you let me know if there is any new
> plan on libavfilter/dnn.
> >
> >
> > I would like to explain to you again about the addition of dnn paddle
> backend.
> >
> > At present, ffmpeg only supports openvino and tensorflow backend.
> Among the current deep learning frameworks, TensorFlow is the most active
> in development. TensorFlow has 174k stars and pytorch has 66.5k. openvino
> is 4.2k, and the models that openvino can implement are relatively few. But
> in terms of attention on GitHub, there's no doubt that TensorFlow and
> pytorch are more promising. Currently, the paddle framework has reached
> 20.2k stars on github, which is much more widely used and active than
> frameworks such as mxnet and caffe.
>
> Stars don't matter much here.
>
> Just for reference, there is a thread before:
>
> https://patchwork.ffmpeg.org/project/ffmpeg/patch/20220523092918.9548-
> 2-ting.fu@intel.com/
>
> >
> > Tensoflow has a very rich ecosystem. The TensorFlow models library
> updates very quickly and has existing examples of deep learning applications
> for image classification, object detection, image generation text, and
> generation of adversus-network models. The dnn libavfilter module is
> undoubtedly very necessary for tensorflow backend to support. But the
> complexity of the TensorFlow API and the complexity of the training are
> almost prohibitive, making it a love-hate framework.
> >
> > PyTorch framework tends to be applied to academic fast implementation,
> and its industrial application performance is not good. For example, Pytorch
> framework makes a model to run on a server, Android phone or embedded
> system, and its performance is poor compared with other deep learning
> frameworks.
> >
> >
> > PaddlePadddle is an open source framework of Baidu, which is also used
> by many people in China. It is very consistent with the usage habits of
> developers, but the practicability of the API still needs to be further
> strengthened. However, Paddle is the only deep learning framework I have
> ever used, which does not configure any third-party libraries and can be
> used directly by cloning make. Besides, Paddle occupies a small amount of
> memory and is fast. It also serves a considerable number of projects inside
> Baidu, which is very strong in industrial application. And PaddlePaddle
> supports multiple machine and multiple card training.
Imo, my idea is that we can add 1 or 2 dnn backends as discussed at
http://ffmpeg.org/pipermail/ffmpeg-devel/2022-December/304534.html
The background is that we see different good models from different deep learning
frameworks, and most framework does not support models developed with other
frameworks due to different model format. imo, we'd support several popular frameworks.
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2023-05-11 7:54 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-06 10:36 [FFmpeg-devel] [PATCH v1] libavfi/dnn: add Paddle Inference as one of DNN backend wongwwz
2023-04-08 21:31 ` Jean-Baptiste Kempf
2023-04-11 15:03 ` WenzheWang
2023-05-10 2:25 ` WenzheWang
2023-05-10 4:08 ` "zhilizhao(赵志立)"
2023-05-10 11:30 ` Guo, Yejun
2023-05-11 7:53 ` [FFmpeg-devel] =?gb18030?b?u9i4tKO6ICBbUEFUQ0ggdjFdIGxpYmF2Zmkv?= =?gb18030?q?dnn=3A_add_Paddle_Inference_as_one_of_DNN_backend?= =?gb18030?B?V2VuemhlV2FuZw==?=
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
This inbox may be cloned and mirrored by anyone:
git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git
# If you have public-inbox 1.1+ installed, you may
# initialize and index your mirror using the following commands:
public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
ffmpegdev@gitmailbox.com
public-inbox-index ffmpegdev
Example config snippet for mirrors.
AGPL code for this site: git clone https://public-inbox.org/public-inbox.git