From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTP id C12E540E84 for ; Wed, 9 Feb 2022 06:22:45 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id A18A368B109; Wed, 9 Feb 2022 08:22:42 +0200 (EET) Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id BD54368B0A8 for ; Wed, 9 Feb 2022 08:22:35 +0200 (EET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1644387760; x=1675923760; h=from:to:subject:date:message-id:references:in-reply-to: content-transfer-encoding:mime-version; bh=LDSVK5zJbV23eC9kSjPQ/8RYlpvKFfGDzCsZfCPyidY=; b=maGyAxJ+nnPIxm5/PrqfbAcn0/IBGcblkSA6EvygzOcaHPQA1MO7rvii 6L6uu+qfGKNiV1SiZvGelusHy9ecn35DFxvrm0s4F0U1zBe8bg5LrJhhE qkUFqyISdFwEOwAcUePPCDehzxBBCnhqZ5DfU/F25Hcam/ll+Wzakymkv zZ6iqtqPXyG3UH701cWn9wWh1oW8gZN2HLve5if6E6i51CU+nw0rSel0B /C3ShOXq84N2COfANE1NybDFyO2t0uKTRmMCR2aNkoagYn9EZnmo6wGAH 9NFO+5/sYDHB0T+a1oQTTOsIhOjA3k4YEb1eyjKb/iEV9laVgz156UCd9 g==; X-IronPort-AV: E=McAfee;i="6200,9189,10252"; a="229780559" X-IronPort-AV: E=Sophos;i="5.88,354,1635231600"; d="scan'208";a="229780559" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Feb 2022 22:22:33 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.88,354,1635231600"; d="scan'208";a="568128565" Received: from fmsmsx603.amr.corp.intel.com ([10.18.126.83]) by orsmga001.jf.intel.com with ESMTP; 08 Feb 2022 22:22:33 -0800 Received: from fmsmsx612.amr.corp.intel.com (10.18.126.92) by fmsmsx603.amr.corp.intel.com (10.18.126.83) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.20; Tue, 8 Feb 2022 22:22:33 -0800 Received: from fmsmsx612.amr.corp.intel.com (10.18.126.92) by fmsmsx612.amr.corp.intel.com (10.18.126.92) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.20; Tue, 8 Feb 2022 22:22:32 -0800 Received: from fmsedg601.ED.cps.intel.com (10.1.192.135) by fmsmsx612.amr.corp.intel.com (10.18.126.92) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.20 via Frontend Transport; Tue, 8 Feb 2022 22:22:32 -0800 Received: from NAM11-DM6-obe.outbound.protection.outlook.com (104.47.57.169) by edgegateway.intel.com (192.55.55.70) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2308.20; Tue, 8 Feb 2022 22:22:30 -0800 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=eqAia8V2mv4P4OzhCh+B4jUyVypNAfusaDBpR58EScj1gc6nSGN78JMaXN/8tHMaHTK5z61aUW6uuXs/Y8sDvu1swGywxyOZRMlMzf4tQLUbeKD72udQm0owL0/8I+rvHcpbsSOJ7C9ljY9l/sR9/xu2MJ2+xvU3AxjC9UaxhUl56WCP2U2t9XshjUQk2eIt019MCdB1OHtswMTlhEbyeQ/uzc3sCH9kxjD54qzRhToZB3spJcnABVXDCjLVKcTAOYSbleq6FoqJdzlHMn9IQwFXPdCrAZC42W/I0H9M9989fWCoETG6tJV4+UV6qy/gx9QpTSwPaSaRO73xk+y70A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=d5cW8M3gVPL7dTtwVgOPVy0LQrK/crL/q5f4NFCc/sI=; b=FREhLY9FD6d6NbIDbqwTfHBSewUYX7rTHtzHY5cpLTjY1uQKdYRHo+S78kn9ej+Ogc1PNjpv34YlxTBj2i0m2J5PUtZlWnbAdjVCjs0cYIrPYhW845dd5EsIdOHPGl1HQJG7TjbnCdF6a9FW9aa1uCrrXkV5VUkXLIHNOfBPXl3vCBmlIyd6Au7Ra5yKlN/Bf3DFT+GFGn82X0jAd9kas1eBo85aaG9SNgip1GSYbhJKzozeHrkojVPtc7d8+YUND5NxoNm98JGq/WdOz/dAFGgHga31a8shI0iTIV1TbGA1KxjigFtMCCk7NTkcJxh4du+D0kZniA9+Pu6sxLjrAA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=none; dmarc=none; dkim=none; arc=none Received: from BY5PR11MB3879.namprd11.prod.outlook.com (2603:10b6:a03:18f::17) by BN7PR11MB2660.namprd11.prod.outlook.com (2603:10b6:406:b2::27) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4951.12; Wed, 9 Feb 2022 06:22:28 +0000 Received: from BY5PR11MB3879.namprd11.prod.outlook.com ([fe80::8854:1c7a:b8e0:19dc]) by BY5PR11MB3879.namprd11.prod.outlook.com ([fe80::8854:1c7a:b8e0:19dc%4]) with mapi id 15.20.4951.019; Wed, 9 Feb 2022 06:22:28 +0000 From: "Chen, Wenbin" To: FFmpeg development discussions and patches Thread-Topic: [FFmpeg-devel] [PATCH V3 3/3] libavcodec/vaapi_encode: Add async_depth to vaapi_encoder to increase performance Thread-Index: AQHYHJmZSKaO4uKbAUGjbytyZ5A0aayKv4tQ Date: Wed, 9 Feb 2022 06:22:28 +0000 Message-ID: References: <20220208030549.340748-1-wenbin.chen@intel.com> <20220208030549.340748-3-wenbin.chen@intel.com> In-Reply-To: <20220208030549.340748-3-wenbin.chen@intel.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-product: dlpe-windows dlp-version: 11.6.200.16 dlp-reaction: no-action authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=intel.com; x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: f0bfd6c8-1b93-474a-37c0-08d9eb948c1a x-ms-traffictypediagnostic: BN7PR11MB2660:EE_ x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:9508; x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: e/0OiyvOZR6rhQpIpKs5a5MeUcPwtDxcZ+OSJpG3K+pYHZQEV/HDBwKtYbFWCLR3WwbKScQ4HNEHc8wY4R+m1z7hjjGS1q7/GXMfsHqR9zMDOdpqs1+oI3xPElTdpECKPJ4/hHSua4ac8Js0avkrGcQDqLhlGZgDjD5qzQivKOUCVWt8BplAxKR+UB3cdKRCViRteDcNTlxfctHNfLUJdG2jAhEpBtRVsL5Kyvvda5rKGZS8XaLtuxpxNCANKLjcUK9YVcLU8ODEyI+iv64/oTqjoSh9SW0DJQ0lVpk0u/UUBJEeX2k+T85d3ZJ3PM+eL5G9QUoS3B6mEqIAyMkgtWMaiJndqmT4FEniMFZSNFDpSRGQmV9CPhfHwpj8K3jAesZdRvbQRQnAwfO3XY+qOeEQq+lNU+2yDGD6JDrSJ7BCkLe24PKHLQQhDJ0Yy43gWtn268VLrHdoVvjj8XGwJGkBXYiVQp6OUuFmb+RYxhtuHxhkr85pLXmwDUICjDUoy5KomX87vAMY7UcU2b2CB7yDXoeYtnEs5/x6WrJgca/VOjMwYuwDirw6JIxwGvoeTX6mOJmcRlCjyTcLCAaL2U0n8M5X7OHBbCvpTg22+wBKk8x3+Dnh/0J3a/1huObjSdkSkajzRM1Lx5rIkusBn9urDtnYASDF+sQjBJ53mIVCP/JcN+xdKk37aZoaI+r+m3nCorjKR9izgO7NOPprG/1WFuxHrMRDuiGRyCvoSvt/cJxwbXVtzNLIIqBIrrLCwTqnhu1NnsQaISNg2N6oxT6k0C0khyKHP8xw20fp3X4= x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:BY5PR11MB3879.namprd11.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230001)(366004)(55016003)(83380400001)(38100700002)(33656002)(966005)(2906002)(508600001)(82960400001)(122000001)(5660300002)(26005)(66556008)(66476007)(66446008)(8936002)(6916009)(38070700005)(76116006)(52536014)(86362001)(66946007)(71200400001)(9686003)(316002)(8676002)(186003)(6506007)(7696005)(64756008); DIR:OUT; SFP:1102; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?us-ascii?Q?B2NOzPDYqgLLiDLaM5YdQEf6/EA1K5plS3AZvhmW4UX6n6g7jwYsI+G1wgvr?= =?us-ascii?Q?6mdR5kHbpL21c7OyuvOSVjkslBr2XOkrlihSnOjFF3ErV2XYgWMYi4JgtAj6?= =?us-ascii?Q?ici4akKAUbPXUGi2wa6LFWmdtP5vMBSEvh3wUAf39Ze0SdAEfKvveDK+/hbM?= =?us-ascii?Q?1PHMyWhKXkbsE+BjwbUcp7sWQLufY342+t0vcwRyMcJ6biUIGNSSWOIFrU15?= =?us-ascii?Q?XDc2oXp22ozPjgTmVYlNoJF5JWlOcYmAYrF6ENz+acJUcniECTczObGBN1w7?= =?us-ascii?Q?yBdS3+qFAYmpx+W7EaCKA2myyB9SKft0bqojxOc6Aj2Gio97o8nz5NoQPeEd?= =?us-ascii?Q?86wX0OrEXB2YGYRrrl6GNAJPpIO5xoa3Ip+mZ5KuzMkUmkwgCZ3vlirLKoAb?= =?us-ascii?Q?03MtW1qVBO2XXHKz3wWvzCms3df6+1cuPnXX//O3hrIG7nQSIMwWAiqdxZqe?= =?us-ascii?Q?mW1KeBb76GAYhCjOCUb6vv90SVijfqJu45dtUwdOP2mKI4EDDOofK9jqwOJY?= =?us-ascii?Q?bl3w8oGKuZHegjrkLN+9mcSjjHEvj8UviPzOmyKfl3Ee9BF6eZ6N9kizlTME?= =?us-ascii?Q?E74xMRg4rgydg+E3HWM1jCh3exKkNL/p5W2hbVW6i99Y3H9UXU4AuplPgomM?= =?us-ascii?Q?0mXpOxqz5G1RwlfwCcPgwOjqGa42tWVjfhw/se24AYV4LfSQgyC3tDt9FSF9?= =?us-ascii?Q?z6HfhA845iUyDJoI0FJSnREP/50kEhj1VD//NFi2C5nhE6sNCmM/69ZTVsqm?= =?us-ascii?Q?7PcLcKzI/EewXeDu3tE3TjFziCyIz9Fwp7OymnSaKFzYIHT9umZUUSYLVo8X?= =?us-ascii?Q?o11hGe9cJARqO/4tg1VtGyXi85gEa4ZBpScJtfDe2mAlZSKDA2EDbGPiNixB?= =?us-ascii?Q?F5M2oHHDDQ5xqhPGhBdxnnRjTjtCIAtFV4POkyw/731Gst89xL1SDc4KlahI?= =?us-ascii?Q?opNMpxU8NI/7gVkN4zSP42aPGkoYkKFAaUBUH9RH6negMlkJa4ckqOSZcnJs?= =?us-ascii?Q?0IZCx07saZ+U3GfEKDRmaL9uWGQ4XW3T8PFKdDP5LuZHK0RjG3bay9yVDyia?= =?us-ascii?Q?+2gsyf5C4ZNjX3Ef12IeyV6Rl8pKersvGj7bCQDfFsmchB4ed8NrYt8Sp7Wd?= =?us-ascii?Q?NosyWrmt54fy+W7gDy6rmXt5ftR71ChcLNv1wKWAUrSJGwDWALUdQGGNIxX+?= =?us-ascii?Q?JxTealg0kY2RvJ0QXoNY8Vs+0qwjdV8/85oOLVQmZYXs2LGpwhPCH+QXnCvO?= =?us-ascii?Q?FMnboV5Rbj/6Hc8mezDAIY+aEg7fDCIrJYIrhqJF4DK38+cl5FF9IwX8pkpa?= =?us-ascii?Q?OYoQWUmBhrVA7cFgOa8exoPlSCdpu0bomq9C7E6J7C6UsYsByAHDxpYa2/X/?= =?us-ascii?Q?lpL3XadS/MdBuESLhacWBKcii8Vw9+ssH4bkhbUAqBxEyGPaKAQU0LI5nXSD?= =?us-ascii?Q?7uHu54AEkXZv8adCnecqX7qvKCnB5F7UQ0h9dw+SsW82O67/JNCCc3DW6Y7n?= =?us-ascii?Q?SqSzIxSyrjTGNcnuUGbSTvmBJkTU8yDryRgjqO9hGpPUfDgkVyOnOX4yZpfV?= =?us-ascii?Q?gHP2YzPUxWar8nSQ42LMmfZqHdXwIrEE+Q3z2hkWSV2tGoGuocGGXzwZ4h6g?= =?us-ascii?Q?DZcg8igOalD+ysz+LkcDR+U=3D?= MIME-Version: 1.0 X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: BY5PR11MB3879.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: f0bfd6c8-1b93-474a-37c0-08d9eb948c1a X-MS-Exchange-CrossTenant-originalarrivaltime: 09 Feb 2022 06:22:28.6601 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: 4tByCUI/vmiIvgbNKo13vIqYN+rTpIQG3UIwRw3Qwb/r42j1/rrirtZw34dBmCVfg63iQWnJA4JGs4HOqBaFFA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: BN7PR11MB2660 X-OriginatorOrg: intel.com Subject: Re: [FFmpeg-devel] [PATCH V3 3/3] libavcodec/vaapi_encode: Add async_depth to vaapi_encoder to increase performance X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: > Add async_depth to increase encoder's performance. Reuse encode_fifo as > async buffer. Encoder puts all reordered frame to HW and then check > fifo size. If fifo < async_depth and the top frame is not ready, it will > return AVERROR(EAGAIN) to require more frames. > > 1080p transcoding (no B frames) with -async_depth=4 can increase 20% > performance on my environment. > The async increases performance but also introduces frame delay. > > Signed-off-by: Wenbin Chen > --- > libavcodec/vaapi_encode.c | 16 ++++++++++++---- > libavcodec/vaapi_encode.h | 12 ++++++++++-- > 2 files changed, 22 insertions(+), 6 deletions(-) > > diff --git a/libavcodec/vaapi_encode.c b/libavcodec/vaapi_encode.c > index 15ddbbaa4a..432abf31f7 100644 > --- a/libavcodec/vaapi_encode.c > +++ b/libavcodec/vaapi_encode.c > @@ -1158,7 +1158,8 @@ static int > vaapi_encode_send_frame(AVCodecContext *avctx, AVFrame *frame) > if (ctx->input_order == ctx->decode_delay) > ctx->dts_pts_diff = pic->pts - ctx->first_pts; > if (ctx->output_delay > 0) > - ctx->ts_ring[ctx->input_order % (3 * ctx->output_delay)] = pic->pts; > + ctx->ts_ring[ctx->input_order % > + (3 * ctx->output_delay + ctx->async_depth)] = pic->pts; > > pic->display_order = ctx->input_order; > ++ctx->input_order; > @@ -1214,7 +1215,7 @@ int > ff_vaapi_encode_receive_packet(AVCodecContext *avctx, AVPacket *pkt) > > #if VA_CHECK_VERSION(1, 9, 0) > if (ctx->has_sync_buffer_func) { > - while (av_fifo_can_read(ctx->encode_fifo) <= > MAX_PICTURE_REFERENCES) { > + while (av_fifo_can_read(ctx->encode_fifo) <= MAX_ASYNC_DEPTH) { Here is a mistake I should use "<" instead of "<=" and I can use av_fifo_can_write() instead. I will update it. > pic = NULL; > err = vaapi_encode_pick_next(avctx, &pic); > if (err < 0) > @@ -1232,6 +1233,13 @@ int > ff_vaapi_encode_receive_packet(AVCodecContext *avctx, AVPacket *pkt) > } > if (!av_fifo_can_read(ctx->encode_fifo)) > return err; > + if (av_fifo_can_read(ctx->encode_fifo) < ctx->async_depth && > + !ctx->end_of_stream) { > + av_fifo_peek(ctx->encode_fifo, &pic, 1, 0); > + err = vaapi_encode_wait(avctx, pic, 0); > + if (err < 0) > + return err; > + } > av_fifo_read(ctx->encode_fifo, &pic, 1); > ctx->encode_order = pic->encode_order + 1; > } else > @@ -1267,7 +1275,7 @@ int > ff_vaapi_encode_receive_packet(AVCodecContext *avctx, AVPacket *pkt) > pkt->dts = ctx->ts_ring[pic->encode_order] - ctx->dts_pts_diff; > } else { > pkt->dts = ctx->ts_ring[(pic->encode_order - ctx->decode_delay) % > - (3 * ctx->output_delay)]; > + (3 * ctx->output_delay + ctx->async_depth)]; > } > av_log(avctx, AV_LOG_DEBUG, "Output packet: pts %"PRId64" > dts %"PRId64".\n", > pkt->pts, pkt->dts); > @@ -2588,7 +2596,7 @@ av_cold int ff_vaapi_encode_init(AVCodecContext > *avctx) > vas = vaSyncBuffer(ctx->hwctx->display, 0, 0); > if (vas != VA_STATUS_ERROR_UNIMPLEMENTED) { > ctx->has_sync_buffer_func = 1; > - ctx->encode_fifo = av_fifo_alloc2(MAX_PICTURE_REFERENCES + 1, > + ctx->encode_fifo = av_fifo_alloc2(MAX_ASYNC_DEPTH, > sizeof(VAAPIEncodePicture *), > 0); > if (!ctx->encode_fifo) > diff --git a/libavcodec/vaapi_encode.h b/libavcodec/vaapi_encode.h > index d33a486cb8..691521387d 100644 > --- a/libavcodec/vaapi_encode.h > +++ b/libavcodec/vaapi_encode.h > @@ -48,6 +48,7 @@ enum { > MAX_TILE_ROWS = 22, > // A.4.1: table A.6 allows at most 20 tile columns for any level. > MAX_TILE_COLS = 20, > + MAX_ASYNC_DEPTH = 64, > }; > > extern const AVCodecHWConfigInternal *const > ff_vaapi_encode_hw_configs[]; > @@ -298,7 +299,8 @@ typedef struct VAAPIEncodeContext { > // Timestamp handling. > int64_t first_pts; > int64_t dts_pts_diff; > - int64_t ts_ring[MAX_REORDER_DELAY * 3]; > + int64_t ts_ring[MAX_REORDER_DELAY * 3 + > + MAX_ASYNC_DEPTH]; > > // Slice structure. > int slice_block_rows; > @@ -350,6 +352,8 @@ typedef struct VAAPIEncodeContext { > AVFifo *encode_fifo; > //Whether the driver support vaSyncBuffer > int has_sync_buffer_func; > + //Max number of frame buffered in encoder. > + int async_depth; > } VAAPIEncodeContext; > > enum { > @@ -460,7 +464,11 @@ int ff_vaapi_encode_close(AVCodecContext *avctx); > { "b_depth", \ > "Maximum B-frame reference depth", \ > OFFSET(common.desired_b_depth), AV_OPT_TYPE_INT, \ > - { .i64 = 1 }, 1, INT_MAX, FLAGS } > + { .i64 = 1 }, 1, INT_MAX, FLAGS }, \ > + { "async_depth", "Maximum processing parallelism. " \ > + "Increase this to improve single channel performance", \ > + OFFSET(common.async_depth), AV_OPT_TYPE_INT, \ > + { .i64 = 4 }, 0, MAX_ASYNC_DEPTH, FLAGS } > > #define VAAPI_ENCODE_RC_MODE(name, desc) \ > { #name, desc, 0, AV_OPT_TYPE_CONST, { .i64 = RC_MODE_ ## name }, \ > -- > 2.32.0 > > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".