From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTP id 7978C43F6B for ; Mon, 19 Sep 2022 14:32:18 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 4113068BC81; Mon, 19 Sep 2022 17:32:16 +0300 (EEST) Received: from EUR05-DB8-obe.outbound.protection.outlook.com (mail-db8eur05olkn2022.outbound.protection.outlook.com [40.92.89.22]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 80C1D68BB03 for ; Mon, 19 Sep 2022 17:32:09 +0300 (EEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=mgffrbWKOuNcSlLPJCul1FrMuleJpX1TEQZrigcvk4piKDIIvV1pyvJezt8OT+e6OlqwZTRgCOT+LnG1AFARzs64cPx0s+/z9cK3Tlj8TKk5WPXTe2if7MIreJtwRKJxIWpSbGgKdTSX835FM3uekEm5mGltDBRegA3VsYNF3RV7ekL6TD2oNvRaxFBBBAnn7mwY7P4/OeTJ7QTArvczbh79m+sH5XV8259uK1MmdFM4VVaixf8aQb5Xd3R5gPC/WVHootk0Ylo/tfU54RfFuaH/KboLmSGIguIWnIUvx6w+TuSRR2rfqfClGjuqsKlKYyz1ZbSU/GJtrvQocbailg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=t6Mcd8FHczThoM5btVlDcYUzNrLwTN2E+gJwyN3TcEs=; b=mci1iaJFM3WtLS2BI40bDcPiZQLue5VlzXYV1IRCa3inFQgUtfCDghBPHj6AV5WL6GJmAj/HTuH/aZLDp8kGRah53YazXjymzlIJNkOG1baK+jkGXAAJHi4CeqbA7u9f5lEtduVyYLEKVl7zwzIFxNG9eUEPiJnksM1DUZWvy/oxPisIQ7/5t+jWPhxGk9c7DHPmPX7bAbO5qKTXkhe3vBL/afGG6F4GRg8nwm/O+j/PHF4nh6PTBBDmQ7wPA0axhm+/aAE5oNyzDypHYpFazErUp0McAKsMBrgt4Ph4Qxwu3Z9NBaqDcK1LPLd3V77lvAPqXgLUN5WMeCRpU4e1zA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=none; dmarc=none; dkim=none; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=outlook.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=t6Mcd8FHczThoM5btVlDcYUzNrLwTN2E+gJwyN3TcEs=; b=uRLrqummdnCBQlK5ZxQBYkVAYH0eF4UERAZAKdPPlK6i4vTMJByZhwYd2iJ6lojKwdsAGj6uiVf4YlZG2PvIQV8EhfZ2Bb/GNDGe2SU3lcYD3wWDUA+7WpdNSpUiadYmzUF1S4gC7qzwPs7/JdLLZwWKLlElVCnmROlfesJrAUNKhc91htpWqzFghRNzNBHXCzYabjQqjHtLLZRjDnXodOY8p55lMTZl7NEyws7fvIV2tMdDTnswg4nQNTXYhZjleIh8BwsdOyM9eoYU4PAuk6JrKppdVtAwA1wgHUgI68WbrkHqFH8Q18t+fxn0wnAPrAHH9jIPTxTEZDuuSAR0wA== Received: from AS8P250MB0744.EURP250.PROD.OUTLOOK.COM (2603:10a6:20b:541::14) by AM8P250MB0200.EURP250.PROD.OUTLOOK.COM (2603:10a6:20b:327::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5632.19; Mon, 19 Sep 2022 14:32:08 +0000 Received: from AS8P250MB0744.EURP250.PROD.OUTLOOK.COM ([fe80::2c84:e72a:48a9:ff90]) by AS8P250MB0744.EURP250.PROD.OUTLOOK.COM ([fe80::2c84:e72a:48a9:ff90%6]) with mapi id 15.20.5632.014; Mon, 19 Sep 2022 14:32:07 +0000 Message-ID: Date: Mon, 19 Sep 2022 16:32:08 +0200 Content-Language: en-US To: ffmpeg-devel@ffmpeg.org References: From: Andreas Rheinhardt In-Reply-To: X-TMN: [SVPSAUYCFquPHM5wHpqjfnBhiJHgf6F4sxsJ6/FkpQU=] X-ClientProxiedBy: ZR2P278CA0035.CHEP278.PROD.OUTLOOK.COM (2603:10a6:910:47::20) To AS8P250MB0744.EURP250.PROD.OUTLOOK.COM (2603:10a6:20b:541::14) X-Microsoft-Original-Message-ID: MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: AS8P250MB0744:EE_|AM8P250MB0200:EE_ X-MS-Office365-Filtering-Correlation-Id: 411be7b9-c0de-4ef1-1a6b-08da9a4bbb0b X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: LLvBT1JeMFU6I8FMTWuWW6bB8GnrgUVW1yIU10TKvPuC5xtNBbD3uFmmiI2OFt+oWHHTVn/PHIx9NizZzfj9Vwp2UnJm8F1xwV4cmQMNSL+aMkFJPESfaObGuJiqluVCuWCLWQK0f0u2pvaU6m5l+Y61GV0VJ1KZg5oRpQPFpdpffEhwyQV8+UzRQlRnbXHunhtvYa8Z8770w+8h8JT9Z1bjV2wYoVNlPWX3KWyAK1nkMnnn3ZYeRE//tZEowy7ZA95q9QaEA/XVWCYKBPIVRPgOF2WXYlJWCR0QT+vlKSnJKzenzdyAk0wSIE6FqshvN679GhmOvE7+y962wR2h0Yiy542l5I6oUKv1dptGM0djdjw+Dxo7R9dVMnjks+mkFLPpzIU7hBirV9F+A4KjCRxRS2sa1WEhXy1FPIJtArnZFCLLxuwTwsiaARXf0hzOU7xHAPVnuZvTej1cd2xLM63zVxqXMZXhyR1SHOqGgDU9377wRu36R2kOW2toXqDlsXgAXhjEx/mlvm6393UhC+b0gAVcVHuwzHf9rXNug3TE+ofignW5nONLbwbKAPo2IA9Rf+aUdohokNVTNaIx0vUK2Qb35dgXAiIBRd2g8/Be2QPlSrnSz8G4MbliI75szb2ej54mzeCnwZtQ2UhcLul1/14w1f7H+hPeZS+c/Wg7QawF/nhRXIOs4O3xH/S5 X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?a1JMaUptLzIvN0VOTXNybkU4WWVzelU4aGZ4d2FKREt1NHJpWFN5dmJEMC9l?= =?utf-8?B?amNEQ1loV25MZUZDK2xNR1NBY3E3VmpSQWk5aFdpQzIyZ0NkMjl2VVB4VkxJ?= =?utf-8?B?SVdEOGdwMTNzSHhTRjNoV1d1UWVqNWh6TXB6OTUyUFpWVTBFaXYwWVpsdms2?= =?utf-8?B?MVdQRmk0VkQ5cEh2cVQrTjViT3prWUorVFFHdVJRVG1CZUNWVndlVnIzM0dn?= =?utf-8?B?S0JFTlkvUjdTU3Nla2sza0I0eHVBdkEwTndoZ1dMbFVmREl1dEZUQUtZRU11?= =?utf-8?B?UFMraVVnalcycTI4M0VOMHJnbTJmQjc1WXh4Wm9KZytMQWZuYitjSXlUZ2lI?= =?utf-8?B?RlJCR1Ayc2dWOU1UbXBnNjJCdlkrK0VLQ2FGSnlFOW9TOUNGUjFhSGp6RTJu?= =?utf-8?B?QXFRWWx0akw3Nm5LUU9WN1BtYWdvdXhESWFScDFpMzMreCsvSDdSeXVPWWpG?= =?utf-8?B?MmdFQjlpcXBrRytCK0NrSTlzWlpWcFliTVpNV3RTRCtlY2d6ai9PQzZpbTI2?= =?utf-8?B?bmVYV042NjZDRi82WUhzSk9QdUowN2hnR2Y5T0UvamRsRFJ0b1FKSkVkM2Nv?= =?utf-8?B?dUpkUzQzdmlIN2JxNjZSZnZTREwzNlBDREozWStPWXlhVVFBRVNrazJIdStr?= =?utf-8?B?ZG4rLzNsclQ3b1RkSUNnaXFDdnp4YnZySGw5aWtEK2FpcWNKWDg1WEVGWlFM?= =?utf-8?B?SFUrNU1EVWkydW5OMnMyZE1taDhGUzVHMmdROHNIbEhaUGZFZGRuZmtBMkVD?= =?utf-8?B?QThia1FRaGRXRWZWVloza3JNQ1BxUStqUFBkRnp3Smt1TFRUS3E1TmFSb3lr?= =?utf-8?B?ekQrU1c1cER0dWw5blowa01nQmd0OXo4Ly9ObWpTYkNQaFd2bGhRazBWVTFN?= =?utf-8?B?YVBPMFgwaFY5cUJzZFN4dmdkMkpnVjFsdzR0R3Nod1ZFSVRjVURyTjhreUpp?= =?utf-8?B?bVc4b3R0RGY4bEhMRWZiZGJxVENOMnBiaVFGNVhWd1FiazFESzVxaDVCN01a?= =?utf-8?B?UVdKYklBOUFuTHlkYTNTNmpCU3V0ZWFtMHBBVEt5em11VXo1cjB5T1dCZ1ps?= =?utf-8?B?WnFFYW9Ea1JUM21aVldNdmtlK1VHdkM5SC9sNmZSWm14amhLOU8ySXR1RXVl?= =?utf-8?B?MVlvckNHejAzNktXbjJHUVJlR01uR3d5bExiUW1teVN2SlNnb1lHQmU4cTBy?= =?utf-8?B?NGM4eldJcC9NdnQxb0lIZ2NPNXo0YzMveVBQeE1OTk1wRE95cHNHOXRzQlpo?= =?utf-8?B?ZnFvazF4YUJGNUt3cXF0R0lmYzVEajhhSmMzbEpQVXgvM2czVVZWS0owejVl?= =?utf-8?B?d3RpMTBlOUhIMTZaUWF4dm1Fd0lzclVRR3Mxb08rcXdWL1haQUNOQWg2L3hi?= =?utf-8?B?Y1JWY1c5TzV4cFlSMW9UMnBGdTJPWEYzSURtUU90eVBwUy9mTUV4NGdMemR6?= =?utf-8?B?U1pNZFRuV0cvbktxdXBIT09UOHQxcUFOcUdXdXREY2Q1SEoxQ0NzNUhsU3U4?= =?utf-8?B?S1RPOGhxZzBLcW5Vb2FZRU5IdmRJRy9PRThDOGF1MFJRTUtOcmpsRHRUUTVZ?= =?utf-8?B?azcrWk9qZ29YdG1TTGh0YmlUOThRTFlEalgxSm16ek05ZHNSRkVMOWYzQWVY?= =?utf-8?B?QzFXczArcjNFUHpNNENGT0pXMlBwS3FYaUhNWVpJTWhsb3NkNCtFZ1RNN1N4?= =?utf-8?B?WXV2SnVsZzRpMzN6OURjbVp5dFdJdU94RmhEMk5paFBGZXRsRm12cDg2YUd2?= =?utf-8?Q?kj03qt+jnGIPEVvBWIGHJFbdrLgqPXTMOoVl1EG?= X-OriginatorOrg: outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 411be7b9-c0de-4ef1-1a6b-08da9a4bbb0b X-MS-Exchange-CrossTenant-AuthSource: AS8P250MB0744.EURP250.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 19 Sep 2022 14:32:07.9680 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-CrossTenant-RMS-PersistedConsumerOrg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM8P250MB0200 Subject: Re: [FFmpeg-devel] [PATCH v2 1/2] swscale/input: Avoid calls to av_pix_fmt_desc_get() X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: Andreas Rheinhardt: > Up until now, libswscale/input.c used a macro to read > an input pixel which involved a call to av_pix_fmt_desc_get() > to find out whether the input pixel format is BE or LE > despite this being known at compile-time (there are templates > per pixfmt). Even worse, these calls are made in a loop, > so that e.g. there are six calls to av_pix_fmt_desc_get() > for every pair of UV pixel processed in > rgb64ToUV_half_c_template(). > > This commit modifies these macros to ensure that isBE() > is evaluated at compile-time. This saved 9743B of .text > for me (GCC 11.2, -O3). For a simple RGB64LE->YUV420P > transformation like > ffmpeg -f lavfi -i haldclutsrc,format=rgba64le -pix_fmt yuv420p \ > -threads 1 -t 1:00 -f null - > the amount of decicycles spent in rgb64LEToUV_half_c > (which is created via the template mentioned above) > decreases from 19751 to 5341; for RGBA64BE the number > went down from 11945 to 5393. For shared builds (where > the call to av_pix_fmt_desc_get() is indirect) the old numbers > are 15230 for RGBA64BE and 27502 for RGBA64LE, whereas > the numbers with this patch are indistinguishable from > the numbers from a static build. > > Also make the macros that are touched conform to the > usual convention of using uppercase names while just at it. > > Signed-off-by: Andreas Rheinhardt > --- > libswscale/input.c | 122 +++++++++++++++++++++++++-------------------- > 1 file changed, 68 insertions(+), 54 deletions(-) > > diff --git a/libswscale/input.c b/libswscale/input.c > index 88e318e664..7ff7bfaa01 100644 > --- a/libswscale/input.c > +++ b/libswscale/input.c > @@ -28,14 +28,21 @@ > #include "config.h" > #include "swscale_internal.h" > > -#define input_pixel(pos) (isBE(origin) ? AV_RB16(pos) : AV_RL16(pos)) > +#define input_pixel(pos) (is_be ? AV_RB16(pos) : AV_RL16(pos)) > + > +#define IS_BE_LE 0 > +#define IS_BE_BE 1 > +#define IS_BE_ 0 > +/* ENDIAN_IDENTIFIER needs to be "BE", "LE" or "". The latter is intended > + * for single-byte cases where the concept of endianness does not apply. */ > +#define IS_BE(ENDIAN_IDENTIFIER) IS_BE_ ## ENDIAN_IDENTIFIER > > #define r ((origin == AV_PIX_FMT_BGR48BE || origin == AV_PIX_FMT_BGR48LE || origin == AV_PIX_FMT_BGRA64BE || origin == AV_PIX_FMT_BGRA64LE) ? b_r : r_b) > #define b ((origin == AV_PIX_FMT_BGR48BE || origin == AV_PIX_FMT_BGR48LE || origin == AV_PIX_FMT_BGRA64BE || origin == AV_PIX_FMT_BGRA64LE) ? r_b : b_r) > > static av_always_inline void > rgb64ToY_c_template(uint16_t *dst, const uint16_t *src, int width, > - enum AVPixelFormat origin, int32_t *rgb2yuv) > + enum AVPixelFormat origin, int32_t *rgb2yuv, int is_be) > { > int32_t ry = rgb2yuv[RY_IDX], gy = rgb2yuv[GY_IDX], by = rgb2yuv[BY_IDX]; > int i; > @@ -51,7 +58,7 @@ rgb64ToY_c_template(uint16_t *dst, const uint16_t *src, int width, > static av_always_inline void > rgb64ToUV_c_template(uint16_t *dstU, uint16_t *dstV, > const uint16_t *src1, const uint16_t *src2, > - int width, enum AVPixelFormat origin, int32_t *rgb2yuv) > + int width, enum AVPixelFormat origin, int32_t *rgb2yuv, int is_be) > { > int i; > int32_t ru = rgb2yuv[RU_IDX], gu = rgb2yuv[GU_IDX], bu = rgb2yuv[BU_IDX]; > @@ -70,7 +77,7 @@ rgb64ToUV_c_template(uint16_t *dstU, uint16_t *dstV, > static av_always_inline void > rgb64ToUV_half_c_template(uint16_t *dstU, uint16_t *dstV, > const uint16_t *src1, const uint16_t *src2, > - int width, enum AVPixelFormat origin, int32_t *rgb2yuv) > + int width, enum AVPixelFormat origin, int32_t *rgb2yuv, int is_be) > { > int i; > int32_t ru = rgb2yuv[RU_IDX], gu = rgb2yuv[GU_IDX], bu = rgb2yuv[BU_IDX]; > @@ -86,13 +93,13 @@ rgb64ToUV_half_c_template(uint16_t *dstU, uint16_t *dstV, > } > } > > -#define rgb64funcs(pattern, BE_LE, origin) \ > +#define RGB64FUNCS_EXT(pattern, BE_LE, origin, is_be) \ > static void pattern ## 64 ## BE_LE ## ToY_c(uint8_t *_dst, const uint8_t *_src, const uint8_t *unused0, const uint8_t *unused1,\ > int width, uint32_t *rgb2yuv, void *opq) \ > { \ > const uint16_t *src = (const uint16_t *) _src; \ > uint16_t *dst = (uint16_t *) _dst; \ > - rgb64ToY_c_template(dst, src, width, origin, rgb2yuv); \ > + rgb64ToY_c_template(dst, src, width, origin, rgb2yuv, is_be); \ > } \ > \ > static void pattern ## 64 ## BE_LE ## ToUV_c(uint8_t *_dstU, uint8_t *_dstV, \ > @@ -102,7 +109,7 @@ static void pattern ## 64 ## BE_LE ## ToUV_c(uint8_t *_dstU, uint8_t *_dstV, \ > const uint16_t *src1 = (const uint16_t *) _src1, \ > *src2 = (const uint16_t *) _src2; \ > uint16_t *dstU = (uint16_t *) _dstU, *dstV = (uint16_t *) _dstV; \ > - rgb64ToUV_c_template(dstU, dstV, src1, src2, width, origin, rgb2yuv); \ > + rgb64ToUV_c_template(dstU, dstV, src1, src2, width, origin, rgb2yuv, is_be); \ > } \ > \ > static void pattern ## 64 ## BE_LE ## ToUV_half_c(uint8_t *_dstU, uint8_t *_dstV, \ > @@ -112,18 +119,20 @@ static void pattern ## 64 ## BE_LE ## ToUV_half_c(uint8_t *_dstU, uint8_t *_dstV > const uint16_t *src1 = (const uint16_t *) _src1, \ > *src2 = (const uint16_t *) _src2; \ > uint16_t *dstU = (uint16_t *) _dstU, *dstV = (uint16_t *) _dstV; \ > - rgb64ToUV_half_c_template(dstU, dstV, src1, src2, width, origin, rgb2yuv); \ > + rgb64ToUV_half_c_template(dstU, dstV, src1, src2, width, origin, rgb2yuv, is_be); \ > } > +#define RGB64FUNCS(pattern, endianness, base_fmt) \ > + RGB64FUNCS_EXT(pattern, endianness, base_fmt ## endianness, IS_BE(endianness)) > > -rgb64funcs(rgb, LE, AV_PIX_FMT_RGBA64LE) > -rgb64funcs(rgb, BE, AV_PIX_FMT_RGBA64BE) > -rgb64funcs(bgr, LE, AV_PIX_FMT_BGRA64LE) > -rgb64funcs(bgr, BE, AV_PIX_FMT_BGRA64BE) > +RGB64FUNCS(rgb, LE, AV_PIX_FMT_RGBA64) > +RGB64FUNCS(rgb, BE, AV_PIX_FMT_RGBA64) > +RGB64FUNCS(bgr, LE, AV_PIX_FMT_BGRA64) > +RGB64FUNCS(bgr, BE, AV_PIX_FMT_BGRA64) > > static av_always_inline void rgb48ToY_c_template(uint16_t *dst, > const uint16_t *src, int width, > enum AVPixelFormat origin, > - int32_t *rgb2yuv) > + int32_t *rgb2yuv, int is_be) > { > int32_t ry = rgb2yuv[RY_IDX], gy = rgb2yuv[GY_IDX], by = rgb2yuv[BY_IDX]; > int i; > @@ -142,7 +151,7 @@ static av_always_inline void rgb48ToUV_c_template(uint16_t *dstU, > const uint16_t *src2, > int width, > enum AVPixelFormat origin, > - int32_t *rgb2yuv) > + int32_t *rgb2yuv, int is_be) > { > int i; > int32_t ru = rgb2yuv[RU_IDX], gu = rgb2yuv[GU_IDX], bu = rgb2yuv[BU_IDX]; > @@ -164,7 +173,7 @@ static av_always_inline void rgb48ToUV_half_c_template(uint16_t *dstU, > const uint16_t *src2, > int width, > enum AVPixelFormat origin, > - int32_t *rgb2yuv) > + int32_t *rgb2yuv, int is_be) > { > int i; > int32_t ru = rgb2yuv[RU_IDX], gu = rgb2yuv[GU_IDX], bu = rgb2yuv[BU_IDX]; > @@ -187,7 +196,7 @@ static av_always_inline void rgb48ToUV_half_c_template(uint16_t *dstU, > #undef b > #undef input_pixel > > -#define rgb48funcs(pattern, BE_LE, origin) \ > +#define RGB48FUNCS_EXT(pattern, BE_LE, origin, is_be) \ > static void pattern ## 48 ## BE_LE ## ToY_c(uint8_t *_dst, \ > const uint8_t *_src, \ > const uint8_t *unused0, const uint8_t *unused1,\ > @@ -197,7 +206,7 @@ static void pattern ## 48 ## BE_LE ## ToY_c(uint8_t *_dst, \ > { \ > const uint16_t *src = (const uint16_t *)_src; \ > uint16_t *dst = (uint16_t *)_dst; \ > - rgb48ToY_c_template(dst, src, width, origin, rgb2yuv); \ > + rgb48ToY_c_template(dst, src, width, origin, rgb2yuv, is_be); \ > } \ > \ > static void pattern ## 48 ## BE_LE ## ToUV_c(uint8_t *_dstU, \ > @@ -213,7 +222,7 @@ static void pattern ## 48 ## BE_LE ## ToUV_c(uint8_t *_dstU, \ > *src2 = (const uint16_t *)_src2; \ > uint16_t *dstU = (uint16_t *)_dstU, \ > *dstV = (uint16_t *)_dstV; \ > - rgb48ToUV_c_template(dstU, dstV, src1, src2, width, origin, rgb2yuv); \ > + rgb48ToUV_c_template(dstU, dstV, src1, src2, width, origin, rgb2yuv, is_be); \ > } \ > \ > static void pattern ## 48 ## BE_LE ## ToUV_half_c(uint8_t *_dstU, \ > @@ -229,13 +238,15 @@ static void pattern ## 48 ## BE_LE ## ToUV_half_c(uint8_t *_dstU, \ > *src2 = (const uint16_t *)_src2; \ > uint16_t *dstU = (uint16_t *)_dstU, \ > *dstV = (uint16_t *)_dstV; \ > - rgb48ToUV_half_c_template(dstU, dstV, src1, src2, width, origin, rgb2yuv); \ > + rgb48ToUV_half_c_template(dstU, dstV, src1, src2, width, origin, rgb2yuv, is_be); \ > } > +#define RGB48FUNCS(pattern, endianness, base_fmt) \ > + RGB48FUNCS_EXT(pattern, endianness, base_fmt ## endianness, IS_BE(endianness)) > > -rgb48funcs(rgb, LE, AV_PIX_FMT_RGB48LE) > -rgb48funcs(rgb, BE, AV_PIX_FMT_RGB48BE) > -rgb48funcs(bgr, LE, AV_PIX_FMT_BGR48LE) > -rgb48funcs(bgr, BE, AV_PIX_FMT_BGR48BE) > +RGB48FUNCS(rgb, LE, AV_PIX_FMT_RGB48) > +RGB48FUNCS(rgb, BE, AV_PIX_FMT_RGB48) > +RGB48FUNCS(bgr, LE, AV_PIX_FMT_BGR48) > +RGB48FUNCS(bgr, BE, AV_PIX_FMT_BGR48) > > #define input_pixel(i) ((origin == AV_PIX_FMT_RGBA || \ > origin == AV_PIX_FMT_BGRA || \ > @@ -245,7 +256,7 @@ rgb48funcs(bgr, BE, AV_PIX_FMT_BGR48BE) > : ((origin == AV_PIX_FMT_X2RGB10LE || \ > origin == AV_PIX_FMT_X2BGR10LE) \ > ? AV_RL32(&src[(i) * 4]) \ > - : (isBE(origin) ? AV_RB16(&src[(i) * 2]) \ > + : (is_be ? AV_RB16(&src[(i) * 2]) \ > : AV_RL16(&src[(i) * 2])))) > > static av_always_inline void rgb16_32ToY_c_template(int16_t *dst, > @@ -257,7 +268,7 @@ static av_always_inline void rgb16_32ToY_c_template(int16_t *dst, > int maskr, int maskg, > int maskb, int rsh, > int gsh, int bsh, int S, > - int32_t *rgb2yuv) > + int32_t *rgb2yuv, int is_be) > { > const int ry = rgb2yuv[RY_IDX]< const unsigned rnd = (32<<((S)-1)) + (1<<(S-7)); > @@ -283,7 +294,7 @@ static av_always_inline void rgb16_32ToUV_c_template(int16_t *dstU, > int maskr, int maskg, > int maskb, int rsh, > int gsh, int bsh, int S, > - int32_t *rgb2yuv) > + int32_t *rgb2yuv, int is_be) > { > const int ru = rgb2yuv[RU_IDX] * (1 << rsh), gu = rgb2yuv[GU_IDX] * (1 << gsh), bu = rgb2yuv[BU_IDX] * (1 << bsh), > rv = rgb2yuv[RV_IDX] * (1 << rsh), gv = rgb2yuv[GV_IDX] * (1 << gsh), bv = rgb2yuv[BV_IDX] * (1 << bsh); > @@ -311,7 +322,7 @@ static av_always_inline void rgb16_32ToUV_half_c_template(int16_t *dstU, > int maskr, int maskg, > int maskb, int rsh, > int gsh, int bsh, int S, > - int32_t *rgb2yuv) > + int32_t *rgb2yuv, int is_be) > { > const int ru = rgb2yuv[RU_IDX] * (1 << rsh), gu = rgb2yuv[GU_IDX] * (1 << gsh), bu = rgb2yuv[BU_IDX] * (1 << bsh), > rv = rgb2yuv[RV_IDX] * (1 << rsh), gv = rgb2yuv[GV_IDX] * (1 << gsh), bv = rgb2yuv[BV_IDX] * (1 << bsh), > @@ -345,13 +356,13 @@ static av_always_inline void rgb16_32ToUV_half_c_template(int16_t *dstU, > > #undef input_pixel > > -#define rgb16_32_wrapper(fmt, name, shr, shg, shb, shp, maskr, \ > - maskg, maskb, rsh, gsh, bsh, S) \ > +#define RGB16_32FUNCS_EXT(fmt, name, shr, shg, shb, shp, maskr, \ > + maskg, maskb, rsh, gsh, bsh, S, is_be) \ > static void name ## ToY_c(uint8_t *dst, const uint8_t *src, const uint8_t *unused1, const uint8_t *unused2, \ > int width, uint32_t *tab, void *opq) \ > { \ > rgb16_32ToY_c_template((int16_t*)dst, src, width, fmt, shr, shg, shb, shp, \ > - maskr, maskg, maskb, rsh, gsh, bsh, S, tab); \ > + maskr, maskg, maskb, rsh, gsh, bsh, S, tab, is_be); \ > } \ > \ > static void name ## ToUV_c(uint8_t *dstU, uint8_t *dstV, \ > @@ -360,7 +371,7 @@ static void name ## ToUV_c(uint8_t *dstU, uint8_t *dstV, \ > { \ > rgb16_32ToUV_c_template((int16_t*)dstU, (int16_t*)dstV, src, width, fmt, \ > shr, shg, shb, shp, \ > - maskr, maskg, maskb, rsh, gsh, bsh, S, tab);\ > + maskr, maskg, maskb, rsh, gsh, bsh, S, tab, is_be); \ > } \ > \ > static void name ## ToUV_half_c(uint8_t *dstU, uint8_t *dstV, \ > @@ -371,27 +382,32 @@ static void name ## ToUV_half_c(uint8_t *dstU, uint8_t *dstV, \ > rgb16_32ToUV_half_c_template((int16_t*)dstU, (int16_t*)dstV, src, width, fmt, \ > shr, shg, shb, shp, \ > maskr, maskg, maskb, \ > - rsh, gsh, bsh, S, tab); \ > -} > - > -rgb16_32_wrapper(AV_PIX_FMT_BGR32, bgr32, 16, 0, 0, 0, 0xFF0000, 0xFF00, 0x00FF, 8, 0, 8, RGB2YUV_SHIFT + 8) > -rgb16_32_wrapper(AV_PIX_FMT_BGR32_1, bgr321, 16, 0, 0, 8, 0xFF0000, 0xFF00, 0x00FF, 8, 0, 8, RGB2YUV_SHIFT + 8) > -rgb16_32_wrapper(AV_PIX_FMT_RGB32, rgb32, 0, 0, 16, 0, 0x00FF, 0xFF00, 0xFF0000, 8, 0, 8, RGB2YUV_SHIFT + 8) > -rgb16_32_wrapper(AV_PIX_FMT_RGB32_1, rgb321, 0, 0, 16, 8, 0x00FF, 0xFF00, 0xFF0000, 8, 0, 8, RGB2YUV_SHIFT + 8) > -rgb16_32_wrapper(AV_PIX_FMT_BGR565LE, bgr16le, 0, 0, 0, 0, 0x001F, 0x07E0, 0xF800, 11, 5, 0, RGB2YUV_SHIFT + 8) > -rgb16_32_wrapper(AV_PIX_FMT_BGR555LE, bgr15le, 0, 0, 0, 0, 0x001F, 0x03E0, 0x7C00, 10, 5, 0, RGB2YUV_SHIFT + 7) > -rgb16_32_wrapper(AV_PIX_FMT_BGR444LE, bgr12le, 0, 0, 0, 0, 0x000F, 0x00F0, 0x0F00, 8, 4, 0, RGB2YUV_SHIFT + 4) > -rgb16_32_wrapper(AV_PIX_FMT_RGB565LE, rgb16le, 0, 0, 0, 0, 0xF800, 0x07E0, 0x001F, 0, 5, 11, RGB2YUV_SHIFT + 8) > -rgb16_32_wrapper(AV_PIX_FMT_RGB555LE, rgb15le, 0, 0, 0, 0, 0x7C00, 0x03E0, 0x001F, 0, 5, 10, RGB2YUV_SHIFT + 7) > -rgb16_32_wrapper(AV_PIX_FMT_RGB444LE, rgb12le, 0, 0, 0, 0, 0x0F00, 0x00F0, 0x000F, 0, 4, 8, RGB2YUV_SHIFT + 4) > -rgb16_32_wrapper(AV_PIX_FMT_BGR565BE, bgr16be, 0, 0, 0, 0, 0x001F, 0x07E0, 0xF800, 11, 5, 0, RGB2YUV_SHIFT + 8) > -rgb16_32_wrapper(AV_PIX_FMT_BGR555BE, bgr15be, 0, 0, 0, 0, 0x001F, 0x03E0, 0x7C00, 10, 5, 0, RGB2YUV_SHIFT + 7) > -rgb16_32_wrapper(AV_PIX_FMT_BGR444BE, bgr12be, 0, 0, 0, 0, 0x000F, 0x00F0, 0x0F00, 8, 4, 0, RGB2YUV_SHIFT + 4) > -rgb16_32_wrapper(AV_PIX_FMT_RGB565BE, rgb16be, 0, 0, 0, 0, 0xF800, 0x07E0, 0x001F, 0, 5, 11, RGB2YUV_SHIFT + 8) > -rgb16_32_wrapper(AV_PIX_FMT_RGB555BE, rgb15be, 0, 0, 0, 0, 0x7C00, 0x03E0, 0x001F, 0, 5, 10, RGB2YUV_SHIFT + 7) > -rgb16_32_wrapper(AV_PIX_FMT_RGB444BE, rgb12be, 0, 0, 0, 0, 0x0F00, 0x00F0, 0x000F, 0, 4, 8, RGB2YUV_SHIFT + 4) > -rgb16_32_wrapper(AV_PIX_FMT_X2RGB10LE, rgb30le, 16, 6, 0, 0, 0x3FF00000, 0xFFC00, 0x3FF, 0, 0, 4, RGB2YUV_SHIFT + 6) > -rgb16_32_wrapper(AV_PIX_FMT_X2BGR10LE, bgr30le, 0, 6, 16, 0, 0x3FF, 0xFFC00, 0x3FF00000, 4, 0, 0, RGB2YUV_SHIFT + 6) > + rsh, gsh, bsh, S, tab, is_be); \ > +} > + > +#define RGB16_32FUNCS(base_fmt, endianness, name, shr, shg, shb, shp, maskr, \ > + maskg, maskb, rsh, gsh, bsh, S) \ > + RGB16_32FUNCS_EXT(base_fmt ## endianness, name, shr, shg, shb, shp, maskr, \ > + maskg, maskb, rsh, gsh, bsh, S, IS_BE(endianness)) > + > +RGB16_32FUNCS(AV_PIX_FMT_BGR32, , bgr32, 16, 0, 0, 0, 0xFF0000, 0xFF00, 0x00FF, 8, 0, 8, RGB2YUV_SHIFT + 8) > +RGB16_32FUNCS(AV_PIX_FMT_BGR32_1, , bgr321, 16, 0, 0, 8, 0xFF0000, 0xFF00, 0x00FF, 8, 0, 8, RGB2YUV_SHIFT + 8) > +RGB16_32FUNCS(AV_PIX_FMT_RGB32, , rgb32, 0, 0, 16, 0, 0x00FF, 0xFF00, 0xFF0000, 8, 0, 8, RGB2YUV_SHIFT + 8) > +RGB16_32FUNCS(AV_PIX_FMT_RGB32_1, , rgb321, 0, 0, 16, 8, 0x00FF, 0xFF00, 0xFF0000, 8, 0, 8, RGB2YUV_SHIFT + 8) > +RGB16_32FUNCS(AV_PIX_FMT_BGR565, LE, bgr16le, 0, 0, 0, 0, 0x001F, 0x07E0, 0xF800, 11, 5, 0, RGB2YUV_SHIFT + 8) > +RGB16_32FUNCS(AV_PIX_FMT_BGR555, LE, bgr15le, 0, 0, 0, 0, 0x001F, 0x03E0, 0x7C00, 10, 5, 0, RGB2YUV_SHIFT + 7) > +RGB16_32FUNCS(AV_PIX_FMT_BGR444, LE, bgr12le, 0, 0, 0, 0, 0x000F, 0x00F0, 0x0F00, 8, 4, 0, RGB2YUV_SHIFT + 4) > +RGB16_32FUNCS(AV_PIX_FMT_RGB565, LE, rgb16le, 0, 0, 0, 0, 0xF800, 0x07E0, 0x001F, 0, 5, 11, RGB2YUV_SHIFT + 8) > +RGB16_32FUNCS(AV_PIX_FMT_RGB555, LE, rgb15le, 0, 0, 0, 0, 0x7C00, 0x03E0, 0x001F, 0, 5, 10, RGB2YUV_SHIFT + 7) > +RGB16_32FUNCS(AV_PIX_FMT_RGB444, LE, rgb12le, 0, 0, 0, 0, 0x0F00, 0x00F0, 0x000F, 0, 4, 8, RGB2YUV_SHIFT + 4) > +RGB16_32FUNCS(AV_PIX_FMT_BGR565, BE, bgr16be, 0, 0, 0, 0, 0x001F, 0x07E0, 0xF800, 11, 5, 0, RGB2YUV_SHIFT + 8) > +RGB16_32FUNCS(AV_PIX_FMT_BGR555, BE, bgr15be, 0, 0, 0, 0, 0x001F, 0x03E0, 0x7C00, 10, 5, 0, RGB2YUV_SHIFT + 7) > +RGB16_32FUNCS(AV_PIX_FMT_BGR444, BE, bgr12be, 0, 0, 0, 0, 0x000F, 0x00F0, 0x0F00, 8, 4, 0, RGB2YUV_SHIFT + 4) > +RGB16_32FUNCS(AV_PIX_FMT_RGB565, BE, rgb16be, 0, 0, 0, 0, 0xF800, 0x07E0, 0x001F, 0, 5, 11, RGB2YUV_SHIFT + 8) > +RGB16_32FUNCS(AV_PIX_FMT_RGB555, BE, rgb15be, 0, 0, 0, 0, 0x7C00, 0x03E0, 0x001F, 0, 5, 10, RGB2YUV_SHIFT + 7) > +RGB16_32FUNCS(AV_PIX_FMT_RGB444, BE, rgb12be, 0, 0, 0, 0, 0x0F00, 0x00F0, 0x000F, 0, 4, 8, RGB2YUV_SHIFT + 4) > +RGB16_32FUNCS(AV_PIX_FMT_X2RGB10, LE, rgb30le, 16, 6, 0, 0, 0x3FF00000, 0xFFC00, 0x3FF, 0, 0, 4, RGB2YUV_SHIFT + 6) > +RGB16_32FUNCS(AV_PIX_FMT_X2BGR10, LE, bgr30le, 0, 6, 16, 0, 0x3FF, 0xFFC00, 0x3FF00000, 4, 0, 0, RGB2YUV_SHIFT + 6) > > static void gbr24pToUV_half_c(uint8_t *_dstU, uint8_t *_dstV, > const uint8_t *gsrc, const uint8_t *bsrc, const uint8_t *rsrc, > @@ -832,8 +848,6 @@ p01x_wrapper(10, 6) > p01x_wrapper(12, 4) > p01x_uv_wrapper(16, 0) > > -#define input_pixel(pos) (isBE(origin) ? AV_RB16(pos) : AV_RL16(pos)) > - > static void bgr24ToY_c(uint8_t *_dst, const uint8_t *src, const uint8_t *unused1, const uint8_t *unused2, > int width, uint32_t *rgb2yuv, void *opq) > { Will apply this patchset tomorrow unless there are objections. - Andreas _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".