From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTP id D2FA94749E for ; Fri, 8 Sep 2023 08:35:37 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 6612D68C893; Fri, 8 Sep 2023 11:35:34 +0300 (EEST) Received: from EUR01-VE1-obe.outbound.protection.outlook.com (mail-ve1eur01olkn2029.outbound.protection.outlook.com [40.92.66.29]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id D497568C770 for ; Fri, 8 Sep 2023 11:35:27 +0300 (EEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=WQqAe0oOhFQSxdc5GGw6WCv9psRwImvhVpbUB7N0DohOU449s2wrGB5KXLFhNsmvvZUA5v2xsFdLLgEuBUeGiFSB8ImgjiDn3ox0TIohMLtf4dH+ks6I8CSiXZczlc5Cz8lRCfySW9AqmzXP2ru11suYF9e1Qxja845ES6msZHzGKr3UvNcTjKXlzf946RSBH9Zhpi/K0XKIXGASW7tuU6omGqtFkRXi4lIxwLXihBHOPhjqAmmghFinSgVSRleU0d/SMo+gEvzgg2yCibMhBUnVhWgLRmmlLnlShh/C5GMpIQ2GK0MIUNz6x4Hjqx0pCG8e2X5AxYsg3oEmhC4r5g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=gAmtRB1UGhjhuiRC5wJRVBmhMZnQ8vRgMhugsO9PyVM=; b=HeQSEonqYw5X1qFjyYvandP8CCo9jtmqll2iBDYRUaEDcwpAmK/XD7CEl6sFhj3xeagMJt9exR++CwR4/vOSQwLslRCzmAsLOBSU4HgHcwKyfTTnSKq3KegDGjHS0opxZe640OP/wP9xGU/XBnQQdRzLZ4V+SsZ5lEJuKvLosdZ/KEZtNRp5N9LmGvd/uBuSx1SIFHJEgWhCZRBvhdkYolEDOuLMwRif7oq9Xi7gr9oJtGknH2iLdnt7njXXfJBsL0nWi1mUmGsHUyn1jY6mu1R+D9RVj1JgGiSug9ERsTJ8yYs0KaGuOI25YZCxnWhja/belDSf/ryy274bamSwWw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=none; dmarc=none; dkim=none; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=outlook.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=gAmtRB1UGhjhuiRC5wJRVBmhMZnQ8vRgMhugsO9PyVM=; b=KkCjpWBJUwrh2uBluLStvZbmC45bs1xd1C4hE3zBhKUTIM7fpfo9M8emSJE62MMFB/tJXtlr4NTDb5A/72p8q8lPFIEC9y/fZUFvq+x5elDJfrMPszmh2N8BxuKtJjpF7Xa/boIpFd/Wav9XkrlDDVIMazT5bWMMeevta8NnSTjhGCePNiaRKN6ZtgSIT4g/fczDZOs/3bwGRdkAhi+lCPkl4QJPo+h+Pq0pSXomFZA7DXuF0ReMVPMncYxOeIb+J1vDjk1vvNKVIEw8nDQjUq2z/IDDnP2EhppTA/tw43YYlpotMWLrbsIdKlRK1ijoh774n4mG5iLTTmGtc5fm8w== Received: from AS8P250MB0744.EURP250.PROD.OUTLOOK.COM (2603:10a6:20b:541::14) by DU2P250MB0255.EURP250.PROD.OUTLOOK.COM (2603:10a6:10:27b::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6768.30; Fri, 8 Sep 2023 08:35:26 +0000 Received: from AS8P250MB0744.EURP250.PROD.OUTLOOK.COM ([fe80::5e01:aea5:d3a8:cafa]) by AS8P250MB0744.EURP250.PROD.OUTLOOK.COM ([fe80::5e01:aea5:d3a8:cafa%3]) with mapi id 15.20.6768.029; Fri, 8 Sep 2023 08:35:26 +0000 Message-ID: Date: Fri, 8 Sep 2023 10:36:45 +0200 Content-Language: en-US To: ffmpeg-devel@ffmpeg.org References: <20230908081508.510-1-christophe.gisquet@gmail.com> From: Andreas Rheinhardt In-Reply-To: <20230908081508.510-1-christophe.gisquet@gmail.com> X-TMN: [ImQAcndJXlOrA1i1sDzCZrckqONlvzWV] X-ClientProxiedBy: FR0P281CA0163.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:b3::17) To AS8P250MB0744.EURP250.PROD.OUTLOOK.COM (2603:10a6:20b:541::14) X-Microsoft-Original-Message-ID: <34d1e70a-8636-a45a-41cb-0b7dcb677b1e@outlook.com> MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: AS8P250MB0744:EE_|DU2P250MB0255:EE_ X-MS-Office365-Filtering-Correlation-Id: 11d7aa32-01d2-4f5a-7dc7-08dbb0468d00 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: sPOCMtHVcikrSx7mzvt274T1ArLd3pqxN7+W99MvlKQQNHzxVMOy9mpPeDyhSq/InDFCng2BDPj5pmCQZUagFi7g6TyyZq4r2qXiUvYNs1NY95bV0hN+7Sc30MN6BffblibAEGiHV1btMdeY/aWGQmXRvUyYJCtR7VRFLabw9mQ24hajFClI9illJCHjZRYDOrYqMN+F2nuospe6QD02m0smAUhWlPz7dE8AtclPmXjmwSZbC0N/bBiT90tmwAKmVbUHPd7UuYz0I0Nmxe9C3ApVKTtKct4QqdEr/DwS6PEjwV9fv35UWg54QdadeNQgEex+qzNXMg/a1rlzczFMSUBGiCtCNbknqQ+X3RVe0L+rvM40tomWsj9Tbs9JqYGO5YbBd/99HPxOi3XJZTMYo6u96Y111ACdbZUx10iMn2KV+TnRyT7MnnwbNjPwfUGddif3mCSqYasUEkRiW8wfu+lIQ8okIuwifvcSteKJ/dCI9kQTSAQe1fTDb519i7s4fGToB1lC72cosCqhDiZBR6Zy95AtyjR4G7THy9iz0cIhjlRsjI9bSzdYJ8mW0JxV X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?Vi9keDB6cnpISzlvakI3YzVDSk9yRTRlbDJOaEthQStvZHB3TDh0blRjZG5l?= =?utf-8?B?d1RZWXAxdlNsMFFFbm9BSlppdjJobGZJQmFVYVkxbmcrcmNwenYzTVowQ3lH?= =?utf-8?B?VktLWkV0QnRaRVdhY0p1RnhUZWQwZlFQNTh6Q0lyd0Z1TkFDWW1mQWJWR2ls?= =?utf-8?B?QU1yREpML1QxejF4ZmJrelZZVGUvS2hLYVNsMzlKVkh1NDJTejZ4KzJEYXFV?= =?utf-8?B?eWh0NWlZVnBJcnhnbzNGdG5xa24yWHZldm1vbWw1ckxhU3RnaUNHSGN6WkFL?= =?utf-8?B?eFRsWk9mdmVtUmNTZFpPZUhSbzUreDNxTDkvL0dqMThqOWlLVHRYcllzOThh?= =?utf-8?B?VE5rb0FBbWwzeFlHVDUrWWsvSGhjL1hRcnJSMTVORFhmVUpKQzROQXFYNFR1?= =?utf-8?B?bm1RQjJHRXlOcWg3Wmp6R3BIQ1NUVzVSdy9idkRuVHNKSExWQXVDaytxRExY?= =?utf-8?B?MThtcW5Sb0dySlhXQ3FtZTIwVXVIZU4vUzFvZU1yY0EwZVJmNmNXWTJIZ2hl?= =?utf-8?B?bklieXIzUjZOWHp4M3RvMzk5cUR0UTdmZk5EdkYrMHYzMkdRRExJcTVXZ29v?= =?utf-8?B?OXoxWmNHdWtVVStYYWkyUnNSR09jemdGNVMvazZuRnlKN2lmd01ibEJEWGpn?= =?utf-8?B?T3B1WVZZN0lqNGEyeVZ0Z3V6QXJsVTRKYkhqcmU2dmt2UDliS1VhOFVLUmVt?= =?utf-8?B?Nm13UDRFODNxdk5pUjg0OTZmWStBcTdVRkpNK0c2dVVHRUdSNjZCbXpaMThx?= =?utf-8?B?THJzUTlWQnJnSUR4UDFFVzJJdW5EMjBzSitGazQ2clpoSk5jT1ovK2NaWjBz?= =?utf-8?B?QS84VlV2T1lpdFhLKzNQM3pXaURwZG4xcjJ5SUxKS0dFM0ZGbG8vTmUyN0oy?= =?utf-8?B?d1ZxZk5xNE9CVFBKV1hTeHFZTDR6Y3U0cVNhTVdwTldwaVFINmNIdTg4UWgr?= =?utf-8?B?N0Z3Sk9WMVAzd2ZtRFIyaTJaRldNTXpiaUlKTGVRemtXMmYrM0JxN1lqWEZC?= =?utf-8?B?d0JaUWRSajZGaUlpQXIyWTY3MXg2cDJQSlFBMDNnakJLT2ZFSU1GSklqbCsy?= =?utf-8?B?TmtGNVM2TEZFd295RHd0TFViSUJ6TTNreGlEV1pWYkNTSzRWQjllZFM5cWk2?= =?utf-8?B?MmxzKzNSMEhqSFNKTTJSV0FDQk9Nc3huYmRWTHp4SThsYm1DQmlQb3A1WXFJ?= =?utf-8?B?Y0dncWtsSnhjNFFmY0IvT0lRTXZzbFFuMHZ1cGQyN25RdzFqbnk4MjlsdkJr?= =?utf-8?B?anFDd3NhNnEzYzY4NitReTllakE0WHZKMndzV1ZNeHgxVThGckZLMzRoWHhM?= =?utf-8?B?cGhZMUNCU0dVWE4wcTRaczQ5Z2VhRXNEdGVPQ2dPZ2Z4ZnZabWk1N2FXNm1s?= =?utf-8?B?dTdCMkJ3YmZUNk9aT2xFZHpZTXRSeXdzQWVHdUVzZjNML1JCRDR5NjU3QXZo?= =?utf-8?B?UHZ1cjI2c09CNE1oYURzUkpPNG4rVzA0U2xGRkNld1VvZkcrSk81WUtyZWFJ?= =?utf-8?B?b1BtYnpNTER5ZDZHZjBxam81eHNobG0rOEhXVXMxcVdSWDVlMWxkMWJOUy9x?= =?utf-8?B?RUtCNkd5eXl2VGNqZStqbEFUeXp4TTlBWFNlMHNTOG5sem1MT2JvbXRhT1BE?= =?utf-8?B?VWtsVGZZRVBKbnVlL2RYRkNCbndnUjhvNVUyaDc3bjQ5U0RuNzRFNW9hNXdG?= =?utf-8?Q?wEb0gnJOHd7p2ml7l93I?= X-OriginatorOrg: outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 11d7aa32-01d2-4f5a-7dc7-08dbb0468d00 X-MS-Exchange-CrossTenant-AuthSource: AS8P250MB0744.EURP250.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 08 Sep 2023 08:35:26.5138 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-CrossTenant-RMS-PersistedConsumerOrg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-Transport-CrossTenantHeadersStamped: DU2P250MB0255 Subject: Re: [FFmpeg-devel] [PATCH 1/7] proresdec2: port and fix for cached reader X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: Christophe Gisquet: > Summary of changes > - move back to regular, non-macro, get_bits API > - reduce the lookup to switch the coding method > - shorter reads wherever possible, in particular for the end of bitstream > (16 bits instead of 32, as per the above) > > There are cases that really need longer lengths (larger EG codes) of up > to 27 bits. > > Win64: 6.10 -> 4.87 (~20% speedup) > > Reference for an hypothetical 32bits version of the cached reader: > Win32: 11.4 -> 9.8 (14%, because iDCT is not SIMDed) > --- The commit message claims to fix something; what does it fix? Also, changing to the non-macro API should be done in a separate commit to the one changing the type of bitstream reader. Furthermore, you should also provide information about the code size impact when switching the type of reader. > libavcodec/proresdec2.c | 53 ++++++++++++++++++----------------------- > 1 file changed, 23 insertions(+), 30 deletions(-) > > diff --git a/libavcodec/proresdec2.c b/libavcodec/proresdec2.c > index 9297860946..6e243cfc17 100644 > --- a/libavcodec/proresdec2.c > +++ b/libavcodec/proresdec2.c > @@ -24,9 +24,7 @@ > * Known FOURCCs: 'apch' (HQ), 'apcn' (SD), 'apcs' (LT), 'apco' (Proxy), 'ap4h' (4444), 'ap4x' (4444 XQ) > */ > > -//#define DEBUG > - > -#define LONG_BITSTREAM_READER > +#define CACHED_BITSTREAM_READER 1 > > #include "config_components.h" > > @@ -422,35 +420,37 @@ static int decode_picture_header(AVCodecContext *avctx, const uint8_t *buf, cons > return pic_data_size; > } > > -#define DECODE_CODEWORD(val, codebook, SKIP) \ > +/* bitstream_read may fail on 32bits ARCHS for >24 bits, so use long version there */ > +#if 0 //BITSTREAM_BITS == 32 > +# define READ_BITS get_bits_long > +#else > +# define READ_BITS get_bits > +#endif > + > +#define DECODE_CODEWORD(val, codebook) \ > do { \ > unsigned int rice_order, exp_order, switch_bits; \ > unsigned int q, buf, bits; \ > \ > - UPDATE_CACHE(re, gb); \ > - buf = GET_CACHE(re, gb); \ > + buf = show_bits(gb, 14); \ > \ > /* number of bits to switch between rice and exp golomb */ \ > switch_bits = codebook & 3; \ > rice_order = codebook >> 5; \ > exp_order = (codebook >> 2) & 7; \ > \ > - q = 31 - av_log2(buf); \ > + q = 13 - av_log2(buf); \ > \ > if (q > switch_bits) { /* exp golomb */ \ > bits = exp_order - switch_bits + (q<<1); \ > - if (bits > FFMIN(MIN_CACHE_BITS, 31)) \ > - return AVERROR_INVALIDDATA; \ > - val = SHOW_UBITS(re, gb, bits) - (1 << exp_order) + \ > + val = READ_BITS(gb, bits) - (1 << exp_order) + \ > ((switch_bits + 1) << rice_order); \ > - SKIP(re, gb, bits); \ > } else if (rice_order) { \ > - SKIP_BITS(re, gb, q+1); \ > - val = (q << rice_order) + SHOW_UBITS(re, gb, rice_order); \ > - SKIP(re, gb, rice_order); \ > + skip_remaining(gb, q+1); \ > + val = (q << rice_order) + get_bits(gb, rice_order); \ > } else { \ > val = q; \ > - SKIP(re, gb, q+1); \ > + skip_remaining(gb, q+1); \ > } \ > } while (0) > > @@ -466,9 +466,7 @@ static av_always_inline int decode_dc_coeffs(GetBitContext *gb, int16_t *out, > int16_t prev_dc; > int code, i, sign; > > - OPEN_READER(re, gb); > - > - DECODE_CODEWORD(code, FIRST_DC_CB, LAST_SKIP_BITS); > + DECODE_CODEWORD(code, FIRST_DC_CB); > prev_dc = TOSIGNED(code); > out[0] = prev_dc; > > @@ -477,13 +475,12 @@ static av_always_inline int decode_dc_coeffs(GetBitContext *gb, int16_t *out, > code = 5; > sign = 0; > for (i = 1; i < blocks_per_slice; i++, out += 64) { > - DECODE_CODEWORD(code, dc_codebook[FFMIN(code, 6U)], LAST_SKIP_BITS); > + DECODE_CODEWORD(code, dc_codebook[FFMIN(code, 6U)]); > if(code) sign ^= -(code & 1); > else sign = 0; > prev_dc += (((code + 1) >> 1) ^ sign) - sign; > out[0] = prev_dc; > } > - CLOSE_READER(re, gb); > return 0; > } > > @@ -497,11 +494,9 @@ static av_always_inline int decode_ac_coeffs(AVCodecContext *avctx, GetBitContex > const ProresContext *ctx = avctx->priv_data; > int block_mask, sign; > unsigned pos, run, level; > - int max_coeffs, i, bits_left; > + int max_coeffs, i, bits_rem; > int log2_block_count = av_log2(blocks_per_slice); > > - OPEN_READER(re, gb); > - UPDATE_CACHE(re, gb); \ > run = 4; > level = 2; > > @@ -509,28 +504,26 @@ static av_always_inline int decode_ac_coeffs(AVCodecContext *avctx, GetBitContex > block_mask = blocks_per_slice - 1; > > for (pos = block_mask;;) { > - bits_left = gb->size_in_bits - re_index; > - if (!bits_left || (bits_left < 32 && !SHOW_UBITS(re, gb, bits_left))) > + bits_rem = get_bits_left(gb); > + if (!bits_rem || (bits_rem < 16 && !show_bits(gb, bits_rem))) > break; > > - DECODE_CODEWORD(run, run_to_cb[FFMIN(run, 15)], LAST_SKIP_BITS); > + DECODE_CODEWORD(run, run_to_cb[FFMIN(run, 15)]); > pos += run + 1; > if (pos >= max_coeffs) { > av_log(avctx, AV_LOG_ERROR, "ac tex damaged %d, %d\n", pos, max_coeffs); > return AVERROR_INVALIDDATA; > } > > - DECODE_CODEWORD(level, lev_to_cb[FFMIN(level, 9)], SKIP_BITS); > + DECODE_CODEWORD(level, lev_to_cb[FFMIN(level, 9)]); > level += 1; > > i = pos >> log2_block_count; > > - sign = SHOW_SBITS(re, gb, 1); > - SKIP_BITS(re, gb, 1); > + sign = -get_bits1(gb); > out[((pos & block_mask) << 6) + ctx->scan[i]] = ((level ^ sign) - sign); > } > > - CLOSE_READER(re, gb); > return 0; > } > _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".