From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTP id 5A21A49669 for ; Sun, 17 Mar 2024 14:47:21 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 2DC7668D14E; Sun, 17 Mar 2024 16:47:20 +0200 (EET) Received: from EUR01-HE1-obe.outbound.protection.outlook.com (mail-he1eur01olkn2051.outbound.protection.outlook.com [40.92.65.51]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 173F868D098 for ; Sun, 17 Mar 2024 16:47:17 +0200 (EET) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=B+CbkdKDdLpjxda0lHpU17RyN+6oVbjkryd9ssKi9KiocAKIKHPs79iquxbklvLoQD0bwF5NLMjjroxBHnfnvkBl44RwqSNI9+vov6l9dEQSuDk2etQriMaVt49NXeyxIrC6YiptbnemF2R6bduOn67HidMJRCWvF2rb5L3fnendUFBWPozqL5l2s0ePiuhsaaikwboREtS4NkU4e6I4hx6OgCNUakwJLGpJfkDJQjw/QRnoMtdVbFWZ8F54mzQlfaGOv3AbksZ/Kzuv6U2mCKLm22Kll89JTYtdxWEWLaNgmUqs/AkLmS0tuvdar/IQ98JJDhQTbpr+NcsUh2Lylw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=7mNI8tZlJymS6L7dplWdJqVspFZ9YYaUs6edrOoMNKA=; b=b3uZ2IbPrRXM8kRsxrgWvkBXv0H8b9TK152YRPSNYk6mS9fhjmQ6wTTgmkViDL0UsqJsobO8PZJ84qN4DFp4AaWBQkCxsUvIa2Owi82fB0NE4SgjN7JbiRYz8fG4zAFHFAy1DTv+N8NMSUjEiTXpyha1tpaFH3un6QdeH/e4BBngusnxBcTBLjRk79UKjj8uID96nEmYb8qEmKfEUi815uyx+CH8GDazuZTW9JYA3ox2W5Di9FtD+OxAfBmvzCKrdHTATDf0X9eJQv4H1sIV89WnYSniXH7qjuszrjgSl0R6i415Nr8r8NsAR4rDThCNz+wH+OWrEuyBHm6O3pDDmQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=none; dmarc=none; dkim=none; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=outlook.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=7mNI8tZlJymS6L7dplWdJqVspFZ9YYaUs6edrOoMNKA=; b=R7XwvSDqFYtPEdCSbNVnGhCcIX2Cyp6dPIcJM7pT9ZrNDvNuzA9vYjIaOyJrcX81JRd0dtoI7NCszdqTu8SVHhyI4AF0gSmH8fSsPB4RMvt75UfCQEOX2dKQhWzhfUuQS6UzFhnvaYdQT8pU1rfKF8z8Na6fG+7rp8b4geddLbISoSYXKCCMz6OAw5haGW6IscJJDoPYnhSfpsn91wyLRu+6iswGga8ATlnvzymkY1gI6aZKs8ikESXc4Ts+McvKDKilBkUMvOytwqXyWW0JCwkgUgu6Gm/+rdwW8x86mvHYMmfPf1ASuRIHMOlszdtbKN20vwW+YxE0GZPjU3YglA== Received: from AS8P250MB0744.EURP250.PROD.OUTLOOK.COM (2603:10a6:20b:541::14) by PR3P250MB0227.EURP250.PROD.OUTLOOK.COM (2603:10a6:102:17a::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7386.25; Sun, 17 Mar 2024 14:47:14 +0000 Received: from AS8P250MB0744.EURP250.PROD.OUTLOOK.COM ([fe80::228d:8c6f:ed10:82eb]) by AS8P250MB0744.EURP250.PROD.OUTLOOK.COM ([fe80::228d:8c6f:ed10:82eb%7]) with mapi id 15.20.7362.035; Sun, 17 Mar 2024 14:47:14 +0000 Message-ID: Date: Sun, 17 Mar 2024 15:47:14 +0100 User-Agent: Mozilla Thunderbird Content-Language: en-US To: ffmpeg-devel@ffmpeg.org References: From: Andreas Rheinhardt In-Reply-To: X-TMN: [YQ6Z6wuxlYE8TqXDUoIoaipHr2sQUTWhxhfTIDM47bg=] X-ClientProxiedBy: ZR2P278CA0033.CHEP278.PROD.OUTLOOK.COM (2603:10a6:910:47::9) To AS8P250MB0744.EURP250.PROD.OUTLOOK.COM (2603:10a6:20b:541::14) X-Microsoft-Original-Message-ID: <02ae6d53-e2c3-42f0-b24d-b5edb1079369@outlook.com> MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: AS8P250MB0744:EE_|PR3P250MB0227:EE_ X-MS-Office365-Filtering-Correlation-Id: 543cf170-8054-40d0-78fd-08dc469122ac X-MS-Exchange-SLBlob-MailProps: q6Zzr5Fg03EwS/rAH54lOYNtz4Wim6V94WW7bRRPkOsSJgv0uWttr77RKkleyUXveJtAp7GX/05ePdrL6XAjxXwX6crIcC0/mmywhgpEepri2WAszO1161HMfHbMLvY04+ZhCUUOcEm5z8R9PqsFDj5Yab7MdKpK/aK/PC0Xl3HZR3QhrpGUNXbFzy1/v0s78VgXGuSDz9ZxKV5qEUEoO7mP2aRy7bbbCMF536LPsWglN+oSqtxabSK6KR5HUz/6eZ8vYr0LV84u0Q3Nv03e8SBlJ9sHOKNeTATOkfoOMTlmA952XOsXz/e7Kpc7e6XpDwV/DLvAqldeO6km8CA8oehncB0yDEUEL3MWqZmcwaTSTSyGy0PSKCdt6y/SisGpcb3IX3LHPcAklf6tWZsP/UMWG9qTf+TWhOLJ7AO90bWij616JRbRnOOQj7Xof0B5UGVup/eO4T3v6wWT0o49uuXKCVadMtkNSq6FRh59FwjEQ8GhJc+sr7kWNFbVDGE2HaXU7PAoS3hBZP53v1SCSbr1nD3Z0aMNu0EVgkw4NyXdkKeVHngkyh+fm2P7iL8rUdoUh3xN+wJIjhGyYbHwP1mxg/AponR6A34HOqrGCK2sXOSd1DxBuHZtCnxDcnqptNJ+yFxrsHC4mc01/45UHo8ah1x0QCzX98IWAxbpZj1Vjk0oh3oK7KWfkRFr/UEo X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: A6kfWnem4s0S7Sduc/baHLybqJQ8hgkqLJry+He4oOS3o8BpqqckbXEB/2UWTnQuYNSIVoOuKyLtiCP7Wl0kEvQBC/viLwknZCxFZCHQO3cK782dXp7KJEcIH1Vp0JFDZQqU8rBsMEkQnAaH0OkZ5/6ZKVnJzmJzs9ymF9VQyR8jGHZrGG41b4onECjBMqkpk8G7dKEpwIdNeSKqQDEoSKIag4vcAi+TIC6G4eUPtzxYjUTV882sigtNtJPga1WED4RZCL6d5YPz2rCf/kx0egyIGV3sd0wOOJyLivTP+Yd1eP9g3DsngOWt/hTijdy/LEA5LCx3mOs38kx9/1dgPTCMFqx8zkgorPmpRk5IR2f/doDE2PDDKTSAuxonXYiSHylDeUp2d8PmInw1ys1vYTEhOQUyY6IiM++CCS2hfqckeMFT4NT23S+siug94vlclhGiujP3p2v5gmW6bygbeacmkKzfBmnBzM7IHjEMmzTt4+kNaJIh2G1xSvfxb53XwNQunoavLhNAVdOTN4PZyi14nHgC0v7Xojqtw673NWgqC+J9Qqkhr3lUkvUaT0kq X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?NnVRSlNPbnV1Y2FuQktwdXlFS24rRFV0aTZtRGl4UUN2WEVKKytYSkZaQmYz?= =?utf-8?B?d3Znc2tJUDRzaTEvOCtmWmE3VUtOc3pWS3IzU0tBNWZlcDJCV0RoanVSS2V0?= =?utf-8?B?UU01Q0lhR0tMREdKSzJxYTQxY0tYOU5VSVB1SWQ1bnNYZHR0Um5BSy91QXl4?= =?utf-8?B?b1VVRVAvOEVkYTgyRUNqWGpKSThnTjlCL1JEOVNzNHg3am53WGUyL2dGbjN5?= =?utf-8?B?a05iSUFRMVlUYXM3T3paUyt2L2FMakdHbndIak9uVXQ0VEZtU3RJTkxNRGtx?= =?utf-8?B?akxSTjdlU2N5b2ozZUpVREdTSWdRWnBobHlzNkYvT3JHMTcrQS9maHBjL3FO?= =?utf-8?B?M3QzYnRTc1MwUmxaSW5iQUYveVhlWENjNTlxbktxbDlhODJmRlVvZ1F6dm5U?= =?utf-8?B?YXFRSkJ1dnhDbkhteE5BNUVCbUNVenJkbkVvM2I0d0tKR3VvZ3VKSFEwbmpp?= =?utf-8?B?KzBlWXQ0U29GU3N2SkE0OXAxWTZJSUcxby9sa3hxWUlCaDdqelN2eWc4Y0lT?= =?utf-8?B?bmJiZEFKRjZyM2hHUm9VQXJPNitnek04R1F5K0ZqanlCU1FLdTU4ZUovOUM1?= =?utf-8?B?U1U3dWtabXM2bzNoakRvaDNLZWI0eDkvblRIUEIyZkx6MzM5alhWUkxaVVla?= =?utf-8?B?WStnem5pMU9pM1p2eUl1TGZHT0dNQkxsTlU0c2FUcG83TmFwaVNKUVBSbE1Y?= =?utf-8?B?cjBJTWFBMVdoS1lveFhjNmRKNDlsUlM0VE51NVF0emhkVXdBUXJmcXM0NWxE?= =?utf-8?B?MXdDM3ZqbVlKNFJrUjFWOWRjNHh2Tis5SW14dkNEU0FZMC95L0duYkF5cXh6?= =?utf-8?B?OWtOb1Z6NVJNcW45d1BGZXE2Y1pTd3NoMVpNVDZ3WUI4M29Dbng1Vzd0OXNY?= =?utf-8?B?bEY0dmIvWWZyNVZOOTZyTW5GSWE5TldlY2VjTld1MmF6WFhCbDZXNVpuczBI?= =?utf-8?B?Q1BNdEU4disyNTlzV2tpY1RlMVg3OURVTDBmUEc3MGhobTVLM3l2N2VsS05Y?= =?utf-8?B?T1JpajB3VFhpTisvQ3JZWkJQK3VHdzFabDVQcEVHVFZZcEVKUDhKVHlLU0Iw?= =?utf-8?B?eE1DWVA0ZFVBbGFoUHFqQTBBcWVtTlJFSG5SN0tkMld0aWIyRGJ2SmlySzRR?= =?utf-8?B?eVMramMra3VHTHlaOGJqNXUwd1dVa3lSYjBVOCs2ZW1scUJ3R2NtN1B0VVBr?= =?utf-8?B?MmdVUFp0aTc1d2Q3aG9hMlFIWVRHOFBhR2NYTy9mcms0TzgrNTBxUHRjb3VG?= =?utf-8?B?Slg0aDdEK0RFV2taVTV4Umxkc1AySk8veERXaVZ3Qmo2NFRuRmNhaUQ4bXJC?= =?utf-8?B?cGtGdWZDT3lpU1lubm5GSHVjWElZb3lsRGw1K0FkNDMvYk1kTW9sNVlCVzBn?= =?utf-8?B?T2lvbHVrUXFPejJQKzZZSEhscVl4QVBzaVpIdlUrbEt3WXlsK1Bta05uR05S?= =?utf-8?B?aVV3cEQvN3F3Q0J1TXV5SGlvUlh6SDVhTExYc1hadkptN2tRTmFSU1JoZ21x?= =?utf-8?B?cTZpVHozWStBZ2Z4eTdadVkwczV4dXg4d2g2U2V1bllXcHlRaHlvWXg4TGpn?= =?utf-8?B?NlRZNTdURnhZbFg0cGxKNlNyZnVJbVR1K3BSclQzcWY4MWxXT3k2VmlRZkVD?= =?utf-8?B?R0hVMit1KzJVOTd5NGdpM1FQck1YU25wNW5mNENkTlpPM3hzU2x5RVphOTFr?= =?utf-8?B?R2w1dGJhVnJDNFovKzVURjFiTGwyT2xRUysvNVZ5M21zOW80MXBWWHV3PT0=?= X-OriginatorOrg: outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 543cf170-8054-40d0-78fd-08dc469122ac X-MS-Exchange-CrossTenant-AuthSource: AS8P250MB0744.EURP250.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 17 Mar 2024 14:47:14.8494 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-CrossTenant-RMS-PersistedConsumerOrg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-Transport-CrossTenantHeadersStamped: PR3P250MB0227 Subject: Re: [FFmpeg-devel] [PATCH] avcodec/mips/aaccoder_mips: Remove MIPS-specific aaccoder X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: Andreas Rheinhardt: > ff_aac_coder_init_mips() modifies a static const structure of > function pointers. This will crash if the binary uses relro > and is a data race in any case. > > Furthermore it points to a maintainability issue: The > AACCoefficientsEncoder structures have been constified > in commit fd9212f2edfe9b107c3c08ba2df5fd2cba5ab9e3, > a Libav commit merged in 318778de9ebec276cb9dfc65509231ca56590d13. > Libav did not have the MIPS-specific AAC code and so this was > fine for them; yet FFmpeg had them, but this was not recognized. > > Commit 75a099fc734a4ee2b1347d0a3d8c53d883b95174 points to another > maintainability issue: Contrary to ordinary DSP code, this code > here is way more complex and needs to be constantly kept in sync > with the ordinary code which it mimicks and replaces. Said commit > is the only commit actually changing aaccoder.c in the last few > years and the same change has not been performed for the MIPS > clone; before that, it even happened several times that the mips > code was broken due to changes of the generic code (see commits > 97437bd17a8c5d4135b2f3b1b299bd7bb72ce02c and > de262d018d7d7d9c967af1dfd1b861c4b9eb2a60 or > 860dbe0275e57cbf4228f3f653f872ff66ca596b or > 933309a6ca0f18bf1d40e917fff455221f57fb4b or > b65ffa316e377213c29736929beba584d0d80d7c). This might even lead > to scenarios where someone changing non-dsp aacenc code would > have to modify mips inline asm in order to keep them in sync. > This is obviously a significant burden (if the AAC encoder were > actively developed). > > Finally, the code does not even compile here due to errors like > "Error: float register should be even, was 1". > > Signed-off-by: Andreas Rheinhardt > --- > libavcodec/aacenc.c | 4 - > libavcodec/aacenc.h | 1 - > libavcodec/mips/Makefile | 1 - > libavcodec/mips/aaccoder_mips.c | 2503 ------------------------------- > 4 files changed, 2509 deletions(-) > delete mode 100644 libavcodec/mips/aaccoder_mips.c > > diff --git a/libavcodec/aacenc.c b/libavcodec/aacenc.c > index 3f99188be4..55fa307809 100644 > --- a/libavcodec/aacenc.c > +++ b/libavcodec/aacenc.c > @@ -1383,10 +1383,6 @@ static av_cold int aac_encode_init(AVCodecContext *avctx) > > ff_aacenc_dsp_init(&s->aacdsp); > > -#if HAVE_MIPSDSP > - ff_aac_coder_init_mips(s); > -#endif > - > ff_af_queue_init(avctx, &s->afq); > > return 0; > diff --git a/libavcodec/aacenc.h b/libavcodec/aacenc.h > index c18e828905..8899f90ac7 100644 > --- a/libavcodec/aacenc.h > +++ b/libavcodec/aacenc.h > @@ -241,7 +241,6 @@ typedef struct AACEncContext { > } buffer; > } AACEncContext; > > -void ff_aac_coder_init_mips(AACEncContext *c); > void ff_quantize_band_cost_cache_init(struct AACEncContext *s); > > > diff --git a/libavcodec/mips/Makefile b/libavcodec/mips/Makefile > index 45c56e8ad9..50fe38a50e 100644 > --- a/libavcodec/mips/Makefile > +++ b/libavcodec/mips/Makefile > @@ -19,7 +19,6 @@ OBJS-$(CONFIG_AAC_DECODER) += mips/aacdec_mips.o \ > mips/aacsbr_mips.o \ > mips/sbrdsp_mips.o \ > mips/aacpsdsp_mips.o > -MIPSDSP-OBJS-$(CONFIG_AAC_ENCODER) += mips/aaccoder_mips.o > MIPSFPU-OBJS-$(CONFIG_AAC_ENCODER) += mips/iirfilter_mips.o > OBJS-$(CONFIG_HEVC_DECODER) += mips/hevcdsp_init_mips.o \ > mips/hevcpred_init_mips.o > diff --git a/libavcodec/mips/aaccoder_mips.c b/libavcodec/mips/aaccoder_mips.c > deleted file mode 100644 > index dd9661fbdd..0000000000 > --- a/libavcodec/mips/aaccoder_mips.c > +++ /dev/null > @@ -1,2503 +0,0 @@ > -/* > - * Copyright (c) 2012 > - * MIPS Technologies, Inc., California. > - * > - * Redistribution and use in source and binary forms, with or without > - * modification, are permitted provided that the following conditions > - * are met: > - * 1. Redistributions of source code must retain the above copyright > - * notice, this list of conditions and the following disclaimer. > - * 2. Redistributions in binary form must reproduce the above copyright > - * notice, this list of conditions and the following disclaimer in the > - * documentation and/or other materials provided with the distribution. > - * 3. Neither the name of the MIPS Technologies, Inc., nor the names of its > - * contributors may be used to endorse or promote products derived from > - * this software without specific prior written permission. > - * > - * THIS SOFTWARE IS PROVIDED BY THE MIPS TECHNOLOGIES, INC. ``AS IS'' AND > - * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE > - * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE > - * ARE DISCLAIMED. IN NO EVENT SHALL THE MIPS TECHNOLOGIES, INC. BE LIABLE > - * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL > - * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS > - * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) > - * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT > - * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY > - * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF > - * SUCH DAMAGE. > - * > - * Author: Stanislav Ocovaj (socovaj@mips.com) > - * Szabolcs Pal (sabolc@mips.com) > - * > - * AAC coefficients encoder optimized for MIPS floating-point architecture > - * > - * This file is part of FFmpeg. > - * > - * FFmpeg is free software; you can redistribute it and/or > - * modify it under the terms of the GNU Lesser General Public > - * License as published by the Free Software Foundation; either > - * version 2.1 of the License, or (at your option) any later version. > - * > - * FFmpeg is distributed in the hope that it will be useful, > - * but WITHOUT ANY WARRANTY; without even the implied warranty of > - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > - * Lesser General Public License for more details. > - * > - * You should have received a copy of the GNU Lesser General Public > - * License along with FFmpeg; if not, write to the Free Software > - * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA > - */ > - > -/** > - * @file > - * Reference: libavcodec/aaccoder.c > - */ > - > -#include "libavutil/libm.h" > - > -#include > -#include "libavutil/mathematics.h" > -#include "libavcodec/avcodec.h" > -#include "libavcodec/put_bits.h" > -#include "libavcodec/aac.h" > -#include "libavcodec/aacenc.h" > -#include "libavcodec/aacencdsp.h" > -#include "libavcodec/aactab.h" > -#include "libavcodec/aacenctab.h" > -#include "libavcodec/aacenc_utils.h" > - > -#if HAVE_INLINE_ASM > -#if !HAVE_MIPS32R6 && !HAVE_MIPS64R6 > -typedef struct BandCodingPath { > - int prev_idx; > - float cost; > - int run; > -} BandCodingPath; > - > -static const uint8_t uquad_sign_bits[81] = { > - 0, 1, 1, 1, 2, 2, 1, 2, 2, > - 1, 2, 2, 2, 3, 3, 2, 3, 3, > - 1, 2, 2, 2, 3, 3, 2, 3, 3, > - 1, 2, 2, 2, 3, 3, 2, 3, 3, > - 2, 3, 3, 3, 4, 4, 3, 4, 4, > - 2, 3, 3, 3, 4, 4, 3, 4, 4, > - 1, 2, 2, 2, 3, 3, 2, 3, 3, > - 2, 3, 3, 3, 4, 4, 3, 4, 4, > - 2, 3, 3, 3, 4, 4, 3, 4, 4 > -}; > - > -static const uint8_t upair7_sign_bits[64] = { > - 0, 1, 1, 1, 1, 1, 1, 1, > - 1, 2, 2, 2, 2, 2, 2, 2, > - 1, 2, 2, 2, 2, 2, 2, 2, > - 1, 2, 2, 2, 2, 2, 2, 2, > - 1, 2, 2, 2, 2, 2, 2, 2, > - 1, 2, 2, 2, 2, 2, 2, 2, > - 1, 2, 2, 2, 2, 2, 2, 2, > - 1, 2, 2, 2, 2, 2, 2, 2, > -}; > - > -static const uint8_t upair12_sign_bits[169] = { > - 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, > - 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, > - 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, > - 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, > - 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, > - 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, > - 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, > - 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, > - 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, > - 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, > - 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, > - 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, > - 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2 > -}; > - > -static const uint8_t esc_sign_bits[289] = { > - 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, > - 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, > - 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, > - 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, > - 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, > - 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, > - 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, > - 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, > - 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, > - 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, > - 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, > - 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, > - 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, > - 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, > - 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, > - 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, > - 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2 > -}; > - > -/** > - * Functions developed from template function and optimized for quantizing and encoding band > - */ > -static void quantize_and_encode_band_cost_SQUAD_mips(struct AACEncContext *s, > - PutBitContext *pb, const float *in, float *out, > - const float *scaled, int size, int scale_idx, > - int cb, const float lambda, const float uplim, > - int *bits, float *energy, const float ROUNDING) > -{ > - const float Q34 = ff_aac_pow34sf_tab[POW_SF2_ZERO - scale_idx + SCALE_ONE_POS - SCALE_DIV_512]; > - const float IQ = ff_aac_pow2sf_tab [POW_SF2_ZERO + scale_idx - SCALE_ONE_POS + SCALE_DIV_512]; > - int i; > - int qc1, qc2, qc3, qc4; > - float qenergy = 0.0f; > - > - uint8_t *p_bits = (uint8_t *)ff_aac_spectral_bits[cb-1]; > - uint16_t *p_codes = (uint16_t *)ff_aac_spectral_codes[cb-1]; > - float *p_vec = (float *)ff_aac_codebook_vectors[cb-1]; > - > - abs_pow34_v(s->scoefs, in, size); > - scaled = s->scoefs; > - for (i = 0; i < size; i += 4) { > - int curidx; > - int *in_int = (int *)&in[i]; > - int t0, t1, t2, t3, t4, t5, t6, t7; > - const float *vec; > - > - qc1 = scaled[i ] * Q34 + ROUND_STANDARD; > - qc2 = scaled[i+1] * Q34 + ROUND_STANDARD; > - qc3 = scaled[i+2] * Q34 + ROUND_STANDARD; > - qc4 = scaled[i+3] * Q34 + ROUND_STANDARD; > - > - __asm__ volatile ( > - ".set push \n\t" > - ".set noreorder \n\t" > - > - "slt %[qc1], $zero, %[qc1] \n\t" > - "slt %[qc2], $zero, %[qc2] \n\t" > - "slt %[qc3], $zero, %[qc3] \n\t" > - "slt %[qc4], $zero, %[qc4] \n\t" > - "lw %[t0], 0(%[in_int]) \n\t" > - "lw %[t1], 4(%[in_int]) \n\t" > - "lw %[t2], 8(%[in_int]) \n\t" > - "lw %[t3], 12(%[in_int]) \n\t" > - "srl %[t0], %[t0], 31 \n\t" > - "srl %[t1], %[t1], 31 \n\t" > - "srl %[t2], %[t2], 31 \n\t" > - "srl %[t3], %[t3], 31 \n\t" > - "subu %[t4], $zero, %[qc1] \n\t" > - "subu %[t5], $zero, %[qc2] \n\t" > - "subu %[t6], $zero, %[qc3] \n\t" > - "subu %[t7], $zero, %[qc4] \n\t" > - "movn %[qc1], %[t4], %[t0] \n\t" > - "movn %[qc2], %[t5], %[t1] \n\t" > - "movn %[qc3], %[t6], %[t2] \n\t" > - "movn %[qc4], %[t7], %[t3] \n\t" > - > - ".set pop \n\t" > - > - : [qc1]"+r"(qc1), [qc2]"+r"(qc2), > - [qc3]"+r"(qc3), [qc4]"+r"(qc4), > - [t0]"=&r"(t0), [t1]"=&r"(t1), [t2]"=&r"(t2), [t3]"=&r"(t3), > - [t4]"=&r"(t4), [t5]"=&r"(t5), [t6]"=&r"(t6), [t7]"=&r"(t7) > - : [in_int]"r"(in_int) > - : "memory" > - ); > - > - curidx = qc1; > - curidx *= 3; > - curidx += qc2; > - curidx *= 3; > - curidx += qc3; > - curidx *= 3; > - curidx += qc4; > - curidx += 40; > - > - put_bits(pb, p_bits[curidx], p_codes[curidx]); > - > - if (out || energy) { > - float e1,e2,e3,e4; > - vec = &p_vec[curidx*4]; > - e1 = vec[0] * IQ; > - e2 = vec[1] * IQ; > - e3 = vec[2] * IQ; > - e4 = vec[3] * IQ; > - if (out) { > - out[i+0] = e1; > - out[i+1] = e2; > - out[i+2] = e3; > - out[i+3] = e4; > - } > - if (energy) > - qenergy += (e1*e1 + e2*e2) + (e3*e3 + e4*e4); > - } > - } > - if (energy) > - *energy = qenergy; > -} > - > -static void quantize_and_encode_band_cost_UQUAD_mips(struct AACEncContext *s, > - PutBitContext *pb, const float *in, float *out, > - const float *scaled, int size, int scale_idx, > - int cb, const float lambda, const float uplim, > - int *bits, float *energy, const float ROUNDING) > -{ > - const float Q34 = ff_aac_pow34sf_tab[POW_SF2_ZERO - scale_idx + SCALE_ONE_POS - SCALE_DIV_512]; > - const float IQ = ff_aac_pow2sf_tab [POW_SF2_ZERO + scale_idx - SCALE_ONE_POS + SCALE_DIV_512]; > - int i; > - int qc1, qc2, qc3, qc4; > - float qenergy = 0.0f; > - > - uint8_t *p_bits = (uint8_t *)ff_aac_spectral_bits[cb-1]; > - uint16_t *p_codes = (uint16_t *)ff_aac_spectral_codes[cb-1]; > - float *p_vec = (float *)ff_aac_codebook_vectors[cb-1]; > - > - abs_pow34_v(s->scoefs, in, size); > - scaled = s->scoefs; > - for (i = 0; i < size; i += 4) { > - int curidx, sign, count; > - int *in_int = (int *)&in[i]; > - uint8_t v_bits; > - unsigned int v_codes; > - int t0, t1, t2, t3, t4; > - const float *vec; > - > - qc1 = scaled[i ] * Q34 + ROUND_STANDARD; > - qc2 = scaled[i+1] * Q34 + ROUND_STANDARD; > - qc3 = scaled[i+2] * Q34 + ROUND_STANDARD; > - qc4 = scaled[i+3] * Q34 + ROUND_STANDARD; > - > - __asm__ volatile ( > - ".set push \n\t" > - ".set noreorder \n\t" > - > - "ori %[t4], $zero, 2 \n\t" > - "ori %[sign], $zero, 0 \n\t" > - "slt %[t0], %[t4], %[qc1] \n\t" > - "slt %[t1], %[t4], %[qc2] \n\t" > - "slt %[t2], %[t4], %[qc3] \n\t" > - "slt %[t3], %[t4], %[qc4] \n\t" > - "movn %[qc1], %[t4], %[t0] \n\t" > - "movn %[qc2], %[t4], %[t1] \n\t" > - "movn %[qc3], %[t4], %[t2] \n\t" > - "movn %[qc4], %[t4], %[t3] \n\t" > - "lw %[t0], 0(%[in_int]) \n\t" > - "lw %[t1], 4(%[in_int]) \n\t" > - "lw %[t2], 8(%[in_int]) \n\t" > - "lw %[t3], 12(%[in_int]) \n\t" > - "slt %[t0], %[t0], $zero \n\t" > - "movn %[sign], %[t0], %[qc1] \n\t" > - "slt %[t1], %[t1], $zero \n\t" > - "slt %[t2], %[t2], $zero \n\t" > - "slt %[t3], %[t3], $zero \n\t" > - "sll %[t0], %[sign], 1 \n\t" > - "or %[t0], %[t0], %[t1] \n\t" > - "movn %[sign], %[t0], %[qc2] \n\t" > - "slt %[t4], $zero, %[qc1] \n\t" > - "slt %[t1], $zero, %[qc2] \n\t" > - "slt %[count], $zero, %[qc3] \n\t" > - "sll %[t0], %[sign], 1 \n\t" > - "or %[t0], %[t0], %[t2] \n\t" > - "movn %[sign], %[t0], %[qc3] \n\t" > - "slt %[t2], $zero, %[qc4] \n\t" > - "addu %[count], %[count], %[t4] \n\t" > - "addu %[count], %[count], %[t1] \n\t" > - "sll %[t0], %[sign], 1 \n\t" > - "or %[t0], %[t0], %[t3] \n\t" > - "movn %[sign], %[t0], %[qc4] \n\t" > - "addu %[count], %[count], %[t2] \n\t" > - > - ".set pop \n\t" > - > - : [qc1]"+r"(qc1), [qc2]"+r"(qc2), > - [qc3]"+r"(qc3), [qc4]"+r"(qc4), > - [sign]"=&r"(sign), [count]"=&r"(count), > - [t0]"=&r"(t0), [t1]"=&r"(t1), [t2]"=&r"(t2), [t3]"=&r"(t3), > - [t4]"=&r"(t4) > - : [in_int]"r"(in_int) > - : "memory" > - ); > - > - curidx = qc1; > - curidx *= 3; > - curidx += qc2; > - curidx *= 3; > - curidx += qc3; > - curidx *= 3; > - curidx += qc4; > - > - v_codes = (p_codes[curidx] << count) | (sign & ((1 << count) - 1)); > - v_bits = p_bits[curidx] + count; > - put_bits(pb, v_bits, v_codes); > - > - if (out || energy) { > - float e1,e2,e3,e4; > - vec = &p_vec[curidx*4]; > - e1 = copysignf(vec[0] * IQ, in[i+0]); > - e2 = copysignf(vec[1] * IQ, in[i+1]); > - e3 = copysignf(vec[2] * IQ, in[i+2]); > - e4 = copysignf(vec[3] * IQ, in[i+3]); > - if (out) { > - out[i+0] = e1; > - out[i+1] = e2; > - out[i+2] = e3; > - out[i+3] = e4; > - } > - if (energy) > - qenergy += (e1*e1 + e2*e2) + (e3*e3 + e4*e4); > - } > - } > - if (energy) > - *energy = qenergy; > -} > - > -static void quantize_and_encode_band_cost_SPAIR_mips(struct AACEncContext *s, > - PutBitContext *pb, const float *in, float *out, > - const float *scaled, int size, int scale_idx, > - int cb, const float lambda, const float uplim, > - int *bits, float *energy, const float ROUNDING) > -{ > - const float Q34 = ff_aac_pow34sf_tab[POW_SF2_ZERO - scale_idx + SCALE_ONE_POS - SCALE_DIV_512]; > - const float IQ = ff_aac_pow2sf_tab [POW_SF2_ZERO + scale_idx - SCALE_ONE_POS + SCALE_DIV_512]; > - int i; > - int qc1, qc2, qc3, qc4; > - float qenergy = 0.0f; > - > - uint8_t *p_bits = (uint8_t *)ff_aac_spectral_bits[cb-1]; > - uint16_t *p_codes = (uint16_t *)ff_aac_spectral_codes[cb-1]; > - float *p_vec = (float *)ff_aac_codebook_vectors[cb-1]; > - > - abs_pow34_v(s->scoefs, in, size); > - scaled = s->scoefs; > - for (i = 0; i < size; i += 4) { > - int curidx, curidx2; > - int *in_int = (int *)&in[i]; > - uint8_t v_bits; > - unsigned int v_codes; > - int t0, t1, t2, t3, t4, t5, t6, t7; > - const float *vec1, *vec2; > - > - qc1 = scaled[i ] * Q34 + ROUND_STANDARD; > - qc2 = scaled[i+1] * Q34 + ROUND_STANDARD; > - qc3 = scaled[i+2] * Q34 + ROUND_STANDARD; > - qc4 = scaled[i+3] * Q34 + ROUND_STANDARD; > - > - __asm__ volatile ( > - ".set push \n\t" > - ".set noreorder \n\t" > - > - "ori %[t4], $zero, 4 \n\t" > - "slt %[t0], %[t4], %[qc1] \n\t" > - "slt %[t1], %[t4], %[qc2] \n\t" > - "slt %[t2], %[t4], %[qc3] \n\t" > - "slt %[t3], %[t4], %[qc4] \n\t" > - "movn %[qc1], %[t4], %[t0] \n\t" > - "movn %[qc2], %[t4], %[t1] \n\t" > - "movn %[qc3], %[t4], %[t2] \n\t" > - "movn %[qc4], %[t4], %[t3] \n\t" > - "lw %[t0], 0(%[in_int]) \n\t" > - "lw %[t1], 4(%[in_int]) \n\t" > - "lw %[t2], 8(%[in_int]) \n\t" > - "lw %[t3], 12(%[in_int]) \n\t" > - "srl %[t0], %[t0], 31 \n\t" > - "srl %[t1], %[t1], 31 \n\t" > - "srl %[t2], %[t2], 31 \n\t" > - "srl %[t3], %[t3], 31 \n\t" > - "subu %[t4], $zero, %[qc1] \n\t" > - "subu %[t5], $zero, %[qc2] \n\t" > - "subu %[t6], $zero, %[qc3] \n\t" > - "subu %[t7], $zero, %[qc4] \n\t" > - "movn %[qc1], %[t4], %[t0] \n\t" > - "movn %[qc2], %[t5], %[t1] \n\t" > - "movn %[qc3], %[t6], %[t2] \n\t" > - "movn %[qc4], %[t7], %[t3] \n\t" > - > - ".set pop \n\t" > - > - : [qc1]"+r"(qc1), [qc2]"+r"(qc2), > - [qc3]"+r"(qc3), [qc4]"+r"(qc4), > - [t0]"=&r"(t0), [t1]"=&r"(t1), [t2]"=&r"(t2), [t3]"=&r"(t3), > - [t4]"=&r"(t4), [t5]"=&r"(t5), [t6]"=&r"(t6), [t7]"=&r"(t7) > - : [in_int]"r"(in_int) > - : "memory" > - ); > - > - curidx = 9 * qc1; > - curidx += qc2 + 40; > - > - curidx2 = 9 * qc3; > - curidx2 += qc4 + 40; > - > - v_codes = (p_codes[curidx] << p_bits[curidx2]) | (p_codes[curidx2]); > - v_bits = p_bits[curidx] + p_bits[curidx2]; > - put_bits(pb, v_bits, v_codes); > - > - if (out || energy) { > - float e1,e2,e3,e4; > - vec1 = &p_vec[curidx*2 ]; > - vec2 = &p_vec[curidx2*2]; > - e1 = vec1[0] * IQ; > - e2 = vec1[1] * IQ; > - e3 = vec2[0] * IQ; > - e4 = vec2[1] * IQ; > - if (out) { > - out[i+0] = e1; > - out[i+1] = e2; > - out[i+2] = e3; > - out[i+3] = e4; > - } > - if (energy) > - qenergy += (e1*e1 + e2*e2) + (e3*e3 + e4*e4); > - } > - } > - if (energy) > - *energy = qenergy; > -} > - > -static void quantize_and_encode_band_cost_UPAIR7_mips(struct AACEncContext *s, > - PutBitContext *pb, const float *in, float *out, > - const float *scaled, int size, int scale_idx, > - int cb, const float lambda, const float uplim, > - int *bits, float *energy, const float ROUNDING) > -{ > - const float Q34 = ff_aac_pow34sf_tab[POW_SF2_ZERO - scale_idx + SCALE_ONE_POS - SCALE_DIV_512]; > - const float IQ = ff_aac_pow2sf_tab [POW_SF2_ZERO + scale_idx - SCALE_ONE_POS + SCALE_DIV_512]; > - int i; > - int qc1, qc2, qc3, qc4; > - float qenergy = 0.0f; > - > - uint8_t *p_bits = (uint8_t*) ff_aac_spectral_bits[cb-1]; > - uint16_t *p_codes = (uint16_t*)ff_aac_spectral_codes[cb-1]; > - float *p_vec = (float *)ff_aac_codebook_vectors[cb-1]; > - > - abs_pow34_v(s->scoefs, in, size); > - scaled = s->scoefs; > - for (i = 0; i < size; i += 4) { > - int curidx1, curidx2, sign1, count1, sign2, count2; > - int *in_int = (int *)&in[i]; > - uint8_t v_bits; > - unsigned int v_codes; > - int t0, t1, t2, t3, t4; > - const float *vec1, *vec2; > - > - qc1 = scaled[i ] * Q34 + ROUND_STANDARD; > - qc2 = scaled[i+1] * Q34 + ROUND_STANDARD; > - qc3 = scaled[i+2] * Q34 + ROUND_STANDARD; > - qc4 = scaled[i+3] * Q34 + ROUND_STANDARD; > - > - __asm__ volatile ( > - ".set push \n\t" > - ".set noreorder \n\t" > - > - "ori %[t4], $zero, 7 \n\t" > - "ori %[sign1], $zero, 0 \n\t" > - "ori %[sign2], $zero, 0 \n\t" > - "slt %[t0], %[t4], %[qc1] \n\t" > - "slt %[t1], %[t4], %[qc2] \n\t" > - "slt %[t2], %[t4], %[qc3] \n\t" > - "slt %[t3], %[t4], %[qc4] \n\t" > - "movn %[qc1], %[t4], %[t0] \n\t" > - "movn %[qc2], %[t4], %[t1] \n\t" > - "movn %[qc3], %[t4], %[t2] \n\t" > - "movn %[qc4], %[t4], %[t3] \n\t" > - "lw %[t0], 0(%[in_int]) \n\t" > - "lw %[t1], 4(%[in_int]) \n\t" > - "lw %[t2], 8(%[in_int]) \n\t" > - "lw %[t3], 12(%[in_int]) \n\t" > - "slt %[t0], %[t0], $zero \n\t" > - "movn %[sign1], %[t0], %[qc1] \n\t" > - "slt %[t2], %[t2], $zero \n\t" > - "movn %[sign2], %[t2], %[qc3] \n\t" > - "slt %[t1], %[t1], $zero \n\t" > - "sll %[t0], %[sign1], 1 \n\t" > - "or %[t0], %[t0], %[t1] \n\t" > - "movn %[sign1], %[t0], %[qc2] \n\t" > - "slt %[t3], %[t3], $zero \n\t" > - "sll %[t0], %[sign2], 1 \n\t" > - "or %[t0], %[t0], %[t3] \n\t" > - "movn %[sign2], %[t0], %[qc4] \n\t" > - "slt %[count1], $zero, %[qc1] \n\t" > - "slt %[t1], $zero, %[qc2] \n\t" > - "slt %[count2], $zero, %[qc3] \n\t" > - "slt %[t2], $zero, %[qc4] \n\t" > - "addu %[count1], %[count1], %[t1] \n\t" > - "addu %[count2], %[count2], %[t2] \n\t" > - > - ".set pop \n\t" > - > - : [qc1]"+r"(qc1), [qc2]"+r"(qc2), > - [qc3]"+r"(qc3), [qc4]"+r"(qc4), > - [sign1]"=&r"(sign1), [count1]"=&r"(count1), > - [sign2]"=&r"(sign2), [count2]"=&r"(count2), > - [t0]"=&r"(t0), [t1]"=&r"(t1), [t2]"=&r"(t2), [t3]"=&r"(t3), > - [t4]"=&r"(t4) > - : [in_int]"r"(in_int) > - : "t0", "t1", "t2", "t3", "t4", > - "memory" > - ); > - > - curidx1 = 8 * qc1; > - curidx1 += qc2; > - > - v_codes = (p_codes[curidx1] << count1) | sign1; > - v_bits = p_bits[curidx1] + count1; > - put_bits(pb, v_bits, v_codes); > - > - curidx2 = 8 * qc3; > - curidx2 += qc4; > - > - v_codes = (p_codes[curidx2] << count2) | sign2; > - v_bits = p_bits[curidx2] + count2; > - put_bits(pb, v_bits, v_codes); > - > - if (out || energy) { > - float e1,e2,e3,e4; > - vec1 = &p_vec[curidx1*2]; > - vec2 = &p_vec[curidx2*2]; > - e1 = copysignf(vec1[0] * IQ, in[i+0]); > - e2 = copysignf(vec1[1] * IQ, in[i+1]); > - e3 = copysignf(vec2[0] * IQ, in[i+2]); > - e4 = copysignf(vec2[1] * IQ, in[i+3]); > - if (out) { > - out[i+0] = e1; > - out[i+1] = e2; > - out[i+2] = e3; > - out[i+3] = e4; > - } > - if (energy) > - qenergy += (e1*e1 + e2*e2) + (e3*e3 + e4*e4); > - } > - } > - if (energy) > - *energy = qenergy; > -} > - > -static void quantize_and_encode_band_cost_UPAIR12_mips(struct AACEncContext *s, > - PutBitContext *pb, const float *in, float *out, > - const float *scaled, int size, int scale_idx, > - int cb, const float lambda, const float uplim, > - int *bits, float *energy, const float ROUNDING) > -{ > - const float Q34 = ff_aac_pow34sf_tab[POW_SF2_ZERO - scale_idx + SCALE_ONE_POS - SCALE_DIV_512]; > - const float IQ = ff_aac_pow2sf_tab [POW_SF2_ZERO + scale_idx - SCALE_ONE_POS + SCALE_DIV_512]; > - int i; > - int qc1, qc2, qc3, qc4; > - float qenergy = 0.0f; > - > - uint8_t *p_bits = (uint8_t*) ff_aac_spectral_bits[cb-1]; > - uint16_t *p_codes = (uint16_t*)ff_aac_spectral_codes[cb-1]; > - float *p_vec = (float *)ff_aac_codebook_vectors[cb-1]; > - > - abs_pow34_v(s->scoefs, in, size); > - scaled = s->scoefs; > - for (i = 0; i < size; i += 4) { > - int curidx1, curidx2, sign1, count1, sign2, count2; > - int *in_int = (int *)&in[i]; > - uint8_t v_bits; > - unsigned int v_codes; > - int t0, t1, t2, t3, t4; > - const float *vec1, *vec2; > - > - qc1 = scaled[i ] * Q34 + ROUND_STANDARD; > - qc2 = scaled[i+1] * Q34 + ROUND_STANDARD; > - qc3 = scaled[i+2] * Q34 + ROUND_STANDARD; > - qc4 = scaled[i+3] * Q34 + ROUND_STANDARD; > - > - __asm__ volatile ( > - ".set push \n\t" > - ".set noreorder \n\t" > - > - "ori %[t4], $zero, 12 \n\t" > - "ori %[sign1], $zero, 0 \n\t" > - "ori %[sign2], $zero, 0 \n\t" > - "slt %[t0], %[t4], %[qc1] \n\t" > - "slt %[t1], %[t4], %[qc2] \n\t" > - "slt %[t2], %[t4], %[qc3] \n\t" > - "slt %[t3], %[t4], %[qc4] \n\t" > - "movn %[qc1], %[t4], %[t0] \n\t" > - "movn %[qc2], %[t4], %[t1] \n\t" > - "movn %[qc3], %[t4], %[t2] \n\t" > - "movn %[qc4], %[t4], %[t3] \n\t" > - "lw %[t0], 0(%[in_int]) \n\t" > - "lw %[t1], 4(%[in_int]) \n\t" > - "lw %[t2], 8(%[in_int]) \n\t" > - "lw %[t3], 12(%[in_int]) \n\t" > - "slt %[t0], %[t0], $zero \n\t" > - "movn %[sign1], %[t0], %[qc1] \n\t" > - "slt %[t2], %[t2], $zero \n\t" > - "movn %[sign2], %[t2], %[qc3] \n\t" > - "slt %[t1], %[t1], $zero \n\t" > - "sll %[t0], %[sign1], 1 \n\t" > - "or %[t0], %[t0], %[t1] \n\t" > - "movn %[sign1], %[t0], %[qc2] \n\t" > - "slt %[t3], %[t3], $zero \n\t" > - "sll %[t0], %[sign2], 1 \n\t" > - "or %[t0], %[t0], %[t3] \n\t" > - "movn %[sign2], %[t0], %[qc4] \n\t" > - "slt %[count1], $zero, %[qc1] \n\t" > - "slt %[t1], $zero, %[qc2] \n\t" > - "slt %[count2], $zero, %[qc3] \n\t" > - "slt %[t2], $zero, %[qc4] \n\t" > - "addu %[count1], %[count1], %[t1] \n\t" > - "addu %[count2], %[count2], %[t2] \n\t" > - > - ".set pop \n\t" > - > - : [qc1]"+r"(qc1), [qc2]"+r"(qc2), > - [qc3]"+r"(qc3), [qc4]"+r"(qc4), > - [sign1]"=&r"(sign1), [count1]"=&r"(count1), > - [sign2]"=&r"(sign2), [count2]"=&r"(count2), > - [t0]"=&r"(t0), [t1]"=&r"(t1), [t2]"=&r"(t2), [t3]"=&r"(t3), > - [t4]"=&r"(t4) > - : [in_int]"r"(in_int) > - : "memory" > - ); > - > - curidx1 = 13 * qc1; > - curidx1 += qc2; > - > - v_codes = (p_codes[curidx1] << count1) | sign1; > - v_bits = p_bits[curidx1] + count1; > - put_bits(pb, v_bits, v_codes); > - > - curidx2 = 13 * qc3; > - curidx2 += qc4; > - > - v_codes = (p_codes[curidx2] << count2) | sign2; > - v_bits = p_bits[curidx2] + count2; > - put_bits(pb, v_bits, v_codes); > - > - if (out || energy) { > - float e1,e2,e3,e4; > - vec1 = &p_vec[curidx1*2]; > - vec2 = &p_vec[curidx2*2]; > - e1 = copysignf(vec1[0] * IQ, in[i+0]); > - e2 = copysignf(vec1[1] * IQ, in[i+1]); > - e3 = copysignf(vec2[0] * IQ, in[i+2]); > - e4 = copysignf(vec2[1] * IQ, in[i+3]); > - if (out) { > - out[i+0] = e1; > - out[i+1] = e2; > - out[i+2] = e3; > - out[i+3] = e4; > - } > - if (energy) > - qenergy += (e1*e1 + e2*e2) + (e3*e3 + e4*e4); > - } > - } > - if (energy) > - *energy = qenergy; > -} > - > -static void quantize_and_encode_band_cost_ESC_mips(struct AACEncContext *s, > - PutBitContext *pb, const float *in, float *out, > - const float *scaled, int size, int scale_idx, > - int cb, const float lambda, const float uplim, > - int *bits, float *energy, const float ROUNDING) > -{ > - const float Q34 = ff_aac_pow34sf_tab[POW_SF2_ZERO - scale_idx + SCALE_ONE_POS - SCALE_DIV_512]; > - const float IQ = ff_aac_pow2sf_tab [POW_SF2_ZERO + scale_idx - SCALE_ONE_POS + SCALE_DIV_512]; > - int i; > - int qc1, qc2, qc3, qc4; > - float qenergy = 0.0f; > - > - uint8_t *p_bits = (uint8_t* )ff_aac_spectral_bits[cb-1]; > - uint16_t *p_codes = (uint16_t*)ff_aac_spectral_codes[cb-1]; > - float *p_vectors = (float* )ff_aac_codebook_vectors[cb-1]; > - > - abs_pow34_v(s->scoefs, in, size); > - scaled = s->scoefs; > - > - if (cb < 11) { > - for (i = 0; i < size; i += 4) { > - int curidx, curidx2, sign1, count1, sign2, count2; > - int *in_int = (int *)&in[i]; > - uint8_t v_bits; > - unsigned int v_codes; > - int t0, t1, t2, t3, t4; > - const float *vec1, *vec2; > - > - qc1 = scaled[i ] * Q34 + ROUNDING; > - qc2 = scaled[i+1] * Q34 + ROUNDING; > - qc3 = scaled[i+2] * Q34 + ROUNDING; > - qc4 = scaled[i+3] * Q34 + ROUNDING; > - > - __asm__ volatile ( > - ".set push \n\t" > - ".set noreorder \n\t" > - > - "ori %[t4], $zero, 16 \n\t" > - "ori %[sign1], $zero, 0 \n\t" > - "ori %[sign2], $zero, 0 \n\t" > - "slt %[t0], %[t4], %[qc1] \n\t" > - "slt %[t1], %[t4], %[qc2] \n\t" > - "slt %[t2], %[t4], %[qc3] \n\t" > - "slt %[t3], %[t4], %[qc4] \n\t" > - "movn %[qc1], %[t4], %[t0] \n\t" > - "movn %[qc2], %[t4], %[t1] \n\t" > - "movn %[qc3], %[t4], %[t2] \n\t" > - "movn %[qc4], %[t4], %[t3] \n\t" > - "lw %[t0], 0(%[in_int]) \n\t" > - "lw %[t1], 4(%[in_int]) \n\t" > - "lw %[t2], 8(%[in_int]) \n\t" > - "lw %[t3], 12(%[in_int]) \n\t" > - "slt %[t0], %[t0], $zero \n\t" > - "movn %[sign1], %[t0], %[qc1] \n\t" > - "slt %[t2], %[t2], $zero \n\t" > - "movn %[sign2], %[t2], %[qc3] \n\t" > - "slt %[t1], %[t1], $zero \n\t" > - "sll %[t0], %[sign1], 1 \n\t" > - "or %[t0], %[t0], %[t1] \n\t" > - "movn %[sign1], %[t0], %[qc2] \n\t" > - "slt %[t3], %[t3], $zero \n\t" > - "sll %[t0], %[sign2], 1 \n\t" > - "or %[t0], %[t0], %[t3] \n\t" > - "movn %[sign2], %[t0], %[qc4] \n\t" > - "slt %[count1], $zero, %[qc1] \n\t" > - "slt %[t1], $zero, %[qc2] \n\t" > - "slt %[count2], $zero, %[qc3] \n\t" > - "slt %[t2], $zero, %[qc4] \n\t" > - "addu %[count1], %[count1], %[t1] \n\t" > - "addu %[count2], %[count2], %[t2] \n\t" > - > - ".set pop \n\t" > - > - : [qc1]"+r"(qc1), [qc2]"+r"(qc2), > - [qc3]"+r"(qc3), [qc4]"+r"(qc4), > - [sign1]"=&r"(sign1), [count1]"=&r"(count1), > - [sign2]"=&r"(sign2), [count2]"=&r"(count2), > - [t0]"=&r"(t0), [t1]"=&r"(t1), [t2]"=&r"(t2), [t3]"=&r"(t3), > - [t4]"=&r"(t4) > - : [in_int]"r"(in_int) > - : "memory" > - ); > - > - curidx = 17 * qc1; > - curidx += qc2; > - curidx2 = 17 * qc3; > - curidx2 += qc4; > - > - v_codes = (p_codes[curidx] << count1) | sign1; > - v_bits = p_bits[curidx] + count1; > - put_bits(pb, v_bits, v_codes); > - > - v_codes = (p_codes[curidx2] << count2) | sign2; > - v_bits = p_bits[curidx2] + count2; > - put_bits(pb, v_bits, v_codes); > - > - if (out || energy) { > - float e1,e2,e3,e4; > - vec1 = &p_vectors[curidx*2 ]; > - vec2 = &p_vectors[curidx2*2]; > - e1 = copysignf(vec1[0] * IQ, in[i+0]); > - e2 = copysignf(vec1[1] * IQ, in[i+1]); > - e3 = copysignf(vec2[0] * IQ, in[i+2]); > - e4 = copysignf(vec2[1] * IQ, in[i+3]); > - if (out) { > - out[i+0] = e1; > - out[i+1] = e2; > - out[i+2] = e3; > - out[i+3] = e4; > - } > - if (energy) > - qenergy += (e1*e1 + e2*e2) + (e3*e3 + e4*e4); > - } > - } > - } else { > - for (i = 0; i < size; i += 4) { > - int curidx, curidx2, sign1, count1, sign2, count2; > - int *in_int = (int *)&in[i]; > - uint8_t v_bits; > - unsigned int v_codes; > - int c1, c2, c3, c4; > - int t0, t1, t2, t3, t4; > - > - qc1 = scaled[i ] * Q34 + ROUNDING; > - qc2 = scaled[i+1] * Q34 + ROUNDING; > - qc3 = scaled[i+2] * Q34 + ROUNDING; > - qc4 = scaled[i+3] * Q34 + ROUNDING; > - > - __asm__ volatile ( > - ".set push \n\t" > - ".set noreorder \n\t" > - > - "ori %[t4], $zero, 16 \n\t" > - "ori %[sign1], $zero, 0 \n\t" > - "ori %[sign2], $zero, 0 \n\t" > - "shll_s.w %[c1], %[qc1], 18 \n\t" > - "shll_s.w %[c2], %[qc2], 18 \n\t" > - "shll_s.w %[c3], %[qc3], 18 \n\t" > - "shll_s.w %[c4], %[qc4], 18 \n\t" > - "srl %[c1], %[c1], 18 \n\t" > - "srl %[c2], %[c2], 18 \n\t" > - "srl %[c3], %[c3], 18 \n\t" > - "srl %[c4], %[c4], 18 \n\t" > - "slt %[t0], %[t4], %[qc1] \n\t" > - "slt %[t1], %[t4], %[qc2] \n\t" > - "slt %[t2], %[t4], %[qc3] \n\t" > - "slt %[t3], %[t4], %[qc4] \n\t" > - "movn %[qc1], %[t4], %[t0] \n\t" > - "movn %[qc2], %[t4], %[t1] \n\t" > - "movn %[qc3], %[t4], %[t2] \n\t" > - "movn %[qc4], %[t4], %[t3] \n\t" > - "lw %[t0], 0(%[in_int]) \n\t" > - "lw %[t1], 4(%[in_int]) \n\t" > - "lw %[t2], 8(%[in_int]) \n\t" > - "lw %[t3], 12(%[in_int]) \n\t" > - "slt %[t0], %[t0], $zero \n\t" > - "movn %[sign1], %[t0], %[qc1] \n\t" > - "slt %[t2], %[t2], $zero \n\t" > - "movn %[sign2], %[t2], %[qc3] \n\t" > - "slt %[t1], %[t1], $zero \n\t" > - "sll %[t0], %[sign1], 1 \n\t" > - "or %[t0], %[t0], %[t1] \n\t" > - "movn %[sign1], %[t0], %[qc2] \n\t" > - "slt %[t3], %[t3], $zero \n\t" > - "sll %[t0], %[sign2], 1 \n\t" > - "or %[t0], %[t0], %[t3] \n\t" > - "movn %[sign2], %[t0], %[qc4] \n\t" > - "slt %[count1], $zero, %[qc1] \n\t" > - "slt %[t1], $zero, %[qc2] \n\t" > - "slt %[count2], $zero, %[qc3] \n\t" > - "slt %[t2], $zero, %[qc4] \n\t" > - "addu %[count1], %[count1], %[t1] \n\t" > - "addu %[count2], %[count2], %[t2] \n\t" > - > - ".set pop \n\t" > - > - : [qc1]"+r"(qc1), [qc2]"+r"(qc2), > - [qc3]"+r"(qc3), [qc4]"+r"(qc4), > - [sign1]"=&r"(sign1), [count1]"=&r"(count1), > - [sign2]"=&r"(sign2), [count2]"=&r"(count2), > - [c1]"=&r"(c1), [c2]"=&r"(c2), > - [c3]"=&r"(c3), [c4]"=&r"(c4), > - [t0]"=&r"(t0), [t1]"=&r"(t1), [t2]"=&r"(t2), [t3]"=&r"(t3), > - [t4]"=&r"(t4) > - : [in_int]"r"(in_int) > - : "memory" > - ); > - > - curidx = 17 * qc1; > - curidx += qc2; > - > - curidx2 = 17 * qc3; > - curidx2 += qc4; > - > - v_codes = (p_codes[curidx] << count1) | sign1; > - v_bits = p_bits[curidx] + count1; > - put_bits(pb, v_bits, v_codes); > - > - if (p_vectors[curidx*2 ] == 64.0f) { > - int len = av_log2(c1); > - v_codes = (((1 << (len - 3)) - 2) << len) | (c1 & ((1 << len) - 1)); > - put_bits(pb, len * 2 - 3, v_codes); > - } > - if (p_vectors[curidx*2+1] == 64.0f) { > - int len = av_log2(c2); > - v_codes = (((1 << (len - 3)) - 2) << len) | (c2 & ((1 << len) - 1)); > - put_bits(pb, len*2-3, v_codes); > - } > - > - v_codes = (p_codes[curidx2] << count2) | sign2; > - v_bits = p_bits[curidx2] + count2; > - put_bits(pb, v_bits, v_codes); > - > - if (p_vectors[curidx2*2 ] == 64.0f) { > - int len = av_log2(c3); > - v_codes = (((1 << (len - 3)) - 2) << len) | (c3 & ((1 << len) - 1)); > - put_bits(pb, len* 2 - 3, v_codes); > - } > - if (p_vectors[curidx2*2+1] == 64.0f) { > - int len = av_log2(c4); > - v_codes = (((1 << (len - 3)) - 2) << len) | (c4 & ((1 << len) - 1)); > - put_bits(pb, len * 2 - 3, v_codes); > - } > - > - if (out || energy) { > - float e1, e2, e3, e4; > - e1 = copysignf(c1 * cbrtf(c1) * IQ, in[i+0]); > - e2 = copysignf(c2 * cbrtf(c2) * IQ, in[i+1]); > - e3 = copysignf(c3 * cbrtf(c3) * IQ, in[i+2]); > - e4 = copysignf(c4 * cbrtf(c4) * IQ, in[i+3]); > - if (out) { > - out[i+0] = e1; > - out[i+1] = e2; > - out[i+2] = e3; > - out[i+3] = e4; > - } > - if (energy) > - qenergy += (e1*e1 + e2*e2) + (e3*e3 + e4*e4); > - } > - } > - } > - if (energy) > - *energy = qenergy; > -} > - > -static void quantize_and_encode_band_cost_NONE_mips(struct AACEncContext *s, > - PutBitContext *pb, const float *in, float *out, > - const float *scaled, int size, int scale_idx, > - int cb, const float lambda, const float uplim, > - int *bits, float *energy, const float ROUNDING) { > - av_assert0(0); > -} > - > -static void quantize_and_encode_band_cost_ZERO_mips(struct AACEncContext *s, > - PutBitContext *pb, const float *in, float *out, > - const float *scaled, int size, int scale_idx, > - int cb, const float lambda, const float uplim, > - int *bits, float *energy, const float ROUNDING) { > - int i; > - if (bits) > - *bits = 0; > - if (out) { > - for (i = 0; i < size; i += 4) { > - out[i ] = 0.0f; > - out[i+1] = 0.0f; > - out[i+2] = 0.0f; > - out[i+3] = 0.0f; > - } > - } > - if (energy) > - *energy = 0.0f; > -} > - > -static void (*const quantize_and_encode_band_cost_arr[])(struct AACEncContext *s, > - PutBitContext *pb, const float *in, float *out, > - const float *scaled, int size, int scale_idx, > - int cb, const float lambda, const float uplim, > - int *bits, float *energy, const float ROUNDING) = { > - quantize_and_encode_band_cost_ZERO_mips, > - quantize_and_encode_band_cost_SQUAD_mips, > - quantize_and_encode_band_cost_SQUAD_mips, > - quantize_and_encode_band_cost_UQUAD_mips, > - quantize_and_encode_band_cost_UQUAD_mips, > - quantize_and_encode_band_cost_SPAIR_mips, > - quantize_and_encode_band_cost_SPAIR_mips, > - quantize_and_encode_band_cost_UPAIR7_mips, > - quantize_and_encode_band_cost_UPAIR7_mips, > - quantize_and_encode_band_cost_UPAIR12_mips, > - quantize_and_encode_band_cost_UPAIR12_mips, > - quantize_and_encode_band_cost_ESC_mips, > - quantize_and_encode_band_cost_NONE_mips, /* cb 12 doesn't exist */ > - quantize_and_encode_band_cost_ZERO_mips, > - quantize_and_encode_band_cost_ZERO_mips, > - quantize_and_encode_band_cost_ZERO_mips, > -}; > - > -#define quantize_and_encode_band_cost( \ > - s, pb, in, out, scaled, size, scale_idx, cb, \ > - lambda, uplim, bits, energy, ROUNDING) \ > - quantize_and_encode_band_cost_arr[cb]( \ > - s, pb, in, out, scaled, size, scale_idx, cb, \ > - lambda, uplim, bits, energy, ROUNDING) > - > -static void quantize_and_encode_band_mips(struct AACEncContext *s, PutBitContext *pb, > - const float *in, float *out, int size, int scale_idx, > - int cb, const float lambda, int rtz) > -{ > - quantize_and_encode_band_cost(s, pb, in, out, NULL, size, scale_idx, cb, lambda, > - INFINITY, NULL, NULL, (rtz) ? ROUND_TO_ZERO : ROUND_STANDARD); > -} > - > -/** > - * Functions developed from template function and optimized for getting the number of bits > - */ > -static float get_band_numbits_ZERO_mips(struct AACEncContext *s, > - PutBitContext *pb, const float *in, > - const float *scaled, int size, int scale_idx, > - int cb, const float lambda, const float uplim, > - int *bits) > -{ > - return 0; > -} > - > -static float get_band_numbits_NONE_mips(struct AACEncContext *s, > - PutBitContext *pb, const float *in, > - const float *scaled, int size, int scale_idx, > - int cb, const float lambda, const float uplim, > - int *bits) > -{ > - av_assert0(0); > - return 0; > -} > - > -static float get_band_numbits_SQUAD_mips(struct AACEncContext *s, > - PutBitContext *pb, const float *in, > - const float *scaled, int size, int scale_idx, > - int cb, const float lambda, const float uplim, > - int *bits) > -{ > - const float Q34 = ff_aac_pow34sf_tab[POW_SF2_ZERO - scale_idx + SCALE_ONE_POS - SCALE_DIV_512]; > - int i; > - int qc1, qc2, qc3, qc4; > - int curbits = 0; > - > - uint8_t *p_bits = (uint8_t *)ff_aac_spectral_bits[cb-1]; > - > - for (i = 0; i < size; i += 4) { > - int curidx; > - int *in_int = (int *)&in[i]; > - int t0, t1, t2, t3, t4, t5, t6, t7; > - > - qc1 = scaled[i ] * Q34 + ROUND_STANDARD; > - qc2 = scaled[i+1] * Q34 + ROUND_STANDARD; > - qc3 = scaled[i+2] * Q34 + ROUND_STANDARD; > - qc4 = scaled[i+3] * Q34 + ROUND_STANDARD; > - > - __asm__ volatile ( > - ".set push \n\t" > - ".set noreorder \n\t" > - > - "slt %[qc1], $zero, %[qc1] \n\t" > - "slt %[qc2], $zero, %[qc2] \n\t" > - "slt %[qc3], $zero, %[qc3] \n\t" > - "slt %[qc4], $zero, %[qc4] \n\t" > - "lw %[t0], 0(%[in_int]) \n\t" > - "lw %[t1], 4(%[in_int]) \n\t" > - "lw %[t2], 8(%[in_int]) \n\t" > - "lw %[t3], 12(%[in_int]) \n\t" > - "srl %[t0], %[t0], 31 \n\t" > - "srl %[t1], %[t1], 31 \n\t" > - "srl %[t2], %[t2], 31 \n\t" > - "srl %[t3], %[t3], 31 \n\t" > - "subu %[t4], $zero, %[qc1] \n\t" > - "subu %[t5], $zero, %[qc2] \n\t" > - "subu %[t6], $zero, %[qc3] \n\t" > - "subu %[t7], $zero, %[qc4] \n\t" > - "movn %[qc1], %[t4], %[t0] \n\t" > - "movn %[qc2], %[t5], %[t1] \n\t" > - "movn %[qc3], %[t6], %[t2] \n\t" > - "movn %[qc4], %[t7], %[t3] \n\t" > - > - ".set pop \n\t" > - > - : [qc1]"+r"(qc1), [qc2]"+r"(qc2), > - [qc3]"+r"(qc3), [qc4]"+r"(qc4), > - [t0]"=&r"(t0), [t1]"=&r"(t1), [t2]"=&r"(t2), [t3]"=&r"(t3), > - [t4]"=&r"(t4), [t5]"=&r"(t5), [t6]"=&r"(t6), [t7]"=&r"(t7) > - : [in_int]"r"(in_int) > - : "memory" > - ); > - > - curidx = qc1; > - curidx *= 3; > - curidx += qc2; > - curidx *= 3; > - curidx += qc3; > - curidx *= 3; > - curidx += qc4; > - curidx += 40; > - > - curbits += p_bits[curidx]; > - } > - return curbits; > -} > - > -static float get_band_numbits_UQUAD_mips(struct AACEncContext *s, > - PutBitContext *pb, const float *in, > - const float *scaled, int size, int scale_idx, > - int cb, const float lambda, const float uplim, > - int *bits) > -{ > - const float Q34 = ff_aac_pow34sf_tab[POW_SF2_ZERO - scale_idx + SCALE_ONE_POS - SCALE_DIV_512]; > - int i; > - int curbits = 0; > - int qc1, qc2, qc3, qc4; > - > - uint8_t *p_bits = (uint8_t *)ff_aac_spectral_bits[cb-1]; > - > - for (i = 0; i < size; i += 4) { > - int curidx; > - int t0, t1, t2, t3, t4; > - > - qc1 = scaled[i ] * Q34 + ROUND_STANDARD; > - qc2 = scaled[i+1] * Q34 + ROUND_STANDARD; > - qc3 = scaled[i+2] * Q34 + ROUND_STANDARD; > - qc4 = scaled[i+3] * Q34 + ROUND_STANDARD; > - > - __asm__ volatile ( > - ".set push \n\t" > - ".set noreorder \n\t" > - > - "ori %[t4], $zero, 2 \n\t" > - "slt %[t0], %[t4], %[qc1] \n\t" > - "slt %[t1], %[t4], %[qc2] \n\t" > - "slt %[t2], %[t4], %[qc3] \n\t" > - "slt %[t3], %[t4], %[qc4] \n\t" > - "movn %[qc1], %[t4], %[t0] \n\t" > - "movn %[qc2], %[t4], %[t1] \n\t" > - "movn %[qc3], %[t4], %[t2] \n\t" > - "movn %[qc4], %[t4], %[t3] \n\t" > - > - ".set pop \n\t" > - > - : [qc1]"+r"(qc1), [qc2]"+r"(qc2), > - [qc3]"+r"(qc3), [qc4]"+r"(qc4), > - [t0]"=&r"(t0), [t1]"=&r"(t1), [t2]"=&r"(t2), [t3]"=&r"(t3), > - [t4]"=&r"(t4) > - ); > - > - curidx = qc1; > - curidx *= 3; > - curidx += qc2; > - curidx *= 3; > - curidx += qc3; > - curidx *= 3; > - curidx += qc4; > - > - curbits += p_bits[curidx]; > - curbits += uquad_sign_bits[curidx]; > - } > - return curbits; > -} > - > -static float get_band_numbits_SPAIR_mips(struct AACEncContext *s, > - PutBitContext *pb, const float *in, > - const float *scaled, int size, int scale_idx, > - int cb, const float lambda, const float uplim, > - int *bits) > -{ > - const float Q34 = ff_aac_pow34sf_tab[POW_SF2_ZERO - scale_idx + SCALE_ONE_POS - SCALE_DIV_512]; > - int i; > - int qc1, qc2, qc3, qc4; > - int curbits = 0; > - > - uint8_t *p_bits = (uint8_t*)ff_aac_spectral_bits[cb-1]; > - > - for (i = 0; i < size; i += 4) { > - int curidx, curidx2; > - int *in_int = (int *)&in[i]; > - int t0, t1, t2, t3, t4, t5, t6, t7; > - > - qc1 = scaled[i ] * Q34 + ROUND_STANDARD; > - qc2 = scaled[i+1] * Q34 + ROUND_STANDARD; > - qc3 = scaled[i+2] * Q34 + ROUND_STANDARD; > - qc4 = scaled[i+3] * Q34 + ROUND_STANDARD; > - > - __asm__ volatile ( > - ".set push \n\t" > - ".set noreorder \n\t" > - > - "ori %[t4], $zero, 4 \n\t" > - "slt %[t0], %[t4], %[qc1] \n\t" > - "slt %[t1], %[t4], %[qc2] \n\t" > - "slt %[t2], %[t4], %[qc3] \n\t" > - "slt %[t3], %[t4], %[qc4] \n\t" > - "movn %[qc1], %[t4], %[t0] \n\t" > - "movn %[qc2], %[t4], %[t1] \n\t" > - "movn %[qc3], %[t4], %[t2] \n\t" > - "movn %[qc4], %[t4], %[t3] \n\t" > - "lw %[t0], 0(%[in_int]) \n\t" > - "lw %[t1], 4(%[in_int]) \n\t" > - "lw %[t2], 8(%[in_int]) \n\t" > - "lw %[t3], 12(%[in_int]) \n\t" > - "srl %[t0], %[t0], 31 \n\t" > - "srl %[t1], %[t1], 31 \n\t" > - "srl %[t2], %[t2], 31 \n\t" > - "srl %[t3], %[t3], 31 \n\t" > - "subu %[t4], $zero, %[qc1] \n\t" > - "subu %[t5], $zero, %[qc2] \n\t" > - "subu %[t6], $zero, %[qc3] \n\t" > - "subu %[t7], $zero, %[qc4] \n\t" > - "movn %[qc1], %[t4], %[t0] \n\t" > - "movn %[qc2], %[t5], %[t1] \n\t" > - "movn %[qc3], %[t6], %[t2] \n\t" > - "movn %[qc4], %[t7], %[t3] \n\t" > - > - ".set pop \n\t" > - > - : [qc1]"+r"(qc1), [qc2]"+r"(qc2), > - [qc3]"+r"(qc3), [qc4]"+r"(qc4), > - [t0]"=&r"(t0), [t1]"=&r"(t1), [t2]"=&r"(t2), [t3]"=&r"(t3), > - [t4]"=&r"(t4), [t5]"=&r"(t5), [t6]"=&r"(t6), [t7]"=&r"(t7) > - : [in_int]"r"(in_int) > - : "memory" > - ); > - > - curidx = 9 * qc1; > - curidx += qc2 + 40; > - > - curidx2 = 9 * qc3; > - curidx2 += qc4 + 40; > - > - curbits += p_bits[curidx] + p_bits[curidx2]; > - } > - return curbits; > -} > - > -static float get_band_numbits_UPAIR7_mips(struct AACEncContext *s, > - PutBitContext *pb, const float *in, > - const float *scaled, int size, int scale_idx, > - int cb, const float lambda, const float uplim, > - int *bits) > -{ > - const float Q34 = ff_aac_pow34sf_tab[POW_SF2_ZERO - scale_idx + SCALE_ONE_POS - SCALE_DIV_512]; > - int i; > - int qc1, qc2, qc3, qc4; > - int curbits = 0; > - > - uint8_t *p_bits = (uint8_t *)ff_aac_spectral_bits[cb-1]; > - > - for (i = 0; i < size; i += 4) { > - int curidx, curidx2; > - int t0, t1, t2, t3, t4; > - > - qc1 = scaled[i ] * Q34 + ROUND_STANDARD; > - qc2 = scaled[i+1] * Q34 + ROUND_STANDARD; > - qc3 = scaled[i+2] * Q34 + ROUND_STANDARD; > - qc4 = scaled[i+3] * Q34 + ROUND_STANDARD; > - > - __asm__ volatile ( > - ".set push \n\t" > - ".set noreorder \n\t" > - > - "ori %[t4], $zero, 7 \n\t" > - "slt %[t0], %[t4], %[qc1] \n\t" > - "slt %[t1], %[t4], %[qc2] \n\t" > - "slt %[t2], %[t4], %[qc3] \n\t" > - "slt %[t3], %[t4], %[qc4] \n\t" > - "movn %[qc1], %[t4], %[t0] \n\t" > - "movn %[qc2], %[t4], %[t1] \n\t" > - "movn %[qc3], %[t4], %[t2] \n\t" > - "movn %[qc4], %[t4], %[t3] \n\t" > - > - ".set pop \n\t" > - > - : [qc1]"+r"(qc1), [qc2]"+r"(qc2), > - [qc3]"+r"(qc3), [qc4]"+r"(qc4), > - [t0]"=&r"(t0), [t1]"=&r"(t1), [t2]"=&r"(t2), [t3]"=&r"(t3), > - [t4]"=&r"(t4) > - ); > - > - curidx = 8 * qc1; > - curidx += qc2; > - > - curidx2 = 8 * qc3; > - curidx2 += qc4; > - > - curbits += p_bits[curidx] + > - upair7_sign_bits[curidx] + > - p_bits[curidx2] + > - upair7_sign_bits[curidx2]; > - } > - return curbits; > -} > - > -static float get_band_numbits_UPAIR12_mips(struct AACEncContext *s, > - PutBitContext *pb, const float *in, > - const float *scaled, int size, int scale_idx, > - int cb, const float lambda, const float uplim, > - int *bits) > -{ > - const float Q34 = ff_aac_pow34sf_tab[POW_SF2_ZERO - scale_idx + SCALE_ONE_POS - SCALE_DIV_512]; > - int i; > - int qc1, qc2, qc3, qc4; > - int curbits = 0; > - > - uint8_t *p_bits = (uint8_t *)ff_aac_spectral_bits[cb-1]; > - > - for (i = 0; i < size; i += 4) { > - int curidx, curidx2; > - int t0, t1, t2, t3, t4; > - > - qc1 = scaled[i ] * Q34 + ROUND_STANDARD; > - qc2 = scaled[i+1] * Q34 + ROUND_STANDARD; > - qc3 = scaled[i+2] * Q34 + ROUND_STANDARD; > - qc4 = scaled[i+3] * Q34 + ROUND_STANDARD; > - > - __asm__ volatile ( > - ".set push \n\t" > - ".set noreorder \n\t" > - > - "ori %[t4], $zero, 12 \n\t" > - "slt %[t0], %[t4], %[qc1] \n\t" > - "slt %[t1], %[t4], %[qc2] \n\t" > - "slt %[t2], %[t4], %[qc3] \n\t" > - "slt %[t3], %[t4], %[qc4] \n\t" > - "movn %[qc1], %[t4], %[t0] \n\t" > - "movn %[qc2], %[t4], %[t1] \n\t" > - "movn %[qc3], %[t4], %[t2] \n\t" > - "movn %[qc4], %[t4], %[t3] \n\t" > - > - ".set pop \n\t" > - > - : [qc1]"+r"(qc1), [qc2]"+r"(qc2), > - [qc3]"+r"(qc3), [qc4]"+r"(qc4), > - [t0]"=&r"(t0), [t1]"=&r"(t1), [t2]"=&r"(t2), [t3]"=&r"(t3), > - [t4]"=&r"(t4) > - ); > - > - curidx = 13 * qc1; > - curidx += qc2; > - > - curidx2 = 13 * qc3; > - curidx2 += qc4; > - > - curbits += p_bits[curidx] + > - p_bits[curidx2] + > - upair12_sign_bits[curidx] + > - upair12_sign_bits[curidx2]; > - } > - return curbits; > -} > - > -static float get_band_numbits_ESC_mips(struct AACEncContext *s, > - PutBitContext *pb, const float *in, > - const float *scaled, int size, int scale_idx, > - int cb, const float lambda, const float uplim, > - int *bits) > -{ > - const float Q34 = ff_aac_pow34sf_tab[POW_SF2_ZERO - scale_idx + SCALE_ONE_POS - SCALE_DIV_512]; > - int i; > - int qc1, qc2, qc3, qc4; > - int curbits = 0; > - > - uint8_t *p_bits = (uint8_t*)ff_aac_spectral_bits[cb-1]; > - > - for (i = 0; i < size; i += 4) { > - int curidx, curidx2; > - int cond0, cond1, cond2, cond3; > - int c1, c2, c3, c4; > - int t4, t5; > - > - qc1 = scaled[i ] * Q34 + ROUND_STANDARD; > - qc2 = scaled[i+1] * Q34 + ROUND_STANDARD; > - qc3 = scaled[i+2] * Q34 + ROUND_STANDARD; > - qc4 = scaled[i+3] * Q34 + ROUND_STANDARD; > - > - __asm__ volatile ( > - ".set push \n\t" > - ".set noreorder \n\t" > - > - "ori %[t4], $zero, 15 \n\t" > - "ori %[t5], $zero, 16 \n\t" > - "shll_s.w %[c1], %[qc1], 18 \n\t" > - "shll_s.w %[c2], %[qc2], 18 \n\t" > - "shll_s.w %[c3], %[qc3], 18 \n\t" > - "shll_s.w %[c4], %[qc4], 18 \n\t" > - "srl %[c1], %[c1], 18 \n\t" > - "srl %[c2], %[c2], 18 \n\t" > - "srl %[c3], %[c3], 18 \n\t" > - "srl %[c4], %[c4], 18 \n\t" > - "slt %[cond0], %[t4], %[qc1] \n\t" > - "slt %[cond1], %[t4], %[qc2] \n\t" > - "slt %[cond2], %[t4], %[qc3] \n\t" > - "slt %[cond3], %[t4], %[qc4] \n\t" > - "movn %[qc1], %[t5], %[cond0] \n\t" > - "movn %[qc2], %[t5], %[cond1] \n\t" > - "movn %[qc3], %[t5], %[cond2] \n\t" > - "movn %[qc4], %[t5], %[cond3] \n\t" > - "ori %[t5], $zero, 31 \n\t" > - "clz %[c1], %[c1] \n\t" > - "clz %[c2], %[c2] \n\t" > - "clz %[c3], %[c3] \n\t" > - "clz %[c4], %[c4] \n\t" > - "subu %[c1], %[t5], %[c1] \n\t" > - "subu %[c2], %[t5], %[c2] \n\t" > - "subu %[c3], %[t5], %[c3] \n\t" > - "subu %[c4], %[t5], %[c4] \n\t" > - "sll %[c1], %[c1], 1 \n\t" > - "sll %[c2], %[c2], 1 \n\t" > - "sll %[c3], %[c3], 1 \n\t" > - "sll %[c4], %[c4], 1 \n\t" > - "addiu %[c1], %[c1], -3 \n\t" > - "addiu %[c2], %[c2], -3 \n\t" > - "addiu %[c3], %[c3], -3 \n\t" > - "addiu %[c4], %[c4], -3 \n\t" > - "subu %[cond0], $zero, %[cond0] \n\t" > - "subu %[cond1], $zero, %[cond1] \n\t" > - "subu %[cond2], $zero, %[cond2] \n\t" > - "subu %[cond3], $zero, %[cond3] \n\t" > - "and %[c1], %[c1], %[cond0] \n\t" > - "and %[c2], %[c2], %[cond1] \n\t" > - "and %[c3], %[c3], %[cond2] \n\t" > - "and %[c4], %[c4], %[cond3] \n\t" > - > - ".set pop \n\t" > - > - : [qc1]"+r"(qc1), [qc2]"+r"(qc2), > - [qc3]"+r"(qc3), [qc4]"+r"(qc4), > - [cond0]"=&r"(cond0), [cond1]"=&r"(cond1), > - [cond2]"=&r"(cond2), [cond3]"=&r"(cond3), > - [c1]"=&r"(c1), [c2]"=&r"(c2), > - [c3]"=&r"(c3), [c4]"=&r"(c4), > - [t4]"=&r"(t4), [t5]"=&r"(t5) > - ); > - > - curidx = 17 * qc1; > - curidx += qc2; > - > - curidx2 = 17 * qc3; > - curidx2 += qc4; > - > - curbits += p_bits[curidx]; > - curbits += esc_sign_bits[curidx]; > - curbits += p_bits[curidx2]; > - curbits += esc_sign_bits[curidx2]; > - > - curbits += c1; > - curbits += c2; > - curbits += c3; > - curbits += c4; > - } > - return curbits; > -} > - > -static float (*const get_band_numbits_arr[])(struct AACEncContext *s, > - PutBitContext *pb, const float *in, > - const float *scaled, int size, int scale_idx, > - int cb, const float lambda, const float uplim, > - int *bits) = { > - get_band_numbits_ZERO_mips, > - get_band_numbits_SQUAD_mips, > - get_band_numbits_SQUAD_mips, > - get_band_numbits_UQUAD_mips, > - get_band_numbits_UQUAD_mips, > - get_band_numbits_SPAIR_mips, > - get_band_numbits_SPAIR_mips, > - get_band_numbits_UPAIR7_mips, > - get_band_numbits_UPAIR7_mips, > - get_band_numbits_UPAIR12_mips, > - get_band_numbits_UPAIR12_mips, > - get_band_numbits_ESC_mips, > - get_band_numbits_NONE_mips, /* cb 12 doesn't exist */ > - get_band_numbits_ZERO_mips, > - get_band_numbits_ZERO_mips, > - get_band_numbits_ZERO_mips, > -}; > - > -#define get_band_numbits( \ > - s, pb, in, scaled, size, scale_idx, cb, \ > - lambda, uplim, bits) \ > - get_band_numbits_arr[cb]( \ > - s, pb, in, scaled, size, scale_idx, cb, \ > - lambda, uplim, bits) > - > -static float quantize_band_cost_bits(struct AACEncContext *s, const float *in, > - const float *scaled, int size, int scale_idx, > - int cb, const float lambda, const float uplim, > - int *bits, float *energy) > -{ > - return get_band_numbits(s, NULL, in, scaled, size, scale_idx, cb, lambda, uplim, bits); > -} > - > -/** > - * Functions developed from template function and optimized for getting the band cost > - */ > -#if HAVE_MIPSFPU > -static float get_band_cost_ZERO_mips(struct AACEncContext *s, > - PutBitContext *pb, const float *in, > - const float *scaled, int size, int scale_idx, > - int cb, const float lambda, const float uplim, > - int *bits, float *energy) > -{ > - int i; > - float cost = 0; > - > - for (i = 0; i < size; i += 4) { > - cost += in[i ] * in[i ]; > - cost += in[i+1] * in[i+1]; > - cost += in[i+2] * in[i+2]; > - cost += in[i+3] * in[i+3]; > - } > - if (bits) > - *bits = 0; > - if (energy) > - *energy = 0.0f; > - return cost * lambda; > -} > - > -static float get_band_cost_NONE_mips(struct AACEncContext *s, > - PutBitContext *pb, const float *in, > - const float *scaled, int size, int scale_idx, > - int cb, const float lambda, const float uplim, > - int *bits, float *energy) > -{ > - av_assert0(0); > - return 0; > -} > - > -static float get_band_cost_SQUAD_mips(struct AACEncContext *s, > - PutBitContext *pb, const float *in, > - const float *scaled, int size, int scale_idx, > - int cb, const float lambda, const float uplim, > - int *bits, float *energy) > -{ > - const float Q34 = ff_aac_pow34sf_tab[POW_SF2_ZERO - scale_idx + SCALE_ONE_POS - SCALE_DIV_512]; > - const float IQ = ff_aac_pow2sf_tab [POW_SF2_ZERO + scale_idx - SCALE_ONE_POS + SCALE_DIV_512]; > - int i; > - float cost = 0; > - float qenergy = 0.0f; > - int qc1, qc2, qc3, qc4; > - int curbits = 0; > - > - uint8_t *p_bits = (uint8_t *)ff_aac_spectral_bits[cb-1]; > - float *p_codes = (float *)ff_aac_codebook_vectors[cb-1]; > - > - for (i = 0; i < size; i += 4) { > - const float *vec; > - int curidx; > - int *in_int = (int *)&in[i]; > - float *in_pos = (float *)&in[i]; > - float di0, di1, di2, di3; > - int t0, t1, t2, t3, t4, t5, t6, t7; > - > - qc1 = scaled[i ] * Q34 + ROUND_STANDARD; > - qc2 = scaled[i+1] * Q34 + ROUND_STANDARD; > - qc3 = scaled[i+2] * Q34 + ROUND_STANDARD; > - qc4 = scaled[i+3] * Q34 + ROUND_STANDARD; > - > - __asm__ volatile ( > - ".set push \n\t" > - ".set noreorder \n\t" > - > - "slt %[qc1], $zero, %[qc1] \n\t" > - "slt %[qc2], $zero, %[qc2] \n\t" > - "slt %[qc3], $zero, %[qc3] \n\t" > - "slt %[qc4], $zero, %[qc4] \n\t" > - "lw %[t0], 0(%[in_int]) \n\t" > - "lw %[t1], 4(%[in_int]) \n\t" > - "lw %[t2], 8(%[in_int]) \n\t" > - "lw %[t3], 12(%[in_int]) \n\t" > - "srl %[t0], %[t0], 31 \n\t" > - "srl %[t1], %[t1], 31 \n\t" > - "srl %[t2], %[t2], 31 \n\t" > - "srl %[t3], %[t3], 31 \n\t" > - "subu %[t4], $zero, %[qc1] \n\t" > - "subu %[t5], $zero, %[qc2] \n\t" > - "subu %[t6], $zero, %[qc3] \n\t" > - "subu %[t7], $zero, %[qc4] \n\t" > - "movn %[qc1], %[t4], %[t0] \n\t" > - "movn %[qc2], %[t5], %[t1] \n\t" > - "movn %[qc3], %[t6], %[t2] \n\t" > - "movn %[qc4], %[t7], %[t3] \n\t" > - > - ".set pop \n\t" > - > - : [qc1]"+r"(qc1), [qc2]"+r"(qc2), > - [qc3]"+r"(qc3), [qc4]"+r"(qc4), > - [t0]"=&r"(t0), [t1]"=&r"(t1), [t2]"=&r"(t2), [t3]"=&r"(t3), > - [t4]"=&r"(t4), [t5]"=&r"(t5), [t6]"=&r"(t6), [t7]"=&r"(t7) > - : [in_int]"r"(in_int) > - : "memory" > - ); > - > - curidx = qc1; > - curidx *= 3; > - curidx += qc2; > - curidx *= 3; > - curidx += qc3; > - curidx *= 3; > - curidx += qc4; > - curidx += 40; > - > - curbits += p_bits[curidx]; > - vec = &p_codes[curidx*4]; > - > - qenergy += vec[0]*vec[0] + vec[1]*vec[1] > - + vec[2]*vec[2] + vec[3]*vec[3]; > - > - __asm__ volatile ( > - ".set push \n\t" > - ".set noreorder \n\t" > - > - "lwc1 $f0, 0(%[in_pos]) \n\t" > - "lwc1 $f1, 0(%[vec]) \n\t" > - "lwc1 $f2, 4(%[in_pos]) \n\t" > - "lwc1 $f3, 4(%[vec]) \n\t" > - "lwc1 $f4, 8(%[in_pos]) \n\t" > - "lwc1 $f5, 8(%[vec]) \n\t" > - "lwc1 $f6, 12(%[in_pos]) \n\t" > - "lwc1 $f7, 12(%[vec]) \n\t" > - "nmsub.s %[di0], $f0, $f1, %[IQ] \n\t" > - "nmsub.s %[di1], $f2, $f3, %[IQ] \n\t" > - "nmsub.s %[di2], $f4, $f5, %[IQ] \n\t" > - "nmsub.s %[di3], $f6, $f7, %[IQ] \n\t" > - > - ".set pop \n\t" > - > - : [di0]"=&f"(di0), [di1]"=&f"(di1), > - [di2]"=&f"(di2), [di3]"=&f"(di3) > - : [in_pos]"r"(in_pos), [vec]"r"(vec), > - [IQ]"f"(IQ) > - : "$f0", "$f1", "$f2", "$f3", > - "$f4", "$f5", "$f6", "$f7", > - "memory" > - ); > - > - cost += di0 * di0 + di1 * di1 > - + di2 * di2 + di3 * di3; > - } > - > - if (bits) > - *bits = curbits; > - if (energy) > - *energy = qenergy * (IQ*IQ); > - return cost * lambda + curbits; > -} > - > -static float get_band_cost_UQUAD_mips(struct AACEncContext *s, > - PutBitContext *pb, const float *in, > - const float *scaled, int size, int scale_idx, > - int cb, const float lambda, const float uplim, > - int *bits, float *energy) > -{ > - const float Q34 = ff_aac_pow34sf_tab[POW_SF2_ZERO - scale_idx + SCALE_ONE_POS - SCALE_DIV_512]; > - const float IQ = ff_aac_pow2sf_tab [POW_SF2_ZERO + scale_idx - SCALE_ONE_POS + SCALE_DIV_512]; > - int i; > - float cost = 0; > - float qenergy = 0.0f; > - int curbits = 0; > - int qc1, qc2, qc3, qc4; > - > - uint8_t *p_bits = (uint8_t*)ff_aac_spectral_bits[cb-1]; > - float *p_codes = (float *)ff_aac_codebook_vectors[cb-1]; > - > - for (i = 0; i < size; i += 4) { > - const float *vec; > - int curidx; > - float *in_pos = (float *)&in[i]; > - float di0, di1, di2, di3; > - int t0, t1, t2, t3, t4; > - > - qc1 = scaled[i ] * Q34 + ROUND_STANDARD; > - qc2 = scaled[i+1] * Q34 + ROUND_STANDARD; > - qc3 = scaled[i+2] * Q34 + ROUND_STANDARD; > - qc4 = scaled[i+3] * Q34 + ROUND_STANDARD; > - > - __asm__ volatile ( > - ".set push \n\t" > - ".set noreorder \n\t" > - > - "ori %[t4], $zero, 2 \n\t" > - "slt %[t0], %[t4], %[qc1] \n\t" > - "slt %[t1], %[t4], %[qc2] \n\t" > - "slt %[t2], %[t4], %[qc3] \n\t" > - "slt %[t3], %[t4], %[qc4] \n\t" > - "movn %[qc1], %[t4], %[t0] \n\t" > - "movn %[qc2], %[t4], %[t1] \n\t" > - "movn %[qc3], %[t4], %[t2] \n\t" > - "movn %[qc4], %[t4], %[t3] \n\t" > - > - ".set pop \n\t" > - > - : [qc1]"+r"(qc1), [qc2]"+r"(qc2), > - [qc3]"+r"(qc3), [qc4]"+r"(qc4), > - [t0]"=&r"(t0), [t1]"=&r"(t1), [t2]"=&r"(t2), [t3]"=&r"(t3), > - [t4]"=&r"(t4) > - ); > - > - curidx = qc1; > - curidx *= 3; > - curidx += qc2; > - curidx *= 3; > - curidx += qc3; > - curidx *= 3; > - curidx += qc4; > - > - curbits += p_bits[curidx]; > - curbits += uquad_sign_bits[curidx]; > - vec = &p_codes[curidx*4]; > - > - qenergy += vec[0]*vec[0] + vec[1]*vec[1] > - + vec[2]*vec[2] + vec[3]*vec[3]; > - > - __asm__ volatile ( > - ".set push \n\t" > - ".set noreorder \n\t" > - > - "lwc1 %[di0], 0(%[in_pos]) \n\t" > - "lwc1 %[di1], 4(%[in_pos]) \n\t" > - "lwc1 %[di2], 8(%[in_pos]) \n\t" > - "lwc1 %[di3], 12(%[in_pos]) \n\t" > - "abs.s %[di0], %[di0] \n\t" > - "abs.s %[di1], %[di1] \n\t" > - "abs.s %[di2], %[di2] \n\t" > - "abs.s %[di3], %[di3] \n\t" > - "lwc1 $f0, 0(%[vec]) \n\t" > - "lwc1 $f1, 4(%[vec]) \n\t" > - "lwc1 $f2, 8(%[vec]) \n\t" > - "lwc1 $f3, 12(%[vec]) \n\t" > - "nmsub.s %[di0], %[di0], $f0, %[IQ] \n\t" > - "nmsub.s %[di1], %[di1], $f1, %[IQ] \n\t" > - "nmsub.s %[di2], %[di2], $f2, %[IQ] \n\t" > - "nmsub.s %[di3], %[di3], $f3, %[IQ] \n\t" > - > - ".set pop \n\t" > - > - : [di0]"=&f"(di0), [di1]"=&f"(di1), > - [di2]"=&f"(di2), [di3]"=&f"(di3) > - : [in_pos]"r"(in_pos), [vec]"r"(vec), > - [IQ]"f"(IQ) > - : "$f0", "$f1", "$f2", "$f3", > - "memory" > - ); > - > - cost += di0 * di0 + di1 * di1 > - + di2 * di2 + di3 * di3; > - } > - > - if (bits) > - *bits = curbits; > - if (energy) > - *energy = qenergy * (IQ*IQ); > - return cost * lambda + curbits; > -} > - > -static float get_band_cost_SPAIR_mips(struct AACEncContext *s, > - PutBitContext *pb, const float *in, > - const float *scaled, int size, int scale_idx, > - int cb, const float lambda, const float uplim, > - int *bits, float *energy) > -{ > - const float Q34 = ff_aac_pow34sf_tab[POW_SF2_ZERO - scale_idx + SCALE_ONE_POS - SCALE_DIV_512]; > - const float IQ = ff_aac_pow2sf_tab [POW_SF2_ZERO + scale_idx - SCALE_ONE_POS + SCALE_DIV_512]; > - int i; > - float cost = 0; > - float qenergy = 0.0f; > - int qc1, qc2, qc3, qc4; > - int curbits = 0; > - > - uint8_t *p_bits = (uint8_t *)ff_aac_spectral_bits[cb-1]; > - float *p_codes = (float *)ff_aac_codebook_vectors[cb-1]; > - > - for (i = 0; i < size; i += 4) { > - const float *vec, *vec2; > - int curidx, curidx2; > - int *in_int = (int *)&in[i]; > - float *in_pos = (float *)&in[i]; > - float di0, di1, di2, di3; > - int t0, t1, t2, t3, t4, t5, t6, t7; > - > - qc1 = scaled[i ] * Q34 + ROUND_STANDARD; > - qc2 = scaled[i+1] * Q34 + ROUND_STANDARD; > - qc3 = scaled[i+2] * Q34 + ROUND_STANDARD; > - qc4 = scaled[i+3] * Q34 + ROUND_STANDARD; > - > - __asm__ volatile ( > - ".set push \n\t" > - ".set noreorder \n\t" > - > - "ori %[t4], $zero, 4 \n\t" > - "slt %[t0], %[t4], %[qc1] \n\t" > - "slt %[t1], %[t4], %[qc2] \n\t" > - "slt %[t2], %[t4], %[qc3] \n\t" > - "slt %[t3], %[t4], %[qc4] \n\t" > - "movn %[qc1], %[t4], %[t0] \n\t" > - "movn %[qc2], %[t4], %[t1] \n\t" > - "movn %[qc3], %[t4], %[t2] \n\t" > - "movn %[qc4], %[t4], %[t3] \n\t" > - "lw %[t0], 0(%[in_int]) \n\t" > - "lw %[t1], 4(%[in_int]) \n\t" > - "lw %[t2], 8(%[in_int]) \n\t" > - "lw %[t3], 12(%[in_int]) \n\t" > - "srl %[t0], %[t0], 31 \n\t" > - "srl %[t1], %[t1], 31 \n\t" > - "srl %[t2], %[t2], 31 \n\t" > - "srl %[t3], %[t3], 31 \n\t" > - "subu %[t4], $zero, %[qc1] \n\t" > - "subu %[t5], $zero, %[qc2] \n\t" > - "subu %[t6], $zero, %[qc3] \n\t" > - "subu %[t7], $zero, %[qc4] \n\t" > - "movn %[qc1], %[t4], %[t0] \n\t" > - "movn %[qc2], %[t5], %[t1] \n\t" > - "movn %[qc3], %[t6], %[t2] \n\t" > - "movn %[qc4], %[t7], %[t3] \n\t" > - > - ".set pop \n\t" > - > - : [qc1]"+r"(qc1), [qc2]"+r"(qc2), > - [qc3]"+r"(qc3), [qc4]"+r"(qc4), > - [t0]"=&r"(t0), [t1]"=&r"(t1), [t2]"=&r"(t2), [t3]"=&r"(t3), > - [t4]"=&r"(t4), [t5]"=&r"(t5), [t6]"=&r"(t6), [t7]"=&r"(t7) > - : [in_int]"r"(in_int) > - : "memory" > - ); > - > - curidx = 9 * qc1; > - curidx += qc2 + 40; > - > - curidx2 = 9 * qc3; > - curidx2 += qc4 + 40; > - > - curbits += p_bits[curidx]; > - curbits += p_bits[curidx2]; > - > - vec = &p_codes[curidx*2]; > - vec2 = &p_codes[curidx2*2]; > - > - qenergy += vec[0]*vec[0] + vec[1]*vec[1] > - + vec2[0]*vec2[0] + vec2[1]*vec2[1]; > - > - __asm__ volatile ( > - ".set push \n\t" > - ".set noreorder \n\t" > - > - "lwc1 $f0, 0(%[in_pos]) \n\t" > - "lwc1 $f1, 0(%[vec]) \n\t" > - "lwc1 $f2, 4(%[in_pos]) \n\t" > - "lwc1 $f3, 4(%[vec]) \n\t" > - "lwc1 $f4, 8(%[in_pos]) \n\t" > - "lwc1 $f5, 0(%[vec2]) \n\t" > - "lwc1 $f6, 12(%[in_pos]) \n\t" > - "lwc1 $f7, 4(%[vec2]) \n\t" > - "nmsub.s %[di0], $f0, $f1, %[IQ] \n\t" > - "nmsub.s %[di1], $f2, $f3, %[IQ] \n\t" > - "nmsub.s %[di2], $f4, $f5, %[IQ] \n\t" > - "nmsub.s %[di3], $f6, $f7, %[IQ] \n\t" > - > - ".set pop \n\t" > - > - : [di0]"=&f"(di0), [di1]"=&f"(di1), > - [di2]"=&f"(di2), [di3]"=&f"(di3) > - : [in_pos]"r"(in_pos), [vec]"r"(vec), > - [vec2]"r"(vec2), [IQ]"f"(IQ) > - : "$f0", "$f1", "$f2", "$f3", > - "$f4", "$f5", "$f6", "$f7", > - "memory" > - ); > - > - cost += di0 * di0 + di1 * di1 > - + di2 * di2 + di3 * di3; > - } > - > - if (bits) > - *bits = curbits; > - if (energy) > - *energy = qenergy * (IQ*IQ); > - return cost * lambda + curbits; > -} > - > -static float get_band_cost_UPAIR7_mips(struct AACEncContext *s, > - PutBitContext *pb, const float *in, > - const float *scaled, int size, int scale_idx, > - int cb, const float lambda, const float uplim, > - int *bits, float *energy) > -{ > - const float Q34 = ff_aac_pow34sf_tab[POW_SF2_ZERO - scale_idx + SCALE_ONE_POS - SCALE_DIV_512]; > - const float IQ = ff_aac_pow2sf_tab [POW_SF2_ZERO + scale_idx - SCALE_ONE_POS + SCALE_DIV_512]; > - int i; > - float cost = 0; > - float qenergy = 0.0f; > - int qc1, qc2, qc3, qc4; > - int curbits = 0; > - > - uint8_t *p_bits = (uint8_t *)ff_aac_spectral_bits[cb-1]; > - float *p_codes = (float *)ff_aac_codebook_vectors[cb-1]; > - > - for (i = 0; i < size; i += 4) { > - const float *vec, *vec2; > - int curidx, curidx2, sign1, count1, sign2, count2; > - int *in_int = (int *)&in[i]; > - float *in_pos = (float *)&in[i]; > - float di0, di1, di2, di3; > - int t0, t1, t2, t3, t4; > - > - qc1 = scaled[i ] * Q34 + ROUND_STANDARD; > - qc2 = scaled[i+1] * Q34 + ROUND_STANDARD; > - qc3 = scaled[i+2] * Q34 + ROUND_STANDARD; > - qc4 = scaled[i+3] * Q34 + ROUND_STANDARD; > - > - __asm__ volatile ( > - ".set push \n\t" > - ".set noreorder \n\t" > - > - "ori %[t4], $zero, 7 \n\t" > - "ori %[sign1], $zero, 0 \n\t" > - "ori %[sign2], $zero, 0 \n\t" > - "slt %[t0], %[t4], %[qc1] \n\t" > - "slt %[t1], %[t4], %[qc2] \n\t" > - "slt %[t2], %[t4], %[qc3] \n\t" > - "slt %[t3], %[t4], %[qc4] \n\t" > - "movn %[qc1], %[t4], %[t0] \n\t" > - "movn %[qc2], %[t4], %[t1] \n\t" > - "movn %[qc3], %[t4], %[t2] \n\t" > - "movn %[qc4], %[t4], %[t3] \n\t" > - "lw %[t0], 0(%[in_int]) \n\t" > - "lw %[t1], 4(%[in_int]) \n\t" > - "lw %[t2], 8(%[in_int]) \n\t" > - "lw %[t3], 12(%[in_int]) \n\t" > - "slt %[t0], %[t0], $zero \n\t" > - "movn %[sign1], %[t0], %[qc1] \n\t" > - "slt %[t2], %[t2], $zero \n\t" > - "movn %[sign2], %[t2], %[qc3] \n\t" > - "slt %[t1], %[t1], $zero \n\t" > - "sll %[t0], %[sign1], 1 \n\t" > - "or %[t0], %[t0], %[t1] \n\t" > - "movn %[sign1], %[t0], %[qc2] \n\t" > - "slt %[t3], %[t3], $zero \n\t" > - "sll %[t0], %[sign2], 1 \n\t" > - "or %[t0], %[t0], %[t3] \n\t" > - "movn %[sign2], %[t0], %[qc4] \n\t" > - "slt %[count1], $zero, %[qc1] \n\t" > - "slt %[t1], $zero, %[qc2] \n\t" > - "slt %[count2], $zero, %[qc3] \n\t" > - "slt %[t2], $zero, %[qc4] \n\t" > - "addu %[count1], %[count1], %[t1] \n\t" > - "addu %[count2], %[count2], %[t2] \n\t" > - > - ".set pop \n\t" > - > - : [qc1]"+r"(qc1), [qc2]"+r"(qc2), > - [qc3]"+r"(qc3), [qc4]"+r"(qc4), > - [sign1]"=&r"(sign1), [count1]"=&r"(count1), > - [sign2]"=&r"(sign2), [count2]"=&r"(count2), > - [t0]"=&r"(t0), [t1]"=&r"(t1), [t2]"=&r"(t2), [t3]"=&r"(t3), > - [t4]"=&r"(t4) > - : [in_int]"r"(in_int) > - : "memory" > - ); > - > - curidx = 8 * qc1; > - curidx += qc2; > - > - curidx2 = 8 * qc3; > - curidx2 += qc4; > - > - curbits += p_bits[curidx]; > - curbits += upair7_sign_bits[curidx]; > - vec = &p_codes[curidx*2]; > - > - curbits += p_bits[curidx2]; > - curbits += upair7_sign_bits[curidx2]; > - vec2 = &p_codes[curidx2*2]; > - > - qenergy += vec[0]*vec[0] + vec[1]*vec[1] > - + vec2[0]*vec2[0] + vec2[1]*vec2[1]; > - > - __asm__ volatile ( > - ".set push \n\t" > - ".set noreorder \n\t" > - > - "lwc1 %[di0], 0(%[in_pos]) \n\t" > - "lwc1 %[di1], 4(%[in_pos]) \n\t" > - "lwc1 %[di2], 8(%[in_pos]) \n\t" > - "lwc1 %[di3], 12(%[in_pos]) \n\t" > - "abs.s %[di0], %[di0] \n\t" > - "abs.s %[di1], %[di1] \n\t" > - "abs.s %[di2], %[di2] \n\t" > - "abs.s %[di3], %[di3] \n\t" > - "lwc1 $f0, 0(%[vec]) \n\t" > - "lwc1 $f1, 4(%[vec]) \n\t" > - "lwc1 $f2, 0(%[vec2]) \n\t" > - "lwc1 $f3, 4(%[vec2]) \n\t" > - "nmsub.s %[di0], %[di0], $f0, %[IQ] \n\t" > - "nmsub.s %[di1], %[di1], $f1, %[IQ] \n\t" > - "nmsub.s %[di2], %[di2], $f2, %[IQ] \n\t" > - "nmsub.s %[di3], %[di3], $f3, %[IQ] \n\t" > - > - ".set pop \n\t" > - > - : [di0]"=&f"(di0), [di1]"=&f"(di1), > - [di2]"=&f"(di2), [di3]"=&f"(di3) > - : [in_pos]"r"(in_pos), [vec]"r"(vec), > - [vec2]"r"(vec2), [IQ]"f"(IQ) > - : "$f0", "$f1", "$f2", "$f3", > - "memory" > - ); > - > - cost += di0 * di0 + di1 * di1 > - + di2 * di2 + di3 * di3; > - } > - > - if (bits) > - *bits = curbits; > - if (energy) > - *energy = qenergy * (IQ*IQ); > - return cost * lambda + curbits; > -} > - > -static float get_band_cost_UPAIR12_mips(struct AACEncContext *s, > - PutBitContext *pb, const float *in, > - const float *scaled, int size, int scale_idx, > - int cb, const float lambda, const float uplim, > - int *bits, float *energy) > -{ > - const float Q34 = ff_aac_pow34sf_tab[POW_SF2_ZERO - scale_idx + SCALE_ONE_POS - SCALE_DIV_512]; > - const float IQ = ff_aac_pow2sf_tab [POW_SF2_ZERO + scale_idx - SCALE_ONE_POS + SCALE_DIV_512]; > - int i; > - float cost = 0; > - float qenergy = 0.0f; > - int qc1, qc2, qc3, qc4; > - int curbits = 0; > - > - uint8_t *p_bits = (uint8_t *)ff_aac_spectral_bits[cb-1]; > - float *p_codes = (float *)ff_aac_codebook_vectors[cb-1]; > - > - for (i = 0; i < size; i += 4) { > - const float *vec, *vec2; > - int curidx, curidx2; > - int sign1, count1, sign2, count2; > - int *in_int = (int *)&in[i]; > - float *in_pos = (float *)&in[i]; > - float di0, di1, di2, di3; > - int t0, t1, t2, t3, t4; > - > - qc1 = scaled[i ] * Q34 + ROUND_STANDARD; > - qc2 = scaled[i+1] * Q34 + ROUND_STANDARD; > - qc3 = scaled[i+2] * Q34 + ROUND_STANDARD; > - qc4 = scaled[i+3] * Q34 + ROUND_STANDARD; > - > - __asm__ volatile ( > - ".set push \n\t" > - ".set noreorder \n\t" > - > - "ori %[t4], $zero, 12 \n\t" > - "ori %[sign1], $zero, 0 \n\t" > - "ori %[sign2], $zero, 0 \n\t" > - "slt %[t0], %[t4], %[qc1] \n\t" > - "slt %[t1], %[t4], %[qc2] \n\t" > - "slt %[t2], %[t4], %[qc3] \n\t" > - "slt %[t3], %[t4], %[qc4] \n\t" > - "movn %[qc1], %[t4], %[t0] \n\t" > - "movn %[qc2], %[t4], %[t1] \n\t" > - "movn %[qc3], %[t4], %[t2] \n\t" > - "movn %[qc4], %[t4], %[t3] \n\t" > - "lw %[t0], 0(%[in_int]) \n\t" > - "lw %[t1], 4(%[in_int]) \n\t" > - "lw %[t2], 8(%[in_int]) \n\t" > - "lw %[t3], 12(%[in_int]) \n\t" > - "slt %[t0], %[t0], $zero \n\t" > - "movn %[sign1], %[t0], %[qc1] \n\t" > - "slt %[t2], %[t2], $zero \n\t" > - "movn %[sign2], %[t2], %[qc3] \n\t" > - "slt %[t1], %[t1], $zero \n\t" > - "sll %[t0], %[sign1], 1 \n\t" > - "or %[t0], %[t0], %[t1] \n\t" > - "movn %[sign1], %[t0], %[qc2] \n\t" > - "slt %[t3], %[t3], $zero \n\t" > - "sll %[t0], %[sign2], 1 \n\t" > - "or %[t0], %[t0], %[t3] \n\t" > - "movn %[sign2], %[t0], %[qc4] \n\t" > - "slt %[count1], $zero, %[qc1] \n\t" > - "slt %[t1], $zero, %[qc2] \n\t" > - "slt %[count2], $zero, %[qc3] \n\t" > - "slt %[t2], $zero, %[qc4] \n\t" > - "addu %[count1], %[count1], %[t1] \n\t" > - "addu %[count2], %[count2], %[t2] \n\t" > - > - ".set pop \n\t" > - > - : [qc1]"+r"(qc1), [qc2]"+r"(qc2), > - [qc3]"+r"(qc3), [qc4]"+r"(qc4), > - [sign1]"=&r"(sign1), [count1]"=&r"(count1), > - [sign2]"=&r"(sign2), [count2]"=&r"(count2), > - [t0]"=&r"(t0), [t1]"=&r"(t1), [t2]"=&r"(t2), [t3]"=&r"(t3), > - [t4]"=&r"(t4) > - : [in_int]"r"(in_int) > - : "memory" > - ); > - > - curidx = 13 * qc1; > - curidx += qc2; > - > - curidx2 = 13 * qc3; > - curidx2 += qc4; > - > - curbits += p_bits[curidx]; > - curbits += p_bits[curidx2]; > - curbits += upair12_sign_bits[curidx]; > - curbits += upair12_sign_bits[curidx2]; > - vec = &p_codes[curidx*2]; > - vec2 = &p_codes[curidx2*2]; > - > - qenergy += vec[0]*vec[0] + vec[1]*vec[1] > - + vec2[0]*vec2[0] + vec2[1]*vec2[1]; > - > - __asm__ volatile ( > - ".set push \n\t" > - ".set noreorder \n\t" > - > - "lwc1 %[di0], 0(%[in_pos]) \n\t" > - "lwc1 %[di1], 4(%[in_pos]) \n\t" > - "lwc1 %[di2], 8(%[in_pos]) \n\t" > - "lwc1 %[di3], 12(%[in_pos]) \n\t" > - "abs.s %[di0], %[di0] \n\t" > - "abs.s %[di1], %[di1] \n\t" > - "abs.s %[di2], %[di2] \n\t" > - "abs.s %[di3], %[di3] \n\t" > - "lwc1 $f0, 0(%[vec]) \n\t" > - "lwc1 $f1, 4(%[vec]) \n\t" > - "lwc1 $f2, 0(%[vec2]) \n\t" > - "lwc1 $f3, 4(%[vec2]) \n\t" > - "nmsub.s %[di0], %[di0], $f0, %[IQ] \n\t" > - "nmsub.s %[di1], %[di1], $f1, %[IQ] \n\t" > - "nmsub.s %[di2], %[di2], $f2, %[IQ] \n\t" > - "nmsub.s %[di3], %[di3], $f3, %[IQ] \n\t" > - > - ".set pop \n\t" > - > - : [di0]"=&f"(di0), [di1]"=&f"(di1), > - [di2]"=&f"(di2), [di3]"=&f"(di3) > - : [in_pos]"r"(in_pos), [vec]"r"(vec), > - [vec2]"r"(vec2), [IQ]"f"(IQ) > - : "$f0", "$f1", "$f2", "$f3", > - "memory" > - ); > - > - cost += di0 * di0 + di1 * di1 > - + di2 * di2 + di3 * di3; > - } > - > - if (bits) > - *bits = curbits; > - if (energy) > - *energy = qenergy * (IQ*IQ); > - return cost * lambda + curbits; > -} > - > -static float get_band_cost_ESC_mips(struct AACEncContext *s, > - PutBitContext *pb, const float *in, > - const float *scaled, int size, int scale_idx, > - int cb, const float lambda, const float uplim, > - int *bits, float *energy) > -{ > - const float Q34 = ff_aac_pow34sf_tab[POW_SF2_ZERO - scale_idx + SCALE_ONE_POS - SCALE_DIV_512]; > - const float IQ = ff_aac_pow2sf_tab [POW_SF2_ZERO + scale_idx - SCALE_ONE_POS + SCALE_DIV_512]; > - const float CLIPPED_ESCAPE = 165140.0f * IQ; > - int i; > - float cost = 0; > - float qenergy = 0.0f; > - int qc1, qc2, qc3, qc4; > - int curbits = 0; > - > - uint8_t *p_bits = (uint8_t*)ff_aac_spectral_bits[cb-1]; > - float *p_codes = (float* )ff_aac_codebook_vectors[cb-1]; > - > - for (i = 0; i < size; i += 4) { > - const float *vec, *vec2; > - int curidx, curidx2; > - float t1, t2, t3, t4, V; > - float di1, di2, di3, di4; > - int cond0, cond1, cond2, cond3; > - int c1, c2, c3, c4; > - int t6, t7; > - > - qc1 = scaled[i ] * Q34 + ROUND_STANDARD; > - qc2 = scaled[i+1] * Q34 + ROUND_STANDARD; > - qc3 = scaled[i+2] * Q34 + ROUND_STANDARD; > - qc4 = scaled[i+3] * Q34 + ROUND_STANDARD; > - > - __asm__ volatile ( > - ".set push \n\t" > - ".set noreorder \n\t" > - > - "ori %[t6], $zero, 15 \n\t" > - "ori %[t7], $zero, 16 \n\t" > - "shll_s.w %[c1], %[qc1], 18 \n\t" > - "shll_s.w %[c2], %[qc2], 18 \n\t" > - "shll_s.w %[c3], %[qc3], 18 \n\t" > - "shll_s.w %[c4], %[qc4], 18 \n\t" > - "srl %[c1], %[c1], 18 \n\t" > - "srl %[c2], %[c2], 18 \n\t" > - "srl %[c3], %[c3], 18 \n\t" > - "srl %[c4], %[c4], 18 \n\t" > - "slt %[cond0], %[t6], %[qc1] \n\t" > - "slt %[cond1], %[t6], %[qc2] \n\t" > - "slt %[cond2], %[t6], %[qc3] \n\t" > - "slt %[cond3], %[t6], %[qc4] \n\t" > - "movn %[qc1], %[t7], %[cond0] \n\t" > - "movn %[qc2], %[t7], %[cond1] \n\t" > - "movn %[qc3], %[t7], %[cond2] \n\t" > - "movn %[qc4], %[t7], %[cond3] \n\t" > - > - ".set pop \n\t" > - > - : [qc1]"+r"(qc1), [qc2]"+r"(qc2), > - [qc3]"+r"(qc3), [qc4]"+r"(qc4), > - [cond0]"=&r"(cond0), [cond1]"=&r"(cond1), > - [cond2]"=&r"(cond2), [cond3]"=&r"(cond3), > - [c1]"=&r"(c1), [c2]"=&r"(c2), > - [c3]"=&r"(c3), [c4]"=&r"(c4), > - [t6]"=&r"(t6), [t7]"=&r"(t7) > - ); > - > - curidx = 17 * qc1; > - curidx += qc2; > - > - curidx2 = 17 * qc3; > - curidx2 += qc4; > - > - curbits += p_bits[curidx]; > - curbits += esc_sign_bits[curidx]; > - vec = &p_codes[curidx*2]; > - > - curbits += p_bits[curidx2]; > - curbits += esc_sign_bits[curidx2]; > - vec2 = &p_codes[curidx2*2]; > - > - curbits += (av_log2(c1) * 2 - 3) & (-cond0); > - curbits += (av_log2(c2) * 2 - 3) & (-cond1); > - curbits += (av_log2(c3) * 2 - 3) & (-cond2); > - curbits += (av_log2(c4) * 2 - 3) & (-cond3); > - > - t1 = fabsf(in[i ]); > - t2 = fabsf(in[i+1]); > - t3 = fabsf(in[i+2]); > - t4 = fabsf(in[i+3]); > - > - if (cond0) { > - if (t1 >= CLIPPED_ESCAPE) { > - di1 = t1 - CLIPPED_ESCAPE; > - qenergy += CLIPPED_ESCAPE*CLIPPED_ESCAPE; > - } else { > - di1 = t1 - (V = c1 * cbrtf(c1) * IQ); > - qenergy += V*V; > - } > - } else { > - di1 = t1 - (V = vec[0] * IQ); > - qenergy += V*V; > - } > - > - if (cond1) { > - if (t2 >= CLIPPED_ESCAPE) { > - di2 = t2 - CLIPPED_ESCAPE; > - qenergy += CLIPPED_ESCAPE*CLIPPED_ESCAPE; > - } else { > - di2 = t2 - (V = c2 * cbrtf(c2) * IQ); > - qenergy += V*V; > - } > - } else { > - di2 = t2 - (V = vec[1] * IQ); > - qenergy += V*V; > - } > - > - if (cond2) { > - if (t3 >= CLIPPED_ESCAPE) { > - di3 = t3 - CLIPPED_ESCAPE; > - qenergy += CLIPPED_ESCAPE*CLIPPED_ESCAPE; > - } else { > - di3 = t3 - (V = c3 * cbrtf(c3) * IQ); > - qenergy += V*V; > - } > - } else { > - di3 = t3 - (V = vec2[0] * IQ); > - qenergy += V*V; > - } > - > - if (cond3) { > - if (t4 >= CLIPPED_ESCAPE) { > - di4 = t4 - CLIPPED_ESCAPE; > - qenergy += CLIPPED_ESCAPE*CLIPPED_ESCAPE; > - } else { > - di4 = t4 - (V = c4 * cbrtf(c4) * IQ); > - qenergy += V*V; > - } > - } else { > - di4 = t4 - (V = vec2[1]*IQ); > - qenergy += V*V; > - } > - > - cost += di1 * di1 + di2 * di2 > - + di3 * di3 + di4 * di4; > - } > - > - if (bits) > - *bits = curbits; > - return cost * lambda + curbits; > -} > - > -static float (*const get_band_cost_arr[])(struct AACEncContext *s, > - PutBitContext *pb, const float *in, > - const float *scaled, int size, int scale_idx, > - int cb, const float lambda, const float uplim, > - int *bits, float *energy) = { > - get_band_cost_ZERO_mips, > - get_band_cost_SQUAD_mips, > - get_band_cost_SQUAD_mips, > - get_band_cost_UQUAD_mips, > - get_band_cost_UQUAD_mips, > - get_band_cost_SPAIR_mips, > - get_band_cost_SPAIR_mips, > - get_band_cost_UPAIR7_mips, > - get_band_cost_UPAIR7_mips, > - get_band_cost_UPAIR12_mips, > - get_band_cost_UPAIR12_mips, > - get_band_cost_ESC_mips, > - get_band_cost_NONE_mips, /* cb 12 doesn't exist */ > - get_band_cost_ZERO_mips, > - get_band_cost_ZERO_mips, > - get_band_cost_ZERO_mips, > -}; > - > -#define get_band_cost( \ > - s, pb, in, scaled, size, scale_idx, cb, \ > - lambda, uplim, bits, energy) \ > - get_band_cost_arr[cb]( \ > - s, pb, in, scaled, size, scale_idx, cb, \ > - lambda, uplim, bits, energy) > - > -static float quantize_band_cost(struct AACEncContext *s, const float *in, > - const float *scaled, int size, int scale_idx, > - int cb, const float lambda, const float uplim, > - int *bits, float *energy) > -{ > - return get_band_cost(s, NULL, in, scaled, size, scale_idx, cb, lambda, uplim, bits, energy); > -} > - > -#include "libavcodec/aacenc_quantization_misc.h" > - > -#include "libavcodec/aaccoder_twoloop.h" > - > -static void search_for_ms_mips(AACEncContext *s, ChannelElement *cpe) > -{ > - int start = 0, i, w, w2, g, sid_sf_boost, prev_mid, prev_side; > - uint8_t nextband0[128], nextband1[128]; > - float M[128], S[128]; > - float *L34 = s->scoefs, *R34 = s->scoefs + 128, *M34 = s->scoefs + 128*2, *S34 = s->scoefs + 128*3; > - const float lambda = s->lambda; > - const float mslambda = FFMIN(1.0f, lambda / 120.f); > - SingleChannelElement *sce0 = &cpe->ch[0]; > - SingleChannelElement *sce1 = &cpe->ch[1]; > - if (!cpe->common_window) > - return; > - > - /** Scout out next nonzero bands */ > - ff_init_nextband_map(sce0, nextband0); > - ff_init_nextband_map(sce1, nextband1); > - > - prev_mid = sce0->sf_idx[0]; > - prev_side = sce1->sf_idx[0]; > - for (w = 0; w < sce0->ics.num_windows; w += sce0->ics.group_len[w]) { > - start = 0; > - for (g = 0; g < sce0->ics.num_swb; g++) { > - float bmax = bval2bmax(g * 17.0f / sce0->ics.num_swb) / 0.0045f; > - if (!cpe->is_mask[w*16+g]) > - cpe->ms_mask[w*16+g] = 0; > - if (!sce0->zeroes[w*16+g] && !sce1->zeroes[w*16+g] && !cpe->is_mask[w*16+g]) { > - float Mmax = 0.0f, Smax = 0.0f; > - > - /* Must compute mid/side SF and book for the whole window group */ > - for (w2 = 0; w2 < sce0->ics.group_len[w]; w2++) { > - for (i = 0; i < sce0->ics.swb_sizes[g]; i++) { > - M[i] = (sce0->coeffs[start+(w+w2)*128+i] > - + sce1->coeffs[start+(w+w2)*128+i]) * 0.5; > - S[i] = M[i] > - - sce1->coeffs[start+(w+w2)*128+i]; > - } > - abs_pow34_v(M34, M, sce0->ics.swb_sizes[g]); > - abs_pow34_v(S34, S, sce0->ics.swb_sizes[g]); > - for (i = 0; i < sce0->ics.swb_sizes[g]; i++ ) { > - Mmax = FFMAX(Mmax, M34[i]); > - Smax = FFMAX(Smax, S34[i]); > - } > - } > - > - for (sid_sf_boost = 0; sid_sf_boost < 4; sid_sf_boost++) { > - float dist1 = 0.0f, dist2 = 0.0f; > - int B0 = 0, B1 = 0; > - int minidx; > - int mididx, sididx; > - int midcb, sidcb; > - > - minidx = FFMIN(sce0->sf_idx[w*16+g], sce1->sf_idx[w*16+g]); > - mididx = av_clip(minidx, 0, SCALE_MAX_POS - SCALE_DIV_512); > - sididx = av_clip(minidx - sid_sf_boost * 3, 0, SCALE_MAX_POS - SCALE_DIV_512); > - if (sce0->band_type[w*16+g] != NOISE_BT && sce1->band_type[w*16+g] != NOISE_BT > - && ( !ff_sfdelta_can_replace(sce0, nextband0, prev_mid, mididx, w*16+g) > - || !ff_sfdelta_can_replace(sce1, nextband1, prev_side, sididx, w*16+g))) { > - /* scalefactor range violation, bad stuff, will decrease quality unacceptably */ > - continue; > - } > - > - midcb = find_min_book(Mmax, mididx); > - sidcb = find_min_book(Smax, sididx); > - > - /* No CB can be zero */ > - midcb = FFMAX(1,midcb); > - sidcb = FFMAX(1,sidcb); > - > - for (w2 = 0; w2 < sce0->ics.group_len[w]; w2++) { > - FFPsyBand *band0 = &s->psy.ch[s->cur_channel+0].psy_bands[(w+w2)*16+g]; > - FFPsyBand *band1 = &s->psy.ch[s->cur_channel+1].psy_bands[(w+w2)*16+g]; > - float minthr = FFMIN(band0->threshold, band1->threshold); > - int b1,b2,b3,b4; > - for (i = 0; i < sce0->ics.swb_sizes[g]; i++) { > - M[i] = (sce0->coeffs[start+(w+w2)*128+i] > - + sce1->coeffs[start+(w+w2)*128+i]) * 0.5; > - S[i] = M[i] > - - sce1->coeffs[start+(w+w2)*128+i]; > - } > - > - abs_pow34_v(L34, sce0->coeffs+start+(w+w2)*128, sce0->ics.swb_sizes[g]); > - abs_pow34_v(R34, sce1->coeffs+start+(w+w2)*128, sce0->ics.swb_sizes[g]); > - abs_pow34_v(M34, M, sce0->ics.swb_sizes[g]); > - abs_pow34_v(S34, S, sce0->ics.swb_sizes[g]); > - dist1 += quantize_band_cost(s, &sce0->coeffs[start + (w+w2)*128], > - L34, > - sce0->ics.swb_sizes[g], > - sce0->sf_idx[w*16+g], > - sce0->band_type[w*16+g], > - lambda / band0->threshold, INFINITY, &b1, NULL); > - dist1 += quantize_band_cost(s, &sce1->coeffs[start + (w+w2)*128], > - R34, > - sce1->ics.swb_sizes[g], > - sce1->sf_idx[w*16+g], > - sce1->band_type[w*16+g], > - lambda / band1->threshold, INFINITY, &b2, NULL); > - dist2 += quantize_band_cost(s, M, > - M34, > - sce0->ics.swb_sizes[g], > - mididx, > - midcb, > - lambda / minthr, INFINITY, &b3, NULL); > - dist2 += quantize_band_cost(s, S, > - S34, > - sce1->ics.swb_sizes[g], > - sididx, > - sidcb, > - mslambda / (minthr * bmax), INFINITY, &b4, NULL); > - B0 += b1+b2; > - B1 += b3+b4; > - dist1 -= b1+b2; > - dist2 -= b3+b4; > - } > - cpe->ms_mask[w*16+g] = dist2 <= dist1 && B1 < B0; > - if (cpe->ms_mask[w*16+g]) { > - if (sce0->band_type[w*16+g] != NOISE_BT && sce1->band_type[w*16+g] != NOISE_BT) { > - sce0->sf_idx[w*16+g] = mididx; > - sce1->sf_idx[w*16+g] = sididx; > - sce0->band_type[w*16+g] = midcb; > - sce1->band_type[w*16+g] = sidcb; > - } else if ((sce0->band_type[w*16+g] != NOISE_BT) ^ (sce1->band_type[w*16+g] != NOISE_BT)) { > - /* ms_mask unneeded, and it confuses some decoders */ > - cpe->ms_mask[w*16+g] = 0; > - } > - break; > - } else if (B1 > B0) { > - /* More boost won't fix this */ > - break; > - } > - } > - } > - if (!sce0->zeroes[w*16+g] && sce0->band_type[w*16+g] < RESERVED_BT) > - prev_mid = sce0->sf_idx[w*16+g]; > - if (!sce1->zeroes[w*16+g] && !cpe->is_mask[w*16+g] && sce1->band_type[w*16+g] < RESERVED_BT) > - prev_side = sce1->sf_idx[w*16+g]; > - start += sce0->ics.swb_sizes[g]; > - } > - } > -} > -#endif /*HAVE_MIPSFPU */ > - > -#include "libavcodec/aaccoder_trellis.h" > - > -#endif /* !HAVE_MIPS32R6 && !HAVE_MIPS64R6 */ > -#endif /* HAVE_INLINE_ASM */ > - > -void ff_aac_coder_init_mips(AACEncContext *c) { > -#if HAVE_INLINE_ASM > -#if !HAVE_MIPS32R6 && !HAVE_MIPS64R6 > - AACCoefficientsEncoder *e = c->coder; > - int option = c->options.coder; > - > - if (option == 2) { > - e->quantize_and_encode_band = quantize_and_encode_band_mips; > - e->encode_window_bands_info = codebook_trellis_rate; > -#if HAVE_MIPSFPU > - e->search_for_quantizers = search_for_quantizers_twoloop; > -#endif /* HAVE_MIPSFPU */ > - } > -#if HAVE_MIPSFPU > - e->search_for_ms = search_for_ms_mips; > -#endif /* HAVE_MIPSFPU */ > -#endif /* !HAVE_MIPS32R6 && !HAVE_MIPS64R6 */ > -#endif /* HAVE_INLINE_ASM */ > -} Will apply tomorrow unless there are objections. - Andreas _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".