From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTP id C885B43BF2 for ; Fri, 21 Oct 2022 13:58:18 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id DAEFA68BE8F; Fri, 21 Oct 2022 16:58:14 +0300 (EEST) Received: from mail-lj1-f174.google.com (mail-lj1-f174.google.com [209.85.208.174]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 7858968BDCD for ; Fri, 21 Oct 2022 16:58:08 +0300 (EEST) Received: by mail-lj1-f174.google.com with SMTP id j23so3781245lji.8 for ; Fri, 21 Oct 2022 06:58:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gramner.com; s=google; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :from:to:cc:subject:date:message-id:reply-to; bh=VWRYGoSKYrxyjT6rxZddQWLdwIdSzQCneX1OdIz6ZBk=; b=ckzBaf+SC7MzLsygHWCkazl2XVI7O0JMlON/dxovNIUW9Q/iiJaiv8czvUk0Wlg4nt uFWddHCOq/hMWme0A4tWbxnwW5i03TeyNvk7WHHIFKPZEOsywwkouJ1N/FTI52pOjoPA 3wzp8z9wVIwIGvPoHK5GbRsHZdmJlpX7II/m0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=VWRYGoSKYrxyjT6rxZddQWLdwIdSzQCneX1OdIz6ZBk=; b=yw5z/+753i3ePj2sA7/280Dxsfj8JnoKVYVjgl5JuVGnSeClAdqoJW6avn7ikcwtGD yFMt6ZC10amDsHUYz97NNQ6T3Z0z7LSEo7vSLZ0dLb6HwtIib4/8wNdFhgu9FoBIpBQ4 WnqRGn0FODptxLMx2Na9EKomHwabVx18VrRcFzEtchhApF9k8vvhvZ5jcJeN+N47MB8h zYQoeTgWKLuSKfLEUQAtNnb0iGMVo53igNx3ePbwkt7z6X2lIQFvOZr6UPMW1DlkpKIn 6sC3z8wGueG/mlqtcfheT8F7J6bBN6AzrkJbUXsK3zJNJOfMNxvEp+mDtxA2PgYPHs5v 60tQ== X-Gm-Message-State: ACrzQf1DIzcKJMFpPMY7da41Bou/3sLsAbpPrCj/+jNVhM4nnTKU5vES HOQLuqYihQ6pDD0lOFBCPcpcxnbSMItZvt0n2jn5S5+RWIAWoOIY X-Google-Smtp-Source: AMsMyM7bSnHk9af9wSWUk+fc5eMCs6t0S8fD4YPt5NYT7Oc0gMRWT5rtJP0fW7UaMlB+5c3bG5JamfgFqHzpaB33rjI= X-Received: by 2002:a2e:3c14:0:b0:26f:c0da:cbad with SMTP id j20-20020a2e3c14000000b0026fc0dacbadmr6425396lja.141.1666360685786; Fri, 21 Oct 2022 06:58:05 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Henrik Gramner Date: Fri, 21 Oct 2022 15:57:54 +0200 Message-ID: To: FFmpeg development discussions and patches Subject: Re: [FFmpeg-devel] [PATCH] RFC: v210enc optimisations and initial AVX-512 X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: On Fri, Oct 21, 2022 at 5:41 AM Kieran Kunhya wrote: > > Hi, > > Please see attached an attempt to optimise the 8-bit input to v210enc to > reduce the number of shuffles. > This comes at the cost of having to extract the middle element and perform > a DWORD shift on it and then reinserting it. > I have added a few comments but any other ideas are welcome. Random untested idea: A: db 32, 0, 48, -1, 1, 33, 2, -1, 49, 3, 34, -1, 4, 50, 5, -1 db 35, 6, 51, -1, 7, 36, 8, -1, 52, 9, 37, -1, 10, 53, 11, -1 db 38, 12, 54, -1, 13, 39, 14, -1, 55, 15, 40, -1, 16, 56, 17, -1 db 41, 18, 57, -1, 19, 42, 20, -1, 58, 21, 43, -1, 22, 59, 23, -1 B: db 1, 0, 16, 0 C: dd 0x0003fc00 [...] mova m2, [A] vpbroadcastd m3, [B] vpbroadcastd m6, [C] [...] .loop: movu ym1, [yq] vinserti32x4 m1, [uq], 2 vinserti32x4 m1, [vq], 3 CLIPUB m1, m4, m5 vpermb m1, m2, m1 pmaddubsw m0, m1, m3 pslld m1, 2 vpternlogd m0, m1, m6, 0xca movu [dstq], m0 I guess it could also be scaled to ymm if you're a big Skylake fan :P (in which case you'd probably want to reorder the shuffle indices so that chroma comes first, i.e. movq [u] + movhps [v] + vinserti32x4 [y]) _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".