From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTP id 222B149CD1 for ; Fri, 8 Mar 2024 09:46:27 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 2023368CE33; Fri, 8 Mar 2024 11:46:25 +0200 (EET) Received: from mail-qv1-f47.google.com (mail-qv1-f47.google.com [209.85.219.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 4064D68CD6D for ; Fri, 8 Mar 2024 11:46:18 +0200 (EET) Received: by mail-qv1-f47.google.com with SMTP id 6a1803df08f44-690b8788b12so1252326d6.2 for ; Fri, 08 Mar 2024 01:46:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1709891176; x=1710495976; darn=ffmpeg.org; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :from:to:cc:subject:date:message-id:reply-to; bh=KmfcB5qVkT2tsJ4mf2YN8xT7NTpqgEKTKh57qE2N2HY=; b=MP0WhN70c++gj9OxYjIlbKj509CtLRYlwfGriP4rI6a20O0Xpstncddq8AhtPX3u6W /leqIyrdHw0V2qPzy/+kv2aTwvIQmr/Yusk6ojenuoQCB8Zm3+psSQ3v7pC0GDTWHQVv HzXYrvKa5AKpRdKR9o0ZW9e8/RqdwFhxpss5Em6RGbJ3aKpO140rXvqhgEPysgeXsMwx 5+hqSj+vApb6WKuYcZGHpGIFJXft87KmV5WGzLJTaMsjJ9z9mkPtDbKF3FD0yhmJ3IRP /gmgIfkXfH+a5ue92r1ILlwWc7ACz48iqMTo+3+VJIuRcBbJLW/txJndZA6xmqUzSJyN vrgg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709891176; x=1710495976; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=KmfcB5qVkT2tsJ4mf2YN8xT7NTpqgEKTKh57qE2N2HY=; b=E1sSd1ulheNZhf4zZvl1y8PujAK7MgY3aku0JUh4lHMQOpF8Lux4JIgRi+Tc5RgQjb 3Tyyab1qQiGyw6D4o5sE1s3TnqSVJLWJ+2h6cmrDO6NEu2eMIWT8m9CJQfJW4YdxhmIR QecAT9nNlEiPB+G0f61WjFUd8yHW9gOx3/QgsbNxTiHls6jkqSel3WLSFRtsmdYA5kYC lO1SQEtlKzC3cxGuPWOd6mEBjLh7Xrko51D/SD84AOFHV9eGBHeu2sWftwr0ZZXFEouf CJc+VEgTdGH/dp3A2tdv7iGDi/VZWL8EnLhWehy+nGYbTfoBHCVuZ7rwFpfg0xvdBIbi yG6A== X-Gm-Message-State: AOJu0YxhV/mbe5dWoMXCnqBFRqIZCF/Sd6wSRyUHJxMR8Al+ZXGXoxCO VlZp+20eUO8lAHHSzql+cCo3VqiTo0lAkgqlezm+f6AmiBkO9g3QERUPkX8BO55SQ4jfaJntbKV d+6obCmkWyOLIEE1y+Z5477APPD3Kjxc26qYRV4zP X-Google-Smtp-Source: AGHT+IF0tRtG3v/IIBLY3ktiPQ3BeYZbtv5Nauhz4Lw8TLuap4inO9FejFoHASWLVopJDGUbfHpBGARXiLvd4isbaz0= X-Received: by 2002:a0c:ec04:0:b0:690:af7c:e9e0 with SMTP id y4-20020a0cec04000000b00690af7ce9e0mr2346657qvo.55.1709891176304; Fri, 08 Mar 2024 01:46:16 -0800 (PST) MIME-Version: 1.0 References: <2096762.rT49G5IHzF@basile.remlab.net> <2D1B8B00-6E58-4C67-B3C9-7C49517E65FF@remlab.net> In-Reply-To: <2D1B8B00-6E58-4C67-B3C9-7C49517E65FF@remlab.net> From: flow gg Date: Fri, 8 Mar 2024 17:46:05 +0800 Message-ID: To: FFmpeg development discussions and patches Content-Type: multipart/mixed; boundary="0000000000002b4f8506132310e2" X-Content-Filtered-By: Mailman/MimeDel 2.1.29 Subject: Re: [FFmpeg-devel] [PATCH 2/2] lavc/vc1dsp: R-V V mspel_pixels X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: --0000000000002b4f8506132310e2 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Alright, using m8, but for now don't add code to address dependencies in loops that have a minor impact. Updated in the reply R=C3=A9mi Denis-Courmont =E4=BA=8E2024=E5=B9=B43=E6=9C=88= 8=E6=97=A5=E5=91=A8=E4=BA=94 17:08=E5=86=99=E9=81=93=EF=BC=9A > > > Le 8 mars 2024 02:45:46 GMT+02:00, flow gg a > =C3=A9crit : > >> Isn't it also faster to max LMUL for the adds here? > > > >It requires the use of one more vset, making the time slightly longer: > >147.7 (m1), 148.7 (m8 + vset). > > A variation of 0.6% on a single set of kernels will end up below > measurement noise in real overall codec usage. And then reducing the > I-cache contention can improve performance in other ways. Larger LMUL > should also improve performance on bigger cores with more ALUs. So it's n= ot > all black and white. > > My personal preference is to keep the code small if it makes almost no > difference but I'm not BDFL. > > >Also this might not be much noticeable on C908, but avoiding sequential > >dependencies on the address registers may help. I mean, avoid using as > >address > >operand a value that was calculated by the immediate previous instructio= n. > > > >> Okay, but the test results haven't changed.. > >It would add more than ten lines of code, perhaps shorter code will > better? > > I don't know. There are definitely in-order vector cores coming, and data > dependencies will hurt them. But I don't know if anyone will care about > FFmpeg on those. > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". > --0000000000002b4f8506132310e2 Content-Type: text/x-patch; charset="US-ASCII"; name="0002-lavc-vc1dsp-R-V-V-mspel_pixels.patch" Content-Disposition: attachment; filename="0002-lavc-vc1dsp-R-V-V-mspel_pixels.patch" Content-Transfer-Encoding: base64 Content-ID: X-Attachment-Id: f_ltih1f1y0 RnJvbSA0N2FlMjMzZTZiYjhmNTJkZDdkOTJhYzA2MmJlZDFhYzg1YWM0OWEwIE1vbiBTZXAgMTcg MDA6MDA6MDAgMjAwMQpGcm9tOiBzdW55dWVjaGkgPHN1bnl1ZWNoaUBpc2Nhcy5hYy5jbj4KRGF0 ZTogV2VkLCAyOCBGZWIgMjAyNCAxNjozMjozOSArMDgwMApTdWJqZWN0OiBbUEFUQ0ggMi8yXSBs YXZjL3ZjMWRzcDogUi1WIFYgbXNwZWxfcGl4ZWxzCgp2YzFkc3AuYXZnX3ZjMV9tc3BlbF9waXhl bHNfdGFiWzBdWzBdX2M6IDg2OS43CnZjMWRzcC5hdmdfdmMxX21zcGVsX3BpeGVsc190YWJbMF1b MF1fcnZ2X2kzMjogMTQ4LjcKdmMxZHNwLmF2Z192YzFfbXNwZWxfcGl4ZWxzX3RhYlsxXVswXV9j OiAyMjAuNQp2YzFkc3AuYXZnX3ZjMV9tc3BlbF9waXhlbHNfdGFiWzFdWzBdX3J2dl9pNjQ6IDU2 LjIKdmMxZHNwLnB1dF92YzFfbXNwZWxfcGl4ZWxzX3RhYlswXVswXV9jOiA1MjMuNwp2YzFkc3Au cHV0X3ZjMV9tc3BlbF9waXhlbHNfdGFiWzBdWzBdX3J2dl9pMzI6IDgyLjAKdmMxZHNwLnB1dF92 YzFfbXNwZWxfcGl4ZWxzX3RhYlsxXVswXV9jOiAxMzguNQp2YzFkc3AucHV0X3ZjMV9tc3BlbF9w aXhlbHNfdGFiWzFdWzBdX3J2dl9pNjQ6IDIzLjcKCnZjMXRtcAotLS0KIGxpYmF2Y29kZWMvcmlz Y3YvdmMxZHNwX2luaXQuYyB8ICA4ICsrKysrCiBsaWJhdmNvZGVjL3Jpc2N2L3ZjMWRzcF9ydnYu UyAgfCA2NiArKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrCiAyIGZpbGVzIGNoYW5n ZWQsIDc0IGluc2VydGlvbnMoKykKCmRpZmYgLS1naXQgYS9saWJhdmNvZGVjL3Jpc2N2L3ZjMWRz cF9pbml0LmMgYi9saWJhdmNvZGVjL3Jpc2N2L3ZjMWRzcF9pbml0LmMKaW5kZXggZTQ3YjY0NGY4 MC4uNjEwYzQzYTFhMyAxMDA2NDQKLS0tIGEvbGliYXZjb2RlYy9yaXNjdi92YzFkc3BfaW5pdC5j CisrKyBiL2xpYmF2Y29kZWMvcmlzY3YvdmMxZHNwX2luaXQuYwpAQCAtMjksNiArMjksMTAgQEAg dm9pZCBmZl92YzFfaW52X3RyYW5zXzh4OF9kY19ydnYodWludDhfdCAqZGVzdCwgcHRyZGlmZl90 IHN0cmlkZSwgaW50MTZfdCAqYmxvY2sKIHZvaWQgZmZfdmMxX2ludl90cmFuc180eDhfZGNfcnZ2 KHVpbnQ4X3QgKmRlc3QsIHB0cmRpZmZfdCBzdHJpZGUsIGludDE2X3QgKmJsb2NrKTsKIHZvaWQg ZmZfdmMxX2ludl90cmFuc184eDRfZGNfcnZ2KHVpbnQ4X3QgKmRlc3QsIHB0cmRpZmZfdCBzdHJp ZGUsIGludDE2X3QgKmJsb2NrKTsKIHZvaWQgZmZfdmMxX2ludl90cmFuc180eDRfZGNfcnZ2KHVp bnQ4X3QgKmRlc3QsIHB0cmRpZmZfdCBzdHJpZGUsIGludDE2X3QgKmJsb2NrKTsKK3ZvaWQgZmZf cHV0X3BpeGVsczE2eDE2X3J2dih1aW50OF90ICpkc3QsIGNvbnN0IHVpbnQ4X3QgKnNyYywgcHRy ZGlmZl90IGxpbmVfc2l6ZSwgaW50IHJuZCk7Cit2b2lkIGZmX3B1dF9waXhlbHM4eDhfcnZ2KHVp bnQ4X3QgKmRzdCwgY29uc3QgdWludDhfdCAqc3JjLCBwdHJkaWZmX3QgbGluZV9zaXplLCBpbnQg cm5kKTsKK3ZvaWQgZmZfYXZnX3BpeGVsczE2eDE2X3J2dih1aW50OF90ICpkc3QsIGNvbnN0IHVp bnQ4X3QgKnNyYywgcHRyZGlmZl90IGxpbmVfc2l6ZSwgaW50IHJuZCk7Cit2b2lkIGZmX2F2Z19w aXhlbHM4eDhfcnZ2KHVpbnQ4X3QgKmRzdCwgY29uc3QgdWludDhfdCAqc3JjLCBwdHJkaWZmX3Qg bGluZV9zaXplLCBpbnQgcm5kKTsKIAogYXZfY29sZCB2b2lkIGZmX3ZjMWRzcF9pbml0X3Jpc2N2 KFZDMURTUENvbnRleHQgKmRzcCkKIHsKQEAgLTM4LDkgKzQyLDEzIEBAIGF2X2NvbGQgdm9pZCBm Zl92YzFkc3BfaW5pdF9yaXNjdihWQzFEU1BDb250ZXh0ICpkc3ApCiAgICAgaWYgKGZsYWdzICYg QVZfQ1BVX0ZMQUdfUlZWX0kzMiAmJiBmZl9nZXRfcnZfdmxlbmIoKSA+PSAxNikgewogICAgICAg ICBkc3AtPnZjMV9pbnZfdHJhbnNfNHg4X2RjID0gZmZfdmMxX2ludl90cmFuc180eDhfZGNfcnZ2 OwogICAgICAgICBkc3AtPnZjMV9pbnZfdHJhbnNfNHg0X2RjID0gZmZfdmMxX2ludl90cmFuc180 eDRfZGNfcnZ2OworICAgICAgICBkc3AtPnB1dF92YzFfbXNwZWxfcGl4ZWxzX3RhYlswXVswXSA9 IGZmX3B1dF9waXhlbHMxNngxNl9ydnY7CisgICAgICAgIGRzcC0+YXZnX3ZjMV9tc3BlbF9waXhl bHNfdGFiWzBdWzBdID0gZmZfYXZnX3BpeGVsczE2eDE2X3J2djsKICAgICAgICAgaWYgKGZsYWdz ICYgQVZfQ1BVX0ZMQUdfUlZWX0k2NCkgewogICAgICAgICAgICAgZHNwLT52YzFfaW52X3RyYW5z Xzh4OF9kYyA9IGZmX3ZjMV9pbnZfdHJhbnNfOHg4X2RjX3J2djsKICAgICAgICAgICAgIGRzcC0+ dmMxX2ludl90cmFuc184eDRfZGMgPSBmZl92YzFfaW52X3RyYW5zXzh4NF9kY19ydnY7CisgICAg ICAgICAgICBkc3AtPnB1dF92YzFfbXNwZWxfcGl4ZWxzX3RhYlsxXVswXSA9IGZmX3B1dF9waXhl bHM4eDhfcnZ2OworICAgICAgICAgICAgZHNwLT5hdmdfdmMxX21zcGVsX3BpeGVsc190YWJbMV1b MF0gPSBmZl9hdmdfcGl4ZWxzOHg4X3J2djsKICAgICAgICAgfQogICAgIH0KICNlbmRpZgpkaWZm IC0tZ2l0IGEvbGliYXZjb2RlYy9yaXNjdi92YzFkc3BfcnZ2LlMgYi9saWJhdmNvZGVjL3Jpc2N2 L3ZjMWRzcF9ydnYuUwppbmRleCA0YTAwOTQ1ZWFkLi40ODI0NGY5MWFhIDEwMDY0NAotLS0gYS9s aWJhdmNvZGVjL3Jpc2N2L3ZjMWRzcF9ydnYuUworKysgYi9saWJhdmNvZGVjL3Jpc2N2L3ZjMWRz cF9ydnYuUwpAQCAtMTExLDMgKzExMSw2OSBAQCBmdW5jIGZmX3ZjMV9pbnZfdHJhbnNfNHg0X2Rj X3J2diwgenZlMzJ4CiAgICAgICAgIHZzc2UzMi52ICAgICAgdjAsIChhMCksIGExCiAgICAgICAg IHJldAogZW5kZnVuYworCitmdW5jIGZmX3B1dF9waXhlbHMxNngxNl9ydnYsIHp2ZTMyeAorICAg ICAgICB2c2V0aXZsaSAgICAgIHplcm8sIDE2LCBlOCwgbTEsIHRhLCBtYQorICAgICAgICAuaXJw IG4gMTYsIDE3LCAxOCwgMTksIDIwLCAyMSwgMjIsIDIzLCAyNCwgMjUsIDI2LCAyNywgMjgsIDI5 LCAzMAorICAgICAgICB2bGU4LnYgICAgICAgIHZcbiwgKGExKQorICAgICAgICBhZGQgICAgICAg ICAgIGExLCBhMSwgYTIKKyAgICAgICAgLmVuZHIKKyAgICAgICAgdmxlOC52ICAgICAgICB2MzEs IChhMSkKKyAgICAgICAgLmlycCBuIDE2LCAxNywgMTgsIDE5LCAyMCwgMjEsIDIyLCAyMywgMjQs IDI1LCAyNiwgMjcsIDI4LCAyOSwgMzAKKyAgICAgICAgdnNlOC52ICAgICAgICB2XG4sIChhMCkK KyAgICAgICAgYWRkICAgICAgICAgICBhMCwgYTAsIGEyCisgICAgICAgIC5lbmRyCisgICAgICAg IHZzZTgudiAgICAgICAgdjMxLCAoYTApCisKKyAgICAgICAgcmV0CitlbmRmdW5jCisKK2Z1bmMg ZmZfcHV0X3BpeGVsczh4OF9ydnYsIHp2ZTY0eAorICAgICAgICB2c2V0aXZsaSAgICAgIHplcm8s IDgsIGU4LCBtZjIsIHRhLCBtYQorICAgICAgICB2bHNlNjQudiAgICAgIHY4LCAoYTEpLCBhMgor ICAgICAgICB2c3NlNjQudiAgICAgIHY4LCAoYTApLCBhMgorCisgICAgICAgIHJldAorZW5kZnVu YworCitmdW5jIGZmX2F2Z19waXhlbHMxNngxNl9ydnYsIHp2ZTMyeAorICAgICAgICBjc3J3aSAg ICAgICAgIHZ4cm0sIDAKKyAgICAgICAgdnNldGl2bGkgICAgICB6ZXJvLCAxNiwgZTgsIG0xLCB0 YSwgbWEKKyAgICAgICAgbGkgICAgICAgICAgICB0MCwgMTI4CisKKyAgICAgICAgLmlycCBuIDE2 LCAxNywgMTgsIDE5LCAyMCwgMjEsIDIyLCAyMywgMjQsIDI1LCAyNiwgMjcsIDI4LCAyOSwgMzAK KyAgICAgICAgdmxlOC52ICAgICAgICB2XG4sIChhMSkKKyAgICAgICAgYWRkICAgICAgICAgICBh MSwgYTEsIGEyCisgICAgICAgIC5lbmRyCisgICAgICAgIHZsZTgudiAgICAgICAgdjMxLCAoYTEp CisgICAgICAgIC5pcnAgbiAwLCAxLCAyLCAzLCA0LCA1LCA2LCA3LCA4LCA5LCAxMCwgMTEsIDEy LCAxMywgMTQKKyAgICAgICAgdmxlOC52ICAgICAgICB2XG4sIChhMCkKKyAgICAgICAgYWRkICAg ICAgICAgICBhMCwgYTAsIGEyCisgICAgICAgIC5lbmRyCisgICAgICAgIHZsZTgudiAgICAgICAg djE1LCAoYTApCisgICAgICAgIHZzZXR2bGkgICAgICAgemVybywgdDAsIGU4LCBtOCwgdGEsIG1h CisgICAgICAgIHZhYWRkdS52diAgICAgdjAsIHYwLCB2MTYKKyAgICAgICAgdmFhZGR1LnZ2ICAg ICB2OCwgdjgsIHYyNAorICAgICAgICB2c2V0aXZsaSAgICAgIHplcm8sIDE2LCBlOCwgbTEsIHRh LCBtYQorICAgICAgICAuaXJwIG4gIDE1LCAxNCwgMTMsIDEyLCAxMSwgMTAsIDksIDgsIDcsIDYs IDUsIDQsIDMsIDIsIDEKKyAgICAgICAgdnNlOC52ICAgICAgICB2XG4sIChhMCkKKyAgICAgICAg c3ViICAgICAgICAgICBhMCwgYTAsIGEyCisgICAgICAgIC5lbmRyCisgICAgICAgIHZzZTgudiAg ICAgICAgdjAsIChhMCkKKworICAgICAgICByZXQKK2VuZGZ1bmMKKworZnVuYyBmZl9hdmdfcGl4 ZWxzOHg4X3J2diwgenZlNjR4CisgICAgICAgIGNzcndpICAgICAgICAgdnhybSwgMAorICAgICAg ICBsaSAgICAgICAgICAgIHQwLCA2NAorICAgICAgICB2c2V0aXZsaSAgICAgIHplcm8sIDgsIGU4 LCBtZjIsIHRhLCBtYQorICAgICAgICB2bHNlNjQudiAgICAgIHYxNiwgKGExKSwgYTIKKyAgICAg ICAgdmxzZTY0LnYgICAgICB2OCwgKGEwKSwgYTIKKyAgICAgICAgdnNldHZsaSAgICAgICB6ZXJv LCB0MCwgZTgsIG00LCB0YSwgbWEKKyAgICAgICAgdmFhZGR1LnZ2ICAgICB2MTYsIHYxNiwgdjgK KyAgICAgICAgdnNldGl2bGkgICAgICB6ZXJvLCA4LCBlOCwgbWYyLCB0YSwgbWEKKyAgICAgICAg dnNzZTY0LnYgICAgICB2MTYsIChhMCksIGEyCisKKyAgICAgICAgcmV0CitlbmRmdW5jCi0tIAoy LjQ0LjAKCg== --0000000000002b4f8506132310e2 Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". --0000000000002b4f8506132310e2--