From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.ffmpeg.org (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTPS id 4846E48BAA for ; Thu, 16 Oct 2025 19:58:43 +0000 (UTC) Authentication-Results: ffbox; dkim=fail (body hash mismatch (got b'4bdxqWwJbjX9kVfQjya7gwCKBw1X6vFKKN8UOgAwLi8=', expected b'szgDBuVgWHNzYqkp1M4iQNtehIOiBKpOFEwGznZ5LSg=')) header.d=rson.riker.me header.a=rsa-sha256 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ffmpeg.org; i=@ffmpeg.org; q=dns/txt; s=mail; t=1760644716; h=date : to : message-id : mime-version : reply-to : subject : list-id : list-archive : list-archive : list-help : list-owner : list-post : list-subscribe : list-unsubscribe : from : cc : content-type : from; bh=7ZBRgrlzeTf4WrZDfzNfxCDqC/ehftukpJ+rQIAiR4g=; b=nlAKguRjZzZZv8zvEwdmWehfaeRjuOSZQfj/h5jzxCsZBel9wwIZaKon28zs+sxnWg4pe C+/W7etcY87AxpBoI2TXqvLj3raKP2CKHK4iO4KHZ0oQakQniAEkayBl1WnPWPEfL2o2Vr2 dI4zMxnBloX2yLO1LMrCVuIlDf52C0I7xqhaAKC8LqPUSnLjDPnDUuMJX17t0hj6Rx+HBrl 5dIKGB8Q4dcUhIOJfaEHzXDIW3k8mxzqCrtbXMuk94MD1gP3Mfc3uOeUS7mzoPd72FOTPdB fPUccQw0yAL9E7zBKk8d8z3816nrqzC4T6cI7oYUg8bbQRsQxlz7tq/zCuPA== Received: from [172.19.0.2] (unknown [172.19.0.2]) by ffbox0-bg.ffmpeg.org (Postfix) with ESMTP id 0E4C568F4E6; Thu, 16 Oct 2025 22:58:36 +0300 (EEST) ARC-Seal: i=1; cv=none; a=rsa-sha256; d=ffmpeg.org; s=arc; t=1760644699; b=WLPIjKwAzv2rmTXzNuc+uIgSG919mBZXCUDXNcICPYP4X32BsIxb2rCuaSEEDuTF1gD2P a3kPAsr4ffJDA0sSiuk9iEYmszEYAYpNr3TGEfGZWxCaEkC4aaAK0DUEYPCpgiuvXlctfb5 2eUBod7WwHN6ZwC9kNTu7C0w2FVnvl+m4LgncWPN+DHMOdi01Td4tfy6Q0x7UKRIKSl5ZVB 3v7BnUy+fzoQZaO3S1tqP6CoIW3JaVtAHNvR4SPsSp5PmniM/TtBkS/are1G+ty3DJhlP8Q PbKgCX6CVrZA0Um8q8nICNitaXEN0IwHFJxYdy5KxGs3+tD6H5r5sYfxXmGg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=ffmpeg.org; s=arc; t=1760644699; h=from : sender : reply-to : subject : date : message-id : to : cc : mime-version : content-type : content-transfer-encoding : content-id : content-description : resent-date : resent-from : resent-sender : resent-to : resent-cc : resent-message-id : in-reply-to : references : list-id : list-help : list-unsubscribe : list-subscribe : list-post : list-owner : list-archive; bh=4bdxqWwJbjX9kVfQjya7gwCKBw1X6vFKKN8UOgAwLi8=; b=dBSZaVX8XWfvduGtDGDtMXv8dewm0vdziZz/N/3TWESv1lgvnctNHccSgvTm67liSFHGa K6r/074Gb/f88xUlDNZZsuYECr51dKqporiPM5PLPW32396cGe1GxFloqAUx5KSsf38Bkm0 ZfCasUkL3O5hUsz3PQxzDJCkR3Yn3OUNbX+Fi0A1W7k3hZn7IyOXXFgtMV0yyhJY1ppXFyd vCn/mU0HBeEya/jc3zZ2pXBJhG5qPZeO/mJgxB9bg6WMRcdXeLlvOrNb+yxv757EoACXHdP NiHKQonrvbnXdaljDgpLOsCBFIY5HhuQ/u6UiobxCnYTOWORLb/Gc8+wvSGA== ARC-Authentication-Results: i=1; ffmpeg.org; dkim=pass header.d=rson.riker.me; arc=none; dmarc=pass header.from=rson.riker.me policy.dmarc=quarantine Authentication-Results: ffmpeg.org; dkim=pass header.d=rson.riker.me; arc=none (Message is not ARC signed); dmarc=pass (Used From Domain Record) header.from=rson.riker.me policy.dmarc=quarantine Received: from mail-10627.protonmail.ch (mail-10627.protonmail.ch [79.135.106.27]) by ffbox0-bg.ffmpeg.org (Postfix) with ESMTPS id 6A74168F44F for ; Thu, 16 Oct 2025 22:58:06 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rson.riker.me; s=protonmail; t=1760644683; x=1760903883; bh=szgDBuVgWHNzYqkp1M4iQNtehIOiBKpOFEwGznZ5LSg=; h=Date:To:From:Subject:Message-ID:Feedback-ID:From:To:Cc:Date: Subject:Reply-To:Feedback-ID:Message-ID:BIMI-Selector; b=Q492VeSok2EoFNlURfsVgUZq2Zq6JATWPQ6rxpGxu9w9YZfNI40ilXFHjGHqUG+C2 sHVj2Y53YIPMbbDj9kS9JlZKlKo4lgvqHPZe9JtAsHUQMNJ+ONi2766px2eGeefL3I FaiYDG7qDXRtyATxxS1a9lkK/BoiRuSihk+SnkANxhySx0fdscrok7zX81Oou0uGtU 0+1cedpW6JEz0lXKAwLBEsMICVF8Fukiu+fQMwu8+2nUgLRYw5cvfvn2Ykt1sHGoqQ onPO7S8kspDpEm1DNP4r6v9UTXJMiXooKFZ5UgfRVKFtn531tPNAkamAcC49hn0Cjv PVngGQ9te8amA== Date: Thu, 16 Oct 2025 19:57:58 +0000 To: "ffmpeg-devel@ffmpeg.org" Message-ID: Feedback-ID: 27303859:user:proton X-Pm-Message-ID: 06c15ccab7ff8e295ce5d0e66e417ed974f904b9 MIME-Version: 1.0 Message-ID-Hash: FOZA5VRM4RLH743V6MZPZBYK6ITU7NZZ X-Message-ID-Hash: FOZA5VRM4RLH743V6MZPZBYK6ITU7NZZ X-MailFrom: SRS0=2v8a=4Z=rson.riker.me=c@ffmpeg.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; header-match-ffmpeg-devel.ffmpeg.org-0; header-match-ffmpeg-devel.ffmpeg.org-1; header-match-ffmpeg-devel.ffmpeg.org-2; header-match-ffmpeg-devel.ffmpeg.org-3; emergency; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.10 Precedence: list Reply-To: FFmpeg development discussions and patches Subject: [FFmpeg-devel] Updates to RBSP Depayloading List-Id: FFmpeg development discussions and patches Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: From: Carson Riker via ffmpeg-devel Cc: Carson Riker Content-Type: multipart/mixed; boundary="===============8041075946092224227==" Archived-At: List-Archive: List-Post: This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --===============8041075946092224227== Content-Type: multipart/signed; protocol="application/pgp-signature"; micalg=pgp-sha512; boundary="------092303cd9098fd700bdb5f83b09ae61eb497207189c6c4b80b9c36b18194b708"; charset=utf-8 This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --------092303cd9098fd700bdb5f83b09ae61eb497207189c6c4b80b9c36b18194b708 Content-Type: multipart/mixed;boundary=---------------------5599a86dfca019acf59ee521479d0984 -----------------------5599a86dfca019acf59ee521479d0984 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain;charset=utf-8 Hello! I am preparing to submit a patch for ff_h2645_extract_rbsp in libavcodec/h2645_parse.c. I believe there are sufficient performance gains= to be made by tweaking the large-stride RBSP escape code detection logic. This w= ould be my first contribution to ffmpeg, so I have several questions about best practice. Please forgive me if this is the incorrect venue for these quest= ions. The current approach uses unaligned 64-byte loads and some bit tricks to d= etect zero bytes. Because RBSP escape sequences are 0x00, 0x00, 0x03, you only n= eed to scan every other byte. This is paired with a 9-byte stride, to minimize th= e number of bytes inspected. My own testing suggests this is shortsighted -- most platforms that suppor= t unaligned 64-byte loads appear to be performing two aligned loads and mask= ing them together. Combined with the odd stride, this means we are preforming roughly twice the required amount of loads. My assumptions were validated = by some anecdotal testing on my own CPU, but I am not confident that my state= ment is true for all targets. Updating the scan to seek to an 8-byte offset, and then proceed in aligned 8-byte strides, appears to confer a significant performance improvement. M= y tests showed a 32% performance improvement, but this was a custom harness that j= ust does RBSP depayloading, and again, my CPU (AMD EPYC 9754) is not necessari= ly a representative example. In general, though, I expect the performance improvements to hold for _mod= ern_ machines, but could hurt performance of 'older' targets. My rough understa= nding is that lookahead fetch engines became common ~2010. Targets without robus= t lookahead fetchers may see a slight performance hit. My questions are twofold: 1. What validation work of the performance improvement should I preform be= fore writing and submitting a patch? Assuming I am confident that the patch = yields a performance improvement on most modern targets, would ffmpeg welcome = this change? Or is the performance stability of legacy platforms too importa= nt? 2. How should I approach big-endian support? As my patch would involve a l= ot of = 64-bit literals, I could either: - (1) prioritize code size, providing one implementation, wrapping load= s with a macro that converts BE to LE as needed. This would be cleanest = imho, but incur a slight penalty on big-endian targets. - (2) prioritize performance, providing two mostly-identical implementa= tions, one for each endianess. Thanks for the time, Carson Riker -----------------------5599a86dfca019acf59ee521479d0984-- --------092303cd9098fd700bdb5f83b09ae61eb497207189c6c4b80b9c36b18194b708 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: ProtonMail wsC5BAEBCgBtBYJo8U44CRAki1CC1Jg4hkUUAAAAAAAcACBzYWx0QG5vdGF0 aW9ucy5vcGVucGdwanMub3JnSIFLicdSo9xfKtWKPTRHFuwvvLOhL35eLo57 KJlWzxMWIQRsScoBwwFPaNeVcFUki1CC1Jg4hgAAj2kH/RR8sSbl/7UxNx1F bcbt9/ykIs7Po9297gO8xtTiG2jFIsWMdB8XwqN1oBEkomKnq1KtJ8VJLLuJ UCT2bPTqDpxWy0LBnhsjKaPwK7g19V6dnCyrev1nbmMbb19LkJbHL6eezaVh 6Xk/2b2ma5TXR2uOzZ0f7zQIRqeojmDdKHpC51J39Bysk+fJfAwlBM2kA0DB I8pyQ2pH2ziqwDU5iTcl0NBMbokFPC7itMbCUoS26sK2jS5/6YaWCHXB5oYU 9Z8trsrsWMrhm9xrl/UPntFJDG4Tc4NMPkKU6VqNxzFcKhyDvS/sWxyfdm7E D8XgsmL5jN3Cc/7BHKdZyMfGYZI= =iXYO -----END PGP SIGNATURE----- --------092303cd9098fd700bdb5f83b09ae61eb497207189c6c4b80b9c36b18194b708-- --===============8041075946092224227== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ ffmpeg-devel mailing list -- ffmpeg-devel@ffmpeg.org To unsubscribe send an email to ffmpeg-devel-leave@ffmpeg.org --===============8041075946092224227==--