From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.ffmpeg.org (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTPS id 2176D4C8DF for ; Tue, 10 Jun 2025 11:42:53 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.ffmpeg.org (Postfix) with ESMTP id 638DD68CCDC; Tue, 10 Jun 2025 14:42:48 +0300 (EEST) Received: from smtpout1.mo533.mail-out.ovh.net (smtpout1.mo533.mail-out.ovh.net [51.210.94.138]) by ffbox0-bg.ffmpeg.org (Postfix) with ESMTPS id 4389168CBDD for ; Tue, 10 Jun 2025 14:42:42 +0300 (EEST) Received: from director3.derp.mail-out.ovh.net (director3.derp.mail-out.ovh.net [152.228.215.222]) by mo533.mail-out.ovh.net (Postfix) with ESMTPS id 4bGn351gqVz125g; Tue, 10 Jun 2025 11:42:41 +0000 (UTC) Received: from director3.derp.mail-out.ovh.net (director3.derp.mail-out.ovh.net. [127.0.0.1]) by director3.derp.mail-out.ovh.net (inspect_sender_mail_agent) with SMTP for ; Tue, 10 Jun 2025 11:42:40 +0000 (UTC) Received: from mta3.priv.ovhmail-u1.ea.mail.ovh.net (unknown [10.109.176.8]) by director3.derp.mail-out.ovh.net (Postfix) with ESMTPS id 4bGn3465h8z5xNM; Tue, 10 Jun 2025 11:42:40 +0000 (UTC) Received: from mailstore2.priv.ovhmail-u1.ea.mail.ovh.net (unknown [10.2.8.2]) by mta3.priv.ovhmail-u1.ea.mail.ovh.net (Postfix) with ESMTP id 700DF3A419A; Tue, 10 Jun 2025 11:42:40 +0000 (UTC) Date: Tue, 10 Jun 2025 11:42:40 +0000 (UTC) From: Marcos Del Sol To: Tomas =?utf-8?Q?H=C3=A4rdin?= Message-ID: <125650946.251443723.1749555760249.JavaMail.zimbra@orca.pet> In-Reply-To: <711117349.249383674.1749513064741.JavaMail.zimbra@orca.pet> References: <20250527102811.369474-1-marcos@orca.pet> <1210481741.184664378.1748342436436.JavaMail.zimbra@orca.pet> <204e8be99e97c2a9d56c5fbd0f51c4540452a436.camel@haerdin.se> <5d3bd41e77e388369dd3a69469a57f1a1131e00e.camel@haerdin.se> <711117349.249383674.1749513064741.JavaMail.zimbra@orca.pet> MIME-Version: 1.0 X-Originating-IP: [147.156.42.6] Thread-Topic: avformat/webvttdec: improve WebVTT parsing Thread-Index: gOCgaFlB7osCQuJByAA8OTGjLvW6ntFlgM8w X-Ovh-Tracer-Id: 1887008247944337067 X-VR-SPAMSTATE: OK X-VR-SPAMSCORE: -100 X-VR-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrgeeffedrtddugddutdeilecutefuodetggdotefrodftvfcurfhrohhfihhlvgemucfqggfjpdevjffgvefmvefgnecuuegrihhlohhuthemucehtddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenucfjughrpeffhffvvefkjghfufggtgfgihhtsehtjegttddttdejnecuhfhrohhmpeforghrtghoshcuffgvlhcuufholhcuoehmrghrtghoshesohhrtggrrdhpvghtqeenucggtffrrghtthgvrhhnpeeluefhgedufeeltedutdduheefuedtvddvfeevgefgtdetleekkeehkefgvedvffenucffohhmrghinhepghhithhhuhgsrdgtohhmpdhgohhoghhlvghsohhurhgtvgdrtghomhdpvhhtthgpphgrrhhsvghrrdgttgdpshgvrghrtghhfhhogidrohhrghdpvhhiuggvohhlrghnrdhorhhgnecukfhppeduvdejrddtrddtrddupddugeejrdduheeirdegvddrieenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepihhnvghtpeduvdejrddtrddtrddupdhmrghilhhfrhhomhepmhgrrhgtohhssehorhgtrgdrphgvthdpnhgspghrtghpthhtohepvddprhgtphhtthhopehffhhmphgvghdquggvvhgvlhesfhhfmhhpvghgrdhorhhgpdhrtghpthhtohepghhitheshhgrvghrughinhdrshgvpdfovfetjfhoshhtpehmohehfeefmgdpmhhouggvpehsmhhtphhouhht Subject: Re: [FFmpeg-devel] [PATCH] avformat/webvttdec: improve WebVTT parsing X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: FFmpeg development discussions and patches Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: > WebVTT is supposed to be an extensible format. Limiting to a small set of > known values and silently aborting when anything new is introduced does > not seem like the best option to me. Web browsers do not stop rendering > pages when they see a new, unknown HTML tag or CSS option. About this, for interoperability reasons I've checked other multiple implementations of WebVTT parsers and these all follow the approach of silently ignoring unknown chunks: - WebKit (https://github.com/WebKit/WebKit/blob/main/Source/WebCore/html/track/WebVTTParser.cpp#L224-L227) - Chromium's Blink parser (https://chromium.googlesource.com/chromium/src/+/main/third_party/blink/renderer/core/html/track/vtt/vtt_parser.cc#199) - Mozilla's Gecko (https://searchfox.org/mozilla-central/source/dom/media/webvtt/vtt.sys.mjs#1674-1677) - VLC (https://code.videolan.org/videolan/vlc/-/blob/master/modules/codec/webvtt/webvtt.c) The only one that would report an error, but will still keep parsing anyway, is W3C's own WebVTT.js parser (https://github.com/w3c/webvtt.js/blob/main/parser.js#L150-L153). If that's not enough, I found also in part 2.1 of the WebVTT specification, also copied verbatim: "The parsing rules are more tolerant to author errors than the syntax allows, in order to provide for extensibility and to still render cues that have some syntax errors." Basically, the standard asks to follow Postel's law: "Be conservative in what you send, be liberal in what you accept". ffmpeg should do that. _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".