From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <ffmpeg-devel-bounces@ffmpeg.org>
Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100])
	by master.gitmailbox.com (Postfix) with ESMTPS id C2AC54E9E2
	for <ffmpegdev@gitmailbox.com>; Thu, 20 Mar 2025 07:25:14 +0000 (UTC)
Received: from [127.0.1.1] (localhost [127.0.0.1])
	by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id E90F7687B91;
	Thu, 20 Mar 2025 09:25:09 +0200 (EET)
Received: from mail-ej1-f44.google.com (mail-ej1-f44.google.com
 [209.85.218.44])
 by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 76DFA687B76
 for <ffmpeg-devel@ffmpeg.org>; Thu, 20 Mar 2025 09:25:03 +0200 (EET)
Received: by mail-ej1-f44.google.com with SMTP id
 a640c23a62f3a-aaf0f1adef8so91778666b.3
 for <ffmpeg-devel@ffmpeg.org>; Thu, 20 Mar 2025 00:25:03 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=gmail.com; s=20230601; t=1742455502; x=1743060302; darn=ffmpeg.org;
 h=content-transfer-encoding:mime-version:message-id:date:subject:cc
 :to:from:from:to:cc:subject:date:message-id:reply-to;
 bh=kE35TNEJlsbuR4960r/TeIAo2185rrI8pao8limWt8k=;
 b=TjLJyZnEYsbrgxx19MzyOPMmuxyHEJAhLS3FPtZnUfVPIOE0zDvL+Wk9TUF2akAqpr
 xGnKYETc5x8nuH2NeSq96E4Ytso119eJ7YtC1h9Hv3eoLawMpOHMnKzHoZF2o969gM3U
 4Uo3ftnEBoWD1u9/VaFKmQl8fxaUwU3d/Mxj8oHKeg+o5eRDw29HlsfiokWEUIkcF44/
 HU8u7BPQ3UJKTc6khSsjc/No6RyTVb3yF6xskFM6mJk6yxsPIcxt0/KbVWpvpl69FFfJ
 7TpLRcFTVykn39OlIZKIYV7EkfPyi3ORtC6p6wChTHIGvI0VWAkKAK+Ds2OToGz2YSx7
 DDng==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20230601; t=1742455502; x=1743060302;
 h=content-transfer-encoding:mime-version:message-id:date:subject:cc
 :to:from:x-gm-message-state:from:to:cc:subject:date:message-id
 :reply-to;
 bh=kE35TNEJlsbuR4960r/TeIAo2185rrI8pao8limWt8k=;
 b=qkvzmGaqX+KQw3J7rE9sdTTFTh9gHAlm/b3Xu0VmLvbsaodxVW3Yk+rWvOJI+2cAEv
 r1sB8avE4YUt7i+bH0ShlFwaf3xuG2ohYMOcnBUsz1XwoFBYWFpE5VOr6osvPy63kGKS
 QmjUGl8cBTorlDJYvs8NExPovGZokiVon3B/I3UZZ8vMtWRiyJJwSxVy7BDsbggjOtv7
 8kcZym1HntIVXus/v+UA2FNwQK4/Yobc/QnuhulmpGQN6gkfhbyz4hjIXCHMC5J9XFxL
 jG5ixRYcA8HQzNARmQCLBS8x8aaLomDJhBcyFllgiQowq0rn1u4N4FX65JgDGfPnURWP
 VIVA==
X-Gm-Message-State: AOJu0Yz22HgZ0p18FnbuK9mUW/cgCR9xYuLX6YJ24oyKEqNVzYbvFMhv
 NaHejgVQrJre0tJtORAxp+ra35VV9JNX3y+SZ6kAymhX/KPyXsdWWvjFDQ==
X-Gm-Gg: ASbGncsGR+i0jcj8TfTWxrKgoOSaJt/pnZ7Z+NTxm8zjaU6GHh87bnz6LkM8wuL1dB5
 +8lod5dO9Djp5/9IVXeEvpypS5f2GEFlxk/zcB7/gjM9LYKUNzpWp/tOJfBCYl3Q63gLbmXWxk4
 cTmHWx0k6Fd9Qb8E+dqrsab1WcXMB97YpGP5WJ0XVKs6OLTapxum+OliS+VhZer89GpN6TW8Nky
 cZgoZmRF/0WOnO0uDM3hemeQmag64mcsdAxoA1b/WiAe+qG5m9/jESJ6X2TPMmWBdSNJFuFXhUu
 fuZ5voI7MjnEepxK+iuRLEnS/pLwUMSci6JcjEpK2GlmJNio36OEjBVsc2QdpDYIr01OZQ==
X-Google-Smtp-Source: AGHT+IGlfjmbe+9EYuxNXMNgHcSUoEGQbD+lqxLIKAah58Ofl2B6YjsnG9w3cYAKurjoyuEezL8vnA==
X-Received: by 2002:a17:907:3e1f:b0:ac3:2a54:875d with SMTP id
 a640c23a62f3a-ac3ce14b969mr262563266b.36.1742455502112; 
 Thu, 20 Mar 2025 00:25:02 -0700 (PDT)
Received: from skoolleptop.voCampus.nl ([145.140.129.93])
 by smtp.gmail.com with ESMTPSA id
 a640c23a62f3a-ac3149cf0a8sm1134656866b.97.2025.03.20.00.25.01
 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
 Thu, 20 Mar 2025 00:25:01 -0700 (PDT)
From: Leon Grutters <gruttersleonbot2@gmail.com>
To: ffmpeg-devel@ffmpeg.org
Date: Thu, 20 Mar 2025 08:24:50 +0100
Message-ID: <20250320072450.1164-1-gruttersleonbot2@gmail.com>
X-Mailer: git-send-email 2.49.0
MIME-Version: 1.0
Subject: [FFmpeg-devel] [PATCH] avcodec/webvttdec: strip classes
X-BeenThere: ffmpeg-devel@ffmpeg.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: FFmpeg development discussions and patches <ffmpeg-devel.ffmpeg.org>
List-Unsubscribe: <https://ffmpeg.org/mailman/options/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=unsubscribe>
List-Archive: <https://ffmpeg.org/pipermail/ffmpeg-devel>
List-Post: <mailto:ffmpeg-devel@ffmpeg.org>
List-Help: <mailto:ffmpeg-devel-request@ffmpeg.org?subject=help>
List-Subscribe: <https://ffmpeg.org/mailman/listinfo/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=subscribe>
Reply-To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Cc: Leon Grutters <gruttersleonbot2@gmail.com>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Errors-To: ffmpeg-devel-bounces@ffmpeg.org
Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>
Archived-At: <https://master.gitmailbox.com/ffmpegdev/20250320072450.1164-1-gruttersleonbot2@gmail.com/>
List-Archive: <https://master.gitmailbox.com/ffmpegdev/>
List-Post: <mailto:ffmpegdev@gitmailbox.com>

If a supported tag has a class, e.g "<i.bold>" it is ignored entirely;
so for example "<i.bold>Hello</i>" would be converted to "Hello{\i0}"
instead of the intended "{\i1}Hello{\i0}".

Signed-off-by: Leon Grutters <gruttersleonbot2@gmail.com>
---
 libavcodec/webvttdec.c | 51 +++++++++++++++++++++++++++++++++---------
 1 file changed, 40 insertions(+), 11 deletions(-)

diff --git a/libavcodec/webvttdec.c b/libavcodec/webvttdec.c
index 35bdbe805d..4111d138c4 100644
--- a/libavcodec/webvttdec.c
+++ b/libavcodec/webvttdec.c
@@ -29,25 +29,53 @@
 #include "ass.h"
 #include "codec_internal.h"
 #include "libavutil/bprint.h"
+#include "libavutil/mem.h"
 
 static const struct {
     const char *from;
     const char *to;
 } webvtt_tag_replace[] = {
-    {"<i>", "{\\i1}"}, {"</i>", "{\\i0}"},
-    {"<b>", "{\\b1}"}, {"</b>", "{\\b0}"},
-    {"<u>", "{\\u1}"}, {"</u>", "{\\u0}"},
     {"{", "\\{{}"}, {"\\", "\\\xe2\x81\xa0"}, // escape to avoid ASS markup conflicts
     {"&gt;", ">"}, {"&lt;", "<"},
     {"&lrm;", "\xe2\x80\x8e"}, {"&rlm;", "\xe2\x80\x8f"},
     {"&amp;", "&"}, {"&nbsp;", "\\h"},
 };
+static const struct {
+    const char *from;
+    const char *to;
+} webvtt_valid_tags[] = {
+    {"i", "{\\i1}"}, {"/i", "{\\i0}"},
+    {"b", "{\\b1}"}, {"/b", "{\\b0}"},
+    {"u", "{\\u1}"}, {"/u", "{\\u0}"},
+};
 
 static int webvtt_event_to_ass(AVBPrint *buf, const char *p)
 {
-    int i, again = 0, skip = 0;
+    int i, again = 0/*, skip = 0*/;
 
     while (*p) {
+        if (*p == '<') {
+            const char *tag_end = strchr(p, '>');
+            char *tag_body, *tag_name, *saveptr = NULL;
+            ptrdiff_t len;
+            if (!tag_end)
+                break;
+            len = tag_end - p + 1;
+            tag_body = av_strndup(p + 1, len - 2);
+            if (!tag_body)
+                return AVERROR(ENOMEM);
+            tag_name = av_strtok(tag_body, ".", &saveptr);
+            for (i = 0; i < FF_ARRAY_ELEMS(webvtt_valid_tags); i++) {
+                const char *from = webvtt_valid_tags[i].from;
+                if(!strcmp(tag_name, from)) {
+                    av_bprintf(buf, "%s", webvtt_valid_tags[i].to);
+                    break;
+                }
+            }
+            p += len;
+            again = 1;
+            av_freep(&tag_body);
+        }
 
         for (i = 0; i < FF_ARRAY_ELEMS(webvtt_tag_replace); i++) {
             const char *from = webvtt_tag_replace[i].from;
@@ -59,21 +87,22 @@ static int webvtt_event_to_ass(AVBPrint *buf, const char *p)
                 break;
             }
         }
+
         if (!*p)
             break;
 
         if (again) {
             again = 0;
-            skip = 0;
+            // skip = 0;
             continue;
         }
-        if (*p == '<')
-            skip = 1;
-        else if (*p == '>')
-            skip = 0;
-        else if (p[0] == '\n' && p[1])
+        // if (*p == '<')
+        //     skip = 1;
+        // else if (*p == '>')
+        //     skip = 0;
+        if (p[0] == '\n' && p[1])
             av_bprintf(buf, "\\N");
-        else if (!skip && *p != '\r')
+        else if (/*!skip && */*p != '\r')
             av_bprint_chars(buf, *p, 1);
         p++;
     }
-- 
2.49.0

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".