From: Soft Works <softworkz-at-hotmail.com@ffmpeg.org> To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>, Marth64 <marth64@proxyid.net> Subject: Re: [FFmpeg-devel] [PATCH v2 00/11] fix broken CC detection and ffprobe fields (cover letter) Date: Mon, 27 Jan 2025 09:04:57 +0000 Message-ID: <DM8P223MB036555A6DC63C9880D5DA06CBAEC2@DM8P223MB0365.NAMP223.PROD.OUTLOOK.COM> (raw) In-Reply-To: <CA+28BfAJ7Juo1csX1Nojb2=S=tfKoQByFqHwyJM5oyRytvQHxA@mail.gmail.com> Hi Marth64, first of all, thanks for taking on solving this long-standing broken feature. My earlier submission (https://patchwork.ffmpeg.org/project/ffmpeg/list/?series=5031) didn't find much love, so I had given up proposing, yet I can say that multiple millions of files got probed with it meanwhile. I'll get back later to the reason for mentioning this. Following the conversation about your submission I could see that your earlier approaches were similar to mine, but unfortunately you have been talked out of it. Comments like "Strong no. Internal properties are internal...." without proposing a better, more agreeable way - without even caring about what you are trying to achieve, even though it's about fixing a long-standing important bug - that's a blueprint example for what I mentioned recently, that the predominant understanding of "democratic" participation - whether member in a committee or GA - is about having the right to say "no" to something. This doesn't bring things forward and that's not the way how leadership should look like, neither can it work successfully this way. (this is meant to address everyone giving answers like that, selection is purely accidental, it doesn't matter who actually said it) This serves as a good example for how these things are doing no good to the project. In earlier days, the tagline was like "world's fastest audio and video encoder", but this is rather heading for "world's slowest media prober". Try this for example: ffprobe -loglevel verbose -analyze_frames -show_entries stream=closed_captions:disposition=:side_data= "http://streams.videolan.org/streams/ts/CC/NewsStream-608-ac3.ts" This correctly indicates CC availability with 66,147,986 bytes read (file size: 60.428.088) Without analyze_frames: 5,715,518 bytes read So, the actual probing requires 5.7 MB. To get a similar coverage I set -read_intervals "%+3" to analyze frames from the first 3s. But now it reads 11MB, because it processes the data twice. For comparison, with the patch I had submitted in 2021 it correctly detects CC with only 5,718,438 bytes read - i.e. without any extra processing needed. What's happening? When running ffprobe with show_frames (and now also with analyze_frames), it (also) does the actual probing first. This part (#1) stops kind of like "as soon as it has all required information" or the analyze_duration is exceeded. Then, for (#2) it seeks back to the start and iterates through the frames to determine the presence of CC. But #2 isn't needed at all, because the information about CC data is already available after #1 is done. ffprobe just cannot access it, because somebody said it must be private: well done! Normally, everything in ffmpeg is optimized to the max with SIMD and assembly code (which is great), but here, ffprobe is taking multiple times longer (see below) to execute than it would need to - because of somebody's idealized picture of which data has to be private. That's one of the reasons why I'm saying that a stronger leadership is needed - like for recognizing and clearing up such nonsense in a timely manner. Calling for a "democratic" vote each time for every little bit - who wants that? As a GA member, I wouldn't want to have to vote and as a contributor I would rather give up (like I did) or choose an inferior way (like Marth64 did). Notes It had been argued that the CC indication in the codec properties would not be reliable as it might not have analyzed a sufficient amount of frames "to be sure": While this is a very valid concern for some kinds of frame side data, it does not apply to CC data. It's either in every frame or none. If a provider generally broadcasts CC, then it's always present in every frame, even during programs for which no CC is available - it's always there. Like I mentioned at the top, we're using the properties field from the codec (via codec_par) and there hasn’t been a single case reported where the CC detection this way would have been incorrect. Why not just use the first video frame, then? There are many reasons why this is not reliable: - A broadcast stream can have multiple programs and video streams - The tuner might not have fully locked at the start, packets can be corrupted, audio packets are smaller, less prone to corruption, so you might get valid audio packets before getting valid video frames, same for SD vs. HD, etc. - Some programs might be scrambled and get decrypted after some delay, while fta programs work instantly - so which value to choose for -read_intervals? You just can't be sure, so you need to choose a value by which you are on the safe side - maybe 5s -> just for doing something which has been done already. Any value you would choose won't ever hit the right point (and even then it would still be doubled work), because the right point is exactly when the probing (#1) is complete - as that is determined dynamically, up to the configured limit, and then you can be sure that when a video stream has been successfully probed, the CC information is valid and when not it doesn't matter anyway. That's why the right place for CC detection is clearly in the codec and not in a duplicate sequence of analyzing frames in ffprobe. How that information can find its way from the codec to ffprobe is surely debatable, but comments like "Strong no" or a plain "This data doesn't belong there" is not a debate - it's only destructive. It would be fine if this would be the first part of an answer which is suggesting an alternative way (like Marton's reply on Nov 28), but this is just happening too rarely. Again, I'm not after blaming specific persons. This kind of toxicity is infectious and I'm sure that many don't even want to be like that and probably don't even behave like that elsewhere. It seems more like an ill adaption process, where one has learned that it's more safe here to point at something wrong than to make a suggestion which might then be instantly bashed by others. Rather than complaining and blaming I would rather like to motivate people not to be like that, try to break those chains of behavior and go forward with a more positive attitude, ignoring and stop feeding the negativism of others. 😊 On the subject: I like the -analyze_frames method in general, had something similar in mind actually, because there's other data (like HDR10+ or DVB subtitle presence and screen size) which can occur at any position. I would like to extend this in a way that you can specify for what you are actually looking and also exit in case when all of it has been found. I don't know anything about the recurrence pattern of the film-grain data, but at least for the CC detection there has to be a better way, using the field from the decoder. When probing libraries with thousands of files the ffprobe execution time matters a lot. Best sw _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
next parent reply other threads:[~2025-01-27 9:05 UTC|newest] Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top [not found] <20241128011514.836463-1-marth64@proxyid.net> [not found] ` <2f43d1e7-303c-4ff8-bd95-37a60f7d537b@passwd.hu> [not found] ` <61f11f5d-22d0-4223-9b21-56e5282d1b9d@gmail.com> [not found] ` <bd3bfe4c-b916-d563-625f-e7f2f623fd5e@passwd.hu> [not found] ` <CA+28BfDkBa8RwGco0uVfmQC=s=umD7uycOq-1bsr0eJpce2byA@mail.gmail.com> [not found] ` <daef6726-1881-890e-0e28-4e0f3ffe1f9a@passwd.hu> [not found] ` <CA+28BfC3Ct=aV-fNPktM+39v5Fp9bpQOc-r-FobWuzUSe89CgQ@mail.gmail.com> [not found] ` <CA+28BfD+2Z75h-EOjriaACrdxAf790L6e0FubBP1qzfUuipVxA@mail.gmail.com> [not found] ` <CA+28BfAJ7Juo1csX1Nojb2=S=tfKoQByFqHwyJM5oyRytvQHxA@mail.gmail.com> 2025-01-27 9:04 ` Soft Works [this message] 2025-01-27 9:40 ` Kieran Kunhya via ffmpeg-devel 2025-01-27 10:00 ` Soft Works 2025-01-27 10:07 ` Soft Works 2025-01-27 19:02 ` Soft Works 2025-01-27 19:25 ` Kieran Kunhya via ffmpeg-devel 2025-01-27 19:36 ` Soft Works 2025-01-27 20:15 ` Devin Heitmueller 2025-01-27 20:39 ` Soft Works 2025-01-30 4:43 ` Marth64 2025-01-30 4:58 ` Soft Works 2025-01-30 5:07 ` Marth64 2025-01-30 5:20 ` Soft Works 2025-01-30 5:24 ` Marth64 2025-01-30 5:36 ` Soft Works 2025-01-30 5:41 ` Marth64 2025-01-30 5:46 ` Marth64 2025-01-30 5:54 ` Soft Works 2025-01-30 6:07 ` Marth64 2025-01-30 6:40 ` Soft Works 2025-01-30 6:55 ` Marth64 2025-01-30 7:41 ` Soft Works
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=DM8P223MB036555A6DC63C9880D5DA06CBAEC2@DM8P223MB0365.NAMP223.PROD.OUTLOOK.COM \ --to=softworkz-at-hotmail.com@ffmpeg.org \ --cc=ffmpeg-devel@ffmpeg.org \ --cc=marth64@proxyid.net \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel This inbox may be cloned and mirrored by anyone: git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \ ffmpegdev@gitmailbox.com public-inbox-index ffmpegdev Example config snippet for mirrors. AGPL code for this site: git clone https://public-inbox.org/public-inbox.git