* [FFmpeg-devel] Enhancement layers in FFmpeg @ 2022-08-01 11:24 Niklas Haas 2022-08-01 13:17 ` Soft Works 2022-08-01 13:45 ` Hendrik Leppkes 0 siblings, 2 replies; 5+ messages in thread From: Niklas Haas @ 2022-08-01 11:24 UTC (permalink / raw) To: ffmpeg-devel Hey, We need to think about possible ways to implement reasonably-transparent support for enhancement layers in FFmpeg. (SVC, Dolby Vision, ...). There are more open questions than answers here. From what I can tell, these are basically separate bitstreams that carry some amount of auxiliary information needed to reconstruct the high-quality bitstream. That is, they are not independent, but need to be merged with the original bitstream somehow. How do we architecturally fit this into FFmpeg? Do we define a new codec ID for each (common/relevant) combination of base codec and enhancement layer, e.g. HEVC+DoVi, H.264+SVC, ..., or do we transparently handle it for the base codec ID and control it via a flag? Do the enhancement layer packets already make their way to the codec, and if not, how do we ensure that this is the case? Can the decoder itself recursively initialize a sub-decoder for the second bitstream? And if so, does the decoder apply the actual transformation, or does it merely attach the EL data to the AVFrame somehow in a way that can be used by further filters or end users? (What about the case of Dolby Vision, which iirc requires handling the DoVi RPU metadata before the EL can be applied? What about instances where the user wants the DoVi/EL application to happen on GPU, e.g. via libplacebo in mpv/vlc?) How does this metadata need to be attached? A second AVFrame reference inside the AVFrame? Raw data in a big side data struct? _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [FFmpeg-devel] Enhancement layers in FFmpeg 2022-08-01 11:24 [FFmpeg-devel] Enhancement layers in FFmpeg Niklas Haas @ 2022-08-01 13:17 ` Soft Works 2022-08-01 13:58 ` Niklas Haas 2022-08-01 13:45 ` Hendrik Leppkes 1 sibling, 1 reply; 5+ messages in thread From: Soft Works @ 2022-08-01 13:17 UTC (permalink / raw) To: FFmpeg development discussions and patches > -----Original Message----- > From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of > Niklas Haas > Sent: Monday, August 1, 2022 1:25 PM > To: ffmpeg-devel@ffmpeg.org > Subject: [FFmpeg-devel] Enhancement layers in FFmpeg > > Hey, > > We need to think about possible ways to implement reasonably- > transparent > support for enhancement layers in FFmpeg. (SVC, Dolby Vision, ...). > There are more open questions than answers here. > > From what I can tell, these are basically separate bitstreams that > carry > some amount of auxiliary information needed to reconstruct the > high-quality bitstream. That is, they are not independent, but need > to > be merged with the original bitstream somehow. > > How do we architecturally fit this into FFmpeg? Do we define a new > codec > ID for each (common/relevant) combination of base codec and > enhancement > layer, e.g. HEVC+DoVi, H.264+SVC, ..., or do we transparently handle > it > for the base codec ID and control it via a flag? Do the enhancement > layer packets already make their way to the codec, and if not, how do > we > ensure that this is the case? > > Can the decoder itself recursively initialize a sub-decoder for the > second bitstream? And if so, does the decoder apply the actual > transformation, or does it merely attach the EL data to the AVFrame > somehow in a way that can be used by further filters or end users? From my (rather limited) angle of view, my thoughts are these: When decoding these kinds of sources, a user would typically not only want to do the processing in hardware but the decoding as well. I think we cannot realistically expect that any of the hw decoders will add support for this in the near future. As we cannot modify those ourselves, the only way to do such processing would be a hardware filter. I think, the EL data would need to be attached to frames as some kind of side data (or similar) and get uploaded by the hw filter (internally) which will apply the EL data. (I have no useful thoughts for sw decoding) > (What about the case of Dolby Vision, which iirc requires handling > the > DoVi RPU metadata before the EL can be applied? What about instances > where the user wants the DoVi/EL application to happen on GPU, e.g. > via > libplacebo in mpv/vlc?) IMO it would be desirable when both of these things would/could be done in a single operation. > How does this metadata need to be attached? A second AVFrame > reference > inside the AVFrame? Raw data in a big side data struct? As long as it doesn't have its own format, its own start time, resolution, duration, color space/transfer/primaries, etc.. I wouldn’t say that it's a frame. Best regards, softworkz _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [FFmpeg-devel] Enhancement layers in FFmpeg 2022-08-01 13:17 ` Soft Works @ 2022-08-01 13:58 ` Niklas Haas 2022-08-01 14:26 ` Soft Works 0 siblings, 1 reply; 5+ messages in thread From: Niklas Haas @ 2022-08-01 13:58 UTC (permalink / raw) To: FFmpeg development discussions and patches On Mon, 01 Aug 2022 13:17:12 +0000 Soft Works <softworkz@hotmail.com> wrote: > From my (rather limited) angle of view, my thoughts are these: > > When decoding these kinds of sources, a user would typically not only > want to do the processing in hardware but the decoding as well. > > I think we cannot realistically expect that any of the hw decoders > will add support for this in the near future. As we cannot modify > those ourselves, the only way to do such processing would be a > hardware filter. I think, the EL data would need to be attached > to frames as some kind of side data (or similar) and get uploaded > by the hw filter (internally) which will apply the EL data. If both the BL and the EL are separate fully coded bitstreams, then could we instantiate two independent HW decoder instances to decode the respective planes? > IMO it would be desirable when both of these things would/could be > done in a single operation. For Dolby Vision we have little choice in the matter. The EL application needs to happen *after* chroma interpolation, PQ linearization, IPT matrix application, and poly/MMR reshaping. These are currently all on-GPU processes in the relevant video output codebases. So for Dolby Vision that locks us into the design where we merely expose the EL planes as part of the AVFrame and leave it to be the user's problem (or the problem of filters like `vf_libplacebo`). An open question (for me) is whether or not this is required for SVC-H264, SHVC, AV1-SVC etc. > As long as it doesn't have its own format, its own start time, > resolution, duration, color space/transfer/primaries, etc.. > I wouldn’t say that it's a frame. Indeed, it seems like the EL data is tied directly to the BL data for the formats I have seen so far. So they are just like extra planes on the AVFrame - and indeed, we could simply use extra data pointers here (we already have room for 8). > > Best regards, > softworkz > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [FFmpeg-devel] Enhancement layers in FFmpeg 2022-08-01 13:58 ` Niklas Haas @ 2022-08-01 14:26 ` Soft Works 0 siblings, 0 replies; 5+ messages in thread From: Soft Works @ 2022-08-01 14:26 UTC (permalink / raw) To: FFmpeg development discussions and patches > -----Original Message----- > From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of > Niklas Haas > Sent: Monday, August 1, 2022 3:59 PM > To: FFmpeg development discussions and patches <ffmpeg- > devel@ffmpeg.org> > Subject: Re: [FFmpeg-devel] Enhancement layers in FFmpeg > > On Mon, 01 Aug 2022 13:17:12 +0000 Soft Works <softworkz@hotmail.com> > wrote: > > From my (rather limited) angle of view, my thoughts are these: > > > > When decoding these kinds of sources, a user would typically not > only > > want to do the processing in hardware but the decoding as well. > > > > I think we cannot realistically expect that any of the hw decoders > > will add support for this in the near future. As we cannot modify > > those ourselves, the only way to do such processing would be a > > hardware filter. I think, the EL data would need to be attached > > to frames as some kind of side data (or similar) and get uploaded > > by the hw filter (internally) which will apply the EL data. > > If both the BL and the EL are separate fully coded bitstreams, then > could we instantiate two independent HW decoder instances to decode > the > respective planes? Sure. TBH, I didn't know that the EL data is encoded in the same way. I wonder how those frames would look like when viewed standalone.. > > IMO it would be desirable when both of these things would/could be > > done in a single operation. > > For Dolby Vision we have little choice in the matter. The EL > application > needs to happen *after* chroma interpolation, PQ linearization, IPT > matrix application, and poly/MMR reshaping. These are currently all > on-GPU processes in the relevant video output codebases. > > So for Dolby Vision that locks us into the design where we merely > expose > the EL planes as part of the AVFrame and leave it to be the user's > problem If ffmpeg cannot apply it, then I don't think there will be many users being able to make some use of it :-) > (or the problem of filters like `vf_libplacebo`). Something I always wanted to ask you: is it even thinkable to port this to a CPU implementation (with reasonable performance)? > An open question (for me) is whether or not this is required for > SVC-H264, SHVC, AV1-SVC etc. > > > As long as it doesn't have its own format, its own start time, > > resolution, duration, color space/transfer/primaries, etc.. > > I wouldn’t say that it's a frame. > > Indeed, it seems like the EL data is tied directly to the BL data for > the formats I have seen so far. So they are just like extra planes on > the AVFrame - and indeed, we could simply use extra data pointers > here > (we already have room for 8). Hendrik's idea makes sense to me when this is not just some data but real frames, decoded with a regular decoder. Yet I don't know anything about the other enhancement cases either. Best regards, softworkz _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [FFmpeg-devel] Enhancement layers in FFmpeg 2022-08-01 11:24 [FFmpeg-devel] Enhancement layers in FFmpeg Niklas Haas 2022-08-01 13:17 ` Soft Works @ 2022-08-01 13:45 ` Hendrik Leppkes 1 sibling, 0 replies; 5+ messages in thread From: Hendrik Leppkes @ 2022-08-01 13:45 UTC (permalink / raw) To: FFmpeg development discussions and patches On Mon, Aug 1, 2022 at 1:25 PM Niklas Haas <ffmpeg@haasn.xyz> wrote: > > Hey, > > We need to think about possible ways to implement reasonably-transparent > support for enhancement layers in FFmpeg. (SVC, Dolby Vision, ...). > There are more open questions than answers here. > > From what I can tell, these are basically separate bitstreams that carry > some amount of auxiliary information needed to reconstruct the > high-quality bitstream. That is, they are not independent, but need to > be merged with the original bitstream somehow. > > How do we architecturally fit this into FFmpeg? Do we define a new codec > ID for each (common/relevant) combination of base codec and enhancement > layer, e.g. HEVC+DoVi, H.264+SVC, ..., or do we transparently handle it > for the base codec ID and control it via a flag? Do the enhancement > layer packets already make their way to the codec, and if not, how do we > ensure that this is the case? EL on Blu-rays are a separate stream, so that would need to be handled in some fashion. Unless it wouldn't. See below. > > Can the decoder itself recursively initialize a sub-decoder for the > second bitstream? And if so, does the decoder apply the actual > transformation, or does it merely attach the EL data to the AVFrame > somehow in a way that can be used by further filters or end users? My main question is, how closely related are those streams? I know that Dolby EL can be decoded basically entirely separately from the main video stream. But EL might be the special case here. I have no experience with SVC. If the enhancement layer is entirely independent, like Dolby EL, should avcodec need to do anything? It _can_ decode the stream today, a user-application could write code that decodes both the main stream and the EL stream and links them together, without any changes in avcodec. Do we need to complicate this situation by forcing this into avcodec? Decoding them in entirely separate decoder instances has the advantage of being able to use Hardware for the main one, software for the EL, or both in hardware, or whatever one prefers. Of course this applies to the special situation of Dolby EL which is entirely independent, at least in its primary source - Blu-ray. I think MKV might mix both into one stream, which is an unfortunate design decision on their part. avfilter for example is already setup to synchronize two incoming streams (for eg. overlay), so the same mechanic could be used to pass it to a processing filter. > > (What about the case of Dolby Vision, which iirc requires handling the > DoVi RPU metadata before the EL can be applied? What about instances > where the user wants the DoVi/EL application to happen on GPU, e.g. via > libplacebo in mpv/vlc?) > Yes, processing should be left to dedicated filters. > How does this metadata need to be attached? A second AVFrame reference > inside the AVFrame? Raw data in a big side data struct? For Dolby EL, no attachment is necessary if we follow the above concept of just not having avcodec care. - Hendrik _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2022-08-01 14:26 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-08-01 11:24 [FFmpeg-devel] Enhancement layers in FFmpeg Niklas Haas 2022-08-01 13:17 ` Soft Works 2022-08-01 13:58 ` Niklas Haas 2022-08-01 14:26 ` Soft Works 2022-08-01 13:45 ` Hendrik Leppkes
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel This inbox may be cloned and mirrored by anyone: git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \ ffmpegdev@gitmailbox.com public-inbox-index ffmpegdev Example config snippet for mirrors. AGPL code for this site: git clone https://public-inbox.org/public-inbox.git