Hi Jerome

On Thu, Mar 06, 2025 at 10:20:58PM +0100, Jerome Martinez wrote:
> Le 06/03/2025 à 17:37, Michael Niedermayer a écrit :
> > On Thu, Mar 06, 2025 at 03:14:49AM +0100, Lynne wrote:
> > > On 06/03/2025 01:15, Michael Niedermayer wrote:
> > > > Hi everyone
> > > > 
> > > > Current FFv1 code with my patchset supports 16bit floats. The implementation
> > > > is quite simple. Which is good
> > > > 
> > > > I have tried improving compression, the gains are incremental, but its not
> > > > large overall. For example 44% space gain on the remapping table is just
> > > > 0.1% overall.
> > > > 
> > > > I have few float samples. Its mainly one high quality slideshow of unrelated
> > > > 16bit float images. These excercise a wide range fo things including negative
> > > > color values.
> > > > I think I have only one single (non slideshow) video of float 16 samples,
> > > > 
> > > > It turns out the most efficient way to store floats that i found, is to remap
> > > > them to integers. Not to store tham as sign, exponent, mantisse, i tried that
> > > > for many hours.
> 
> I also tried some stuff but IMO the remap to integer is what is currently
> the best for keeping the spec simple & with a good compression, and a more
> complex way is not needed until we find something which really makes a
> difference in term of compression ratio.
> So let's keep it simple, just a remap to int, and we already get all the
> advantage of e.g. the incoming GPU implementation.

yes, i fully agree


> 
> 
> > > > [...]
> > > > What about the mapping itself, it uses a rather simple rle coder. ive spend
> > > > most of the day today tuning that and failing to really beat that.
> > > > Using context from the previous image didnt work with the slideshow material
> > > > i have nor that one video i have. I tried using sign and exponent as context,
> > > > tried previous run, relations of runs, low bits and many more, but no or
> > > > insignificant improvments of yesterdays implementation was achieved.
> > > I did mention this once before, but you should look into the Daala/Opus way
> > > of storing rawbits into a separate chunk at the end of the bitstream. It
> > > avoids polluting the range context with equiprobable bits.
> > This may fit into my LSB work but that is not float specific and its not
> > needed for float support.
> > 
> > I originally thought we would do floats through RAW storing LSB, and special
> > handling the "always" 0 and 1 bits. It makes sense but just remapping only
> > the used values to a compact list eliminates all the constant bits for free
> > and we never have more than 16bits per sample
> > 
> > We still can implement storing LSB raw, i have written 2 implementations
> > the first without range coder was this:
> > https://lists.ffmpeg.org/pipermail/ffmpeg-devel/2024-October/334718.html
> > 
> > I didnt like it, but it seems more popular
> 
> I am still in favor of this simple and independent way to store the LSB
> bits, the other tentatives seem to only add complexity and performance
> issues while not providing enough gain.
> 
> 
> > 
> > Also given the raw bits are a known and constant length. It could make
> > sense to put them first if they are stored non interleaved.
> 
> Putting them last is better IMO, we can compute the byte offset with the
> constant length while storing some config in the slice header.
> Actually, there are many possibilities, I suggest:
> - permitting both MSB & LSB, because we may have MSB 0 and we would no store
> them (see below)
> - after the count of bits not compressed, we add a flag indicating if we
> store them or not (meaning they are 0)
> - permitting a configuration per stream (Configuration Record) or per slice
> (Slice Header); if a decision is to keep only one place for keeping it
> simple, it would be in the slice header.
> 
> The rational is that we may have:
> - 0 padding for the LSB is when we compress integer from DCP, we have input
> with 16-bit TIFF but only 12 relevant bits, and we need to keep the bit
> depth of 16-bit if we want to say that we do it lossless and that we can
> reverse, especially because 99.999% of frames are with 4 bits of padding but
> you may have a frame for whatever reason (often a bug, but if we do lossless
> we need to store buggy frames) which is not 0, so a performant storage of
> 0-filled LSB while permitting to store the actual bits if not 0.
> - 0 padding for the MSB is when we compress float mapped to integer and
> float range is from 0.0 to 1.0 only, sign & most of mantissa are always 0 so
> we can store them as a flag, but we prepare the stream to have values out of
> range (<0 or > 1).
> - and we don't know that before we start to compress each frame/slice

We have a generic remap table with the float code.
1. This is not really float specific, it could be used with integers
2. Its per slice, so each slice can have a different table or none

Any bits that are always 0 or 1 will be perfectly optimized out using
that table. No mattter if LSB, MSB or in the middle
even if every color is a multiple of 13 the remap table will still perfectly
remove that and produce a continous range that is one thirteenth

If you disagree, please explain how thats not already solving every
variant and more ?

thx

[...]

-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Those who are too smart to engage in politics are punished by being
governed by those who are dumber. -- Plato