Re: [FFmpeg-devel] [PATCH v2] avcodec/jpeg2000dec: Add support for placeholder passes, CAP, and CPF markers

From: WATANABE Osamu <owatanab@es.takushoku-u.ac.jp>
To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Cc: Pierre-Anthony Lemieux <pal@sandflow.com>
Subject: Re: [FFmpeg-devel] [PATCH v2] avcodec/jpeg2000dec: Add support for placeholder passes, CAP, and CPF markers
Date: Wed, 19 Jun 2024 05:51:22 +0000
Message-ID: <2F5DC492-B51B-4E93-9F45-CFCD813A6CBE@es.takushoku-u.ac.jp> (raw)
In-Reply-To: <c80c072d5df8eaa495f4caf2e40c217092f8a552.camel@haerdin.se>

First of all, I appreciate your kind review.
I'm writing to discuss the changes and would like to hear your feedback on these.

> On Jun 18, 2024, at 23:20, Tomas Hardin <git@haerdin.se> wrote:
> 
> 
> It seems this patch combines a lot of things that might be better to
> split into separate patches for easier review

Agree. I will split this patch into several patches.
For example, the set of patches includes changes:
- only for HTJ2K (JPEG 2000 Part 15)
- only for J2K (JPEG 2000 Part 1)
- for both J2K and HTJ2K.

Do you think it makes sense?

> 
>> @@ -382,6 +391,9 @@ static int get_siz(Jpeg2000DecoderContext *s)
>>          } else if (ncomponents == 1 && s->precision == 8) {
>>              s->avctx->pix_fmt = AV_PIX_FMT_GRAY8;
>>              i = 0;
>> +        } else if (ncomponents == 1 && s->precision == 12) {
>> +            s->avctx->pix_fmt = AV_PIX_FMT_GRAY16LE;
>> +            i = 0;
> 
> Could we handle 9 <= precision <= 16 while we're at it?
> 

Yes. The native J2K decoder for both lossless and lossy J2K/HTJ2K
codestreams can handle bpp from 9 to 16. This change is required to
produce the decoded images for the ISO test codestreams defined in
Part 4 (Conformance testing).

>> @@ -408,6 +420,73 @@ static int get_siz(Jpeg2000DecoderContext *s)
>>      s->avctx->bits_per_raw_sample = s->precision;
>>      return 0;
>>  }
>> +/* get extended capabilities (CAP) marker segment */
>> +static int get_cap(Jpeg2000DecoderContext *s, Jpeg2000CodingStyle
>> *c)
>> +{
>> +    uint32_t Pcap;
>> +    uint16_t Ccap_i[32] = { 0 };
>> +    uint16_t Ccap_15;
>> +    uint8_t P;
>> +
>> +    if (bytestream2_get_bytes_left(&s->g) < 6) {
>> +        av_log(s->avctx, AV_LOG_ERROR, "Insufficient space for
>> CAP\n");
>> +        return AVERROR_INVALIDDATA;
>> +    }
>> +
>> +    Pcap = bytestream2_get_be32u(&s->g);
>> +    s->isHT = (Pcap >> (31 - (15 - 1))) & 1;
>> +    for (int i = 0; i < 32; i++) {
>> +        if ((Pcap >> (31 - i)) & 1)
>> +            Ccap_i[i] = bytestream2_get_be16u(&s->g);
>> +    }
>> +    Ccap_15 = Ccap_i[14];
>> +    if (s->isHT == 1) {
>> +    av_log(s->avctx, AV_LOG_INFO, "This is an HTJ2K codestream.\n");
>> +    // Bits 14-15
>> +    switch ((Ccap_15 >> 14) & 0x3) {
> 
> Missing indentation
> 
>> +        case 0x3:
>> +            s->Ccap15_b14_15 = HTJ2K_MIXED;
>> +            break;
>> +        case 0x1:
>> +            s->Ccap15_b14_15 = HTJ2K_HTDECLARED;
>> +            break;
>> +        case 0x0:
>> +            s->Ccap15_b14_15 = HTJ2K_HTONLY;
>> +            break;
>> +        default:
>> +                av_log(s->avctx, AV_LOG_ERROR, "Unknown CCap
>> value.\n");
>> +            return AVERROR(EINVAL);
>> +            break;
>> +    }
>> +    // Bit 13
>> +    if ((Ccap_15 >> 13) & 1) {
>> +        av_log(s->avctx, AV_LOG_ERROR, "MULTIHT set is not
>> supported.\n");
>> +        return AVERROR_PATCHWELCOME;
>> +    }
>> +    // Bit 12
>> +    s->Ccap15_b12 = (Ccap_15 >> 12) & 1;
>> +    // Bit 11
>> +    s->Ccap15_b11 = (Ccap_15 >> 11) & 1;
>> +    // Bit 5
>> +    s->Ccap15_b05 = (Ccap_15 >> 5) & 1;
>> +    // Bit 0-4
>> +    P = Ccap_15 & 0x1F;
>> +    if (!P)
>> +        s->HT_MAGB = 8;
>> +    else if (P < 20)
>> +        s->HT_MAGB = P + 8;
>> +    else if (P < 31)
>> +        s->HT_MAGB = 4 * (P - 19) + 27;
>> +    else
>> +        s->HT_MAGB = 74;
>> +
>> +    if (s->HT_MAGB > 31) {
>> +            av_log(s->avctx, AV_LOG_ERROR, "Available internal
>> precision is exceeded (MAGB> 31).\n");
>> +        return AVERROR_PATCHWELCOME;
>> +    }
>> +    }
> 
> Weird indentation
> 

Thank you for catching these. I will fix them in the patch set.

>> @@ -802,6 +881,15 @@ static int read_crg(Jpeg2000DecoderContext *s,
>> int n)
>>      bytestream2_skip(&s->g, n - 2);
>>      return 0;
>>  }
>> +
>> +static int read_cpf(Jpeg2000DecoderContext *s, int n)
>> +{
>> +    if (bytestream2_get_bytes_left(&s->g) < (n - 2))
>> +        return AVERROR_INVALIDDATA;
>> +    bytestream2_skip(&s->g, n - 2);
>> +    return 0;
>> +}
> 
> Don't we already have code for skipping markers we don't care about?
> 

The `read_cpf()` function was added for consistency with the `read_crg()` function.
We already have `bytestream2_skip(GetByteContext *g, unsigned int size)` that skips `size`
bytes from the compressed data. 
Do you think it is better to replace those functions (= `read_cpf()` and `read_crg()`)
with `bytestream2_skip()`?

>> +
>>  /* Tile-part lengths: see ISO 15444-1:2002, section A.7.1
>>   * Used to know the number of tile parts and lengths.
>>   * There may be multiple TLMs in the header.
>> @@ -965,6 +1053,10 @@ static int init_tile(Jpeg2000DecoderContext *s,
>> int tileno)
>>              comp->roi_shift = s->roi_shift[compno];
>>          if (!codsty->init)
>>              return AVERROR_INVALIDDATA;
>> +        if (s->isHT && (!s->Ccap15_b05) && (!codsty->transform))
>> +            av_log(s->avctx, AV_LOG_WARNING, "Transformation = 0
>> (lossy DWT) is found in HTREV HT set\n");
>> +        if (s->isHT && s->Ccap15_b14_15 != (codsty->cblk_style >> 6)
>> && s->Ccap15_b14_15 != HTJ2K_HTONLY)
>> +            av_log(s->avctx, AV_LOG_WARNING, "SPcod/SPcoc value does
>> not match bit 14-15 values of Ccap15\n");
> 
> Do you have samples demonstrating the need to accept such broken files?
> If not then we should probably error out

Does `error out` mean that
- Should we exit decoding here?
- or should we replace AV_LOG_WARNING with AV_LOG_ERROR?

> 
>> @@ -1704,7 +1989,7 @@ static int decode_cblk(const
>> Jpeg2000DecoderContext *s, Jpeg2000CodingStyle *cod
>>                         Jpeg2000T1Context *t1, Jpeg2000Cblk *cblk,
>>                         int width, int height, int bandpos, uint8_t
>> roi_shift)
>>  {
>> -    int passno = cblk->npasses, pass_t = 2, bpno = cblk->nonzerobits
>> - 1 + roi_shift;
>> +    int passno = cblk->npasses, pass_t = 2, bpno = cblk->nonzerobits
>> - 1;
> 
> Won't this break files with ROI? I see there's some ROI stuff further
> down so maybe not

It won't break ROI decoding, and this change is mandatory
when handling codestreams with placeholder passes.

> 
>> @@ -2187,22 +2472,42 @@ static int
>> jpeg2000_read_main_headers(Jpeg2000DecoderContext *s)
>>              if (!s->tile)
>>                  s->numXtiles = s->numYtiles = 0;
>>              break;
>> +        case JPEG2000_CAP:
>> +            if (!s->ncomponents) {
>> +                av_log(s->avctx, AV_LOG_WARNING, "CAP marker segment
>> shall come after SIZ\n");
> 
> SHALL -> we should be able to safely reject. Similarly with the other
> errors. Unless we know of an encoder that produces broken files then
> there's no reason to be lenient. And if such a broken encoder exists we
> could try to get it fixed

Does `safely reject` mean that we should replace AV_LOG_WARNING with
AV_LOG_ERROR? or we should stop decoding here?

> 
>> @@ -112,6 +112,13 @@ typedef struct Jpeg2000DecoderContext {
>>      Jpeg2000Tile    *tile;
>>      Jpeg2000DSPContext dsp;
>>  
>> +    uint8_t         isHT; // HTJ2K?
> 
> Isn't it possible to mix Part 1 and HT in the same file? I know HTONLY
> exists also

It is possible to mix Part 1 and HT in the same tile-component.
This mode is defined as MIXED in the spec of Part 15.
There are three types of HT codestreams:
- HTONLY
- HTDECLARED
- MIXED

The spec says - 
"The HTONLY set is the set of HTJ2K codestreams where all code-blocks
are HT code-blocks. The HTDECLARED set is the set of HTJ2K codestreams
where all code-blocks within a given tile-component are either
a) HT code-blocks, or b) code-blocks as specified in 
Rec. ITU-T T.800 | ISO/IEC 15444-1. The MIXED set is the set of all
HTJ2K codestreams that are not in the HTDECLARED set."

> 
>> @@ -406,6 +420,7 @@ static void recover_mag_sgn(StateVars *mag_sgn,
>> uint8_t pos, uint16_t q, int32_t
>>              E[n] = 32 - ff_clz(v[pos][i] | 1);
>>              mu_n[n] = (v[pos][i] >> 1) + 1;
>>              mu_n[n] <<= pLSB;
>> +            mu_n[n] |= (1 << (pLSB - 1)); // Add 0.5 (reconstruction
>> parameter = 1/2)
> 
> Aren't there conformance files to catch these kinds of errors? I
> remember looked at J2K a while back, and I think we should add such
> files to FATE
> 

Do you mean "errors" are the difference in pixel values between
uncompressed and lossy compressed images?

There are no specific conformance files to catch the difference
in reconstruction parameter values. 

The value of the reconstruction parameter r is not limited to 1/2.
We can use other values.
However, the spec of Part 4 says the use of reconstruction parameter
r = 1/2 will typically increase the ease of passing.

Looking forward to seeing your thoughts.

Best,
Osamu

> /Tomas
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".