Re: [FFmpeg-devel] [PATCH] libavcodec: add bit-rate support to RoQ video encoder

From: "Tomas Härdin" <git@haerdin.se>
To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Subject: Re: [FFmpeg-devel] [PATCH] libavcodec: add bit-rate support to RoQ video encoder
Date: Tue, 23 Jan 2024 18:44:03 +0100
Message-ID: <0e7aec235bb0481be4abb314a615210484914ad9.camel@haerdin.se> (raw)
In-Reply-To: <CAOSHGWPo8vgUB_Zs3-mY+ApPzCxxxD0s=zgQWCYrR-RYF+XmrQ@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 4626 bytes --]

tis 2024-01-23 klockan 11:33 +0300 skrev Victor Luchitz:
> Re-posting the patch as an attachment. Sorry for the inconvenience!

Yep, this one applies and also passes FATE.

I notice ffplay_buffer whines a lot when playing RoQ files:

> [swscaler @ 0x7fcb44043240] deprecated pixel format used, make sure you did set range correctly
> [ffplay_buffer @ 0x7fcb44032fc0] filter context - w: 256 h: 256 fmt: 14 csp: unknown range: unknown, incoming frame - w: 256 h: 256 fmt: 14 csp: unknown range: pc pts_time: 0
> [ffplay_buffer @ 0x7fcb44032fc0] Changing video frame properties on the fly is not supported by all filters.

I don't think this problem relates to this patch however.

Anyway, using -b:v 100k causes the encoder to effectively become stuck
on the first frame, being unable to go below 621 kbps and increasing
qscale very slowly. But you mentioned this already of course. Perhaps
there should be a faster "startup" phase? Subsequent frames being P-
frames may lead to the average hitting the target bitrate.

Even when using 1 Mbps the encoding is very slow:

./ffmpeg -r:v 30 -t 1 -f lavfi -i testsrc2 -s 256x256 -b:v 1000k -y
foo-1000k.roq

> [roqvideo @ 0x3112980] 
> Generated a frame too big for desired bit rate (1489 kbps), now switching to a bigger qscale value (257).
> [roqvideo @ 0x3112980] 
> Generated a frame too big for desired bit rate (1489 kbps), now switching to a bigger qscale value (259).
> [...]
> Generated a frame too big for desired bit rate (1100 kbps), now switching to a bigger qscale value (196863).
> [roqvideo @ 0x3112980] .0 size=       0kB time=N/A bitrate=N/A speed=N/A    
> Generated a frame too big for desired bit rate (1069 kbps), now switching to a bigger qscale value (262399).
> [roqvideo @ 0x3112980] 
> Generated a frame too small for desired bit rate (601 kbps), reverting lambda and using smaller inc on qscale (196863).
> [roqvideo @ 0x3112980] 0.0 size=      17kB time=00:00:00.16 bitrate= 812.7kbits/s speed=0.0476x    
> Generated a frame too small for desired bit rate (915 kbps), qscale value cannot be lowered any further (1).
> [...]
> [roqvideo @ 0x3112980] 
> Generated a frame too big for desired bit rate (1084 kbps), now switching to a bigger qscale value (327680).
> [roqvideo @ 0x3112980] 
> Generated a frame too big for desired bit rate (1068 kbps), now switching to a bigger qscale value (393216).
> [roqvideo @ 0x3112980] 0.0 size=      42kB time=00:00:00.40 bitrate= 866.5kbits/s speed=0.0667x    
> Generated a frame too small for desired bit rate (514 kbps), reverting lambda and using smaller inc on qscale (327680).
> [roqvideo @ 0x3112980] 0.0 size=      62kB time=00:00:00.56 bitrate= 896.2kbits/s speed=0.0708x    
> Generated a frame too small for desired bit rate (901 kbps), qscale value cannot be lowered any further (1).

This suggests an unstable regulation loop. It also makes the encoder
slower than the Cinepak encoder, amazingly. 59.6 vs 21.57 seconds for
encoding 90 frames of dimension 256x256.

-bt should have a default of 0, and similar default logic as -b. The
default of bitrate/20 also seems excessively small? If I change the
default to be the same as the bitrate then the encoding time above
shrinks to 11.7 seconds.

There is an excessive newline visible here:

> +                   "\nGenerated a frame too big for desired bit rate
> (%d kbps), "

And elsewhere.

Using non-integer framerates causes the encoder to go bananas. For
example 24000/1001 fps makes the ratecontrol think it's running at
24000 fps. An easy fix is to reject avctx->time_base.num != 1 on init.
roq_write_header() rejects non-integer framerates.

I'm also curious about this hunk:

> -        if (enc->lambda > 100000) {
> +        if (enc->lambda > 100000000) {
>              av_log(roq->logctx, AV_LOG_ERROR, "Cannot encode video in Quake compatible form\n");
>              return AVERROR(EINVAL);
>          }

Where in the Quake (3?) source code is this limitation? Seems rather
that there should be a retry limit. There's probably no harm in keeping
going so long as the packet doesn't end up above the limit.

A quick logarithmic regression on bitrate vs qscale suggests it scales
with somewhere between qscale^-0.2833 and qscale^-0.2842. Conversely,
to hit a specific bitrate, try scaling qscale with the bitrate ratio
raised to around 3.5. A more conservative exponent like 2 is probably
also fine. See patch attached.

A more systematic approach might be to do a binary search for qscale,
with an initial guess per the formula above.

/Tomas


[-- Attachment #2: 0001-Fast-ratecontrol-could-be-better.patch --]
[-- Type: text/x-patch, Size: 6460 bytes --]

From 5995119843f57ba68dc8d66cb6c53cd8ec386ab9 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Tomas=20H=C3=A4rdin?= <git@haerdin.se>
Date: Tue, 23 Jan 2024 18:40:55 +0100
Subject: [PATCH] Fast ratecontrol, could be better

---
 libavcodec/roqvideoenc.c | 48 +++++++++++++++++++++++++++-------------
 1 file changed, 33 insertions(+), 15 deletions(-)

diff --git a/libavcodec/roqvideoenc.c b/libavcodec/roqvideoenc.c
index 976015d918..0fd70e85c4 100644
--- a/libavcodec/roqvideoenc.c
+++ b/libavcodec/roqvideoenc.c
@@ -894,7 +894,7 @@ static int roq_encode_video(AVCodecContext *avctx)
     RoqEncContext *const enc = avctx->priv_data;
     RoqTempData *const tempData = &enc->tmp_data;
     RoqContext *const roq = &enc->common;
-    int ret;
+    int ret, num_fast_qscale_deltas = 0, num_tries = 0;
 
     memset(tempData, 0, sizeof(*tempData));
 
@@ -908,12 +908,16 @@ static int roq_encode_video(AVCodecContext *avctx)
     }
 
  retry_encode:
+    num_tries++;
     for (int i = 0; i < roq->width * roq->height / 64; i++)
         gather_data_for_cel(enc->cel_evals + i, enc);
 
     /* Quake 3 can't handle chunks bigger than 65535 bytes */
+#define MAX_LAMBDA_DELTA 3000000
+#define MAX_TRIES 30
+#define MAX_FAST_TRIES 5
     if (tempData->mainChunkSize/8 > 65535 && enc->quake3_compat) {
-        if (enc->lambda > 100000000) {
+        if (num_tries > MAX_TRIES) {
             av_log(roq->logctx, AV_LOG_ERROR, "Cannot encode video in Quake compatible form\n");
             return AVERROR(EINVAL);
         }
@@ -941,28 +945,40 @@ static int roq_encode_video(AVCodecContext *avctx)
         int64_t tol = avctx->bit_rate_tolerance;
 
         /* tolerance > bit rate, set to 5% of the bit rate */
-        if (tol > avctx->bit_rate)
+        if (!tol)
             tol = avctx->bit_rate / 20;
 
         av_log(roq->logctx, AV_LOG_VERBOSE,
-               "\nDesired bit rate (%d kbps), "
+               "Desired bit rate (%d kbps), "
+               "Current bit rate (%d kbps), "
                "Bit rate tolerance (%d), "
                "Frame rate (%d)\n",
-               (int)avctx->bit_rate, (int)tol, avctx->time_base.den);
+               (int)avctx->bit_rate / 1000, ftotal / 1000, (int)tol, avctx->time_base.den);
 
         if (ftotal > (avctx->bit_rate + tol)) {
             /* frame is too big - increase qscale */
-            if (enc->lambda > 100000000) {
-                av_log(roq->logctx, AV_LOG_ERROR, "\nCannot encode video at desired bitrate\n");
-                return AVERROR(EINVAL);
+            if (num_tries > MAX_TRIES) {
+                av_log(roq->logctx, AV_LOG_WARNING, "Cannot encode video at desired bitrate (got %d kbps)\n", ftotal / 1000);
+                // don't fail hard
+                goto keepgoing;
+            }
+            enc->lambda_delta = enc->lambda_delta <= 0 ? 1 : enc->lambda_delta < MAX_LAMBDA_DELTA/2 ? enc->lambda_delta*2 : MAX_LAMBDA_DELTA;
+            if (num_fast_qscale_deltas < MAX_FAST_TRIES) {
+                float ratio;
+                int fast_lambda_delta;
+                ratio = powf((float)ftotal / avctx->bit_rate, 2) - 1; // -1 because it's a delta
+                av_log(roq->logctx, AV_LOG_VERBOSE, "ratio = %f\n", ratio);
+                fast_lambda_delta = FFMAX(1, FFMIN(enc->lambda * ratio, MAX_LAMBDA_DELTA));
+                // sometimes ratio ends up quite small, in which case we fall back to the simple doubling logic
+                enc->lambda_delta = FFMAX(enc->lambda_delta, fast_lambda_delta);
+                num_fast_qscale_deltas++;
             }
-            enc->lambda_delta = enc->lambda_delta <= 0 ? 1 : enc->lambda_delta < 65536 ? enc->lambda_delta*2 : 65536;
             enc->last_lambda = enc->lambda;
             enc->lambda += enc->lambda_delta;
             av_log(roq->logctx, AV_LOG_INFO,
-                   "\nGenerated a frame too big for desired bit rate (%d kbps), "
-                   "now switching to a bigger qscale value (%d).\n",
-                   ftotal / 1000, (int)enc->lambda);
+                   "Generated a frame too big for desired bit rate (%d kbps), "
+                   "now switching to a bigger qscale value (%d -> %d).\n",
+                   ftotal / 1000, (int)enc->last_lambda, (int)enc->lambda);
             tempData->mainChunkSize = 0;
             memset(tempData->used_option, 0, sizeof(tempData->used_option));
             memset(tempData->codebooks.usedCB4, 0,
@@ -975,12 +991,12 @@ static int roq_encode_video(AVCodecContext *avctx)
             /* frame is too small - decrease qscale */
             if (enc->lambda <= 1) {
                 av_log(roq->logctx, AV_LOG_WARNING,
-                       "\nGenerated a frame too small for desired bit rate (%d kbps), "
+                       "Generated a frame too small for desired bit rate (%d kbps), "
                        "qscale value cannot be lowered any further (%d).\n",
                        ftotal / 1000, (int)enc->lambda);
             } else if ((enc->lambda - enc->last_lambda) == 1) {
                 av_log(roq->logctx, AV_LOG_WARNING,
-                       "\nCannot find qscale that gives desired bit rate within desired tolerance, "
+                       "Cannot find qscale that gives desired bit rate within desired tolerance, "
                        "using lower bitrate (%d kbps) with higher qscale value (%d).\n",
                        ftotal / 1000, (int)enc->lambda);
             } else {
@@ -992,7 +1008,7 @@ static int roq_encode_video(AVCodecContext *avctx)
                     enc->lambda = enc->last_lambda;
                     //enc->lambda *= (float)(tempData->mainChunkSize * avctx->time_base.den) / avctx->bit_rate;
                     av_log(roq->logctx, AV_LOG_INFO,
-                           "\nGenerated a frame too small for desired bit rate (%d kbps), "
+                           "Generated a frame too small for desired bit rate (%d kbps), "
                            "reverting lambda and using smaller inc on qscale (%d).\n",
                            ftotal / 1000, (int)enc->lambda);
                 }
@@ -1007,6 +1023,7 @@ static int roq_encode_video(AVCodecContext *avctx)
             }
         }
     }
+keepgoing:
 
     write_codebooks(enc);
 
@@ -1200,6 +1217,7 @@ static const AVClass roq_class = {
 
 static const FFCodecDefault roq_defaults[] = {
     { "b",                "0" },
+    { "bt",               "0" },
     { NULL },
 };
 
-- 
2.39.2


[-- Attachment #3: Type: text/plain, Size: 251 bytes --]

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".