From: "Tomas Härdin" <tjoppen@acc.umu.se> To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org> Subject: Re: [FFmpeg-devel] [PATCH v6 1/2] libavcodec: Added DFPWM1a codec Date: Mon, 14 Mar 2022 11:02:57 +0100 Message-ID: <c81308704449c1f312e311feb5ca3fd9863ca5f9.camel@acc.umu.se> (raw) In-Reply-To: <a298aac8-a8f8-b83a-bfd8-48bee8fd3fad@gmail.com> [-- Attachment #1: Type: text/plain, Size: 2489 bytes --] mån 2022-03-07 klockan 22:04 -0500 skrev Jack Bruienne: > On 3/7/22 06:03, Tomas Härdin wrote: > > > tor 2022-03-03 klockan 10:44 -0500 skrev Jack Bruienne: > > > From the wiki page (https://wiki.vexatos.com/dfpwm): > > > > DFPWM (Dynamic Filter Pulse Width Modulation) is an audio codec > > > > created by Ben “GreaseMonkey” Russell in 2012, originally to be > > > > used > > > > as a voice codec for asiekierka's pixmess, a C remake of > > > > 64pixels. > > > > It is a 1-bit-per-sample codec which uses a dynamic-strength > > > > one- > > > > pole > > > > low-pass filter as a predictor. Due to the fact that a raw > > > > DPFWM > > > > decoding > > > > creates a high-pitched whine, it is often followed by some > > > > post- > > > > processing > > > > filters to make the stream more listenable. > > This sounds similar to something I wrote for the Atari 2600 a > > number of > > years ago (https://www.pouet.net/prod.php?which=59283 ) > > > > I found an encoder online for DFPWM, and it seems to suffer from > > the > > same "beeping" that my debeeping hack fixes (suppressing 0xAA and > > 0x55 > > bytes). Perhaps a similar hack could be useful in dfpwmenc.c > > I'm curious how this works. Do you just cut out those bytes from the > encoder output, or is it modified in some way? Wouldn't removing the > data entirely eventually cause much of the audio to be lost? The source code is included in the release. Look at audioquant.cpp. I've attached it for convenience. The codec is based on a state machine where each state is a 5-bit PCM value that can go to either of two states, also 5-bit PCM values. Hence 1 bit per sample. I also have a low-pass filter in the decoder. I penalize state machines which result in 0x55 and 0xAA being overly represented. This is done via computing a histogram of the output bytes and scaling the RMS error according to how many of those bytes are in the (tentative) output. Another approach could be to detect and blank excessive runs of 0x55 and 0xAA bytes. > > From experience it's usually better to be strict when it comes to > > stuff > > like this. The ComputerCraft people should work out a standard for > > this, preferably a container. We've had a similar problem in FreeDV > > where which codec was being used was implicit most of the time, > > which > > has been resolved with the .c2 format. > > I think the best standardized container for DFPWM will be WAV. I agree, and I see this was already pushed. /Tomas [-- Attachment #2: audioquant.cpp --] [-- Type: text/x-c++src, Size: 9402 bytes --] #define _CRT_SECURE_NO_WARNINGS #include <stdio.h> #include <stdlib.h> #include <memory.h> #include <math.h> #include <stdint.h> #include <vector> using namespace std; typedef short sample_t; typedef int64_t err_t; static vector<int16_t> samples; static int rate; #define DESIRED_RMS 20000 //can be tweaked slightly for better utilization of dynamic range depending on material #define BPS 1 #define DELTA 10 //try table values +- this every pass //higher values = more exhaustive search #define PRESHAPE //#define POSTSHAPE //#define VERBOSE static void normalize(sample_t *data, int n) { float rms = 0; int x; for (x = 0; x < n; x++) rms += data[x]*data[x]; rms = sqrtf(rms / n); fprintf(stderr, "RMS amplitude prior to normalization: %f\n", rms); for (x = 0; x < n; x++) { int value = data[x] * DESIRED_RMS / rms; if (value < -32768) value = -32768; if (value > 32767) value = 32767; data[x] = value; } } static void quantize(const sample_t *input, sample_t *output, int n) { int x, y, last_error = 0; int counts[31][31]; memset(counts, 0, sizeof(counts)); for (x = 0; x < n; x++) { #ifdef PRESHAPE //shape using feedback. not sure how correct this is, //but quiet parts appear to receive less noise output[x] = ((input[x] + 3 * last_error / 4) / 2048) + 15; #else output[x] = (input[x] / 2048) + 15; #endif if (output[x] < 0) output[x] = 0; if (output[x] >= 31) output[x] = 30; last_error = (output[x] - 15) * 2048 - input[x]; if (x > 0) counts[output[x-1]][output[x]]++; } for (y = 0; y < 31; y++) { for (x = 0; x < 31; x++) fprintf(stderr, "%3i ", counts[y][x]); fprintf(stderr, "\n"); } } //table is k*31 entries, where k=2^N static err_t table_adpcm_work(sample_t *data, int n, int *table, int k, int do_output, unsigned char *bits) { int x, last = 15, last_error = 0; err_t ret = 0; int hist[256] = {0}; float factor; int byte = 0; for (x = 0; x < n; x++) { int y; int best, best_error, best_value; for (y = 0; y < k; y++) { int value = table[y + last*k]; int error; #ifdef POSTSHAPE //again, may not be entirely correct, //but the output sounds roughly right error = value - data[x] - last_error / 2; #else error = value - data[x]; #endif error *= error; if (y == 0 || error < best_error) { best = y; best_error = error; best_value = value; } } last_error = best_value - data[x]; last = best_value; ret += best_error; if ((x & 7) == 0) byte = 0; byte |= best << (x & 7); if ((x & 7) == 7) hist[byte]++; if (do_output) { data[x] = best_value; //NOTE: Only supports k == 2 (1-bit) properly atm if (bits && best) bits[x >> 3] |= 1 << (x & 7); } } /* expect x/8/256 of each 0x55 and 0aAA */ factor = (hist[0x55] + hist[0xAA]) / (float)(n/8/128); factor -= 9; if (factor < 1) factor = 1; //fprintf(stderr, "factor = %.2f\tn = %i\n", factor, n); return ret * factor; } static void decode_to_samples(int *table, unsigned char *bits, int bytes) { int last = 15, x; for (x = 0; x < bytes*8; x++) samples.push_back(((last = table[(last << 1) | ((bits[x >> 3] >> (x & 7)) & 1)]) - 15) * 2048); } static void table_adpcm(sample_t *data, int n, int bps, int *best_table, unsigned char *bits, int entries, int k) { int x, y; int *table = (int*)malloc(entries*sizeof(int)); err_t e, ebest; int pass, changes; //speed up processing by only looking at the first tenth of the file //this seems to work well enough int n2 = n; //initialize table with reasonable values for (y = 0; y < 31; y++) for (x = 0; x < k; x++) { if (bps == 1) table[x + y*k] = y + 7*(x - k/2) + 4; else table[x + y*k] = y + 3*(x - k/2) + 3; if (table[x + y*k] < 0) table[x + y*k] = 0; if (table[x + y*k] > 30) table[x + y*k] = 30; } ebest = table_adpcm_work(data, n2, table, k, 0, NULL); memcpy(best_table, table, entries*sizeof(int)); fprintf(stderr, "initial error: %li\n", ebest); fprintf(stderr, "initial rms: %.2f\n", sqrtf((float)ebest/n2)); changes = 1; for (pass = 1; changes; pass++) { changes = 0; for (y = 0; y < entries; y++) { int delta = DELTA; int min = best_table[y] - delta, max = best_table[y] + delta; memcpy(table, best_table, entries*sizeof(int)); #ifdef VERBOSE fprintf(stderr, "table[% 4i/% 4i] = % 3i\n", y, entries, best_table[y]); #endif if (min < 0) min = 0; if (max > 30) //31 isn't a legal value max = 30; for (x = min; x < max; x++) { table[y] = x; e = table_adpcm_work(data, n2, table, k, 0, NULL); if (e < ebest) { float rms = sqrtf((float)ebest/n2); memcpy(best_table, table, entries*sizeof(int)); ebest = e; changes++; #ifdef VERBOSE fprintf(stderr, " -> % 3i -> %li ", x, ebest); //for some reason printf()ing rms above doesn't work.. fprintf(stderr, "(%.2f)\n", rms); #endif } } } fprintf(stderr, "pass %i: %i changes, rms = %.2f\n", pass, changes, sqrtf((float)ebest/n2)); } e = table_adpcm_work(data, n, best_table, k, 1, bits); fprintf(stderr, "final rms: %.2f\n", sqrtf((float)e/n)); free(table); } #define BOOT 26 #define HARMONY 900 //space needed for Harmony's F4 driver. would be available on a real F4 cart #define EFFECTS 100 #define IMAGE 1536 //<= 4 KiB per page, we need space for the player static int pagesizes[8] = { 3920 - BOOT - HARMONY - EFFECTS - IMAGE, 3920 - EFFECTS, 3920 - EFFECTS, 3920 - EFFECTS, 3920 - EFFECTS, 3920 - EFFECTS, 3920 - EFFECTS, 3919 - EFFECTS, }; static void write_l32(FILE *f, uint32_t a) { putc(a, f); putc(a>>8, f); putc(a>>16, f); putc(a>>24, f); } static void write_wav() { fprintf(stderr, "Writing %li samples to output.wav\n", samples.size()); FILE *wav = fopen("output.wav", "wb"); fprintf(wav, "RIFF"); write_l32(wav, samples.size()*2 + 36); fprintf(wav, "WAVEfmt "); write_l32(wav, 16); write_l32(wav, 0x00010001); write_l32(wav, rate); write_l32(wav, rate*2); write_l32(wav, 0x00100002); fprintf(wav, "data"); write_l32(wav, samples.size()*2); fwrite(&samples[0], samples.size()*2, 1, wav); fclose(wav); } int main(int argc, char **argv) { FILE *f; unsigned char header[44]; int size, samples; sample_t *input, *output; int bps; uint8_t pagedata[8][4096]; int x, ofs, bytes, k, entries, iter; int *best_table; unsigned char *bits; if (!(f = fopen(argv[1], "rb"))) return 1; fread(header, 44, 1, f); rate = *(int*)&header[24]; size = *(int*)&header[40]; /* HACK: don't change output just because pagesizes change */ samples = 237136; fprintf(stderr, "length: %.2f seconds\n", (float)samples / rate); if (size / 2 > samples) samples = size / 2; fprintf(stderr, "rate: %i\n", rate); fprintf(stderr, "size: %i = %i samples = %f seconds\n", size, samples, (float)samples/rate); input = (sample_t*)malloc(samples*sizeof(sample_t)); memset(input, 0, samples*2); output = (sample_t*)malloc(samples*sizeof(sample_t)); memset(output, 0, samples*2); fread(input, size, 1, f); bps = BPS; k = 1 << bps; entries = 31 * k; bytes = (samples + 7) / 8; best_table = (int*)malloc(entries*sizeof(int)); bits = (unsigned char*)malloc(bytes); memset(bits, 0, bytes); normalize(input, samples); quantize(input, output, samples); table_adpcm(output, samples, bps, best_table, bits, entries, k); for (iter = ofs = 0; iter < 8; iter++) { int bytes = pagesizes[iter]; printf("\tMAC TABLE%i\n", iter); printf("\t;%i entries\n", entries); printf("\t;%i bits per sample\n", bps); printf("ADPCMTable%i\n", iter); for (x = 0; x < entries; x++) printf("\t.byte %i\n", best_table[x]); printf("\tENDM\n"); printf("\tMAC SAMPLES%i\n", iter); printf("SampleData%i\t;%i bytes\n", iter, bytes); for (x = ofs; x < ofs+bytes; x++) printf("\t.byte %i\n", bits[x]); printf("SampleEnd%i\n", iter); printf("\tENDM\n"); ofs += bytes; } decode_to_samples(best_table, bits, bytes); fprintf(stderr, "encoded size: %.2f KiB (%i bps)\n", (samples * bps) / 8192.f, bps); free(input); free(output); fclose(f); write_wav(); free(best_table); free(bits); return 0; } [-- Attachment #3: Type: text/plain, Size: 251 bytes --] _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
next prev parent reply other threads:[~2022-03-14 10:03 UTC|newest] Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top 2022-03-03 15:44 Jack Bruienne 2022-03-07 11:03 ` Tomas Härdin 2022-03-08 3:04 ` Jack Bruienne 2022-03-14 10:02 ` Tomas Härdin [this message] 2022-03-15 8:48 ` Anton Khirnov
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=c81308704449c1f312e311feb5ca3fd9863ca5f9.camel@acc.umu.se \ --to=tjoppen@acc.umu.se \ --cc=ffmpeg-devel@ffmpeg.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel This inbox may be cloned and mirrored by anyone: git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \ ffmpegdev@gitmailbox.com public-inbox-index ffmpegdev Example config snippet for mirrors. AGPL code for this site: git clone https://public-inbox.org/public-inbox.git