From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTP id D0C6A40F68 for ; Mon, 14 Mar 2022 10:03:09 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 4CBCA68B196; Mon, 14 Mar 2022 12:03:06 +0200 (EET) Received: from mail.acc.umu.se (mail.acc.umu.se [130.239.18.156]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id E872568B178 for ; Mon, 14 Mar 2022 12:02:59 +0200 (EET) Received: from localhost (localhost.localdomain [127.0.0.1]) by amavisd-new (Postfix) with ESMTP id 4931844B92 for ; Mon, 14 Mar 2022 11:02:59 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=acc.umu.se; s=mail1; t=1647252179; bh=a0z0RrnZzZGu0RLtgTuP4K2YczrZYQoJK6UPqIWdKOU=; h=Subject:From:To:Date:In-Reply-To:References:From; b=ktEdMOc+l1cmOLlYcmmygz3v9d/VXEbqpzOfm4OSwQRTLQ5Szqno5jA8xbNXRK9jC 1p4VJmNutf8J8YawBLWM8G/V2y86cHHx76zw2NwXX6lBbwKf9TKotGN0ZXxAuCgoyI sqy2PappTlXxLivynt2awJUIZ0YuchIIHsiLkYvA6I9GnZyxw2uZ3+U6F/hK7ok+Xh p0XVTYM+Rts4xi3I+9lxOoUJ+l6nwskaxUyGrVBwobhHLYLV1AdHOzbyg+MKbaRfW9 CcS9n1yJXx7UHmKOOa7WBNl+adPdrGsftbgh/qGtCjf0w/joWOKhhHDricUf6ps745 YPIspOC12LErA== Received: from debian.lan (unknown [IPv6:2a00:66c0:a::72c]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) (Authenticated sender: tjoppen) by mail.acc.umu.se (Postfix) with ESMTPSA id EB4AB44B90 for ; Mon, 14 Mar 2022 11:02:57 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=acc.umu.se; s=mail1; t=1647252178; bh=a0z0RrnZzZGu0RLtgTuP4K2YczrZYQoJK6UPqIWdKOU=; h=Subject:From:To:Date:In-Reply-To:References:From; b=emUKUTbKWozkjzB6DXMY0AVyt3lAH9m9re/8OvrV6ay0VHkZGd0X4jSX+mq1loj4E cZPjwR4r2xZYaErPAGpKw/w/R+Ae4LH15IEISYOQ/yogwztbZI9sxNkW4pR4Fv7Npc 5gWbNIAs5FjLxvIYCIr2p/5z3xwuEBl/zl0cs9K7uz/PF9+1sKmv/pwQYNVu8bcF2v tj3gt1cFWxjkVf70NaI35Wsbroc8YVR592xwjPDB1ByPP5fXyHD+VEPxkCL3cegl69 OVrz6IJn0cxKclWAwCWz5UsdfbgvbXaQQAH7Agl2xEIwtbbDl3JnRJ3EKS9BjG+bu7 jsv9YU23jLegQ== Message-ID: From: Tomas =?ISO-8859-1?Q?H=E4rdin?= To: FFmpeg development discussions and patches Date: Mon, 14 Mar 2022 11:02:57 +0100 In-Reply-To: References: <0692ba87-361f-498a-dc08-85d771f6bdaf@gmail.com> <8a650bd727d8a69c74f6b6789161552def5beaa9.camel@acc.umu.se> Content-Type: multipart/mixed; boundary="=-w+MEKCmS8MvCC/7dLgKw" User-Agent: Evolution 3.38.3-1 MIME-Version: 1.0 Subject: Re: [FFmpeg-devel] [PATCH v6 1/2] libavcodec: Added DFPWM1a codec X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: --=-w+MEKCmS8MvCC/7dLgKw Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit mån 2022-03-07 klockan 22:04 -0500 skrev Jack Bruienne: > On 3/7/22 06:03, Tomas Härdin wrote: > > > tor 2022-03-03 klockan 10:44 -0500 skrev Jack Bruienne: > > >   From the wiki page (https://wiki.vexatos.com/dfpwm): > > > > DFPWM (Dynamic Filter Pulse Width Modulation) is an audio codec > > > > created by Ben “GreaseMonkey” Russell in 2012, originally to be > > > > used > > > > as a voice codec for asiekierka's pixmess, a C remake of > > > > 64pixels. > > > > It is a 1-bit-per-sample codec which uses a dynamic-strength > > > > one- > > > > pole > > > > low-pass filter as a predictor. Due to the fact that a raw > > > > DPFWM > > > > decoding > > > > creates a high-pitched whine, it is often followed by some > > > > post- > > > > processing > > > > filters to make the stream more listenable. > > This sounds similar to something I wrote for the Atari 2600 a > > number of > > years ago (https://www.pouet.net/prod.php?which=59283  ) > > > > I found an encoder online for DFPWM, and it seems to suffer from > > the > > same "beeping" that my debeeping hack fixes (suppressing 0xAA and > > 0x55 > > bytes). Perhaps a similar hack could be useful in dfpwmenc.c > > I'm curious how this works. Do you just cut out those bytes from the > encoder output, or is it modified in some way? Wouldn't removing the > data entirely eventually cause much of the audio to be lost? The source code is included in the release. Look at audioquant.cpp. I've attached it for convenience. The codec is based on a state machine where each state is a 5-bit PCM value that can go to either of two states, also 5-bit PCM values. Hence 1 bit per sample. I also have a low-pass filter in the decoder. I penalize state machines which result in 0x55 and 0xAA being overly represented. This is done via computing a histogram of the output bytes and scaling the RMS error according to how many of those bytes are in the (tentative) output. Another approach could be to detect and blank excessive runs of 0x55 and 0xAA bytes. > >  From experience it's usually better to be strict when it comes to > > stuff > > like this. The ComputerCraft people should work out a standard for > > this, preferably a container. We've had a similar problem in FreeDV > > where which codec was being used was implicit most of the time, > > which > > has been resolved with the .c2 format. > > I think the best standardized container for DFPWM will be WAV. I agree, and I see this was already pushed. /Tomas --=-w+MEKCmS8MvCC/7dLgKw Content-Disposition: attachment; filename="audioquant.cpp" Content-Type: text/x-c++src; name="audioquant.cpp"; charset="us-ascii" Content-Transfer-Encoding: 7bit #define _CRT_SECURE_NO_WARNINGS #include #include #include #include #include #include using namespace std; typedef short sample_t; typedef int64_t err_t; static vector samples; static int rate; #define DESIRED_RMS 20000 //can be tweaked slightly for better utilization of dynamic range depending on material #define BPS 1 #define DELTA 10 //try table values +- this every pass //higher values = more exhaustive search #define PRESHAPE //#define POSTSHAPE //#define VERBOSE static void normalize(sample_t *data, int n) { float rms = 0; int x; for (x = 0; x < n; x++) rms += data[x]*data[x]; rms = sqrtf(rms / n); fprintf(stderr, "RMS amplitude prior to normalization: %f\n", rms); for (x = 0; x < n; x++) { int value = data[x] * DESIRED_RMS / rms; if (value < -32768) value = -32768; if (value > 32767) value = 32767; data[x] = value; } } static void quantize(const sample_t *input, sample_t *output, int n) { int x, y, last_error = 0; int counts[31][31]; memset(counts, 0, sizeof(counts)); for (x = 0; x < n; x++) { #ifdef PRESHAPE //shape using feedback. not sure how correct this is, //but quiet parts appear to receive less noise output[x] = ((input[x] + 3 * last_error / 4) / 2048) + 15; #else output[x] = (input[x] / 2048) + 15; #endif if (output[x] < 0) output[x] = 0; if (output[x] >= 31) output[x] = 30; last_error = (output[x] - 15) * 2048 - input[x]; if (x > 0) counts[output[x-1]][output[x]]++; } for (y = 0; y < 31; y++) { for (x = 0; x < 31; x++) fprintf(stderr, "%3i ", counts[y][x]); fprintf(stderr, "\n"); } } //table is k*31 entries, where k=2^N static err_t table_adpcm_work(sample_t *data, int n, int *table, int k, int do_output, unsigned char *bits) { int x, last = 15, last_error = 0; err_t ret = 0; int hist[256] = {0}; float factor; int byte = 0; for (x = 0; x < n; x++) { int y; int best, best_error, best_value; for (y = 0; y < k; y++) { int value = table[y + last*k]; int error; #ifdef POSTSHAPE //again, may not be entirely correct, //but the output sounds roughly right error = value - data[x] - last_error / 2; #else error = value - data[x]; #endif error *= error; if (y == 0 || error < best_error) { best = y; best_error = error; best_value = value; } } last_error = best_value - data[x]; last = best_value; ret += best_error; if ((x & 7) == 0) byte = 0; byte |= best << (x & 7); if ((x & 7) == 7) hist[byte]++; if (do_output) { data[x] = best_value; //NOTE: Only supports k == 2 (1-bit) properly atm if (bits && best) bits[x >> 3] |= 1 << (x & 7); } } /* expect x/8/256 of each 0x55 and 0aAA */ factor = (hist[0x55] + hist[0xAA]) / (float)(n/8/128); factor -= 9; if (factor < 1) factor = 1; //fprintf(stderr, "factor = %.2f\tn = %i\n", factor, n); return ret * factor; } static void decode_to_samples(int *table, unsigned char *bits, int bytes) { int last = 15, x; for (x = 0; x < bytes*8; x++) samples.push_back(((last = table[(last << 1) | ((bits[x >> 3] >> (x & 7)) & 1)]) - 15) * 2048); } static void table_adpcm(sample_t *data, int n, int bps, int *best_table, unsigned char *bits, int entries, int k) { int x, y; int *table = (int*)malloc(entries*sizeof(int)); err_t e, ebest; int pass, changes; //speed up processing by only looking at the first tenth of the file //this seems to work well enough int n2 = n; //initialize table with reasonable values for (y = 0; y < 31; y++) for (x = 0; x < k; x++) { if (bps == 1) table[x + y*k] = y + 7*(x - k/2) + 4; else table[x + y*k] = y + 3*(x - k/2) + 3; if (table[x + y*k] < 0) table[x + y*k] = 0; if (table[x + y*k] > 30) table[x + y*k] = 30; } ebest = table_adpcm_work(data, n2, table, k, 0, NULL); memcpy(best_table, table, entries*sizeof(int)); fprintf(stderr, "initial error: %li\n", ebest); fprintf(stderr, "initial rms: %.2f\n", sqrtf((float)ebest/n2)); changes = 1; for (pass = 1; changes; pass++) { changes = 0; for (y = 0; y < entries; y++) { int delta = DELTA; int min = best_table[y] - delta, max = best_table[y] + delta; memcpy(table, best_table, entries*sizeof(int)); #ifdef VERBOSE fprintf(stderr, "table[% 4i/% 4i] = % 3i\n", y, entries, best_table[y]); #endif if (min < 0) min = 0; if (max > 30) //31 isn't a legal value max = 30; for (x = min; x < max; x++) { table[y] = x; e = table_adpcm_work(data, n2, table, k, 0, NULL); if (e < ebest) { float rms = sqrtf((float)ebest/n2); memcpy(best_table, table, entries*sizeof(int)); ebest = e; changes++; #ifdef VERBOSE fprintf(stderr, " -> % 3i -> %li ", x, ebest); //for some reason printf()ing rms above doesn't work.. fprintf(stderr, "(%.2f)\n", rms); #endif } } } fprintf(stderr, "pass %i: %i changes, rms = %.2f\n", pass, changes, sqrtf((float)ebest/n2)); } e = table_adpcm_work(data, n, best_table, k, 1, bits); fprintf(stderr, "final rms: %.2f\n", sqrtf((float)e/n)); free(table); } #define BOOT 26 #define HARMONY 900 //space needed for Harmony's F4 driver. would be available on a real F4 cart #define EFFECTS 100 #define IMAGE 1536 //<= 4 KiB per page, we need space for the player static int pagesizes[8] = { 3920 - BOOT - HARMONY - EFFECTS - IMAGE, 3920 - EFFECTS, 3920 - EFFECTS, 3920 - EFFECTS, 3920 - EFFECTS, 3920 - EFFECTS, 3920 - EFFECTS, 3919 - EFFECTS, }; static void write_l32(FILE *f, uint32_t a) { putc(a, f); putc(a>>8, f); putc(a>>16, f); putc(a>>24, f); } static void write_wav() { fprintf(stderr, "Writing %li samples to output.wav\n", samples.size()); FILE *wav = fopen("output.wav", "wb"); fprintf(wav, "RIFF"); write_l32(wav, samples.size()*2 + 36); fprintf(wav, "WAVEfmt "); write_l32(wav, 16); write_l32(wav, 0x00010001); write_l32(wav, rate); write_l32(wav, rate*2); write_l32(wav, 0x00100002); fprintf(wav, "data"); write_l32(wav, samples.size()*2); fwrite(&samples[0], samples.size()*2, 1, wav); fclose(wav); } int main(int argc, char **argv) { FILE *f; unsigned char header[44]; int size, samples; sample_t *input, *output; int bps; uint8_t pagedata[8][4096]; int x, ofs, bytes, k, entries, iter; int *best_table; unsigned char *bits; if (!(f = fopen(argv[1], "rb"))) return 1; fread(header, 44, 1, f); rate = *(int*)&header[24]; size = *(int*)&header[40]; /* HACK: don't change output just because pagesizes change */ samples = 237136; fprintf(stderr, "length: %.2f seconds\n", (float)samples / rate); if (size / 2 > samples) samples = size / 2; fprintf(stderr, "rate: %i\n", rate); fprintf(stderr, "size: %i = %i samples = %f seconds\n", size, samples, (float)samples/rate); input = (sample_t*)malloc(samples*sizeof(sample_t)); memset(input, 0, samples*2); output = (sample_t*)malloc(samples*sizeof(sample_t)); memset(output, 0, samples*2); fread(input, size, 1, f); bps = BPS; k = 1 << bps; entries = 31 * k; bytes = (samples + 7) / 8; best_table = (int*)malloc(entries*sizeof(int)); bits = (unsigned char*)malloc(bytes); memset(bits, 0, bytes); normalize(input, samples); quantize(input, output, samples); table_adpcm(output, samples, bps, best_table, bits, entries, k); for (iter = ofs = 0; iter < 8; iter++) { int bytes = pagesizes[iter]; printf("\tMAC TABLE%i\n", iter); printf("\t;%i entries\n", entries); printf("\t;%i bits per sample\n", bps); printf("ADPCMTable%i\n", iter); for (x = 0; x < entries; x++) printf("\t.byte %i\n", best_table[x]); printf("\tENDM\n"); printf("\tMAC SAMPLES%i\n", iter); printf("SampleData%i\t;%i bytes\n", iter, bytes); for (x = ofs; x < ofs+bytes; x++) printf("\t.byte %i\n", bits[x]); printf("SampleEnd%i\n", iter); printf("\tENDM\n"); ofs += bytes; } decode_to_samples(best_table, bits, bytes); fprintf(stderr, "encoded size: %.2f KiB (%i bps)\n", (samples * bps) / 8192.f, bps); free(input); free(output); fclose(f); write_wav(); free(best_table); free(bits); return 0; } --=-w+MEKCmS8MvCC/7dLgKw Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". --=-w+MEKCmS8MvCC/7dLgKw--