From: "Большой Человек" <cheloveck2.0@gmail.com> To: ffmpeg-devel@ffmpeg.org Subject: [FFmpeg-devel] How to upsample and then encode audio Date: Thu, 2 Jun 2022 16:29:40 +0600 Message-ID: <CAN1V-9AqoYbgOvfpOmE38ou6oVOK6HAf5kRvSKhosrURDXozYw@mail.gmail.com> (raw) Basically after transcoding pcm_alaw 8khz to mp3 44.1khz, I can hear only some brief or even swift sound in first 1-2 seconds, unrecognizable sound. So something is wrong with pts/dts, packed to planar convertion, or upsampling. My application does transcoding rtsp camera stream to file. Video and audio. Video works fine and audio remuxing as well. Now I have pcm_alaw 8khz audio stream and want to transcode it to mp4 file along with video. Code is quite cumbersome to construct reproducible part, so firstly I want to know if my logic is right. Here is my draft process (assume all error are checked and handled): create encoder: ``` codec_ = avcodec_find_encoder(AV_CODEC_ID_MP3); enc_ctx_ = avcodec_alloc_context3(codec_); enc_ctx_->bit_rate = 64000; enc_ctx_->codec_type = AVMEDIA_TYPE_AUDIO; enc_ctx_->sample_fmt = codec_->sample_fmts ? codec_->sample_fmts[0] : AV_SAMPLE_FMT_S32P; // functions from here https://www.ffmpeg.org/doxygen/4.1/encode_audio_8c-example.html enc_ctx_->sample_rate = select_sample_rate(codec_); enc_ctx_->channel_layout = select_channel_layout(codec_); enc_ctx_->channels = av_get_channel_layout_nb_channels(enc_ctx_->channel_layout); enc_ctx_->time_base = (AVRational){1, enc_ctx_->sample_rate}; enc_ctx_->strict_std_compliance = FF_COMPLIANCE_EXPERIMENTAL; if (is_global_header) { enc_ctx_->flags |= AV_CODEC_FLAG_GLOBAL_HEADER; } avcodec_open2(enc_ctx_, codec_, nullptr); ``` create resampler (in_frame): ``` audio_fifo_ = av_audio_fifo_alloc(enc_ctx_->sample_fmt, enc_ctx_->channels, 1)); in_ch_layout_ = in_frame->channel_layout; in_sample_fmt = in_frame->format; in_sample_rate_ = in_frame->sample_rate; swr_ctx_ = swr_alloc_set_opts(NULL, // we're allocating a new context enc_ctx_->channel_layout, // out_ch_layout enc_ctx_->sample_fmt, // out_sample_fmt enc_ctx_->sample_rate, // out_sample_rate in_frame->channel_layout, // in_ch_layout (AVSampleFormat)in_frame->format, // in_sample_fmt in_frame->sample_rate, // in_sample_rate 0, // log_offset NULL); // log_ctx swr_init(swr_ctx_); ``` resample (in_frame, start_pts, start_dts): ``` auto resampled_frame = av_frame_alloc(); auto dst_nb_samples = av_rescale_rnd(swr_get_delay(swr_ctx_, in_frame->sample_rate) + in_frame->nb_samples, enc_ctx_->sample_rate, in_frame->sample_rate, AV_ROUND_UP); // resampled_frame->nb_samples = dst_nb_samples; resampled_frame->format = enc_ctx_->sample_fmt; resampled_frame->channel_layout = enc_ctx_->channel_layout; // resampled_frame->channels = enc_ctx_->channels; resampled_frame->sample_rate = enc_ctx_->sample_rate; error = swr_convert_frame(swr_ctx_, resampled_frame, in_frame); /* Make the FIFO as large as it needs to be to hold both, * the old and the new samples. */ if (av_audio_fifo_size(audio_fifo_) < dst_nb_samples) { av_audio_fifo_realloc(audio_fifo_, dst_nb_samples); } /* Store the new samples in the FIFO buffer. */ auto nb_samples = av_audio_fifo_write(audio_fifo_, reinterpret_cast<void **>(resampled_frame->extended_data), resampled_frame->nb_samples); int delay = 0; // trying to split resampled frame to desired chunks while (av_audio_fifo_size(audio_fifo_) > 0) { const int frame_size = FFMIN(av_audio_fifo_size(audio_fifo_), enc_ctx_->frame_size); auto out_frame = av_frame_alloc(); out_frame->nb_samples = frame_size; out_frame->format = enc_ctx_->sample_fmt; out_frame->channel_layout = enc_ctx_->channel_layout; out_frame->channels = enc_ctx_->channels; out_frame->sample_rate = enc_ctx_->sample_rate; av_frame_get_buffer(out_frame, 0); av_audio_fifo_read(audio_fifo_, (void **)out_frame->data, frame_size) < frame_size); // ***** tried both cases out_frame->pts = in_frame->pts + delay; out_frame->pkt_dts = in_frame->pkt_dts + delay; // swr_next_pts(swr_ctx_, in_frame->pts) + delay; // swr_next_pts(swr_ctx_, in_frame->pkt_dts) + delay; result.push_back(out_frame); delay += frame_size; } return result; ``` encoding and muxing (in_frame): ``` bool DoesNeedResample(const AVFrame * in_frame) { assert(("DoesNeedResample: in_frame is empty", in_frame)); assert(("DoesNeedResample: encoder is not started", is_init_)); if (in_frame->sample_rate != enc_ctx_->sample_rate || in_frame->channel_layout != enc_ctx_->channel_layout || in_frame->channels != enc_ctx_->channels || in_frame->format != enc_ctx_->sample_fmt) { return true; } return false; } av_frame_make_writable(in_frame); streamserver::AVFrames encoding_frames; if (DoesNeedResample(in_frame)) { encoding_frames = Resample(in_frame, av_rescale_q(in_frame->pts, in_audio_stream_timebase_, out_audio_stream_->time_base), av_rescale_q(in_frame->pkt_dts, in_audio_stream_timebase_, out_audio_stream_->time_base)); } else { encoding_frames.push_back(av_frame_clone(in_frame)); } for (auto frame : encoding_frames) { if ((err = avcodec_send_frame(encoder_ctx, frame)) < 0) { AVFrameFree(&frame); } while (err >= 0) { pkt_->data = NULL; pkt_->size = 0; av_init_packet(pkt_); err = avcodec_receive_packet(encoder_ctx, pkt_); if (err == AVERROR(EAGAIN) || err == AVERROR_EOF) { break; } else if (err < 0) { break; } pkt_->stream_index = out_audio_stream_->index; av_interleaved_write_frame(ofmt_ctx_, pkt_); } av_packet_unref(pkt_); } ``` Sound in result video is corrupted, see first paragraph for description. In https://www.ffmpeg.org/doxygen/4.1/transcode_aac_8c-example.html there are lines: ``` /* * Perform a sanity check so that the number of converted samples is * not greater than the number of samples to be converted. * If the sample rates differ, this case has to be handled differently */ av_assert0(output_codec_context->sample_rate == input_codec_context->sample_rate); ``` How to handle such cases? I tried to split resampled frames via fifo in example above! _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
next reply other threads:[~2022-06-02 10:31 UTC|newest] Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top 2022-06-02 10:29 Большой Человек [this message] 2022-06-02 13:41 ` Leo Izen
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=CAN1V-9AqoYbgOvfpOmE38ou6oVOK6HAf5kRvSKhosrURDXozYw@mail.gmail.com \ --to=cheloveck2.0@gmail.com \ --cc=ffmpeg-devel@ffmpeg.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel This inbox may be cloned and mirrored by anyone: git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \ ffmpegdev@gitmailbox.com public-inbox-index ffmpegdev Example config snippet for mirrors. AGPL code for this site: git clone https://public-inbox.org/public-inbox.git