From: Massimo Eynard <eynard.massimo@gmail.com> To: ffmpeg-devel@ffmpeg.org Subject: Re: [FFmpeg-devel] [PATCH] avcodec/mlpdec: Add decoding of object audio data Date: Mon, 24 Mar 2025 20:07:38 +0100 Message-ID: <fb1381fc-346f-44ba-964d-2b45668b30a9@gmail.com> (raw) In-Reply-To: <bde807f1-bff3-49c3-acdd-28459811ad7a@gmail.com> On 24/03/2025 00:00, James Almer wrote: > On 3/23/2025 6:47 PM, Hendrik Leppkes wrote: >> On Sun, Mar 23, 2025 at 9:35 PM James Almer <jamrial@gmail.com> wrote: >>> >>> On 3/23/2025 4:33 PM, Massimo Eynard wrote: >>>> On 23/03/2025 20:01, James Almer wrote: >>>>> On 3/22/2025 2:49 PM, Massimo Eynard wrote: >>>>>> This patch adds support for decoding the fourth MLP substream >>>>>> which contains the 16-channel presentation used for Atmos >>>>>> audio objects. >>>>>> >>>>>> By default only the first three substreams are decoded >>>>>> unless the new extract_objects flag is enabled as the resulting >>>>>> presentation contains audio object feeds instead of classic >>>>>> loudspeaker feeds. >>>>>> >>>>>> As this introduces interpolation of primitive matrices, precision >>>>>> has been increased to 2.18 fixed point. Therefore this requires >>>>>> DSP code upgrade which has been done for C and x86 implementations >>>>>> but not the ARM implementation. >>>>>> >>>>>> Adds two FATE tests using existing atmos.thd sample to reflect >>>>>> changes. >>>>>> >>>>>> Signed-off-by: Massimo Eynard <eynard.massimo@gmail.com> >>>>>> --- >>>>>> libavcodec/arm/mlpdsp_armv5te.S | 2 +- >>>>>> libavcodec/arm/mlpdsp_init_arm.c | 3 +- >>>>>> libavcodec/mlp.h | 10 +- >>>>>> libavcodec/mlp_parse.c | 31 ++- >>>>>> libavcodec/mlp_parse.h | 1 + >>>>>> libavcodec/mlp_parser.c | 11 +- >>>>>> libavcodec/mlpdec.c | 389 +++++++++++++++++++++++++++---- >>>>>> libavcodec/mlpdsp.c | 50 +++- >>>>>> libavcodec/mlpdsp.h | 25 ++ >>>>>> libavcodec/x86/mlpdsp.asm | 19 +- >>>>>> tests/fate/truehd.mak | 10 + >>>>>> 11 files changed, 476 insertions(+), 75 deletions(-) >>>>> >>>>> With atmos.thd i get: >>>>> >>>>>> [aist#0:0/truehd @ 00000209caf3ee00] Guessed Channel Layout: 7.1.4 >>>>>> Input #0, truehd, from '../samples/truehd/atmos.thd': >>>>>> Duration: N/A, start: 0.000000, bitrate: N/A >>>>>> Stream #0:0: Audio: truehd (Dolby TrueHD + Dolby Atmos), 48000 Hz, 7.1.4, s32 (24 bit) >>>>> >>>>> Which is unlikely to be correct. The file has 11 (or 12) objects, which is exported as 12 channels in an unspecified layout, and automatically assumed to be a 7.1.4 fixed layout. >>>>> >>>> >>>> This is caused by `guess_input_channel_layout` (in `ffmpeg_demux.c`) which tries to assume a layout. >>>> Would using `AV_CHANNEL_ORDER_CUSTOM` with all channels set to `AV_CHAN_UNKNOWN` (for unknown position, except LFE if present) be a better solution? >>> >>> Possibly, but it may make the stream undecodable unless you remap the >>> channels (probably with a filter in the filterchain). >>> >>> Is there no better representation for the output? What are these 12 >>> channels the sample exports? 16 channels (as you say the MLP substream >>> contains) would match Ambisonics 3rd order, but i assume that doesn't >>> apply here, unless you should also be outputting something else. >>> >> >> Its object-based audio. Every extra "channel" represents an audio >> object at any arbitrary position in space, as defined by separate >> metadata, which you are then supposed to mix together for your final >> speaker configuration. >> Typically, the "bed" channels (eg. the base 7.1) will contain audio >> that doesn't require much localization information, music, background >> noises, and the objects will contain audio which is more relevant to >> have full spatial localization. A mixer is then tasked based on the >> spatial metadata and knowledge of the physical speaker configuration >> to mix the objects for ideal spatial representation. >> >> We don't have a channel layout that would identify this sort of setup >> as of yet, nevermind a mixer that could actually deal with it, or even >> exporting the metadata from the TrueHD stream, but baby steps I >> suppose. > > So we'd need a new layout (or pseudo-channel) where you set arbitrary coordinates? Sort of like what Apple defined in https://developer.apple.com/documentation/coreaudiotypes/audio-channel-coordinates > That would be the best approach I guess. Atmos in TrueHD is the same as in E-AC-3 (except for the audio coding part of course) which is described in section 4 of ETSI TS 103 420. In the specification, the audio "channels" for objects are called "audio object essences" which are supplied to a mixer/renderer alongside the metadata. Section 4.4 describes the metadata interface. However the purpose of this patch is only to decode the essences. What should I do for now? >> >> FWIW, taking all this into account, I fully agree that it should by >> default output the 7.1 representation that everyone can actually >> process, because the bed+objects representation is rather unexpected >> and unhandleable at this time. > Agree. > > > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
next prev parent reply other threads:[~2025-03-24 19:07 UTC|newest] Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top 2025-03-22 17:49 Massimo Eynard 2025-03-23 17:31 ` Lynne 2025-03-23 18:58 ` Massimo Eynard 2025-03-23 19:01 ` James Almer 2025-03-23 19:33 ` Massimo Eynard 2025-03-23 20:35 ` James Almer 2025-03-23 21:47 ` Hendrik Leppkes 2025-03-23 23:00 ` James Almer 2025-03-24 19:07 ` Massimo Eynard [this message] 2025-03-23 21:50 ` Marton Balint 2025-03-25 17:42 ` James Almer
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=fb1381fc-346f-44ba-964d-2b45668b30a9@gmail.com \ --to=eynard.massimo@gmail.com \ --cc=ffmpeg-devel@ffmpeg.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel This inbox may be cloned and mirrored by anyone: git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \ ffmpegdev@gitmailbox.com public-inbox-index ffmpegdev Example config snippet for mirrors. AGPL code for this site: git clone https://public-inbox.org/public-inbox.git