From: Zhao Zhili <quinkblack@foxmail.com> To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org> Cc: jeebjp@gmail.com Subject: Re: [FFmpeg-devel] [PATCH 4/6] avformat/mov: parse ISO-14496-12 ChannelLayout Date: Tue, 31 Oct 2023 11:15:36 +0800 Message-ID: <tencent_EF0079690A7C3AACDE397DC309B42C69520A@qq.com> (raw) In-Reply-To: <CAEu79SYaqCt0jdOXgrEjOGUozmwwKD6Ujo6sqmd-=B_2LS1hOQ@mail.gmail.com> > On Feb 24, 2023, at 21:49, Jan Ekström <jeebjp@gmail.com> wrote: > > On Fri, Feb 24, 2023 at 6:25 AM Zhao Zhili <quinkblack@foxmail.com <mailto:quinkblack@foxmail.com>> wrote: >> >> From: Zhao Zhili <zhilizhao@tencent.com <mailto:zhilizhao@tencent.com>> >> >> Signed-off-by: Zhao Zhili <zhilizhao@tencent.com <mailto:zhilizhao@tencent.com>> > > Hah, I actually happened to recently start coding uncompressed audio > support in mp4 myself, but what this commit is handling is what > basically killed my version off since the channel layout box is > required. > > If you're interested you can check my take over at > https://github.com/jeeb/ffmpeg/commits/pcmc_parsing_improvements . > > Will comment on some things. I only have an old copy of the spec, and I may have missed some comments and made some mistakes. Please notify me in mailing list or personal email (this one) if I didn’t something wrong. I have network issue with IRC, can only read the archives if I get the time. I don’t work on open source for daily jobs. > >> --- >> libavformat/mov.c | 79 +++++++++++- >> libavformat/mov_chan.c | 265 +++++++++++++++++++++++++++++++++++++++++ >> libavformat/mov_chan.h | 26 ++++ >> 3 files changed, 369 insertions(+), 1 deletion(-) >> >> diff --git a/libavformat/mov.c b/libavformat/mov.c >> index b125343f84..1db869aa2e 100644 >> --- a/libavformat/mov.c >> +++ b/libavformat/mov.c >> @@ -940,6 +940,82 @@ static int mov_read_chan(MOVContext *c, AVIOContext *pb, MOVAtom atom) >> return 0; >> } >> >> +static int mov_read_chnl(MOVContext *c, AVIOContext *pb, MOVAtom atom) >> +{ >> + int64_t end = av_sat_add64(avio_tell(pb), atom.size); >> + int stream_structure; >> + int ret = 0; >> + AVStream *st; >> + >> + if (c->fc->nb_streams < 1) >> + return 0; >> + st = c->fc->streams[c->fc->nb_streams-1]; >> + >> + /* skip version and flags */ >> + avio_skip(pb, 4); > > We should really not do this any more. Various FullBoxes have multiple > versions or depend on the flags. See how I have added FullBox things > recently, although I would prefer us to have a generic macro/function > setup for this where you then get the version and flags as arguments > or whatever in the future. > > For this specific box, there are now versions 0 and 1 defined since > circa 2018-2019 or so (visible at least in 14496-12 2022) > > Since ISO/IEC has changed the rules for free specifications (against > the wishes of various spec authors) and all that jazz, this is how > it's defined in what I have on hand: > > 12.2.4 Channel layout > > 12.2.4.1 Definition > > Box Types: 'chnl' > Container: Audio sample entry > Mandatory: No > Quantity: Zero or one > > This box may appear in an audio sample entry to document the > assignment of channels in the audio > stream. It is recommended to use this box to convey the base channel > count for the DownMixInstructions > box and other DRC-related boxes specified in ISO/IEC 23003-4. > The channel layout can be all or part of a standard layout (from an > enumerated list), or a custom layout > (which also allows a track to contribute part of an overall layout). > A stream may contain channels, objects, neither, or both. A stream > that is neither channel nor object > structured can implicitly be rendered in a variety of ways. > > 12.2.4.2 Syntax > > aligned(8) class ChannelLayout extends FullBox('chnl', version, flags=0) { > if (version==0) { > unsigned int(8) stream_structure; > if (stream_structure & channelStructured) { > unsigned int(8) definedLayout; > if (definedLayout==0) { > for (i = 1 ; i <= layout_channel_count ; i++) { > // layout_channel_count comes from the sample entry > unsigned int(8) speaker_position; > if (speaker_position == 126) { // explicit position > signed int (16) azimuth; > signed int (8) elevation; > } > } > } else { > unsigned int(64) omittedChannelsMap; > // a ‘1’ bit indicates ‘not in this track’ > } > } > if (stream_structure & objectStructured) { > unsigned int(8) object_count; > } > } else { > unsigned int(4) stream_structure; > unsigned int(4) format_ordering; > unsigned int(8) baseChannelCount; > if (stream_structure & channelStructured) { > unsigned int(8) definedLayout; > if (definedLayout==0) { > unsigned int(8) layout_channel_count; > for (i = 1 ; i <= layout_channel_count ; i++) { > unsigned int(8) speaker_position; > if (speaker_position == 126) { // explicit position > signed int (16) azimuth; > signed int (8) elevation; > } > } > } else { > int(4) reserved = 0; > unsigned int(3) channel_order_definition; > unsigned int(1) omitted_channels_present; > if (omitted_channels_present == 1) { > unsigned int(64) omittedChannelsMap; > // a ‘1’ bit indicates ‘not in this track’ > } > } > } > if (stream_structure & objectStructured) { > // object_count is derived from baseChannelCount > } > } > } > > 12.2.4.3 Semantics > > version is an integer that specifies the version of this box (0 or 1). > When authoring, version 1 should be > preferred over version 0. Version 1 conveys the channel > ordering, which is not always the case for > version 0. Version 1 should be used to convey the base channel > count for DRC. > > stream_structure is a field of flags that define whether the stream > has channel or object structure (or > both, or neither); the following flags are defined, > all other values are reserved: > 1 the stream carries channels > 2 the stream carries objects > > format_ordering indicates the order of formats in the stream starting > from the lowest channel index > (see Table). Each format shall only use contiguous > channel indices. > format_ordering Order > 0 unknown > 1 Channels, possibly followed by Objects > 2 Objects, possibly followed by Channels > Remaining values are reserved > > definedLayout is a ChannelConfiguration from ISO/IEC 23091-3. > > speaker_position is an OutputChannelPosition from ISO/IEC 23091-3. If > an explicit position is used, > then the azimuth and elevation are as defined as for > speakers in ISO/IEC 23091-3. The channel > order corresponds to the order of speaker positions. > > azimuth is a signed value in degrees, as defined for > LoudspeakerAzimuth in ISO/IEC 23091-3. > > elevation is a signed value, in degrees, as defined for > LoudspeakerElevation in ISO/IEC 23091-3. > > channel_order_definition indicates where the ordering of the audio > channels for the definedLayout > are specified (see Table). > > channel_order_definition Channel order specification > 0 as listed for the ChannelConfigurations in > ISO/IEC 23091-3 > 1 Default order of audio codec specification > 2 Channel ordering #2 of audio codec specification > 3 Channel ordering #3 of audio codec specification > 4 Channel ordering #4 of audio codec specification > Remaining values are reserved > > omitted_channels_present is a flag that indicates if it is set to 1 > that the omittedChannelsMap is present. > > omittedChannelsMap is a bit-map of omitted channels; the bits in the > channel map are numbered from > least-significant to most-significant, and > correspond in that ordering with the order of the channels > for the configuration as documented in > ISO/IEC 23091-3 ChannelConfiguration. 1-bits in the > channel map mean that a channel is absent. A zero > value of the map therefore always means that > the given standard layout is fully present. The > default value is 0. > > layout_channel_count is the count of channels for the channel layout. > The default value is 0 if stream_ > structure indicates that no channel structure is > present. Otherwise, the value is the number of > channels of the defined layout, if present, > otherwise it is the value from the sample entry. > object_count is the count of channels that contain audio objects. The > default value is 0. For version > 1 and if the objectStructured flag is set, the value is > computed as baseChannelCount minus the > channel count of the channel structure. > > baseChannelCount represents the combined channel count of the channel > layout and the object count. > The value must match the base channel count for DRC > (see ISO/IEC 23003-4). > > >> + >> + stream_structure = avio_r8(pb); >> + >> + // stream carries channels >> + if (stream_structure & 1) { >> + int layout = avio_r8(pb); >> + >> + av_log(c->fc, AV_LOG_TRACE, "'chnl' layout %d\n", layout); >> + if (!layout) { >> + uint8_t positions[64] = {}; >> + int enable = 1; >> + >> + for (int i = 0; i < st->codecpar->ch_layout.nb_channels; i++) { >> + int speaker_pos = avio_r8(pb); >> + >> + av_log(c->fc, AV_LOG_TRACE, "speaker_position %d\n", speaker_pos); >> + if (speaker_pos == 126) { // explicit position >> + int16_t azimuth = avio_rb16(pb); >> + int8_t elevation = avio_r8(pb); >> + >> + av_log(c->fc, AV_LOG_TRACE, "azimuth %d, elevation %d\n", >> + azimuth, elevation); >> + // Don't support explicit position >> + enable = 0; >> + } else if (i < FF_ARRAY_ELEMS(positions)) { >> + positions[i] = speaker_pos; >> + } else { >> + // number of channel out of our supported range >> + enable = 0; >> + } >> + } >> + >> + if (enable) { >> + ret = ff_mov_get_layout_from_channel_positions(positions, >> + st->codecpar->ch_layout.nb_channels, >> + &st->codecpar->ch_layout); >> + if (ret) { >> + av_log(c->fc, AV_LOG_WARNING, "unsupported speaker positions\n"); >> + ret = 0; >> + } >> + } >> + } else { >> + uint64_t omitted_channel_map = avio_rb64(pb); >> + >> + if (omitted_channel_map) { >> + avpriv_request_sample(c->fc, "omitted_channel_map 0x%" PRIx64 " != 0", >> + omitted_channel_map); >> + return AVERROR_PATCHWELCOME; >> + } >> + ff_mov_get_channel_layout_from_config(layout, &st->codecpar->ch_layout); >> + } >> + } >> + >> + // stream carries objects >> + if (stream_structure & 2) { >> + int obj_count = avio_r8(pb); >> + av_log(c->fc, AV_LOG_TRACE, "'chnl' with object_count %d\n", obj_count); >> + } >> + >> + avio_seek(pb, end, SEEK_SET); >> + return ret; >> +} >> + >> static int mov_read_wfex(MOVContext *c, AVIOContext *pb, MOVAtom atom) >> { >> AVStream *st; >> @@ -7784,7 +7860,8 @@ static const MOVParseTableEntry mov_default_parse_table[] = { >> { MKTAG('w','i','d','e'), mov_read_wide }, /* place holder */ >> { MKTAG('w','f','e','x'), mov_read_wfex }, >> { MKTAG('c','m','o','v'), mov_read_cmov }, >> -{ MKTAG('c','h','a','n'), mov_read_chan }, /* channel layout */ >> +{ MKTAG('c','h','a','n'), mov_read_chan }, /* channel layout from quicktime */ >> +{ MKTAG('c','h','n','l'), mov_read_chnl }, /* channel layout from ISO-14496-12 */ >> { MKTAG('d','v','c','1'), mov_read_dvc1 }, >> { MKTAG('s','g','p','d'), mov_read_sgpd }, >> { MKTAG('s','b','g','p'), mov_read_sbgp }, >> diff --git a/libavformat/mov_chan.c b/libavformat/mov_chan.c >> index f66bf0df7f..10ebcdc08f 100644 >> --- a/libavformat/mov_chan.c >> +++ b/libavformat/mov_chan.c >> @@ -551,3 +551,268 @@ int ff_mov_read_chan(AVFormatContext *s, AVIOContext *pb, AVStream *st, >> >> return 0; >> } >> + >> +/* ISO/IEC 23001-8, 8.2 */ >> +static const AVChannelLayout iso_channel_configuration[] = { >> + // 0: any setup >> + {}, >> + > > I think the better naming for this would be CICP channel configuration > since the specification is called "common independent coding points" > (for video this is shared with ITU-T H.273 which is free). > > Also do note that a whole bunch of these are not in the channel order > that FFmpeg wants after stereo :< > > Thankfully with manual mapping FFmpeg native channel layouts' channel > order should be writable and readable. > > The channel orders for various CICP layouts can be found both in the > referenced specifications, as well as in the comments from Apple's > headers for example > > // ISO/IEC 23091-3, channels w/orderings > kAudioChannelLayoutTag_CICP_1 = > kAudioChannelLayoutTag_MPEG_1_0, ///< C > kAudioChannelLayoutTag_CICP_2 = > kAudioChannelLayoutTag_MPEG_2_0, ///< L R > kAudioChannelLayoutTag_CICP_3 = > kAudioChannelLayoutTag_MPEG_3_0_A, ///< L R C > kAudioChannelLayoutTag_CICP_4 = > kAudioChannelLayoutTag_MPEG_4_0_A, ///< L R C Cs > kAudioChannelLayoutTag_CICP_5 = > kAudioChannelLayoutTag_MPEG_5_0_A, ///< L R C Ls Rs > kAudioChannelLayoutTag_CICP_6 = > kAudioChannelLayoutTag_MPEG_5_1_A, ///< L R C LFE Ls Rs > kAudioChannelLayoutTag_CICP_7 = > kAudioChannelLayoutTag_MPEG_7_1_B, ///< L R C LFE Ls Rs Lc Rc > > kAudioChannelLayoutTag_CICP_9 = > kAudioChannelLayoutTag_ITU_2_1, ///< L R Cs > kAudioChannelLayoutTag_CICP_10 = > kAudioChannelLayoutTag_ITU_2_2, ///< L R Ls Rs > kAudioChannelLayoutTag_CICP_11 = > kAudioChannelLayoutTag_MPEG_6_1_A, ///< L R C LFE Ls Rs Cs > kAudioChannelLayoutTag_CICP_12 = > kAudioChannelLayoutTag_MPEG_7_1_C, ///< L R C LFE Ls Rs Rls Rrs > kAudioChannelLayoutTag_CICP_13 = (204U<<16) | 24, > ///< Lc Rc C LFE2 Rls Rrs L R Cs LFE3 Lss Rss Vhl > Vhr Vhc Ts Ltr Rtr Ltm Rtm Ctr Cb Lb Rb > > kAudioChannelLayoutTag_CICP_14 = (205U<<16) | 8, > ///< L R C LFE Ls Rs Vhl Vhr > kAudioChannelLayoutTag_CICP_15 = (206U<<16) | 12, > ///< L R C LFE2 Rls Rrs LFE3 Lss Rss Vhl Vhr Ctr > > kAudioChannelLayoutTag_CICP_16 = (207U<<16) | 10, > ///< L R C LFE Ls Rs Vhl Vhr Lts Rts > kAudioChannelLayoutTag_CICP_17 = (208U<<16) | 12, > ///< L R C LFE Ls Rs Vhl Vhr Vhc Lts Rts Ts > kAudioChannelLayoutTag_CICP_18 = (209U<<16) | 14, > ///< L R C LFE Ls Rs Lbs Rbs Vhl Vhr Vhc Lts Rts Ts > > kAudioChannelLayoutTag_CICP_19 = (210U<<16) | 12, > ///< L R C LFE Rls Rrs Lss Rss Vhl Vhr Ltr Rtr > kAudioChannelLayoutTag_CICP_20 = (211U<<16) | 14, > ///< L R C LFE Rls Rrs Lss Rss Vhl Vhr Ltr Rtr Leos > Reos > > Best regards, > Jan > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org <mailto:ffmpeg-devel@ffmpeg.org> > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org <mailto:ffmpeg-devel-request@ffmpeg.org> with subject "unsubscribe". _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
next prev parent reply other threads:[~2023-10-31 3:16 UTC|newest] Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top [not found] <20230224122523.143571-1-quinkblack@foxmail.com> 2023-02-24 12:25 ` [FFmpeg-devel] [PATCH 1/6] avformat/movenc: add PCM in mp4 support Zhao Zhili 2023-02-24 9:41 ` Tomas Härdin 2023-02-24 18:29 ` Zhao Zhili 2023-02-24 12:25 ` [FFmpeg-devel] [PATCH 2/6] avformat/mov: fix ISO/IEC 23003-5 support Zhao Zhili 2023-02-24 9:41 ` Tomas Härdin 2023-02-24 12:25 ` [FFmpeg-devel] [PATCH 3/6] avformat/isom_tags: remove ipcm from movaudio_tags Zhao Zhili 2023-02-24 9:42 ` Tomas Härdin 2023-02-24 12:25 ` [FFmpeg-devel] [PATCH 4/6] avformat/mov: parse ISO-14496-12 ChannelLayout Zhao Zhili 2023-02-24 9:42 ` Tomas Härdin 2023-02-24 18:37 ` Zhao Zhili 2023-02-24 13:49 ` Jan Ekström 2023-02-25 4:31 ` Zhao Zhili 2023-10-31 3:15 ` Zhao Zhili [this message] 2023-10-31 3:15 ` Zhao Zhili 2023-02-24 12:25 ` [FFmpeg-devel] [PATCH 5/6] avformat/movenc: write ChannelLayout box for PCM Zhao Zhili 2023-02-24 9:42 ` Tomas Härdin 2023-02-24 12:25 ` [FFmpeg-devel] [PATCH 6/6] fate/mov: add PCM in mp4 test Zhao Zhili
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=tencent_EF0079690A7C3AACDE397DC309B42C69520A@qq.com \ --to=quinkblack@foxmail.com \ --cc=ffmpeg-devel@ffmpeg.org \ --cc=jeebjp@gmail.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel This inbox may be cloned and mirrored by anyone: git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \ ffmpegdev@gitmailbox.com public-inbox-index ffmpegdev Example config snippet for mirrors. AGPL code for this site: git clone https://public-inbox.org/public-inbox.git