16
|
1 January 7, 2002
|
|
2
|
|
3 MP4V2 LIBRARY INTERNALS
|
|
4 =======================
|
|
5
|
|
6 This document provides an overview of the internals of the mp4v2 library
|
|
7 to aid those who wish to modify and extend it. Before reading this document,
|
|
8 I recommend familiarizing yourself with the MP4 (or Quicktime) file format
|
|
9 standard and the mp4v2 library API. The API is described in a set of man pages
|
|
10 in mpeg4ip/doc/mp4v2, or if you prefer by looking at mp4.h.
|
|
11
|
|
12 All the library code is written in C++, however the library API follows uses
|
|
13 C calling conventions hence is linkable by both C and C++ programs. The
|
|
14 library has been compiled and used on Linux, BSD, Windows, and Mac OS X.
|
|
15 Other than libc, the library has no external dependencies, and hence can
|
|
16 be used independently of the mpeg4ip package if desired. The library is
|
|
17 used for both real-time recording and playback in mpeg4ip, and its runtime
|
|
18 performance is up to those tasks. On the IA32 architecture compiled with gcc,
|
|
19 the stripped library is approximately 600 KB code and initialized data.
|
|
20
|
|
21 It is useful to think of the mp4v2 library as consisting of four layers:
|
|
22 infrastructure, file format, generic tracks, and type specific track helpers.
|
|
23 A description of each layer follows, from the fundamental to the optional.
|
|
24
|
|
25
|
|
26 Infrastructure
|
|
27 ==============
|
|
28
|
|
29 The infrastructure layer provides basic file I/O, memory allocation,
|
|
30 error handling, string utilities, and protected arrays. The source files
|
|
31 for this layer are mp4file_io, mp4util, and mp4array.
|
|
32
|
|
33 Note that the array classes uses preprocessor macros instead of C++
|
|
34 templates. The rationale for this is to increase portability given the
|
|
35 sometimes incomplete support by some compilers for templates.
|
|
36
|
|
37
|
|
38 File Format
|
|
39 ===========
|
|
40
|
|
41 The file format layer provides the translation from the on-disk MP4 file
|
|
42 format to in-memory C++ structures and back to disk. It is intended
|
|
43 to exactly match the MP4 specification in syntax and semantics. It
|
|
44 represents the majority of the code.
|
|
45
|
|
46 There are three key structures at the file format layer: atoms, properties,
|
|
47 and descriptors.
|
|
48
|
|
49 Atoms are the primary containers within an mp4 file. They can contain
|
|
50 any combination of properties, other atoms, or descriptors.
|
|
51
|
|
52 The mp4atom files contain the base class for all the atoms, and provide
|
|
53 generic functions that cover most cases. Most atoms are covered in
|
|
54 atom_standard.cpp. Atoms that have a special read, generation or
|
|
55 write needs are contained in their subclass contained in file atom_<name>.cpp,
|
|
56 where <name> is the four letter name of the atom defined in the MP4
|
|
57 specification.
|
|
58
|
|
59 Atoms that only specifies the properties of the atom or the possible child
|
|
60 atoms in the case of a container atom are located in atom_standard.cpp.
|
|
61
|
|
62 In more specialized cases the atom specific file provides routines to
|
|
63 initialize, read, or write the atom.
|
|
64
|
|
65 Properties are the atomic pieces of information. The basic types of
|
|
66 properties are integers, floats, strings, and byte arrays. For integers
|
|
67 and floats there are subclasses that represent the different storage sizes,
|
|
68 e.g. 8, 16, 24, 32, and 64 bit integers. For strings, there is 1 property
|
|
69 class with a number of options regarding exact storage details, e.g. null
|
|
70 terminated, fixed length, counted.
|
|
71
|
|
72 For implementation reasons, there are also two special properties, table
|
|
73 and descriptor, that are actually containers for groups of properties.
|
|
74 I.e by making these containers provide a property interface much code can
|
|
75 be written in a generic fashion.
|
|
76
|
|
77 The mp4property files contain all the property related classes.
|
|
78
|
|
79 Descriptors are containers that derive from the MPEG conventions and use
|
|
80 different encoding rules than the atoms derived from the QuickTime file
|
|
81 format. This means more use of bitfields and conditional existence with
|
|
82 an emphasis on bit efficiency at the cost of encoding/decoding complexity.
|
|
83 Descriptors can contain other descriptors and/or properties.
|
|
84
|
|
85 The mp4descriptor files contain the generic base class for descriptors.
|
|
86 Also the mp4property files have a descriptor wrapper class that allows a
|
|
87 descriptor to behave as if it were a property. The specific descriptors
|
|
88 are implemented as subclasses of the base class descriptor in manner similar
|
|
89 to that of atoms. The descriptors, ocidescriptors, and qosqualifiers files
|
|
90 contain these implementations.
|
|
91
|
|
92 Each atom/property/descriptor has a name closely related to that in the
|
|
93 MP4 specification. The difference being that the mp4v2 library doesn't
|
|
94 use '-' or '_' in property names and capitalizes the first letter of each
|
|
95 word, e.g. "thisIsAPropertyName". A complete name specifies the complete
|
|
96 container path. The names follow the C/C++ syntax for elements and array
|
|
97 indices.
|
|
98
|
|
99 Examples are:
|
|
100 "moov.mvhd.duration"
|
|
101 "moov.trak[2].tkhd.duration"
|
|
102 "moov.trak[3].minf.mdia.stbl.stsz[101].sampleSize"
|
|
103
|
|
104 Note "*" can be used as a wildcard for an atom name (only). This is most
|
|
105 useful when dealing with the stsd atom which contains child atoms with
|
|
106 various names, but shared property names.
|
|
107
|
|
108 Note that internally when performance matters the code looks up a property
|
|
109 by name once, and then stores the returned pointer to the property class.
|
|
110
|
|
111 To add an atom, first you should see if an existing atom exists that
|
|
112 can be used. If not, you need to decide if special read/write or
|
|
113 generate properties need to be established; for example a property in the atom
|
|
114 changes other properties (adds, or subtracts). If there are no
|
|
115 special cases, add the atom properties to atom_standard.cpp. If there
|
|
116 are special properties, add a new file, add a new class to atoms.h, and
|
|
117 add the class to MP4Atom::CreateAtom in mp4atom.cpp.
|
|
118
|
|
119
|
|
120
|
|
121 Generic Tracks
|
|
122 ==============
|
|
123
|
|
124 The two entities at this level are the mp4 file as a whole and the tracks
|
|
125 which are contained with it. The mp4file and mp4track files contain the
|
|
126 implementation.
|
|
127
|
|
128 The critical work done by this layer is to map the collection of atoms,
|
|
129 properties, and descriptors that represent a media track into a useful,
|
|
130 and consistent set of operations. For example, reading or writing a media
|
|
131 sample of a track is a relatively simple operation from the library API
|
|
132 perspective. However there are numerous pieces of information in the mp4
|
|
133 file that need to be properly used and updated to do this. This layer
|
|
134 handles all those details.
|
|
135
|
|
136 Given familiarity with the mp4 spec, the code should be straight-forward.
|
|
137 What may not be immediately obvious are the functions to handle chunks of
|
|
138 media samples. These exist to allow optimization of the mp4 file layout by
|
|
139 reordering the chunks on disk to interleave the media sample chunks of
|
|
140 multiple tracks in time order. (See MP4Optimize API doc).
|
|
141
|
|
142
|
|
143 Type Specific Track Helpers
|
|
144 ===========================
|
|
145
|
|
146 This specialized code goes beyond the meta-information about tracks in
|
|
147 the mp4 file to understanding and manipulating the information in the
|
|
148 track samples. There are currently two helpers in the library:
|
|
149 the MPEG-4 Systems Helper, and the RTP Hint Track Helper.
|
|
150
|
|
151 The MPEG-4 Systems Helper is currently limited to creating the OD, BIFS,
|
|
152 and SDP information about a minimal audio/video scene consistent with
|
|
153 the Internet Streaming Media Alliance (ISMA) specifications. We will be
|
|
154 evaluating how best to generalize the library's helper functions for
|
|
155 MPEG-4 Systems without overburdening the implementation. The code for
|
|
156 this helper is found in the isma and odcommands files.
|
|
157
|
|
158 The RTP Hint Track Helper is more extensive in its support. The hint
|
|
159 tracks contain the track packetization information needed to build
|
|
160 RTP packets for streaming. The library can construct RTP packets based
|
|
161 on the hint track making RTP based servers significantly easier to write.
|
|
162
|
|
163 All code related to rtp hint tracks is in the rtphint files. It would also
|
|
164 be useful to look at test/mp4broadcaster and mpeg4ip/server/mp4creator for
|
|
165 examples of how this part of the library API can be used.
|
|
166
|
|
167
|
|
168 Library API
|
|
169 ===========
|
|
170
|
|
171 The library API is defined and implemented in the mp4 files. The API uses
|
|
172 C linkage conventions, and the mp4.h file adapts itself according to whether
|
|
173 C or C++ is the compilation mode.
|
|
174
|
|
175 All API calls are implemented in mp4.cpp and basically pass thru's to the
|
|
176 MP4File member functions. This ensures that the library has internal access
|
|
177 to the same functions as available via the API. All the calls in mp4.cpp use
|
|
178 C++ try/catch blocks to protect against any runtime errors in the library.
|
|
179 Upon error the library will print a diagnostic message if the verbostiy level
|
|
180 has MP4_DETAILS_ERROR set, and return a distinguished error value, typically
|
|
181 0 or -1.
|
|
182
|
|
183 The test and util subdirectories contain useful examples of how to
|
|
184 use the library. Also the mp4creator and mp4live programs within
|
|
185 mpeg4ip demonstrate more complete usage of the library API.
|
|
186
|
|
187
|
|
188 Debugging
|
|
189 =========
|
|
190
|
|
191 Since mp4 files are fairly complicated, extensive debugging support is
|
|
192 built into the library. Multi-level diagnostic messages are available
|
|
193 under the control of a verbosity bitmask described in the API.
|
|
194
|
|
195 Also the library provides the MP4Dump() call which provides an ASCII
|
|
196 version of the mp4 file meta-information. The mp4dump utilitity is a
|
|
197 wrapper executable around this function.
|
|
198
|
|
199 The mp4extract program is also provided in the utilities directory
|
|
200 which is useful for extracting a track from an mp4file and putting the
|
|
201 media data back into it's own file. It can also extract each sample of
|
|
202 a track into its own file it that is desired.
|
|
203
|
|
204 When all else fails, mp4 files are amenable to debugging by direct
|
|
205 examination. Since the atom names are four letter ASCII codes finding
|
|
206 reference points in a hex dump is feasible. On UNIX, the od command
|
|
207 is your friend: "od -t x1z -A x [-j 0xXXXXXX] foo.mp4" will print
|
|
208 a hex and ASCII dump, with hex addresses, starting optionally from
|
|
209 a specified offset. The library diagnostic messages can provide
|
|
210 information on where the library is reading or writing.
|
|
211
|
|
212
|
|
213 General caveats
|
|
214 ===============
|
|
215
|
|
216 The coding convention is to use the C++ throw operator whenever an
|
|
217 unrecoverable error occurs. This throw is caught at the API layer
|
|
218 in mp4.cpp and translated into an error value.
|
|
219
|
|
220 Be careful about indices. Internally, we follow the C/C++ convention
|
|
221 to use zero-based indices. However the MP4 spec uses one-based indices
|
|
222 for things like samples and hence the library API uses this convention.
|
|
223
|