diff --git a/CHANGELOG.md b/CHANGELOG.md index c911c7a2..375c4064 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -13,13 +13,14 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - PrftBox Info output - Removed ReplaceChild method of StsdBox - CreateHdlr name for timed metadata +- extension .m4s is interpreted as MP4 file in mp4ff-pslister ### Added - NTP64 struct with methods to convert to time.Time - Constants for PrftBox flags -- Unittest to all commands and examples - +- Unittest to all programs in [cmd](cmd) and [examples](examples). +- Documentation in doc.go files for all packages ### Fixed diff --git a/Makefile b/Makefile index 4eb25cab..556d3514 100644 --- a/Makefile +++ b/Makefile @@ -12,9 +12,9 @@ mp4ff-crop mp4ff-decrypt mp4ff-encrypt mp4ff-info mp4ff-nallister mp4ff-pslister go build -ldflags "-X github.com/Eyevinn/mp4ff/mp4.commitVersion=$$(git describe --tags HEAD) -X github.com/Eyevinn/mp4ff/mp4.commitDate=$$(git log -1 --format=%ct)" -o out/$@ ./cmd/$@/main.go .PHONY: examples -examples: initcreator multitrack resegmenter segmenter +examples: add-sidx combine-segs initcreator multitrack resegmenter segmenter -initcreator multitrack resegmenter segmenter: +add-sidx combine-segs initcreator multitrack resegmenter segmenter: go build -o examples-out/$@ ./examples/$@ .PHONY: test @@ -25,6 +25,12 @@ test: prepare testsum: prepare gotestsum +.PHONY: open-docs +open-docs: + echo "If needed: go install golang.org/x/pkgsite/cmd/pkgsite@latest" + pkgsite -http localhost:9999 + # open http://localhost:9999/pkg/github.com/Eyevinn/mp4ff/ + .PHONY: coverage coverage: # Ignore (allow) packages without any tests diff --git a/README.md b/README.md index 61875cfa..b050bc97 100644 --- a/README.md +++ b/README.md @@ -7,75 +7,89 @@ [![Go Report Card](https://goreportcard.com/badge/github.com/Eyevinn/mp4ff)](https://goreportcard.com/report/github.com/Eyevinn/mp4ff) [![license](https://img.shields.io/github/license/Eyevinn/mp4ff.svg)](https://github.com/Eyevinn/mp4ff/blob/master/LICENSE) -Package mp4ff implements MP4 media file parsing and writing for AVC and HEVC video, AAC and AC-3 audio, and stpp and wvtt subtitles. -It is focused on fragmented files as used for streaming in DASH, MSS and HLS fMP4, but can also decode and encode all boxes needed for -progressive MP4 files. In particular, the tool `mp4ff-crop` can be -used to crop a progressive file. +Module mp4ff implements MP4 media file parsing and writing for AVC and HEVC video, AAC and AC-3 audio, stpp and wvtt subtitles, and +timed metadata tracks. +It is focused on fragmented files as used for streaming in MPEG-DASH, MSS and HLS fMP4, but can also decode and encode all +boxes needed for progressive MP4 files. ## Command Line Tools -Some useful command line tools are available in `cmd`. +Some useful command line tools are available in [cmd](cmd) directory. -1. `mp4ff-info` prints a tree of the box hierarchy of a mp4 file with information - about the boxes. The level of detail can be increased with the option `-l`, like `-l all:1` for all boxes - or `-l trun:1,stss:1` for specific boxes. -2. `mp4ff-pslister` extracts and displays SPS and PPS for AVC or HEVC in a mp4 or a bytestream (Annex B) file. +1. [mp4ff-info](cmd/mp4ff-info) prints a tree of the box hierarchy of a mp4 file with information + about the boxes. +2. [mp4ff-pslister](cmd/mp4ff-pslister) extracts and displays SPS and PPS for AVC or HEVC in a mp4 or a bytestream (Annex B) file. Partial information is printed for HEVC. -3. `mp4ff-nallister` lists NALUs and picture types for video in progressive or fragmented file -4. `mp4ff-subslister` lists details of wvtt or stpp (WebVTT or TTML in ISOBMFF) subtitle samples -5. `mp4ff-crop` shortens a progressive mp4 file to a specified duration -6. `mp4ff-encrypt` encrypts a fragmented file using cenc or cbcs Common Encryption scheme -7. `mp4ff-decrypt` decrypts a fragmented file encrypted using cenc or cbcs Common Encryption scheme +3. [mp4ff-nallister](cmd/mp4ff-nallister) lists NALUs and picture types for video in progressive or fragmented file +4. [mp4ff-subslister](cmd/mp4ff-subslister) lists details of wvtt or stpp (WebVTT or TTML in ISOBMFF) subtitle samples +5. [mp4ff-crop](cmd/mp4ff-crop) crops a **progressive** mp4 file to a specified duration +6. [mp4ff-encrypt](cmd/mp4ff-encrypt) encrypts a fragmented file using cenc or cbcs Common Encryption scheme +7. [mp4ff-decrypt](cmd/mp4ff-decrypt) decrypts a fragmented file encrypted using cenc or cbcs Common Encryption scheme You can install these tools by going to their respective directory and run `go install .` or directly from the repo with go install github.com/Eyevinn/mp4ff/cmd/mp4ff-info@latest + go install github.com/Eyevinn/mp4ff/cmd/mp4ff-encrypt@latest + ... + +for each individual tool. ## Example code -Example code is available in the `examples` directory. +Example code for some common use cases is available in the [examples](examples) directory. The examples and their functions are: -1. `initcreator` creates typical init segments (ftyp + moov) for video and audio -2. `resegmenter` reads a segmented file (CMAF track) and resegments it with other - segment durations using `fullSample` -3. `segmenter` takes a progressive mp4 file and creates init and media segments from it. +1. [initcreator](examples/initcreator) creates typical init segments (ftyp + moov) for different video and + audio codecs +2. [resegmenter](examples/resegmenter) reads a segmented file (CMAF track) and resegments it with other + segment durations using `FullSample` +3. [segmenter](examples/segmenter) takes a progressive mp4 file and creates init and media segments from it. This tool has been extended to support generation of segments with multiple tracks as well as reading and writing `mdat` in lazy mode -4. `multitrack` parses a fragmented file with multiple tracks -5. `combine-segs` combines single-track init and media segments into multi-track segments +4. [multitrack](examples/multitrack) parses a fragmented file with multiple tracks +5. [combine-segs](examples/combine-segs) combines single-track init and media segments into multi-track segments +6. [add-sidx](examples/add-sidx) adds a top-level sidx box describing the segments of a fragmented files. + +## Packages -## Library +The top-level packages in the mp4ff module are -The library has functions for parsing (called Decode) and writing (Encode) in the package `mp4ff/mp4`. -It also contains codec specific parsing of AVC/H.264 including complete parsing of -SPS and PPS in the package `mp4ff.avc`. HEVC/H.265 parsing is less complete, and available as `mp4ff.hevc`. -Supplementary Enhancement Information can be parsed and written using the package `mp4ff.sei`. +1. [mp4](mp4) provides support for for parsing (called Decode) and writing (Encode) a plethor of mp4 boxes. + It also contains helper functions for extracting, encrypting, dectrypting samples and a lot more. +2. [avc](avc) deals with AVC (aka H.264) video in the `mp4ff/avc` package including parsing of SPS and PPS, + and finding start-codes in Annex B byte streams. +3. [hevc](hevc) provides structures and functions for dealing with HEVC video and its packaging +4. [sei](sei) provides support for handling Supplementary Enhancement Information (SEI) such as timestamps + for AVC and HEVC video. +5. [av1](av1) provides basic support for AV1 video packaging +6. [aac](aac) provides support for AAC audio. This includes handling ADTS headers which is common + for AAC inside MPEG-2 TS streams. +7. [bits](bits) provides bit-wise and byte-wise readers and writers used by the other packages. -Traditional multiplexed non-fragmented mp4 files can be parsed and decoded, but the focus is on fragmented mp4 files -as used in DASH, HLS, and CMAF. +## Structure and usage -Beyond single-track fragmented files, support has been added to parse and generate multi-track -fragmented files as can be seen in `examples/segment` and `examples/multitrack`. +### mp4.File and its composition The top level structure for both non-fragmented and fragmented mp4 files is `mp4.File`. -In a progressive (non-fragmented) `mp4.File`, the top level attributes Ftyp, Moov, and Mdat points to the corresponding boxes. +In a progressive (non-fragmented) `mp4.File`, the top-level attributes Ftyp, Moov, and Mdat point to the corresponding boxes. A fragmented `mp4.File` can be more or less complete, like a single init segment, -one or more media segments, or a combination of both like a CMAF track which renders +one or more media segments, or a combination of both, like a CMAF track which renders into a playable one-track asset. It can also have multiple tracks. For fragmented files, the following high-level attributes are used: * `Init` contains a `ftyp` and a `moov` box and provides the general metadata for a fragmented file. It corresponds to a CMAF header. It can also contain one or more `sidx` boxes. -* `Segments` is a slice of `MediaSegment` which start with an optional `styp` box, possibly one or more `sidx boxes -and then one or more`Fragment`s. +* `Segments` is a slice of `MediaSegment` which start with an optional `styp` box, possibly one or more `sidx` + boxes and then one or more`Fragment`s. * `Fragment` is a mp4 fragment with exactly one `moof` box followed by a `mdat` box where the latter contains the media data. It can have one or more `trun` boxes containing the metadata - for the samples. + for the samples. The fragment can start with one or more `emsg` boxes. + +It should be noted that it is sometimes hard to decide what should belong to a Segment or Fragment. -All child boxes of container box such as `MoovBox` are listed in the `Children` attribute, but the +All child boxes of container boxes such as `MoovBox` are listed in the `Children` attribute, but the most prominent child boxes have direct links with names which makes it possible to write a path such as @@ -92,9 +106,10 @@ fragment.Moof.Trafs[1].Trun[1] to get the second `trun` of the second `traf` box (provided that they exist). Care must be taken to assert that none of the intermediate pointers are nil to avoid `panic`. -## Creating new fragmented files +### Creating new fragmented files -A typical use case is to a fragment consisting of an init segment followed by a series of media segments. +A typical use case is to generate a fragmented file consisting of an init segment +followed by a series of media segments. The first step is to create the init segment. This is done in three steps as can be seen in `examples/initcreator`: @@ -110,11 +125,13 @@ Multiple tracks are also available via the slice attribute `Traks` instead of `T The second step is to start producing media segments. They should use the timescale that was set when creating the init segment. Generally, that timescale should be chosen so that the -sample durations have exact values without rounding errors. +sample durations have exact values without rounding errors, e.g. 48000 for 48kHz audio. A media segment contains one or more fragments, where each fragment has a `moof` and a `mdat` box. If all samples are available before the segment is created, one can use a single fragment in each segment. Example code for this can be found in `examples/segmenter`. +For low-latency MPEG-DASH generation, short-duration fragments are added to the segment as the +corresponding media samples become available. A simple, but not optimal, way of creating a media segment is to first create a slice of `FullSample` with the data needed. The definition of `mp4.FullSample` is @@ -154,15 +171,21 @@ This segment can finally be output to a `w io.Writer` as err := seg.Encode(w) ``` +or to a `sw bits.SliceWriter` as + +```go +err := seg.EncodeSW(sw) +``` + For multi-track segments, the code is a bit more involved. Please have a look at `examples/segmenter` to see how it is done. A more optimal way of handling media sample is -to handle them lazily, as explained next. +to handle them lazily, or using intervals, as explained next. ### Lazy decoding and writing of mdat data For video and audio, the dominating part of a mp4 file is the media data which is stored in one or more `mdat` boxes. In some cases, for example when segmenting large progressive -files, it is much more memory efficient to just read the movie or fragment data +files, it is much more memory efficient to just read the movie or fragment metadata from the `moov` or `moof` box and defer the reading of the media data from the `mdat` box to later. @@ -172,7 +195,7 @@ For decoding, this is supported by running `mp4.DecodeFile()` in lazy mode as parsedMp4, err = mp4.DecodeFile(ifd, mp4.WithDecodeMode(mp4.DecModeLazyMdat)) ``` -In this case, the media data of the `mdat` box will not be read, but only its size is being set. +In this case, the media data of the `mdat` box will not be read, but only its size is being saved. To read or copy the actual data corresponding to a sample, one must calculate the corresponding byte range and either call @@ -189,7 +212,7 @@ func (m *MdatBox) CopyData(start, size int64, rs io.ReadSeeker, w io.Writer) (nr Example code for this, including lazy writing of `mdat`, can be found in `examples/segmenter` with the `lazy` mode set. -## More efficient I/O using SliceReader and SliceWriter +### More efficient I/O using SliceReader and SliceWriter The use of the interfaces `io.Reader` and `io.Writer` for reading and writing boxes gives a lot of flexibility, but is not optimal when it comes to memory allocation. In particular, the @@ -198,8 +221,8 @@ lot of allocations and copying of data. In order to achieve better performance, it is advantageous to read the full top level boxes into one, or a few, slices and decode these. -To enable that mode, version 0.27 of the code introduced `DecodeX(sr bits.SliceReader)` -methods to every box X where `mp4ff.bits.SliceReader` is an interface. +To enable that mode, version 0.27 of the code introduced `DecodeSR(sr bits.SliceReader)` +methods to every box `` where `mp4ff.bits.SliceReader` is an interface. For example, the `TrunBox` gets the method `DecodeTrunSR(sr bits.SliceReader)` in addition to its old `DecodeTrun(r io.Reader)` method. The `bits.SliceReader` interface provides methods to read all kinds of data structures from an underlying slice of bytes. It has an implementation `bits.FixedSliceReader` @@ -209,10 +232,10 @@ which would get its data from some external source. The memory allocation and speed improvements achieved by this may vary, but should be substantial, especially compared to versions before 0.27 which used an extra `io.LimitReader` layer. -Fur further reduction of memory allocation when reading the ´mdat` data of a progressive file, some -sort of buffered reader should be used. +Fur further reduction of memory allocation, use a buffered top-level reader, especially when +when reading the `mdat` box of a progressive file. -### Benchmarks +#### Benchmarks To investigate the efficiency of the new SliceReader and SliceWriter methods, benchmarks have been done. The benchmarks are defined in @@ -255,19 +278,28 @@ EncodeFile/prog_8s.mp4-16 | 6.84kB | 8.30kB | 0.05kB | |EncodeFile/1.m4s-16 | 15.0 | 15.0 | 3.0 | |EncodeFile/prog_8s.mp4-16 | 101 | 86 | 1 | -## Box structure and interface +## More about mp4 boxes + +The `mp4ff.mp4` contains a lot of box implementations. + +### Box structure and interface Most boxes have their own file named after the box, but in some cases, there may be multiple boxes that have the same content, and the code file then has a generic name like `mp4/visualsampleentry.go`. -The Box interface is specified in `mp4/box.go`. It does not contain decode (parsing) methods which have -distinct names for each box type and are dispatched, +There is an interface for boxes: `Box` specificied in `mp4.box.go`, + +The interfaces define common Box methods including encode (writing), +but not the decode (parsing) methods which have distinct names for each box type and are +dispatched from the parsed box name. -The mapping for decoding dispatch is given in the table `mp4.decoders` for the -`io.Reader` methods and in `mp4.decodersSR` for the `mp4ff.bits.SliceReader` methods. +That dispatch based on box name is defined by the tables `mp4.decodersSR` and `mp4.decoders` +for the functions `mp4.DecodeBoxSR()` and `mp4.DecodeBox()`, respectively. +The `SR` variant should normally be used for better performance. +If a box name is unkonwn, it will result in an `UnknownBox` being created. -## How to implement a new box +### How to implement a new box To implement a new box `fooo`, the following is needed. @@ -279,16 +311,16 @@ Create a file `fooo.go` and create a struct type `FoooBox`. Type() Size() Encode(w io.Writer) -EncodeSW(sw bits.SliceWriter) // new in v0.27.0 +EncodeSW(sw bits.SliceWriter) Info() ``` -It also needs its own decode method `DecodeFooo`, which must be added in the `decoders` map in `box.go`, -and the new in v0.27.0 `DecodeFoooSR` method in `decodersSR`. +It also needs its own decode methods `DecodeFoooSR` and `DecodeFooo`, +which must be added in the `decodersSR` map and `decoders` map, respectively For a simple example, look at the `PrftBox` in `prft.go`. -A test file `fooo_test.go` should also have a test using the method `boxDiffAfterEncodeAndDecode` to check that -the box information is equal after encoding and decoding. +A test file `fooo_test.go` should also have a test using the method `boxDiffAfterEncodeAndDecode` +to check that the box information is equal after encoding and decoding. ## Direct changes of attributes @@ -300,7 +332,7 @@ create inconsistent states in the boxes. As an example, container boxes such as `TrafBox` have a method `AddChild` which adds a box to `Children`, its slice of children boxes, but also sets a specific member reference such as `Tfdt` to point to that box. If `Children` is manipulated -directly, that link may not be valid. +directly, that link may no longer be valid. ## Encoding modes and optimizations @@ -317,8 +349,7 @@ Note that this may change the size of all ancestor boxes of `trun`. Following the ISOBMFF standard, sample numbers and other numbers start at 1 (one-based). This applies to arguments of functions and methods. -The actual storage in slices is zero-based, so -sample nr 1 has index 0 in the corresponding slice. +The actual storage in slices is zero-based, so sample nr 1 has index 0 in the corresponding slice. ## Stability @@ -327,7 +358,7 @@ The APIs should be fairly stable, but minor non-backwards-compatible changes may ## Specifications The main specification for the MP4 file format is the ISO Base Media File Format (ISOBMFF) standard -ISO/IEC 14496-12 6th edition 2020. Some boxes are specified in other standards, as should be commented +ISO/IEC 14496-12 7th edition 2021. Some boxes are specified in other standards, as should be commented in the code. ## LICENSE diff --git a/aac/doc.go b/aac/doc.go index 47154d7a..b8fff3a2 100644 --- a/aac/doc.go +++ b/aac/doc.go @@ -1,4 +1,4 @@ /* -Package aac - parse and generate AAC meta data including ADTS headers. +Package aac parses and generates AAC meta data including ADTS headers. */ package aac diff --git a/av1/doc.go b/av1/doc.go index 80eb25b7..0f7de676 100644 --- a/av1/doc.go +++ b/av1/doc.go @@ -1,4 +1,4 @@ /* -Package av1 - parsing of av1 AV1CodecConfigurationRecord. +Package av1 decodes (parses) and encodes (writes) AV1 CodecConfigurationRecord. */ package av1 diff --git a/avc/doc.go b/avc/doc.go index 6e2c21db..ae33f1eb 100644 --- a/avc/doc.go +++ b/avc/doc.go @@ -1,4 +1,4 @@ /* -Package avc - parse AVC(H.264) NAL unit headers, slice headers and complete SPS and PPS. +Package avc parses AVC (H.264) NAL unit headers, slice headers and complete SPS and PPS. */ package avc diff --git a/bits/doc.go b/bits/doc.go index c6c1ff2b..e9ad7d72 100644 --- a/bits/doc.go +++ b/bits/doc.go @@ -1,9 +1,17 @@ /* -Package bits - bit and bytes reading and writing including Golomb codes and EBSP. +Package bits provides bit and bytes reading and writing including Golomb codes and EBSP as used by MPEG video standards. All readers and writers accumulate errors in the sense that they will stop reading or writing at the first error. -The first error, if any, can be retrieved with AccError(). +The first error, if any, can be retrieved with an AccError() method. -Beyond plain bit reading and writing, reading and writing of ebsp (Encapsulated Byte Sequence Packets) is supported. -EBSP uses insertion of start-code emulation prevention bytes 0x03 and is used in MPEG video standards from AVC (H.264) and forward. +EBSP (Encapsulated Byte Sequence Packets) uses insertion of start-code emulation prevention bytes 0x03 and is +used in MPEG video standards from AVC (H.264) and forward. The main types are: + + - [Reader] reads bits and bytes from an underlying [io.Reader] with accumulated error + - [Writer] writes bits and bytes to an underlying [io.Writer] with accumulated error + - [EBSPReader] reads EBSP from an underlying [io.Reader] with accumulated error + - [EBSPWriter] writes EBSP to an underlying [io.Writer] with accumulated error + - [ByteWriter] writes byte-based structures to an underlying [io.Writer] with accumulated error + - [FixedSliceReader] reads various byte-based structures from a fixed slice with accumulated error + - [FixedSliceWriter] writes various byte-based structures to a fixed slice with accumulated error */ package bits diff --git a/cmd/doc.go b/cmd/doc.go index da89baad..df80b4f8 100644 --- a/cmd/doc.go +++ b/cmd/doc.go @@ -1,12 +1,15 @@ /* -Package cmd - command line tools built using mp4ff. +Package cmd provides command line tools built using mp4ff. Install like go install ./... -or remotely as +or directly from the repo +as - go get -u github.com/Eyevinn/mp4ff/cmd/mp4ff-info + go install github.com/Eyevinn/mp4ff/cmd/mp4ff-info + go install github.com/Eyevinn/mp4ff/cmd/mp4ff-subslister + ... */ package cmd diff --git a/cmd/mp4ff-crop/doc.go b/cmd/mp4ff-crop/doc.go new file mode 100644 index 00000000..a7289dd6 --- /dev/null +++ b/cmd/mp4ff-crop/doc.go @@ -0,0 +1,17 @@ +/* +mp4ff-crop crops a (progressive) mp4 file to just before a sync frame after specified number of milliseconds. +The goal is to leave the file structure intact except for cropping of samples and +moving mdat to the end of the file, if not already there. + + Usage of mp4ff-crop: + + mp4ff-crop [options] + + options: + + -d uint + Duration in milliseconds (default 1000) + -version + Get mp4ff version +*/ +package main diff --git a/cmd/mp4ff-crop/main.go b/cmd/mp4ff-crop/main.go index d886f67d..3283743d 100644 --- a/cmd/mp4ff-crop/main.go +++ b/cmd/mp4ff-crop/main.go @@ -1,8 +1,3 @@ -/* -mp4ff-crop crops a (progressive) mp4 file to just before a sync frame after specified number of milliseconds. -The intension is that the structure of the file shall be left intact except for cropping of samples and -moving mdat to the end of the file, if not already there. -*/ package main import ( @@ -20,11 +15,11 @@ const ( appName = "mp4ff-crop" ) -var usg = `Usage of %s: - -%s crops a (progressive) mp4 file to just before a sync frame after specified number of milliseconds. +var usg = `%s crops a (progressive) mp4 file to just before a sync frame after specified number of milliseconds. The goal is to leave the file structure intact except for cropping of samples and moving mdat to the end of the file, if not already there. + +Usage of %s: ` type options struct { @@ -61,7 +56,6 @@ func run(args []string, stdout io.Writer) error { if err != nil { if errors.Is(err, flag.ErrHelp) { - fs.Usage() return nil } return err diff --git a/cmd/mp4ff-crop/main_test.go b/cmd/mp4ff-crop/main_test.go index 21cd435e..534ab598 100644 --- a/cmd/mp4ff-crop/main_test.go +++ b/cmd/mp4ff-crop/main_test.go @@ -21,6 +21,7 @@ func TestCommandLines(t *testing.T) { {desc: "unknown args", args: []string{appName, "-x"}, expectedErr: true}, {desc: "duration = 0", args: []string{appName, "-d", "0", "dummy.mp4", "dummy.mp4"}, expectedErr: true}, {desc: "non-existing infile", args: []string{appName, "-d", "1000", "notExists.mp4", "dummy.mp4"}, expectedErr: true}, + {desc: "bad infile", args: []string{appName, "-d", "1000", "main.go", "dummy.mp4"}, expectedErr: true}, } for _, c := range cases { t.Run(c.desc, func(t *testing.T) { diff --git a/cmd/mp4ff-decrypt/doc.go b/cmd/mp4ff-decrypt/doc.go new file mode 100644 index 00000000..91a1b601 --- /dev/null +++ b/cmd/mp4ff-decrypt/doc.go @@ -0,0 +1,18 @@ +/* +mp4ff-decrypt decrypts a fragmented mp4 file encrypted with Common Encryption scheme cenc or cbcs. +For a media segment, it needs an init segment with encryption information. + + Usage of mp4ff-decrypt: + + mp4ff-decrypt [options] infile outfile + + options: + + -init string + Path to init file with encryption info (scheme, kid, pssh) + -k string + Required: key (hex) + -version + Get mp4ff version +*/ +package main diff --git a/cmd/mp4ff-decrypt/main.go b/cmd/mp4ff-decrypt/main.go index 27215039..7c7ed7ab 100644 --- a/cmd/mp4ff-decrypt/main.go +++ b/cmd/mp4ff-decrypt/main.go @@ -1,4 +1,3 @@ -// mp4ff-decrypt decrypts a fragmented mp4 file encrypted with Common Encryption scheme cenc or cbcs. package main import ( @@ -16,11 +15,10 @@ const ( appName = "mp4ff-decrypt" ) -var usg = `Usage of %s: - -%s decrypts a fragmented mp4 file encrypted with Common Encryption scheme cenc or cbcs. +var usg = `%s decrypts a fragmented mp4 file encrypted with Common Encryption scheme cenc or cbcs. For a media segment, it needs an init segment with encryption information. +Usage of %s: ` type options struct { @@ -57,7 +55,6 @@ func run(args []string) error { if err != nil { if errors.Is(err, flag.ErrHelp) { - fs.Usage() return nil } return err diff --git a/cmd/mp4ff-decrypt/main_test.go b/cmd/mp4ff-decrypt/main_test.go index b05f6981..ce18955f 100644 --- a/cmd/mp4ff-decrypt/main_test.go +++ b/cmd/mp4ff-decrypt/main_test.go @@ -24,8 +24,9 @@ func TestNonRunningOptionCases(t *testing.T) { {desc: "unknown args", args: []string{"mp4ff-decrypt", "-x"}, err: true}, {desc: "no outfile", args: []string{"mp4ff-decrypt", "infile.mp4"}, err: true}, {desc: "no key", args: []string{"mp4ff-decrypt", "infile.mp4", outFile}, err: true}, - {desc: "bad infile", args: []string{"mp4ff-decrypt", "-k", key, "infile.mp4", outFile}, err: true}, - {desc: "bad initfile", args: []string{"mp4ff-decrypt", "-init", "init.mp4", "-k", key, infile, outFile}, err: true}, + {desc: "non-existing infile", args: []string{"mp4ff-decrypt", "-k", key, "infile.mp4", outFile}, err: true}, + {desc: "non-existing initfile", args: []string{"mp4ff-decrypt", "-init", "init.mp4", "-k", key, infile, outFile}, err: true}, + {desc: "bad infile", args: []string{"mp4ff-decrypt", "-k", key, "main.go", outFile}, err: true}, {desc: "short key", args: []string{"mp4ff-decrypt", "-k", "ab", infile, outFile}, err: true}, {desc: "bad key", args: []string{"mp4ff-decrypt", "-k", badKey, infile, outFile}, err: true}, {desc: "non-encrypted file", args: []string{"mp4ff-decrypt", "-k", key, nonEncryptedFile, outFile}, err: false}, diff --git a/cmd/mp4ff-encrypt/doc.go b/cmd/mp4ff-encrypt/doc.go new file mode 100644 index 00000000..87e4b1b5 --- /dev/null +++ b/cmd/mp4ff-encrypt/doc.go @@ -0,0 +1,27 @@ +/* +mp4ff-encrypt encrypts a fragmented mp4 file using Common Encryption with cenc or cbcs scheme. +A combined fragmented file with init segment and media segment(s) will be encrypted. +For a pure media segment, an init segment with encryption information is needed + + Usage of mp4ff-encrypt: + + mp4ff-encrypt [options] infile outfile + + options: + + -init string + Path to init file with encryption info (scheme, kid, pssh) + -iv string + Required: iv (16 or 32 hex chars) + -key string + Required: key (32 hex chars) + -kid string + key id (32 hex chars). Required if initFilePath empty + -pssh string + file with one or more pssh box(es) in binary format. Will be added at end of moov box + -scheme string + cenc or cbcs. Required if initFilePath empty (default "cenc") + -version + Get mp4ff version +*/ +package main diff --git a/cmd/mp4ff-encrypt/main.go b/cmd/mp4ff-encrypt/main.go index 73247136..8918bb7b 100644 --- a/cmd/mp4ff-encrypt/main.go +++ b/cmd/mp4ff-encrypt/main.go @@ -1,4 +1,3 @@ -// mp4ff-encrypt encrypts a fragmented mp4 file using Common Encryption using cenc or cbcs scheme. package main import ( @@ -16,12 +15,11 @@ const ( appName = "mp4ff-encrypt" ) -var usg = `Usage of %s: - -%s encrypts a fragmented mp4 file using Common Encryption with cenc or cbcs scheme. +var usg = `%s encrypts a fragmented mp4 file using Common Encryption with cenc or cbcs scheme. A combined fragmented file with init segment and media segment(s) will be encrypted. For a pure media segment, an init segment with encryption information is needed. +Usage of %s: ` type options struct { @@ -68,7 +66,6 @@ func run(args []string) error { if err != nil { if errors.Is(err, flag.ErrHelp) { - fs.Usage() return nil } return err diff --git a/cmd/mp4ff-encrypt/main_test.go b/cmd/mp4ff-encrypt/main_test.go index b7f6d19d..2eefb648 100644 --- a/cmd/mp4ff-encrypt/main_test.go +++ b/cmd/mp4ff-encrypt/main_test.go @@ -39,6 +39,9 @@ func TestOptionCases(t *testing.T) { {desc: "non-existing initFile", args: []string{appName, "-key", key, "-iv", iv, "-init", "init.mp4", inSeg, outFile}, err: true}, + {desc: "bad initFile", + args: []string{appName, "-key", key, "-iv", iv, "-init", "main.go", inSeg, outFile}, + err: true}, {desc: "too short iv ", args: []string{appName, "-key", key, "-iv", "00", "-init", init, inSeg, outFile}, err: true}, diff --git a/cmd/mp4ff-info/doc.go b/cmd/mp4ff-info/doc.go new file mode 100644 index 00000000..0f8f378b --- /dev/null +++ b/cmd/mp4ff-info/doc.go @@ -0,0 +1,15 @@ +/* +mp4ff-info prints the box tree of input mp4 (ISOBMFF) file. + + Usage of mp4ff-info: + + mp4ff-info [options] infile + + options: + + -l string + level of details, e.g. all:1 or trun:1,subs:1 + -version + Get mp4ff version +*/ +package main diff --git a/cmd/mp4ff-info/main.go b/cmd/mp4ff-info/main.go index 18ea36d8..6d84ed4e 100644 --- a/cmd/mp4ff-info/main.go +++ b/cmd/mp4ff-info/main.go @@ -1,10 +1,10 @@ -// mp4ff-info prints the box tree of input mp4 (ISOBMFF) file. package main import ( "errors" "flag" "fmt" + "io" "os" "github.com/Eyevinn/mp4ff/mp4" @@ -14,9 +14,9 @@ const ( appName = "mp4ff-info" ) -var usg = `Usage of %s: +var usg = `%s prints the box tree of input mp4 (ISOBMFF) file. -%s prints the box tree of input mp4 (ISOBMFF) file. +Usage of %s: ` type options struct { @@ -41,19 +41,18 @@ func parseOptions(fs *flag.FlagSet, args []string) (*options, error) { } func main() { - if err := run(os.Args); err != nil { + if err := run(os.Args, os.Stdout); err != nil { fmt.Fprintf(os.Stderr, "error: %v\n", err) os.Exit(1) } } -func run(args []string) error { +func run(args []string, w io.Writer) error { fs := flag.NewFlagSet(appName, flag.ContinueOnError) opts, err := parseOptions(fs, args) if err != nil { if errors.Is(err, flag.ErrHelp) { - fs.Usage() return nil } return err @@ -79,7 +78,7 @@ func run(args []string) error { if err != nil { return fmt.Errorf("could not parse input file: %w", err) } - err = parsedMp4.Info(os.Stdout, opts.levels, "", " ") + err = parsedMp4.Info(w, opts.levels, "", " ") if err != nil { return fmt.Errorf("could not print info: %w", err) } diff --git a/cmd/mp4ff-info/main_test.go b/cmd/mp4ff-info/main_test.go index b4e5a1a7..fa2b1ee8 100644 --- a/cmd/mp4ff-info/main_test.go +++ b/cmd/mp4ff-info/main_test.go @@ -1,6 +1,8 @@ package main import ( + "io" + "os" "testing" ) @@ -8,20 +10,22 @@ func TestOptions(t *testing.T) { cases := []struct { desc string args []string + w io.Writer err bool }{ - {desc: "no args", args: []string{appName}, err: true}, - {desc: "unknown args", args: []string{appName, "-x"}, err: true}, - {desc: "non-existing file", args: []string{appName, "infile.mp4"}, err: true}, - {desc: "bad file", args: []string{appName, "main.go"}, err: true}, - {desc: "good file", args: []string{appName, "../../mp4/testdata/init.mp4"}, err: false}, - {desc: "good with details", args: []string{appName, "-l", "all:1", "../../mp4/testdata/init.mp4"}, err: false}, - {desc: "version", args: []string{appName, "-version"}, err: false}, - {desc: "help", args: []string{appName, "-h"}, err: false}, + {desc: "no args", args: []string{appName}, w: os.Stdout, err: true}, + {desc: "unknown args", args: []string{appName, "-x"}, w: os.Stdout, err: true}, + {desc: "non-existing file", args: []string{appName, "infile.mp4"}, w: os.Stdout, err: true}, + {desc: "bad file", args: []string{appName, "main.go"}, w: os.Stdout, err: true}, + {desc: "bad writer", args: []string{appName, "../../mp4/testdata/init.mp4"}, w: &badWriter{}, err: true}, + {desc: "good file", args: []string{appName, "../../mp4/testdata/init.mp4"}, w: os.Stdout, err: false}, + {desc: "good with details", args: []string{appName, "-l", "all:1", "../../mp4/testdata/init.mp4"}, w: os.Stdout, err: false}, + {desc: "version", args: []string{appName, "-version"}, w: os.Stdout, err: false}, + {desc: "help", args: []string{appName, "-h"}, w: os.Stdout, err: false}, } for _, c := range cases { t.Run(c.desc, func(t *testing.T) { - err := run(c.args) + err := run(c.args, c.w) if c.err && err == nil { t.Error("expected error but got nil") } @@ -31,3 +35,9 @@ func TestOptions(t *testing.T) { }) } } + +type badWriter struct{} + +func (w *badWriter) Write(p []byte) (n int, err error) { + return 0, os.ErrClosed +} diff --git a/cmd/mp4ff-nallister/doc.go b/cmd/mp4ff-nallister/doc.go new file mode 100644 index 00000000..39ce5bcf --- /dev/null +++ b/cmd/mp4ff-nallister/doc.go @@ -0,0 +1,32 @@ +/* +mp4ff-nallister lists NAL units and slice types of AVC or HEVC tracks of an mp4 (ISOBMFF) file +or a file containing a byte stream in Annex B format. + +Takes first video track in a progressive file and the first track in a fragmented file. +It can also output information about SEI NAL units. + +The parameter-sets can be further +analyzed using mp4ff-pslister. + + Usage of mp4ff-nallister: + + mp4ff-nallister [options] infile + + options: + + -annexb + Input is Annex B stream file + -c string + Codec to parse (avc or hevc) (default "avc") + -m int + Max nr of samples to parse (default -1) + -ps + Print parameter sets in hex + -raw int + nr raw NAL unit bytes to print + -sei int + Level of SEI information (1 is interpret, 2 is dump hex) + -version + Get mp4ff version +*/ +package main diff --git a/cmd/mp4ff-nallister/main.go b/cmd/mp4ff-nallister/main.go index d961260e..a74157fb 100644 --- a/cmd/mp4ff-nallister/main.go +++ b/cmd/mp4ff-nallister/main.go @@ -1,4 +1,3 @@ -// mp4ff-nallister - list NAL units and slice types of first AVC or HEVC track of an mp4 (ISOBMFF) or bytestream (Annex B) file. package main import ( @@ -20,15 +19,15 @@ const ( appName = "mp4ff-nallister" ) -var usg = `Usage of %s: - -%s lists NAL units and slice types of AVC or HEVC tracks of an mp4 (ISOBMFF) file +var usg = `%s lists NAL units and slice types of AVC or HEVC tracks of an mp4 (ISOBMFF) file or a file containing a byte stream in Annex B format. Takes first video track in a progressive file and the first track in a fragmented file. It can also output information about SEI NAL units. The parameter-sets can be further analyzed using mp4ff-pslister. + +Usage of %s: ` type options struct { @@ -75,7 +74,6 @@ func run(args []string, stdout io.Writer) error { if err != nil { if errors.Is(err, flag.ErrHelp) { - fs.Usage() return nil } return err diff --git a/cmd/mp4ff-nallister/main_test.go b/cmd/mp4ff-nallister/main_test.go index 6cf41c3c..1fe4b4e4 100644 --- a/cmd/mp4ff-nallister/main_test.go +++ b/cmd/mp4ff-nallister/main_test.go @@ -30,6 +30,8 @@ func TestOptions(t *testing.T) { goldenOut: "testdata/golden_h264_mp4.txt", expectedErr: false}, {desc: "annexBHEVC", args: []string{appName, "-annexb", "-c", "hevc", "testdata/hevc.265"}, goldenOut: "testdata/golden_hevc_265.txt", expectedErr: false}, + {desc: "annexBHEVC with SEI", args: []string{appName, "-annexb", "-c", "hevc", "-sei", "2", "testdata/hevc.265"}, + goldenOut: "", expectedErr: false}, {desc: "mp4HEVC", args: []string{appName, "testdata/hevc.mp4"}, goldenOut: "testdata/golden_hevc_mp4.txt", expectedErr: false}, {desc: "h264 frag mp4 raw", args: []string{appName, "-m", "6", "-raw", "4", "../../mp4/testdata/prog_8s_dec_dashinit.mp4"}, diff --git a/cmd/mp4ff-pslister/doc.go b/cmd/mp4ff-pslister/doc.go new file mode 100644 index 00000000..0d4f8932 --- /dev/null +++ b/cmd/mp4ff-pslister/doc.go @@ -0,0 +1,25 @@ +/* +mp4ff-pslister lists parameter sets for AVC/H.264 or HEVC/H.265 from mp4 sample description, bytestream, or hex input. +It prints them as hex and in verbose mode it also prints details in JSON format. + + Usage of mp4ff-pslister: + + mp4ff-pslister [options] + + options: + + -c string + Codec to parse (avc or hevc) (default "avc") + -i string + Input file (mp4 or byte stream) (alternative to sps and pps in hex format) + -pps string + PPS in hex format + -sps string + SPS in hex format, alternative to infile + -v Verbose output -> details. On for hex input + -version + Get mp4ff version + -vps string + VPS in hex format (HEVC only) +*/ +package main diff --git a/cmd/mp4ff-pslister/main.go b/cmd/mp4ff-pslister/main.go index 3fd46fc4..3a90c09d 100644 --- a/cmd/mp4ff-pslister/main.go +++ b/cmd/mp4ff-pslister/main.go @@ -1,6 +1,3 @@ -// mp4ff-pslister - list parameter sets for AVC(H.264) and HEVC(H.265) video in mp4 files. -// -// Print them as hex and with verbose mode provided details in JSON format. package main import ( @@ -22,11 +19,10 @@ const ( appName = "mp4ff-pslister" ) -var usg = `Usage of %s: - -%s lists parameter sets for AVC/H.264 or HEVC/H.265 from mp4 sample description, bytestream, or hex input. +var usg = `%s lists parameter sets for AVC/H.264 or HEVC/H.265 from mp4 sample description, bytestream, or hex input. It prints them as hex and in verbose mode it also prints details in JSON format. +Usage of %s: ` type options struct { @@ -48,7 +44,7 @@ func parseOptions(fs *flag.FlagSet, args []string) (*options, error) { opts := options{} - fs.StringVar(&opts.inFile, "i", "", "Input file (mp4 or byte stream) (alternative to sps") + fs.StringVar(&opts.inFile, "i", "", "Input file (mp4 or byte stream) (alternative to sps and pps in hex format)") fs.StringVar(&opts.codec, "c", "avc", "Codec to parse (avc or hevc)") fs.StringVar(&opts.vpsHex, "vps", "", "VPS in hex format (HEVC only)") fs.StringVar(&opts.spsHex, "sps", "", "SPS in hex format, alternative to infile") @@ -73,7 +69,6 @@ func run(args []string, stdout io.Writer) error { if err != nil { if errors.Is(err, flag.ErrHelp) { - fs.Usage() return nil } return err @@ -108,7 +103,7 @@ func run(args []string, stdout io.Writer) error { return fmt.Errorf("could not open file %s: %w", o.inFile, err) } defer ifd.Close() - mp4Extensions := []string{".mp4", ".m4v", ".cmfv"} + mp4Extensions := []string{".mp4", ".m4v", ".cmfv", ".m4s"} for _, ext := range mp4Extensions { if strings.HasSuffix(o.inFile, ext) { return parseMp4File(stdout, ifd, o.codec, o.verbose) diff --git a/cmd/mp4ff-pslister/main_test.go b/cmd/mp4ff-pslister/main_test.go index 074daad6..27313504 100644 --- a/cmd/mp4ff-pslister/main_test.go +++ b/cmd/mp4ff-pslister/main_test.go @@ -21,6 +21,8 @@ func TestCommandLines(t *testing.T) { expectedErr bool goldenOut string }{ + {desc: "h264 segment without PS", args: []string{appName, "-v", "-i", "../../mp4/testdata/1.m4s"}, + expectedErr: true}, {desc: "help", args: []string{appName, "-h"}, expectedErr: false}, {desc: "version", args: []string{appName, "-version"}, expectedErr: false}, {desc: "no args", args: []string{appName}, expectedErr: true}, diff --git a/cmd/mp4ff-subslister/doc.go b/cmd/mp4ff-subslister/doc.go new file mode 100644 index 00000000..b2aeddf1 --- /dev/null +++ b/cmd/mp4ff-subslister/doc.go @@ -0,0 +1,19 @@ +/* +mp4ff-subslister lists and displays content of wvtt or stpp samples. +These corresponds to WebVTT or TTML subtitles in ISOBMFF files. +Uses track with given non-zero track ID or first subtitle track found in an asset. + + Usage of mp4ff-subslister: + + mp4ff-subslister [options] + + options: + + -m int + Max nr of samples to parse (default -1) + -t int + trackID to extract (0 is unspecified) + -version + Get mp4ff version +*/ +package main diff --git a/cmd/mp4ff-subslister/main.go b/cmd/mp4ff-subslister/main.go index b0d80539..f06a14d0 100644 --- a/cmd/mp4ff-subslister/main.go +++ b/cmd/mp4ff-subslister/main.go @@ -1,4 +1,3 @@ -// mp4ff-subslister - list wvtt or stpp (WebVTT or TTML in ISOBMFF) samples package main import ( @@ -16,11 +15,11 @@ const ( appName = "mp4ff-subslister" ) -var usg = `Usage of %s: - -%s lists and displays content of wvtt or stpp samples. +var usg = `%s lists and displays content of wvtt or stpp samples. These corresponds to WebVTT or TTML subtitles in ISOBMFF files. Uses track with given non-zero track ID or first subtitle track found in an asset. + +Usage of %s: ` type options struct { @@ -59,7 +58,6 @@ func run(args []string, stdout io.Writer) error { if err != nil { if errors.Is(err, flag.ErrHelp) { - fs.Usage() return nil } return err diff --git a/cmd/mp4ff-subslister/main_test.go b/cmd/mp4ff-subslister/main_test.go index 102ecc0c..601dad68 100644 --- a/cmd/mp4ff-subslister/main_test.go +++ b/cmd/mp4ff-subslister/main_test.go @@ -132,6 +132,7 @@ func TestSubsLister(t *testing.T) { wanted string }{ {desc: "help", args: []string{appName, "-h"}, expectedErr: false}, + {desc: "unknown flag", args: []string{appName, "-x"}, expectedErr: true}, {desc: "version", args: []string{appName, "-version"}, expectedErr: false}, {desc: "no args", args: []string{appName}, expectedErr: true}, {desc: "non-existent file", args: []string{appName, "notExisting.mp4"}, expectedErr: true}, diff --git a/doc.go b/doc.go index b4143647..5b27db05 100644 --- a/doc.go +++ b/doc.go @@ -1,31 +1,86 @@ /* -Package mp4ff - MP4 media file parser and writer for AVC and HEVC video, AAC audio and stpp/wvtt subtitles. -Focused on fragmented files as used for streaming in DASH, MSS and HLS fMP4. +Module mp4ff implements MP4 media file parsing and writing for AVC and HEVC video, AAC and AC-3 audio, stpp and wvtt subtitles, and +timed metadata tracks. +It is focused on fragmented files as used for streaming in MPEG-DASH, MSS and HLS fMP4, but can also decode and encode all +boxes needed for progressive MP4 files. -MP4 library +# Command Line Tools -The mp4 library has functions for parsing (called Decode) and writing (called Encode). -It is focused on fragmented files as used for streaming in DASH, MSS and HLS fMP4. -mp4.File is a representation of a "File" which can be more or less complete, -but should have some top layer boxes. It can include +Some useful command line tools are available in [cmd](cmd) directory. + 1. [mp4ff-info] prints a tree of the box hierarchy of a mp4 file with information + about the boxes. + 2. [mp4ff-pslister] extracts and displays SPS and PPS for AVC or HEVC in a mp4 or a bytestream (Annex B) file. + Partial information is printed for HEVC. + 3. [mp4ff-nallister] lists NALUs and picture types for video in progressive or fragmented file + 4. [mp4ff-subslister] lists details of wvtt or stpp (WebVTT or TTML in ISOBMFF) subtitle samples + 5. [mp4ff-crop] crops a **progressive** mp4 file to a specified duration + 6. [mp4ff-encrypt] encrypts a fragmented file using cenc or cbcs Common Encryption scheme + 7. [mp4ff-decrypt] decrypts a fragmented file encrypted using cenc or cbcs Common Encryption scheme - * InitSegment (ftyp + moov boxes) - * One or more segments - * Each segment has an optional styp box followed by one or more fragments - * A fragment must always consist of one moof box followed by one mdat box. +You can install these tools by going to their respective directory and run `go install .` or directly from the repo with -The typical child boxes are exported so that one can write paths such as + go install github.com/Eyevinn/mp4ff/cmd/mp4ff-info@latest + go install github.com/Eyevinn/mp4ff/cmd/mp4ff-encrypt@latests - moof.Traf.Trun +for each individual tool. -to access the (only) trun box of a moof box. +# Example code -Command Line Tools +Example code for some common use cases is available in the [examples](examples) directory. +The examples and their functions are: -Some simple command line tools are available in cmd directory. + 1. [initcreator] creates typical init segments (ftyp + moov) for different video and + audio codecs + 2. [resegmenter] reads a segmented file (CMAF track) and resegments it with other + segment durations using `FullSample` + 3. [segmenter] takes a progressive mp4 file and creates init and media segments from it. + This tool has been extended to support generation of segments with multiple tracks as well + as reading and writing `mdat` in lazy mode + 4. [multitrack] parses a fragmented file with multiple tracks + 5. [combine-segs] combines single-track init and media segments into multi-track segments + 6. [add-sidx] adds a top-level sidx box describing the segments of a fragmented files. -Example code +# Packages -Example code is available in the examples directory. +The top-level packages in the mp4ff module are + + 1. [mp4] provides support for for parsing (called Decode) and writing (Encode) a plethor of mp4 boxes. + It also contains helper functions for extracting, encrypting, dectrypting samples and a lot more. + 2. [avc] deals with AVC (aka H.264) video in the `mp4ff/avc` package including parsing of SPS and PPS, + and finding start-codes in Annex B byte streams. + 3. [hevc] provides structures and functions for dealing with HEVC video and its packaging + 4. [sei] provides support for handling Supplementary Enhancement Information (SEI) such as timestamps + for AVC and HEVC video. + 5. [av1] provides basic support for AV1 video packaging + 6. [aac] provides support for AAC audio. This includes handling ADTS headers which is common + for AAC inside MPEG-2 TS streams. + 7. [bits] provides bit-wise and byte-wise readers and writers used by the other packages. + +# Specifications + +The main specification for the MP4 file format is the ISO Base Media File Format (ISOBMFF) standard +ISO/IEC 14496-12 7th edition 2021. Some boxes are specified in other standards, as should be commented +in the code. + +[mp4]: https://pkg.go.dev/github.com/Eyevinn/mp4ff/mp4 +[avc]: https://pkg.go.dev/github.com/Eyevinn/mp4ff/avc +[hevc]: https://pkg.go.dev/github.com/Eyevinn/mp4ff/hevc +[sei]: https://pkg.go.dev/github.com/Eyevinn/mp4ff/sei +[av1]: https://pkg.go.dev/github.com/Eyevinn/mp4ff/av1 +[aac]: https://pkg.go.dev/github.com/Eyevinn/mp4ff/aac +[bits]: https://pkg.go.dev/github.com/Eyevinn/mp4ff/bits +[initcreator]: https://pkg.go.dev/github.com/Eyevinn/mp4ff/examples/initcreator +[resegmenter]: https://pkg.go.dev/github.com/Eyevinn/mp4ff/examples/resegmenter +[segmenter]: https://pkg.go.dev/github.com/Eyevinn/mp4ff/examples/segmenter +[multitrack]: https://pkg.go.dev/github.com/Eyevinn/mp4ff/examples/multitrack +[combine-segs]: https://pkg.go.dev/github.com/Eyevinn/mp4ff/examples/combine-segs +[add-sidx]: https://pkg.go.dev/github.com/Eyevinn/mp4ff/examples/add-sidx +[mp4ff-info]: https://pkg.go.dev/github.com/Eyevinn/mp4ff/cmd/mp4ff-info +[mp4ff-pslister]: https://pkg.go.dev/github.com/Eyevinn/mp4ff/cmd/mp4ff-pslister +[mp4ff-nallister]: https://pkg.go.dev/github.com/Eyevinn/mp4ff/cmd/mp4ff-nallister +[mp4ff-subslister]: https://pkg.go.dev/github.com/Eyevinn/mp4ff/cmd/mp4ff-subslister +[mp4ff-crop]: https://pkg.go.dev/github.com/Eyevinn/mp4ff/cmd/mp4ff-crop +[mp4ff-encrypt]: https://pkg.go.dev/github.com/Eyevinn/mp4ff/cmd/mp4ff-encrypt +[mp4ff-decrypt]: https://pkg.go.dev/github.com/Eyevinn/mp4ff/cmd/mp4ff-decrypt */ package mp4ff diff --git a/examples/add-sidx/doc.go b/examples/add-sidx/doc.go new file mode 100644 index 00000000..f6aa766f --- /dev/null +++ b/examples/add-sidx/doc.go @@ -0,0 +1,23 @@ +/* +add-sidx shows how to add a top-level sidx box to a fragmented file provided it does not exist. +Segments are identified by styp boxes if they exist, otherwise by +the start of moof or emsg boxes. It is possible to interpret +every moof box as the start of a new segment, by specifying the "-startSegOnMoof" option. +One can further remove unused encryption boxes with the "-removeEnc" option. + + Usage of add-sidx: + + add-sidx [options] infile outfile + + options: + + -nzEPT + Use non-zero earliestPresentationTime + -removeEnc + Remove unused encryption boxes + -startSegOnMoof + Start a new segment on every moof + -version + Get mp4ff version +*/ +package main diff --git a/examples/add-sidx/main.go b/examples/add-sidx/main.go index ebe350a4..082316e1 100644 --- a/examples/add-sidx/main.go +++ b/examples/add-sidx/main.go @@ -1,7 +1,3 @@ -// add-sidx adds a top-level sidx box describing the segments of a fragmented files. -// -// Segments are identified by styp boxes if they exist, otherwise by -// the start of moof or emsg boxes. package main import ( @@ -18,14 +14,13 @@ const ( appName = "add-sidx" ) -var usg = `Usage of %s: - -%s adds a top-level sidx box to a fragmented file provided it does not exist. -If styp boxes are present, they signal new segments. It is possible to interpret +var usg = `%s shows how to add a top-level sidx box to a fragmented file provided it does not exist. +Segments are identified by styp boxes if they exist, otherwise by +the start of moof or emsg boxes. It is possible to interpret every moof box as the start of a new segment, by specifying the "-startSegOnMoof" option. One can further remove unused encryption boxes with the "-removeEnc" option. - +Usage of %s: ` type options struct { @@ -66,7 +61,6 @@ func run(args []string, stdout io.Writer) error { if err != nil { if errors.Is(err, flag.ErrHelp) { - fs.Usage() return nil } return err diff --git a/examples/combine-segs/README.md b/examples/combine-segs/doc.go similarity index 73% rename from examples/combine-segs/README.md rename to examples/combine-segs/doc.go index 7efdcad1..2957a14a 100644 --- a/examples/combine-segs/README.md +++ b/examples/combine-segs/doc.go @@ -1,11 +1,10 @@ -# Multiplex example +/* +combine-segs provides an example of multiplexing tracks from fragmented MP4 files. +It combines init and media segments from two different files into a single +multitrack init or media segment. The functions -This folder provides an example of multiplexing tracks from fragmented MP4 files. - -The functions - - combineInitSegments - combineMediaSegments + combineInitSegments + combineMediaSegments combines the tracks from two or more different init or media segments. The combined tracks get unique track names as specified using parameters. @@ -20,3 +19,5 @@ The data is written to the combined `mdat` box track by track, so that there is exactly one `trun` box for each track. Interleaving data into more `trun` boxes should be possible by writing fractions of samples from the different input files. +*/ +package main diff --git a/examples/combine-segs/main.go b/examples/combine-segs/main.go index b90aa7a6..481840a8 100644 --- a/examples/combine-segs/main.go +++ b/examples/combine-segs/main.go @@ -1,5 +1,3 @@ -// combine-segs is a simple example that demonstrates how to combine init and media segments -// from two different files into a single multitrack init or media segment. package main import ( diff --git a/examples/doc.go b/examples/doc.go new file mode 100644 index 00000000..b2f8f28a --- /dev/null +++ b/examples/doc.go @@ -0,0 +1,4 @@ +/* +Package examples provides example programs built using mp4ff. +*/ +package cmd diff --git a/examples/initcreator/doc.go b/examples/initcreator/doc.go new file mode 100644 index 00000000..27e1a9b4 --- /dev/null +++ b/examples/initcreator/doc.go @@ -0,0 +1,5 @@ +/* +initcreator shows how one can create init segments for video, audio, and subtitles. +The codecs used are AVC, HEVC, AAC, AC3, EC3, WebVTT, and TTML. +*/ +package main diff --git a/examples/initcreator/main.go b/examples/initcreator/main.go index 95df8044..656da5ea 100644 --- a/examples/initcreator/main.go +++ b/examples/initcreator/main.go @@ -1,4 +1,3 @@ -// initcreator - create example init segments for video, audio, and subtitles package main import ( diff --git a/examples/multitrack/doc.go b/examples/multitrack/doc.go new file mode 100644 index 00000000..d73a2222 --- /dev/null +++ b/examples/multitrack/doc.go @@ -0,0 +1,6 @@ +/* +multitrack is an example showing how to decode a multitrack fragmented file with +video and closed caption (CEA-608) tracks. It extracts and prints information about the +video NAL units and the closed caption samples. +*/ +package main diff --git a/examples/resegmenter/doc.go b/examples/resegmenter/doc.go new file mode 100644 index 00000000..227594b2 --- /dev/null +++ b/examples/resegmenter/doc.go @@ -0,0 +1,17 @@ +/* +resegmenter is an example on how to resegment a fragmented file to a new target segment duration. +The duration is given in ticks (in the track timescale). +If no init segment in the input, the trex defaults will not be known which may cause an issue. +The input must be a fragmented file. + + Usage of resegmenter: + resegmenter [options] infile outfile + + options: + -d uint + Required: chunk duration (ticks) + -v Verbose output + +resegmenter is an example on how to resegment mp4 files into concatenated segments with new duration. +*/ +package main diff --git a/examples/resegmenter/main.go b/examples/resegmenter/main.go index 96768b7f..699701cd 100644 --- a/examples/resegmenter/main.go +++ b/examples/resegmenter/main.go @@ -1,7 +1,3 @@ -// resegmenter - example on how to resegment mp4 files into concatenated segments with new duration. -// If no init segment in the input, the trex defaults will not be known which may cause issue if -// needed. -// The input must be a fragmented file. package main import ( @@ -18,11 +14,13 @@ const ( appName = "resegmenter" ) -var usg = `Usage of %s: - -%s resegments a file input fragemtns to a new duration, as closely as possible. +var usg = `%s is an example on how to resegment a fragmented file to a new target segment duration. The duration is given in ticks (in the track timescale). +If no init segment in the input, the trex defaults will not be known which may cause an issue. +The input must be a fragmented file. + +Usage of %s: ` type options struct { @@ -59,7 +57,6 @@ func run(args []string, w io.Writer) error { if err != nil { if errors.Is(err, flag.ErrHelp) { - fs.Usage() return nil } return err diff --git a/examples/segmenter/doc.go b/examples/segmenter/doc.go new file mode 100644 index 00000000..d79022e5 --- /dev/null +++ b/examples/segmenter/doc.go @@ -0,0 +1,30 @@ +/* +segmenter segments a progressive mp4 file into init and media segments. +The output is either single-track segments, or muxed multi-track segments. +With the -lazy mode, mdat is read and written lazily. The lazy write +is only for single-track segments, to provide a comparison with the multi-track +implementation. + +There should be at most one audio and one video track in the input. +The output files will be named as +init segments: _a.mp4 and _v.mp4 +media segments: _a_.m4s and _v_.m4s where n >= 1 +or init.mp4 and media_.m4s + +Codecs supported are AVC and HEVC for video and AAC +and AC-3 for audio. + + Usage of segmenter: + + segmenter [options] infile outfilePrefix + + options: + + -d uint + Required: segment duration (milliseconds). The segments will start at syncSamples with decoded time >= n*segDur + -lazy + Read/write mdat lazily + -m Output multiplexed segments + -v Verbose output +*/ +package main diff --git a/examples/segmenter/main.go b/examples/segmenter/main.go index d5d2740f..4ebf92f9 100644 --- a/examples/segmenter/main.go +++ b/examples/segmenter/main.go @@ -1,19 +1,3 @@ -// segmenter - segments a progressive mp4 file into init and media segments -// -// The output is either single-track segments, or muxed multi-track segments. -// With the -lazy mode, mdat is read and written lazily. The lazy write -// is only for single-track segments, so that it can be compared with multi-track -// implementation. -// There should be at most one audio and one video track in the input. -// The output files will be named as -// init segments: _a.mp4 and _v.mp4 -// media segments: _a_.m4s and _v_.m4s where n >= 1 -// -// or -// -// init.mp4 and media_.m4s -// -// Codecs supported are AVC and HEVC for video and AAC and AC-3 for audio package main import ( @@ -30,13 +14,11 @@ const ( appName = "segmenter" ) -var usg = `Usage of %s: - -%s segments a progressive mp4 file into init and media segments. +var usg = `%s segments a progressive mp4 file into init and media segments. The output is either single-track segments, or muxed multi-track segments. With the -lazy mode, mdat is read and written lazily. The lazy write -is only for single-track segments, so that it can be compared with multi-track +is only for single-track segments, to provide a comparison with the multi-track implementation. There should be at most one audio and one video track in the input. The output files will be named as @@ -46,6 +28,7 @@ or init.mp4 and media_.m4s Codecs supported are AVC and HEVC for video and AAC and AC-3 for audio. +Usage of %s: ` type options struct { @@ -86,7 +69,6 @@ func run(args []string, outDir string) error { if err != nil { if errors.Is(err, flag.ErrHelp) { - fs.Usage() return nil } return err diff --git a/mp4/doc.go b/mp4/doc.go index d599ba20..be838ab9 100644 --- a/mp4/doc.go +++ b/mp4/doc.go @@ -1,55 +1,207 @@ /* -Package mp4 - library for parsing and writing MP4/ISOBMFF files with a focus on fragmented files. +Package mp4 is a library for parsing and writing MP4/ISOBMFF files with a focus on fragmented files. Most boxes have their own file named after the box four-letter name in the ISO/IEC 14996-12 standard, but in some cases, there may be multiple boxes that have the same content, and the code is then having a generic name like visualsampleentry.go. -The Box interface is specified in box.go. It decodes box size and type in the box header and -dispatched decode for each individual box depending on its type. +# Structure and usage -# Implement a new box +The top level structure for both non-fragmented and fragmented mp4 files is [File]. -To implement a new box "fooo", the following is needed: +In a progressive (non-fragmented) [File], the top-level attributes "Ftyp", "Moov", and "Mdat" +point to the corresponding top level boxes. -Create a file fooo.go and with struct type FoooBox. +A fragmented [File] can be more or less complete, like a single init segment, +one or more media segments, or a combination of both, like a CMAF track which renders +into a playable one-track asset. It can also have multiple tracks. +For fragmented files, the following high-level attributes are used: -FoooBox should then implement the Box interface methods: + - Init is an [*mp4.InitSegment] and contains a ftyp and a moov box and provides the + general metadata for a fragmented file, track definitions including time scale and sample descriptors. + It corresponds to a CMAF header. It can also contain one or more `sidx` boxes. + - Segments is a slice of [mp4.MediaSegment] which start with an optional [mp4.StypBox], + possibly one or more [mp4.SidxBox] and then one or more [mp4.Fragment]. + - [mp4.Fragment] is a mp4 fragment with exactly one [mp4.MoofBox] followed by a [mp4.MdatBox] where the latter + contains the media data. It should have one or more [mp4.TrunBox] containing the metadata + for the samples. The fragment can start with one or more [mp4.EmsgBox]. - Type() - Size() - Encode() - EncodeSW() - Info() +It should be noted that it is sometimes hard to decide what should belong to a Segment or Fragment. -but also its own decode method DecodeFooo and DecodeFoooSR, and register these -methods in the decoders map in box.go and decodersSR map in boxsr.go. -For a simple example, look at the `prft` box in `prft.go`. +All child boxes of container boxes such as [mp4.MoofBox] are listed in the Children attribute, but the +most prominent child boxes have direct links with names which makes it possible to write a path such +as -# Container Boxes + fragment.Moof.Traf.Trun -Container boxes like moof, have a list of all their children called Children, -but also direct pointers to the children with appropriate names, -like Mfhd and Traf. This makes it easy to chain box paths to reach an -element like a TfhdBox as +to access the (single or first) [mp4.TrunBox] in a fragment inside the (single or first) [mp4.TrafBox] +of a fragment. - file.Moof.Traf.Tfhd +There are corresponding structures with a plural form for accessing later boxes of the same type, e.g. -When there may be multiple children with the same name, there may be both a -pointer to a slice like Trafs with all boxes and Traf that points to the first. + fragment.Moof.Trafs[1].Trun[1] -# Media Sample Data Structures +to get the second [mp4.TrunBox] of the second [mp4.TrafBox] (provided that they exist). Care must be +taken to assert that none of the intermediate pointers are nil to avoid panic. -To handle media sample data there are two structures: +# Creating new fragmented files -1. `Sample` stores the sample information used in trun +A typical use case is to generate a fragmented file consisting of an init segment +followed by a series of media segments. -2. `FullSample` also carries a slice with the samples binary data as well as decode time +The first step is to create the init segment. This is done in three steps as can be seen in +[examples/initcreator]: -# Fragmenting segments + init := mp4.CreateEmptyInit() + init.AddEmptyTrack(timescale, mediatype, language) + init.Moov.Trak.SetHEVCDescriptor("hvc1", vpsNALUs, spsNALUs, ppsNALUs) -A MediaSegment can be fragmented into multiple fragments by the method +Here the third step fills in codec-specific parameters into the sample descriptor of the single track. - func (s *MediaSegment) Fragmentify(timescale uint64, trex *TrexBox, duration uint32) ([]*Fragment, error) +The second step is to start producing media segments. They should use the timescale that +was set when creating the init segment. Generally, that timescale should be chosen so that the +sample durations have exact values without rounding errors, e.g. 48000 for 48kHz audio. + +A media segment contains one or more fragments. +If all samples are available before the segment is created, one can use a single +fragment in each segment. Example code for this can be found in [examples/segmenter]. +For low-latency MPEG-DASH generation, short-duration fragments are added to the segment as the +corresponding media samples become available. + +A simple, but not optimal, way of creating a media segment is to first create a slice of +[mp4.FullSample] with the data needed. + +The [mp4.Sample] part is what will be written into the [mp4.TrunBox]. +Once a number of such full samples are available, they can be added to a media segment like + + seg := mp4.NewMediaSegment() + frag := mp4.CreateFragment(uint32(segNr), mp4.DefaultTrakID) + seg.AddFragment(frag) + + for _, sample := range samples { + frag.AddFullSample(sample) + } + +This segment can finally be output to a [io.Writer] +as + + err := seg.Encode(w) + +or to a [bits.SliceWriter] as + + err := seg.EncodeSW(sw) + +For multi-track segments, the code is a bit more involved. Please have a look at [examples/segmenter] +to see how it is done. A more optimal way of handling media sample is +to handle them lazily, or using intervals, as explained next. + +# Lazy decoding and writing of mdat data + +For video and audio, the dominating part of a mp4 file is the media data which is stored +in one or more [mp4.MdatBox]. In some cases, for example when segmenting large progressive +files, it is much more memory efficient to just read the movie or fragment metadata +from the [mp4.MoovBox] or [mp4.MoofBox] and defer the reading of the media data from +the [mp4.MdatBox] to later. + +For decoding, this is supported by running +[DecodeFile] in lazy mode as + + parsedMp4, err = mp4.DecodeFile(ifd, mp4.WithDecodeMode(mp4.DecModeLazyMdat)) + +In this case, the media data of the [mp4.MdatBox] box will not be read, but only its size is being saved. +To read or copy the actual data corresponding to a sample, one must calculate the +corresponding byte range and either call + + func (m *MdatBox) ReadData(start, size int64, rs io.ReadSeeker) ([]byte, error) + +or + + func (m *MdatBox) CopyData(start, size int64, rs io.ReadSeeker, w io.Writer) (nrWritten int64, err error) + +Example code for this, including lazy writing of [mp4.MdatBox], can be found in [examples/segmenter] +with the lazy mode set. + +# More efficient I/O using SliceReader and SliceWriter + +The use of the interfaces [io.Reader] and [io.Writer] for reading and writing boxes gives a lot of +flexibility, but is not optimal when it comes to memory allocation. In particular, the +Read(p []byte) method needs a slice "p" of the proper size to read data, which leads to a +lot of allocations and copying of data. +In order to achieve better performance, it is advantageous to read the full top level boxes into +one, or a few, slices and decode these. This is the reason that [bits.SliceReader] and [bits.SliceWriter] +were introduced and that there are double methods for decoding and encoding all boxes using +either of the interfaces. For benchmarks, see the [README.md of the mp4ff module]. + +Fur further reduction of memory allocation, use a buffered top-level reader, especially when +when reading the [mp4.MdatBox] box of a progressive file. + +# More about mp4 boxes + +The mp4 package contains a lot of box implementations. + +The [Box] interface is specified in box.go. It decodes box size and type in the box header and +dispatches decode for each individual box depending on its type. + +There is also a [ContainerBox] interface which is used for boxes that contain other boxes.d + +Most boxes have their own file named after the box, but in some cases, there may be multiple boxes +that have the same content, and the box structure and the source code file then has a generic name like +[mp4.VisualSampleEntryBox] + +The interfaces define common Box methods including encode (writing), +but not the decode (parsing) methods which have distinct names for each box type and are +dispatched from the parsed box name. + +That dispatch based on box name is defined by the tables "mp4.decodersSR" and "mp4.decoders" +for the functions "mp4.DecodeBoxSR" and "mp4.DecodeBox", respectively. +The "SR" variant that uses [bits/SliceReader] should normally be used for better performance. +If a box name is unkonwn, it will result in an [mp4.UnknownBox] being created. + +# How to implement a new box + +To implement a new box "fooo", the following is needed. + 1. Create a new file "fooo.go" and create a struct type "FoooBox". + 2. "FoooBox" must implement the [mp4.Box] interface methods + 3. It also needs its own decode methods "DecodeFoooSR" and "DecodeFooo", + which must be added in the "decodersSR" map and "decoders" map, respectively + For a simple example, look at the [mp4.PrftBox]. + 4. A test file `fooo_test.go` should also have a test using the method "boxDiffAfterEncodeAndDecode" + to check that the box information is equal after encoding and decoding. + +# Direct changes of attributes + +Many attributes are public and can therefore be changed in freely. +The advantage of this is that it is possible to write code that can manipulate boxes +in many different ways, but one must be cautious to avoid breaking links to sub boxes or +create inconsistent states in the boxes. + +As an example, container boxes such as [mp4.TrafBox] have a method "AddChild" which +adds a box to "Children", its slice of children boxes, but also sets a specific +member reference such as "Tfdt" to point to that box. If "Children" is manipulated +directly, that link may no longer be valid. + +# Encoding modes and optimizations + +For fragmented files, one can choose to either encode all boxes in a [mp4.File], or only code +the ones which are included in the init and media segments. The attribute that controls that +is called [mp4.FragEncMode]. +Another attribute [mp4.EncOptimize] controls possible optimizations of the file encoding process. +Currently, there is only one possible optimization called [mp4.OptimizeTrun]. +It can reduce the size of the [mp4.TrunBox] by finding and writing default +values in the [mp4.TfhdBox] and omitting the corresponding values from the [mp4.TrunBox]. +Note that this may change the size of all ancestor boxes of the [mp4.TrunBox]. + +# Sample Number Offset + +Following the ISOBMFF standard, sample numbers and other numbers start at 1 (one-based). +This applies to arguments of functions and methods. +The actual storage in slices is zero-based, so sample nr 1 has index 0 in the corresponding slice. + +[examples/initcreator]: https://pkg.go.dev/Eyevinn/mp4ff/examples/initcreator +[examples/segmenter]: https://pkg.go.dev/Eyevinn/mp4ff/examples/segmenter +[README.md of the mp4ff module]: https://pkg.go.dev/github.com/Eyevinn/mp4ff#section-readme +[bits/SliceReader]: https://pkg.go.dev/Eyevinn/mp4ff/bits#SliceReader + +[bits/SliceWriter]: https://pkg.go.dev/Eyevinn/mp4ff/bits#SliceWriter */ package mp4 diff --git a/sei/doc.go b/sei/doc.go new file mode 100644 index 00000000..25c9cd24 --- /dev/null +++ b/sei/doc.go @@ -0,0 +1,7 @@ +/* +Package sei provides SEI (Supplementary Enhancement Information) parsing and encoding for both AVC and HEVC +video standards. The SEI RBSP syntax is defined in Section 7.3.2.3 of ISO/IEC 14496-10 (AVC) 2020 and earlier. +For AVC, the SEI messages and their syntax is defined in ISO/IEC 14496-10 2020 Annex D. +For HEVC, the SEI message and their syntax i defined in ISO/IEC 23008-2 Annex D. +*/ +package sei diff --git a/sei/sei.go b/sei/sei.go index 208230bc..7e52baa5 100644 --- a/sei/sei.go +++ b/sei/sei.go @@ -1,7 +1,3 @@ -// Package SEI provides SEI (Supplementary Enhancement Information) parsing and encoding for both AVC and HEVC. -// The SEI RBSP syntax is defined in Section 7.3.2.3 of ISO/IEC 14496-10 (AVC) 2020 and earlier. -// For AVC, the SEI messages and their syntax is defined in ISO/IEC 14496-10 2020 Annex D. -// For HEVC, the SEI message and their syntax i defined in ISO/IEC 23008-2 Annex D. package sei import (