-
Notifications
You must be signed in to change notification settings - Fork 1
Audio Video Solution Pack Release Notes
- Introduction
- Ingest
- Browser support
- Server support
- Repository support
- Management
- Metadata extraction + storage
- Datastream storage and management
- Transcoding + thumbnail extraction
- Integration with delivery systems
- Access
- Clients
- Servers, platforms and protocols
This report will walk through the life-cycle of large media objects within a digital repository ecosystem, starting from initial ingest and dealing with software limitations to management and preservation concerns to access and playback issues. While, on the surface, similar to issues around large scientific datasets, with large media files one likely encounters additional challenges including monolithic (and un-segmentable) datastreams, heterogenous, proprietary encoding schemes, intellectual property rights (including third-party and appearance rights), a nascent community without large scale support, etc. I will look at each section from both the server-side infrastructure and client-side user experience challenges. While many of the tools and approaches are infrastructure agnostic, I will dive into some of the concerns specifically around the Fedora Commons repository architecture, the Islandora (Drupal) and Blacklight/Hydra (Ruby on Rails) stack. Much of this information is drawn from my experiences implementing media interfaces for WGBH's Open Vault website.
Dealing with large files sizes on the Internet is a challenge in normal web development, and is only compounded by the many uses of an object within a digital repository context. Within recent years there has been much progress in the user<->provider space, however much remains to be done to fully support large media files within the digital object lifecycle.
Web browser support for large, preservation-quality file sizes is problematic, with user-agent implementations restricting the maximum POST
/PUT
size, offering varying levels of user feedback, and sometimes silently failing. Newer browsers implementations generally have better support. As of this writing, a 2GB
size limit is normal, although new versions of both Opera and Google Chrome have no restriction http://www.motobit.com/help/scptutl/pa98.htm. The default file upload handler in these user-agents provides little feedback, although Javascript or HTML5 approaches can provide this functionality.
Unlike large textual datasets, media files are not easily or efficiently compressed or chunked making two simple approaches less useful.
A variety of client-side work-arounds exist, including browser-plugin based upload clients for Flash (SWFUpload, Uploadify or Java (JUpload, postlet).
Finally, a third option is to use asynchronous or off-line file transfer via existing infrastructure (FTP, network fileshares, etc). While requiring additional intervention or maintenance by repository staff, this approach is certainly the most consistent and reliable.
The Audio/Video Solution Pack release does not require, expect or provide support for any of these approaches, allowing individual implementations to make choices (taking into account user needs and user experience) and provide standard upload implementations across content models.
In addition to the client-side challenges, different web server modules and applications may add additional restrictions on file uploads.
Apache provides a number of tools to the server administration to manage or restrict the file upload capabilities within the HTTP server. The [[LimitRequestBody Directive|http://httpd.apache.org/docs/2.0/mod/core.html#limitrequestbody]]
, perhaps the most common approach, restricts the size of the request body, is disabled by default (but third-party distributions may enable).
After the file transfer, different modules may perform additional filtering or processing on the file. The PHP documentation has an entire section devoted to file upload problems. As application processing of larger files may lead to performance issues, there are third-party modules for Apache (and other web servers) such as mod_porter that manage the file management within the web server and provide only a reference to the application.
Within the Drupal community, there has been significant work to support the aforementioned client-side file upload work-arounds. For dealing with some content types, there are also bundled plugins like media_mover, which provides support asynchronous workflows.
Finally, the operating system or filesystem configuration may add additional restrictions on the maximum size of files. This may be an issue for client, server, or transfer workflows, especially when dealing with third-party providers or off-line file transfer. During a Duracloud pilot, we discovered that cloud storage providers often have unusual maximum file sizes (AWS S3 restricted files to 5GB
), and transferring large, preservation quality files on external disks may require very specific disk cluster sizes.
Large datastream challenges within Fedora are currently tracked in JIRA and on a roadmap theme. The high-level storage architecture, currently in a brainstorm phase, may address some or all of these issues.
Historically,until Fedora 3.2, uploaded file content was stored entirely on the heap, limiting the maximum file upload size to the amount of free memory. Even until Fedora 3.4, managed datastreams larger than 2GB were not checksumable. There is an ongoing effort to globally refactor file handling for Fedora 3.5. For implementors using older version of Fedora, externally managed and redirect datastreams were the only reliable way to handle large datastream sizes, which worked for predictable use cases, at the risk of making digital object management and preservation workflows more complex.
Management challenges of large media files is split into three sections: first, what needs to happen with submitted content before it is ready to be stored for the long term; second, processes and workflows around storage; finally, processes and workflows around making the content accessible and usable.
a) Newly ingested content may arrive with significant descriptive metadata, but fragmentary or inaccurate technical metadata (much like other datastreams, but with a larger set of important information including wrappers, codecs, track information, frame size, etc). As a secondary concern in digital curation systems, robust support for media formats is frequently lacking in the traditional tools (e.g. JHOVE).
Probably the most comprehensive, open source metadata extraction tool is exiftool, a perl utility that can extract technical metadata from images, audio, video and more.
ExifTool Version Number : 7.49
File Name : test.mp4
Directory : .
File Size : 2.8 MB
File Modification Date/Time : 2010:01:01 16:54:04-05:00
File Type : MP4
MIME Type : video/mp4
Version : 0
Create Date : 2010:01:01 21:53:38
Modify Date : 2010:01:01 21:54:04
Time Scale : 90000
Duration : 01:25
Preferred Rate : 1
Preferred Volume : 100.00%
Preview Time : 0 s
Preview Duration : 0 s
Poster Time : 0 s
Selection Time : 0 s
Selection Duration : 0 s
Current Time : 0 s
Next Track ID : 3
Track Version : 0
Track Create Date : 2010:01:01 21:53:38
Track Modify Date : 2010:01:01 21:54:04
Track ID : 1
Track Duration : 01:25
Track Layer : 0
Track Volume : 0.00%
Graphics Mode : srcCopy
Op Color : 0 0 0
Compressor ID : avc1
Image Width : 320
Image Height : 240
X Resolution : 72
Y Resolution : 72
Compressor Name : JVT/AVC Coding
Bit Depth : 24
Video Frame Rate : 15.0
Media Header Version : 0
Media Create Date : 2010:01:01 21:53:38
Media Modify Date : 2010:01:01 21:54:04
Media Time Scale : 48000
Media Duration : 01:23
Media Language Code : eng
Balance : 0
Audio Format : mp4a
Audio Channels : 2
Audio Bits Per Sample : 16
Audio Sample Rate : 48000
Name : Stereo
Handler Type : Metadata
Encoder : HandBrake 0.9.4 2009112300
Image Size : 320x240
Note that modern versions of exiftool can also output basic RDF/XML, which may be easier to work with within a repository context:
<?xml version='1.0' encoding='UTF-8'?>
<rdf:RDF xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#'>
<rdf:Description rdf:about='test.mp4'
xmlns:et='http://ns.exiftool.ca/1.0/' et:toolkit='Image::ExifTool 7.49'
xmlns:ExifTool='http://ns.exiftool.ca/ExifTool/1.0/'
xmlns:File='http://ns.exiftool.ca/File/1.0/'
xmlns:QuickTime='http://ns.exiftool.ca/QuickTime/QuickTime/1.0/'
xmlns:Track1='http://ns.exiftool.ca/QuickTime/Track1/1.0/'
xmlns:Track2='http://ns.exiftool.ca/QuickTime/Track2/1.0/'
xmlns:Composite='http://ns.exiftool.ca/Composite/1.0/'>
<ExifTool:ExifToolVersion>7.49</ExifTool:ExifToolVersion>
<File:FileName>test.mp4</File:FileName>
<File:Directory>.</File:Directory>
<File:FileSize>2.8 MB</File:FileSize>
<File:FileModifyDate>2010:01:01 16:54:04-05:00</File:FileModifyDate>
<File:FileType>MP4</File:FileType>
<File:MIMEType>video/mp4</File:MIMEType>
<QuickTime:Version>0</QuickTime:Version>
<QuickTime:CreateDate>2010:01:01 21:53:38</QuickTime:CreateDate>
<QuickTime:ModifyDate>2010:01:01 21:54:04</QuickTime:ModifyDate>
<QuickTime:TimeScale>90000</QuickTime:TimeScale>
<QuickTime:Duration>01:25</QuickTime:Duration>
<QuickTime:PreferredRate>1</QuickTime:PreferredRate>
<QuickTime:PreferredVolume>100.00%</QuickTime:PreferredVolume>
<QuickTime:PreviewTime>0 s</QuickTime:PreviewTime>
<QuickTime:PreviewDuration>0 s</QuickTime:PreviewDuration>
<QuickTime:PosterTime>0 s</QuickTime:PosterTime>
<QuickTime:SelectionTime>0 s</QuickTime:SelectionTime>
<QuickTime:SelectionDuration>0 s</QuickTime:SelectionDuration>
<QuickTime:CurrentTime>0 s</QuickTime:CurrentTime>
<QuickTime:NextTrackID>3</QuickTime:NextTrackID>
<QuickTime:MediaHeaderVersion>0</QuickTime:MediaHeaderVersion>
<QuickTime:MediaCreateDate>2010:01:01 21:53:38</QuickTime:MediaCreateDate>
<QuickTime:MediaModifyDate>2010:01:01 21:54:04</QuickTime:MediaModifyDate>
<QuickTime:MediaTimeScale>90000</QuickTime:MediaTimeScale>
<QuickTime:MediaDuration>01:25</QuickTime:MediaDuration>
<QuickTime:MediaLanguageCode>und</QuickTime:MediaLanguageCode>
<QuickTime:HandlerType>Video Track</QuickTime:HandlerType>
<QuickTime:GraphicsMode>srcCopy</QuickTime:GraphicsMode>
<QuickTime:OpColor>0 0 0</QuickTime:OpColor>
<QuickTime:CompressorID>avc1</QuickTime:CompressorID>
<QuickTime:ImageWidth>320</QuickTime:ImageWidth>
<QuickTime:ImageHeight>240</QuickTime:ImageHeight>
<QuickTime:XResolution>72</QuickTime:XResolution>
<QuickTime:YResolution>72</QuickTime:YResolution>
<QuickTime:CompressorName>JVT/AVC Coding</QuickTime:CompressorName>
<QuickTime:BitDepth>24</QuickTime:BitDepth>
<QuickTime:VideoFrameRate>15.0</QuickTime:VideoFrameRate>
<QuickTime:MediaHeaderVersion>0</QuickTime:MediaHeaderVersion>
<QuickTime:MediaCreateDate>2010:01:01 21:53:38</QuickTime:MediaCreateDate>
<QuickTime:MediaModifyDate>2010:01:01 21:54:04</QuickTime:MediaModifyDate>
<QuickTime:MediaTimeScale>48000</QuickTime:MediaTimeScale>
<QuickTime:MediaDuration>01:23</QuickTime:MediaDuration>
<QuickTime:MediaLanguageCode>eng</QuickTime:MediaLanguageCode>
<QuickTime:HandlerType>Audio Track</QuickTime:HandlerType>
<QuickTime:Balance>0</QuickTime:Balance>
<QuickTime:AudioFormat>mp4a</QuickTime:AudioFormat>
<QuickTime:AudioChannels>2</QuickTime:AudioChannels>
<QuickTime:AudioBitsPerSample>16</QuickTime:AudioBitsPerSample>
<QuickTime:AudioSampleRate>48000</QuickTime:AudioSampleRate>
<QuickTime:Name>Stereo</QuickTime:Name>
<QuickTime:HandlerType>Metadata</QuickTime:HandlerType>
<QuickTime:Encoder>HandBrake 0.9.4 2009112300</QuickTime:Encoder>
<Track1:TrackVersion>0</Track1:TrackVersion>
<Track1:TrackCreateDate>2010:01:01 21:53:38</Track1:TrackCreateDate>
<Track1:TrackModifyDate>2010:01:01 21:54:04</Track1:TrackModifyDate>
<Track1:TrackID>1</Track1:TrackID>
<Track1:TrackDuration>01:25</Track1:TrackDuration>
<Track1:TrackLayer>0</Track1:TrackLayer>
<Track1:TrackVolume>0.00%</Track1:TrackVolume>
<Track1:ImageWidth>320</Track1:ImageWidth>
<Track1:ImageHeight>240</Track1:ImageHeight>
<Track2:TrackVersion>0</Track2:TrackVersion>
<Track2:TrackCreateDate>2010:01:01 21:53:38</Track2:TrackCreateDate>
<Track2:TrackModifyDate>2010:01:01 21:54:04</Track2:TrackModifyDate>
<Track2:TrackID>2</Track2:TrackID>
<Track2:TrackDuration>01:23</Track2:TrackDuration>
<Track2:TrackLayer>0</Track2:TrackLayer>
<Track2:TrackVolume>100.00%</Track2:TrackVolume>
<Composite:ImageSize>320x240</Composite:ImageSize>
</rdf:Description>
</rdf:RDF>
For preservation workflows that may want to use JHOVE (or similar applications) for metadata, the FITS project combines output from multiple metadata extraction tools (including JHOVE and exiftool) into a single interface.
<?xml version="1.0" encoding="UTF-8"?>
<fits xmlns="http://hul.harvard.edu/ois/xml/ns/fits/fits_output" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://hul.harvard.edu/ois/xml/ns/fits/fits_output http://hul.harvard.edu/ois/xml/xsd/fits/fits_output.xsd" version="0.3.1" timestamp="11/10/10 08:43">
<identification status="SINGLE_RESULT">
<identity format="ISO Media, MPEG v4 system, version 2" mimetype="video/mp4">
<tool toolname="Exiftool" toolversion="7.74" />
<version toolname="Exiftool" toolversion="7.74">0</version>
</identity>
</identification>
<fileinfo>
<lastmodified toolname="Exiftool" toolversion="7.74" status="SINGLE_RESULT">2010:01:01 16:54:04-05:00</lastmodified>
<created toolname="Exiftool" toolversion="7.74" status="SINGLE_RESULT">2010:01:01 21:53:38</created>
<filename toolname="OIS File Information" toolversion="0.1" status="SINGLE_RESULT">test.mp4</filename>
<size toolname="OIS File Information" toolversion="0.1" status="SINGLE_RESULT">2963484</size>
<md5checksum toolname="OIS File Information" toolversion="0.1" status="SINGLE_RESULT">df38217609323522909e25a37e444d26</md5checksum>
<fslastmodified toolname="OIS File Information" toolversion="0.1" status="SINGLE_RESULT">1262382844000</fslastmodified>
</fileinfo>
<filestatus />
<metadata>
<video>
<duration toolname="Exiftool" toolversion="7.74" status="SINGLE_RESULT">01:25</duration>
<frameRate toolname="Exiftool" toolversion="7.74" status="SINGLE_RESULT">15.0</frameRate>
<bitDepth toolname="Exiftool" toolversion="7.74" status="SINGLE_RESULT">24</bitDepth>
<sampleRate toolname="Exiftool" toolversion="7.74" status="SINGLE_RESULT">48000</sampleRate>
<channels toolname="Exiftool" toolversion="7.74" status="SINGLE_RESULT">2</channels>
<imageWidth toolname="Exiftool" toolversion="7.74" status="SINGLE_RESULT">320</imageWidth>
<imageHeight toolname="Exiftool" toolversion="7.74" status="SINGLE_RESULT">240</imageHeight>
<xSamplingFrequency toolname="Exiftool" toolversion="7.74" status="SINGLE_RESULT">72</xSamplingFrequency>
<ySamplingFrequency toolname="Exiftool" toolversion="7.74" status="SINGLE_RESULT">72</ySamplingFrequency>
<creatingApplicationName toolname="Exiftool" toolversion="7.74" status="SINGLE_RESULT">JVT/AVC Coding</creatingApplicationName>
</video>
</metadata>
</fits>
From this data, there are several key pieces that are needed for preservation and access workflows:
- Image Height/Image Width
- Duration
- Compressor ID/Compressor Name/Encoder
Because the media datastreams are arguably more heterogenous than other relatively standardized formats, having a tool that merges so-called repository workflows and popular, well-supported community tools may make it easier to accept obscure or nascent formats as quickly as possible (e.g. the brand new WebM video format from Google)
While the tool-based metadata may be sufficient, converting the output into a standard format may be desirable, however there is little agreement on a universal standard for storing it.
-
(VideoMD)[http://lcweb2.loc.gov/mets/Schemas/VMD.xsd] is part of a series of metadata specifications from the Library of Congress and does a good job capturing a base-level for information about media assets.
-
(PBCore)[http://pbcore.org] is a nascent standard (currently in 2.0 development) from the U.S. Corporation for Public Broadcasting, intendent primarily for broadcast and media production workflows. While not just limited to technical metadata, the PBCore Instantiation element has elements for encoding wrappers, tracks, and codec.
-
(EBUCore)[http://tech.ebu.ch/lang/en/MetadataEbuCore], similar to PBCore, is a metadata standard created by the European Broadcasting Union, with a more developed structure and controlled vocabulary.
-
(MPEG-7)[http://mpeg.chiariglione.org/standards/mpeg-7/mpeg-7.htm]/(MPEG-21)[http://mpeg.chiariglione.org/standards/mpeg-21/mpeg-21.htm] These are two standards developed by the Moving Picture Experts Group, which are extremely technical, but have found acceptance within some communities. Perhaps due to the difficulty in authoring the MPEG metadata, there is a growing set of (community-developed tools)[http://www.multimedia-metadata.info/Software%20and%20Tools]
-
SMPTE DMS-1
-
(SMPTE Metadata Dictionary RP210)[http://www.smpte-ra.org/mdd/RP210v2-1merged-020507b.xls] (XLS)
b) Asset management and storage
Storing large media files isn't much different than regular large datastream storage, however there are a couple additional considerations. First, segmenting large media files may significantly increase access time
c) In order to support access, preservation-quality media files may need to be transcoded. While it can be an entirely automated process (and often is), media quality can be improved with a little manual intervention.
- Quality
- Watermarking
- Hinting/m3p8/etc
The art of the thumbnail... If you're lucky, the exiftool-extracted metadata could provide poster frames information (Preview Time/Poster Time).
Getting it into delivery systems (streaming servers, CDNs, and more..)
mod_rewrite hack.
Fractured client market
- significant focus on web-friendly playback, 3rd party providers..
- few good alternatives for arbitrary playback, e.g. Quicktime (poorly neglected..)
Awkward delivery mechanisms