Today I’ve uploaded a new slightly better looking version of the demo, and added a new feature: extended descriptions. Descriptions convey information about what is happening in the image to those that cannot see it, and are the analogue of captions for the audio space. This brings up one of the remaining big issues in HTML5, which is that if one wants to associate another media resource with the main video, then there is no really satisfactory way to do it. You might want to go and play with the feature a little and then come back to follow the discussion.
For the ASL video, absolute synchronisation is not that important, since the grammar of sign is such that it is very difficult to get a close temporal correspondence between the two languages. To align the ASL then it was ‘simply’ a matter of obtaining a video of the same length of a sign interpreter who listened to the tape and then signed an equivalent.
(Side note, this video was extremely hard for the interpreter to sign, as the pace of the dialogue is so fast, she needed two or three takes to get it fast down, something that she would not normally need in the course of ordinary dialogue. She was a real professional, and a trouper for sticking with it)
Then in the Javascript there is the following code:
if (!videoElement.paused && asl) {
if (asl.paused) {
asl.play();
}
asl.currentTime = videoElement.currentTime;
}
else if (videoElement.paused && asl) {
if (!asl.paused) {
asl.pause();
}
asl.currentTime = videoElement.currentTime;
}
This is called on the timeupdate event handler for the main media and simply moves the ASL timeline to stay in synch. This is very rough and ready, but it seems to work in IE9 Beta and Safari, although maybe not so well in Firefox. What we really need in HTML5 is the ability to slave one video to another, in my markup I’ve added a ‘syncto’ attribute; which is intended to define this relationship.
<video id="theASL" syncto="theVideo" poster="RealPCPride_Thumb.jpg">
<source src="RealPCPride.asl.mp4" type="video/mp4" />
<source src="RealPCPride.asl.ogg" type="video/ogg" />
</video>
For playing audio descriptions however things are a little more involved since the timing needs to be a little tighter. For non extended descriptions, the simplest option is to just provide the video with two audio tracks, one with pre mixed descriptions, and one without. For extended descriptions however, we want to trigger an audio clip at a given moment. I’ve added annotations to the TTML example with pointers to the audio file:
<p
ms:audio="http://www.cwmwenallt.com/ttml/audio/RealPCPride.en.001.mp3"
dur="0.8s" begin="00:00:00:18"
ttm:role="description" xml:id="description1">
Open on a man in sports jacket and tie in front of a plain white background
waving. An email address sean@windows.com is overlaid
</p>
In the code when this <p> becomes active, we pause the main video, and trigger this audio , by setting the src of by a new <audio> element added to the DOM to the value of the ms:audio attribute. The <audio> element has a handler to restart the main video on the ended event.
This works, except for the network delay in loading the audio resource, but its not particularly ideal. What I think we need is a combining operator which describes how the additional material interacts with the primary sourced media.

























