Cwmwenallt Musings

December 6, 2010

More on Accessible Video

Filed under: Timed text, Work — Tags: , — Sean @ 5:05 pm

Today I’ve uploaded a new slightly better looking version of the demo, and added a new feature: extended descriptions. Descriptions convey information about what is happening in the image to those that cannot see it, and are the analogue of captions for the audio space. This brings up one of the remaining big issues in HTML5, which is that if one wants to associate another media resource with the main video, then there is no really satisfactory way to do it. You might want to go and play with the feature a little and then come back to follow the discussion.

For the ASL video, absolute synchronisation is not that important, since the grammar of sign is such that it is very difficult to get a close temporal correspondence between the two languages. To align the ASL then it was ‘simply’ a matter of obtaining a video of the same length of a sign interpreter who listened to the tape and then signed an equivalent.

(Side note, this video was extremely hard for the interpreter to sign, as the pace of the dialogue is so fast, she needed two or three takes to get it fast down, something that she would not normally  need in the course of ordinary dialogue. She was a real professional, and a trouper for sticking with it)

Then in the Javascript there is the following code:

    if (!videoElement.paused && asl) {
        if (asl.paused) {
            asl.play();
        }
        asl.currentTime = videoElement.currentTime;
    }
    else if (videoElement.paused && asl) {
        if (!asl.paused) {
            asl.pause();
        }
        asl.currentTime = videoElement.currentTime;
    }

This  is called on the timeupdate event handler for the main media and simply moves the ASL timeline to stay in synch. This is very rough and ready, but it seems to work in IE9 Beta and Safari, although maybe not so well in Firefox. What we really need in HTML5 is the ability to slave one video to another, in my markup I’ve added a ‘syncto’ attribute; which is intended to define this relationship.

        <video id="theASL" syncto="theVideo" poster="RealPCPride_Thumb.jpg">
           <source src="RealPCPride.asl.mp4" type="video/mp4" />
	   <source src="RealPCPride.asl.ogg" type="video/ogg" />
        </video>

For playing audio descriptions however things are a little more involved since the timing needs to be a little tighter. For non extended descriptions, the simplest option is to just provide the video with two audio tracks, one with pre mixed descriptions, and one without. For extended descriptions however, we want to trigger an audio clip at a given moment. I’ve added annotations to the TTML example with pointers to the audio file:

      <p
        ms:audio="http://www.cwmwenallt.com/ttml/audio/RealPCPride.en.001.mp3"
        dur="0.8s" begin="00:00:00:18"
        ttm:role="description" xml:id="description1">
       Open on a man in sports jacket and tie in front of a plain white background
       waving. An email address sean@windows.com is overlaid
      </p>

In the code when this <p> becomes active, we pause the main video, and trigger this audio , by setting the src of by a new <audio> element added to the DOM to the value of the ms:audio attribute. The <audio> element has a handler to restart the main video on the ended  event.

This works, except for the network delay in loading the audio resource, but its not particularly ideal. What I think we need is a combining operator which describes how the additional material interacts with the primary sourced media.

November 24, 2010

Mini-series: W3C Timed Text (TTML)

Filed under: Timed text, Work — Sean @ 1:06 pm

 

Having been involved with TTML now for best part of a decade, and with it finally coming to be a W3C recommendation as of 18th of November, it’s something of an end-of-era time for me. So following on from my last post , I thought I’d develop a little series talking about TTML, what it is, why it is what it is, and also expand a little on how I think we should go about integrating it into HTML5. I’ve written up a draft specification for this, and this is essentially what I’ve implemented in my Javascript demo. (I toyed with the idea of trying to do it inline here, but not sure how Wordpress would handle it).

The aim of this script is to make it drop dead simple to add TTML based closed captions, subtitles, chapter marks etc to your <video> page; all that is required is to add this line into the <head> section of your HTML:

<script src="ttml.js" type="text/javascript" > </script>

Point to the location of the script, then you need to add new <track> elements to point to your caption/subtitle/karaoke file(s). In the demo I use three track files:

       <video id="theVideo" controls="controls" poster="RealPCPride_Thumb.jpg">
            <source src="RealPCPride.mp4" type="video/mp4" />
            <source src="RealPCPride.ogg" type="video/ogg" />
            <track src="RealPCPride.wmv.en.captions.xml" kind="captions" srclang="en-US" label="English captions" />
            <track src="RealPCPride.wmv.en.descriptions.xml kind="captions" srclang="en-US" label="English text descriptions" />
            <track src="RealPCPride.wmv.en.xml kind="subtitles" srclang="en" label="English captions and text descriptions" /> 
                Short video from the “I’m A PC” ad campaign..  (needs HTML 5 capable browser)
        </video>

The script will add a selection element into the page, enabling the user to choose between them. That’s it!

The script is currently working in IE9 beta and Firefox 3+, and I’m working on Safari support. Still stuff to do; in particular styling the select so that it looks a little more integrated with the native video controls and handling multiple video elements.  I’m also currently scoping out how chapter files should operate. But the basic caption and subtitle kinds are reasonably well supported. You should also notice that the demo adds a second video for sign translation, which is synchronised with the main video. HTML 5 doesn’t actually add tools for this yet, so this is achieved using a second video element and some script, eventually I’ll add in support so that the sign translation can be inserted using the <track> mechanism.

Without getting into the details (see the specification for that), roughly what this does is analyse the timed text file for the content that should be active at the current playback time, and then inserts equivalent elements into the host DOM.

One nice feature of this approach is that since the timed text gets integrated into the live DOM of the page, you can override the built in styles of the caption file. The script converts any role and xml:id attributes it finds in the timed text, into class and id attributes and maps the <region> element to a div with @class=”ttml_cue”  currently I prefix these with ”ttml_” to try and reduce clashes with the host page; not sure if there is a cleaner way to do this without better namespace support.

Anyway, now if I have a caption like the following :

<p xml:id='subtitle1a' ttm:role='caption' begin='00:00:00:27' end='00:00:02:22'
>Sean: Hello.  I’m a PC,</p>

Then in my page CSS I can add a rule like:

.ttml_caption  {
	font-style:italic!important;
 }

This will override all the captions to be italic, and if I want to be specific:

#ttml_subtitle1a  {
	color:red!important;
}

Obviously if you want the TTML to take precedence, and only provide defaults for styles the TTML doesn’t specify, you can leave out the !important keyword.

OK, hopefully that’s given enough of an introduction, there’s lots more to say; next time we’ll get a little more into what Timed Text is all about.

Ciao.

Sean.

November 18, 2010

Windows Live Writer Test

Filed under: Accessibility, Timed text, Work — Tags: — Sean @ 2:02 pm

OK, so this blog hasn’t been updated much recently; I’ve been busy OK. Rest assured there is another stringed thing project in the pipeline, based on an art box and a resonator kit; but that’s for the future.

One of the things I’ve been busy with recently at work is the HTML5 working group, and in particular the choice (or preferably non-choice) of caption format. Now that the <track> element seems to be stabilising,  I’ve put together a demo of how captions , text description and and subtitles in  W3C Timed text could be integrated using just this feature now, rather than waiting for the browser implementations to catch up. To make sure its cross browser, I decided to do it all in Javascript. If this remotely interests you, the current demo is here You’ll need IE9 or a recent build of Firefox to view it (Sorry Safari users, it doesn’t work for you yet; but I’m working on it; I’m not planning on testing Opera though).

If you don’t have an HTML 5 powered browser here’s a screenshot:

screenshot of TTML demo in IE showing bill gates, a caption and a description

Longer term I aim to get this cleaned up and finished and posted on my Codeplex site (which also hasn’t seen much love recently) with audio descriptions, chapter navigation and stuff, so don’t go running off with the code just yet.

Oh, and the title? Yes this post was written using Windows Live Writer (soooo much nicer than hand crafting in notepad and pasting into Wordpress – one of the other reasons i haven’t blogged lately!)

ttfn.

Powered by WordPress