Debate 001 – Video and Audio still have a long way to go on the web.


“And the Battle rages”
(For disclose sake, I use Linux and am bias over anything in the Open Source realm in general. To sum it up, I love the stuff… why, is simple… it grants me POWER to do as I wish with it. It is a “double edge sword that cuts both ways” but for the most part I am in favor of Open Source Technologies. I will try to keep from preaching to the choir, and ask that you notice that most of what I have stated is in the realm of opinion and personal experience unless noted otherwise by links. AND I’m so not a laywer, don’t use me as legal advice)

You know it is funny, two things that seem to be equally as disruptive as the internet have little to do with it. I speak of audio (say like… music) and video. It been noted countless of times that the potential of this two aspects of our understand of the world have are endless. And so there has always been this rather interesting idea to try and reflect this on the internet, why… because it can be done, which tends to reflect the way the internet encapsulates human nature in general.

So a rather small thing happen that has since ballooned into this raging debate. The small thing, the ability to embed a audio file and a video file directly on a web page. Why is this is important… it is something that we never really got with existing methods, video and audio (especially audio) has always lived in this odd cage when you want the web to be interactive mostly through plug-ins. Given that the Web standards was slowly becoming giving people the ability to have an interactive site, it was inevitable that audio and video would need to be embedded in a manner that is standardized for the sake of interaction web defined objects.
If you are lost, just think of it like this… people where slowly crying for the web to be like Flash Player, without the need of Adobe’s Flash Player to do it, with little knowledge of it. The web would be this completely interactive thing that anyone with a web browser could use and, more importantly, it would belong to no-one and everyone to implement, share and use.

The problem came with something that needed to be addressed with the addition of these tags (named <audio> and <video> respectively), the audio and video codec’s to be used with the tags as a baseline codec’s. The tags that would addressed the embedding and making the codec’s a part of the standard would give vendors a means of interoperability (a word that means many things to many people, but basically it is the means of interchange of information with little to no difference in said information between two differently programmed programs… that is my definition of it anyway). At the time, it was suggested that OGG Vorbis(along with Wave PCM [a .wav file]) and OGG Theroa would be a requirement to use the tags, on the <video> side of it someone screamed sacrilege.
Well in this case it was two companies that did, Apple Inc. and Nokia Corp. Nokia is interesting in that they did buyout an open-sourced company (namely Trolltech, which they renamed to QT Technologies) and seem to be adapting it to many of it assets to the company, and given the another Open Source assets that they do control… the fact that they raised the objection and have none nothing more is a little surprising. Apple, on the other hand, clearly has some vested interest in the manner with the iPhone(and iPod/iPod Touch) and it’s wealth in iTunes. It is an objection that they have raised for any time the subject seem to come up, and have suggested another format (namely Mpeg-4’s H.264 format, all of there devices happens to use the codec and iTunes happens to give in said codec) to become the standard in which WHATWG (and W3C by extension) should support.
Since then, some lines in have been drawn in the HTML5 sand… Mozilla Foundation and Opera Software ASA have thrown in OGG support (through only Mozilla has released such so far with Mozilla Firefox 3.5, Opera is reportively got it [for video- namely Theora] working in beta’s of 10), Google has gone on record to say that they will support H.264 and the OGG formats (OGG has been done, the code in in the development release of Chromium, which is the Webkit-based render used in Chome. H.264 is likely to be available only for those that use the final release of Chorme 3.0 ONLY with those that use Chromium having just OGG playback due to patent issues), Apple has continued to work with H.264 exclusively and Nokia (for now) has done and said little else since, Microsoft has not stepped into this ring but there is three likely outcomes from my POV (Silverlight [or any of there own video/audio formats] is extended to handle video/audio playback which would be the same senerio that we have currently with Flash but from a different vendor, they opt-out of <audio> and <video> completely, toss in full OGG support like Opera and Mozilla have done). The OGG formats (Vorbis and Theora) have been written out of HTML5’s spec (which is still in drafting stage) and given the technical talk around WHAT and W3C, it’s very likely that it will stay that way. But Apple’s push for H.264 never really worked, since for now… NO codec has been specified for <video> as a baseline. On the <audio> end of the debate, Vorbis has been written out and Wave PCM is the only thing suggested currently (which I don’t mine, nor to the Vendors) BUT due to Wave’s is notoriously large file size… divides over OGG Vorbis, MP4 (including the AAC formats), and MP3 will continue for something more capable… if only in the shadow of the <video> debates.

Now interestingly the Cult of Mac Fan’s is not to blame for the heavy (and that is something of an understatement) of the debate that has been raised. It the technical community in general. One of the two things against Theora is the quality of the video format, which I will say isn’t great at the moment… but considering that we do live with less in what we have currently (Youtube and Flash), it something that can be argued and it is something that is, I find, improving. The other is the lack of hardware encoders, which is something that may or may not improve with time. It really a matter of manufacturers will to do so (and invest the time and money). The things against Vorbis, I see as manly a lack of will… there are quite a few OGG players out there (though most don’t advertise this fact) and there is a spec for an hardware encoder.
The major problem that I have with the MPEG-4 formats in general is this… Legally, they are landmine, waiting to go off and spoil it for everyone. It’s is something that has happened before with the patented nature of MPEG formats, MPEG-1 Audio Layer 3 (known to most as MP3’s) history is a nasty tale of the cocktail this can and has been. MPEG-4 does a little worse than MPEG-1 Audio Layer 3 on this matter when you look at:
1) the controlling entity of MPEG-4 patents (and MPEG-2, tech around Blu-ray [sans the Java stack], VC-1, etc.) is company known as MPEG-LA, a company that has no association with ISO or MPEG (Moving Picture Experts Group), and is a conglomerate of the other patent holders.
2) MPEG-LA doesn’t own all the patents in MPEG’s pool of technologies, therefore members in it’s membership are not protected completely from litigation (Microsoft has a little story of what happens when you are sue by such people [called patent trolls when all they have is the patent in question]. In court, one said company was granted the reward of 1 billion dollars for MP3 playback in Windows XP from MS for a violation of a number of patents. They appealed and won… but it is still very stark reminder to be careful of such people and companies with any technology). On top of this, if you fall out side of MPEG-LA’s membership and use any of their patents, they will defend them… meaning that you will be dragged into court if you use an encoder/decoder that MPEG-4
3) This document right here sums up the rest of my opinion over MPEG-4 problems. Note the term “content providers” now add it something like YouTube… you begin to see a major problem with AVC/H.264.
The rest of the issues that I have are massively technical, but the biggest is CPU power. Hardware decoders and Mutli-core CPU’s has helped mitigate this but the nasty truth is that H.264 on it’s own uses a massive amount of CPU cycles, for just decoding a video stream. Add in the wild nature of the internet and bitrate levels, you have a nasty cocktail for performance…. which is particular for small devices (Cell phones, net-books, etc.) which tend to use lower powered CPU’s

My opinion of this debate is this, somewhere along the lines, amiss the debate about quality and legalese… the point of the internet was lost. The internet, first and foremost exists to exchange data in a open manner… to communicate between points with data. That is it, there isn’t much mysticism about it… it’s like talking to a person, or reading a book, it’s a VERY basic and often overlooked requirement. The point of the internet, like it or not, is to communicate… to convey information, and now that information is being stored in Video and Audio files doesn’t make it’s special. It means that much like text, it’s just data to be displayed, which in turn is used and shared (and can be the stepping stone to do things like it and/or modify the format)…. the first format that grant’s web designers this is the first to the post, and generally dictates the path of technology.
This is the problem, there is such a heightened value of video and audio that it’s created this cloud that values the format more than the content inside the format, and that those that control the format, make the money. I was debating this around the web-sphere (interwebz to some of you) and my conclusion among the many post is the same, video and audio need to be dumb-ed down to things like text to make more money out of it and to make it more useful to the many people that continue to use the internet.
Google is a prime example of this, with Adsense, and how much it’s grown the company gown with monetizing text ads. Before Google came, almost ad services where all flashy images in the way of billboards in Las Vagas… a lot of noise, little in context. Google made that has changed the way we see ads, and odd has made it more meaning full (since you only get so many words to convey a point).
This is that braking point for video and audio, to make it a general object of the web or to keep things the way that they are. And this is where the baseline codec’s come in, by stating one all those that don’t conform to using the baseline can’t call themselves part of the standard. On the Web, any browser that bends or brakes standards is frown badly upon (you can see lot of this in the continuing campaign to stop others from using Internet Explorer 6). But to do this, Open Source must be the starting point for any codec to be used. It something that even W3C is very aware of of this point… so much so that anything that doesn’t confirm to their policy is something of a non-starter, most Open Source Licenses do confirm to this almost by accident (The BSD license is used with the OGG formats save Drac, which uses something else).

It makes one wonder why the debate started, since H.264 is a nasty replacement that reminds me of the chaos that ensued with MP3’s during the Naspter years. To the point where no-one in there right mind would touch the technology for fear of being sue out of existence. It has happened before and with the amount of money that is being poured into H.264 it is very likely to happen again, throught I fear the payouts will be MUCH bigger. To add to this, H264 becoming a standard over the internet keeps others trying to being something like Youtube, since as “content providers” once would be oblgated to pay MPEG-LA (or it’s members) a very big royalty should they reach that size (if Youtube moves to H.264, they too will have this oblgation to fill).
Really if we are to accept that, something it very wrong at the core of the internet. As much as there is this big fear of submerine patents, that effects any and all technologies… OGG looks alot better since that is the only baggage that they got legally and the rule about handling them in the US is much like the “Fight Club” (through there is a reason for this, that is for anyone that is found to willingly violate a patent in the US will be penilized for triple the damages. Even if it’s for the sake of finding out if one has violated a patent to change it later… Indeed in the US, you DON’T talk about the “Fight Club”) . To me the OGG formats, as bad as Theora is, are the only way forward since they give vendors the ability to treat the internet as it has been…. by giving people the ability to implement, share and use without royalites to suck potential dry.


Let me also pick a bone of something that I have seen. Mozilla more or less displayed there options in the open… H.264 is a non-starter since there are many others in the downstream of Gecko that would twitch at being open to this kind of risk. For that, they have been canned for endlessly for adopting an “Theora or bust” position. The same Foundation/Corp that released Firefox 3.5 to the world, where one billion copies of it have been downloaded (and counting)… Mozilla as some big clout that should not be overlooked for being number two. For those that can them for taken a high road on this matter, all I can say it this… wait and see.

