[RTW] modularization, what appears 'on the web', and other vague thoughts

Sun Oct 10 02:45:15 CEST 2010

On Oct 10, 2010, at 5:04 , Christopher Blizzard wrote:
> 
> The device element is only a small part of the picture (at least so far.)  Most of the people in the room were protocol & codec folks so we spent most of our time talking about the underlying elements.

Speaking completely off the top of my head, I imagine something a little higher level than device.

At the workshop we discussed ever so briefly using the 'video' element for the display of the remote end.  You'd need to give it a suitable URL that identifies a protocol and address from which to get the a/v, of course.  I wondered (and still do) if we can split 'discovery' off.  This might be multi-level:

Address-book-like:  "I need to talk to Chris Blizzard"
--> Chris has the connectivity phoneto:14155551212, wondrous:snowblizzard at example.com, awesome:magic at excellentphone.org

Discovery: my UA knows about the wondrous phone system, and it says "find me snowblizzard at example.com"
--> he has the address sip:192.168.34.45

so now I know how to set up a SIP/RTP call using IP addresses.  This is something I pass to the video element.

Similarly, I wonder about a "capture" element which can capture audio and video and reflect them on to the browser display.  I guess they have a lot of attributes/DOM interfaces to set things like the bitrate, screen size to send, screen size to show locally, and so on.  I pass the same sip:192.168.34.45 to the "capture" element, and set the right attributes, and it provides the other direction.

I'd like to think we can make the system even more modular.  Certainly we might like to see all the non-real-time stuff mediated through scripts and so on.  We need to remember that we have the local UA at each end, the sites that served the 'integration' web page for each end, and the servers that provide the back-ends for the discovery protocols and possibly the real-time communications, though for the last we all seem to prefer that it be *possible* to talk end-to-end directly, as intermediate servers add delay (probably).  But the details of how protocols work is, of course, a matter for those protocols; at the UA/browser level, we're identifying protocols by URL type.

Now, we might want to recommend/mandate certain protocol(s), and, within them, certain codecs, encryption, and so on, to provide a baseline of interoperability.  We don't have the luxury here that the video and audio elements have, where a clear common use case is using HTTP to load a file to play.

David Singer
Multimedia and Software Standards, Apple Inc.