<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html;
charset=ISO-8859-1">
<meta name="Generator" content="Microsoft Word 14 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:Wingdings;
panose-1:5 0 0 0 0 0 0 0 0 0;}
@font-face
{font-family:Wingdings;
panose-1:5 0 0 0 0 0 0 0 0 0;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri","sans-serif";}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
{mso-style-priority:34;
margin-top:0in;
margin-right:0in;
margin-bottom:0in;
margin-left:.5in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri","sans-serif";}
span.EmailStyle17
{mso-style-type:personal-compose;
font-family:"Calibri","sans-serif";
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-family:"Calibri","sans-serif";}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
/* List Definitions */
@list l0
{mso-list-id:791021394;
mso-list-type:hybrid;
mso-list-template-ids:587904636 -297898198 67698691 67698693 67698689 67698691 67698693 67698689 67698691 67698693;}
@list l0:level1
{mso-level-start-at:0;
mso-level-number-format:bullet;
mso-level-text:\F0B7;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;
font-family:Symbol;
mso-fareast-font-family:Calibri;
mso-bidi-font-family:"Times New Roman";}
@list l0:level2
{mso-level-number-format:bullet;
mso-level-text:o;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;
font-family:"Courier New";}
@list l0:level3
{mso-level-number-format:bullet;
mso-level-text:\F0A7;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;
font-family:Wingdings;}
@list l0:level4
{mso-level-number-format:bullet;
mso-level-text:\F0B7;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;
font-family:Symbol;}
@list l0:level5
{mso-level-number-format:bullet;
mso-level-text:o;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;
font-family:"Courier New";}
@list l0:level6
{mso-level-number-format:bullet;
mso-level-text:\F0A7;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;
font-family:Wingdings;}
@list l0:level7
{mso-level-number-format:bullet;
mso-level-text:\F0B7;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;
font-family:Symbol;}
@list l0:level8
{mso-level-number-format:bullet;
mso-level-text:o;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;
font-family:"Courier New";}
@list l0:level9
{mso-level-number-format:bullet;
mso-level-text:\F0A7;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;
font-family:Wingdings;}
ol
{margin-bottom:0in;}
ul
{margin-bottom:0in;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body text="#000000" bgcolor="#ffffff">
Sharing some info from other W3C groups that are working on handling
of audio in browsers.<br>
<br>
Harald<br>
<br>
-------- Original Message --------
<table class="moz-email-headers-table" border="0" cellpadding="0"
cellspacing="0">
<tbody>
<tr>
<th nowrap="nowrap" valign="BASELINE" align="RIGHT">Subject: </th>
<td>Feedback to the DAP group on the topic of audio/media
capture needed for HTML+Speech</td>
</tr>
<tr>
<th nowrap="nowrap" valign="BASELINE" align="RIGHT">Resent-Date:
</th>
<td>Sat, 15 Jan 2011 05:47:45 +0000</td>
</tr>
<tr>
<th nowrap="nowrap" valign="BASELINE" align="RIGHT">Resent-From:
</th>
<td><a class="moz-txt-link-abbreviated" href="mailto:public-device-apis@w3.org">public-device-apis@w3.org</a></td>
</tr>
<tr>
<th nowrap="nowrap" valign="BASELINE" align="RIGHT">Date: </th>
<td>Sat, 15 Jan 2011 04:54:56 +0000</td>
</tr>
<tr>
<th nowrap="nowrap" valign="BASELINE" align="RIGHT">From: </th>
<td>Michael Bodell <a class="moz-txt-link-rfc2396E" href="mailto:mbodell@microsoft.com"><mbodell@microsoft.com></a></td>
</tr>
<tr>
<th nowrap="nowrap" valign="BASELINE" align="RIGHT">To: </th>
<td><a class="moz-txt-link-abbreviated" href="mailto:public-device-apis@w3.org">public-device-apis@w3.org</a>
<a class="moz-txt-link-rfc2396E" href="mailto:public-device-apis@w3.org"><public-device-apis@w3.org></a></td>
</tr>
<tr>
<th nowrap="nowrap" valign="BASELINE" align="RIGHT">CC: </th>
<td><a class="moz-txt-link-abbreviated" href="mailto:public-xg-htmlspeech@w3.org">public-xg-htmlspeech@w3.org</a>
<a class="moz-txt-link-rfc2396E" href="mailto:public-xg-htmlspeech@w3.org"><public-xg-htmlspeech@w3.org></a></td>
</tr>
</tbody>
</table>
<br>
<br>
<meta http-equiv="Content-Type" content="text/html;
charset=ISO-8859-1">
<meta name="Generator" content="Microsoft Word 14 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:Wingdings;
panose-1:5 0 0 0 0 0 0 0 0 0;}
@font-face
{font-family:Wingdings;
panose-1:5 0 0 0 0 0 0 0 0 0;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri","sans-serif";}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
{mso-style-priority:34;
margin-top:0in;
margin-right:0in;
margin-bottom:0in;
margin-left:.5in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri","sans-serif";}
span.EmailStyle17
{mso-style-type:personal-compose;
font-family:"Calibri","sans-serif";
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-family:"Calibri","sans-serif";}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
/* List Definitions */
@list l0
{mso-list-id:791021394;
mso-list-type:hybrid;
mso-list-template-ids:587904636 -297898198 67698691 67698693 67698689 67698691 67698693 67698689 67698691 67698693;}
@list l0:level1
{mso-level-start-at:0;
mso-level-number-format:bullet;
mso-level-text:\F0B7;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;
font-family:Symbol;
mso-fareast-font-family:Calibri;
mso-bidi-font-family:"Times New Roman";}
@list l0:level2
{mso-level-number-format:bullet;
mso-level-text:o;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;
font-family:"Courier New";}
@list l0:level3
{mso-level-number-format:bullet;
mso-level-text:\F0A7;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;
font-family:Wingdings;}
@list l0:level4
{mso-level-number-format:bullet;
mso-level-text:\F0B7;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;
font-family:Symbol;}
@list l0:level5
{mso-level-number-format:bullet;
mso-level-text:o;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;
font-family:"Courier New";}
@list l0:level6
{mso-level-number-format:bullet;
mso-level-text:\F0A7;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;
font-family:Wingdings;}
@list l0:level7
{mso-level-number-format:bullet;
mso-level-text:\F0B7;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;
font-family:Symbol;}
@list l0:level8
{mso-level-number-format:bullet;
mso-level-text:o;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;
font-family:"Courier New";}
@list l0:level9
{mso-level-number-format:bullet;
mso-level-text:\F0A7;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;
font-family:Wingdings;}
ol
{margin-bottom:0in;}
ul
{margin-bottom:0in;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
<div class="WordSection1">
<p class="MsoNormal">On today’s Hypertext Coordination Group
Teleconference the issue of “Audio on the Web” was discussed
(see minutes:
<a moz-do-not-send="true"
href="http://www.w3.org/2011/01/14-hcg-minutes.html">http://www.w3.org/2011/01/14-hcg-minutes.html</a>)
and I was given the action item of contacting the DAP group to
provide feedback about audio capture. We in the HTML Speech XG
(<a moz-do-not-send="true"
href="http://www.w3.org/2005/Incubator/htmlspeech/">http://www.w3.org/2005/Incubator/htmlspeech/</a>)
have been discussing use cases, requirements, and some proposals
around speech enabled html pages and the need for the audio to
be captured and recognized in real time (I.e., in a streaming
fashion, not in a file upload fashion). We recognize that there
are interesting security and privacy concerns with supporting
this necessary functionality.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">The HTML Speech XG has currently finished
with requirements gathering, and is in the process of
prioritizing these requirements. Our requirements document is
at
<a moz-do-not-send="true"
href="http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html">http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html</a>.
There are a large number (almost half) of our requirements that
might be of particular note to the audio capture process. I’ve
tried to pull out and organize the requirements most relevant to
the DAP audio capture:<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoListParagraph" style="text-indent: -0.25in;"><!--[if !supportLists]--><span
style="font-family: Symbol;"><span style="">·<span
style="font: 7pt "Times New Roman";">
</span></span></span><!--[endif]-->Requirements about to
where the audio is streamed:<o:p></o:p></p>
<p class="MsoListParagraph" style="margin-left: 1in; text-indent:
-0.25in;">
<!--[if !supportLists]--><span style="font-family: "Courier
New";"><span style="">o<span style="font: 7pt "Times
New Roman";">
</span></span></span><!--[endif]-->FPR12. Speech services
that can be specified by web apps must include network speech
services [<a moz-do-not-send="true"
href="http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr12">http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr12</a>]<o:p></o:p></p>
<p class="MsoListParagraph" style="margin-left: 1in; text-indent:
-0.25in;">
<!--[if !supportLists]--><span style="font-family: "Courier
New";"><span style="">o<span style="font: 7pt "Times
New Roman";">
</span></span></span><!--[endif]-->FPR32. Speech services
that can be specified by web apps must include local speech
services. [<a moz-do-not-send="true"
href="http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr32">http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr32</a>]<o:p></o:p></p>
<p class="MsoListParagraph" style="text-indent: -0.25in;"><!--[if !supportLists]--><span
style="font-family: Symbol;"><span style="">·<span
style="font: 7pt "Times New Roman";">
</span></span></span><!--[endif]-->Requirements about the
audio streams and the fact that it needs to be streamed:<o:p></o:p></p>
<p class="MsoListParagraph" style="margin-left: 1in; text-indent:
-0.25in;">
<!--[if !supportLists]--><span style="font-family: "Courier
New";"><span style="">o<span style="font: 7pt "Times
New Roman";">
</span></span></span><!--[endif]-->FPR33. There should be at
least one mandatory-to-support codec that isn't encumbered with
IP issues and has sufficient fidelity & low bandwidth
requirements. [<a moz-do-not-send="true"
href="http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr33">http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr33</a>]<o:p></o:p></p>
<p class="MsoListParagraph" style="margin-left: 1in; text-indent:
-0.25in;">
<!--[if !supportLists]--><span style="font-family: "Courier
New";"><span style="">o<span style="font: 7pt "Times
New Roman";">
</span></span></span><!--[endif]-->FPR25. Implementations
should be allowed to start processing captured audio before the
capture completes. [<a moz-do-not-send="true"
href="http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr25">http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr25</a>]<o:p></o:p></p>
<p class="MsoListParagraph" style="margin-left: 1in; text-indent:
-0.25in;">
<!--[if !supportLists]--><span style="font-family: "Courier
New";"><span style="">o<span style="font: 7pt "Times
New Roman";">
</span></span></span><!--[endif]-->FPR26. The API to do
recognition should not introduce unneeded latency. [<a
moz-do-not-send="true"
href="http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr26">http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr26</a>]<o:p></o:p></p>
<p class="MsoListParagraph" style="margin-left: 1in; text-indent:
-0.25in;">
<!--[if !supportLists]--><span style="font-family: "Courier
New";"><span style="">o<span style="font: 7pt "Times
New Roman";">
</span></span></span><!--[endif]-->FPR56. Web applications
must be able to request NL interpretation based only on text
input (no audio sent). [<a moz-do-not-send="true"
href="http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr56">http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr56</a>]<o:p></o:p></p>
<p class="MsoListParagraph" style="text-indent: -0.25in;"><!--[if !supportLists]--><span
style="font-family: Symbol;"><span style="">·<span
style="font: 7pt "Times New Roman";">
</span></span></span><!--[endif]-->Requirements about what
must be possible while streaming (I.e., getting midstream events
in a timely fashion without cutting off the stream; being able
to decide to cut off the stream mid request; being able to reuse
the stream):<o:p></o:p></p>
<p class="MsoListParagraph" style="margin-left: 1in; text-indent:
-0.25in;">
<!--[if !supportLists]--><span style="font-family: "Courier
New";"><span style="">o<span style="font: 7pt "Times
New Roman";">
</span></span></span><!--[endif]-->FPR40. Web applications
must be able to use barge-in (interrupting audio and TTS output
when the user starts speaking). [<a moz-do-not-send="true"
href="http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr40">http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr40</a>]<o:p></o:p></p>
<p class="MsoListParagraph" style="margin-left: 1in; text-indent:
-0.25in;">
<!--[if !supportLists]--><span style="font-family: "Courier
New";"><span style="">o<span style="font: 7pt "Times
New Roman";">
</span></span></span><!--[endif]-->FPR21. The web app should
be notified that capture starts. [<a moz-do-not-send="true"
href="http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr21">http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr21</a>]<o:p></o:p></p>
<p class="MsoListParagraph" style="margin-left: 1in; text-indent:
-0.25in;">
<!--[if !supportLists]--><span style="font-family: "Courier
New";"><span style="">o<span style="font: 7pt "Times
New Roman";">
</span></span></span><!--[endif]-->FPR22. The web app should
be notified that speech is considered to have started for the
purposes of recognition. [<a moz-do-not-send="true"
href="http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr22">http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr22</a>]<o:p></o:p></p>
<p class="MsoListParagraph" style="margin-left: 1in; text-indent:
-0.25in;">
<!--[if !supportLists]--><span style="font-family: "Courier
New";"><span style="">o<span style="font: 7pt "Times
New Roman";">
</span></span></span><!--[endif]-->FPR23. The web app should
be notified that speech is considered to have ended for the
purposes of recognition. [<a moz-do-not-send="true"
href="http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr23">http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr23</a>]<o:p></o:p></p>
<p class="MsoListParagraph" style="margin-left: 1in; text-indent:
-0.25in;">
<!--[if !supportLists]--><span style="font-family: "Courier
New";"><span style="">o<span style="font: 7pt "Times
New Roman";">
</span></span></span><!--[endif]-->FPR24. The web app should
be notified when recognition results are available. [<a
moz-do-not-send="true"
href="http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr24">http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr24</a>]<o:p></o:p></p>
<p class="MsoListParagraph" style="margin-left: 1in; text-indent:
-0.25in;">
<!--[if !supportLists]--><span style="font-family: "Courier
New";"><span style="">o<span style="font: 7pt "Times
New Roman";">
</span></span></span><!--[endif]-->FPR57. Web applications
must be able to request recognition based on previously sent
audio. [<a moz-do-not-send="true"
href="http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr57">http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr57</a>]<o:p></o:p></p>
<p class="MsoListParagraph" style="margin-left: 1in; text-indent:
-0.25in;">
<!--[if !supportLists]--><span style="font-family: "Courier
New";"><span style="">o<span style="font: 7pt "Times
New Roman";">
</span></span></span><!--[endif]-->FPR59. While capture is
happening, there must be a way for the web application to abort
the capture and recognition process. [<a moz-do-not-send="true"
href="http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr59">http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr59</a>]<o:p></o:p></p>
<p class="MsoListParagraph" style="text-indent: -0.25in;"><!--[if !supportLists]--><span
style="font-family: Symbol;"><span style="">·<span
style="font: 7pt "Times New Roman";">
</span></span></span><!--[endif]-->Requirements around the
UI/API/Usability of speech/audio capture:<o:p></o:p></p>
<p class="MsoListParagraph" style="margin-left: 1in; text-indent:
-0.25in;">
<!--[if !supportLists]--><span style="font-family: "Courier
New";"><span style="">o<span style="font: 7pt "Times
New Roman";">
</span></span></span><!--[endif]-->FPR42. It should be
possible for user agents to allow hands-free speech input. [<a
moz-do-not-send="true"
href="http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr42">http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr42</a>]<o:p></o:p></p>
<p class="MsoListParagraph" style="margin-left: 1in; text-indent:
-0.25in;">
<!--[if !supportLists]--><span style="font-family: "Courier
New";"><span style="">o<span style="font: 7pt "Times
New Roman";">
</span></span></span><!--[endif]-->FPR54. Web apps should be
able to customize all aspects of the user interface for speech
recognition, except where such customizations conflict with
security and privacy requirements in this document, or where
they cause other security or privacy problems. [<a
moz-do-not-send="true"
href="http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr54">http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr54</a>]<o:p></o:p></p>
<p class="MsoListParagraph" style="margin-left: 1in; text-indent:
-0.25in;">
<!--[if !supportLists]--><span style="font-family: "Courier
New";"><span style="">o<span style="font: 7pt "Times
New Roman";">
</span></span></span><!--[endif]-->FPR13. It should be easy
to assign recognition results to a single input field. [<a
moz-do-not-send="true"
href="http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr13">http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr13</a>]<o:p></o:p></p>
<p class="MsoListParagraph" style="margin-left: 1in; text-indent:
-0.25in;">
<!--[if !supportLists]--><span style="font-family: "Courier
New";"><span style="">o<span style="font: 7pt "Times
New Roman";">
</span></span></span><!--[endif]-->FPR14. It should not be
required to fill an input field every time there is a
recognition result. [<a moz-do-not-send="true"
href="http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr14">http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr14</a>]<o:p></o:p></p>
<p class="MsoListParagraph" style="margin-left: 1in; text-indent:
-0.25in;">
<!--[if !supportLists]--><span style="font-family: "Courier
New";"><span style="">o<span style="font: 7pt "Times
New Roman";">
</span></span></span><!--[endif]-->FPR15. It should be
possible to use recognition results to multiple input fields. [<a
moz-do-not-send="true"
href="http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr15">http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr15</a>]<o:p></o:p></p>
<p class="MsoListParagraph" style="text-indent: -0.25in;"><!--[if !supportLists]--><span
style="font-family: Symbol;"><span style="">·<span
style="font: 7pt "Times New Roman";">
</span></span></span><!--[endif]-->Requirements around
privacy and security concerns:<o:p></o:p></p>
<p class="MsoListParagraph" style="margin-left: 1in; text-indent:
-0.25in;">
<!--[if !supportLists]--><span style="font-family: "Courier
New";"><span style="">o<span style="font: 7pt "Times
New Roman";">
</span></span></span><!--[endif]-->FPR16. User consent
should be informed consent. [<a moz-do-not-send="true"
href="http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr16">http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr16</a>]<o:p></o:p></p>
<p class="MsoListParagraph" style="margin-left: 1in; text-indent:
-0.25in;">
<!--[if !supportLists]--><span style="font-family: "Courier
New";"><span style="">o<span style="font: 7pt "Times
New Roman";">
</span></span></span><!--[endif]-->FPR20. The spec should
not unnecessarily restrict the UA's choice in privacy policy. [<a
moz-do-not-send="true"
href="http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr20">http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr20</a>]<o:p></o:p></p>
<p class="MsoListParagraph" style="margin-left: 1in; text-indent:
-0.25in;">
<!--[if !supportLists]--><span style="font-family: "Courier
New";"><span style="">o<span style="font: 7pt "Times
New Roman";">
</span></span></span><!--[endif]-->FPR55. Web application
must be able to encrypt communications to remote speech service.
[<a moz-do-not-send="true"
href="http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr55">http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr55</a>]<o:p></o:p></p>
<p class="MsoListParagraph" style="margin-left: 1in; text-indent:
-0.25in;">
<!--[if !supportLists]--><span style="font-family: "Courier
New";"><span style="">o<span style="font: 7pt "Times
New Roman";">
</span></span></span><!--[endif]-->FPR1. Web applications
must not capture audio without the user's consent. [<a
moz-do-not-send="true"
href="http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr1">http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr1</a>]<o:p></o:p></p>
<p class="MsoListParagraph" style="margin-left: 1in; text-indent:
-0.25in;">
<!--[if !supportLists]--><span style="font-family: "Courier
New";"><span style="">o<span style="font: 7pt "Times
New Roman";">
</span></span></span><!--[endif]-->FPR17. While capture is
happening, there must be an obvious way for the user to abort
the capture and recognition process. [<a moz-do-not-send="true"
href="http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr17">http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr17</a>]<o:p></o:p></p>
<p class="MsoListParagraph" style="margin-left: 1in; text-indent:
-0.25in;">
<!--[if !supportLists]--><span style="font-family: "Courier
New";"><span style="">o<span style="font: 7pt "Times
New Roman";">
</span></span></span><!--[endif]-->FPR18. It must be
possible for the user to revoke consent. [<a
moz-do-not-send="true"
href="http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr18">http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr18</a>]<o:p></o:p></p>
<p class="MsoListParagraph" style="margin-left: 1in; text-indent:
-0.25in;">
<!--[if !supportLists]--><span style="font-family: "Courier
New";"><span style="">o<span style="font: 7pt "Times
New Roman";">
</span></span></span><!--[endif]-->FPR37. Web application
should be given captured audio access only after explicit
consent from the user. [<a moz-do-not-send="true"
href="http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr37">http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr37</a>]<o:p></o:p></p>
<p class="MsoListParagraph" style="margin-left: 1in; text-indent:
-0.25in;">
<!--[if !supportLists]--><span style="font-family: "Courier
New";"><span style="">o<span style="font: 7pt "Times
New Roman";">
</span></span></span><!--[endif]-->FPR49. End users need a
clear indication whenever microphone is listening to the user. [<a
moz-do-not-send="true"
href="http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr49">http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html#fpr49</a>]<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">We would be happy to discuss the details and
context behind any of these requirements, and we’d also
appreciate any feedback on our use cases and requirements. I’m
sure many of these are requirements the DAP group is already
considering, but the speech use cases may well add some
additional requirements that may not have yet been considered as
part of the capture work.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">The HTML Speech XG is also in the process of
collecting proposals for our Speech API which we are planning to
finish by the end of February. In our discussions to date, we
have reviewed and discussed some of the DAP capture API as well
as some of the work that has gone on around the <device>
tag proposals (We reviewed and discussed at least
<a moz-do-not-send="true"
href="http://www.w3.org/TR/html-media-capture/">http://www.w3.org/TR/html-media-capture/</a>
and
<a moz-do-not-send="true"
href="http://www.w3.org/TR/media-capture-api/">http://www.w3.org/TR/media-capture-api/</a>
and Robin provided the following links to more in progress work
in the htcg call
<a moz-do-not-send="true"
href="http://dev.w3.org/2009/dap/camera/">http://dev.w3.org/2009/dap/camera/</a>
and
<a moz-do-not-send="true"
href="http://dev.w3.org/2009/dap/camera/Overview-API.html">http://dev.w3.org/2009/dap/camera/Overview-API.html</a>).
In general I’d characterize our discussions as we would be
extremely happy if we could reuse the DAP work, and would be
happy to work with you on having proposals that meet this need.
To date in our review the large issue has been the streaming
issue where the capture API is nearly useless to us if it
doesn’t support streaming. But happily from today’s htcg call
it sounds like DAP is actively working on streaming so we
strongly support that work direction, think it is extremely
important, and will be interesting to see any and all work in
that direction.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">I’m not sure what the most productive next
steps for us to take (email discussion back and forth, some HTML
Speech XG members come to a DAP audio capture conference call,
some DAP members come to a Speech XG teleconference, or
something else). In general, the HTML Speech XG tries to do
most of our work over the public email alias and we also have a
schedule-as-needed Thursday teleconference time for 90 minutes
starting at noon New York time.
<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Thanks, and look forward to working on this
important topic with you!<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Michael Bodell (Microsoft)<o:p></o:p></p>
<p class="MsoNormal">Co-chair HTML Speech XG<o:p></o:p></p>
</div>
</body>
</html>