不管电信运营商们有多么渴望回到OTT出现之前的黄金年代,这已经是不可能了,但WebRTC可以让电信运营商停止没落之势,甚至可以稍微扳回一点,这就要看他们的行动有多快多彻底了。AT&T, T-Mobile, Deutsche Telekom 以及 Orange是主流电信运营商的例子,他们已经意识到这一点,并迅速对WebRTC提供的机会进行投资。问题是,其他人要跟上还要多久呢?
作者简介:
Tsahi Levent-Levi在远程通信、VoIP以及3G领域已经有超过15年经验了:他做过工程师、经理、营销人员、CTO。现在,Tsahi Levent-Levi在Amdocs担任商务解决方案总监(the Director of Business Solutions),并负责寻找创新的方式,让电信运营商可以为用户带去更多价值。
连接成功之后,作为client端,可以看到除自己之外所有连接到相同server端的client(s)。比如,现在我有两个client(Temo和Administrator)连接到同一个server,那么client端Temo可以看到Administrator在线信息,List of currently connected peers:就是用来显示所有连接到同一server的client信息的:
WebRTC is a new front in the long war for an open and unencumbered web. — Brendan Eich, inventor of JavaScript
Imagine a world where your phone, TV and computer could all communicate on a common platform. Imagine it was easy to add video chat to your web application. That’s the vision of WebRTC.
WebRTC implements open standards for real-time, plugin-free video, audio and data communication. The need is real:
A lot of web services already use Real-time Communication (RTC), but need downloads, native apps or plugins. These includes Skype, Facebook (which uses Skype) and Google Hangouts (which use the Google Talk plugin).
For end users, plugin download, installation and update can be complex, error prone and annoying.
For developers, plugins can be difficult to deploy, debug, troubleshoot, test and maintain—and may require licensing and integration of complex, expensive technology. It can be hard to persuade people to install plugins in the first place!
The guiding principles of the WebRTC project are that its APIs should be open source, free, standardised, and more efficient than existing technologies.
Want to try it out? WebRTC is available now in Google Chrome.
A good place to start is the simple video chat application at apprtc.appspot.com. Open the page in Chrome, with PeerConnection enabled on the chrome://flags page, then open the URL again (with query string added) in a new window. There is a walkthrough of the code later in this article.
Quick start
Haven’t got time to read this article, or just want code?
Get an overview of WebRTC from Justin Uberti’s Google I/O video:
Get to grips with the PeerConnection API by reading through the demo at webrtc-demos.appspot.com, which implements WebRTC on a single web page.
Learn more about how WebRTC uses servers for signalling, NAT traversal and data communication, by reading through the code and the console logs from the video chat demo at apprtc.appspot.com.
A very short history of WebRTC
For many years, RTC components were expensive, complex and needed to be licensed—putting RTC out of the reach of individuals and smaller companies.
Gmail video chat became popular in 2008, and in 2011 Google introduced Hangouts, which use the Google Talk service (as does Gmail). Google bought GIPS, a company which had developed many of the components required for RTC, such as codecs and echo cancellation techniques. Google open sourced the technologies developed by GIPS and engaged with relevant standards bodies, the IETF and W3C, to ensure industry consensus. In May 2011, Ericsson built the first implementation of WebRTC.
Other JavaScript APIs used by WebRTC apps, such as getUserMedia and WebSocket, emerged at the same time. Future integration with APIs such as Web Audio will make WebRTC even more powerful—WebRTC has already shown huge promise when teamed up with technologies such as WebGL.
Where are we now?
WebRTC has been available in the stable build of Google Chrome since version 20. The getUserMedia API is ‘flagless’ in Chrome from version 21: you don’t have to enable MediaStream on the chrome://flags page.
Opera 12 shipped with getUserMedia; further WebRTC implementation is planned for Opera this year. Firefox has WebRTC efforts underway, and has demonstrated a prototype version of PeerConnection. Full getUserMedia support is planned for Firefox 17 on desktop and Android. WebRTC functionality is available in Internet Explorer via Chrome Frame, and Skype (acquired by Microsoft in 2011) is reputedly planning to use WebRTC. Native implementations with WebRTC include WebKitGTK+.
As well as browser vendors, WebRTC has strong support from Cisco, Ericsson and other companies such as Voxeo, who recently announced the Phono jQuery plugin for building WebRTC-enabled web apps with phone functionality and messaging.
A word of warning: be skeptical of reports that a platform ‘supports WebRTC’. Often this actually just means that getUserMedia is supported, but not any of the other RTC components.
My first WebRTC
WebRTC client applications need to do several things:
Get streaming audio, video or data.
Communicate streaming audio, video or data.
Exchange control messages to initiate or close sessions and report errors.
Exchange information about media such as resolution and format.
More specifically, WebRTC as implemented uses the following APIs.
MediaStream: get access to data streams, such as from the user’s camera and microphone.
PeerConnection: audio or video calling, with facilities for encryption and bandwidth management.
DataChannel: peer-to-peer communication of generic data.
Crossing the streams
The MediaStream API represents a source of streaming media. Each MediaStream has one or more MediaStreamTracks, each of which corresponds to a synchronised media source. For example, a stream taken from camera and microphone input has synchronised video and audio tracks. (Don’t confuse MediaStream tracks with the <track> element, which is something entirely different.)
The getUserMedia() function can be used to get a LocalMediaStream. This has a labelidentifying the source device (something like ‘FaceTime HD Camera (Built-in)’) as well as audioTracks and videoTracks properties, each of which is a MediaStreamTrackList. In Chrome, the webkitURL.createObjectURL() method converts a LocalMediaStream to a Blob URL which can be set as the src of a video element. (In Opera, the src of the video can be set from the stream itself.)
Currently no browser allows audio data from getUserMedia to be passed to an audio or video element, or to other APIs such as Web Audio. The WebRTC PeerConnection API handles audio as well as video, but audio from getUserMedia is not yet supported in other contexts.
You can try out getUserMedia with the code below, if you have a webcam. Paste the code into the console in Chrome and press return. Permission to use the camera and microphone will be requested in an infobar at the top of the browser window; press the Allow button to proceed. The video stream from the webcam will then be displayed in the video element created by the code, at the bottom of the page.
The intention is eventually to enable a MediaStream for any streaming data source, not just a camera or microphone. This could be extremely useful for gathering and communicating arbitrary real-time data, for example from sensors or other inputs.
Signalling
WebRTC uses PeerConnection to communicate streams of data, but also needs a mechanism to send control messages between peers, a process known as signalling. Signalling methods and protocols are not specified by WebRTC: signalling is not part of the PeerConnection API. Instead, WebRTC app developers can choose whatever messaging protocol they prefer, such as SIP or XMPP, and any appropriate duplex (two-way) communication channel such as WebSocket, or XMLHttpRequest (XHR) in tandem with the Google Channel API.
To start a session, WebRTC clients need the following:
Local configuration information.
Remote configuration information.
Remote transport candidates: how to connect to the remote client (IP addresses and ports).
Configuration information is described in the form of a SessionDescription. the structure of which conforms to the Session Description Protocol, SDP. Serialised, an SDP object looks like this:
The SessionDescription sent by the caller is known as an offer, and the response from the callee is an answer. (Note that WebRTC currently only supports one-to-one communication.)
The offer SessionDescription is passed to the caller’s browser via the PeerConnection setLocalDescription() method, and via signalling to the remote peer, whose own PeerConnection object invokes setRemoteDescription() with the offer. This architecture is called JSEP, JavaScript Session Establishment Protocol. (There’s an excellent animation explaining the process of signalling and streaming in Ericsson’s demo video for its first WebRTC implementation.)
Once the signalling process has completed successfully, data can be streamed directly, peer to peer, between the caller and callee, or via an intermediary server (more about this below). Streaming is the job of PeerConnection.
PeerConnection
Below is a WebRTC architecture diagram. As you will notice, the green parts are complex!
From a JavaScript perspective, the main thing to understand from this diagram is that PeerConnection shields web developers from myriad complexities that lurk beneath. The codecs and protocols used by WebRTC do a huge amount of work to make real-time communication possible, even over unreliable networks:
packet loss concealment
echo cancellation
bandwidth adaptivity
dynamic jitter buffering
automatic gain control
noise reduction and suppression
image ‘cleaning’.
PeerConnection sans servers
WebRTC from the PeerConnection point of view is described in the example below. The code is taken from the ‘single page’ WebRTC demo at webrtc-demos.appspot.com, which has local and remote PeerConnection (and local and remote video) on one web page. This doesn’t constitute anything very useful—caller and callee are on the same page—but it does make the workings of the PeerConnection API a little clearer, since the PeerConnection objects on the page can exchange data and messages directly without having to use intermediary servers.
First, a quick explanation of the name webkitPeerConnection00. When PeerConnection using the JSEP architecture was implemented in Chrome (see above), the original pre-JSEP implementation was renamed webkitDeprecatedPeerConnection. This made it possible to keep old demos working with a simple rename. The new JSEP PeerConnection implementation was named webkitPeerConnection00, and as the JSEP draft standard evolves, it might become webkitPeerConnection01, webkitPeerConnection02—and so on—to avoid more breakage. When the dust finally settles, the API name will become PeerConnection.
So, without further ado, here is the process of setting up a call using PeerConnection…
Caller
Create a new PeerConnection and add a stream (for example, from a webcam):
pc1 = new webkitPeerConnection00(null, iceCallback1);
// ...
pc1.addStream(localstream);
Create a local SessionDescription, apply it and initiate a session:
var offer = pc1.createOffer(null);
pc1.setLocalDescription(pc1.SDP_OFFER, offer);
// ...
pc1.startIce(); // start connection process
(Wait for a response from the callee.)
Receive remote SessionDescription and use it:
pc1.setRemoteDescription(pc1.SDP_ANSWER, answer);
Callee
(Receive call from caller.)
Create PeerConnection and set remote session description:
// create the 'sending' PeerConnection
pc1 = new webkitPeerConnection00(null, iceCallback1);
// create the 'receiving' PeerConnection
pc2 = new webkitPeerConnection00(null, iceCallback2);
// set the callback for the receiving PeerConnection to display video
pc2.onaddstream = gotRemoteStream;
// add the local stream for the sending PeerConnection
pc1.addStream(localstream);
// create an offer, with the local stream
var offer = pc1.createOffer(null);
// set the offer for the sending and receiving PeerConnection
pc1.setLocalDescription(pc1.SDP_OFFER, offer);
pc2.setRemoteDescription(pc2.SDP_OFFER, offer);
// create an answer
var answer = pc2.createAnswer(offer.toSdp(), {has_audio:true, has_video:true});
// set it on the sending and receiving PeerConnection
pc2.setLocalDescription(pc2.SDP_ANSWER, answer);
pc1.setRemoteDescription(pc1.SDP_ANSWER, answer);
// start the connection process
pc1.startIce();
pc2.startIce();
PeerConnection plus servers
So… That’s WebRTC on one page in one browser. But what about a real application, with peers on different computers?
In the real world, WebRTC needs servers, however simple, so the following can happen:
Users discover each other.
Users send their details to each other.
Communication survives network glitches.
WebRTC client applications communicate data about media such as video format and resolution.
In a nutshell, WebRTC needs two types of server-side functionality:
User discovery, communication and signalling.
NAT traversal and streaming data communication.
NAT traversal, peer-to-peer networking, and the requirements for building a server app for user discovery and signalling, are beyond the scope of this article. Suffice to say that the STUN protocol and its extension TURN are used by the ICE framework to enable PeerConnection to cope with NAT traversal and other network vagaries.
ICE is a framework for connecting peers, such as two video chat clients. Initially, ICE tries to connect peers directly, with the lowest possible latency, via UDP. In this process, STUN servers have a single task: to enable a peer behind a NAT to find out its public address and port. (Google has a couple of STUN severs, one of which is used in the apprtc.appspot.com example.)
If UDP fails, ICE tries TCP: first HTTP, then HTTPS. If direct connection fails—in particular, because of enterprise NAT traversal and firewalls—ICE uses an intermediary (relay) TURN server. In other words, ICE will first use STUN with UDP to directly connect peers and, if that fails, will fall back to a TURN relay server. The expression ‘finding candidates’ refers to the process of finding network interfaces and ports.
This code initializes variables for the HTML video elements that will display video streams from the local camera (localVideo) and from the camera on the remote client (remoteVideo). resetStatus() simply sets a status message.
The openChannel() function sets up messaging between WebRTC clients:
function openChannel() {
console.log("Opening channel.");
var channel = new goog.appengine.Channel('AHRlWrqwxKQHdOiOaux3JkDQaxmTvdlYgz1wL69DE20mE3Xq0WaxE3zznRLD6_jwIGiRFlAR-En4lAlLHWRKk862_JTGHrdCHaoTuJTCw8l6Cf7ChMWiVjU');
var handler = {
'onopen': onChannelOpened,
'onmessage': onChannelMessage,
'onerror': onChannelError,
'onclose': onChannelClosed
};
socket = channel.open(handler);
}
For signalling, this demo uses the Google App Engine Channel API, which enables messaging between JavaScript clients without polling. (WebRTC signalling is covered in more detail above).
Establishing a channel with the Channel API works like this:
Client A generates a unique ID.
Client A requests a Channel token from the App Engine app, passing its ID.
App Engine app requests a channel and a token for the client’s ID from the Channel API.
App sends the token to Client A.
Client A opens a socket and listens on the channel set up on the server.
Sending a message works like this:
Client B makes a POST request to the App Engine app with an update.
The App Engine app passes a request to the channel.
The channel carries a message to Client A.
Client A’s onmessage callback is called.
Just to reiterate: signalling messages are communicated via whatever mechanism the developer chooses: the signalling mechanism is not specified by WebRTC. The Channel API is used in this demo, but other methods (such as WebSocket) could be used instead.
After the call to openChannel(), the getUserMedia() function called by initialize() checks if the browser supports the getUserMedia API. (Find out more about getUserMedia on HTML5 Rocks.)
This causes video from the local camera to be displayed in the localVideo element, by creating an object (Blob) URL for the camera’s data stream and then setting that URL as the src for the element. (createObjectURL is used here as a way to get a URI for an ‘in memory’ binary resource, i.e. the LocalDataStream for the video.) The data stream is also set as the value of localStream, which is subsequently made available to the remote user.
At this point, initiator has been set to 1 (and it stays that way until the caller’s session has terminated) so maybeStart() is called:
console.log("Failed to create PeerConnection, exception: " + e.message);
alert("Cannot create PeerConnection object; Is the 'PeerConnection' flag enabled in about:flags?");
return;
}
pc.onconnecting = onSessionConnecting;
pc.onopen = onSessionOpened;
pc.onaddstream = onRemoteStreamAdded;
pc.onremovestream = onRemoteStreamRemoved;
}
The underlying purpose is to set up a connection, using a STUN server, with onIceCandidate() as the callback (see above for an explanation of ICE, STUN and ‘candidate’). Handlers are then set for each of the PeerConnection events: when a session is connecting or open, and when a remote stream is added or removed. In fact, in this example these handlers only log status messages—except for onRemoteStreamAdded(), which sets the source for the remoteVideo element.
Once createPeerConnection() has been invoked in maybeStart(), a call is initiated.
The offer creation process here is similar to the no-signalling example above but, in addition, a message is sent to the remote peer, giving a serialised SessionDescription for the offer. pc.startIce() starts the connection process using the ICE framework (as described above).
Signalling with the Channel API
The onIceCallback() function invoked when the PeerConnection is successfully created in createPeerConnection() sends information about a candidate that has been ‘gathered’.
Outbound messaging, from the client to the server, is done by sendMessage() with an XHR request.
XHR works fine for sending signalling messages from the client to the server, but some mechanism is needed for server–client messaging: this demo uses the Google App Engine Channel API. Messages from the API (i.e. from the App Engine server) are handled by processSignalingMessage().
If the message is an answer from a peer (a response to an offer), PeerConnection sets the remote SessionDescription and communication can begin. If the message is an offer (i.e. a message from the callee) PeerConnection sets the remote SessionDescription, sends an answer to the callee, and starts connection by invoking the PeerConnection startIce()method.
And that’s it! The caller and callee have discovered each other and exchanged information about their capabilities, a call session is initiated, and real-time data communication can begin.
DataChannel
As well as audio and video, WebRTC supports real-time communication for other types of data.
The DataChannel API will enable peer-to-peer exchange of arbitrary data, with low latency and high throughput.
There are many potential use cases for the API, including:
Gaming
Remote desktop applications
Real-time text chat
File transfer
Decentralized networks
The API has several features to make the most of PeerConnection and enable powerful and flexible peer-to-peer communication:
Leveraging of PeerConnection session setup.
Multiple simultaneous channels, with prioritization.
Reliable and unreliable delivery semantics.
Built-in security (DTLS) and congestion control.
Ability to use with or without audio or video.
In conclusion
The APIs and standards of WebRTC can democratise and decentralise tools for content creation and communication—for telephony, gaming, video production, music making, news gathering and many other applications.
Technology doesn’t get much more disruptive than this.
We look forward to seeing what inventive developers make of WebRTC as it becomes widely implemented over the next few months. As blogger Phil Edholm put it, ‘Potentially, WebRTC and HTML5 could enable the same transformation for real-time communications that the original browser did for information.’
For more information about support for APIs such as getUserMedia, see caniuse.com.
----------
运营商借WebRTC技术 推进IMS视频业务发展
Web的实时通信技术或简称WebRTC(Web Real Time Communication)是最近由Google推出的一项旨在支持网络浏览器进行实时语音对话或视频对话的软件架构。和传统的基于本地客户端或浏览器插件的多媒体通信方式不同,WebRTC通过将多媒体通信所必须的音视频处理(采集、编码、增强)、网络传输、会话控制等核心模块集成到浏览器内部,从而使第三方应用开发者仅需通过简单的JavaScript API调用即可获得实时的音视频通信能力。
以下介绍一种如何在基于SIP的IMS网络架构中部署WebRTC端到端实时音视频通信应用的组网方案。出于复杂性的考虑,仅考虑了同类WebRTC客户端间的互通,而不涉及与其他SIP终端或PSTN电话间的互通问题。如图所示,WebRTC客户端是以JavaScript编写的,运行于Web浏览器中的Web应用,直接或通过私有网关连接至Internet网络。业务平台需要架设WebRTC代理服务器和STUN(Session Traversal Utilities for NAT)+TURN(Traversal Using Relays around NAT)服务器。SIP服务器则基于IMS核心网的原有配置,不做任何改动。图中的WebRTC客户端皆位于NAT或防火墙之后。在通信过程中,信令流与媒体流分两路进行传输。
Google 已经与 Global IP Solution(GIPS) 达成一项收协议, GIPS 是一家总部在 San Francisco 的挪威公司,主要技术是 Voice 和 Video IP 技术。
收购价格折算至每股高于今年1月11日 GIPS 股价的142.1%,或5月14日股价的27.5%。
Google 工程总监 Rian Liebenberg 在对外声明中说 GIPS 的技术可以为 Google 提供高质量的语音和视频通讯技术,Google 则欲借此提升其网络开发平台方面的优势。
虽然 Google 并未说明将如何部署或整合 GIPS,但可以料想这项技术在 Google 众多产品如 Google Voice, Gtalk 或 Andorid OS 和 Chrome OS 中会有所应用。
对于 GIPS 现有客户,GIPS CEO Emerick Woods 说,他们将继续为他们提供服务,这些客户包括 AOL, Nortel, Oracle, Samsung, WebEx, and Yahoo.
------------------
Global IP Solutions
Google acquired Global IP Solutions and its real time audio and video products and technology. The GIPS suite of products including: Voice Engine, Video Engine, Voice ConferencingEngine, Video ConferencingEngine, Voice MediationEngine and related components are no longer available for sale to new customers.
GIPS technology is now available for licensing through the WebRTC project.
from http://www.gipscorp.com
-----------------
谷歌开放实时通信框架WebRTC的源代码
谷歌宣布向开发人员开放 WebRTC 架构的源代码。WebRTC 是一项在浏览器内部进行实时视频和音频通信的技术,正是谷歌一年前收购 Global IP Solutions 公司而获得的一项技术。
WebRTC is a free, open project that provides browsers and mobile applications with Real-Time Communications (RTC) capabilities via simple APIs. The WebRTC components have been optimized to best serve this purpose.
Our mission: To enable rich, high-quality RTC applications to be developed for the browser, mobile platforms, and IoT devices, and allow them all to communicate via a common set of protocols.
The WebRTC initiative is a project supported by Google, Mozilla and Opera, amongst others. This page is maintained by the Google Chrome team.
# Enable Built-in turn server
[turn]
# if no set, use random useruser = "filegogo:filegogo"realm = "filegogo"listen = "0.0.0.0:3478"# Public ip# if aws, aliyunpublicIP = "0.0.0.0"relayMinPort = 49160relayMaxPort = 49200
This was introduced on trial but turned out to perform badly for WebRTC
purposes and never used in production.
翻译(gpt-3.5):这个功能在试验中被引入,但在 WebRTC 的实际应用中表现不佳,从未在生产环境中使用过。
I
don’t think there are any plans for using BBR. One of the problems with
BBR for real-time communications is that it alternates between probing
the bandwidth and measuring the RTT. To measure the bottleneck
bandwidth, we need to build network queues. To accurately measure the
RTT, all network queues must be completely drained which requires
sending almost nothing for at least one RTT. These variations in target
send rate do not play nice with encoders and jitter buffers.
No comments:
Post a Comment