Learning WebRTC in Practice: Best Tools and Demos

Professor

Professional
Messages
637
Reaction score
635
Points
93

For whom​

In this article (or digest), I will share the key tools, demo applications, and open source projects that are essential for a practical understanding of WebRTC. There will be no tutorials or detailed explanations of this or that part of WebRTC, but rather a digest of resources that will help you better understand the topic. If you have been working with this technology for some time, you are unlikely to find anything new for yourself.

WebRTC Basics: What You Need to Know​

If you are familiar with how video calls work in browsers, feel free to skip this section (and perhaps the entire material).

Most popular video calling platforms (except Zoom) use WebRTC technology to implement their products in browsers. It is the de facto standard for creating real-time communication services and is supported by all modern browsers.

WebRTC consists of several main components:
  • Media Capture API: It allows you to get audio and video streams from the device's camera and microphone.
  • Connection API: This includes protocols and methods for establishing direct connections between browsers to transfer media and real-time data. This includes exchanging network information, choosing the best route for data, and managing the connection.
  • Codecs and media processing: WebRTC supports various audio and video codecs for encoding and decoding media streams. Media processing tools are also available, including echo cancellation, noise reduction, and automatic gain control.
  • Security: WebRTC uses encryption for all transmitted data and media streams to ensure privacy and security on the network.

How WebRTC works:
  • Signaling: To start exchanging media data between peers, a signaling process is required, which is not directly included in the WebRTC specification, leaving developers free to choose the signaling method (via WebSocket, REST API, etc.). Signaling is used to exchange meta-information: media type, codec parameters, network addresses and ports required to establish a connection.
  • Connection establishment: After the initial information is exchanged, ICE (Interactive Connectivity Establishment) mechanisms are applied using STUN (Session Traversal Utilities for NAT) and TURN (Traversal Using Relays around NAT) servers to bypass NAT (network address translator) and establish a direct connection between peers.
  • Data Transfer: Once a connection is established, streaming data begins to be transferred between peers over a secure connection using RTP (Real-time Transport Protocol) for media and SCTP (Stream Control Transmission Protocol) for data.
  • Adaptation and processing: During transmission, WebRTC can adapt to changing network conditions, changing the quality of the stream to optimize performance. Media processing also occurs, including noise filtering, echo cancellation, and other features to improve communication quality.

Next, in three sections, I will talk about the tools/services and demos that helped me study this technology in a practical way, and then I will share opensource products that will allow you to understand in detail and in practice how WebRTC works in large products.

Tools​

Chrome Tools​

The Chrome browser provides several powerful tools for analyzing WebRTC:
  • chrome://webrtc-internals
  • chrome://webrtc-logs

Webrtc-internals​

Let's start with chrome://webrtc-internals. This is probably the most basic tool that you get to know when starting out with WebRTC. It provides a ton of information about current connections, which helps with debugging and reverse engineering other products.

Here you can find out what codecs are used by existing video calling services. For example, by joining a call in Google Meets and opening chrome://webrtc-internals, you can see that Google Meets uses the AV1 video codec.

Outgoing video stream information when calling in Google Meets

Outgoing video stream information when calling in Google Meets

Or, for example, you can see the number of frames sent per second and their resolution

Graphs displaying information about the video stream at each moment in time

Graphs displaying information about the video stream at each moment in time

The main limitation is that the data is only available during a call (or if you turn on recording in advance), which means that the tool cannot be used during unexpected problems.

More information can be found here: https://getstream.io/blog/debugging-webrtc-calls/

Webrtc-logs​

This section is used much less often, as it contains more detailed information, and is necessary for a more detailed analysis of complex problems. Unlike chrome://webrtc-internals, it contains logs in text format about previous sessions. Thus, you can view logs for an already completed call, chrome://webrtc-internals cannot do this.

Inside you can find information about the start and end of video calls, media stream parameters, codecs, video resolution, frame rate and other media stream quality metrics.

Some of the information in these logs can be used to build metrics similar to those in chrome://webrtc-internals. Unfortunately, Chrome does not provide this functionality, but there is an open source product that helps with this task.

1ad7cf7a8e526c268fc3bfd53c75df3b.png


This tool is ideal if a problem occurs on one of the real calls and when real-time analysis is not available. The log data is saved by default to the directory …/Google/Chrome/Default/WebRTC Logs/<log_name>.gz, which allows you to collect this data from non-technical users who are willing to help analyze the problem.

TestRTC​

Link

analyzeRTC from testRTC is in something like beta testing. The entire platform is paid, but you can analyze your dump absolutely free. In general, you won't be able to find anything new here after chrome://webrtc-internals, since they have the same data source. But the display of this data in analyzeRTC is much better.

For example, information about the video stream is presented in a particularly convenient way:

Video stream information in the same Google Meets call, but using logs and TestRTC

Video stream information in the same Google Meets call, but using logs and TestRTC

You can also see the overall call and audio quality rating here:

4587c0c3a723c47b962b4c4ce8cde748.png


Score is an overall score of the scenario from 0 to 10, taking into account audio and video across all channels. It helps to evaluate the performance of the service and requires subjective interpretation based on experience with the application. MOS (Mean Opinion Score) evaluates only the audio channels of the scenario, representing the subjective quality of the sound on a scale of 1 to 5, where above 3 is considered good quality, 2 to 3 is fair, and below 2 is poor.

A short video tutorial can be found here

Demo applications​

Official Demos​

Link

What follows are official examples of WebRTC usage. In this case, it is not a tool as I described above, but a set of working examples. These examples cover a wide range of functionality, from basic peer-to-peer connection establishment to more complex scenarios such as data exchange and media stream management.

Each link contains a working demo and a link to the source code on GitHub. Using only these examples, you can already understand how to develop a fairly complex web application with video calling functionality. In general, these examples help to better understand each aspect of WebRTC in practice, in order to better consolidate theoretical knowledge.

Official demo of initializing a connection between two participants of a video conference

Official demo of initializing a connection between two participants of a video conference

SDP Explanation​

Link

Example of explaining each line of SDP using an interactive web application

Example of explaining each line of SDP using an interactive web application

Next, let's talk about SDP - a protocol used to establish a connection before starting a call. The format itself is described in RFC 8866, it is designed to agree on media stream parameters before a call, such as codec formats, video resolution and media types (audio, video, data) needed to establish a connection and exchange media data.

The protocol is designed in such a way that on the one hand it is human-readable, and on the other hand it takes up as little memory as possible, as a result it is quite difficult to immediately understand what is contained in a specific message. This tool can be used to solve this problem.

In interactive mode, you can read about each line of the SDP protocol, saving time on searching for explanations in RFCs.

Media test​

Link

Flexible configuration of video stream processing tests before sending it

Flexible configuration of video stream processing tests before sending it

In addition to establishing a connection and transmitting media streams, an important component of any video calling application is the processing of these media streams. This tool allows you to understand how you can process video.

Here you can experiment with various aspects of video streams, such as creating a stream from scratch or from a camera, choosing video resolution and frame rate. You can also apply transformations to frames, such as changing color, converting to black and white, encoding/decoding to H.264, and adding overlays to track display time in the <video> element.

After experimenting, you can see the results of the processing, including the processing time at different stages (e.g. converting to RGBX, adding background and overlays), frame display time, total time from start to finish of the processing, and the waiting time in the queue. Each metric includes the number of frames processed, average, median, minimum, and maximum processing times. This data helps you evaluate the performance and efficiency of video processing in your application and understand how much each operation “costs.”

Slides from the authors.

Simulcast Playgrounds​

First
Link

Simulcast settings in simulcast playground

Simulcast settings in simulcast playground

Another demo, this time unofficial but still open source. Here we focus on Simulcast. This is an approach that allows conference participants with different network bandwidths to receive video from you in different resolutions (the higher the bandwidth, the better the resolution).

In order not to use server resources, each participant sends several video streams in different resolutions. This is the algorithm simulated in this example. It allows you to configure the parameters of video transmission, including the choice of resolution, frame rate, codecs and specific settings for Simulcast, such as adding and removing layers, changing priorities and setting encoder parameters.

If you've never heard of Simulcast, it's best to spend some time on the theory before jumping into this demo.

Second
Link

Simulcast stream statistics

Simulcast stream statistics

Another variation of the simulcast demo, not at all flexible, but much more representative. You can see the standard configuration with three parallel streams and metrics from chrome://webrtc-internals. It looks more familiar and does not distract with additional settings. An excellent addition to the tool described above.

Pagination Demo App​

Link

A great tool for understanding how incoming video streams are displayed and managing the resources they use. Unfortunately, the full demo was not working at the time of writing, but it wouldn't have been much help anyway, since the main value of this product is its source code.

There you can figure out how to properly manage incoming video/audio streams from other call participants. This is especially important when there are a lot of participants, when they all do not fit on the screen and you need to add other pages. There is no point in accepting a data stream from participants located on other pages, and you need to process it correctly.

The most interesting component is 'components/PaginatedGrid.js', it contains almost all the logic for managing rendering and managing streams. A description of the source code and the entire application in general can be found in the official documentation.

Opensource products​

Jitsi Meet​

GitHub repository

One of the most popular open source products for developing video calling related to the browser. In fact, jitsi contains not only the web client, but also all the other necessary parts to build the entire product. It is a great resource for diving very deeply into the topic of RTC, but it will take a lot of time, as jitsi is very flexible and quite an old product with a huge code base.

This tool overall has a pretty high barrier to entry due to its size and age. Documentation is present and fairly detailed, but again, it's very large.

Janus​

GitHub repository

Janus Gateway is an open-source WebRTC server that solves a similar problem. Unlike Jitsi Meet, however, Janus focuses on providing a low-level API for managing media streams, making it more flexible but potentially more difficult for newcomers to use. It's also more suited to those interested in the server side of things rather than the client side, which Jitsi Meet is better suited for.

Janus has very detailed documentation and a whole bunch of separate demos (which, by the way, were not included in the previous section, but I would still recommend checking them out).

Mediasoup​

GitHub repository

MediaSoup is another WebRTC server. Unlike Jitsi Meet and Janus Gateway, it solves the same tasks as Janus Gateway with the difference that mediasoup has both a server and a client part.

Other products​

In fact, there are a whole bunch of opensource products for building video calls, I have indicated those that I consider the most useful and convenient when studying the technologies and approaches used in WebRTC.
You can learn about other products at the link.

Instead of conclusions​

There were no discussions or innovations in the article, which means that no conclusions can be made. I only described all the tools that I used, I am sure that this is far from a complete list. I will be grateful for your recommendations of tools and services that I missed in the comments!

(c) Source
 
Top