Video Service

Note

The Video Service is considered to be in Beta phase on Linux systems.

The Video Service allows managing the local participant’s video stream as well as attaching a Video Sink to remote video streams.

Video interface

#include <dolbyio/comms/video.h>

class video

The video service.

Public Functions

virtual local_video &local() = 0

Gets the local video service instance.

Returns:

the local video service.

virtual remote_video &remote() = 0

Gets the remote video service instance.

Returns:

the remove video service.

Local video

class local_video

The local video service.

This service is used to control local participant’s video capture and sending into conference.

Attention

The local video interface contains methods that return async_result. Each function that returns async_result is asynchronous and the operation is executed during the SDK event loop. The caller can block the calling thread until the operation completes using the wait helper. The caller can also chain consecutive operations, which are dependent on the completion of this method, using the async_result::then calls. Each async_result chain needs to be terminated with an async_result::on_error.

Public Functions

virtual async_result<void> start(const camera_device &device = {}, video_frame_handler *handler = nullptr) = 0

Starts local video capture.

This method may be called at any time, regardless of the conference state. If this method is invoked when there’s no active conference, it will still select the camera device and set the video frame handler. If the video frame handler returns a non-null video sink, camera will start delivering frames to the sink.

This method can also be used to switch cameras at any point. If you have passed in a video_frame_handler to the previous start call and would like to continue using this handler, you must pass the same handler into the subsequent call used to switch cameras. This will have the effect of just switching cameras, keeping the rest of the pipeline in tact.

The ownership of the frame handler remains with the application. The application must not delete the handler, its sink and source, until it invokes the stop() method and the stop() method execution is finished.

If the application uses a default-constructed camera_device, then a first camera found in the system will be used.

If this method returns an error, the provided frame handler can be safely deleted by the application.

If the application starts the video while not in the conference, and later joins the conference, the conference’s local video state is determined by the media_constraints passed to the conference::join() method. It is possible to start local camera preview, but join the conference without video; in order to enable video later in the conference, the start() method should be used again. It is not possible to disable sending video into the conference but keep the local camera preview once the conference started video.

Parameters:
  • device – Camera device to start capturing from.

  • handler – the camera stream’s video frame handler.

Returns:

async operation result.

virtual async_result<void> stop() = 0

Stops local video capture.

Returns:

The result object producing the operation status asynchronously.

Remote video

The remote video API allows attaching a Video Sink to receive Raw Video Frames. These video frames can then be handled as the application desires. For example, for rendering on screen or dumping to a file.

class remote_video

The remote video service.

Attention

The remote video interface contains methods that return async_result. Each function that returns async_result is asynchronous and the operation is executed during the SDK event loop. The caller can block the calling thread until the operation completes using the wait helper. The caller can also chain consecutive operations, which are dependent on the completion of this method, using the async_result::then calls. Each async_result chain needs to be terminated with an async_result::on_error.

Public Functions

virtual async_result<void> set_video_sink(video_sink *sink) = 0

Sets the video sink to be used by all conferences.

The video sink passed to this method will be used for passing the decoded video frames to the application. The ownership of the sink remains with the application, and the SDK will not delete it. The application should set null sink and ensure that the set_video_sink() call returned before deleting the previously set sink object.

Parameters:

sink – the video sink or nullptr.

Returns:

the result object producing the operation status asynchronously.

Video frame handling

#include <dolbyio/comms/media_engine/media_engine.h>

The application can use the video frame handling capabilities of the SDK to process captured VideoFrames. The frame handler is an interface providing a Video Sink and a Video Source, thus inserting the frame handler into the video capture pipeline allows the application to receive, process, and then inject VideoFrames into the SDK. If you want to only provide frames to the SDK, the application only needs to implement the Video Source portion of the handler.

The Video processor section shows a very basic idea for implementing a Video Processor to receive camera frames, alter them and inject them back into the SDK. Note that this is just an example to give an idea of how to create such a module. After creating a custom video processor module like the example, one must call start video and provide the processor to this function call. At this point the processor will be part of the Video Capture pipeline and all camera frames will pass through it.

class video_frame_handler

The video frame handler for local video streams.

The application can set the video frame handler when starting a local camera stream. The frame handler can be used to capture the camera frames for local camera preview, and for delivering modified video frames back into the video pipeline for encoding.

There are four use-cases supported by the video_frame_handler:

  1. No-op frame handler: the camera frames are not delivered to the application, and are being encoded by the video pipeline and sent into conference. The frame handler may return null sink and source, or the frame handler pointer passed to the media pipeline can be just null.

  2. The local preview: the frame handler returns non-null sink, but a null source. The video frames captured from the camera are passed both to the conference’s video track, and to the frame handler sink.

  3. Video processing: the frame handler returns non-null sink and source. The camera frames are passed to the frame handler sink only. When the conference’s video track starts sending data, it will connect the frame handler source to the internal sink. The application is supposed to deliver the video frames, but it’s not required to be synchronous with frames delivered to the frame handler sink. The application can deliver frames on any thread.

  4. Video injection: the frame handler returns null sink, and non-null source. In this scenario, the real camera is not used at all. The application should deliver frames produced externally through the frame handler source interface.

In the local preview and video processing scenarios, the camera is open all the time, regardless of the video track state in the conference. The local preview can be displayed even before joining the conference, and will remain open after the conference is left. In the video injection scenario, the camera is not open at all. When a no-op frame handler is used, the conference’s video track presence enables the camera.

Subclassed by dolbyio::comms::plugin::injector

Public Functions

virtual video_sink *sink() = 0

Get the frame handler’s video sink.

If the frame handler wishes to get raw video frames in the stream it’s attached to, this method should return the proper video sink pointer.

Returns:

a video sink pointer, or nullptr.

virtual video_source *source() = 0

Get the frame handler’s video source.

If the frame handler wishes to forward the processed frames down the pipeline, it should return non-null source.

Returns:

a video source pointer, or nullptr.

class video_sink

The interface for receiving the raw video frames (YUV bitmaps, or platform-specific format).

Subclassed by dolbyio::comms::plugin::recorder

Public Functions

virtual void handle_frame(const std::string &stream_id, const std::string &track_id, std::unique_ptr<video_frame> frame) = 0

The callback that is invoked when a video frame is decoded and ready to be processed.

Parameters:
  • stream_id – The ID of the media stream to which the video track belongs. In the event of a local camera camera stream, this string may be empty.

  • track_id – The ID of the video track to which the frame belongs. In the event of a local camera camera stream, this string may be empty.

  • frame – The pointer to the video frame.

class video_source

The interface for providing video frames.

This interface must be implemented by the injector, it shall serve as the source of video frames passed to the rtc_video_source.

Subclassed by dolbyio::comms::plugin::injector

Public Functions

virtual void set_sink(video_sink *sink, const config &config) = 0

Sets the video sink on the video source.

This method is invoked when the video pipeline is ready to accept video frames from the source. After this method is invoked with non-null sink, the source can start delivering frames on any thread. This method may be invoked multiple times with the same, or changing sink. The source implementation should ensure, if the new sink pointer is different than the previous one, that after this method returns, the previously used sink will not receive any more frames (should sync with the thread which delivers frames). When this method is called with the null sink, the source should stop producing video frames at all.

Parameters:
  • sink – The sink which will receive the injected video frames.

  • config – the suggested config for the video properties.

struct config

The video configuration wanted by the WebRTC track.

The video_source is free to ignore parts of the configuration, or the whole configuration. The video coding will be most efficient if the configuration is respected though.

Public Members

bool rotation_applied = false

Experimental configuration.

bool black_frames = false

True if the frames should be black.

int max_pixel_count = std::numeric_limits<int>::max()

The maximum number of pixels in each frame.

int target_pixel_count = -1

The desired number of pixels in each frame. -1 means no preference, but the source should attempt to fit below max_pixel_count.

int max_framerate_fps = std::numeric_limits<int>::max()

The maximum framerate.

Video frame API

#include <dolbyio/comms/media_engine/media_engine.h>

Video frame interface

class video_frame

The interface that wraps decoded video frames received from and to be injected into WebRTC.

Public Functions

virtual ~video_frame() = default
virtual int width() const = 0

Gets the width of the frame.

Returns:

The width of the frame.

virtual int height() const = 0

Gets the height of the video frame.

Returns:

The height of the frame.

virtual int64_t timestamp_us() const = 0

Gets the timestamp of the video frame if it was set.

Attention

On frames passed from the SDK this will be set to the time when the frame was captured. This will be in sync with the timestamp of the captured audio frame corresponding to this video frame. If the application plans to process the frame and then inject the processed frame back to the SDK, it should reuse the timestamp it receives from the SDK to ensure proper AV synchronization on the receiving end.

Returns:

The System monotonic clock timestamp of the video frame in microseconds.

virtual video_frame_i420 *get_i420_frame() = 0

Gets the I420 (YUV) data of the frame.

Returns:

the instance of the YUV interface to the data frame, or nullptr if the video frame is not in YUV format.

virtual video_frame_macos *get_native_frame() = 0

Gets the Texture data of the frame.

Attention

This is currently only available on MacOS.

Returns:

the instance of the MacOS video frame interface to the data frame, or nullptr if the video frame is not a texture.

YUV420 video frame

class video_frame_i420

The interface for obtainig I420 (YUV) memory pointers and info for i420 frames.

Public Functions

virtual const uint8_t *get_y() const = 0

Gets the Y component.

Returns:

The pointer to the Y data buffer.

virtual const uint8_t *get_u() const = 0

Gets the U component.

Returns:

The pointer to the U data buffer.

virtual const uint8_t *get_v() const = 0

Gets the V component for YUV.

Returns:

The pointer to the V data buffer.

virtual int stride_y() const = 0

Returns the Y component stride.

Returns:

An integer representing the Y component stride.

virtual int stride_u() const = 0

Returns the U component stride.

Returns:

An integer representing the U component stride.

virtual int stride_v() const = 0

Returns the V component stride.

Returns:

An integer representing the V component stride.

MacOS video frame

#include <dolbyio/comms/media_engine/video_frame_macos.h>

class video_frame_macos

MacOS Video Frames containing Texture data.

Public Functions

virtual CVPixelBufferRef get_buffer() = 0

Gets the underlying CVPixelBUfferRef .

Returns:

Reference to the underling CVPixerlBuffer.

Video utilities

#include <dolbyio/comms/media_engine/video_utils.h>

class format_converter

Class which can be used as helper convert various frame formats. For now it supports to/from NV12, i420, RGB as well as helper for merging/splitting UV planes.

Public Static Functions

static int nv12_to_i420(const uint8_t *src_y, int src_stride_y, const uint8_t *src_vu, int src_stride_vu, uint8_t *dst_y, int dst_stride_y, uint8_t *dst_u, int dst_stride_u, uint8_t *dst_v, int dst_stride_v, int width, int height)

Convert from NV12 to i420 video format.

Parameters:
  • src_y – Source Y Plane Buffer.

  • src_stride_y – Source Y Plane Stride.

  • src_vu – Source UV Plane Buffer.

  • src_stride_vu – Source UV Plane Stride.

  • dst_y – Destination Y plane buffer.

  • dst_stride_y – Destination Y plane stride.

  • dst_u – Destination U plane buffer.

  • dst_stride_u – Destination U plane stride.

  • dst_v – Destination V plane buffer.

  • dst_stride_v – Destination V plane stride.

  • width – Frame width.

  • height – Frame height.

Return values:
  • 0 – On success.

  • 1 – On failure.

static void split_uv_planes(const uint8_t *src_vu, int src_stride_vu, uint8_t *dst_u, int dst_stride_u, uint8_t *dst_v, int dst_stride_v, int width, int height)

Split the UV plane buffers into respective U and V buffers. This can be used to go from NV12 to i420 if you want to keep the same Y buffer and just get separated U and V buffers.

Warning

Caller must ensure that height * width does not exceed buffer size.

Parameters:
  • src_vu – Source UV Plane Buffer.

  • src_stride_vu – Source UV Plane Stride.

  • dst_u – Destination U plane buffer.

  • dst_stride_u – Destination U plane stride.

  • dst_v – Destination V plane buffer.

  • dst_stride_v – Destination V plane stride.

  • width – Width of the plane the plane buffer (not frame width).

  • height – Height of the plane buffer (not frame width).

static int i420_to_nv12(const uint8_t *src_y, int src_stride_y, const uint8_t *src_u, int src_stride_u, const uint8_t *src_v, int src_stride_v, uint8_t *dst_y, int dst_stride_y, uint8_t *dst_uv, int dst_stride_uv, int width, int height)

Convert i420 to NV12 video format.

Parameters:
  • src_y – Source Y Plane Buffer.

  • src_stride_y – Source Y Plane Stride.

  • src_u – Source U Plane Buffer.

  • src_stride_u – Source U Plane Stride.

  • src_v – Source V Plane Buffer.

  • src_stride_v – Source V Plane Stride.

  • dst_y – Destination Y plane buffer.

  • dst_stride_y – Destination Y plane stride.

  • dst_uv – Destination UV plane buffer.

  • dst_stride_uv – Destination UV plane stride.

  • width – Frame width.

  • height – Frame height.

Return values:
  • 0 – On success.

  • 1 – On failure.

static void merge_uv_plane(const uint8_t *src_u, int src_stride_u, const uint8_t *src_v, int src_stride_v, uint8_t *dst_uv, int dst_stride_uv, int width, int height)

Merge the U and V plane buffers into a single UV plane buffer. This can be used to convert from i420 to NV12, if you want to keep using the same Y buffer qnd just get merged UV buffer.

Warning

Caller must ensure that height * width does not exceed buffer size.

Parameters:
  • src_u – Source U Plane Buffer.

  • src_stride_u – Source U Plane Stride.

  • src_v – Source V Plane Buffer.

  • src_stride_v – Source V Plane Stride.

  • dst_uv – Destination UV plane buffer.

  • dst_stride_uv – Destination UV plane stride.

  • width – Width of the plane the plane buffer (not frame width).

  • height – Height of the plane buffer (not frame width).

static int i420_to_argb(const uint8_t *src_y, int src_stride_y, const uint8_t *src_u, int src_stride_u, const uint8_t *src_v, int src_stride_v, uint8_t *dst_argb, int dst_stride_argb, int width, int height)

Convert i420 to ARGB video format.

Parameters:
  • src_y – Source Y Plane Buffer.

  • src_stride_y – Source Y Plane Stride.

  • src_u – Source U Plane Buffer.

  • src_stride_u – U Plane Stride.

  • src_v – Source V Plane Buffer.

  • src_stride_v – Source V Plane Stride.

  • dst_argb – Destination ARGB buffer.

  • dst_stride_argb – Destination ARGB stride.

  • width – Frame width.

  • height – Frame height.

Return values:
  • 0 – On success.

  • 1 – On failure.

static int argb_to_i420(const uint8_t *src_argb, int src_stride_argb, uint8_t *dst_y, int dst_stride_y, uint8_t *dst_u, int dst_stride_u, uint8_t *dst_v, int dst_stride_v, int width, int height)

Convert ARGB to i420 video format.

Parameters:
  • src_argb – Source ARGB buffer.

  • src_stride_argb – Source ARGB stride.

  • dst_y – Destination Y Plane Buffer.

  • dst_stride_y – Destination Y Plane Stride.

  • dst_u – Destination U Plane Buffer.

  • dst_stride_u – Destination U Plane Stride.

  • dst_v – Destination V Plane Buffer.

  • dst_stride_v – Destination V Plane Stride.

  • width – Frame width.

  • height – Frame height.

Return values:
  • 0 – On success.

  • 1 – On failure.

static int argb_to_nv12(const uint8_t *src_argb, int src_stride_argb, uint8_t *dst_y, int dst_stride_y, uint8_t *dst_uv, int dst_stride_uv, int width, int height)

Convert ARGB to NV12 video format.

Parameters:
  • src_argb – Source ARGB buffer.

  • src_stride_argb – Source ARGB stride.

  • dst_y – Destination Y plane buffer.

  • dst_stride_y – Destination Y plane stride.

  • dst_uv – Destination UV plane buffer.

  • dst_stride_uv – Destination UV plane stride.

  • width – Frame width.

  • height – Frame height.

Return values:
  • 0 – On success.

  • 1 – On failure.

static int nv12_to_argb(const uint8_t *src_y, int src_stride_y, const uint8_t *src_vu, int src_stride_vu, uint8_t *dst_argb, int dst_stride_argb, int width, int height)

Convert NV12 to ARGB video format.

Parameters:
  • src_y – Source Y plane buffer.

  • src_stride_y – Source Y plane stride.

  • src_vu – Source UV plane buffer.

  • src_stride_vu – Source UV plane stride.

  • dst_argb – Source ARGB buffer.

  • dst_stride_argb – Source ARGB stride.

  • width – Frame width.

  • height – Frame height.

Return values:
  • 0 – On success.

  • 1 – On failure.

static void set_plane_buffer_value(uint8_t *dst, int dst_stride, int width, int height, uint32_t value)

Sets a plane buffer to specified 32bit value. This can be used to zero-out a plane for instance.

Warning

Caller must ensure that height * width does not exceed buffer size.

Parameters:
  • dst – Destination plane buffer.

  • dst_stride – Destination plane stride;

  • width – Width of buffer the plane in bytes.

  • height – Height of the plane buffer in bytes.

  • value – Value to be set.