Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
25 views4 pages

How Streaming Object Based Audio Might Work

This document discusses how object-based audio streaming might work. It describes three data stacks for media, production, and schedules. Objects could be encoded individually and composed into representations. Live and on-demand delivery could use templated manifests and compositions to fetch required media chunks. Clients would render compositions to output audio. Practical examples include choosing audio formats, splitting stations, and targeted ad insertion.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views4 pages

How Streaming Object Based Audio Might Work

This document discusses how object-based audio streaming might work. It describes three data stacks for media, production, and schedules. Objects could be encoded individually and composed into representations. Live and on-demand delivery could use templated manifests and compositions to fetch required media chunks. Clients would render compositions to output audio. Practical examples include choosing audio formats, splitting stations, and targeted ad insertion.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Audio Engineering Society

Convention e-Brief 398 rd


Presented at the 143 Convention
2017 October 18–21, New York, NY, USA

This Engineering Brief was selected on the basis of a submitted synopsis. The author is solely responsible for its
presentation, and the AES takes no responsibility for its contents. All rights reserved. Reproduction of this paper, or any
portion thereof, is not permitted without direct permission from the Audio Engineering Society.

How Streaming Object Based Audio Might Work


Adrian Wisbey1
1
British Broadcasting Corporation, Broadcasting House, London, W1A 1AA. United Kingdom.
Correspondence should be addressed to Author ([email protected])

ABSTRACT
Object based media is being considered as the future platform model by a number of broadcasting and
production organizations. This paper is a personal imagining of how object based broadcasting might be
implemented with IP media as the primary distribution whilst still supporting traditional distributions such as
FM, DAB and DVB. The examples assume a broadcaster supporting a number of linearly scheduled services
providing both live (simulcast) and on-demand (catch-up) content. An understanding of the basics of object
based audio production and broadcasting by the reader is assumed. Whilst this paper specifically discusses audio
or radio broadcasting many of the components and requirements are equally valid in a video environment.

1 Introduction For the purposes of this paper I have assumed the


Historically IP media production and distribution production environment will comprise three data
has come at end of the traditional linear broadcast stacks each with a high speed, high availability data
chain. If broadcasters are to really become ‘internet store and associated API. (Fig. 1).
fit’ then they may need to embrace a reversal of
these roles with IP media the heart of production and • The Media Stack hosts the objects and their
distribution and linear broadcasting tacked onto the associated metadata.
end. • The Production Stack hosts the scheduled
content compositions.
Current broadcasting is channel based continuous • The Schedules Stack hosts the station running
linear content on a fixed number of channels with orders and programme metadata.
the representation pre defined by the original mix.

Object based broadcasting does not have any


channels or mixes at the point of distribution. It
comprises a pool of objects and a recipe of how to
create a representation for the listener.

That recipe or composition contains all the data


required to allow the playback engine or renderer to
construct a mix suitable for the listeners’ equipment
capabilities and environmental situation. The same
objects and composition could produce stereo,
surround sound or binaural representations and those
representations could be processed for normal, night Figure 1. Architecture Overview.
mode or loud environment listening.
Wisbey Streaming Object Based Audio

2 Encoding
There are already relatively new codecs designed to
handle object based audio. However for this
proposal to work no specialized encoder is required.
Every object exists as an entity in its own right and
can be encoded using the current batch of audio
codecs. They can be encoded in multiple bitrates to
enable adaptive bitrate streaming to be implemented.

Delivery of live streams can be accomplished by Figure 3. Object Based Delivery


utilising a ‘chunk at source’ workflow. Usually, for
HTTP chunked delivery, the content is chunked as The basic proposition is the same, the significant
part of the encoding process. Chunking at source difference being the separation of the media
breaks the objects it to a pre defined chunk size and (objects) and composition components. It is
so each chunk can be encoded in isolation. As each assumed that media objects will still be delivered
chunk is encoded the object metadata is updated to using an HTTP chunked protocol using a templated
represent the latest object statistics and availability. manifest. The composition would define the starting
and ending media chunks required.
3 On-line Delivery
Consider the functional model of current HTTP Templated manifests work by having a predictable
chunked delivery formats (HLS, HDS and MPEG- URL for the next required media chunk thus
DASH). (Fig. 2). reducing the number of manifest calls required to
just one at the start of the media request.

Both these models work for live (simulcast) and on-


demand (catch-up) delivery. The only difference
being that for live the manifest or composition will
be being continuously updated in near real time as
new media chunks and composition events are
created in the production ‘mix’. Thus there is a
requirement to refresh the manifest or composition
at regular intervals.
Figure 2. HTTP Chunked Delivery
4 The Client
The media availability may be provided from ‘in-
The basic client comprises a User Interface (UI) and
house’ resources, third party ‘aggregators’ or both.
the playback engine or renderer.
Provided the correct availability and endpoint details
are supplied both solutions work.
The UI is responsible for the collection, curation and
presentation of the available media. It could also
Not shown, for clarity, are any cacheing layers and
allow for personal preferences for listening format
probable content delivery network (CDN) partner
and conditions to be managed.
required to deliver the services at scale and with
resilience.
The renderer is the heart of the client. It is
responsible for reading the composition, fetching the
Compare this with a functional model of object
required media and mixing (rendering) the final
based delivery. (Fig. 3).
audio image. It would also be responsible for

AES 143rd Convention, New York, NY, USA, 2017 October 18–21
Page 2 of 4
Wisbey Streaming Object Based Audio

determining network connection type and managing 5.2 Choice of Image


the adaptive bitrate function. A live classical music event is being mixed in multi-
channel format. The listener has the choice of audio
A specialized, stripped down, client would be image formats to consume.
implemented to provide the fixed format output
needed to sustain the traditional linear broadcast
platforms. This client could run at the broadcast
centre or at a remote transmitter or relay site.

5 Practical Examples
Most of these use cases will be familiar to most
broadcasters and content producers. These are just a
few of the possible scenarios where object based
Figure 5 - Audio Image Selection
audio can be leveraged to solve a technical and/or
production requirement.
5.3 Station Splits
One of the main reasons they all work is because A local station wants to carry the home team match
objects can have any duration and any point in the commentary. The listener can choose to stay with
program timeline can be chosen for an event. scheduled programming or switch to the game.

Other uses cases not illustrated here include:


• Choosing a personalized equalization and/or
dynamic range setting to suit the listeners
preferred tastes and listening environment.
• Synthesizing surround or immersive sound from
mono or stereo material.
• User control of level and position of particular
elements such as commentary, home team Figure 6 - Station Split
support or away team support.
• Multiple language support. 5.4 Ad Insertion
A station wants to deliver a targeted advertisement
5.1 Variable duration to a listener. Predefined or on the fly markers are
A magazine program is produced with alternative inserted into the composition that trigger an auction
duration editions. The listener has the choice of for the ad spot.
versions to consume.

Figure 7 - Advertising
Figure 4 - Variable Duration Content

AES 143rd Convention, New York, NY, USA, 2017 October 18–21
Page 3 of 4
Wisbey Streaming Object Based Audio

6 Reality Check: Are we there yet? BBC IP Studio (http://www.bbc.co.uk/rd/projects/ip-


If we assume an orderly standards based world then studio) is building a model for end-to-end
the short answer is no. Whilst there may be practical broadcasting that will allow a live studio to run
working examples they rely to a greater or lesser entirely on IP networks.
extent on proprietary technology. Having said that
there are many areas in which we do have standards Advanced Media Workflow Association (AMWA)
or at least standards in development with the AES has produced a series of Network Open Media
taking a leading role in many areas. Specifications (NMOS https://github.com/AMWA-
TV/nmos).
AES67[1] provides comprehensive interoperability
recommendations in the areas of synchronization, The Society of Motion Picture and Television
media clock identification, network transport, Engineers (SMPTE), the European Broadcasting
encoding and streaming, session description and Union (EBU), the International telecommunications
connection management. Union (ITU), the Advanced Television Systems
Committee (ATSC), Digital Video Broadcasting
AES70[2] defines a scalable open control-protocol (DVB), the European Telecommunications
architecture for professional media networks Standards Group (ETSI) and of course the Audio
addressing device control and monitoring only Engineering Society (AES) are all working on
standards that enable or contribute to the goal of
AESX238 aims to define network discovery, OBB over IP.
registration and authentication. Currently this is
probably the largest ‘missing piece’. 7 Conclusion
Object Based Audio holds much potential for online
Initial work by BBC R&D on a Universal Media and traditional broadcasting although the required
Composition Protocol[3] has now largely been standards and maturity of technology is not quite
replaced by efforts to update and extend the Audio ready. However the demands of Next Generation
Definition Model[4] including real time audio. Audio and Immersive 3D Gaming are likely to drive
the development and implementation of OBB. This
So far renderers have only been described within the could prove to be the turning point for online media
limitations of specific codec implementations; Dolby to become the primary focus of broadcasters and
AC-4 and MPEG-H. EBU R-147 calls for a generic producers on their technology roadmaps.
Next generation Audio (NGA) renderer.
8 References
The platform will need to be hosted on a high
performance media transit network also capable of [1] AES67-2015 Audio application of networks -
supporting the required monitoring, control and High-performance streaming audio-over IP
media-clock (timing) protocols in a distributed and interoperability.
or cloud based solution. The Joint Taskforce on [2] AES70-2015 Audio application of networks –
Network Media (JTNM) has produced reference Open Control Architecture.
architecture and gap analysis documents.
[3] http://www.bbc.co.uk/rd/blog/2016-09-
ORPHEUS (https://orpheus-audio.eu/) is a European object-based-composition
research project dedicated to improving the [4] ITU BS-2076-1 Audio Definition Model
management of audio content to create new user EBU Tech 3364 Metadata Specification
experiences. The ten consortium partners aim to AES Convention Paper 9629
develop, implement and validate a new end-to-end
object-based media chain for audio content.

AES 143rd Convention, New York, NY, USA, 2017 October 18–21
Page 4 of 4

You might also like