Adaptive Scalable Video Streaming in Wireless Networks
Siyuan Xiang
University of Victoria
[email protected]
Lin Cai
University of Victoria
[email protected]
Jianping Pan
University of Victoria
[email protected]
ABSTRACT
In this paper, we investigate the optimal streaming strat-
egy for dynamic adaptive streaming over HTTP (DASH).
Specically, we focus on the rate adaptation algorithm for
streaming scalable video (H.264/SVC) in wireless networks.
We model the rate adaptation problem as a Markov Decision
Process (MDP), aiming to nd an optimal streaming strat-
egy in terms of user-perceived quality of experience (QoE)
such as playback interruption, average playback quality and
playback smoothness. We then obtain the optimal MDP
solution using dynamic programming. We further dene a
reward parameter in our proposed streaming strategy, which
can be adjusted to make a good trade-o between the av-
erage playback quality and playback smoothness. We also
use a simple testbed to validate our solution. Experiment
results show the feasibility of the proposed solution and its
advantage over the existing work.
Categories and Subject Descriptors
C.2.5 [Local and Wide-Area Networks]: Internet; H.5.1
[Multimedia Information Systems]: Video
General Terms
Design, Performance, Experimentation
Keywords
Adaptive Video Streaming, Scalable Video Coding
1. INTRODUCTION
Progressive download is currently one of the most popu-
lar video delivery techniques on the Internet. It has several
advantages over the traditional streaming techniques using
RTSP/UDP. First, it is simple to deploy. At the server side,
any web server can host videos and serve as a streaming
server; at the client side, the user only needs a ash player
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for prot or commercial advantage and that copies
bear this notice and the full citation on the rst page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specic
permission and/or a fee.
MMSys 12, February 22-24, 2012, Chapel Hill, North Carolina, USA.
Copyright 2012 ACM 978-1-4503-1131-1/12/02 ...$10.00.
or web browser supporting HTML5 for video playback. Sec-
ond, the HTTP/TCP protocols used in progressive down-
load are more rewall and NAT friendly, and the congestion
control mechanism in TCP simplies the design of the appli-
cation layer. Third, for progressive download, a server can
store several versions of a video to meet the requirements of
heterogeneous users. Ideally, a user can select the right ver-
sion of the video according to the device decoding capability,
display size and available network bandwidth.
However, selecting the appropriate version of a video ac-
cording to the available bandwidth may not be easy for users
and their decisions might be error-prone. In addition, with
progressive download, the client always downloads as much
video data as possible. It is likely that when a user turns o
the video player or switches to another video, a large amount
of un-watched video is buered unnecessarily, which wastes
the resources of both the network and the end-systems.
Dynamic adaptive streaming over HTTP (DASH) [13] is
a promising technique to overcome the aforementioned dis-
advantages of progressive download. Videos encoded in dif-
ferent versions are chopped into small segments. After the
client receives one segment, it has a chance to decide which
version of the video to request for the next segment, based
on the current network condition. Thus, rate adaptation can
be performed at the client side naturally and exibly. Also,
the client has a chance to control the client-side queue length
to avoid streaming buer overow, e.g., when the download
rate is much higher than the playback rate.
Currently, commercial adaptive streaming products such
as Microsoft Smooth Streaming and Apple Live Streaming
use single-layer H.264/AVC encoded videos. Multiple ver-
sions of a video with dierent resolution, frame rate and
quality are obtained by encoding the source video multiple
times with dierent congurations, and the dierent ver-
sions of the video are completely independent to each other.
Thus, not only more server storage space is needed, but also
the web caching hit-ratio is reduced.
Recently, scalable video coding (H.264/SVC) has been in-
troduced to the DASH framework to improve the system
performance [11]. With SVC, a video is encoded once only
but can be decoded many times with dierent resolution,
frame rate and quality. However, how to improve the rate
adaptation algorithm to provide users with a satisfactory
quality of experience (QoE) is still a challenging, open ques-
tion. The problem is even more challenging when a user uses
a handheld device and a wireless access link for video stream-
ing, as the handheld devices typically have limited energy
supply and computation capacity, and the wireless links are
167
highly dynamic due to the time-varying fading, shadowing,
interference and hand-o, all of which motivate this work.
In this paper, we investigate the optimal strategy for stream-
ing scalable video over HTTP in wireless networks. The
main contributions of this paper are twofold. First, we for-
mulate the rate adaptation problem as a nite Markov De-
cision Process (MDP), aiming to nd an optimal streaming
strategy in terms of user-perceived QoE such as playback in-
terruption, average playback quality and playback smooth-
ness. We obtain the optimal streaming strategy by dy-
namic programing under the reinforcement learning frame-
work [14]. We further dene a reward parameter in our pro-
posed strategy, which can be adjusted to make a good trade-
o between average playback quality and playback smooth-
ness. Second, we evaluate the proposed streaming strategy
and compare it with the existing work using a testbed with
a real sample video encoded by the SVC reference codec,
JSVM [10]. The experiment results show the advantage of
our proposed solution.
The rest of the paper is organized as follows. Section 2
summarizes the related work. Section 3 formulates the op-
timal streaming problem as an MDP and describes the pro-
posed solution based on the reinforcement learning frame-
work. The testbed implementation and the experiment re-
sults are described and given in Section 4, followed by con-
cluding remarks and further research issues in Section 5.
2. BACKGROUND AND RELATED WORK
Dierent from the application layer multicasting [7], in a
DASH system, rate adaptation is conducted at the client
side, which is also called pull-based rate adaptation [3]. At
the server side, a source video is encoded into dierent ver-
sions with dierent resolution, frame rate and quality. For
each version, the video is divided into small segments. A web
server can host these segments and send them to the clients
upon HTTP requests. At the client side, after a user clicks
the play button, the streaming starts. The video player rst
obtains the general information of the video, such as the
number of versions and the corresponding resolution, frame
rate and quality of each version. Then, the video player will
decide the right version according to its own display size, de-
coding capability and network condition. Usually, the play-
back does not start until a sucient number of segments
are received. After the client receives a segment completely,
the rate adaptation algorithm will decide which version to
request for the next segment based on the current network
condition and the client-side state such as the number of
buered segments. In this way, the workload of the server
is reduced dramatically. Fig. 1 shows the general work ow
of the video player.
There are extensive research eorts on adaptive video
streaming over HTTP [13, 8, 2]. [13] introduced the 3GPP
specication of dynamic adaptive streaming over HTTP,
which describes the framework of the adaptive streaming
system. In [2], commercial adaptive streaming products
including Microsoft Smooth Streaming, Netix player and
open source media framework (OSMF) player were evalu-
ated and compared. The results show that the performance
of these products still needs to be improved substantially.
Liu et al. proposed a rate adaptation algorithm for adap-
tive video streaming [8]. The decision of switching to a video
version of a higher or lower bit-rate is made based on the
measured segment fetch time, which can be converted to
initiate client
get video information
send HTTP request
wait HTTP reponse
request the
first segment
measure avg. throughput
estimate bandwidth
save content to buffer
fetched by the decoder
receive HTTP response
rate adaptation algorithm
BEGN
TERMNATON
request decision Video information
Figure 1: Video Player State Diagram
the average segment throughput and buer state. The al-
gorithm is evaluated using constant bit-rate (CBR), single-
layer video trac only, and the queue length may sometimes
exceed the maximum buer size. In [6], a quality adap-
tation controller based on the feedback control theory was
proposed. The controller tries to maintain the buer level
as stable as possible to match the video bit-rate with the
available bandwidth. As the server needs to maintain the
information for each user to perform rate adaptation, the
complexity of the server is increased.
Recently, SVC has been introduced to adaptive video stream-
ing. With SVC, we can encode video once and decode the
bitstream multiple times with dierent resolution, frame
rate and video quality [5], so the server storage space and en-
coding time can be saved. In addition, thanks to the layered
structure of SVC, we may even upgrade an already received
segment to a higher quality [12]. [11] showed the advantage
of using SVC in adaptive HTTP streaming over the single-
layer advanced video coding (AVC) in terms of caching e-
ciency. In [12], the authors proposed a priority-based media
delivery strategy using SVC with RTP and HTTP stream-
ing. In the pre-buering phase, the most important base
layer is transmitted rst, so there are more base-layer frames
than enhancement-layer frames in the buer. This scheme
was designed assuming that the temporary bandwidth re-
duction is the only possible bandwidth variation, and the
bandwidth will restore to a normal level after the tempo-
rary reduction. Thus, it cannot fully handle the random
variation of network bandwidth.
Dierent from these existing approaches, in this paper, we
focus on the rate adaptation algorithm for streaming SVC
video in wireless networks, considering the random and less
predictable variation of the available bandwidth. We also
consider the more general case where the layered video is
encoded in variable bit-rate (VBR).
3. PROBLEM FORMULATION
Considering the limited computation capacity of hand-
held devices and the high variation of wireless access links,
we formulate the optimal rate adaptation problem as a nite
Markov Decision Process, which can deal with the random
network condition with a relatively simple approach that
is feasible for handheld devices. For each video segment,
the client uses MDP to make a decision on which action to
conduct given the current client state. There are four com-
168
ponents for MDP, i.e., action, state, transition probability
and reward. In the following, we dene them one by one.
As shown in Figure 1, after a segment is obtained com-
pletely, the rate adaptation algorithm has a chance to de-
cide the video version of the next segment to be requested
and whether the client should be idle for a while to avoid
buer overow. We dene the sequential actions as {at}(t =
0, 1, ). at is the decision made at step t, where the step
duration equals the time to retrieve one segment. Note
that the step duration is not a constant, since the segment
download time varies according to the segment size and the
available bandwidth. L is the number of versions. The ac-
tion set for a given state is A(s) = {Ai, Au, Aw}, where
Ai(i = L+1, , L1) means to request the next segment
with i layer higher (i 0) or lower (i < 0) than the current
one, Au means to upgrade the last received segment to a
higher version, and Aw means to wait for a time duration of
Ts (Ts is the constant playback time of a segment).
We dene a state at step t as st = (qt, qt, vt, vt, bwt, dt).
Here, qt is the queue length in terms of the number of
buered frames. Obviously, qt is in the range of (0, F), where
F = BT Ns, BT is the target buer size in terms of the
number of segments, and Ns is the number of frames per seg-
ment. qt is the queue length variation after a new segment
has been retrieved, i.e., qt = qt qt1, which indicates
whether the requested videos bit-rate matches the available
bandwidth. qt is in the range of [F, Ns]. vt is the version
index of the last received segment. vt indicates the dier-
ence of video versions requested in consecutive steps. bwt is
the available bandwidth at step t. dt is the number of re-
ceived segments, which is in the range of [0, NT ], where NT
is the total number of segments the client needs to request.
From the denition of the states, we can observe that the
Markov property exists, since all of these states depend on
their immediately previous state only, i.e.,
Pr{st+1|st, at, st1, at1, , s0, a0} = Pr{st+1|st, at}.
To obtain the state transition probability, the most chal-
lenging issue is to obtain the model for bwt. For most wire-
less streaming scenarios, the bottleneck is often in the wire-
less access link, and the nite-state Markov chain has been
widely used to model the variation of wireless channels [15,
17]. Thus, we use a discrete-time nite-state Markov model
to capture the variation of the bandwidth, with the state
transition probabilities obtained from the measurement or
derived from the wireless channel model [15]. Given the
available bandwidth for downloading the current segment,
we can estimate the probability distribution of the band-
width for the next segment using the state transition prob-
ability matrix of the Markov model.
For the problem of our interest, we can derive the state
transition probability for the MDP by
P
a
ss
= Pr{st+1 = s
|st = s, at = a}. (1)
The state at step t is s = (q, q, v, v, bw, d). If action at =
Ai is selected, then with probability P
a
ss
= Pr{bw
|bw}, the
new state will be s
= (q
, q
, v
, v
, bw
, d
), i.e.,
v
= v +i, v
= i, (2)
q
= q
(m
v
d+1
f)/bw
+Ns,
q
= q
q, d
= d + 1,
where m
v
d+1
is the size of version v
of segment d + 1 and
Table 1: Rewards Associated with States
st = s R(s)
(, , , , , NT ) 0
(0, , , , , ) F + q
(F
+
, , , , , ) F q
(, q, , v, , ) min(|v|, |q|)
f is the playback frame rate (since we are dealing with the
stored video streaming, the client can have the knowledge of
the size of every segment). If at = Au, then the new state is
v
= v + 1, v
= v + 1, (3)
q
= q
[(m
v
d
m
v
d
) f]/bw
,
q
= q
q, d
= d.
Similarly, we can derive other state transition probabilities.
The reward in MDP is the payo obtained when a partic-
ular action is taken at a state,
rt+1 = R(st = s), (4)
where R maps the state to a reward. Table 1 lists the re-
wards dened for dierent states. means any value for
the state, F
+
means the number of buered frames is larger
than BT Ns. The reward of a state can be looked up in
the table from the top to the bottom, using the reward of
the rst entry in the table matching the current state. The
values of rewards need to be carefully designed, since it is
closely related to the control objective. The stored video
has a nite length, and when the state reaches d = NT , i.e.,
all the segments have been downloaded, the streaming task
completes, which is called an episodic task. Therefore, we
give state (, , , , , NT ) a reward of 0. Besides, any ac-
tion taken in this state will not change the state, i.e., the
terminal state will not aect the decision process. By giving
the minimum reward when the buer is empty, we can min-
imize playback interruption; by giving a negative reward to
the state when the number of buered frames is larger than
the maximum value, we can avoid buer overow. When
both q and v are 0, the maximum reward (0) is given,
since in these states, the playback should be smooth and the
selected video version matches the available bandwidth well.
In addition, we can associate a weight parameter with
the reward to make a trade-o between average playback
quality and playback smoothness. When is smaller, the
video streaming can be more adaptive to the available band-
width to achieve higher average playback quality; when is
larger, a higher priority is given to the playback smoothness.
Note that the reward is independent of the bandwidth, since
we are unable to control the varying bandwidth.
Finally, we can formulate the rate adaptation problem as
an optimization problem. The objective is to nd a strategy
(s) for the action taken at a state s to maximize the reward
received in the long run. The state-value function given a
deterministic strategy is thus
V
(s) =
P
a
ss
R(s) +V
(s
, (5)
where is the discounting rate 0 1. Note that in our
case, we can set to 1, since we are dealing with an episodic
streaming task. An optimal strategy
(s) should maximize
169
the state-value function in the long run, i.e.,
(s) = arg max
P
a
ss
R(s) +V
(s
, (6)
where V
(s) is the optimal value function. Then, we can
obtain the optimal streaming strategy using dynamic pro-
gramming [14]. The solution is a table that maps each state
to an optimal action. During the online decision making
process, a table look-up can quickly identify the action to
take, which is simple and feasible for handheld devices.
Furthermore, to reduce the number of states for MDP
and the input size for dynamic programming, we divide the
buer size (in frames) into small bins and index them as q
b
starting from 0 to BT Ns/BS, where BS is the number
of frames in each bin. Then we use q
b
BS to represent the
number of buered frames for each bin.
4. PERFORMANCE EVALUATION
In this section, we rst dene the QoE metrics in terms
of playback interruption, average playback quality and play-
back smoothness. Then we evaluate the proposed solution
and compare it with the existing state-of-the-art rate adap-
tation algorithm [8] by experiments.
4.1 QoE Metrics
Interruption ratio: Every 1/f second (f is video frame
rate), the video player displays one frame, which is dened
as one display event. If there is no decoded frame avail-
able to display, playback interruption occurs. Let n0 be the
number of occurrences that a frame to be displayed is not
available. Denote by nt the total number of display events,
the interruption ratio (IR) is dened as IR = n0/nt.
Average playback quality: We dene a continuous playback
of Layer i video as one run and its length in terms of the
number of display events as nr for the r-th run. There are
totally N runs. The layer index 0 denotes that a playback
interruption happens. The weighted sum of the layer index is
used to measure the average playback quality (APQ), which
is dened as APQ =
N
r=1
(nr i)/
N
r=1
(nr).
Playback smoothness [9]: Intuitively, a longer expected run
length leads to a smoother watching experience. It also gives
fair evaluation when the length of one run is much larger
than the others compared with arithmetic average. Thus
the expected run length is used to measure the playback
smoothness (PS), and we have PS =
N
r=1
(nr)
2
/N.
4.2 Experimental Settings
We have prototyped a scalable video streaming testbed [16]
and used it to evaluate the proposed streaming strategy and
compared it with the existing state-of-the-art solution [8].
The testbed used Lighttpd as the streaming server and the
video player was implemented using an open-source SVC de-
coder [4]. The web server and the video player communicate
through HTTP/TCP protocols, and a channel emulator was
used to simulate the varying bandwidth in wireless networks.
We used the open-source SVC codec JSVM [10] to encode
the sample video (Big Buck Bunny [1]) into three layers,
and their congurations are listed in Table 2. Note that the
Y-PSNR of Layer 3 is lower than that of Layer 2, but we
still prefer Layer 3 video which has a higher resolution, and
it leads to a better watching experience when displayed on a
larger screen due to a higher dots per inch (DPI). Obviously,
Table 2: Layer Conguration
Resolution
Avg.
bit-rate
(Kbps)
std bit-rate
deviation
Y-PSNR
Layer
index
320x180 112.84 39.01 35.47 1
320x180 238.94 88.84 39.44 2
640x360 363.82 140.33 35.90 3
Table 3: State Prob. and Available Bandwidth
State 1 2 3 4
Bandwidth (Kbps) 50.32 180.63 260.38 550.75
Steady state prob (P1) 0.026 0.102 0.407 0.465
Steady state prob (P2) 0.103 0.256 0.385 0.256
PSNR is not an appropriate QoE performance index for SVC
encoded videos, so it is not considered in this paper. Each
layer is chopped into small segments of 17 frames. The total
number of segments NT is 200, and the frame rate is 24
frames per second. From experiments, we found that the
segment size of 17 frames is small enough to react to the
varying bandwidth. The playback starts when 4 segments
are received.
To maximize the spectrum eciency and limit the packet
error rate, broadband wireless systems (such as 3G and
WLAN) can adjust the transmission data rate according
to the wireless channel quality using adaptive modulation
and coding techniques. When the channel quality is good,
a higher data rate is used, and vice versa. As the wire-
less channel condition may change randomly, the nite-state
Markov model has been widely used to describe the variation
of wireless channel conditions [15, 17], and thus it can also
be used to describe the wireless link data rate variation for
broadband wireless systems. Since the wireless access link
is presumably the bottleneck, we used a discrete-time four-
state Markov model to capture the variation of the avail-
able bandwidth, and the duration of the time step for the
Markov model is constant (700 ms in our experiment and it
is close to the segment playback duration). We used two sets
of probability transition matrices for two dierent wireless
link proles. The two matrices, P
1 and P2 are, respectively,
0.5 0.05 0.05 0.4
0.2 0.25 0.2 0.35
0.2 0.1 0.2 0.5
0.1 0.1 0.1 0.7
0.25 0.75 0 0
0.3 0.4 0.3 0
0 0.2 0.6 0.2
0 0 0.375 0.625
.
The average bandwidth and steady state probabilities of the
wireless link under dierent proles are listed in Table 3.
The rst dierence is that the average bandwidth of the two
proles are 377.6 Kbps and 292.84 Kbps, respectively. The
other dierence is that a link with P1 has a smaller average
fading duration than that with P2, i.e., given the link is in
a worse channel condition, the link with P2 may stay in the
worse condition for a longer duration on average.
One challenge for the MDP model used in Section 3 is that
the MDP model requires the state transition probability af-
ter one segment is downloaded, but the segment download
duration may not be a constant. In this paper, the segment
download time is approximated by a constant to obtain the
state transition matrix for the MDP. According to the exper-
iment results, such approximation is acceptable. One reason
is that the proposed rate adaptation strategy prefers to se-
170
Table 4: Playback Performance
P BT ALG IR APQ PS Max queue
P1
20
RA 0 2.03 117.30 23
OS 0 2.22 189.72 20.5
30
RA 0 1.91 124.45 33
OS 0 2.19 237.37 30.2
FL(3) 0.07 2.78 314.7 13.6
P2
20
RA 0 1.68 155.27 23
OS 0 1.88 246.54 20.4
30
RA 0 1.60 200.5 33
OS 0 1.87 268.32 30.1
FL(2) 0.03 1.93 1176.74 24.7
lect the video version to match the available bandwidth, so
the time to download a segment does not vary severely. Also,
the proposed rate adaptation algorithm can tolerate the in-
accuracy in the Markov model used, which will be discussed
further near the end of Section 4.3.
Since the video has three layers (versions), the set of ac-
tions of state s is A(s) = {A2, A1, A0, A1, A2, Au, Aw}.
For Aw, the client will wait for 700 ms (one time step) be-
fore sending the HTTP request of the next segment. The
bin size is set to 17 to reduce the number of states. By dy-
namic programming, the action for every possible state can
be obtained oine. During the video download process, the
client only needs to look up the table to make the decision.
Although the number of states is not small, since each state
is unique, we can index and store each state and action at a
unique location. Then looking for the action for a particular
state only takes O(1) time.
4.3 Performance Comparison
We compare our proposed optimal streaming (OS) strat-
egy with the rate adaptation (RA) algorithm in [8]. With
RA, a client may be idle for some time to avoid buer over-
ow, but it does not explicitly set the target buer size. To
make the comparison fair, we set the minimum buered me-
dia time in RA the same as the target buer size (BT ) in our
solution. We repeated the experiments for each algorithm
10 times under each conguration, and the results presented
here are the average over 10 runs.
Table 4 compares the two algorithms with dierent target
buer sizes and wireless conditions. We also include another
algorithm FL(3) for P1, which xes the layer index to 3, as
the average network bandwidth is larger than the average
bit-rate of Layer 3. Due to the variation of segment size and
network bandwidth, FL(3) suers from stuttering, which
is very unpleasant for the watching experience. FL(2) of P2
shows similar results. From the table, both the proposed OS
and the existing RA algorithms can make the playback free
of interruption. OS has the advantage over RA in terms of
both APQ and PS in all cases (for P1, OS uses = 1; for
P2, OS uses = 2). RA assumes that the size of a segment
is large enough such that the average download throughput
of a segment can be used to represent the average available
bandwidth for the following segment. For a highly varying
wireless link, even with a very large segment, the throughput
of the next segment can be quite dierent. In addition,
OS controls the queue length better than RA, because RA
conservatively estimates the throughput of the next segment
as the bit-rate of the lowest version of the video.
Figs. 2(a) and (b) show the playback traces during the
0 5 10 15
x 10
4
0
1
2
3
4
L
a
y
e
r
i
n
d
e
x
(A) Playback index and
requested segments
Playback layer index
0 5 10 15
x 10
4
0
10
20
30
time(ms)
S
e
g
m
e
n
t
s
(B) Buffer state
Segment buffer queue length
0 5 10 15
x 10
4
0
1
2
3
4
L
a
y
e
r
i
n
d
e
x
(A) Playback index and
requested segments
Playback layer index
0 5 10 15
x 10
4
0
10
20
30
time(ms)
S
e
g
m
e
n
t
s
(B) Buffer state
Segment buffer queue length
(a) OS ( = 1) (b) RA
Figure 2: Performance Comparison
5.1 5.15 5.2 5.25 5.3 5.35
x 10
4
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
l
a
y
e
r
i
n
d
e
x
time (ms)
playback layer index
Action A
u
requesting Layer
segment to "upgrade" last segment
Action A
w
idle for
about 700 ms
Figure 3: A Zoom-in of Playback Trace
experiments and the corresponding buer occupancy states
with OS and RA, respectively, where the transition matrix
is P1 and BT = 20. From these gures, RA encounters more
frequent layer switching with a lower smoothness than OS.
The maximum queue length of RA is also larger.
We further zoom in the playback trace for OS. In Figure 3,
the black rectangle represents a segment and the width of
it denotes the download time duration (from the time in-
stant of sending the HTTP request to that the segment is
completed received). The horizontal gap between edges of
rectangles is due to the waiting action to avoid buer over-
ow. Figure 3 shows the advantage of the proposed video
streaming framework using SVC: the rectangles rise from a
non-zero layer index (circled and annotated by arrows) are
the layer segments to upgrade the already buered seg-
ments to improve both APQ and PS, which is not possible
when using the traditional AVC streaming techniques.
With OS, another exibility is that we can make a trade-
o between APQ and PS by adjusting the reward parameter
. When we increase the value of to 2, as shown in Table 5,
the APQ is reduced slightly from 2.22 to 2.14 and the ex-
pected run length is increased from 189.72 to 333.56. When
= 3, similar trend happens. When we set to 10, the
APQ is 1 and the run length is 3, 400, which is the extreme
case that any action involving layer switching is avoided.
Last but not least, one practical issue is that the Markov
model used for the varying bandwidth may not be accurate.
It is important to test how sensitive the performance of OS
is to the model accuracy. In the test, we used P
2 as the
171
Table 5: Trade-o between APQ and PS (P = P1)
IR APQ PS Max queue
1 0 2.22 189.72 20.5
2 0 2.14 333.56 20.5
3 0 2.11 410.74 20.1
10 0 1 3400 20
Table 6: Model Sensitivity Test
Env BT ALG IR APQ PS Max queue
P2
20
OS(P1) 0 1.92 183.79 20
OS(P2) 0 1.88 246.54 20.4
30
OS(P1) 0 1.88 229.58 29.9
OS(P2) 0 1.87 268.32 30.1
Markov model to drive the channel emulator in the experi-
ment, and used both P1 and P2 in the dynamic programming
to obtain the action decisions. The results are shown in Ta-
ble 6. OS(Pi) means that the streaming strategy is obtained
using transition matrix Pi for i = 1, 2. From the table, when
the matrix in the decision process does not match the real
situation, the performance degrades slightly, but still in a
tolerable range. Also, comparing the results in Tables 4 and
6, even with a mismatched model for the available band-
width, the proposed OS still substantially outperforms RA
in terms of both APQ and PS.
5. CONCLUSIONS
In this paper, for DASH-based adaptive video streaming
in wireless networks, we have formulated the rate adapta-
tion problem as an MDP and used dynamic programming to
solve the problem. The trade-o between the average video
quality and playback smoothness can be made by adjusting
the parameter in the reward function. Experiment results
have shown that the proposed solution is feasible and sub-
stantially outperforms the existing one [8]. There are several
issues worth further investigation. To fully utilize the lay-
ered feature of SVC, we may consider other possible actions,
such as to upgrade multiple previously received segments
when possible. However, more actions may increase the size
of the action set and require more information to describe
the system state, which increases the state number and sys-
tem complexity. The proposed streaming policy is an o-line
solution requiring the bandwidth transition probability ma-
trix. How to design an on-line algorithm to estimate the
bandwidth transition matrix is left for future investigation.
Nevertheless, the proposed solution is robust against band-
width estimation errors, so it is promising to be used even
without accurate knowledge of the channel prole. Also, the
number of the states for the MDP in this paper is consid-
erably large, which requires a large memory space. It is de-
sirable to reduce the number of states without substantially
sacricing the performance. Besides rate adaptation, other
issues such as how to organize the layer segments eciently
and optimize the segment size require further research.
6. REFERENCES
[1] Big Buck Bunny. http://www.bigbuckbunny.org.
[2] S. Akhshabi, A. C. Begen, and C. Dovrolis. An
experimental evaluation of rate-adaptation algorithms
in adaptive streaming over HTTP. In ACM
MMSys11, pages 157168, New York, NY, USA, 2011.
[3] A. Begen, T. Akgul, and M. Baugher. Watching video
over the web: Part 1: Streaming protocols. IEEE
Internet Computing, 15(2):54 63, March-April 2011.
[4] M. Blestel and M. Raulet. Open SVC decoder: a
exible SVC library. ACM MM 10, pages 14631466,
New York, NY, USA, 2010.
[5] L. Cai, S. Xiang, Y. Luo, and J. Pan. Scalable
modulation for video transmission in wireless
networks. IEEE Trans. on Veh. Tech., 2011.
[6] L. De Cicco, S. Mascolo, and V. Palmisano. Feedback
control for adaptive live video streaming. In ACM
MMSys11, pages 145156, New York, NY, USA, 2011.
[7] M. Kobayashi, H. Nakayama, N. Ansari, and N. Kato.
Robust and ecient stream delivery for application
layer multicasting in heterogeneous networks. IEEE
Transactions on Multimedia, 11(1):166176, Jan. 2009.
[8] C. Liu, I. Bouazizi, and M. Gabbouj. Rate adaptation
for adaptive HTTP streaming. In ACM MMSys11,
pages 169174, New York, NY, USA, 2011.
[9] S. Nelakuditi, R. Harinath, E. Kusmierek, and
Z. Zhang. Providing smoother quality layered video
stream. In ACM NOSSDAV00, June 2000.
[10] J. Reichel, H. Schwarz, and M. Wien. Joint scalable
video model 11 (JSVM 11). Joint Video Team, Doc.
JVT- X, 2007.
[11] Y. Sanchez, T. Schierl, C. Hellge, T. Wiegand,
D. Hong, D. De Vleeschauwer, W. Van Leekwijck, and
Y. Lelouedec. iDASH: improved dynamic adaptive
streaming over HTTP using scalable video coding. In
ACM MMSys11, pages 257264, New York, NY,
USA, 2011.
[12] T. Schierl, Y. Sanchez de la Fuente, R. Globisch,
C. Hellge, and T. Wiegand. Priority-based media
delivery using SVC with RTP and HTTP streaming.
Multimedia Tools and Applications, 55:227246, 2011.
[13] T. Stockhammer. Dynamic adaptive streaming over
HTTP: standards and design principles. In ACM
MMSys11, pages 133144, New York, NY, USA, 2011.
[14] R. Sutton and A. Barto. Reinforcement learning: An
introduction. Cambridge Univ Press, 1998.
[15] H. S. Wang and N. Moayeri. Finite-state Markov
channel a useful model for radio communication
channels. IEEE Trans on Veh. Tech., 44(1):163171,
1995.
[16] S. Xiang. Scalable streaming. https:
//sites.google.com/site/svchttpstreaming/.
[17] R. Zhang and L. Cai. A packet-level model for uwb
channel with people shadowing process based on
angular spectrum analysis. IEEE Trans. on Wireless
Comm., 8(8):404855, Aug. 2009.
172