Imagedecomp 1
Imagedecomp 1
1, JANUARY 2007 1
Abstract—Intrinsic image decomposition is an important problem that targets the recovery of shading and reflectance components
from a single image. While this is an ill-posed problem on its own, we propose a novel approach for intrinsic image decomposition
using reflectance sparsity priors that we have developed. Our sparse representation of reflectance is based on a simple observation:
neighboring pixels with similar chromaticities usually have the same reflectance. We formalize and apply this sparsity constraint on
local reflectance to construct a data-driven second-generation wavelet representation. We show that the reflectance component of
natural images is sparse in this representation. We further propose and formulate a global sparse constraint on reflectance colors
using the assumption that each natural image uses a small set of material colors. Using this sparse reflectance representation and the
global constraint on a sparse set of reflectance colors, we formulate a constrained l1 -norm minimization problem for intrinsic image
decomposition that can be solved efficiently. Our algorithm can successfully extract intrinsic images from a single image, without
using color models or any user interaction. Experimental results on a variety of images demonstrate the effectiveness of the proposed
technique.
the earlier constrained 1 -norm minimization problem. incompatible with many practical cases such as multi-
We show that using this prior can improve the recovery color surfaces and gray-scale input images.
of the global structures of shading and reflectance, which In contrast, our priors are independent of color models
in turn leads to further improvements in our intrinsic on local surfaces. Furthermore, by using the two new
image decomposition. global sparse priors on reflectance, the proposed method
The rest of the paper is organized as follows. Section 2 in this paper can automatically recover the intrinsic im-
discusses related work. The new sparse priors are intro- ages from a single image without additional information.
duced in Section 3, while the optimization framework Our method is partially inspired by the work of
for performing intrinsic image decomposition using the Fattal et al. [18] on the construction of data-dependent
proposed priors is described in Section 4. Section 5 second-generation wavelets for edge-preserving image
presents experimental results on various test images. processing. Different from first-generation wavelets con-
Finally, concluding remarks are presented in Section 6. sisting of translates and dilates of a single pair of scal-
ing and wavelet functions, second-generation wavelets
allow them to change according to spatial particular-
2 R ELATED W ORK
ities of the data. The Lifting scheme first introduced
The problem of intrinsic image decomposition into re- by Sweldens [3] is an efficient implementation of the
flectance and shading components was first introduced fast wavelet transform for constructing bi-orthogonal
by Barrow and Tenenbaum [5]. The reflectance compo- wavelets through space. Fattal et al. [18] proposed the
nent describes the intrinsic albedo of a surface, which edge-avoiding wavelets (EAW) constructed using a data-
is illumination-invariant. The shading component corre- prediction lifting scheme based on the edge content of
sponds to the amount of reflected light from the surface, the input image. In this paper, we utilize the lifting
which depends on surface geometry, reflection function scheme [3] to construct a new data-dependent MRA
and illumination condition. based on the local reflectance sparseness using the chro-
Some previously proposed methods use additional maticity information.
information from multiple images to resolve the inher-
ent ambiguities. For example, user registered images
captured under different illumination conditions can be 3 S PARSE P RIORS ON R EFLECTANCE
used [6], [7], [8]. The approach by Troccoli and Allen [9]
used a laser scan of the scene and multiple lighting and In this section, we show how to derive the proposed
viewing conditions to perform relighting and to estimate sparse representation of the reflectance component of
reflectance. natural images from a simple local constraint on re-
To overcome the severely ill-posed nature of the prob- flectance and formulate the global sparsity constraint of
lem, previous methods for intrinsic image decomposition reflectance based on the representation. We also present
from a single image used either a strong prior or assump- the sparse prior on reflectance spectra and show how to
tion. Using the Retinex strategy, local derivatives can be use that prior by introducing a total-variations-like cost
analyzed in order to distinguish between shading in- term.
duced and reflectance induced image variations [1], [2],
[10], [11], [12]. Training-based approaches have also been
3.1 Sparse Reflectance Representation
proposed to classify image derivatives into reflectance
changes or shading changes [13], [14], [15]. With trained 3.1.1 Local Sparseness of Reflectance
classifiers, Tappen et al. obtained good decomposition Our method is based on an observation of a local sparse-
results from a single image by solving a global opti- ness of reflectance, where neighboring pixels of similar
mization problem with belief propagation [14]. A major chromaticity have similar reflectance. We can exploit
drawback of these previous methods is that the decom- this observation to build a local sparse representation of
position is analyzed locally within a small window. One reflectance by minimizing the following cost function:
exception is the work of Shen et al. [16] which proposed
a global optimization algorithm incorporating both the
Retinex constraint and non-local texture constraint to J(R) = ij R(j)
R(i) − w , (1)
obtain global consistency of image structures. i j∈Ni
1
More recently, a user-assisted method has been pro-
posed by Bousseau et al. [17]. Focusing on diffuse ob- where R(i) is the RGB vector that represents the re-
jects, they used the assumption that local reflectance flectance of pixel i and Ni is the set of neighboring pixels
colors lie on a plane and derived a closed-form least ij is a set of normalized non-negative weights
of i. w
squares system which can be solved together with addi- which sum to one. This weight should be large when two
tional user-supplied constraints. Their method obtained neighboring chromaticities are similar, and small when
impressive results on the presented test images. How- they are different.
ever, the method requires precise user strokes and their The normalized weight, w ij , is derived from a weight-
“color plane” assumption on local reflectance values is ing function, wij . We define wij based on the difference
JOURNAL OF LATEX CLASS FILES, VOL. 6, NO. 1, JANUARY 2007 3
Fig. 3. (a) A synthetic image with 4 homogeneous regions. (b) RBW transform with equal weighting. (c) Proposed
WRBW transform; note that all the coefficients are 0 except for the top left corner as shown in the zoomed-in box. (d)
A 3D plot of the WRBW coefficients (across RGB).
updated using the computed detail coefficients stored flectance with similar chromaticity values. This can thus
at the black pixel locations. The updated red pixels are be regarded as a chromaticity distribution preserving
decomposed further into the blue and yellow subsets down-sampling that attempts to keep local reflectance
as shown in Fig. 1. The yellow pixels are predicted values as close to each other as possible at each scale.
using their four diagonally-located neighbors at the blue Overall, the WRBW transform is expected to lead to
pixel locations, and the computed detail pixels, dk+1 , sparse reflectance components due to the combination
are stored at the yellow pixels. Finally, the blue pix- of chromaticity distribution preserving down-sampling
els are updated using the computed detail coefficients and the chromaticity-based weighted prediction.
stored at the yellow pixel locations, and the computed Fig. 3 illustrates the sparse nature of the proposed
approximation coefficients, ak+1 , are stored at the blue WRBW representation for an image satisfying the local
pixels. Fig. 2 shows the lifting scheme of the forward sparseness constraint, where we use a synthetic im-
transform of the weighted red-black wavelets (WRBW). age with 4 homogeneous color regions with different
By inverting each of these lifting steps, an image can be chromaticities. The detail coefficients are zero where
reconstructed from the wavelet coefficients. We perform the input image is flat, resembling the transform re-
K = log2 (min(w, h)) levels of decomposition, where w sults by the original RBW. Near the edges, since the
and h are respectively the width and height of the image. proposed wavelets are designed with a support that
The predict and update steps of the Red-Black stage avoids containing both the edge and the pixels with
are defined by Equation (3) and Equation (4) respec- different chromaticities, the wavelets response to such
tively; the predict and update steps of the Blue-Yellow edges diminishes. For this synthetic image, the coeffi-
stage are similar and only differ in the neighborhood cients obtained by the proposed WRBW transform are
used. The multi-scale weights, w
k
ij , are normalized from all zero except the four approximation coefficients at the
the weights computed using Equation (2) with the chro- coarsest level aK . Compared to the RBW coefficients, our
maticity information at every scale. At coarser scales, WRBW coefficients show a stronger sparsity.
the neighboring chromaticities around a pixel might be
significantly different; for such pixels, the normalized 3.2 Sparse prior on reflectance component
k
weights, w ij , are set to zero. It is interesting to note We formulate a global sparse constraint on the re-
that proposed set of wavelet weights actually contains flectance component of natural images by using the
information about the chromaticity configurations of the multi-scale representation described in Section 3.1.
image at every scale. We denote the WRBW forward transform operator by
−1
By using the weighted scheme, the proposed wavelets Bw , and the backward transform operator by Bw . Then,
are designed with a support that is biased towards the reflectance component of a natural image can be
neighboring pixels with similar chromaticity values. represented in the wavelet domain as:
At the predict step, the prediction operation is the
R = Bw R
same as the term being summed in Equation (1). With
the weights used, the prediction of each reflectance where R are the wavelet coefficients of the reflectance.
value would be weighted more towards neighboring Recall from Equation (2) that when the chromaticities
reflectance with similar chromaticity values leading to of the neighboring pixels around a pixel are significantly
K
most of the detail coefficients, dk k=1, being zero or different, wij = 0 for all neighbors; therefore, from Equa-
k
close to zero except at pixels where j∈Ni wij = 0. tion (3), R(i) stores the actual reflectance value at that
At the update step, instead of merely preserving the location and scale. When carrying out the initial wavelet
approximation average, the update of each reflectance decomposition, we keep track of this set of locations,
value is again weighted more towards neighboring re- Γ, where the chromaticities of neighboring pixels are
JOURNAL OF LATEX CLASS FILES, VOL. 6, NO. 1, JANUARY 2007 5
4.1 Optimization
Assuming that the illumination component changes
Fig. 5. Separation results illustrating soft matting refine-
smoothly over the scene, we can apply a smoothness
ment for “paper2” image. (a) Original image. (b) Before
constraint on L by adding a Laplacian-based cost at all
refinement (zoom-in of yellow box). (c) After refinement
locations: (zoom-in of yellow box).
Esmooth = ΔL(i)2
i
where Δ denote the Laplacian operator. This smoothness where μ is an additional regularization parameter.
regularization of L both ensures that every pixel has an We note here that we can take advantage of the
equation that constrains it and controls the smoothness fact that the A and AT operators can be implemented
of the illumination component. efficiently without the need to perform full matrix mul-
−1
We substitute L by I − R and express this smoothness tiplication. The inverse WRBW transform, Bw , can be
constraint on the illumination with the following cost computed using wavelet lifting, while the inverse dual
−1 T
term: WRBW transform, Bw , can also be computed using
2 wavelet lifting by switching the order of the predict and
Esmooth = ΔL
2 update steps and manipulating the weights used. More-
−1 over, the Laplacian operator, Δ, can be implemented as
= ΔI − ΔBw R .
an image filter.
The smoothness constraint can be considered to be a set
of measurements on the reflectance coefficients, i.e.,
4.2 Soft matting
y ≈ AR Small changes in the reflectance component that are co-
located with those in the chromaticity component, which
where
−1 could be caused by phenomena such as color bleeding,
A = ΔBw and y = ΔI (7)
could be wrongly assigned to the shading component
If surfaces in the scene are diffuse or near-diffuse, we can since the local color sparsity constraint described in Sec-
assume that the input image chromaticity is the same as tion 3.1.1 is no longer valid. Here, we apply a refinement
the reflectance chromaticity. The weights of the WRBW process to solve this problem.
transform are thus computed according to Equation (2) We first express each intrinsic component as the prod-
using the chromaticity of the input image. uct between a scalar intensity, r = R or l = L, and
To recover R, we would solve the following con- a chromaticity, Rc = R/r or Lc = L/l:
strained 1 -norm minimization problem by using the
I = rRc + lLc .
sparse reflectance representation prior from Equation (5)
together with the smoothness constraint on illumination: Denoting α = r/(r + l), we express the intensity value at
each pixel as a mixture of two values weighted by α:
min ΛR s.t. AR = y
1
R I = αRc + (1 − α)Lc
This optimization problem can be solved using an 1 -
where Rc = (r + l)Rc and Lc = (r + l)Lc . Therefore, we
regularized least-squares solver, e.g., [25], [26] by re-
can apply a closed-form framework of matting [27] to
writing the optimization problem as:
refine the separation. To do so, we first perform an initial
2
decomposition by solving one of the two optimization
min AR − y + λ ΛR (8)
R 2 1 problems presented earlier in Eqns. (8) and (9). We then
compute an initial value of α, denoted by α, at the pixels
where λ is a regularization parameter.
from edges in the decomposed image, and propagate α
Further including the sparse prior on reflectance spec-
on those edges using the matting Laplacian algorithm of
tra from Equation (6) , we obtain the following optimiza-
Levin et al. [27]. Rewriting α(x) and α(x) in their vector
tion:
2 forms, we minimize the following cost function:
min AR − y + λ ΛR + μ T R , (9) J(α) = αT Σα + (α − α)T G(α − α)
R 2 1 1
JOURNAL OF LATEX CLASS FILES, VOL. 6, NO. 1, JANUARY 2007 7
TABLE 1
LMSE for CR, SR and SRC over the MIT Intrinsic Images
dataset
LMSE
Example
CR SR SRC
box 0.013 0.0036 0.0018
cup1 0.007 0.0043 0.0030
cup2 0.011 0.0052 0.0045
deer 0.041 0.0413 0.0419
dinosaur 0.035 0.0317 0.0216
frog1 0.066 0.0558 0.0483
frog2 0.071 0.0587 0.0472
panther 0.011 0.0075 0.0078
paper1 0.004 0.0019 0.0014
paper2 0.004 0.0027 0.0021
raccoon 0.015 0.0052 0.0048
sun 0.003 0.0024 0.0023
squirrel 0.072 0.0856 0.0794
Fig. 6. Separation results illustrating soft matting refine- teabag1 0.032 0.0280 0.0280
ment for a flower image. (a) Input image. (b) Separation teabag2 0.023 0.0151 0.0141
results before matting refinement. (d) Separation results turtle 0.069 0.0349 0.0174
Average 0.030 0.0240 0.0204
after matting refinement. (c) Zoomed-in reflectance re-
sults within the yellow box (Left: before refinement; Right:
after our matting refinement).
will be referred to as SR. Then, we use both the global
sparsity constraints on the reflectance representation and
where G is a diagonal matrix of weights. We set Gii = 0 reflectance colors which solves (9); this will be referred
when pixel i is at an edge, and Gii = 100 otherwise. to as SRC. In our implementation, we use the fast
The matrix Σ is the matting Laplacian matrix [27]. The Nesta method [26] for both SR and SRC to solve the
optimal α can be obtained by solving the following constrained 1 -norm minimization problem.
sparse linear system: (Σ + G)α = Gα.
The derivation of the matting Laplacian matrix in [27] 5.1 Benchmarking Results on MIT Intrinsic Images
is based on a color line assumption, i.e, within a small Dataset
window, foreground (backround) colors lie on a straight A benchmark dataset with ground-truth (GT) was pre-
line in color space. Since this assumption still holds for sented in [24] for performance evaluation of intrinsic im-
natural shading/reflectance images, it is also valid to use age algorithms. We test our methods, SR and SRC, with
the matting Laplacian matrix in intrinsic images. Fig. 5 this dataset. Following [24], we use local mean squared
shows the results of matting refinement for the “paper2” error (LMSE) from the ground truth to measure de-
example. We can see that the “ghost” markings in the composition quality. We compare with the conventional
shading component are reduced. Fig. 6 shows the refined color Retinex algorithm (CR) [12], which performed best
results for a flower image. As we can see, the “ghost” among single image based methods in the study of [24].
markings in the shading component are reduced, such All the separation results of our methods here are before
as that within the red rectangle. Fig. 6(c) shows the the refinement process. The LMSE values2 used for com-
zoomed-in reflectance component within the yellow rect- parisons are computed using the color retinex algorithm
angle where the block artifacts in the reflectance are made available by the MIT Intrinsic Images dataset [24].
suppressed after applying the matting refinement. This dataset contains three categories: artificially
painted surfaces, printed objects, and toy animals. We
display one example from each category in Table 2.
5 E XPERIMENTAL R ESULTS
With conventional Retinex constraints, pixels that con-
In this section, we provide various experimental vali- tain significant reflectance derivatives should be smooth
dation of the proposed method. We first evaluate the in shading. Using the local constraint, the CR method
performance of our method on a benchmark dataset with correctly identifies most of the markings as reflectance
known ground truth [24]. Then, we compare our method changes. However, it leaves some “ghost” markings in
with the user-assisted approach of [17]. the shading and some residues of the cast shadows
In the experiments1 , we test two variations of the in the reflectance images because the sharp edges con-
proposed intrinsic image decomposition algorithm. First, tain a mixture of large and small image radiances. In
we only use the sparsity constraint on the reflectance contrast, SR eliminate many of these ghost because by
representation in the algorithm which solves (8); this using multi-resolution analysis, our method enforces the
1. In the paper, all the separation results are shown in AI g amma 2. Some of the computed values could be slightly different from that
with gamma correction = 1, and A is a scale presented in [24] because of the convergence algorithm.
JOURNAL OF LATEX CLASS FILES, VOL. 6, NO. 1, JANUARY 2007 8
TABLE 2
Decomposition results by Color Retinex and our proposed methods on three images from the MIT intrinsic dataset
Fig. 7. Reflectance recovered using CR, SR and SRC on the “turtle” image. (a) CR. (b) SR. (c) SRC. (d) Zoom-in of
yellow patch for CR, SR and SRC (left to right). (e) Zoom-in of red patches for CR, SR and SRC (left to right).
JOURNAL OF LATEX CLASS FILES, VOL. 6, NO. 1, JANUARY 2007 9
Fig. 8. Intrinsic image decomposition results for the “box” image. (a) Input image. (b-c) Separation results using SR
(LMSE = 0.003606). (d-e) Separation results using SRC (LMSE = 0.001835)
Fig. 9. Intrinsic image decomposition results for the “paper1” image. (a) Input image. (b-c) Separation results using
SR (LMSE = 0.001871). (d-e) Separation results using SRC (LMSE = 0.001395)
Fig. 10. Example of failure of the local sparseness of reflectance assumption in the “cup2” image. Note that in this
case, there exists intensity change with constant hue which corresponds to a change in reflectance and not shading.
(a) Input image. (b-c) Separation results of CR method (LMSE = 0.011). (d-e) Separation results of our SRC method
(LMSE = 0.0045)
sparse constraint on neighboring reflectance at every To illustrate the benefit of the global sparsity constraint
scale. The constraint on the multi-resolution representa- on reflectance color, we also compare the results obtained
tion broadens the influence of local cues to help resolve with the SRC method in Fig. 7. As shown in Fig. 7, the
the ambiguous local inferences. forequarter and hindquarter of the turtle are two distinct
regions. In the decomposition with SR and CR, shading
The “turtle” image in Table 2 is challenging for the and reflectance in each of these regions are computed
CR method. The shell of the turtle exhibits big variations separately, which results in recovered reflectances that
in shading and shadows that arise from the 3D weave are inconsistent, as seen in Fig. 7(a) and (b). With the
pattern. With only local cues, CR misses much of the non-local sparse constraint on reflectance colors, the
global and local shading structure in the recovered shad- recovered reflectance with SRC has a smaller set of
ing image because the algorithm misinterprets many reflectance values, which leads to a more consistent
image gradients as purely reflectance changes due to the decomposition as shown in Fig. 7(c). This can be seen
large color differences. In contrast, SR can better handle more clearly for a closeup of the small regions on the two
the gradual shading change across the image as well as feet Fig. 7(e). With this global constraint on reflectance
the local shading variations, and accurately recovers the colors, the SRC method can correctly recover the global
shape of the shell surface. This difference can be more shading and reflectance structure that cannot easily be
clearly seen in the closeup of a small shell region in inferred using local cues alone.
Fig. 7(d).
JOURNAL OF LATEX CLASS FILES, VOL. 6, NO. 1, JANUARY 2007 10
Fig. 8 and 9 show two other examples where SRC can see that Bouseau’s method leaves some “ghost”
effectively eliminates cast shadows on the surfaces from markings in the shading (c) and some residues of the
the reflectance. Chromaticity values might change in cast shadows in the reflectance component (d). Fig. 14
dark regions caused by cast shadows. Since the proposed shows the comparison with Tappen et al.’s work [15]
WRBW is designed with a support that avoids pixels and Bousseau et al.’s. We also compare our method
with different chromaticities, the wavelets response to with Tappen et al’s method [14] for a gray-scale image
such edges would be diminished. In the decomposition example in Fig. 15. For gray-scale images, we compute
with SR, shading and reflectance in these shadow regions the WRBW using pixel intensity. It is evident that our
are thus computed separately from the neighboring technique can generate visually comparable results from
regions, which results in the inconsistent reflectances a single image without any additional information.
as shown in Fig. 8(b) and Fig. 9(b). With the sparse
constraint on reflectance colors, the reflectance values 6 C ONCLUSION
recovered by SRC in these regions are more similar to In this paper, to address the problem of intrinsic image
the ones which are not in shadows. The proposed SRC decomposition, we have proposed two new sparse priors
method can better deal with this problem, and more on reflectance: a data-driven sparse representation of
accurately removes the cast shadow found inside the box reflectance and a global sparse constraint on reflectance
in Fig. 8(d) and the shadow at upper right in Fig. 9(d). colors. Combining the two sparse priors, we can ef-
Quantitative comparisons on all the dataset images are fectively decompose a single image into its intrinsic
provided in Table 1 where we compared the LMSE of components.
CR and our proposed methods, SR and SRC. The SR A sparse representation is made possible by using
method outperforms CR for most of the objects and data-dependent weighted wavelets constructed based on
the SRC method generally has the best performance. the local sparsity constraint on reflectance. At the same
However, CR outperforms our proposed methods on a time, the constructed weighted wavelet also preserves
few examples, “deer”, and “squirrel”. This is a result chromaticity distribution even at coarse scales. By us-
of our assumption on the local sparseness of reflectance ing a multi-resolution representation of reflectance and
being invalid. Fig. 10 exemplifies the problem with the applying reflectance weighting to enforce the sparsity
proposed methods on the “cup2” image. There are some constraint at multiple scales, we can convert what ap-
places on the cup surface where neighboring pixels pears to be a local constraint into a global constraint.
with similar chromaticities have different reflectance, We also apply a global assumption that the number of
and that is where our methods fail to properly separate different reflectance colors in the image is small through
reflectance and shading. For the cup2 example, even the use of a total-variations-like cost term. The decom-
though the local sparsity prior is invalid for these places, position problem is formulated as a constrained 1 -norm
the separation results of our method are still better then minimization problem, and the proposed approach seeks
the ones of the color retinex method. to recover the sparse reflectance signal given smooth-
ness constraints on the illumination component. We also
5.2 Comparison with user-assisted approaches discussed the color bleeding problem in the decomposi-
tion with the proposed method. Small changes in the
Here, we compare our method with that of Bousseau
reflectance components could be wrongly assigned to
et al. [17], which uses the following global constraints
the shading component. We solve this problem by using
provided by a user: sets of pixels with similar reflectance,
a soft matting method based the color line assumption
sets of pixels with similar illumination, and locations and
which holds for natural shading and reflectance images.
shading values of pixels with known illumination.
The optimization formulation effectively broadens the
Accurate decomposition results can be achieved by us- influence of local information to help resolve ambiguous
ing the global constraints of shading and reflectance pro- local inference and our experimental results show that
vided in the form of user scribbles. However, users may the decomposition significantly benefits from the global
not always provide useful scribbles. Fig. 11(a) shows constraints.
the decomposition results with ground truth scribbles,
which has a LMSE of 0.00055. To simulate the effect
of having inaccurate scribbles, we used scribbles with
R EFERENCES
the same fixed illumination values as before but with [1] R. Kimmel, M. Elad, D. Shaked, R. Keshet, and I. Sobel, “A vari-
ational framework for retinex,” International Journal of Computer
positions that are randomly perturbed by up to 15 pixels; Vision, vol. 52, pp. 7–23, 2003.
the result is shown in Fig. 11(b), with a LMSE of 0.0011. [2] B. V. Funt, M. S. Drew, and M. Brockington, “Recovering shading
The result of our proposed method without any user from color images,” in European Conf. on Computer Vision (ECCV),
1992, pp. 124–132.
interaction is shown in Fig. 11(c), which has a LMSE of [3] W. Sweldens, “The lifting scheme: A construction of second
0.0015. generation wavelets,” SIAM J. Math. Anal, vol. 29, pp. 511–546,
Fig. 12 shows further comparisons with the method 1998.
[4] I. Omer and M. Werman, “Color lines: Image specific color repre-
proposed by Bousseau et al. In Fig. 13, we show the sentation,” Computer Vision and Pattern Recognition, IEEE Computer
zoomed-in separation results for the cloth example. We Society Conference on, vol. 2, pp. 946–953, 2004.
JOURNAL OF LATEX CLASS FILES, VOL. 6, NO. 1, JANUARY 2007 11
Fig. 11. Comparison using ground truth data from a synthetic image. Bousseau et al.’s approach [17] requires fairly
accurate user strokes. (a) [17]’s results when user strokes are set to ground truth values (LMSE = 0.00055). (b) [17]’s
results when the positions of user strokes are randomly perturbed by up to 15 pixels (LMSE = 0.0011). (c) Proposed
SRC’s results without user interaction (LMSE = 0.0015).
[5] H. Barrow and J. Tenenbaum, “Recovering intrinsic scene charac- [14] M. F. Tappen, W. T. Freeman, and E. H. Adelson, “Recovering
teristics from images,” Computer Vision Systems, pp. 3–26, 1978. intrinsic images from a single image,” IEEE Trans. on Pattern
[6] Y. Weiss, “Deriving intrinsic images from image sequences,” in Analysis and Machine Intelligence, vol. 27, pp. 1459–1472, 2005.
IEEE Int’l Conf. on Computer Vision (ICCV), vol. 2, 2001, pp. 68–75. [15] M. Tappen, E. Adelson, and W. Freeman, “Estimating intrinsic
[7] Y. Matsushita, S. Lin, S. B. Kang, and H.-Y. Shum, “Estimating component images using non-linear regression,” in IEEE Conf. on
intrinsic images from image sequences with biased illumination,” Computer Vision and Pattern Recognition (CVPR), 2006, pp. II: 1992–
in European Conf. on Computer Vision (ECCV), vol. 2, 2004, pp. 274– 1999.
286. [16] L. Shen, P. Tan, and S. Lin, “Intrinsic image decomposition with
[8] K. Sunkavalli, W. Matusik, H. Pfister, and S. Rusinkiewicz, “Fac- non-local texture cues,” in IEEE Conf. on Computer Vision and
tored time-lapse video,” ACM Transactions on Graphics, vol. 26, Pattern Recognition (CVPR), 2008, pp. 1–7.
no. 3, p. 101, 2007. [17] A. Bousseau, S. Paris, and F. Durand, “User-assisted intrinsic
[9] A. Troccoli and P. Allen, “Building illumination coherent 3d images,” in SIGGRAPH Asia ’09: ACM SIGGRAPH Asia 2009
models of large-scale outdoor scenes,” International Journal of papers. ACM, 2009, pp. 1–10.
Computer Vision, vol. 78, no. 2-3, pp. 261–280, 2008. [18] R. Fattal, “Edge-avoiding wavelets and their applications,” ACM
[10] E. Land and J. McCann, “Lightness and retinex theory,” Journal of Trans. on Graphics, vol. 28, no. 3, pp. 1–10, Aug 2009.
the Optical Society of America A, vol. 3, pp. 1684 – 1692, 1971. [19] G. Uytterhoeven and A. Bultheel, “The Red-Black wavelet trans-
[11] B. K. P. Horn, Robot Vision. MIT Press, 1986. form,” in Signal Processing Symposium (IEEE Benelux), M. Moonen,
[12] G. D. Finlayson, S. D. Hordley, and M. Drew, “Removing shadows Ed. IEEE Benelux Signal Processing Chapter, 1998, pp. 191–194.
from images using retinex,” in Proceedings of IS&T/SID Tenth Color [20] J. Shi and J. Malik, “Normalized cuts and image segmentation,”
Imaging Conference: Color science, Systems and Applications, 2002, pp. Pattern Analysis and Machine Intelligence, IEEE Transactions on,
73–79. vol. 22, no. 8, pp. 888–905, 2000.
[13] M. Bell and W. T. Freeman, “Learning local evidence for shading [21] L. Grady, “Random walks for image segmentation,” IEEE Trans.
and reflectance,” in IEEE Int’l Conf. on Computer Vision (ICCV), on Pattern Analysis and Machine Intelligence, vol. 28, no. 11, pp.
vol. 1, 2001, pp. 670–677. 1768–1783, 2006.
JOURNAL OF LATEX CLASS FILES, VOL. 6, NO. 1, JANUARY 2007 12
Fig. 12. Comparison with the user-assisted approach of Bousseau et al. [17].
Fig. 13. We zoom into the separation results for details of the yellow and red patches of the cloth example.
JOURNAL OF LATEX CLASS FILES, VOL. 6, NO. 1, JANUARY 2007 13
Fig. 14. Comparison with the user-assisted approach of Bousseau et al. [17], and the automatic approach of Tappen
et al. [14].
Fig. 15. Gray-scale image example. Comparison with Tappen et al.’s work [14].
[22] A. Levin, D. Lischinski, and Y. Weiss, “Colorization using opti- Li Shen received the M.Eng. degree (Pana-
mization,” ACM Trans. Graphics, vol. 23, no. 3, pp. 689–694, 2004. sonic Scholarship) in software science from the
[23] X. Liu, L. Wan, Y. Qu, T.-T. Wong, S. Lin, C.-S. Leung, and P.- Osaka University, in 2002, and the Ph.D. de-
A. Heng, “Intrinsic colorization,” ACM Transactions on Graphics gree (MONBUSHO Scholarship) in information
(SIGGRAPH Asia 2008 issue), vol. 27, no. 5, pp. 152:1–152:9, systems engineering from the Osaka University,
December 2008. Japan 2006. From 2006 to 2008, she was a
[24] R. Grosse, M. K. Johnson, E. H. Adelson, and W. T. Freeman, visiting researcher at Microsoft Research Asia,
“Ground-truth dataset and baseline evaluations for intrinsic im- Beijing. Since 2009, she has been a scientist
age algorithms,” in International Conference on Computer Vision, with the Computer Graphics and Interface De-
2009, pp. 2335–2342. partment at the Institute for Infocomm Research,
[25] S.-J. Kim, K. Koh, M. Lustig, S. Boyd, and D. Gorinevsky, “An Singapore. Her main research interests are in
interior-point method for large-scale l1-regularized least squares,” computer graphics & compute vision, especially in low-level vision,
IEEE Journal on Selected Topics in Signal Processing, vol. 1, no. 4, pp. computational photography, and image-based rendering/modeling.
606–617, 2007.
[26] T. Goldstein and S. Osher, “The split bregman method for l1
regularized problems,” SIAM Journal on Imaging Sciences, vol. 2,
no. 2, pp. 323–343, 2009.
[27] A. Levin, D. Lischinski, and Y. Weiss, “A closed-form solution
to natural image matting,” IEEE Trans. on Pattern Analysis and
Machine Intelligence, vol. 30, no. 2, pp. 228–242, 2008.
JOURNAL OF LATEX CLASS FILES, VOL. 6, NO. 1, JANUARY 2007 14