Lecture 02
Lecture 02
Digital photography
Course announcements
Complete the start-of-semester survey:
https://docs.google.com/forms/d/e/1FAIpQLScY5gtWcuSZ4X1n7MUtT5DPto92
1t5A80jFlB1mmq43oetVrA/viewform
https://docs.google.com/spreadsheets/d/1aWtAWuxstwshd5eq7BIW-
idBc4RJLVh1OMoL_8FbucY/
• Color primer.
Slide credits
A lot of inspiration and quite a few examples for these slides were taken directly from:
• Michael Brown (CVPR 2016 Tutorial on understanding the image processing pipeline).
post-capture processing
(lectures 5-10)
Imaging sensors
• Lecture 23 will cover sensors and noise issues in more detail. Canon 6D sensor
(20.20 MP, full-frame)
9
photons
… exposure begins…
Shutter speed
11
Who is this?
12
Photoelectric effect
incident emitted
photons electrons
Albert Einstein
Canon 6D sensor
(20.20 MP, full-frame)
15
photosite photosite
potential potential Canon 6D sensor
well well (20.20 MP, full-frame)
Photosite response
The photosite response is mostly linear
number of electrons
what does this slope equal?
number of photons
18
Photosite response
The photosite response is mostly linear
what happens here?
number of electrons
QE
number of photons
19
Photosite response
The photosite response is mostly linear, but:
number of electrons
(over-exposure)
number of photons
over-exposure
(non-linearity due Saturation means that the potential
to sensor saturation) well is full before exposure ends.
20
photosite photosite
potential potential
well well
photosite photosite
potential potential
well well
photosite photosite
potential potential
well well
Microlenses
Sensor size
CCD vs CMOS
• Most modern
commercial and
industrial cameras use
CMOS sensors.
40
Analog front-end
Vignetting
Fancy word for: pixels far off the center receive less light
Vignetting
Four types of vignetting:
• stores electrons into the photosite’s potential well while it is not full
• reads out photosites’ wells, row-by-row, and converts them to analog signals
• corrects non-linearities
Remember these?
helps photosite • Lenslets also filter the image
collect more light to avoid resolution artifacts.
(also called lenslet) • Lenslets are problematic when
working with coherent light.
microlens microlens • Many modern cameras do not
color filter color filter
have lenslet arrays.
We will see what the color filters are for later in this lecture.
46
Color primer
47
Color
color is complicated
48
electromagnetic
spectrum
• When measuring light of some SPD , the sensor produces a scalar response:
sensor
response
• There are three types of cells with different spectral sensitivity functions.
“short”
“medium”
“long”
cone distribution
for normal vision
(64% L, 32% M)
52
• “Cones” correspond to pixels that are covered by different color filters, each with its
own spectral sensitivity function.
CYGM RGBE
Canon IXUS, Powershot Sony Cyber-shot
Images of the same scene captured using 3 different cameras with identical settings.
56
Modern
camera with
Lippmann plate
[credit: Hans
Bjelkhagen]
Lippmann’s Nobel Prize in Physics in 1908 “for his method, based on the phenomenon
of interference, which permits the reproduction of colours by photography.”
59
• at every photosite, converts incident photons into electrons using mosaic’s SSF
• stores electrons into the photosite’s potential well while it is not full
• reads out photosites’ wells, row-by-row, and converts them to analog signals
• corrects non-linearities
lots of mosaicking
noise artifacts
• Kind of disappointing.
• We call this the RAW image.
61
post-capture processing
(lectures 5-10)
analog front-
end
RAW image
(mosaiced,
linear, 12-bit)
CFA white
denoising
demosaicing balance
final RGB
color tone
compression image (non-
transforms reproduction
linear, 8-bit)
64
• Sometimes the term image signal processor (ISP) is used to refer to the image
processing pipeline itself.
• The inverse process, going from a “conventional” image back to RAW is called
derendering.
65
analog front-
end
RAW image
(mosaiced,
linear, 12-bit)
CFA white
denoising
demosaicing balance
analog front-
end
RAW image
(mosaiced,
linear, 12-bit)
CFA white
denoising
demosaicing balance
final RGB
color tone
compression image (non-
transforms reproduction
linear, 8-bit)
67
White balancing
Human visual system has chromatic adaptation:
• We can perceive white (and other colors) correctly under different light sources.
White balancing
Human visual system has chromatic adaptation:
• We can perceive white (and other colors) correctly under different light sources.
White balancing
Human visual system has chromatic adaptation:
• We can perceive white (and other colors) correctly under different light sources.
Retinal vs
perceived color.
White balancing
Human visual system has chromatic adaptation:
• We can perceive white (and other colors) correctly under different light sources.
Retinal vs
perceived color.
White balancing
Human visual system has chromatic adaptation:
• We can perceive white (and other colors) correctly under different light sources.
Retinal vs
perceived color.
White balancing
Human visual system has chromatic adaptation:
• We can perceive white (and other colors) correctly under different light sources.
• Cameras cannot do that (there is no “camera perception”).
White balancing: The process of removing color casts so that colors that we would
perceive as white are rendered as white in final image.
analog front-
end
RAW image
(mosaiced,
linear, 12-bit)
CFA white
denoising
demosaicing balance
final RGB
color tone
compression image (non-
transforms reproduction
linear, 8-bit)
80
CFA demosaicing
Produce full RGB image from mosaiced sensor output.
CFA demosaicing
Produce full RGB image from mosaiced sensor output.
analog front-
end
RAW image
(mosaiced,
linear, 12-bit)
CFA white
denoising
demosaicing balance
final RGB
color tone
compression image (non-
transforms reproduction
linear, 8-bit)
84
Noise in images
Can be very pronounced in low-light images.
85
• The brighter the scene, the larger the variance of the distribution.
2) Dark-shot noise:
• Emitted electrons due to thermal activity (becomes worse as sensor gets hotter.)
3) Read noise:
Bright scene and large pixels: photon shot noise is the main noise source.
86
How to denoise?
87
How to denoise?
Look at the neighborhood around you. I1 I2 I3
I4 I5 I6
I7 I8 I9
• Mean filtering (take average):
I1 + I2 + I3 + I4 + I5 + I6 + I7 + I8 + I9
I’ 5 =
9
• Median filtering (take median):
I’ 5 = median( I1 , I2 , I3 , I4 , I5 , I6 , I7 , I8 , I9 )
Large area of research. We will see some more about filtering in a later lecture.
88
analog front-
end
RAW image
(mosaiced,
linear, 12-bit)
CFA white
denoising
demosaicing balance
final RGB
color tone
compression image (non-
transforms reproduction
linear, 8-bit)
89
Gamma encoding
After this stage, we perform compression, which includes changing from 12 to 8 bits.
• Apply non-linear curve to use available bits to better encode the information human
vision is more sensitive to.
91
Demonstration
original (8-bits, 256 tones)
Can you predict what will happen if we linearly encode this tone range with only 5 bits?
Can you predict what will happen if we gamma encode this tone range with only 5 bits?
92
Demonstration
original (8-bits, 256 tones)
Can you predict what will happen if we gamma encode this tone range with only 5 bits?
93
Demonstration
original (8-bits, 256 tones)
human visual
system: concave
gamma curve
image a human
would see at
different stages of
the pipeline
98
RAW pipeline
gamma encoding display still applies
is skipped! gamma correction!
human visual
system: concave
gamma curve
image a human
RAW image appears very
would see at
dark! (Unless you are
different stages of
using a RAW viewer)
the pipeline
99
Historical note
• CRT displays used to have a response curve that was (almost) exactly equal to the inverse
of the human sensitivity curve. Therefore, displays could skip gamma correction and
display directly the gamma-encoded images.
• It is sometimes mentioned that gamma encoding is done to undo the response curve of a
display. This used to (?) be correct, but it is not true nowadays. Gamma encoding is
performed to ensure a more perceptually-uniform use of the final image’s 8 bits.
100
analog front-
end
RAW image
(mosaiced,
linear, 12-bit)
CFA white
denoising
demosaicing balance
final RGB
color tone
compression image (non-
transforms reproduction
linear, 8-bit)
102
Emphatic yes!
• Every time you use a physics-based computer vision algorithm, you need linear
measurements of radiance.
• Applying the algorithms on non-linear (i.e., not RAW) images will produce completely
invalid results.
105
• If you like re-finishing your photos (e.g., on Photoshop), RAW makes your life much easier
and your edits much more flexible.
107
• Your camera will buffer more often when shooting in burst mode.
• Sometimes, it may not be “fully” RAW. The Lightroom app provides images after
demosaicking but before tone reproduction.
111
I forgot to set my camera to RAW, can I still get the RAW file?
• The image processing pipeline is lossy: After all the steps, information about the original
image is lost.
Derendering
113
What I described today is an “idealized” version of what we think commercial cameras do.
• Almost all of the steps in both the sensor and image processing pipeline I described
earlier are camera-dependent.
• Even if we know the basic steps, the implementation details are proprietary information
that companies actively try to keep secret.
analog front-
end?
RAW image
(mosaiced,
linear, 12-bit)
CFA white
denoising?
demosaicing? balance?
final RGB
color tone compression
image (non-
transforms? reproduction? ?
linear, 8-bit)
115
Various curves
All of these sensitivity curves are different from camera to camera and kept secret.
117
• Very difficult to get access to ground-truth data at intermediate stages of the pipeline.
• Very difficult to evaluate effect of new algorithms for specific pipeline stages.
118
You can’t (not easily at least). You need to use one of the following:
• dcraw – tool for parsing camera-dependent RAW files (specification of file formats are
also kept secret).
• Adobe DNG – recently(-ish) introduced file format that attempts to standardize RAW file
handling.
• If you want to do physics-based vision, the best image processing pipeline is no pipeline
at all (use RAW).
• What if you want to use images for, e.g., object recognition? Tracking? Robotics SLAM?
Face identification? Forensics?
Take-home messages
References
Basic reading:
• Szeliski textbook, Section 2.3.
• Michael Brown, “Understanding the In-Camera Image Processing Pipeline for Computer Vision,” CVPR 2016,
slides available at: http://www.comp.nus.edu.sg/~brown/CVPR2016_Brown.html
Additional reading:
• Adams et al., “The Frankencamera: An Experimental Platform for Computational Photography,” SIGGRAPH 2010.
The first open architecture for the image processing pipeline, and precursor to the Android Camera API.
• Heide et al., “FlexISP: A Flexible Camera Image Processing Framework,” SIGGRAPH Asia 2014.
Discusses how to implement a single-stage image processing pipeline.
• Buckler et al., “Reconfiguring the Imaging Pipeline for Computer Vision,” ICCV 2017.
• Diamond et al., “Dirty Pixels: Optimizing Image Classification Architectures for Raw Sensor Data,” arXiv 2017.
Both papers discuss how to adaptively change the conventional image processing pipeline so that it is better suited to various
computer vision problems.
• Chakrabarti et al., “Rethinking Color Cameras,” ICCP 2014.
Discusses different CFAs, including ones that have white filters, and how to do demosaicing for them.
• Gunturk et al., “Demosaicking: Color Filter Array Interpolation,” IEEE Signal Processing Magazine 2005
A nice review of demosaicing algorithms.
• Kim et al., “A New In-Camera Imaging Model for Color Computer Vision and Its Application,” PAMI 2012.
• Chakrabarti et al., “Probabilistic Derendering of Camera Tone-mapped Images,” PAMI 2014.
Two papers that discuss in detail how to model and calibrate the image processing pipeline, how to (attempt to) derender an image
that has already gone through the pipeline, and how to rerender an image under a different camera’s pipeline.
• Baechler et al., “Shedding light on 19th century spectra by analyzing Lippmann photography,” PNAS 2021.
A recent paper analyzing Lippmann color photography.