How WebKit
Works
Adam Barth (abarth)
October 30, 2012
What is WebKit?
WebKit is a rendering engine for web content
HTML
JavaScript
WebKit
Rendering of a
web page
CSS
WebKit is not a browser, a science project, or
the solution to every problem
Major Components
This talk
WebKit and WebKit2
(Embedding API)
Bindings
(JavaScript API, Objective-C API)
WebCore
(HTML, CSS, DOM, etc, etc)
Platform
(Network, Storage, Graphics)
JavaScriptCore
(JavaScript Virtual Machine)
WTF
(Data structures, Threading primitives)
Life of Web Page
Network
Loader
HTML Parser
CSS
DOM
Render Tree
Graphics Context
Script
Pages, Frames, and Documents
Page
Main Frame
Document
Frame
Document
Frame
Document
Frame
Document
Frame
Document
Lifecycle of a Frame
Uninitialized
Initial
Document
Provisional
Checking
Policy
Ready to
Commit
Committed
Committed is the quiescent state
How the Loader Works (Idealized)
CachedResourceRequest
CachedResourceLoader
MemoryCache
CachedResource
The Loader is actually very messy and
complicated, but we have a long-term
project to clean up its nuttiness
ResourceRequest
ResourceLoader
ResourceHandle
Platform-specific code
How the HTML Parser Works
Bytes
Characters
Tokens
Nodes
3C 62 6F 64 79 3E 48 65 6C 6C 6F 2C 20 3C 73 70 61 6E 3E 77 6F 72 6C
64 21 3C 2F 73 70 61 6E 3E 3C 2F 62 6F 64 79 3E
Tokenizer
<body>Hello, <span>world!</span></body>
StartTag: body
body
Hello,
Hello,
body
DOM
Hello,
span
world!
span
StartTag: span
world!
world!
EndTag: span
TreeBuilder
Preload Scanning for Fun and Profit
document.write("<textarea>");
Mary had a little lamb
Tokenizer
TreeBuilder
Script execution can change the input stream
Preload scanner tokenizes ahead
When parser is blocked on external scripts
Starts resource loads earlier
XSSAuditor
XSSAuditor
HTTP Request
HTTP Response
Tokenizer
TreeBuilder
XSSAuditor examines token stream
Looks for scripts that were also in the request
Assumes those scripts were reflected XSS
Blocks them
DOM + CSS Render Tree
#footer { position: fixed; bottom: 0; left: 0 }
body > span { font-weight: bold; }
Render
Block
Render
Block
Layout
html
Render
Text
head
title
Greeting
Render
Inline
bold
body
Hello,
Render
Text
span
world!
img
Render
Image
fixed
Anonymous RenderObjects
Render
Block
div
Anonymous
Hello,
div
world!
Render
Block
Render
Block
Render
Text
Render
Text
Not every RenderObject has a DOM Node
Every RenderBlock either:
Has all inline children
Has no inline children
LayerTree
Render
Layer
Render
Block
Render
Block
Render
Layer
Render
Text
Render
Inline
Render
Text
bold
Render
Image
fixed
Sparse representation of RenderTree
Enables accelerated compositing, scrolling
Yet Another Tree: LineBoxTree
<div>An old silent pond...
A frog jumps into the pond,
splash! <b>Silence again.</b></div>
Render
Block
Render
Text
Render
Inline
RootInlineBox
InlineTextBox
RootInlineBox
InlineTextBox
RootInlineBox
InlineTextBox
bold
Render
Text
One RootInlineBox per line of text
List of inline flow and inline text boxes
InlineTextBox
Conclusion
WebCore's main processing pipeline:
Loader and Parser
CSS, DOM, and Script
RenderTree, LayerTree, and InlineBoxes
Other major subsystems
Accessibility, Editing, Events, CSS, Web Inspector
Plugins, SVG, MathML, XSLT...
Other components
WebKit, Bindings, Platform, JavaScriptCore, WTF
... 1.5 MLOC of C++
Learn more:
http://www.webkit.org/coding/technical-articles.html
Thanks!
[email protected]