Best Paper 1
Best Paper 1
ABSTRACT 26–30, 2023, Copenhagen, Denmark. ACM, New York, NY, USA, 15 pages.
Extensions complement web browsers with additional functional- https://doi.org/10.1145/3576915.3616584
ities and also bring new vulnerability venues, allowing privilege
escalations from adversarial web pages to use extension APIs. Prior 1 INTRODUCTION
works on extension vulnerability detection adopt classic static anal- Browser extensions are personalized add-ons written in JavaScript
ysis, which is unable to handle dynamic JavaScript features such to native browsers to boost native browsers’ functionalities. Popular
as those function calls as part of array lookups. At the same time, browser extensions are often installed by millions of users for daily
prior abstract interpretation focuses on lightweight server-side usage. For example, Grammarly [6] provides grammatical sugges-
JavaScript, which often cannot scale to client-side extension code tions for its users, and Google scholar [5] provides quick searches
due to object explosions in the abstract domain. and citations of academic papers. Because browser extensions need
In this paper, we design, implement and evaluate a novel, coverage- to provide additional functionality close to the native browser, they
driven, concurrent abstract interpretation framework, called CoCo, often have a higher privilege than the normal website visited by
to efficiently detect vulnerabilities in browser extensions. On one web users. For example, browser extensions may send cross-origin
hand, CoCo parallelizes abstract interpretation with concurrent requests beyond the restriction enforced by the same-origin policy
taint propagation for each branching statement, message passing and access browser-only storage such as bookmarks and browsing
and content/background scripts to detect vulnerabilities with im- history. Such privileged APIs need to be protected from normal
proved scalability. On the other hand, CoCo prioritizes analysis that web pages in preventing privilege escalations.
increases code coverage, thus further detecting more vulnerabili- To enforce such protection, modern browser extension archi-
ties. Our evaluation shows that CoCo detects at least 43 zero-day, tecture, particularly Google Chrome, adopts a so-called isolated
exploitable, manually-verified extension vulnerabilities that cannot world [1] to separate scripts (called background or more recently
be detected by state-of-the-art works. We responsibly disclosed all replaced by service workers in manifest V3) having access to privi-
the zero-day vulnerabilities to extension developers. leged APIs from those (called content scripts) that are close to the
potential adversary, i.e., the web page. Scripts from these two worlds
CCS CONCEPTS communicate with each other via a message-passing channel. While
• Security and privacy → Browser security. such an architecture greatly reduces potential vulnerabilities, still
the communication between two types of scripts will lead to the
KEYWORDS access of privileged APIs and data as shown in many real-world
vulnerabilities [4, 11] in the wild.
JavaScript, Browser Extension, Security
Researchers have been working on the detection of privilege es-
ACM Reference Format: calations. On one hand, prior works [14, 35, 55] that detect browser
Jianjia Yu, Song Li† , Junmin Zhu† , and Yinzhi Cao. 2023. CoCo: Efficient extension vulnerabilities propose using static analysis that tracks
Browser Extension Vulnerability Detection via Coverage-guided, Concur- dataflow between adversary-controlled inputs (e.g., a message sent
rent Abstract Interpretation. In Proceedings of the 2023 ACM SIGSAC Con- to the background script) and a privileged API. Specifically, Em-
ference on Computer and Communications Security (CCS ’23), November
PoWeb [55]—the first work of its kind in modern extension vul-
nerability detection—combines call graph analysis and manual in-
This work is licensed under a Creative Commons Attribution- spection to find hundreds of vulnerable extensions. A follow-up
NonCommercial-ShareAlike International 4.0 License.
work called DoubleX [35], improves the static analysis performed
CCS ’23, November 26–30, 2023, Copenhagen, Denmark
by EmPoWeb with an Extension Dependence Graph (EDG) for more
© 2023 Copyright held by the owner/author(s).
† The two authors contribute to the paper when they are either studying or interning
ACM ISBN 979-8-4007-0050-7/23/11.
https://doi.org/10.1145/3576915.3616584 at Johns Hopkins University.
2441
CCS ’23, November 26–30, 2023, Copenhagen, Denmark Jianjia Yu, Song Li, Junmin Zhu, & Yinzhi Cao
accurate yet automated detection. However, neither work is able to extension vulnerabilities) and verified manually. We responsibly
handle dynamic JavaScript features, such as function calls that are disclosed all our findings to extension developers and so far have
resolved dynamically (e.g., funcs[name]() where name is another received one confirmation. We compared CoCo with DoubleX, the
variable). state-of-the-art extension vulnerability detection tool as well as
On the other hand, recent advances in server-side vulnerability ODGen-ext. Our evaluation shows that CoCo outperforms both
detection [39, 42, 43, 46] use abstract interpretation to resolve afore- DoubleX and ODGen-ext in terms of the number of detected vul-
mentioned dynamic JavaScript features. For example, ODGen [43] nerabilities and false positives/negatives.
interprets Node.js packages in an abstract domain, called Object
Dependence Graph, and queries the graph for dynamic object res- 2 BACKGROUND
olution. However, existing server-side JavaScript abstract inter- In this section, we present some background knowledge on browser
pretation is single threaded, which cannot detect vulnerabilities extension architecture and abstract interpretation.
related to concurrency features of client-side extensions, such as
isolated worlds and message passing. Furthermore, while abstract Browser extension architecture. Modern browser extension
interpretation itself is promising, it often cannot scale especially often adopts an isolation mechanism, e.g., isolated world [1] in
given the size and complexity of client-side browser extension code. Chrome, to prevent scripts with a low privilege to access higher-
One common issue is that the number of objects explodes in the privileged APIs. For example, such isolation divides scripts in Google
abstract domain, which prevents abstract interpretation from even Chrome (including extensions and web pages) into three types: web
reaching vulnerable code, thus leading to many false negatives. page scripts, content scripts, and background (service workers in
In this paper, we design a novel, coverage-guided, concurrent manifest V3) scripts.
abstract interpretation framework, called CoCo, on a graph-based Since different scripts are isolated from each other, a communica-
abstract domain to efficiently detect browser extension vulnerabili- tion mechanism is necessary to enable the exchange of information.
ties using static analysis. The key insights of CoCo are two-fold as Specifically, such communication is facilitated by message passing
suggested by the two “Co”s in its name. On one hand, CoCo par- in browser architecture and we list the following four types of
allelizes abstract interpretation with concurrent taint propagation message passing mechanisms.
by converting each branching statement, asynchronous callback
• Web page ↔ Content script. Web page script communicates
functions, and content/background scripts with message passing
with content scripts via a regular postMessage channel with
into multiple threads to better detect vulnerability with improved
addEventListener and onmessage APIs.
scalability. For example, the concurrent taint analysis propagates
taint information among different threads across message passing, • Content ↔ Background script. There are two types of commu-
to detect browser extension vulnerabilities. Specifically, CoCo mod- nications between content and background scripts (or service
els and maintains event loops to iterate through callback functions worker in V3). First, they can communicate via one-time requests
for content and background scripts while keeping track of taints. APIs, i.e., sendMessage and onMessage under either runtime
Then, CoCo also simulates message passing between content and (content) or tabs (background). Such communication exchanges
background scripts and propagates taint information across scripts one message at a time. Second, they can communicate via long-
during concurrent abstract interpretation. lived APIs (e.g., connect and onConnect) to exchange multiple
On the other hand, CoCo prioritizes abstract interpretation for messages.
unseen code to maximize code coverage, thus further increasing • Web page ↔ Background script. A web page can communi-
the number of detected vulnerabilities. Specifically, CoCo allocates cate with a background script if permissions are declared, i.e.,
analysis time to each thread based on the branching level and past the externally_connectable field of the manifest file contains
performance in increasing code coverage. That is, CoCo allocates the web page’s URL. The communication is similar to content
more time to a thread if it either analyzes more code or is located on ↔ background script in the two types with the exception that
a top-level branch. Then, CoCo preempts a thread after it uses up onMessageExternal and onConnectExternal are used.
its allocated time. Furthermore, because the number of threads and • Extension ↔ Another extension. Such a communication is
abstract domain states could be exponential given many branching similar to web page ↔ background script and enabled by default.
statements in a target extension, CoCo merges the abstract domain, A whitelist can be declared in the manifest file using allowed
particularly the graph representation, of these different threads extensions’ IDs.
after the conditional statement to reduce the number of states
and threads and avoid state explosion. That is, CoCo reduces two Abstract interpretation. Abstract interpretation is a technique
threads into one by keeping newly-added nodes or edges if they to approximate the execution of a given computer program upon
exist in either thread and removing nodes or edges if they are an abstract domain without concrete inputs. There are two types
deleted from both threads. of abstract interpretation in the literature based on the abstract
We implemented a prototype of CoCo, which is available at this domain types, which are lattice- and graph-based. First, classic
anonymous repository [9]. Then, we crawled 145K+ extensions abstract interpretation [29] adopts a lattice structure as the abstract
from the Chrome extension store for evaluation. Our evaluation domain. One challenge is the over-approximation of abstract values
shows that CoCo finds at least 43 zero-day, exploitable vulnera- and thus many prior works propose optimizations, such as trace
bilities that cannot be detected by prior works (e.g., DoubleX [35] partitioning [44, 53], to improve traditional interval analysis by
and a modified version ODGen [43], called ODGen-ext, to detect moving some statements outside a branching statement inside.
2442
CoCo: Efficient Browser Extension Vulnerability Detection via Coverage-guided, Concurrent Abstract Interpretation CCS ’23, November 26–30, 2023, Copenhagen, Denmark
Second, recent works [42, 43] propose graph-based abstract in- 1 // exploit code located in a web page
2 var editorExtensionId =" cfbodcmobhpfbjhbennacnanbmpbcfkd "
terpretation for vulnerability detection given efficient graph oper- 3 chrome . runtime . sendMessage ( editorExtensionId , { action : " getData "
ations on the abstract domain. One challenge is scalability given , product_url : " https :// appfreaker . com / " } ,
4 function ( response ) {
the exponential number of objects in the graph. CoCo works on 5 console . log ( response )
graph-based abstract interpretation to improve their scalability, 6 }) ;
7
which is different from existing optimizations. That is, existing 8 // vulnerable extension code
9 var actionList = [" addProduct " , " getData " ];
optimization methods—which operate on a lattice-based abstract 10 chrome . runtime . onMessageExternal . addListener (
domain—are not applicable to graph-based abstract interpretation. 11 function ( req , caller , res ) {
12 if ( req . hasOwnProperty (" action ") ) {
Take trace partitioning for example, which merges intervals of an 13 if ( actionList . indexOf ( req . action ) == -1) {
abstract value. Instead, CoCo merges graphs from different threads, 14 // dealing with invalid actions
15 } else {
which is a completely different concept. Similarly, it also remains 16 // call the callback with request and caller data
unknown to apply other static analysis optimizations, like loop 17 var index = actionList . indexOf ( req . action ) ;
18 res ({ " result ": window [ actionList [ index ]]( req ) }) ;
unrolling and object packing, for graph-based abstract domains. 19 }
20 } else {
21 // dealing with no actions
3 OVERVIEW 22 }
23 })
In this section, we present an overview of CoCo with a motivating 24 function getData ( req ) {
25 var result = {};
example, a solution overview and the threat model. 26 var product_url = req . product_url ;
27 result [" product_url "] = req . product_url ;;
28 $ . ajax ({ // privileged escalation for sending AJAX request
3.1 A Motivating Example bypassing SOP
29 url : req . product_url ,
We illustrate a motivating example, called AliExpress to Shopify Im- 30 type : " get " ,
porter, to describe the challenges of browser extension vulnerability 31 async : false ,
32 success : function ( resdata ) {
detection. This extension is designed to help users automatically im- 33 result [" data "] = resdata ;
34 result [" status "] = ' success ';
port AliExpress products’ information into another website, called 35 },
Shopify. It has a privilege escalation vulnerability, which allows 36 error : function ( resdata ) {
37 result [" data "] = resdata ;
a website to escalate its privilege to the browser extension, thus 38 result [" status "] = " error ";
sending arbitrary third-party requests regardless of the cross-origin 39 }
40 }) ;
header. 41 result [" message "] = " The application is running ";
Listing 1 shows the vulnerable code: Line 28 in the getData func- 42 return result ;
43 }
tion is where the vulnerability locates. The AJAX call in browser
extension is privileged, but the destination url at Line 29 is control- Listing 1: A Motivating example: AliExpress to Shopify
lable by an adversary from a website via a message. Lines 1–6 show Importer (Sink function is at Line 28; an adversary can send
the exploit code: A webpage adversary sends a message with a cross- privileged AJAX requests bypassing same-origin policy)
origin product_url and obtains the responses. The challenges of
detecting this vulnerability are manifested in two-fold: even reach Line 17 to resolve the aforementioned dynamic call
• Dynamic Function Call. The invocation of getData is via a dy- edge.
namic object lookup at Lines 17–18. Specifically, say the adver-
sary provides a string "getData" in req.action like the ex- 3.2 Solution Overview
ploit code does at Line 3. Then, the vulnerable code at Line 17 We describe an overview of CoCo in detecting the vulnerability of
finds the index of "getData" in the actionList array at Line 9 our motivating example in Listing 1. From a high-level perspective,
to avoid unauthorized function calls (i.e., actionList[index] CoCo finds a data flow from a user input (i.e., the req object at Line
is "getData"). Next, Line 18 looks up the function dynami- 11) to a sensitive function (i.e., the url parameter of the $.ajax
cally via window["getData"] and invokes it with req as the function at Line 28) and finally to the user again (i.e., the parameter
parameter. State-of-the-art extension vulnerability detection, of the res at Line 18, which is provided by the user at Line 11).
namely DoubleX [35], cannot find the dynamic call edge between Now let us describe how CoCo solves the aforementioned chal-
Line 17 and Line 24, thus skipping the data-flow edge between lenges. First, CoCo resolves the dynamic call edge at Line 18 via
req.product_url at Line 29 and the value at Line 3. The vul- looking up objects in the abstract domain. That is, CoCo first re-
nerable extension is included in the EmPoWeb dataset [55] but solves index as 1, then fetches actionList[index] as "getData",
detected mostly with manual work by the author: The dynamic and finally looks up the getData function via window["getData"].
call edges are also missing in their analysis, but a human expert All the information is stored in the abstract domain as nodes and
can identify them. edges.
• Reachability. The invocation of getData is located in an else Second, CoCo solves the reachability issue by scheduling ab-
branch of an if statement at Line 13, leading to a reachability stract interpretations of different branches in parallel and allocating
issue of static abstract interpretation. Specifically, state-of-the- analysis time by priority values associated with code coverage. Let
art abstract interpretation, such as ODGen [43] and ObjLupAn- us use Listing 1 for the explanation. CoCo analyzes two branches
sys [42], stuck in the if branch (Lines 13–15) for this specific of the if statement at Line 13 in two threads in parallel. That is,
extension because of state explosion. Therefore, they cannot CoCo switches between the analysis of Line 14 and Lines 17–18
2443
CCS ’23, November 26–30, 2023, Copenhagen, Denmark Jianjia Yu, Song Li, Junmin Zhu, & Yinzhi Cao
ered sensitive, because their content scripts are subject to Thread scheduler Thread executor
the same origin policy according to a new change [2].) Ready queue Running queue
Pop
Consequences Detailed APIs
Terminates Modeled Client-
Thread activator
Code execution eval, tabs.executeScript, setTimeout, setInterval side APIs
Waiting queue
AJAX requests ajax (not content scripts), fetch, get, post, XML-
HttpRequest().open Timeout
Taint propagator
Priority calculator
File downloads downloads.download Conditional stmts/ Vulnerability
Events/
Storage access Thread creator detector
2444
CoCo: Efficient Browser Extension Vulnerability Detection via Coverage-guided, Concurrent Abstract Interpretation CCS ’23, November 26–30, 2023, Copenhagen, Denmark
Table 2: Annotations of procedures, sets, constants, and con- Condition: The initial state of CoCo is 𝜎, which can be repre-
texts used in the operational semantics in Figure 2 sented as a tuple (Δ, Σ, Θ), the taint of source variable 𝑠𝑟𝑐 is 𝑡
Name Description (Φ(𝑠𝑟𝑐) corresponds to the 𝑡𝑠𝑘 in [𝑡𝑠𝑘 , 𝑓𝑘1 , ..., 𝑓𝑘𝑛 ] ).
Context Context variables Action: The expression get_taint(𝑠𝑟𝑐) in state 𝜎 is reduced to
Δ Mapping an object to its taint set a new state, i.e., (Δ[Φ(𝑠𝑟𝑐) ← 𝑡], Σ, Θ), where Δ[Φ(𝑠𝑟𝑐) ← 𝑡]
Σ Mapping an event to its callback denotes updating the taint of variable 𝑠𝑟𝑐 to 𝑡.
Θ Mapping a message port to its callback on receiving a message
• Property access. This rule propagates taints for property access.
Procedures(P) Graph operations Condition: The state 𝜎 is represented as a tuple (Δ, Σ, Θ). A prop-
Φ(𝑥) Locating the object for a variable 𝑥 erty of a tainted object is accessed through 𝑥 [𝑝]/𝑥 .𝑐𝑜𝑛𝑠𝑡. CoCo
Λ( 𝑓 ) Locating the corresponding event of a function 𝑓
Copy(x) Copying the 𝑥 variable
tries to resolve the property and represents the property object as
New(e) Creating a new variable for the expression 𝑒 𝑝 or 𝑝𝑛𝑒𝑤 if the property can or can not be resolved, respectively.
Port(x) Creating a long-lived connection port with variable 𝑥 Action: There are two cases to resolve the property access, which
LkupPort(f) Finding the corresponding port for function 𝑓
are 𝑥 [𝑝] and 𝑥 .𝑐𝑜𝑛𝑠𝑡. If the property can be resolved, CoCo does
Sets(S) Sets defined and used by CoCo
not propagate the taint. Otherwise, if a property of a tainted
F Sanitization functions object is accessed and cannot be resolved, CoCo propagates the
Set Operation Set operation defined and used by CoCo taint from the object to the property object. The reason is that
⊕ Δ[𝑥 ⊕ 𝑓 ] is to append a sanitation function 𝑓 to each of the if a property cannot be resolved, it is likely coming from an
list [𝑡𝑠𝑘 , 𝑓𝑘1 , ..., 𝑓𝑘𝑛 ] in set Δ [𝑥 ] , return the updated mapping adversary, e.g., a JSON object with an inner structure.
Δ
• Binary operators. This rule deals with the binary operators.
Variables(V) Constants/built-in functions/variables of extensions
Condition: The two operands under state 𝜎 can be reduced to
S Sender variable, constant
R Built-in response function for simple one-time requests
two different new states. The new variable generated by the
* Wildcard variable operation is denoted as 𝑥𝑛𝑒𝑤 . CoCo updates the taint of 𝑥𝑛𝑒𝑤
with the union of the taint sets of its two operands, namely,
in an abstract domain. CoCo performs three main tasks in the ex- Δ1 [Φ(𝑥 1 )] and Δ2 [Φ(𝑥 2 )].
ecutor: (i) taint propagation, (ii) vulnerability detection, and (iii) Action: The binary operator under state 𝑠 is reduced to a new
interaction with modeled client-side APIs. state with the new taint, the union of event mapping and the
union of the message port mapping.
4.2.1 Taint Propagation. CoCo defines a taint as a list with the first
item as the taint source followed by a list of sanitization functions • Function calls. This rule deals with the function calls.
between the source and a current object (if there is any). Specifically, Condition: The state 𝜎 is represented as the tuple as before. The
𝑥 is an object, and Δ[𝑥] maps the object 𝑥 to its taint set. Each taint returned new variable (if any) is denoted as 𝑥𝑟𝑒𝑡 . A possible new
in the taint set follows the format like [𝑡𝑠𝑘 , 𝑓𝑘1 , ..., 𝑓𝑘𝑛 ], where 𝑡𝑠𝑘 is mapping is denoted as Δ ′ . The update of Δ ′ is to update the
the 𝑘th taint source, 𝑓𝑘1 , ..., 𝑓𝑘𝑛 are the sanitization functions. The taint of 𝑥𝑟𝑒𝑡 with the union of the taints of all the parameters
taint is stored together with all the abstract objects in the abstract appended with the called function 𝑓 .
domain. Note that we use 𝑡𝑠𝑘 because there might exist several Action: There are three cases. (i) If the function definition can
taints for one given object. be resolved, CoCo goes into the function body so that the func-
First, we present different annotations of the operational se- tion call is reduced to the function body statements, e.g., binary
mantics of the taint analysis in Table 2. A state 𝜎 in CoCo’s taint operators and property access. No update of taint is needed by
analysis is represented by a tuple (Δ, Σ, Θ), where Δ is a mapping the function call syntax. (ii) If the function is a sanitization call
between objects and taint set, Σ a mapping between events and and cannot be resolved, the taint mapping Δ is updated to Δ ′ ,
callbacks, and Θ a mapping between message ports and callbacks. propagating the taint from the parameters to the returned ob-
Furthermore, we denote the update of an object 𝑥 with a taint 𝑡 as ject and appending the sanitization function. (iii) If the function
𝑥 ← 𝑡. That is, Δ[𝑥 ← 𝑡] means that CoCo taints the object 𝑎 with definition is not a sanitization function and can not be resolved,
𝑡 and updates the states. Similarly, we denote the appendment of a CoCo propagates the taint. The taint mapping updates to Δ ′′ .
sanitization function 𝑓 to a taint Δ[𝑥] as Δ[𝑥 ⊕ 𝑓 ]. For the reason Second, we describe taint propagation in “simple one-time re-
of space, all other annotations are listed in Table 2. quests” in Figure 2, which sends a one-time JSON-serializable mes-
Next, we describe the taint propagation process. Figure 2 shows sage between the content script and the background script (or
the operational semantics, which can be generally classified into service worker in V3). When analyzing sendMessage, CoCo copies
three categories: basic expressions, simple one-time requests, and the message and propagates taints from the sender to the receiver. If
long-lived connections. The latter two are two types of message- an optional callback [8] is enabled, CoCo also propagates the taint
passing mechanisms. The inference rules in the operational seman- of the message variable to the parameter of the callback function
tics define the valid transitions of a composite piece of syntax in Σ[Λ(𝑓 )]. The taint propagation for sendResponse and its callback
terms of the transitions of its components [10]. Each inference rule onResponse are similar to sendMessage. Now we explain the in-
can be seen as an if-then pair where the upper part corresponds to ference rules for "simple one-time requests" one by one.
the if condition and the lower part corresponds to the then action. • sendMessage. This rule applies to the sendMessage call.
First, let us describe the “basic expressions” in Figure 2 below: Condition: The sendMessage function is called on the sender end.
• Taint init. This rule initializes taint variables. The two parameters are the message 𝑥 and the callback function
2445
CCS ’23, November 26–30, 2023, Copenhagen, Denmark Jianjia Yu, Song Li, Junmin Zhu, & Yinzhi Cao
𝜎⇒(Δ,Σ,Θ),𝑡 =( [Φ(𝑠𝑟𝑐 ) ]) 𝜎 ⇒(Δ,Σ,Θ),𝑝𝑟𝑒𝑠𝑜𝑙 𝑣𝑒𝑑 =Φ[𝑥 [𝑝 ]/𝑥 .𝑐𝑜𝑛𝑠𝑡 ],𝑝𝑛𝑒𝑤 =𝑁 𝑒𝑤 (𝑥 [𝑝 ]/𝑥 .𝑐𝑜𝑛𝑠𝑡 )
(get_taint(𝑠𝑟𝑐 ), 𝜎 )⇒(Δ [Φ(𝑠𝑟𝑐 ) ←𝑡 ],Σ,Θ) Taint Init (𝑥 [𝑝 ]/𝑥 .𝑐𝑜𝑛𝑠𝑡, 𝜎 ) ⇒if 𝑝𝑟𝑒𝑠𝑜𝑙 𝑣𝑒𝑑 !=𝑛𝑢𝑙𝑙 then (Δ,Σ,Θ) else (Δ[Φ(𝑝𝑛𝑒𝑤 ) ←Δ[Φ(𝑥 ) ] ],Σ,Θ) Property Access
′ =Δ[Φ(𝑥
𝑛𝑒𝑤 ) ←Δ1 [Φ(𝑥 1 ) ] Δ2 [Φ(𝑥 2 ) ] ]
Ð
(𝑥 1 ,𝜎 )⇒(Δ1 ,Σ1 ,Θ1 ),(𝑥 2 ,𝜎 ) ⇒(Δ2 ,Σ2 ,Θ2 ),𝑥𝑛𝑒𝑤 =𝑁 𝑒𝑤 (𝑥 1 Ðop 𝑥 2 ),ΔÐ
(𝑥 1 op 𝑥 2 ,𝜎 ) ⇒(Δ′,Σ1 Σ2 ,Θ1 Θ2 ) Binary Op
𝑛 𝑛
𝜎 ⇒(Δ,Σ,Θ),𝑥𝑟𝑒𝑡 =𝑁 𝑒𝑤 (𝑐𝑎𝑙𝑙 𝑓 (𝑥 1 ,...,𝑥𝑛 ) ),Δ′ =Δ [Φ(𝑥𝑟𝑒𝑡 ) ←( Δ [Φ(𝑥𝑖 ) ⊕𝑓 ]) ],Δ′′ =Δ[Φ(𝑥𝑟𝑒𝑡 ) ←( Δ[Φ(𝑥𝑖 ) ]) ]
Ð Ð
𝑖=1 𝑖=1
(𝑐𝑎𝑙𝑙 𝑓 (𝑥 1 ,...,𝑥𝑛 ),𝜎 ) ⇒if 𝑓 is resolved then (Δ,Σ,Θ) else if 𝑓 ∈ 𝐹 then (Δ′,Σ,Θ) else (Δ′′,Σ,Θ) Func Call
𝜎 ⇒(Δ,Θ,Σ),𝑥 ′ =𝐶𝑜𝑝𝑦 (𝑥 ),𝑔=Σ[Λ(𝑓 ) ],Δ′ =Δ[Φ(𝑥 ′ ) ←Δ[Φ(𝑥 ) ] ],Σ′ =Σ[Λ(𝑟 ) ←𝑟 ] 𝜎⇒(Δ,Σ,Θ),Σ′ =Σ[Λ(𝑓 ) ←𝑥 ]
( 𝑓 (𝑥,𝑟 ),𝜎 ) ⇒(𝑔 (𝑥 ′,𝑆,𝑅),(Δ′,Θ,Σ′ ) ) sendMessage (𝑓 (𝑥 ),𝜎 ) ⇒(Δ,Σ′,Θ) onMessage-addListener
𝜎,𝑚=𝑁 𝑒𝑤 (∗) 𝜎
( 𝑓 (𝑥 ),𝜎 )⇒(get_taint(𝑚),𝑥 (𝑚,𝑆,𝑅),𝜎 ) onMessageExternal-addListener (𝑓 (𝑥 ),𝜎 ) ⇒(set_sink(𝑥 ),𝜎 ) onMessageExternal-sendResponse
Long-Lived Connections
𝜎 ⇒(Δ,Σ,Θ),𝑥 ′ =𝐶𝑜𝑝𝑦 (𝑥 ),Δ′ =Δ[Φ(𝑥 ′ ) ←Δ [Φ(𝑥 ) ] ],𝑝=𝐿𝑘𝑢𝑝𝑃𝑜𝑟𝑡 ( 𝑓 ),𝑔=Θ[𝑝 ] 𝜎⇒(Δ,Σ,Θ),𝑝=𝐿𝑘𝑢𝑝𝑃𝑜𝑟𝑡 ( 𝑓 ),Θ′ =Θ[𝑝←𝑥 ]
(𝑓 (𝑥 ),𝜎 ) ⇒(𝑔 (𝑥 ′ ),(Δ′,Σ,Θ) ) postMessage ( 𝑓 (𝑥 ),𝜎 ) ⇒(Δ,Σ,Θ′ ) onMessage-addListener
𝜎,𝑝=𝑃𝑜𝑟𝑡 (∗)
( 𝑓 (𝑥 ),𝜎 )⇒( (𝑥 (𝑝 ),𝜎 ) ) onConnectExternal-addListener
𝑟 . After sending the message, CoCo copies the message obj 𝑥 by 𝑔 = Σ[Λ(𝑓 )]. Note that since this is a one-time request, the
as 𝑥 ′ and updates the taint of 𝑥 to 𝑥 ′ by Δ[Φ(𝑥 ′ ) ← Δ[Φ(𝑥)]]. event-callback mapping is set to null after the messaging finishes
The function invoked upon receiving the message is fetched by by Σ ′ = Σ[Λ(𝑓 ) ← 𝑛𝑢𝑙𝑙].
𝑔 = Σ[Λ(𝑓 )]. The three parameters for the function 𝑔 are: 𝑥 ′ the Action: The function call 𝑓 (𝑥) under state 𝜎 is reduced to the call
copied message, 𝑆 the Sender variable, and 𝑅 the built-in response of the function 𝑔 under the new state.
function for simple one-time requests. CoCo also updates the • onMessageExternal-addListener. This rule applies to the re-
mapping of the “one-time request response” event to the callback ceipt of external messages.
function 𝑟 . Condition: CoCo marks the message 𝑚 from an external source
Action: The sendMessage function call 𝑓 (𝑥, 𝑟 ) under state 𝜎 is as tainted.
reduced to the call of the function 𝑔 under the new state. Action: The function call under state 𝜎 is reduced to getting taint
• onMessage-addListener. This rule applies to the message re- from the wildcard variable and then calls the callback function
ceiver. under state 𝜎. Note that CoCo calls the callback function directly
Condition: The onMessage function is registered by addListener to mimic the external environments.
and called on the receiver end upon receiving a message. Its • onMessageExternal-sendResponse. This rule applies to the
function parameter is the function 𝑔 in sendMessage rule. CoCo response for external messages.
updates the callback function mapping of the “one-time request Condition: CoCo treats the message sent to external as a sink.
onMessage” event, which is mapped by Λ(𝑓 ). Action: The function call 𝑓 (𝑥) under state 𝜎 is reduced to setting
Action: The function call 𝑓 (𝑥, 𝑟 ) under state 𝜎 can be reduced to the parameter 𝑥 as sink under the new state.
the new state with a new mapping of event and callback. Third, we describe taint propagation for “long-lived message
• onMessage-sendResponse. This rule applies to the response connections”. When a connection is established, CoCo updates
sent from the receiver. Θ to include the port to the corresponding callback function on
Condition: The sendResponse function is called from the re- receiving the message. Then, similar to one-time requests, CoCo
ceiver end inside the onMessage callback. The rule is similar copies messages and propagates taints from the sender to the re-
to the sendMessage rule: CoCo copies the message obj 𝑥 as ceiver for port.sendMessage. Now we explain the inference rules
𝑥 ′ and updates the taint of 𝑥 to 𝑥 ′ by Δ[Φ(𝑥 ′ ) ← Δ[Φ(𝑥)]]. for “long-lived connections” one by one.
The function invoked upon receiving the response is fetched • connect. This rule applies long-lived message connection.
2446
CoCo: Efficient Browser Extension Vulnerability Detection via Coverage-guided, Concurrent Abstract Interpretation CCS ’23, November 26–30, 2023, Copenhagen, Denmark
Condition: This function is called by the party initiating the • Control-flow patterns. CoCo detects whether there is a control-
connection. The parameter 𝑥 is connection information. The re- flow path from an adversary-controlled function to another func-
turned object of the connect function is a port, which is denoted tion with high privileges. Say for example there exists a control-
as 𝑝 = 𝑃𝑜𝑟𝑡 (𝑥) in the rule. 𝑔 is the function from the party that flow path between chrome.storage.sync.clear() and mes-
accepts the connection, fetched by 𝑔 = Σ[Λ(𝑓 )]. sage listener. CoCo considers this as a privilege escalation to
Action: The connect function call under state 𝜎 is reduced to the access browser storage because an adversary can clear the ex-
𝑔 function call under the same state. tension’s storage.
• onConnect-addListener. This rule applies the callback func- • Control-flow and taint patterns. CoCo detects whether there ex-
tion for long-lived message connection. ists a taint-flow to a sink’s parameter in addition to the aforemen-
Condition: This function is called from the party that accepts the tioned control-flow path. Take chrome.tabs.executeScript, a
connection. CoCo registers the callback function that is invoked function call used for executing scripts in the extension context,
when a connection event is fired. for example. CoCo detects a vulnerability if the function call is
Action: The function call under state 𝜎 is reduced to updating reachable from the control-flow and its parameter is tainted so
the mapping of event-callback of state 𝜎. that an adversary can execute a script.
• postMessage. This rule applies message posting after a long-
lived connection. Second, CoCo checks the feasibility of source-sink pairs, e.g.,
Condition: CoCo copies the message object and updates the taint ensuring that one is from the adversary and the other is from the
from the original message object to the copied one. Furthermore, extension. Specifically, there are two scenarios. (i) If the source con-
CoCo fetches the message port for the postMessage function tains sensitive data from browser extensions, e.g., a cookie, CoCo
by 𝑝 = 𝐿𝑘𝑢𝑝𝑃𝑜𝑟𝑡 (𝑓 ) and fetches the corresponding callback of checks whether the sink will be accessible to an adversary, e.g.,
the message port by 𝑔 = Θ[𝑝]. an HTTP request parameter or a message callback. (ii) By con-
Action: The call of postMessage function under state 𝜎 is re- trast, if the source is from an adversary, CoCo checks whether
duced to the call of function 𝑔 under state with the updated taint the sink is a sensitive API in the browser extension. After check-
mapping. ing the source-sink feasibility, CoCo also checks whether there is
source-sink-specific sanitization along the data flow. For example,
• onMessage-addListener. This rule applies to the receipt of a
CoCo checks whether the user is informed of cookie access by
message during a long-lived connection.
inspecting control-flow dependencies like an if statement with
Condition: This function is called by the receiver. CoCo first
consent-related APIs such as window.confirm: if so, the user con-
looks up the port for the function 𝑓 , then registers the callback
sent is considered as a sanitization.
function on receiving the message by updating the mapping of
Lastly, CoCo checks the extension’s manifest.json to see whether
port 𝑝 and callback function 𝑥 by Θ ′ = Θ[𝑝 ← 𝑥].
all the involved APIs along the control- and data-flows have the
Action: The function under state 𝜎 is reduced to a new state with
corresponding permission so that the adversary can launch an
the updated mapping of message port and callback.
attack.
• onConnectExternal-addListener. This rule applies the call-
back function for an external, long-lived message connection. 4.2.3 Client-side API Modeling. CoCo interacts with a list of mod-
Condition: CoCo creates a wildcard port by Port(*). eled, client-side APIs during abstract interpretation, taint propaga-
Action: The function under state 𝜎 is reduced to the calling of tion, and vulnerability detection. Specifically, we categorize such
callback under the same state. modeled APIs into two types: event-related and taint-related. Such
• postMessageExternal. This rule applies to external message modeling is done semi-automatically. That is, CoCo has standard
posting during a long-lived connection. function calls for each type, but the determination of API type is
Condition: CoCo treats the external message posting as a sink. decided manually.
Then: The postMessageExternal function under state 𝜎 is re- First, CoCo maintains an event queue and a dictionary of events
duced to setting the message 𝑥 as the sink under the same state. and listeners for event-related APIs. When CoCo analyzes an event
• onMessageExternal-addListener. This rule applies to the re- and its callback function has been registered, CoCo directly creates
ceipt of an external message during a long-lived connection. a thread to analyze the event’s callback function. By contrast, if the
Condition: CoCo treats the message from the external port as callback function has not been registered (e.g., for a sendMessage
tainted and represents it as a wildcard variable New(*). event), CoCo adds the event to the event queue. At the same time,
Action: The function under state 𝜎 is reduced to getting taint CoCo has an event loop, i.e., a special thread, which goes through
from the wildcard message and calling of callback under the all the events in the queue: After the event callback is registered,
same state. CoCo, particularly the event loop, will analyze the callback via
creating a new thread.
4.2.2 Vulnerability Detection. CoCo detects extension vulnerabili- Second, CoCo models all the taint-related functions including
ties via three detailed steps: (i) detection of control- and data-flow sources, sinks, and sanitizations. Specifically, CoCo marks corre-
patterns, (ii) check on source-sink pair feasibility, and (iii) permis- sponding parameter(s) in the source function as tainted, removes
sion check of “manifest.json”. such taints from the function call return value for a sanitization
First, let us start with control- and data-flow patterns. CoCo function, and reports a vulnerability if the corresponding parameter
looks for the following two types of patterns: of a sink function is tainted.
2447
CCS ’23, November 26–30, 2023, Copenhagen, Denmark Jianjia Yu, Song Li, Junmin Zhu, & Yinzhi Cao
Ready queue Running queue low level for nested conditional statements. The reason is that
the top level often contains more high-level semantics compared
T3 T2 T1 Completed
with the low level where code is often fragmented with details.
Say there exists a complex filter in a target program that matches
Preempted inputs with multiple nested conditional statements. CoCo can
Waiting queue Waiting→Ready quickly skip the analysis of the filter to analyze the rest of the
program.
T0
Runs into branch stmts • Scheduling Criterion 3 [fairness]: Threads that have not been
executed for a while have a higher priority than those that are
just executed. The third criterion says that CoCo strikes a balance
Figure 3: Three queue structures (ready, running, and wait- among all the threads when all other conditions are the same.
ing) used by CoCo and their relations (i.e., how a thread is The purpose of this criterion is to prevent starvation and give
transferred from one queue to another) every thread at least some time to execute.
4.3 Thread scheduler Priority Calculation. We describe how CoCo calculates the pri-
CoCo’s thread scheduler is responsible for three tasks: (i) sched- ority in Equation 1 following the three scheduling criteria:
uling thread execution, (ii) creating threads to analyze code, and 𝑃 child = 𝑃 parent + 𝛼 · CovInclast − 𝛽 · brDepth − 𝛾 · 𝑇last (1)
(iii) handling inactive (or debris) threads (i.e., those finish execu-
where 𝛼, 𝛽, and 𝛾 are coefficients, 𝑃parent the thread’s parent’s
tion). Specifically, CoCo maintains three queue structures, called
priority, CovInclast the percentage of increased code coverage
“Ready”, “Running”, and “Waiting”, as shown in Figure 3 to achieve
(Criterion 1) in the last allocated time slot, brDepth the branch
the goal. The ready queue is a priority queue maintaining all the
depth (Criterion 2), and 𝑇last the last time (Criterion 3) that the
threads that are ready to be executed and will be fetched onto the
thread is scheduled.
running queue. The running queue maintains a list of threads that
are currently executed by the thread executor. The waiting queue 4.3.2 Creation. To start, CoCo creates threads for each component
contains threads that are waiting for other threads’ results or that of a browser extension, i.e., content and background scripts. Then,
finish execution and are ready to be merged with other threads. during analysis, CoCo gradually creates more threads for analysis
These three queues are connected and threads are moved among following three creation criteria.
three queues based on the aforementioned three tasks. Threads in • Creation Criterion 1 [conditional statement]: CoCo creates a
the ready queue are moved to the running queue by scheduling thread for each branch of a conditional statement. CoCo cre-
and back to the ready queue by preemption. Then, threads in the ates new threads for each branch of a conditional statement and
running queue may also be moved to the waiting queue if the also puts the original thread in the waiting queue, which becomes
execution finishes or encounters a branching statement. Lastly, inactive and waits for the finish of branching statements. The
threads in the waiting queue may be moved back to the ready reason is that CoCo analyzes later branches immediately without
queue if it is activated. We now describe the three tasks of CoCo letting them wait for the finish of the beginning branches. Such
in detail. a creation helps CoCo reach more code as soon as possible.
4.3.1 Scheduling. CoCo schedules threads by moving them from • Creation Criterion 2 [event callbacks]: CoCo creates a thread
the ready queue to the running queue based on a priority value. for each event callback function. CoCo creates new threads for
Then, CoCo also preempts threads in the running queue and moves each event callback when they are registered. For example, when
them to the ready queue with a new priority value after the allocated CoCo encounters setTimeout, CoCo creates a new thread for
time is used up. Below we describe the general scheduling criteria the callback function in its parameter. The procedure is differ-
of CoCo and then present a detailed, specific priority calculation ent for message-related events because there are two parties
method used by CoCo. involved. CoCo maintains an event queue to store all the mes-
sage events sent by sendMessage. When the onMessage listener
Scheduling Criteria. CoCo follows three criteria below to sched- is registered, CoCo allocates a special thread that loops through
ule threads. all the messages constantly for handling. Details are described
• Scheduling Criterion 1 [new code]: The analysis of unseen code in Section 4.2.3. The reason for such handling of messages is
has a higher priority than seen code. The first criterion says similar: CoCo can quickly reach new code that is embedded as
that CoCo analyzes new code compared with old code that has part of event callbacks.
already been analyzed before. Consider an if statement with • Creation Criterion 3 [sequential statements]: CoCo may create
two branches. The first branch calls a function that is analyzed threads for sequential statements if they are dataflow indepen-
before, and the second branch calls another unseen function. dent and the preceding statement is complex to analyze (e.g.,
CoCo prioritizes the analysis of the second branch because this the analysis time exceeds a threshold). This is a special, rarely-
branch has a higher probability of containing a vulnerability. happened case. Say there are two statements separated by a
• Scheduling Criterion 2 [nested branches]: Given a conditional comma. The first statement takes very long to analyze, e.g., it is
statement and a branch of the statement, the analysis of a con- an Immediately Invoked Function Expression (IIFE) [3]. CoCo
current branch has a higher priority than another embedded will create a thread for the second statement assuming that there
conditional statement under this branch. The second criterion are no data dependencies. Later on, if the first statement modi-
says that CoCo analyzes code on the top level compared with the fies an object read by the second statement, CoCo will abort the
2448
CoCo: Efficient Browser Extension Vulnerability Detection via Coverage-guided, Concurrent Abstract Interpretation CCS ’23, November 26–30, 2023, Copenhagen, Denmark
execution of the second statement. The reason for Criterion 3 is extensions. There are 44 overlaps between the two datasets and the
also to skip complex analysis and reach vulnerability locations. combination of the two datasets has 213 extensions containing 256
vulnerabilities. This is because some of these extensions contain
4.3.3 Thread Activation and Merging. The purpose of activation is
more than one vulnerability. We manually verify the exploitability
to bring threads in the waiting queue back to the ready queue after
of this combined dataset and find that six vulnerabilities are not
branching. There are three different policies for such activation to
exploitable due to a lack of data flows, control flows, or solvable
merge different threads.
constraints. We thus exclude these six from the dataset.
• [Policy 1] CoCo moves the parent thread from the waiting state We now describe different baselines and CoCo variants used in
to the ready state once one of its created threads terminates. our evaluation. (1) DoubleX [35] is the original implementation
• [Policy 2] CoCo moves the parent thread from waiting state to from the authors with the newly updated sensitive APIs in Table 1.
ready state when all of its created threads terminate. (2) ODGen-ext [43] is the original, sequential implementation plus
• [Policy 3] CoCo creates a copy of the parent thread, and moves the newly client-side extension model (e.g., message passing) from
the original parent thread from waiting state to ready state once CoCo. We use ODGen-ext as another baseline because the origi-
one of its created threads terminates. Then when other branch nal ODGen only supports the analysis of Node.js modules but not
threads terminate, CoCo moves the copied parent thread to the browser extensions. (3) CoCo-unguided is a variant of CoCo with-
ready state. out coverage guidance. We use CoCo-unguided for the ablation
When CoCo moves a parent thread from waiting to ready, CoCo study to understand the importance of coverage guidance.
merges the states in the graph-based abstract domains of the debris
child thread with the parent. The merge operates on the graph level
6 EVALUATION
and generates a new graph based on updates in graphs of child
threads. More specifically, the merge starts from the graph of the In this section, we evaluate CoCo and answer five Research Ques-
parent thread and then gradually adds or deletes nodes and edges. tions (RQs). Our evaluation is on a virtual machine with 128G
On one hand, if an edge or a node is added by any child thread, memory, 20 Intel(R) Xeon(R) Gold 5118 CPU @ 2.30GHz cores,
CoCo will add the same edge or node with the corresponding tag running Ubuntu 20.04.
marking the branch. On the other hand, if an edge or a node is
deleted by all the threads, CoCo will delete the edge or the node. 6.1 RQ1: Zero-day vulnerabilities
Otherwise, CoCo will mark the correct branch tag to the edge or In this research question, we show the capability of CoCo in
the node if one branch still keeps the edge or node. discovering zero-day vulnerabilities using the large-scale exten-
sion dataset with 185,076 extensions. Specifically, our definition of
5 IMPLEMENTATION AND SETUPS zero-day under the paper’s context, following Wikipedia [12], is
Implementation. Our implementation is open-source with 4,020 a browser extension vulnerability that is previously unknown to
lines of code, which is available at this anonymous repo [9]. Our those who are interested in its mitigation. Therefore, we consider a
implementation has a pre-processing module, which analyzes man- vulnerability found by CoCo as zero-day if it satisfies the following
ifest.json and extracts all the JavaScript for CoCo to analyze. CoCo two criteria: (i) The vulnerability cannot be detected by other tools,
will also include modeled client-side APIs (which are shown in particularly neither DoubleX or ODGen-ext. (ii) The vulnerability
Table 1) with each analyzed file. Table 1 shows the sensitive APIs, is not revealed online—e.g., as a bug report, with a CVE identifier,
which are stored as easily-editable text files in CoCo. Our Abstract or in another vulnerability dataset—to the best of our knowledge
Syntax Tree (AST) parser is based on an open-source tool, Esprima based on an extensive Google search.
(https://esprima.org/). Our implementation of the thread executor, In total, CoCo uniquely outputs 301 reports from our large-scale
specifically the abstract interpretation, is based on an open-source extension dataset, which are not reported by either DoubleX or
project, ODGen (https://github.com/Song-Li/ODGen), and its repre- ODGen-ext. Because of the large number, we only manually inspect
sentation of Object Dependence Graph as the abstract domain. All 50 reports whose corresponding extensions have more than 1,000
the open-source code is excluded from the aforementioned Lines users and find 43 zero-day vulnerabilities. A selective list (ranked
of Code. by # of users) is shown in Table 3. There are five columns in Table 3:
Experimental Setups. We describe our datasets and variations extension name, # of users, vulnerability type, status (i.e., whether
of CoCo and state of the art. We prepare two datasets for the the vulnerability is reported or confirmed by developers), and the
evaluation of CoCo: one is unlabelled for zero-day vulnerability exploit code. Note that CoCo detected vulnerabilities in popular
detection and the other is labeled mostly by prior works (Dou- extensions (e.g., Nuance PowerMic Web Extension with more than
bleX [35] and EmPoWeb [55]). (1) Large-scale extension dataset. 200K users). Take Ceibal Library Reader, an extension with 100K+
Specifically, we crawled 185,076 Chrome extensions from Chrome users, which allows saving the URL of the books that have been
Web Store in September 2021. We then exclude 20,622 empty ex- downloaded, for example. CoCo finds a vulnerability that allows
tensions, 35 malformed extensions (one that cannot be unzipped, an adversary to create a new bookmark via sending a message to
33 with incorrect manifest.json, and one without manifest.json), the extension.
and 19,289 themes. (2) Vulnerable extension dataset. This dataset We also show all the zero-day vulnerabilities in Table 4 with a
contains 207 vulnerable extensions provided by prior works (Dou- breakdown. Privileged storage access is the most popular one and
bleX [35] and EmPoWeb [55]). Specifically, the DoubleX dataset we further break it down into cookie, bookmark/history and other
has 184 vulnerable extensions and EmPoWeb has 73 vulnerable extension storage. One reason is the large amount of APIs involved
2449
CCS ’23, November 26–30, 2023, Copenhagen, Denmark Jianjia Yu, Song Li, Junmin Zhu, & Yinzhi Cao
Table 4: A breakdown by vulnerability type (The list of re- 1 // exploit code ( which retrieves browsing history )
lated sensitive APIs for each type can be found in Table 1.) 2 window . addEventListener (" message " , function ( msg ) { console . log ( msg
) })
and exploitable scope (i.e., any websites or those that are de- 3 window . postMessage ({ from : " __newtab " , event :" search_history " ,
query : " " })
fined in the allowlist in the manifest file) 4 // content_script . js
5 window . addEventListener ( ' message ', function ( e ) {
Vulnerability Type # vulnerabilities = # exploitable by any sites + 6 ...
# exploitable by allowlisted sites 7 chrome . runtime . sendMessage ( e . data , ( res ) = > {
8 const sender = { event : e . data . event , res : res };
Code Execution 4=3+1 9 sender . from = ' ext ';
10 window . postMessage ( sender , '* ') ;
Privileged AJAX requests 5 = 3+2 11 }) ;
Arbitrary File Downloads 1 = 0+1 12 }) ;
13 // background . js
Privileged Storage Access 33 = 12+21 14 chrome . runtime . onMessage . addListener (( request , sender ,
- Cookie 4 = 1+3 sendResponse ) = > {
15 listener ( request , sender , sendResponse ) ;
- Bookmark/history 3 = 2+1
16 return true ;
- Other storage 26 = 9+17 17 }) ;
Total 43 = 18+25 18 function listener ( request , sender , sendResponse ) {
19 switch ( request . event ) {
20 case " search_history ":
21 searchHistory ( request . query ) . then ( res = > {
in Table 1. The other is that many extensions may access storage like 22 sendResponse ( res ) ;
localStorage. We also break down all the zero-day vulnerabilities 23 }) ;
24 break ;
by their exploitable scope, i.e., whether they are exploitable by any 25 }
26 }
websites or those that are defined in the allowlist in the manifest file. 27 function searchHistory ( query ) {
Overall about 41.9% of websites can be exploited by any websites 28 return new Promise (( resolve , reject ) = > {
29 chrome . history . search ({ text : query , maxResults : 10 } , (
and the rest needs to be specific sites in the allowlist. res ) = > {
There are two major reasons that CoCo detects zero-day vulner- 30 resolve ( res ) ;
31 }) ;
abilities that are not found by prior works. First, there are many 32 }) ;
dynamic language features, e.g., bracket syntax and promise, which 33 }
are not handled by prior works. Here we list an example of one Listing 2: A Zero-day Vulnerability Example: Abcd PDF -
zero-day vulnerability from “Abcd PDF - Chrome New Tab Page” Chrome New Tab Page
found by CoCo in Listing 2. The vulnerability allows an adversary
to obtain the browsing history and bookmarks of the user. Specifi-
The exploit code of Table 3 shows that the vulnerability is triggered
cally, the exploit code is shown in Lines 2–3. The adversary sends a
by a custom event listened by the extension as “_nuca_link_request”.
query at Line 3, which was processed by the content script at Line 5.
Such an event is not simulated by DoubleX.
The content script then sends a message to the background at Line 7,
which is received at Line 18. Then, Line 21 calls the searchHistory
function at Line 27, which calls a sensitive function at Line 29 and 6.2 RQ2: FP and FN
then sends the results back at Line 22. Then, the message listener In this research question, we evaluate the false positives and nega-
registered by the adversary at Line 2 receives the results. DoubleX tives of CoCo and compare with prior works. Figure 4 shows the
cannot detect this vulnerability because of the heavy involvement Venn diagrams of the reported results (including false positives) of
of Promise and then function, leading to missing call edges of all three approaches upon our large-scale extension dataset. First,
DoubleX’s results. Such dynamic features are often the reason that CoCo detects all the report extensions from ODGen-ext: This is
leads to missed detection of vulnerabilities by prior works such as expected because the purpose of our concurrent, coverage-driven
DoubleX and EmPoWeb. Second, there are vulnerabilities that are abstract interpretation is to increase code coverage and detect more
triggered by complex inputs and not considered by prior works. vulnerable extensions. Second, the detection results of CoCo and
“Nuance PowerMic Web Extension” in Table 3 is such an example. DoubleX overlap, but each approach has its unique results.
2450
CoCo: Efficient Browser Extension Vulnerability Detection via Coverage-guided, Concurrent Abstract Interpretation CCS ’23, November 26–30, 2023, Copenhagen, Denmark
2451
CCS ’23, November 26–30, 2023, Copenhagen, Denmark Jianjia Yu, Song Li, Junmin Zhu, & Yinzhi Cao
Table 7: Ablation Study on Concurrency and Coverage- Creation Criterion 2 helps the detection of 19 out of 250 vulner-
guided Analysis. abilities and Creation Criterion 3 the detection of 14 out of 250
Approach Large-scale dataset Vulnerable extension dataset vulnerabilities.
CoCo 1,374 248
CoCo-unguided 1,077 241
ODGen-ext 832 228
6.4 RQ4: Performance
In this research question, we show the performance of CoCo using
objects. Examples of such functions are regular expression and three metrics: total analysis time, # of threads, and total memory
$.Deferred. over time. Our methodology is as follows. First, we randomly select
500 extensions from our large-scale dataset and observe the analysis
6.3 RQ3: Ablation Study time of three approaches—CoCo, DoubleX and ODGen-ext. Figure 8
In this research question, we perform an ablation study to under- shows the Cumulative Distributional Function (CDF) of the total
stand how each “Co”s contributes to the analysis results of CoCo. analysis time of three approaches. CoCo is slightly slower than
ODGen-ext due to the setup of multiple threads and the time used
6.3.1 Coverage-guided Analysis. In this part, we show an ablation for switching between threads. Eventually, CoCo manages to finish
study of the impacts of coverage-guided analysis by removing cov- analyzing more extensions compared with ODGen-ext due to its
erage guidance from CoCo. Table 7 shows the comparison among high code coverage. In the end, the number of finished extensions
CoCo, CoCo-unguided, and ODGen-ext. CoCo-unguided detects of CoCo is the same as DoubleX, i.e., there are 32 extensions (6.4%)
fewer vulnerabilities compared with CoCo, but more than ODGen- that time out after ten minutes for both CoCo and DoubleX.
ext. The reason is that CoCo-unguided often analyzes deep em- Second, we select five extensions in the aforementioned 500: two
bedded branches with repeated function calls while ignoring some with the longest analysis time and the other three randomly from
vulnerabilities that exist in the high-level branches. the 500 above. We then show the # of threads and memory overhead
over its analysis time. Figure 9 shows the number of threads over
6.3.2 Concurrency. In part, we show the advantage of concurrency
analysis time. The number of threads starts to increase when CoCo
brought by CoCo in the analysis using an ablation study, removing
encounters branching statements, reaches more than 100 for one
concurrency from CoCo, which essentially becomes ODGen-ext.
extension, and then decreases after analyzing branching statements
Specifically, we randomly pick 100 extensions, analyze them and
for merging. In another case, the number of threads keeps increasing
find that 24 of them will time out after ten minutes using ODGen-
due to a large number of branching statements in the extension
ext. Then, we compare the code coverage (statement coverage)
until we kill CoCo after the ten-minute time-out.
of these 24 extensions with and without concurrency brought by
Lastly, Figure 10 shows the CDF of the maximum consumed
CoCo. We run each extension extensively with and without concur-
memory in the unit of mebibyte of three approaches, DoubleX,
rency until the code coverage is stable (i.e., staying the same) after
ODGen-ext, and CoCo. The used memory of CoCo is similar to the
ten minutes. Figure 5 shows the Bar Graph of the code coverage
one used by ODGen-ext, because the main memory consumption
improvement comparing CoCo and ODGen-ext. CoCo improves
is the storage of the graph structure. Both ODGen-ext and CoCo
the code coverage by 0% to 14% compared with ODGen-ext with a
used less memory compared with DoubleX for more than 95% of
median value of 4%. This advantage is brought by the concurrent
extensions because of different representations of the program
execution of CoCo.
dependency graph and object dependence graph. ODGen-ext and
We also show two examples of code coverage of ODGen-ext
CoCo used more memory for the rest of 5% of extensions because
and CoCo over time in Figure 6 and 7 respectively. First, the code
abstract interpretation will generate more nodes for some particular
coverage of a ODGen-ext will increase in the beginning but then
program structures like recursion.
stay stable after being stuck in analyzing old code. By contrast, the
code coverage of a concurrent analysis will increase even if one
thread is stuck with analyzing old code. Second, the code coverage 7 DISCUSSION
of a concurrent analysis is often below 100% even if we give CoCo Responsible Disclosure. We follow standard responsible disclo-
enough time. One reason is that some extensions may include dead sure procedures to inform vulnerable extension developers and give
code: For example, if the extension developer includes a JavaScript them 45 days to fix the vulnerability before public release. That said,
library, many functions in the library may not be called. This is an we wrote emails to all the developers of the zero-day vulnerabilities
advantage: even if there exist vulnerabilities in the dead code, they belonging to 84 extensions when we manually verified them as
cannot be exploited by an adversary. Lastly, the code coverage of exploitable regardless of whether the vulnerability is detected by
sequential analysis may be larger than that of concurrent in the CoCo, DoubleX, or ODGen-ext. More specifically, we only find
beginning as shown in Figure 7. Then, the coverage of concurrent the developers’ contact information of 39 out of 43 vulnerabilities
analysis exceeds sequential in the end. The reason is that concurrent that are uniquely detected by CoCo as vulnerable, and make corre-
analysis has many threads with context switching, which may fall sponding reports to the developers of the 37 affected extensions. In
behind sequential analysis in the beginning. addition, we also find contact information of zero-day vulnerabili-
Lastly, we also study how Creation Criteria 2 and 3. Our experi- ties that are detected by either DoubleX or ODGen-ext as well and
ment methodology is as follows. We run CoCo without Creation confirmed as exploitable. Thus, we make corresponding reports to
Criteria 2 or 3 upon our vulnerable extension dataset and com- those developers of affected extensions as well. Note that many of
pare the number of detected vulnerabilities. Our result shows that these vulnerabilities reported by either DoubleX or ODGen-ext are
2452
CoCo: Efficient Browser Extension Vulnerability Detection via Coverage-guided, Concurrent Abstract Interpretation CCS ’23, November 26–30, 2023, Copenhagen, Denmark
Percentage of covered code (%)
60 60 60
50 50 50
40 40 40
30 30
30
20 20
20
10 10
10
0 0
0
0 5 10 15 20 0 100 200 300 400 500 600 0 100 200 300 400 500 600
Time (in second) Time (in second) Time (in second)
Figure 5: Code Coverage Compari- Figure 6: Code Coverage Increase of Figure 7: Code Coverage Increase over
son of ODGen-ext and CoCo over 24 CoCo vs. ODGen-ext over Time of Time of CoCo vs. ODGen-ext for a
Timed-out Extensions Zuora RBM Connect Plugin specific extension [7]
Percentage of finished extensions (%)
60 100 60
75
40 40
50
20 20
ODGen-ext 25
CoCo
DoubleX 0
0 0
0 100 200 300 400 500 600 0 50 100 150 200 250 300 350 0 5,000 10,000 15,000 20,000 25,000 30,000 35,000 40,000
Time (in second) Time (in second) Maximum Memory used (in MiB)
Figure 8: CDF of Total Analysis Time Figure 9: # of Threads vs. Analysis Figure 10: CDF of Maximum Memory
for 500 Random Extensions Time of CoCo for 500 Random Extensions
also detected by CoCo; that is, only four vulnerabilities belonging That is, if one variable is related to user inputs, CoCo makes an
to four extensions are only detected by DoubleX but not CoCo. In overapproximation. Second, dynamic generation of code in CoCo
the email, we describe where the vulnerability locates and how we is sound only if the generated code value is resolvable. That is, if
suggest patching the vulnerability. So far we have only received one the code generation parameter is related to user inputs, CoCo will
confirmation and unfortunately, none of the vulnerable extensions skip the analysis. Note that since dynamic code generation is also a
have been fixed yet. We are still working with developers as well sink for code execution, the implementation choice does not affect
as extension marketplace operators for fixes. vulnerability analysis.
Manifest V3 Extensions. Recently, Google Chrome releases
Manifest V3, which makes two major changes beyond the mani- 8 RELATED WORK
fest file: (i) migrating background scripts to service workers, (ii)
Browser Extension Security. In 2007, Louw et al. [58] studied the
adding additional constraints for cross-origin requests in the con-
security issues of browser extensions and introduced an integrity-
tent scripts. CoCo is compatible with Manifest V3 extensions with
checking mechanism to control the extensions’ installation and
three major changes. First, CoCo supports a special parsing com-
loading process. In 2010, VEX [14] applied static information-flow
ponent for all V3 manifest files. Second, CoCo creates a special
analysis to the browser extensions and used static flow patterns
thread for service workers just like background scripts. Here are the
and unsafe programming practices to highlight potential security
analysis results. So far, there are 3,798 extensions in our large-scale
vulnerabilities. Later, IBEX [36] adopted Datalog and data flow poli-
extension dataset using Manifest V3. CoCogenerates 21 reports as
cies to limit the API usage of browser extensions. Then, Carlini et
potentially vulnerable and our further manual verification shows
al. [26] evaluated the effectiveness of Chrome extension security
that 19 out of 21 are true positives.
architecture by a security review of 100 Chrome extensions, and
Constraint Solving. CoCo statically finds a data path between
Wang et al. [59] did a measurement study to analyze extension be-
an adversary-controlled input and a sink function. The found path
haviors. In 2013, Sentinel [47] introduced a user-controllable policy
may not be feasible due to certain control- or data-flow constraints.
enforcer for the Firefox browser that gives fine-grained control to
This is the same as the state-of-the-art approach, particularly Dou-
the JavaScript Firefox extensions. In 2014 and 2015, multiple re-
bleX. Furthermore, false positives caused by a lack of constraint
searchers focused on the detection of malicious extensions and vul-
solving is relatively small based on our manual checking. We leave
nerabilities. Hulk [40] leveraged extension-related dynamic pages
it as our future work to include constraint solvers in CoCo.
and employed a fuzzer to detect malicious extensions. Calzavara et
Soundness. CoCo is the same as all previous static analysis
al. [16] proposed a formal security analysis of browser extensions
of JavaScript, which is not sound. Specifically, we describe sound-
to show that message-passing APIs may lead to privilege escalation
ness conditions below. First, dynamic object creation in CoCo is
attacks. WebEval [37] adopted dynamic analysis, static analysis,
sound only if all the related variables can be precisely resolved.
and reputation tracking to detect malicious extensions. Onarlioglu
2453
CCS ’23, November 26–30, 2023, Copenhagen, Denmark Jianjia Yu, Song Li, Junmin Zhu, & Yinzhi Cao
et al. [48] investigated the security issues of the Firefox XPCOM extensions with concurrent taint analysis. Specifically, CoCo cre-
extension APIs and introduced methods to detect related vulner- ates concurrent analysis for new branches and events and propa-
abilities. In 2016, researchers used static and dynamic analysis to gates taints across different threads. Then, CoCo schedules analysis
detect vulnerabilities and sensitive information leakage. Anil et to prioritize code coverage so that it can always try to reach new
al. [54] extended the colluding attacks to the extension domain code and find vulnerabilities. We evaluate CoCo using both ground
and showed that the collusion between two extensions may lead to truth and real-world extension datasets and compare CoCo with the
private information leakage. CrossFire [15] adopted a multi-stage state-of-the-art approach, DoubleX, as well as a modified version
lightweight static analyzer to detect instances of extension-reuse of ODGen. Our evaluation shows that CoCo detects zero-day vul-
vulnerabilities on top of Firefox. ExtensionGuard [27] used a cus- nerabilities that cannot be detected by state-of-the-art approaches.
tomizable dynamic taint tracker to mark the sensitive information,
and then detect information leakage during runtime.
In 2017 and 2018, there were some works studying privacy is- ACKNOWLEDGEMENT
sues of browser extensions. Starov et al. [56] reported a large-scale We would like to thank anonymous shepherd and reviewers for
study of privacy leakage enabled by extensions, and Ex-Ray [60] their helpful comments and feedback. This work was supported in
presented a dynamic technique that was based on the network traf- part by National Science Foundation (NSF) under grants CNS-21-
fic patterns for identifying privacy-violating extensions. Aggarwal 54404 and CNS-20-46361 and Defense Advanced Research Projects
et al. [13] detected and defended spying extensions by using RNN Agency (DARPA) under AFRL Definitive Contract FA875019C0006
with the sequence of browser API calls. Mystique [28] used static and a DARPA Young Faculty Award (YFA) under Grant Agreement
analysis to obtain the data-flow and control-flow graphs and modi- D22AP00137-00 as well as an Amazon Research Award (ARA) 2021
fies Chromium to detect the leakage of private information. In 2019 and a Visa Research Award. The views and conclusions contained
and 2020, EmPoWeb [55] used call graph analysis to investigate how herein are those of the authors and should not be interpreted as
communication-related APIs can influence the security of browser necessarily representing the official policies or endorsements, either
extensions. Pantelaios et al. [51] detected malicious browser ex- expressed or implied, of NSF, DARPA, Amazon, or Visa.
tensions through their update deltas and did a large-scale to-date
measurement study. Recently in 2021, DoubleX [35] analyzed taint
REFERENCES
flows to detect browser extension vulnerabilities without the sup-
[1] [n.d.]. Architecture overview. https://developer.chrome.com/docs/extensions/
port of dynamic features. In 2022, Benjamin et al. [31] presented a mv3/architecture-overview/.
systematic study of attack entry points in the browser extension [2] [n.d.]. Changes to Cross-Origin Requests in Chrome Extension Content
Scripts. https://www.chromium.org/Home/chromium-security/extension-
ecosystem. content-script-fetches/.
Note that prior works on browser extension vulnerability de- [3] [n.d.]. Definition of IIFE. https://developer.mozilla.org/en-US/docs/Glossary/IIFE.
tection either adopted dynamic analysis [16, 37], used policy en- [4] [n.d.]. FromDocToPdf: exposes browsing history to all websites. https://bugs.
chromium.org/p/project-zero/issues/detail?id=1557.
forcers [47, 48], or only had limited support for dynamic features [35]. [5] [n.d.]. Google Scholar Button. https://chrome.google.com/webstore/detail/
General Web Security. General web security [17–20, 22–25, 49, google-scholar-button/ldipcbpaocekfooobnbcddclnhejkcpn. [Online; Accessed
50, 61, 62] has been studied for many years. We start with static on 07-June-2022].
[6] [n.d.]. Grammarly: Grammar Checker and Writing App. https:
analysis. Jensen et al. [38] use static analysis to detect type-related //chrome.google.com/webstore/detail/grammarly-grammar-checker/
and dataflow-related programming errors of client-side JavaScript kbfnbcaeplbcioakkpcpgfkobkghlhen. [Online; Accessed on 07-June-2022].
[7] [n.d.]. Kino No Tabi Backgrounds HD S Journey New Tab. https://chrome-
applications that interact with the HTML, DOM, and browser APIs. stats.com/d/agpijpbfbjfdahhjjigjbhfeogijlajm.
HideNoSeek [32], JShield [21], JaSt [34], and JSTap [33] adopt [8] [n.d.]. Message passing. https://developer.chrome.com/docs/extensions/mv3/
static analysis to detect malicious client-side JavaScript applica- messaging/.
[9] [n.d.]. Open-source Repository. https://github.com/CoCoAbstractInterpretation/
tions. JSIsolate [64] provides an isolated and reliable JavaScript CoCo.git.
execution environment based on the dependency relationship of [10] [n.d.]. Operational semantics (2022) Wikipedia. https://en.wikipedia.org/wiki/
different JavaScript program components. JAW [41] detects client- Operational_semantics. [Online; Accessed on March 21, 2023].
[11] [n.d.]. Video Downloader Extension: Universal XSS. https://bugs.chromium.org/p/
side CSRF vulnerabilities by modeling browser objects in the Hybrid project-zero/issues/detail?id=1555.
Property Graphs. As for dynamic analysis. Deemon [52] combines [12] [n.d.]. [Wikipedia] Zero-day (computing). https://en.wikipedia.org/wiki/Zero-
day_(computing).
dynamic analysis and property graphs to detect the CSRF vulner- [13] Anupama Aggarwal, Bimal Viswanath, Liang Zhang, Saravana Kumar, Ayush
ability. Melicher et al. [45] and Steffens et al. [57] adopt dynamic Shah, and Ponnurangam Kumaraguru. 2018. I Spy with My Little Eye: Analysis
analysis to detect DOM-based XSS vulnerabilities. JSObserver [63] and Detection of Spying Browser Extensions. In 2018 IEEE European Symposium on
Security and Privacy (EuroS&P). 47–61. https://doi.org/10.1109/EuroSP.2018.00012
focuses on the code integrity problem of client-side JavaScript that [14] Sruthi Bandhakavi, Samuel T. King, P. Madhusudan, and Marianne Winslett. 2010.
is caused by global identifier conflicts. Black Widow [30], a black VEX: Vetting Browser Extensions for Security Vulnerabilities. In 19th USENIX
box data-driven approach to web crawling and scanning, finds more Security Symposium (USENIX Security 10). Washington, DC.
[15] Ahmet Salih Buyukkayhan, Kaan Onarlioglu, William K. Robertson, and Engin
cross-site scripting vulnerabilities with no false positives. Kirda. 2016. CrossFire: An Analysis of Firefox Extension-Reuse Vulnerabilities.
In NDSS 2016.
[16] Stefano Calzavara, Michele Bugliesi, Silvia Crafa, and Enrico Steffinlongo. 2015.
Fine-Grained Detection of Privilege Escalation Attacks on Browser Extensions. In
Programming Languages and Systems, Jan Vitek (Ed.). Springer Berlin Heidelberg,
9 CONCLUSION Berlin, Heidelberg, 510–534.
[17] Yinzhi Cao, Zhanhao Chen, Song Li, and Shujiang Wu. 2017. Deterministic
In this paper, we design and implement a new framework, called Browser. In CCS (Dallas, Texas, USA) (CCS ’17). Association for Computing Ma-
CoCo, to parallelize abstract interpretation for analyzing browser chinery, New York, NY, USA, 163–178. https://doi.org/10.1145/3133956.3133996
2454
CoCo: Efficient Browser Extension Vulnerability Detection via Coverage-guided, Concurrent Abstract Interpretation CCS ’23, November 26–30, 2023, Copenhagen, Denmark
[18] Yinzhi Cao, Song Li, Erik Wijmans, et al. 2017. (Cross-) Browser Fingerprinting Security Symposium (USENIX Security 21). 2525–2542.
via OS and Hardware Level Features.. In NDSS. [42] Song Li, Mingqing Kang, Jianwei Hou, and Yinzhi Cao. 2021. Detecting Node.js
[19] Yinzhi Cao, Zhichun Li, Vaibhav Rastogi, and Yan Chen. 2010. Virtual Browser: Prototype Pollution Vulnerabilities via Object Lookup Analysis. In ESEC/FSE ’21:
A Web-Level Sandbox to Secure Third-Party JavaScript without Sacrificing Func- 29th ACM Joint European Software Engineering Conference and Symposium on the
tionality. In CCS (CCS ’10). Association for Computing Machinery. Foundations of Software Engineering.
[20] Yinzhi Cao, Xiang Pan, Yan Chen, and Jianwei Zhuge. 2014. JShield: Towards Real- [43] Song Li, Mingqing Kang, Jianwei Hou, and Yinzhi Cao. 2022. Mining Node.js
Time and Vulnerability-Based Detection of Polluted Drive-by Download Attacks. Vulnerabilities via Object Dependence Graph and Query. In 31st USENIX Security
In Proceedings of the 30th Annual Computer Security Applications Conference Symposium (USENIX Security 22). USENIX Association, Boston, MA.
(ACSAC ’14). Association for Computing Machinery, New York, NY, USA. [44] Laurent Mauborgne and Xavier Rival. 2005. Trace partitioning in abstract in-
[21] Yinzhi Cao, Xiang Pan, Yan Chen, and Jianwei Zhuge. 2014. JShield: towards real- terpretation based static analyzers. In European Symposium on Programming.
time and vulnerability-based detection of polluted drive-by download attacks. Springer, 5–20.
Proceedings of the 30th Annual Computer Security Applications Conference (2014). [45] William Melicher, Anupam Das, Mahmood Sharif, Lujo Bauer, and Limin Jia.
[22] Yinzhi Cao, Xiang Pan, Yan Chen, Jianwei Zhuge, Xiaobin Qian, and Jian Fu. 2015. 2018. Riding out DOMsday: Towards Detecting and Preventing DOM Cross-Site
Malicious code detection technologies. US Patent 9,213,839. Scripting. In Network and Distributed System Security Symposium (NDSS).
[23] Yinzhi Cao, Vaibhav Rastogi, Zhichun Li, Yan Chen, and Alexander Moshchuk. [46] Benjamin Barslev Nielsen, Behnaz Hassanshahi, and François Gauthier. 2019.
2013. Redefining web browser principals with a Configurable Origin Policy. In Nodest: Feedback-Driven Static Analysis of Node.Js Applications. In Proceedings
2013 43rd Annual IEEE/IFIP International Conference on Dependable Systems and of the 2019 27th ACM Joint Meeting on European Software Engineering Conference
Networks (DSN). 1–12. https://doi.org/10.1109/DSN.2013.6575317 and Symposium on the Foundations of Software Engineering (ESEC/FSE). 455–465.
[24] Yinzhi Cao, Yan Shoshitaishvili, Kevin Borgolte, Christopher Kruegel, Giovanni [47] Kaan Onarlioglu, Mustafa Battal, William Robertson, and Engin Kirda. 2013.
Vigna, and Yan Chen. 2014. Protecting Web-Based Single Sign-on Protocols Securing Legacy Firefox Extensions with SENTINEL. In Detection of Intrusions
against Relying Party Impersonation Attacks through a Dedicated Bi-directional and Malware, and Vulnerability Assessment, Konrad Rieck, Patrick Stewin, and
Authenticated Secure Channel. In Research in Attacks, Intrusions and Defenses. Jean-Pierre Seifert (Eds.).
[25] Yinzhi Cao, Vinod Yegneswaran, and Yan Chen. 2012. PathCutter: Severing the [48] Kaan Onarlioglu, Ahmet Salih Buyukkayhan, William Robertson, and Engin
Self-Propagation Path of XSS JavaScript Worms in Social Web Networks.. In Kirda. 2015. SENTINEL: Securing Legacy Firefox Extensions. Computers &
NDSS. Security 49 (2015), 147–161. https://doi.org/10.1016/j.cose.2014.12.002
[26] Nicholas Carlini, Adrienne Porter Felt, and David Wagner. 2012. An Evaluation [49] Xiang Pan, Yinzhi Cao, and Yan Chen. 2015. I do not know what you visited
of the Google Chrome Extension Security Architecture. In 21st USENIX Security last summer: Protecting users from third-party web tracking with trackingfree
Symposium (USENIX Security 12). Bellevue, WA, 97–111. browser. In NDSS.
[27] Wentao Chang and Songqing Chen. 2016. ExtensionGuard: Towards runtime [50] Xiang Pan, Yinzhi Cao, Shuangping Liu, Yu Zhou, Yan Chen, and Tingzhe Zhou.
browser extension information leakage detection. In 2016 IEEE Conference on 2016. CSPAutoGen: Black-Box Enforcement of Content Security Policy upon
Communications and Network Security (CNS). 154–162. Real-World Websites. In CCS 2016 (CCS ’16).
[28] Quan Chen and Alexandros Kapravelos. 2018. Mystique: Uncovering Informa- [51] Nikolaos Pantelaios, Nick Nikiforakis, and Alexandros Kapravelos. 2020. You’ve
tion Leakage from Browser Extensions. In Proceedings of the 2018 ACM SIGSAC Changed: Detecting Malicious Browser Extensions through Their Update Deltas.
Conference on Computer and Communications Security. Association for Computing Machinery, New York, NY, USA, 477–491.
[29] Patrick Cousot. 1996. Abstract interpretation. ACM Computing Surveys (CSUR) [52] Giancarlo Pellegrino, Martin Johns, Simon Koch, Michael Backes, and Christian
28, 2 (1996), 324–328. Rossow. 2017. Deemon: Detecting CSRF with Dynamic Analysis and Property
[30] Benjamin Eriksson, Giancarlo Pellegrino, and Andrei Sabelfeld. 2021. Black Graphs. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and
Widow: Blackbox Data-driven Web Scanning. In 2021 IEEE Symposium on Security Communications Security (Dallas, Texas, USA) (CCS ’17).
and Privacy (SP). [53] Xavier Rival and Laurent Mauborgne. 2007. The trace partitioning abstract
[31] Benjamin Eriksson, Pablo Picazo-Sanchez, and Andrei Sabelfeld. 2022. Hard- domain. ACM Transactions on Programming Languages and Systems (TOPLAS)
ening the Security Analysis of Browser Extensions. In Proceedings of the 37th 29, 5 (2007), 26–es.
ACM/SIGAPP Symposium on Applied Computing (SAC ’22). [54] Anil Saini, Manoj Singh Gaur, Vijay Laxmi, and Mauro Conti. 2016. Colluding
[32] Aurore Fass, Michael Backes, and Ben Stock. 2019. HideNoSeek: Camouflaging browser extension attack on user privacy and its implication for web browsers.
Malicious JavaScript in Benign ASTs. In Proceedings of the 2019 ACM SIGSAC Computers & Security 63 (2016), 14–28.
Conference on Computer and Communications Security. Association for Computing [55] Dolière Francis Somé. 2019. EmPoWeb: Empowering Web Applications with
Machinery. Browser Extensions. In IEEE Security and Privacy Symposium.
[33] Aurore Fass, Michael Backes, and Ben Stock. 2019. JStap: A Static Pre-Filter [56] Oleksii Starov and Nick Nikiforakis. 2017. Extended Tracking Powers: Measuring
for Malicious JavaScript Detection. In Proceedings of the 35th Annual Computer the Privacy Diffusion Enabled by Browser Extensions. In Proceedings of the 26th
Security Applications Conference (San Juan, Puerto Rico, USA) (ACSAC ’19). Asso- International Conference on World Wide Web (WWW ’17).
ciation for Computing Machinery, 257–269. [57] Marius Steffens, Christian Rossow, Martin Johns, and Ben Stock. 2019. Don’t
[34] Aurore Fass, Robert P. Krawczyk, Michael Backes, and Ben Stock. 2018. JaSt: Trust The Locals: Investigating the Prevalence of Persistent Client-Side Cross-Site
Fully Syntactic Detection of Malicious (Obfuscated) JavaScript. In Detection of Scripting in the Wild. In NDSS.
Intrusions and Malware, and Vulnerability Assessment. [58] Mike Ter Louw, Jin Soon Lim, and V. N. Venkatakrishnan. 2007. Extensible Web
[35] Aurore Fass, Dolière Francis Somé, Michael Backes, and Ben Stock. 2021. DoubleX: Browser Security. In Proceedings of the 4th DIMVA (2007).
Statically Detecting Vulnerable Data Flows in Browser Extensions at Scale, In [59] Jiangang Wang, Xiaohong Li, Xuhui Liu, Xinshu Dong, Junjie Wang, Zhenkai
ACM CCS 2021. ACM CCS. Liang, and Zhiyong Feng. 2012. An Empirical Study of Dangerous Behaviors in
[36] Arjun Guha, Matthew Fredrikson, Benjamin Livshits, and Nikhil Swamy. 2011. Firefox Extensions. In Information Security.
Verified Security for Browser Extensions. In 2011 IEEE Symposium on Security [60] Michael Weissbacher, Enrico Mariconti, Guillermo Suarez-Tangil, Gianluca
and Privacy. 115–130. https://doi.org/10.1109/SP.2011.36 Stringhini, William Robertson, and Engin Kirda. 2017. Ex-Ray: Detection of
[37] Nav Jagpal, Eric Dingle, Jean-Philippe Gravel, Panayiotis Mavrommatis, Niels History-Leaking Browser Extensions. In ACSAC 2017.
Provos, Moheeb Abu Rajab, and Kurt Thomas. 2015. Trends and Lessons from [61] Shujiang Wu, Song Li, Yinzhi Cao, and Ningfei Wang. 2019. Rendered Private:
Three Years Fighting Malicious Extensions. In 24th USENIX Security Symposium. Making GLSL Execution Uniform to Prevent WebGL-based Browser Fingerprint-
[38] Simon Holm Jensen, Magnus Madsen, and Anders Møller. 2011. Modeling the ing.. In USENIX Security.
HTML DOM and Browser API in Static Analysis of JavaScript Web Applications. [62] Shujiang Wu, Pengfei Sun, Yao Zhao, and Yinzhi Cao. 2023. Him of Many Faces:
In Proc. 8th joint meeting of the European Software Engineering Conference and the Characterizing Billion-scale Adversarial and Benign Browser Fingerprints on
ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE). Commercial Websites. In 30th Annual Network and Distributed System Security
[39] Simon Holm Jensen, Anders Møller, and Peter Thiemann. 2009. Type Analysis Symposium, NDSS 2023, San Diego, California, USA, February 27 - March 3, 2023.
for JavaScript. In Proc. 16th International Static Analysis Symposium (SAS) (LNCS, The Internet Society.
Vol. 5673). Springer-Verlag. [63] Mingxue Zhang and Wei Meng. 2020. Detecting and Understanding JavaScript
[40] Alexandros Kapravelos, Chris Grier, Neha Chachra, Christopher Kruegel, Gio- Global Identifier Conflicts on the Web. In Proceedings of ESEC/FSE 2020 (Virtual
vanni Vigna, and Vern Paxson. 2014. Hulk: Eliciting Malicious Behavior in Event, USA). 38–49.
Browser Extensions. In 23rd USENIX Security Symposium. [64] Mingxue Zhang and Wei Meng. 2021. JSISOLATE: Lightweight in-Browser
[41] Soheil Khodayari and Giancarlo Pellegrino. 2021. JAW: Studying Client-side JavaScript Isolation. In Proceedings of the 29th ACM Joint Meeting on European
CSRF with Hybrid Property Graphs and Declarative Traversals. In 30th USENIX Software Engineering Conference and Symposium on the Foundations of Software
Engineering.
2455