0% found this document useful (0 votes)

30 views14 pages

X Fuzz

The document presents X F UZZ, a machine learning-guided fuzzing framework designed to detect cross-contract vulnerabilities in smart contracts, which are often overlooked by existing tools that typically analyze only two contracts at a time. X F UZZ utilizes machine learning models to filter benign program paths and has demonstrated superior performance by identifying 18 exploitable vulnerabilities, 15 of which were previously unreported, while also being more efficient than other fuzzing tools. The framework's effectiveness is validated through extensive experiments on a dataset of 7,391 contracts, achieving high precision and recall rates in vulnerability detection.

Uploaded by

160421737033

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views14 pages

X Fuzz

Uploaded by

160421737033

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

1

xFuzz: Machine Learning Guided

Cross-Contract Fuzzing
Yinxing Xue, Jiaming Ye, Wei Zhang, Jun Sun, Lei Ma, Haijun Wang, and Jianjun Zhao

Abstract—Smart contract transactions are increasingly interleaved by cross-contract calls. While many tools have been developed to
identify a common set of vulnerabilities, the cross-contract vulnerability is overlooked by existing tools. Cross-contract vulnerabilities are
exploitable bugs that manifest in the presence of more than two interacting contracts. Existing methods are however limited to analyze a
maximum of two contracts at the same time. Detecting cross-contract vulnerabilities is highly non-trivial. With multiple interacting
contracts, the search space is much larger than that of a single contract. To address this problem, we present X F UZZ, a machine learning
guided smart contract fuzzing framework. The machine learning models are trained with novel features (e.g., word vectors and
arXiv:2111.12423v2 [cs.CR] 30 Jun 2022

instructions) and are used to filter likely benign program paths. Comparing with existing static tools, machine learning model is proven to
be more robust, avoiding directly adopting manually-defined rules in specific tools. We compare X F UZZ with three state-of-the-art tools on
7,391 contracts. X F UZZ detects 18 exploitable cross-contract vulnerabilities, of which 15 vulnerabilities are exposed for the first time.
Furthermore, our approach is shown to be efficient in detecting non-cross-contract vulnerabilities as well—using less than 20% time as
that of other fuzzing tools, X F UZZ detects twice as many vulnerabilities.

Index Terms—Smart Contract, Fuzzing, Cross-contract Vulnerability, Machine Learning

This paper is accepted by IEEE Transactions of Depend- Considering the close connection between smart contract
able and Secure Computing. and financial activities, the security of smart contract security
largely effects the stability of the society.
Many methods and tools have since been developed
1 I NTRODUCTION to analyze smart contracts. Existing tools can roughly be
categorized into two groups: static analyzers and dynamic ana-
E THEREUM has been on the forefront of most rankings
of block-chain platforms in recent years [1]. It enables
the execution of programs, called smart contracts, written in
lyzers. Static analyzers (e.g., [8], [9], [10], [11], [12], [13]) often
leverage static program analysis techniques (e.g., symbolic
Turing-complete languages such as Solidity. Smart contracts execution and abstract interpretation) to identify suspicious
are increasingly receiving more attention, e.g., with over 1 program traces. Due to the well-known limitations of static
million transactions per day since 2018 [2]. analysis, there are often many false alarms. On the other
At the same time, smart contracts related security attacks side, dynamic analyzers (including fuzzing engines such
are on the rise as well. According to [3], [4], [5], vulnerabilities as [14], [15], [16], [17], [18]) avoid false alarms by dynamically
in smart contracts have already led to devastating financial executing the traces. Their limitation is that there can often
losses over the past few years. In 2016, the notorious be a huge number of program traces to execute and thus
DAO attack resulted in the loss of 150 million dollars [6]. smart strategies must be developed to selectively test the
Additionally, as figured out by Zou et al. [7], over 75% of program traces in order to identify as many vulnerabilities
developers agree that the smart contract software has a as possible. Besides, static and dynamic tools also have a
much high security requirement than traditional software. common drawback — the detection rules are usually built-in and
predefined by developers, sometimes the rules among different
• Yinxing Xue and Wei Zhang are with the University of Science and
tools could be contradictory (e.g., reentrancy detection rules
Technology of China. E-mail: [email protected], [email protected]. in S LITHER and O YENTE [19]).
• Jiaming Ye and Jianjun Zhao are with the Kyushu University. Email: While existing efforts have identified an impressive list of
[email protected], [email protected]. vulnerabilities, one important category of vulnerabilities, i.e.,
• Jun Sun is with the Singapore Management University. E-mail: jun-
[email protected]. cross-contract vulnerabilities, has been largely overlooked
• Lei Ma is with the University of Alberta. E-mail: [email protected]. so far. Cross-contract vulnerabilities are exploitable bugs
• Haijun Wang is with the Nanyang Technological University. E-mail: that manifest only in the presence of more than two inter-
[email protected].
acting contracts. For instance, the reentrancy vulnerability
Manuscript received December 22, 2021; revised April 14, 2022; accepted shown in Figure 4 occurs only if three contracts interact
June 2, 2022. Date of publication July 2, 2022; date of current version June 5,
2022. This work was supported in part by National Nature Science Foundation in a particular order. In our preliminary experiment, the
of China under Grant 61972373, in part by the Basic Research Program two well-known fuzzing engines for smart contracts, i.e.,
of Jiangsu Province under Grant BK20201192 and in part by the National C ONTRACT F UZZER [15] (version 1.0) and S F UZZ [14] (version
Research Foundation Singapore under its NSoE Programme (Award Number: 1.0), both missed this vulnerability due to they are limited to
NSOE-TSS2019-03). The research of Dr Xue is also supported by CAS Pioneer
Hundred Talents Program of China. (Yinxing Xue and Jiaming Ye are co-first analyze two contracts at the same time.
authors. Yinxing Xue is the corresponding author). Given a large number of cross-contract transactions in
2

practice [20], there is an urgent need for developing sys- suspiciousness score, which is defined based on an efficient
tematic approaches to identify cross-contract vulnerabilities. measurement of the likelihood of covering the program
Detecting cross-contract vulnerabilities however is non- paths.
trivial. With multiple contracts involved, the search space To validate the usefulness of X F UZZ, we performed
is much larger than that of a single contract, i.e., we must comprehensive experiments, comparing with a static cross-
consider all sequences and interleaving of function calls from contract detector C LAIRVOYANCE [19] and two state-of-
multiple contracts. the-art dynamic analyzers, i.e., C ONTRACT F UZZER [15]
As fuzzing techniques practically run programs and and S F UZZ, on widely-used open-dataset ([30], [31]) and
barely produce false positive reports [15], [21], adopting additional 7,391 contracts. The results confirm the effective-
fuzzing in cross-contract vulnerability detection is preferred. ness of X F UZZ in detecting cross-contract vulnerabilities, i.e.,
However, due to the efficiency concerns, we need other 18 cross-contract vulnerabilities have been identified. 15 of
techniques to guide fuzzers to practically detect cross- them are missed by all the tested state-of-the-art tools. We
contract vulnerabilities. Previous works (e.g., [22], [23]) have also show that our search space reduction and prioritization
evidenced the advantages of applying machine learning techniques achieve high precision and recall. Furthermore,
method for improving efficiency of vulnerability fuzzing in our techniques can be applied to improve the efficiency of
C/C++ programs. Compared with static rule-based methods, detecting intra-contract vulnerabilities, e.g., X F UZZ detects
the ML model based method requires no prior domain twice as many vulnerabilities as that of S F UZZ and uses less
knowledge about known vulnerabilities, and can effectively than 20% of time.
reduce the large search space for covering more vulnerable The contributions of this work are summarized as follows.
functions. In smart contract, existing works (e.g., ILF [24]) • To the best of our knowledge, we make the first attempts
focus on exploring the state space in the intra-contract scope. to formulate and detect three common cross-contract vul-
They are unable to address the cross-contract vulnerabilities. nerabilities, i.e., reentrancy, delegatecall and tx-origin.
With a large search space of combinations of numerous • We propose a novel ML based approach to significantly
function calls, it is desired to guide the fuzzing process reduce the search space for exploitable paths, achieving
via the aid of the machine learning models. well-trained ML models with a recall of 95% on a testing
In this work, we propose X F UZZ, a machine learning (ML) dataset of 100K contracts. We also find that the trained
guided fuzzing engine designed for detecting cross-contract model can cover a majority of reports of other tools.
vulnerabilities. Ideally, according to the Pareto principle • We perform a large-scale evaluation and performed com-
in testing [25] (i.e., roughly 80% of errors come from 20% parative studies with state-of-the-art tools. Leveraging
of the code), we want to rapidly identify the error-prone code the ML models, X F UZZ outperforms the state-of-the-art
before applying the fuzzing technique. As reported by previous tools by at least 42.8% in terms of recall meanwhile
works [26], [27], the existing analysis tools suffer from high keeping a satisfactory precision of 96.1%.
false positive rates (e.g., S LITHER [10] and S MART C HECK [13] • X F UZZ also finds 18 cross-contract vulnerabilities. All of
have more than 70% of false positive rates). Therefore, them are verified by security experts from our industry
adopting only one static tool in our approach may produce partner. We have published the exploiting code to these
biased results. To alleviate this, we use three tools to vote the vulnerabilities on our anonymous website [32] for public
reported vulnerabilities in contracts, and we further train a access.
ML model to learn common patterns from the voting results.
It is known that ML models can automatically learn patterns
from inputs with less bias [28]. Based on this, the overall bias 2 M OTIVATION
due to using a certain tool to identify potentially vulnerable In this section, we first introduce three common types of
functions in contracts can be reduced. cross-contract vulnerabilities. Then, we discuss the challenges
Specifically, X F UZZ provides multiple ways of reducing in detecting these vulnerabilities by state-of-the-art fuzzing
the enormous search space. First, X F UZZ is designed to engines to motivate our work.
leverage an ML model for identifying the most probably
vulnerable functions. That is, an ML model is trained to
2.1 Problem Formulation and Definition
filter most of the benign functions whilst preserving most of
the vulnerable functions. During the training phase, the ML In general, smart contracts are compiled into opcodes [33]
models are trained based on a training dataset that contains so that they can run on EVM. We say that a smart contract
program codes that are labeled using three famous static anal- is vulnerable if there exists a program trace that allows an
ysis tools (i.e., the labels are their majority voting result). Fur- attacker to gain certain benefit (typically financial) illegiti-
thermore, the program code is vectorized into vectors based mately. Formally, a vulnerability occurs when there exist de-
on word2vec [29]. In addition, manually designed features, pendencies from certain critical instructions (e.g., TXORIGIN
such as can_send_eth, has_call and callee_external, and DELEGATECALL) to a set of specific instructions (e.g., ADD,
are supplied to improve training effectiveness as well. In the SUB and SSTORE). Therefore, to formulate the problem, we
guided fuzzing phase, the model is used to predict whether adopt definitions of vulnerabilities from [9], [34], based on
a function is potentially vulnerable or not. In our evaluation which we define (control and data) dependency and then
of ML models, the models allow us to filter 80.1% non- define the cross-contract vulnerabilities.
vulnerable contracts. Second, to further reduce the effort Definition 1 (Control Dependency). An opcode opj is said to
required to expose cross-contract vulnerabilities, the filtered be control-dependent on opi if there exists an execution
contracts and functions are further prioritized based on a from opi to opj such that opj post-dominates all opk in
3

1 function withdrawBalance() public { A smart contract suffers from reentrancy vulnerability if

2 uint amountToWithdraw = userBalances[msg. and only if at least one of its traces suffers from reentrancy
sender]; vulnerability. This vulnerability results from the incorrect
3 msg.sender.call.value(amountToWithdraw)("");
4 userBalances[msg.sender] = 0; use of external calls, which are exploited to construct a call-
5 } chain. When an attacker A calls a user U to withdraw money,
the fallback function in contract A is invoked. Then, the
Fig. 1: An example of reentrancy vulnerability. malicious fallback function calls back to U to recursively
1 contract Delegate { steal money. In Figure 1, the attacker can construct an end-to-
2 address public owner; end call-chain by calling withdrawBalance in the fallback
3 function pwn() {
4 owner = msg.sender; function of the attacker’s contract then steals money.
5 } }
6 contract Delegation { Definition 4 (Dangerous Delegatecall Vulnerability). A trace
7 address public owner; suffers from dangerous delegatecall vulnerability if it
8 Delegate delegate; executes an opcode opc ∈ C that depends on an opcode
9 function() {
10 if(delegate.delegatecall(msg.data)) { DELEGATECALL.
11 this;
12 } } } A smart contract suffers from delegatecall vulnerability if
and only if at least one of its traces suffers from delegatecall
Fig. 2: An example of delegatecall vulnerability. vulnerability. This vulnerability is due to the abuse of dan-
1 function withdrawAll(address _recipient) public gerous opcode DELEGATECALL. When a malicious attacker
{ B calls contract A by using delegatecall, contract A’s
2 require(tx.origin == owner); function is executed in the context of attacker, and thus causes
3 _recipient.transfer(this.balance);
4 } damages. In Figure 2, malicious attacker B sends ethers
to contract Delegation to invoke the fallback function at
Fig. 3: An example of tx-origin vulnerability. line 10. The fallback function calls contract Delegate and
executes the malicious call data msg.data. Since the call
data is executed in the context of Delegate, the attacker can
the path from opi to opj (excluding opi ) but does not post- change the owner to an arbitrary user by executing pwn at
dominates opi . An opcode opj is said to post-dominate an line 3.
opcode opi if all traces starting from opi must go through Definition 5 (Tx-origin Misuse Vulnerability). A trace suffers
opj . from tx-origin misuse vulnerability if it executes an
opcode opc ∈ C that depends on an opcode ORIGIN.
Definition 2 (Data Dependency). An opcode opj is said to be
data-dependent on opi if there exists a trace that executes A smart contract suffers from tx-origin vulnerability
opi and subsequently opj such that W (opi ) ∩ R(opj ) 6= ∅, if and only if at least one of its traces suffers from tx-
where R(opj ) is a set of locations read by opj and W (opi ) origin vulnerability. This vulnerability is due to the mis-
is a set of locations written by opi . use of tx.origin to verify access. An example of such
vulnerability is shown in Figure 3. When a user U calls
An opcode opj is dependent on opi if opj is control or data a malicious contract A, who intends to forward call to
dependent to opi or opj is dependent to opk meanwhile opk is contract B. Contract B relies on vulnerable identity check
dependent to opi . (i.e., require(tx.origin == owner) at line 2 to filter
In this work, we define three typical categories of cross- malicious access. Since tx.orign returns the address of U
contract vulnerabilities that we focus on, i.e., reentrancy, (i.e., the address of owner), malicious contract A successfully
delegatecall and tx-origin. Although our method can be poses as U.
generalized to support more types of vulnerabilities, in Definition 6 (Cross-contract Vulnerability). A group of
this paper, we focus on the above three vulnerabilities contracts suffer from cross-contract vulnerability if there
since they are among the most dangerous ones with urgent is a vulnerable trace (that suffers from reentrancy, dele-
testing demands. Specifically, the reentrancy and delegatecall gatecall, tx-origin) due to opcode from more than two
vulnerabilities are highlighted as top risky vulnerabilities contracts.
in previous works [9], [10]. The tx-origin vulnerability is
broadly warned in previous research [35], [10]. A smart contract suffers from cross-contract vulnera-
bility if and only if at least one of its traces suffers from
We define C as a set of critical opcodes, which contains
cross-contract vulnerability. For example, a cross-contract
CALL, CALLCODE, DELEGATECALL, i.e., the set of all opcode
reentrancy vulnerability is shown in Figure 4. An attack
associated with external calls. These opcodes associated with
requires the participation of three contracts: malicious con-
external calls could be the causes of vulnerabilities (since
tract Logging deployed at addr_m, logic contract Logic
then the code is under the control of external attackers).
deployed at addr_l and wallet contract Wallet deployed
Definition 3 (Reentrancy Vulnerability). A trace suffers from at addr_w. First, the attack function log calls function
reentrancy vulnerability if it executes an opcode opc ∈ C logging at Logic contract then sends ethers to the at-
and subsequently executes an opcode ops in the same tacker contract by calling function withdraw at contract
function such that ops is SSTORE, and opc depends on Wallet. Next, the wallet contract sends ethers to attacker
ops . contract and calls function log. An end-to-end call chain
4

Fig. 4: An example of cross-contract reentrancy vulnerability which is missed by the state-of-art fuzzer, namely S F UZZ.
∗
Note: The solid boxes represent functions and the dashed containers denote contracts. Specifically, function call is denoted by
solid line. The cross-contract calls are highlighted by red arrows. The blue arrow represents cross-contract call missed by sFuzz and
ContractFuzzer.

1 → 2 → 3 → 4 → 1 ... is formed and the attacker can C1 How to train the machine learning model and achieve
recursively steal money without any limitations. satisfactory precision and recall.
C2 How to combine trained model with fuzzer to reduce
2.2 State-of-the-arts and Their Limitations search space towards efficient fuzzing.
First, we perform an investigation on the capability in detect- C3 How to empower the guided fuzzer the support of
ing vulnerabilities by the state-of-the-art methods, including effective cross-contract vulnerability detection.
[10], [8], [9], [19], [14], [15]. In general, cross-contract testing In the rest of this section, we provide an overview of
and analysis are not supported by most of these tools except X F UZZ which aims at addressing the above challenges, as
C LAIRVOYANCE. The reason is existing approaches merely shown in Figure 5. Generally, the framework can be separated
focus on one or two contracts, and thus, the sequences and into two phases: machine learning model training phase and
interleavings of function calls from multiple contracts are guided fuzzing phase.
often ignored. For example, the vulnerability in Figure 4 is
a false negative case of static analyzer S LITHER, O YENTE 3.1 Machine Learning Model Training Phase
and S ECURIFY. Note that although this vulnerability is found In previous works [36], [37], fuzzers are limited to prior
by C LAIRVOYANCE, this tool however generates many false knowledge of vulnerabilities and they are not well general-
alarms, making the confirmation of which rather difficult. ized against vulnerable variants. In this work, we propose
This could be a common problem for many static analyzers. to leverage ML predictions to guide fuzzers. The benefit of
Although high false positive rate could be well addressed using ML instead of a particular static tool is that ML model
by fuzzing tools by running contracts with generated inputs, can reduce bias introduced by manually defined detection
existing techniques are limited to maximum two contracts rules.
(i.e., input contract and tested contract). In our investigation In this phase, we collect training data, engineer features,
of two currently representative fuzzing tools S F UZZ and and evaluate models. First, we employ the state-of-the-arts
C ONTRACT F UZZER, cross-contract calls are largely over- S LITHER, S ECURIFY and S OLHINT to detect vulnerabilities on
looked, and thus leads to missed vulnerabilities. To sum the dataset. Next, we collect their reports to label contracts.
up, most of the existing methods and tools are still limited to The contract gains at least two votes are labeled as vulnera-
handle non-cross-contract vulnerabilities, which motivates bility. After that, we engineer features. The input contracts
this work to bridge such a gap towards solving the currently are compiled into bytecode then vectorized into vectors by
urgent demands. Word2Vec [29]. To address C1, they are enriched by combin-
ing with static features (e.g., can_send_eth, has_call and
3 OVERVIEW callee_external, etc.). These static features are extracted
Detecting cross-contract vulnerability often requires exam- from ASTs and CFGs. Eventually, the features are used as
ining a large number of sequence transactions and thus can inputs to train the ML models. In particular, the precision
be quite computationally expensive some even infeasible. In and recall of models are evaluated to choose three candidate
this section, we give an overall high-level description of our models (e.g., XGBoost [38], EasyEnsembleClassifier [39] and
method, e.g., focusing on fuzzing suspicious transactions Decision Tree), among which we select the best one.
based on the guideline of a machine learning (ML) model.
Technically, there are three challenges of leveraging ML to 3.2 Guided Testing Phase
guide the effective fuzzing cross-contracts for vulnerability In guided testing phase, contracts are input to the pretrained
detection: models to obtain predictions. After that, the vulnerable con-
5

Fig. 5: The overview of X F UZZ framework.

tracts are analyzed and pinpointed. To address challenge C2, TABLE 1: Vulnerability detection capability of voting static
the functions that are predicted as suspiciously vulnerable tools.
ones. Then we use call-graph analysis and control-flow-graph
Slither Solhint Securify
analysis to construct cross-contract call path. After we collect
all available paths, we use the path prioritization algorithm Reentrancy l l l
Tx-origin l l
to prioritize them. The prioritization becomes the guidance of Delegatecall l
the fuzzer. This guidance of model predictions significantly
reduces search space because the benign functions wait until
the vulnerable ones finish. The fuzzer can focus on vulnerable
dataset aiming at eliminating the bias of each tool. Note
functions and report more vulnerabilities.
that the two vulnerabilities (i.e., delegatecall and tx-origin)
To address C3, we extract static information (e.g., function
are hardly supported by existing tools. Therefore, we only
parameters, conditional paths) of contracts to enrich model
vote vulnerable functions on vulnerabilities supported by
predictions. The predictions and the static information are
at least two tools. That is, for reentrancy, the voting results
combined to compute path priority scores. Based on this, the
are counted in the way that the function gain at least two
most exploitable paths are prioritized, where vulnerabilities
votes is deemed as vulnerability; for tx-origin, the function is
are more likely found. Here, the search space of exploitable
deemed as vulnerability when it gains at least one vote. As
paths is further reduced and the cross-contract fuzzing is
for delegatecall vulnerability, we label all reported functions
therefore feasible by invoking vulnerability through available
as vulnerable ones.
paths.
As a result, we collect 788 reentrancy, 40 delegatecall and
334 tx-origin vulnerabilities, respectively. All of the above
4 M ACHINE L EARNING G UIDANCE P REPARATION vulnerabilities are manually confirmed by two authors of this
In this section, we elaborate on the training of our ML paper, both of whom have more than 3 years development
model for fuzzing guidance. We discuss the data collection experience for smart contracts, to remove false alarms.
in Section 4.1 and introduce feature engineering in Section
4.2, followed by candidate model evaluation in Section 4.3. 4.2 Feature Engineering
Then, both vulnerable and benign functions are preprocessed
4.1 Data Collection by S LITHER to extract their runtime bytecode. After that,
S MART B UGS [31] and SWCR EGISTRY [40] are two representa- Word2Vec [29] is leveraged to transform the bytecode into a
tives of existing smart contract vulnerability benchmarks. 20-dimensional vector. However, as reported in [41], vectors
However, their labeled data is scarce and the amount alone are still insufficient for training a high-performance
currently available is insufficient to train a good model. model. To address this, we enrich the vectors with 7 ad-
Therefore, we choose to download and collect contracts from ditional static features extracted from CFGs. In short, the
Etherscan (https://etherscan.io/), a prominent Ethereum features are 27 dimensions in total, in which 20 are yielded
service platform. Overall, to be representative, we collect a by Word2Vec and the other 7 are summarized in Table 2.
large set of 100,139 contracts in total for further processing. Among the 7 static features, has_modifier, has_call,
The collected dataset is then labeled based on the voting has_balance, callee_external and can_send_eth are
results of three most well-rated static analyzers (i.e., S OLHINT static features. We collect them by utilizing static analysis
[11] v2.3.1, S LITHER [10] v0.6.9 and S ECURIFY [9] v1.0 ). The techniques. The feature has_modifier is designed to iden-
three tools are chosen based on the fact that they are ¶ tify existing program guards. In smart contract programs,
state-of-the-art static analyzers and · well maintained and the function modifier is often used to guard a function from
frequently updated. The detection capability vary among arbitrary access. That is, a function with modifier is less
these tools (as shown in Table 1). We then vote to label the like a vulnerable one. Therefore, we make the modifier as
6

TABLE 2: The seven static features adopted in model training

Feature Name Type Description

has modifier bool whether has a modifier
has call bool whether contains a call operation
has delegate bool whether contains a delegatecall
has tx origin bool whether contains a tx-origin operation
has balance bool whether has a balance check operation
can send eth bool whether supports sending ethers
callee external bool whether contains external callees

a counter-feature to avoid false alarms. Feature has_call

and feature has_balance are designed to identify external
calls and balance check operations. These two features are
closely connected with transfer operations. We prepare them
to better locate the transfer behavior and narrow search space. Fig. 6: The P-R Curve of models. The dashed lines represent
Feature callee_external provides important information performance on training set, while the solid lines represent
on whether the function has external callees. This feature performance on validation set.
is used to capture risky calls. In smart contracts, cross-
contract calls are prone to be exploited by attackers. Feature TABLE 3: The performance of evaluated ML models.
can_send_eth extracts static information (e.g., whether the
function has transfer operation) to figure out whether the Model Name Precision Recall
function has ability to send ethers to others. Considering the EasyEnsembleClassifier 26% 95%
vulnerable functions often have risky transfer operations, this XGBoost 66% 48%
DecisionTree 70% 43%
feature can help filtering out benign functions and reduce
SupportVectorMachine 60% 14%
false positive reports. KNeighbors 50% 43%
The remaining three features, i.e., has_delegate and NaiveBayes 50% 59%
has_tx_origin correspond to particular key opcodes used LogiticRegression 53% 38%
in vulnerabilities. Specifically, feature has_delegate corre-
sponds to the opcode DELEGATECALL in delegatecall vulner-
abilities, feature has_tx_origin corresponds to the opcode 1:189 for delegatecall and 1:141 for tx-origin. Still the dataset
ORIGIN in tx-origin vulnerabilities. These two features are is highly imbalanced.
specifically designed for the two vulnerabilities, as their As studied in [45], the imbalance can be alleviated by
names suggest. Note that the features can be easily updated data sampling strategies. However, we find that sampling
to support detection on new vulnerabilities. If the new strategies like oversampling [46] can hardly improve the
vulnerability shares similar mechanism with the above three precision and recall of models because the strategy introduces
vulnerabilities or is closely related to them, the existing too much polluted data instead of real vulnerabilities.
features can be directly adopted; otherwise, one or two We then attempt to evaluate models to select one that fits
new specific features highly correlated with the new type the imbalanced data well. Note that to counteract the impact
of vulnerability should be added. The 7 static features are of different ML models, we try to cover as many candidate
combined with word vectors, which together form the input ML methods as possible, among which we select the best one.
to our ML models for further training. The models we evaluated including tree-based models XGBT
[38], EEC [39], Decision Tree (DT), and other representative
ML models like Logistic Regression, Bayes Models, SVMs
4.3 Model Selection and LSTM [47]. The performance of the models can be found
In this section, we train and evaluate diverse candidate at Table 3. We find that the tree-based models achieve better
models, based on which we select the best one to guide precision and recall than others. Other non-tree-based models
fuzzers. To achieve this, one challenge we have to address are biased towards the major class and hence show very poor
first is the dataset imbalance. In particular, there are 1,162 classification rates on minor classes. Therefore, we select
vulnerabilities and 98,977 benign contracts. This is not XGBT, EEC and DT as the candidate models.
rare in ML-based vulnerability detection tasks [42], [43]. The precision-recall curves of the three models on positive
In fact, our dataset endures imbalance in rate of 1:126 for cases are shown on Figure 6. In this figure, the dashed lines
reentrancy, 1:2,502 for delegatecall and 1:298 for tx-origin. denote models fitting with validation set and solid lines
Such imbalanced dataset can hardly be used for training. denote fitting with testing set. Intuitively, model XGBT and
To address the challenge, we first eliminate the duplicated model EEC achieve better performance with similar P-R
data. In fact, we found 73,666 word vectors are exactly same curves. However, EEC performs much better than XGBT in
to others. These samples are different in source code, but recall. In fact, model XGBT holds a precision rate of 66% and
after they are compiled, extracted and transformed into a recall rate of 48%. Comparatively, model EEC achieves a
vectors, they share the same values, because most of them are precision rate of 26% and a recall rate of 95%. We remark
syntactically identical clones [44] at source code level. After that our goal is not to train a model that is very accurate,
our remedy, data imbalance comes to 1:31 for reentrancy, but rather a model that allows us to filter as many benign
7

TABLE 4: The coverage rate (CR) score of ML model on other Algorithm 1: Machine learning guided fuzzing
tools.
input : IS , all the input smart contract source code
input : M , suspicious function detection ML model
CR(Slither) CR(Securify) CR(Solhint)
input : T Rs ← ∅, the set of potentially vulnerable
Reentrancy 83.6% 81.1% 86.3% function execution paths
Tx-origin 91.9% N.A. 75.1% output : V ← ∅, the set of vulnerable paths
Delegatecall 90.6% N.A. N.A. 1 Fs ← IS.getF unctionList()
2 // get the functions in a contract
3 foreach function f ∈ Fs do
contracts as possible without missing real vulnerabilities. 4 if if IsSuspiciousF unction(f, M ) is True then
Therefore, we select the EEC model for further guiding the 5 // employ ML models to predict
whether the function is suspicious
fuzzing process.
6 Sf unc ← getF uncP riorityScore(f )
7 Scaller ← getCallerP riorityScore(f )
4.4 Model Robustness Evaluation 8 T Rs ← T Rs ∪ {f, Sf unc , Scaller }
9 // get scores for each function
To further evaluate the robustness of our selected
10 P T R ← P rioritizationAlgorithm(T Rs)
model and to assess that to how much extent can our
11 // Prioritized paths
model represent existing analyzers, we conduct evalu- 12 V ←∅
ation of comparing the vulnerability detection on un- 13 // the output vulnerability list
known dataset between our model and other state-of- 14 while not timeout do
the-art static analyzers. The evaluation dataset is down- 15 T ← P T R.pop()
load from a prominent third-party blockchain secu- 16 // pop up trace with higher priority
rity team (https://github.com/tintinweb/smart-contract- 17 F uzzingResult ← F uzzing(T )
18 if F uzzingResult is Vulnerable then
sanctuary). We select smart contracts released in version 19 V ← V ∪ {T }
0.4.24 and 0.4.25 (i.e., the majority versions of existing smart 20 else
contract applications [48]) and remove the contracts which 21 continue
has been used in our previous model training and model
selection. After all, we get 78,499 contracts in total for 22 return V
evaluation.
Definition 7 (Coverage Rate of ML Model on Another Tool).
1 contract Wallet{
Given the true positive reports of ML model Rm , the true 2 function withdraw(address addr, uint value){
positive reports of another tool Rt , a coverage rate of ML 3 addr.transfer(value);
model CR(t) on the tool is calculated as: 4 }
5 function changeOwner(address[] addrArray,
uint idx) public{
CR(t) = (Rm ∩ Rt )/Rt (1) 6 require(msg.sender == owner);
7 owner = addrArray[idx];
8 withdraw(owner, this.balance);
The results are listed in Table 4. Here, we use the coverage 9 } }
rate (CR) to evaluate the representativeness of our model 10 contract Logic{
regarding the three vulnerabilities. Specifically, the coverage 11 function logTrans(address addr_w, address
_exec, uint _value, bytes infor) public{
rate measures how much reports of ML model are intersected 12 Wallet(addr_w).withdraw(_exec, _value);
with static analyzer tools. The coverage rate CR is calculate 13 } }
as listed in Definition 7. The N.A. in the table denotes that the
detection of this vulnerability is not support by the analyzer. Fig. 7: An example of prioritizing paths.
Our evaluation results show that the reports of our tool
can cover a majority of reports of other tools. Specifically, the
trained ML model can well approximate the capability of is that even with the machine learning model filtering, the
each static tool used in vulnerability labeling and model search space is still rather large, which is evidenced by the
training. For example, 81.1% of true positive reports of large number of paths explored by S F UZZ (e.g., the 2,596
S ECURIFY on reentrancy are also contained in our ML suspicious functions have 873 possibly vulnerable paths),
model’s reports. Besides, 75.1% of true positive reports of and thus we propose to first prioritize the path.
S OLHINT on Tx-origin and 90.6% of true positive reports of The overall process of our guided fuzzing can be found
S LITHER on Delegatecall are also covered. at Algorithm 1. In this algorithm, we first retrieve function
list of an input source at line 1. Next, from line 3 to line 8, we
5 G UIDED C ROSS - CONTRACT F UZZING calculate the path priority based on two scores (i.e., function
priority scores and caller priority scores) for each path. Both
5.1 Guidance Algorithm scores are designed for prioritizing suspicious functions.
The pretrained models are applied to guide fuzzers in the After the calculation, the results are saved together with
ways that the predictions are utilized to ¶ locate suspicious the function itself. In line 10, we prioritize the suspicious
functions and · combine with static information for path function paths. The prioritization algorithm can be found
prioritization. at Algorithm 2. The trace with higher priority will be first
Our guidance is based on both model predictions and tested by fuzzers. Finally, from line 14 to line 21, we pop up
the priority scores computed from static features. The reason a candidate trace from prioritized list and employ fuzzers to
8

Algorithm 2: Priorization Algorithm parameters (i.e., array, bytes and address parameters) are
input : M , The trained machine learning model assigned with lower priority, because these parameters often
input : T Rs, functions and their priority scores increase the difficulty of penetrating a function. Specifically,
output : P T R, the set of prioritized vulnerable paths one parameter has 1 dimensionality except for the complex
1 while isN otEmpty(T Rs) do parameters, i.e., they have 2 dimensionalities. The parameter
2 T Rs ← sortByF unctionP riority(T Rs) dimensionality of a function is the sum of parameters di-
3 function f ← T Rs.pop() mensionalities. For example, in Figure 7, function withdraw
4 paths P s ← getAllP aths(f )
5 while isN otEmpty(P s) do and changeOwner both have an address and an integer
6 P s ← sortByCallerP riority(P s) parameter thus their dimensionality is 3. Function logTrans
7 P ← P s.pop() has two addresses, a byte and an integer parameter, so the
8 PTR ← PTR ∪ P dimensionality is 7.
9 return P T R Definition 8 (Function Priority Score). Given the suspicious
factor fs , the caller dimensionality score SC and the
parameter dimensionality score SP , a function priority
conduct focus fuzzing. The fuzzing process will not end until score Sf unc is calculated as:
it reaches an timeout limitation. The found vulnerability will Sf unc = fs × (SC + 1) × (SP + 1) (2)
be return as final result.
The details of our prioritization algorithm are shown in In this formula, we add 1 to the caller dimensionality
Algorithm 2. The input of the algorithm is the functions and and parameter dimensionality to avoid the overall score to
their corresponding priority scores. The scores are calculated be 0. The priority scores in Figure 7 are: function withdraw
in Algorithm 1. The output of the algorithm is the prioritized = 6, function changeOwner = 4, function logTrans = 8.
vulnerable paths. Specifically, the first step of the algorithm is The results show that function changeOwner has highest
getting the prioritized function based on the function priority priority because function withdraw has two callers to
score, as shown in line 2 and line 3. The functions with lower traverse meanwhile function logTrans is more difficult for
function priority scores will be prioritized. Next, we sort all penetration than changeOwner.
call paths (no matter cross-contract or non-cross-contract call) Caller Priority. We traverse every caller of a function
which are correlated to the function, as shown from line 4 to and collect their static features, based on which we compute
line 6. We pop up the call path which has the highest priority the priority score to decide which caller to test first. Firstly,
and add it to the prioritized path set. The prioritized path the number of branch statements (e.g., if, for and while)
set will guide fuzzer to test call path in a certain order. and assertions (e.g., require and assert) are counted
To summarize, the goal of our guidance algorithm is to measure condition complexity Comp to describe the
to prioritize cross-contract paths, which are penetrable but difficulties to bypass the conditions. The path with more
usually overlooked by previous practice [15], [14], and to conditions is in lower priority. For example, in Figure 7,
further improve the fuzzing testing efficiency on cross- function withdraw has two callers. One caller changeOwner
contract vulnerabilities. has an assertion at line 6, so the complexity is 1. The other
caller logTrans contains no conditions thus the complexity
5.2 Priority Score is 0.
Generally, the path priority consists of two parts: function pri- Next, we count the condition distance. S F UZZ selects
ority and caller priority. The function priority is for evaluating seed according to branch-distance only, which is not ideal
the complexity of function and the caller priority is designed for identifying the three particular kinds of cross-contract
to measure the cost to traverse a path. vulnerabilities that we focus on in this work. Thus, we
Function Priority. We collect static features of functions propose to consider not only branch distance but also this
to compute function priority. After that, a priority score can condition distance CondDis. This distance is intuitively the
be obtained. The lower score denotes higher priority. number of statements from entry to condition. In case of the
We first mark the suspicious functions by model predic- function has more than one conditions, the distance is the
tions. A suspicious function is likely to contain vulnerabilities number of statements between entry and first condition. For
so it is provided with higher priority. We implement this as a example, in Figure 7, the condition distance of changeOwner
factor fs which equals 0.5 for suspicious function otherwise is 1 and the condition distance of logTrans is 0.
1 for benign functions. For example, in Figure 7, the function Definition 9 (Caller Priority Score). Given the condition
withdraw is predicted as suspicious so that the factor fs distance CondDis and the path condition complexity
equals 0.5. Comp, a path priority score Scaller is calculated as:
Next, we compute the caller dimensionality SC . The
dimensionality is the number of callers of a function. In cross- Scaller = (CondDis + 1) × (Comp + 1) (3)
contract fuzzing, a function with multiple callers requires
more testing time to traverse all paths. For example, in Figure Finally, the caller priority score is computed based on
7, function withdraw in contract Wallet has an internal condition complexity and condition distance, as shown in
caller changeOwner and an external caller logTrans, thus Definition 9. The complexity and distance add 1 so that the
the dimensionality of this function is 2. overall score is not 0. The caller priority scores in Figure 7 are:
The parameter dimensionality SP is set to measure logTrans → withdraw = 1, changeOwner → withdraw =
the complexity of parameters. The functions with complex 4. Function changeOwner has identity check at line 6, which
9

RQ1. How effective is X F UZZ in detecting cross-contract

vulnerabilities?
RQ2. To what extent the machine learning models and the
path prioritization contribute to reducing the search
space?
RQ3. What are the overhead of X F UZZ, compared to the
vanilla S F UZZ?
RQ4. Can X F UZZ discover real-world unknown cross-
contract vulnerabilities, and what are the reasons for
false negatives?

6.1 Dataset Preparation

Our evaluation dataset includes smart contracts from three
sources: 1) datasets from previously published works (e.g.,
Fig. 8: The cross-contract fuzzing process. [30] and [31]); 2) smart contract vulnerability websites with
good reputation (e.g., [40]); 3) smart contracts downloaded
from Etherscan. The dataset is carefully checked to remove
increase the difficulty to penetrate. Thus, the other path from duplicate contracts with dataset used in our machine learning
logTrans to withdraw is prior. training. Specifically, the DataSet1 includes contracts from
previous works and famous websites. After we remove
5.3 Cross-contract Fuzzing duplicate contracts and toy-contract (i.e., those which are
Given the prioritized paths, we utilized cross-contract not deployed on real world chains), we collect 18 labeled
fuzzing to improve fuzzing efficiency. Here, we implement reentrancy vulnerabilities. To enrich the evaluation dataset,
this fuzzing technique by the following steps: 1) The contracts our Dataset2 includes contracts downloaded from Etherscan.
under test should be deployed on EVM. As shown in Figure We remove contracts without external calls (they are irrel-
8, the fuzzer will first deploy all contracts on a local private evant to cross-contract vulnerabilities) and contracts that
chain to facilitate cross-contract calls among contracts. 2) are not developed by using Solidity 0.4.24 and 0.4.25 (i.e.,
The path-unrelated functions will be called. Here, the path- the most two popular versions of Solidity [48]). In the end,
unrelated functions denote functions that do not appear in 7,391 contracts are collected in Dataset2. The source code of
the input prioritized paths. We run them first to initialize the above datasets are publicly available in our website [32]
state variables of a contract. 3) We store the function selectors so that the evaluations are reproducible, benefiting further
appeared in all contracts. The function selector is the unique research.
identity recognizer of a function. It is usually encoded in
4-byte hex code [49]. 4) The fuzzer checks whether there is a 6.2 RQ1: Vulnerability Detection Effectiveness
cross-contract call. If not, the following step 5 and step 6 will We first conduct evaluations on Dataset1 by comparing
be skipped. 5) The fuzzer automatically searches local states three tools C ONTRACT F UZZER, S F UZZ and X F UZZ. The
to find out correct function selectors, and then directly trigger C LAIRVOYANCE is not included because it is a static analysis
a cross-contract call to the target function in step 6. 7) The tool. For the sake of page space, we present a part of the
fuzzer compares the execution results against the detection results in Table 5 with an overall summary and leave the
rules and output reports. whole list available at here1 .
In this evaluation, C ONTRACT F UZZER fail to find a vulner-
6 E VALUATION ability among the contracts. S F UZZ missed 3 vulnerabilities
and outputted 9 incorrect reports. Comparatively, X F UZZ
X F UZZ is implemented in Python and C with 3298 lines
missed 2 vulnerabilities and outputted 6 incorrect reports.
of code. All experiments are run on a computer which is
The reason of the missed vulnerabilities and incorrect reports
running Ubuntu 18.04 LTS and equipped with Intel Xeon
lies on the difficult branch conditions (e.g., an if statement
E5-2620v4, 32GB memories and 2TB HDD.
with 3 conditions) which blocks the fuzzer to traverse
For the baseline comparison, X F UZZ is compared with the
vulnerable branches. Note that X F UZZ is equipped with
state-of-art fuzzer S F UZZ [14], a previously published testing
model guidance so that it can focus on fuzzing suspicious
engine C ONTRACT F UZZER [15] and a static cross-contract
functions and find more vulnerabilities than S F UZZ.
analysis tool C LAIRVOYANCE [19]. The recently published
While we compare our tool with existing works on
tool E CHIDNA [16] relies on manually written testing oracles,
publicly available Dataset1, the dataset only provides non-
which may lead to different testing results depending on de-
cross-contract labels thus cannot be used to verify our
veloper’s expertise. Thus, it is not compared. Other tools (like
detection ability on cross-contract ones. To complete this,
H ARVEY [21]) are not publicly available for evaluation, and
we further evaluate the effectiveness of cross-contract and
thus are not included in our evaluations. We systematically
non-cross-contract fuzzing on Dataset2. To reduce the effect
run all four tools on the contract datasets. Notably, to verify
of randomness, we repeat each setting 20 times, and report
the authenticity of the vulnerability reports, we invite senior
the averaged results.
technical experts from security department of our industry
partner to check vulnerable code. Our evaluation aims at 1. https://anonymous.4open.science/r/xFuzzforReview-ICSE/
investigating the following research questions (RQs). Evaluation%20on%20Open-dataset.pdf
10

TABLE 5: Evaluations on Dataset1. The 4 represents the tool TABLE 7: Performance of X F UZZ, C LAIRVOYANCE, C ON -
successfully finds vulnerability in this function, otherwise TRACT F UZZER and S F UZZ on non-cross-contract evaluations.
the tool is marked with 6.
reentrancy delegatecall tx-origin
Address ContractFuzzer xFuzz sFuzz P% R% #N P% R% #N P% R% #N

0x7a8721a9 6 4 6 C.F. 100 1.7 3 0 0 0 0 0 0

0x4e73b32e 6 4 4 S F UZZ 84.2 33.5 70 100 54.3 19 0 0 0
0xb5e1b1ee 6 4 4 C.V. 48.3 40.4 145 0 0 0 0 0 0
0xaae1f51c 6 4 4 X F UZZ 95.5 84.6 156 100 100 35 100 100 25
0x7541b76c 6 4 6
... ... ... ...
Summary ContractFuzzer xFuzz sFuzz
0/18 9/18 5/18

TABLE 6: Performance of X F UZZ, C LAIRVOYANCE (C.V.),

C ONTRACT F UZZER (C.F.), S F UZZ on cross-contract vulnera-
bilities.

reentrancy delegatecall tx-origin

P% R% #N P% R% #N P% R% #N
C.F. 0 0 0 0 0 0 0 0 0
S F UZZ 0 0 0 0 0 0 0 0 0
C.V. 43.7 43.7 16 0 0 0 0 0 0
X F UZZ 100 81.2 13 100 100 3 100 100 2
Fig. 9: Comparison of reported vulnerabilities between
X F UZZand S F UZZ regarding reentrancy.

6.2.1 Cross-contract Vulnerability.

6.2.2 Non-Cross-contract Vulnerability.
The results are summarized in Table 6. Note that the The experiment results show that X F UZZ improves detection
“P%” and “R%” represent precision rate and recall rate, of non-cross-contract vulnerabilities as well (see Table 7).
“#N” is the number of vulnerability reports. “C.V.” means For reentrancy, C ONTRACT F UZZER achieves the best 100%
C LAIRVOYANCE and “C.F.” means C ONTRACT F UZZER. Cross- precision rate but the worst 1.7% recall rate. S F UZZ and
contract vulnerabilities are currently not supported by C ON - C LAIRVOYANCE identified 33.5% and 40.4% vulnerabilities.
X F UZZ has a precision rate of 95.5%, which is slightly lower
TRACT F UZZER, S F UZZ and thus they report no vulnerabilities
detected. than that of C ONTRACT F UZZER, and more importantly, the
bests recall rate of 84.2%. X F UZZ exhibits strong capability in
Precision. C LAIRVOYANCE managed to find 7 true cross- detecting vulnerabilities by finding a total of 209 (149+35+25)
contract reentrancy vulnerabilities. In comparison, X F UZZ vulnerabilities.
found 9 cross-contract reentrancy, 3 cross-contract delegate- Precision. For reentrancy, C LAIRVOYANCE reports 75
call and 2 cross-contract tx-origin vulnerabilities. The two false positives, because of ¶ the abuse of detection rules and
tools found 21 cross-contract vulnerabilities in total. C LAIR - · unexpected jump to unreachable paths due to program
VOYANCE report 16 vulnerabilities but only 43.7% of them errors. The 11 false positives of S F UZZ are due to the
are true positives. In contrast, X F UZZ generates 18 (13+3+2) misconceived ether transfer. S F UZZ captures ether transfers
reports of three types of cross-contract vulnerabilities and to locate dangerous calls. However, the ethers from attacker
all of them are true positives. The reason of the high false to victim is also falsely captured. The 7 false alarms of X F UZZ
positive rate of C LAIRVOYANCE is mainly due to its static are due to the mistakes of contract programmers by calling a
analysis based approach, without runtime validation. We nonexistent functions. These calls are however misconceived
further check the 18 vulnerabilities on some third-party as vulnerabilities by X F UZZ.
security expose websites [50], [40], [31] and we find 15 of Recall. C LAIRVOYANCE missed 59.6% of the true posi-
them are not flagged. tives. The root cause is the adoption of unsound rules during
Recall. The 9 vulnerabilities missed by C LAIRVOYANCE static analysis. S F UZZ missed 117 reentrancy vulnerabilities
are all resulted from the abuse of detection rules, i.e., the and 16 delegatecall vulnerabilities due to (1) timeout and (2)
vulnerable contracts are filtered out by unsound rules. In incapability to find feasible paths to the vulnerability. X F UZZ
total, 3 cross-contract vulnerabilities are missed by X F UZZ. missed 27 vulnerabilities due to complex path conditions.
A close investigation shows that they are missed due to Answer to RQ1: Our tool X F UZZ achieves a precision
the complex path conditions, which blocks the input from of 95.5% and a recall of 84.6%. Among the evaluated
penetrating the function. We also carefully check false four methods, X F UZZ achieves the best recall. Besides,
negatives missed by X F UZZ, and find they are not reported X F UZZ successfully finds 209 real-world non-cross-
by C ONRACT F UZZER and S F UZZ as well. While existing contract vulnerabilities as well as 18 real-world cross-
works all fail to penetrate the complex path conditions, we contract vulnerabilities.
believe this limitation can be addressed by future works.
11

TABLE 8: The paths reported by X F UZZ and S F UZZ. The

vulnerable paths found by the two tools are counted respec-
tively.

Number in the Top

Found by Vul Total
Top10 Other
xFuzz Reentrancy 172 152 20
sFuzz Reentrancy 59 57 2
xFuzz Delegatecall 33 32 1
sFuzz Delegatecall 19 19 0

TABLE 9: The time cost of each step in fuzzing procedures.

sFuzz C.V. xFuzz

Reentrancy N.A. N.A. 630.6
Fig. 10: Comparison of reported vulnerabilities between MPT(min) Delegatecall N.A. N.A. 630.6
X F UZZ and S F UZZ regarding delegatecall. Tx-origin N.A. N.A. 630.6
Reentrancy 21,930.0 N.A. 3,621.0
ST(min) Delegatecall 22,131.0 N.A. 3,678.0
6.3 RQ2: The Effectiveness of Guided Testing Tx-origin N.A. N.A. 3,683.0
Reentrancy 54.1 246.2 86.6
This RQ investigates the usefulness of the ML model and DT(min) Delegatecall 2.8 N.A. 4.2
path prioritization for the guidance of fuzzing. To answer this Tx-origin N.A. N.A. 2.9
RQ, we compare S F UZZ with a customized version of X F UZZ, Reentrancy 21,984.1 246.2 4,338.2
i.e., which differs from S F UZZ only by adopting the ML Total(min) Delegatecall 22,133.8 N.A. 4,312.8
model (without focusing on cross-contract vulnerabilities). Tx-origin N.A. N.A. 4,316.5
The intuition is to check whether the ML model enables us
to reduce the time spent on benign contracts and thus reveal
vulnerabilities more efficiently. That is, we implement X F UZZ likely vulnerable paths before the remaining. Thus, if path
such that each contract is only allowed to be fuzzed for tl prioritization works, we would expect that the vulnerabilities
seconds if the ML model considers the contract benign; or are mostly found in paths, where X F UZZ explores first. We
otherwise, 180 seconds, which is also the time limit adopted thus systematically count the number of vulnerabilities found
in S F UZZ. Note that if tl is 0, the contract is skipped entirely in the first 10 paths which are explored by X F UZZ. The results
when it is predicted to be benign by the ML model. The are summarized in Table 8, where column “Top 10” shows
goal is to see whether we can set tl to be a value smaller the number of vulnerabilities detected in the first 10 paths
than 180 safely (i.e., without missing vulnerabilities). We thus explored.
systematically vary the value of tl and observe the number The results show that, X F UZZ finds a total of 152 (out of
of identified vulnerabilities. 172) reentrancy vulnerabilities in the first 10 explored paths.
The results are summarized in Figure 9 and Figure 10. In particular, the number of found vulnerabilities in the first
Note that the tx-origin vulnerability is not included since 10 explored paths by X F UZZ is almost three times as many
it is not supported by S F UZZ. The red line represents vul- as that by S F UZZ. Similarly, X F UZZ also finds 32 (out of 33)
nerabilities only found by X F UZZ, the green line represents delegatecall vulnerabilities in the first 10 explored paths. The
vulnerabilities only reported by S F UZZ and the blue line results thus clearly suggest that path prioritization allows
denotes the reports shared by both two tools. We can see that us to focus on relevant paths effectively, which has practical
the curves climb/drop sharply at the beginning and then consequence on fuzzing large contracts.
saturate/flatten after 30s, indicating that most vulnerabilities Answer to RQ2: The ML model enables us to signif-
are found in the first 30s. icantly reduce the fuzzing time on likely benign con-
We observe that when tl is set to 0s (i.e., contracts tracts without missing almost any vulnerabilities. Fur-
predicted as benign are skipped entirely), X F UZZ still detects thermore, most vulnerabilities are detected efficiently
82.8% (i.e., 111 out of 134, or equivalently 166% of that of through our path prioritization. Overall, X F UZZ finds
S F UZZ ) of the reentrancy vulnerabilities as well as 65.0% twice as many reentrancy or delegatecall vulnerabilities
of the delegatecall vulnerability (13 out of 20). The result as S F UZZ.
further improves if we set tl to be 30 seconds, i.e., almost
all (except 2 out of 174 reentrancy vulnerabilities; and none
of the delegatecall vulnerabilities) are identified. Based on 6.4 RQ3: Detection Efficiency
the result, we conclude that the ML model indeed enables to Next, we evaluate the efficiency of our approach. We record
reduce fuzzing time on likely benign contracts significantly time taken for each step during fuzzing and the results are
(i.e., from 180 seconds to 30 seconds) without missing almost summarized in Table 9. To eliminate randomness during
any vulnerability. fuzzing, we replay our experiments for five times and report
The Effectiveness of Path Prioritization. To evaluate the averaged results. In this table, “MPT” means model
the relevance of path prioritization, we further analyze the prediction time; “ST” means search time for vulnerable paths
results of the customized version of X F UZZ as discussed during fuzzing; “DT” means detection time for C LAIRVOY-
above. Recall that path prioritization allows us to explore ANCE and fuzzing time for the fuzzers. “N.A.” means that
12

1 function buyOne(address _exchange, uint256 1 contract SolidStamp{

_value, bytes _data) payable public 2 function audContract(address _auditor) public
2 { onlyRegister
3 ... 3 {
4 buyInternal(_exchange, _value, _data); 4 ...
5 } 5 _auditor.transfer(reward.sub(commissionKept
6 function buyInternal(address _exc, uint256 ));
_value, bytes _data) internal 6 }
7 { 7 }
8 ... 8 contract SolidStampRegister{
9 require(_exc.call.value(_value)(_data)); 9 address public CSolidStamp;
10 balances[msg.sender] = balances[msg.sender 10 function registerAudit(bytes32 _codeHash)
].sub(_value); public
11 } 11 {
12 ...
13 SolidStamp(CSolidStamp).audContract(msg.
Fig. 11: A real-world reentrancy vulnerability found by sender);
14 }
X F UZZ ,
in which the vulnerable path relies on internal calls. 15 }

the tool has no such step in fuzzing or the vulnerability Fig. 12: A cross-contract vulnerability found by X F UZZ. This
is currently not supported by it, and thus the time is not contract is used in auditing transactions in real-world.
recorded.
The efficiency of our method (i.e., by reducing the search 1 if ((random()%2==1) && (msg.value == 1 ether)
space) is evidenced as the results show that X F UZZ is && (!locked))
2 \\at 0x11F4306f9812B80E75C1411C1cf296b04917b2f0
obviously faster than S F UZZ, i.e., saving 80% of the time. 3
The main reason for the saving is due to the saving on 4 require(msg.value == 0 || (_amount == msg.value
the search time (i.e., 80% reduction). We also observe that && etherTokens[fromToken]));
5 \\at 0x1a5f170802824e44181b6727e5447950880187ab
X F UZZ is slightly slower than S F UZZ in terms of the effective
fuzzing time, i.e., an additional 32.5 (86.6-54.1) min is used
Fig. 13: Complex path conditions involving with multiple
for fuzzing cross-contract vulnerabilities. This is expected as
variables and values.
the number of paths is much more (even after the reduction
thanks to the ML model and path prioritization) than that
in the presence of more than 2 interacting contracts. Note
function registerAudit has a cross-contract call to a public
that C LAIRVOYANCE is faster than all tools because this tool
address CSolidStamp at line 13, which intends to forward
is a static detector without perform runtime execution of
the call to function audContract. While this function is only
contracts.
allowed to be accessed by the registered functions, as limited
Answer to RQ3: Owing to the reduced search space by modifier onlyRegister, we can bypass this restriction
of suspicious functions, the guided fuzzer X F UZZ by a cross-contract call registerAudit → audContrat.
saves over 80% of searching time and reports more Eventually, an attacker would be able to steals the ethers
vulnerabilities than S F UZZ with less than 20% of the in seconds.
time. Real-world Case 4:During our investigation on the exper-
iment results, we gain the insights that X F UZZ can be further
improved in terms of handling complex path conditions.
6.5 RQ4: Real-world Case Studies
Complex path conditions often lead to prolonged fuzzing
In this section, we present 2 typical vulnerabilities re- time or blocking penetration altogether. We identified a total
ported by X F UZZ to qualitatively show why X F UZZ works. of 3 cross-contract and 24 non-cross-contract vulnerabilities
In general, the ML model and path prioritization help X F UZZ that are missed due to such a reason. Two of such complex
find vulnerabilities in three ways, i.e., ¶ locate vulnerable condition examples (from two real-word false negatives
functions, · identify paths from internal calls and ¸ identify of X F UZZ) are shown in Figure 13. Function calls, values,
feasible paths from external calls. variables and arrays are involved in the conditions. These
Real-world Case 1: X F UZZ is enhanced with path priori- conditions are difficult to satisfy for X F UZZ and fuzzers in
tization, which enables it to focus on vulnerabilities related general (e.g., S F UZZ failed to penetrate these paths too). This
to internal calls. In Figure 112 , the modifier internal limits problem can be potentially addressed by integrating X F UZZ
the access only to internal member functions. The attacker with a theorem prover such that Z3 [51] which is tasked
can however steal ethers by path buyOne → buyInternal. to solve these path conditions. That is, a hybrid fuzzing
By applying X F UZZ, the vulnerability is identified in 0.05 approach that integrates symbolic execution in a lightweight
seconds and the vulnerable path is also efficiently exposed. manner is likely to further improve X F UZZ.
Real-world Case 2: The path prioritization also enables
X F UZZ to find cross-contract vulnerabilities efficiently. For Answer to RQ4: With the help of model predictions
example, a real-world cross-contract vulnerability3 is shown and path prioritization, X F UZZ is capable of rapidly
in Figure 12. This example is for auditing transactions in real- locating vulnerabilities in real-world contracts. The
world and involves with over 2,000 dollars. In this example, main reason for false negatives is complex path con-
ditions, which could be potentially addressed through
2. deployed at 0x0695B9EA62C647E7621C84D12EFC9F2E0CDF5F72 integrating hybrid fuzzing into X F UZZ.
3. deployed at 0x165CFB9CCF8B185E03205AB4118EA6AFBDBA9203
13

7 R ELATED W ORK agree that the smart contract software has a much high
security requirement than traditional software. According
In this section, we discuss works that are most relevant to to [7], the reasons behind such requirement are: 1) The
ours. frequent operations on sensitive information (e.g., digital
Program analysis. We draw valuable development expe- currencies, tokens); 2) The transactions are irreversible; 3)
rience and domain specific knowledge from existing work The deployed code cannot be modified. Considering the close
[8], [10], [3], [4], [5]. Among them, S LITHER [10], O YENTE connection between smart contract and financial activities,
[8] and Atzei et al. [5] provide a transparent overlook on the security of smart contract security largely effects the
smart contracts detection and enhance our understanding stability of the society.
on vulnerabilities. Chen et al. [3] and Durieux et al. [4] offer
evaluations on the state-of-the-arts, which helps us find the
limitation of existing tools. 8 C ONCLUSION
Cross-contract vulnerability. Our study is closely related
In this paper, we propose X F UZZ, a novel machine learning
to previous works focusing on interactions between multiple
guided fuzzing framework for smart contracts, with a special
contracts. Zhou et al. [52] present work to analyze relevance
focus on cross-contract vulnerabilities. We address two key
between smart contract files, which inspires us to focus on
challenges during its development: the search space of
cross-contract interactions. He et al. [24] report that existing
fuzzing is reduced, and cross-contract fuzzing is completed.
tools fail to exercise functions that can only execute at
The experiments demonstrate that X F UZZ is much faster and
deeper states. Xue et al. [19] studied cross-contract reentrancy
more effective than existing fuzzers and detectors. In future,
vulnerability. They propose to construct ICFG (combining
we will extend our framework with more static approach to
CFGs with call graphs) then track vulnerability by taint
support more vulnerabilities.
analysis.
Smart contract testing. Our study is also relevant to
previous fuzzing work on smart contracts. Smart contract R EFERENCES
testing plays an important role in smart contract security.
Zou et al. [7] report that over 85% of developers intend [1] V. K. DAS, “Top blockchain platforms of
2020,” https://www.blockchain-council.org/
to do heavy testing when programming. The work of blockchain/topblockchainplatformsof2020that\
Jiang et al. [15] makes the early attempt to fuzz smart everyblockchainenthusiastmustknow/, 2020, online; accessed
contracts. C ONTRACT F UZZER instruments Ethereum virtual September 2020.
machine and then collects execution logs for further analysis. [2] Ethereum, “Ethereum daily transaction chart,” https://etherscan.
io/chart/tx, 2017, online; accessed 29 January 2017.
Wüstholz et al. present guided fuzzer to better mutate [3] H. Chen, M. Pendleton, L. Njilla, and S. Xu, “A survey on ethereum
inputs. Similar method is implemented by He et al. [24]. systems security: Vulnerabilities, attacks, and defenses,” ACM
They propose to learn fuzzing strategies from the inputs Computing Surveys (CSUR), 2020.
generated from a symbolic expert. The above two methods [4] T. Durieux, J. F. Ferreira, R. Abreu, and P. Cruz, “Empirical review
of automated analysis tools on 47,587 ethereum smart contracts,”
inspire us to leverage a guider to reduce search space. in Proceedings of the ACM/IEEE 42nd ICSE, 2020, pp. 530–541.
Tai D et al. [14] implement a user-friendly AFL fuzzing [5] N. Atzei, M. Bartoletti, and T. Cimoli, “A survey of attacks on
tool for smart contracts, based on which we build our ethereum smart contracts (sok),” in International Conference on
Principles of Security and Trust. Springer, 2017, pp. 164–186.
fuzzing framework. Different from these existing work, our [6] O. G. Güçlütürk, “The dao hack explained: Unfortunate
work makes a special focus on proposing novel ML-guided take-off of smart contracts,” https://medium.com/@ogucluturk/
method for fuzzing cross-contract vulnerabilities, which is the-dao-hack-explained-unfortunate-take-off-of-smart-contracts-2bd8c8db3562,
highly important but largely untouched by existing work. 2018, online; accessed 22 January 2018.
[7] W. Zou, D. Lo, P. S. Kochhar, X.-B. D. Le, X. Xia, Y. Feng,
Additionally, our comprehensive evaluation demonstrates Z. Chen, and B. Xu, “Smart contract development: Challenges and
that our proposed technique indeed outperforms the state- opportunities,” IEEE Transactions on Software Engineering, vol. 47,
of-the-arts in detecting cross-contract vulnerabilities. no. 10, pp. 2084–2106, 2019.
[8] L. Luu, D.-H. Chu, H. Olickel, P. Saxena, and A. Hobor, “Making
Machine learning practice. This work is also inspired smart contracts smarter,” in Proceedings of the 2016 ACM SIGSAC
by previous work [53], [54], [55]. In their work, they pro- CCS, 2016, pp. 254–269.
pose learning behavior automata to facilitate vulnerability [9] P. Tsankov, A. Dan, D. Drachsler-Cohen, A. Gervais, F. Buenzli, and
detection. Zhuang et al. [56] propose to build graph networks M. Vechev, “Securify: Practical security analysis of smart contracts,”
in Proceedings of the 2018 ACM SIGSAC Conference on Computer and
on smart contracts to extend understanding of malicious Communications Security, 2018, pp. 67–82.
attacks. Their work inspires us to introduce machine learning [10] J. Feist, G. Grieco, and A. Groce, “Slither: a static analysis frame-
method for detection. We also improve our model selection work for smart contracts,” in 2019 IEEE/ACM 2nd International
Workshop on Emerging Trends in Software Engineering for Blockchain
by inspiration of work of Liu et al. [39]. Their algorithm
(WETSEB), 2019, pp. 8–15.
helps us select best models with satisfactory performance [11] Protofire, “Solhint,” https://github.com/protofire/solhint, 2018,
on recall and precision on highly imbalanced dataset. Yan online; accessed September 2018.
et al. [55] have proposed a method to mimic the cognitive [12] S. Kalra, S. Goel, M. Dhawan, and S. Sharma, “Zeus: Analyzing
safety of smart contracts.” in NDSS, 2018.
process of human experts. Their work inspires us to find [13] S. Tikhomirov, E. Voskresenskaya, I. Ivanitskiy, R. Takhaviev,
the consensus of vulnerability evaluators to better train the E. Marchenko, and Y. Alexandrov, “Smartcheck: Static analysis
machine learning models. of ethereum smart contracts,” in WETSEB, 2018, pp. 9–16.
Smart contract security to society. Smart contract has [14] T. D. Nguyen, L. H. Pham, J. Sun, Y. Lin, and Q. T. Minh,
“Sfuzz: An efficient adaptive fuzzer for solidity smart contracts,” in
drawn a number of security concerns since it came into being. Proceedings of the ACM/IEEE 42nd International Conference on Software
As figured out by Zou et al. [7], over 75% of developers Engineering, ser. ICSE ’20, New York, NY, USA, 2020, p. 778–788.
14

[15] B. Jiang, Y. Liu, and W. Chan, “Contractfuzzer: Fuzzing smart [35] Protofire, “Decentralized application security project,” https://
contracts for vulnerability detection,” in 2018 33rd IEEE/ACM dasp.co/, accessed September, 2018.
International Conference on Automated Software Engineering (ASE). [36] N. Stephens, J. Grosen, C. Salls, A. Dutcher, R. Wang, J. Corbetta,
IEEE, 2018, pp. 259–269. Y. Shoshitaishvili, C. Kruegel, and G. Vigna, “Driller: Augmenting
[16] G. Grieco, W. Song, A. Cygan, J. Feist, and A. Groce, “Echidna: fuzzing through selective symbolic execution.” in NDSS, vol. 16,
effective, usable, and fast fuzzing for smart contracts,” in Proceed- 2016.
ings of the 29th ACM SIGSOFT International Symposium on Software [37] W. Drewry and T. Ormandy, “Flayer: Exposing application inter-
Testing and Analysis, 2020, pp. 557–560. nals,” 2007.
[17] Q. Zhang, Y. Wang, J. Li, and S. Ma, “Ethploit: From fuzzing to [38] T. Chen and C. Guestrin, “Xgboost: A scalable tree boosting system,”
efficient exploit generation against smart contracts,” in 2020 IEEE in Proceedings of the 22nd acm sigkdd international conference on
27th SANER. IEEE, 2020, pp. 116–126. knowledge discovery and data mining, 2016, pp. 785–794.
[18] J. Gao, H. Liu, Y. Li, C. Liu, Z. Yang, Q. Li, Z. Guan, and Z. Chen, [39] X.-Y. Liu, J. Wu, and Z.-H. Zhou, “Exploratory undersampling for
“Towards automated testing of blockchain-based decentralized class-imbalance learning,” IEEE Transactions on Systems, Man, and
applications,” in IEEE/ACM 27th ICPC, 2019, pp. 294–299. Cybernetics, Part B (Cybernetics), pp. 539–550, 2008.
[19] X. Yinxing, M. Mingliang, L. Yun, S. Yulei, Y. Jiaming, and P. Tiany- [40] S. C. Security, “Smart contract weakness classification registry,”
ong, “Cross-contract static analysis for detecting practical reen- https://github.com/SmartContractSecurity/SWC-registry, 2019,
trancy vulnerabilities in smart contracts,” in 2020 35rd IEEE/ACM online; accessed September 2019.
International Conference on Automated Software Engineering (ASE), [41] U. Alon, M. Zilberstein, O. Levy, and E. Yahav, “code2vec: Learning
2020. distributed representations of code,” Proceedings of the ACM on
[20] G. A. Oliva, A. E. Hassan, and Z. M. J. Jiang, “An exploratory study Programming Languages, 2019.
of smart contracts in the ethereum blockchain platform,” Empirical [42] Z. Li, D. Zou, J. Tang, Z. Zhang, M. Sun, and H. Jin, “A comparative
Software Engineering, pp. 1–41, 2020. study of deep learning-based vulnerability detection system,” IEEE
[21] V. Wüstholz and M. Christakis, “Harvey: A greybox fuzzer for Access, pp. 103 184–103 197, 2019.
smart contracts,” in Proceedings of the 28th ACM Joint Meeting [43] G. Grieco, G. L. Grinblat, L. Uzal, S. Rawat, J. Feist, and L. Mounier,
on European Software Engineering Conference and Symposium on the “Toward large-scale vulnerability discovery using machine learn-
Foundations of Software Engineering, 2020, pp. 1398–1409. ing,” in Proceedings of the 6th ACM Conference on Data and Application
[22] X. Du, B. Chen, Y. Li, J. Guo, Y. Zhou, Y. Liu, and Y. Jiang, “Leopard: Security and Privacy, 2016, p. 85–96.
Identifying vulnerable code for vulnerability assessment through [44] T. Kamiya, S. Kusumoto, and K. Inoue, “Ccfinder: a multilinguistic
program metrics,” in 2019 IEEE/ACM 41st International Conference token-based code clone detection system for large scale source
on Software Engineering (ICSE). IEEE, 2019, pp. 60–71. code,” IEEE Transactions on Software Engineering, pp. 654–670, 2002.
[23] P. Godefroid, H. Peleg, and R. Singh, “Learn&fuzz: Machine [45] J. L. Leevy, T. M. Khoshgoftaar, R. A. Bauder, and N. Seliya, “A
learning for input fuzzing,” in 2017 32nd IEEE/ACM International survey on addressing high-class imbalance in big data,” Journal of
Conference on Automated Software Engineering (ASE). IEEE, 2017, Big Data, p. 42, 2018.
pp. 50–59. [46] N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer,
[24] J. He, M. Balunović, N. Ambroladze, P. Tsankov, and M. Vechev, “Smote: synthetic minority over-sampling technique,” Journal of
“Learning to fuzz from symbolic execution with application to artificial intelligence research, vol. 16, pp. 321–357, 2002.
smart contracts,” in Proceedings of the 2019 ACM SIGSAC Conference [47] Z. Huang, W. Xu, and K. Yu, “Bidirectional lstm-crf models for
on Computer and Communications Security, 2019, pp. 531–548. sequence tagging,” arXiv preprint arXiv:1508.01991, 2015.
[25] S. T. Help, “7 principles of software testing: Defect clustering [48] Z. Tian, J. Tian, Z. Wang, Y. Chen, H. Xia, and L. Chen, “Landscape
and pareto principle,” https://www.softwaretestinghelp.com/ estimation of solidity version usage on ethereum via version
7-principles-of-software-testing/, accessed March, 2021. identification,” International Journal of Intelligent Systems, vol. 37,
[26] A. Ghaleb and K. Pattabiraman, “How effective are smart contract no. 1, pp. 450–477, 2022.
analysis tools? evaluating smart contract static analysis tools using [49] S. Contract, “Function selector,” https://solidity-by-example.org/
bug injection,” in Proceedings of the 29th ACM SIGSOFT International function-selector/, accessed March, 2021.
Symposium on Software Testing and Analysis, 2020, pp. 415–427. [50] Dedaub, “Security technology for smart contracts,” https://
[27] M. Ren, Z. Yin, F. Ma, Z. Xu, Y. Jiang, C. Sun, H. Li, and Y. Cai, contract-library.com/, 2020, online; accessed 29 January 2020.
“Empirical evaluation of smart contract testing: what is the best [51] L. De Moura and N. Bjørner, “Z3: An efficient smt solver,” in
choice?” in Proceedings of the 30th ACM SIGSOFT International International conference on Tools and Algorithms for the Construction
Symposium on Software Testing and Analysis, 2021, pp. 566–579. and Analysis of Systems. Springer, 2008, pp. 337–340.
[28] Y. Zhuang, Z. Liu, P. Qian, Q. Liu, X. Wang, and Q. He, “Smart [52] E. Zhou, S. Hua, B. Pi, J. Sun, Y. Nomura, K. Yamashita, and
contract vulnerability detection using graph neural network.” in H. Kurihara, “Security assurance for smart contract,” in 2018
IJCAI, 2020, pp. 3283–3290. 9th IFIP International Conference on New Technologies, Mobility and
[29] T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient esti- Security (NTMS). IEEE, 2018, pp. 1–5.
mation of word representations in vector space,” arXiv preprint [53] H. Xiao, J. Sun, Y. Liu, S.-W. Lin, and C. Sun, “Tzuyu: Learning
arXiv:1301.3781, 2013. stateful typestates,” in 2013 28th IEEE/ACM ASE. IEEE, 2013, pp.
[30] M. Ren, Z. Yin, F. Ma, Z. Xu, Y. Jiang, C. Sun, H. Li, and Y. Cai, 432–442.
“Empirical evaluation of smart contract testing: What is the best [54] Y. Xue, J. Wang, Y. Liu, H. Xiao, J. Sun, and M. Chandramohan,
choice?” 2021. “Detection and classification of malicious javascript via attack
[31] J. F. Ferreira, P. Cruz, T. Durieux, and R. Abreu, “Smartbugs: A behavior modelling,” in Proceedings of the 2015 ISSTA, 2015, pp.
framework to analyze solidity smart contracts,” arXiv preprint 48–59.
arXiv:2007.04771, 2020. [55] G. Yan, J. Lu, Z. Shu, and Y. Kucuk, “Exploitmeter: Combining
[32] xFuzz, “Machine learning guided cross-contract fuzzing,” https:// fuzzing with machine learning for automated evaluation of soft-
anonymous.4open.science/r/xFuzzforReview-ICSE, 2020, online; ware exploitability,” in 2017 IEEE Symposium on Privacy-Aware
accessed September 2020. Computing (PAC). IEEE, 2017, pp. 164–175.
[33] ethervm, “Ethereum virtual machine opcodes,” https://ethervm. [56] Y. Zhuang, Z. Liu, P. Qian, Q. Liu, X. Wang, and Q. He, “Smart
io/, 2019, online; accessed September 2019. contract vulnerability detection using graph neural network.” In-
[34] T. D. Nguyen, L. H. Pham, and J. Sun, “sguard: Towards ternational Joint Conferences on Artificial Intelligence Organization,
fixing vulnerable smart contracts automatically,” arXiv preprint 2020, pp. 3283–3290.
arXiv:2101.01917, 2021.

Choosing the Best Test Automation Framework
No ratings yet
Choosing the Best Test Automation Framework
3 pages
Specification For Concrete Crack Repair
100% (1)
Specification For Concrete Crack Repair
12 pages
AC Service Unit: Repair Instructions
100% (1)
AC Service Unit: Repair Instructions
29 pages
Improving Smart Contract Security With Contrastive Learning-Based Vulnerability Detection
No ratings yet
Improving Smart Contract Security With Contrastive Learning-Based Vulnerability Detection
11 pages
Contractfuzzer: Fuzzing Smart Contracts For Vulnerability Detection
No ratings yet
Contractfuzzer: Fuzzing Smart Contracts For Vulnerability Detection
11 pages
Multimodal AI for Smart Contract Security
No ratings yet
Multimodal AI for Smart Contract Security
20 pages
Sensors 23 07246
No ratings yet
Sensors 23 07246
21 pages
SSRN 4331099
No ratings yet
SSRN 4331099
21 pages
Ai-Driven Vulnerability Analysis in Smart Contracts: Trends, Challenges and Future Directions
No ratings yet
Ai-Driven Vulnerability Analysis in Smart Contracts: Trends, Challenges and Future Directions
18 pages
Smart Contracts in Int'l Payments
No ratings yet
Smart Contracts in Int'l Payments
3 pages
Advanced Security Auditing Methods For Solidity-Ba
No ratings yet
Advanced Security Auditing Methods For Solidity-Ba
22 pages
SmartGift Learning To Generate Practical Inputs For Testing Smart Contracts
No ratings yet
SmartGift Learning To Generate Practical Inputs For Testing Smart Contracts
12 pages
Apna Research Paper New Final
No ratings yet
Apna Research Paper New Final
13 pages
Vuln Sense
No ratings yet
Vuln Sense
12 pages
1 s2.0 S016412122300314X Main
No ratings yet
1 s2.0 S016412122300314X Main
14 pages
Deep Learning-Based Solution For Smart Contract Vu
No ratings yet
Deep Learning-Based Solution For Smart Contract Vu
18 pages
Smart Contract Vulnerability Detection Combined With Multi-Objective Detection
No ratings yet
Smart Contract Vulnerability Detection Combined With Multi-Objective Detection
13 pages
Function-Level Dynamic Monitoring and Analysis System For Smart Contract
No ratings yet
Function-Level Dynamic Monitoring and Analysis System For Smart Contract
12 pages
Smartllm: Smart Contract Auditing Using Custom Generative Ai
No ratings yet
Smartllm: Smart Contract Auditing Using Custom Generative Ai
7 pages
LatexTemplate Copy
No ratings yet
LatexTemplate Copy
14 pages
Designing Secure Ethereum Smart Contracts: A Finite State Machine Based Approach
No ratings yet
Designing Secure Ethereum Smart Contracts: A Finite State Machine Based Approach
18 pages
ICSE23
No ratings yet
ICSE23
13 pages
So 等 - SMARTEST Effectively Hunting Vulnerable Transacti
No ratings yet
So 等 - SMARTEST Effectively Hunting Vulnerable Transacti
19 pages
LatexTemplate Copy - 8
No ratings yet
LatexTemplate Copy - 8
20 pages
Applsci 13 00770 v3
No ratings yet
Applsci 13 00770 v3
25 pages
Smart LLM Inj
No ratings yet
Smart LLM Inj
11 pages
SuMo: Mutation Testing for Solidity
No ratings yet
SuMo: Mutation Testing for Solidity
10 pages
2017 WTSC
No ratings yet
2017 WTSC
17 pages
2021 - Done - DL-based Malicious Smart Contract Detection Scheme For IoT Environment 1-s2.0-S004579062100519X-Main
No ratings yet
2021 - Done - DL-based Malicious Smart Contract Detection Scheme For IoT Environment 1-s2.0-S004579062100519X-Main
13 pages
Smart Contracts in Blockchain Technology: A Critical Review
No ratings yet
Smart Contracts in Blockchain Technology: A Critical Review
19 pages
P2P Paper Blockchain
No ratings yet
P2P Paper Blockchain
36 pages
Design and Evaluation of Highly Accurate Smart Contract Code Vulnerability Detection Framework
No ratings yet
Design and Evaluation of Highly Accurate Smart Contract Code Vulnerability Detection Framework
25 pages
Journal Pone 0281043
No ratings yet
Journal Pone 0281043
31 pages
Smart Contract Dev: Challenges & Fixes
No ratings yet
Smart Contract Dev: Challenges & Fixes
28 pages
Sereum Protecting Existing Smart Contracts Against
No ratings yet
Sereum Protecting Existing Smart Contracts Against
16 pages
Project Summary
No ratings yet
Project Summary
3 pages
BiAn Smart Contract Source Code Obfuscation
No ratings yet
BiAn Smart Contract Source Code Obfuscation
21 pages
From Fuzzing To Efficient Exploit
No ratings yet
From Fuzzing To Efficient Exploit
11 pages
Smart Contract Security with ML
No ratings yet
Smart Contract Security with ML
46 pages
Ethereum Smart Contract Repair
No ratings yet
Ethereum Smart Contract Repair
62 pages
CS731 Project Presentation
No ratings yet
CS731 Project Presentation
15 pages
Paper 76-Enhancing Smart Contract Security
No ratings yet
Paper 76-Enhancing Smart Contract Security
14 pages
Sensors 22 03577 v2
No ratings yet
Sensors 22 03577 v2
24 pages
Smart Contract Vulnerability Detection System
No ratings yet
Smart Contract Vulnerability Detection System
11 pages
Proceedings of Spie: Svchecker: A Deep Learning-Based System For Smart Contract Vulnerability Detection
No ratings yet
Proceedings of Spie: Svchecker: A Deep Learning-Based System For Smart Contract Vulnerability Detection
7 pages
Cross-Modality Mutual Learning For Enhancing Smart Contract Vulnerability Detection On Bytecode
No ratings yet
Cross-Modality Mutual Learning For Enhancing Smart Contract Vulnerability Detection On Bytecode
10 pages
Chapter 1
100% (3)
Chapter 1
46 pages
Smart Contract Development - Challenges and Opportunities
No ratings yet
Smart Contract Development - Challenges and Opportunities
22 pages
Smart Contract Vulnerability Taxonomy
No ratings yet
Smart Contract Vulnerability Taxonomy
55 pages
Fuzzing Solana Smart Contracts
No ratings yet
Fuzzing Solana Smart Contracts
17 pages
2021 - Done - HLF Contract Fuzzer - 2106.11210
No ratings yet
2021 - Done - HLF Contract Fuzzer - 2106.11210
8 pages
Smart Contracts Security Application and Challenge
No ratings yet
Smart Contracts Security Application and Challenge
28 pages
Smart Contracts Vulnerabilities Detection Using Ensemble Architecture of Graphical Attention Model Distillation and Inference Network
No ratings yet
Smart Contracts Vulnerabilities Detection Using Ensemble Architecture of Graphical Attention Model Distillation and Inference Network
13 pages
Smart Contract and Defi Security Tools: Do They Meet The Needs of Practitioners?
No ratings yet
Smart Contract and Defi Security Tools: Do They Meet The Needs of Practitioners?
13 pages
Preprints202312 2325 v1
No ratings yet
Preprints202312 2325 v1
17 pages
SmartBerg BERT
No ratings yet
SmartBerg BERT
14 pages
MVD-HGmultigranularity Smart
No ratings yet
MVD-HGmultigranularity Smart
15 pages
2024ist - A Vulnerability Detection Framework by Focusing On Critical Execution Paths
No ratings yet
2024ist - A Vulnerability Detection Framework by Focusing On Critical Execution Paths
16 pages
Smart Contract Penetration Testing Framework
No ratings yet
Smart Contract Penetration Testing Framework
16 pages
Smart Contract Static Analysis Tool
No ratings yet
Smart Contract Static Analysis Tool
17 pages
Icse2024 - Towards Finding Accounting Errors in Smart Contracts
No ratings yet
Icse2024 - Towards Finding Accounting Errors in Smart Contracts
13 pages
2018 IEEE Blockchain SmartContracts Security
No ratings yet
2018 IEEE Blockchain SmartContracts Security
9 pages
LastQ PredictedCLasses
No ratings yet
LastQ PredictedCLasses
3 pages
33NaiveBayesOn Iris
No ratings yet
33NaiveBayesOn Iris
1 page
Linear Reg 33
No ratings yet
Linear Reg 33
3 pages
Pandas
No ratings yet
Pandas
26 pages
SPM Syllabus
No ratings yet
SPM Syllabus
1 page
Hadoop Architecture Overview
No ratings yet
Hadoop Architecture Overview
10 pages
Hadoop Ecosystem
No ratings yet
Hadoop Ecosystem
5 pages
Pushover-Based Risk Assessment Method:: A Practical Tool For Risk Assessment of Building Structures
No ratings yet
Pushover-Based Risk Assessment Method:: A Practical Tool For Risk Assessment of Building Structures
14 pages
Corporate Banking Analysis Guide
No ratings yet
Corporate Banking Analysis Guide
38 pages
2820H Service Manual
No ratings yet
2820H Service Manual
55 pages
2011-2012 Tuition Fees Rates
No ratings yet
2011-2012 Tuition Fees Rates
2 pages
District Test On The Circular Flow Model-1-1
100% (2)
District Test On The Circular Flow Model-1-1
7 pages
IRCTC Train Ticket: Rourkela to Surat
No ratings yet
IRCTC Train Ticket: Rourkela to Surat
3 pages
Binomail Distribution
No ratings yet
Binomail Distribution
37 pages
Lesson 1 Economics As Social Science
No ratings yet
Lesson 1 Economics As Social Science
6 pages
Taran Et Al. 2015 PDF
No ratings yet
Taran Et Al. 2015 PDF
31 pages
Gonna Fly Now Trumpet Cover
No ratings yet
Gonna Fly Now Trumpet Cover
3 pages
Carrier BacnetSC Setup Guide
No ratings yet
Carrier BacnetSC Setup Guide
27 pages
Lec 1 Cost Est
No ratings yet
Lec 1 Cost Est
42 pages
Regulatory Environment For Food and Beverage in Brazil
No ratings yet
Regulatory Environment For Food and Beverage in Brazil
12 pages
D1-211 - 2020 Failure Analysis of 400 KV Insulator
No ratings yet
D1-211 - 2020 Failure Analysis of 400 KV Insulator
12 pages
Engineer Onboarding Form
No ratings yet
Engineer Onboarding Form
12 pages
How To Earn Online Webinar
No ratings yet
How To Earn Online Webinar
29 pages
Lab Report On Basics Logic Gate
80% (10)
Lab Report On Basics Logic Gate
9 pages
Appendix-47
No ratings yet
Appendix-47
9 pages
Ca Inter FM List of Important Concepts & List of Important Questions
No ratings yet
Ca Inter FM List of Important Concepts & List of Important Questions
5 pages
LEED AP ID+C Candidate Handbook
No ratings yet
LEED AP ID+C Candidate Handbook
32 pages
Solution Practice 6 Consolidations 3
No ratings yet
Solution Practice 6 Consolidations 3
8 pages
Beam Telecom PVT LTD.: 8-2-610/A, Road No.10, Banjara Hills, Hyderabad-500034 Tel: +91-40-66272727
No ratings yet
Beam Telecom PVT LTD.: 8-2-610/A, Road No.10, Banjara Hills, Hyderabad-500034 Tel: +91-40-66272727
2 pages
Design of Linear Quadratic Regulator For Rotary Inverted Pendulum Using Labview
No ratings yet
Design of Linear Quadratic Regulator For Rotary Inverted Pendulum Using Labview
5 pages
Pantry Evaluation Proposal Internship
No ratings yet
Pantry Evaluation Proposal Internship
6 pages
DeltaX - Product Analyst - Job Description - Campus Hiring 2025
No ratings yet
DeltaX - Product Analyst - Job Description - Campus Hiring 2025
3 pages
Rotary Valve Fast Cycle Pressure Swing Adsorption Paper
No ratings yet
Rotary Valve Fast Cycle Pressure Swing Adsorption Paper
14 pages
Hydronic Heaters Selection Spreadsheet
No ratings yet
Hydronic Heaters Selection Spreadsheet
19 pages

X Fuzz

Uploaded by

X Fuzz

Uploaded by

1

xFuzz: Machine Learning Guided

Index Terms—Smart Contract, Fuzzing, Cross-contract Vulnerability, Machine Learning

1 function withdrawBalance() public { A smart contract suffers from reentrancy vulnerability if

Fig. 5: The overview of X F UZZ framework.

TABLE 2: The seven static features adopted in model training

Feature Name Type Description

a counter-feature to avoid false alarms. Feature has_call

RQ1. How effective is X F UZZ in detecting cross-contract

6.1 Dataset Preparation

0x7a8721a9 6 4 6 C.F. 100 1.7 3 0 0 0 0 0 0

TABLE 6: Performance of X F UZZ, C LAIRVOYANCE (C.V.),

reentrancy delegatecall tx-origin

6.2.1 Cross-contract Vulnerability.

TABLE 8: The paths reported by X F UZZ and S F UZZ. The

Number in the Top

TABLE 9: The time cost of each step in fuzzing procedures.

sFuzz C.V. xFuzz

1 function buyOne(address _exchange, uint256 1 contract SolidStamp{

You might also like