BuSS: A solid-state drive/magnetic disk hybrid storage architecture to
improve wear-leveling
Brett L. Dennis, Sushanth Kumar Reddy, Seung-jin Kim
University of Arizona
{dennisb, sushanth, seungjin}@email.arizona.edu
Abstract
Traditional flash translation layers (FTLs) exist process referred to as wear leveling), flash
to map user-level write operations to physical translation layers (FTLs) have been developed to
solid-state flash memory drives (SSDs). These map user-level read and write requests to
FTLs use log-structured block allocation to physical pages in the SSD using a log-structured
facilitate wear leveling and increase the overall approach [1] [2]. In typical FTLs, a user-level
lifetime of the SSD. In many of these schemes, page will be mapped to one or more physical
a user-level logical block is mapped to one or pages. Early attempts at FTLs included the
more physical blocks in the drive. When it ANAND mapping, which mapped a user block
becomes necessary to recycle a physical block, a to one primary block and up to n secondary
merge operation is performed to consolidate blocks, where n was the number of pages per
primary and secondary data mapped to multiple block and each page in the block was mapped in
physical SSD blocks into a single physical SSD a single-associative fashion to a single page
block. The secondary blocks are then erased and location in each of its secondary blocks. The
may be reused. These erase operations are FMAX mapping improved on this idea by
expensive in part because they create wear on mapping a user block to one primary block and
the physical SSD medium and reduce its overall at most one secondary block. FMAX mapped its
lifetime. The BuSS FTL is introduced and secondary block in a block-associative manner,
explained, which employs a large SSD as meaning that a page could be written to any
primary storage and a small magnetic disk to location within the block (the concept of
store secondary blocks (also called replacement associativity is analogous to the associativity of
blocks), thus reducing and in some cases cache in a memory hierarchy, and is explained in
eliminating the need for erase operations on the greater detail in [3]). The FAST FTL [4] made
SSD. Comparisons between the number of erase further improvements to the layout of the
operations of the BuSS FTL and those of other physical SSD by making page writes fully
FTLs are made. associative.
1. Introduction No matter which FTL is used, wear will impose
a definite and limited read/write/erase lifetime to
an SSD. Typical numbers of read-write
NAND flash based Solid-state drives (SSD) are
operations on an SSD range from 300,000 to
being used more frequently in today’s storage
1,000,000 [5]. This motivates research into
systems, due at least in part to the nonvolatile
ways of reducing the number of reads, writes,
nature of NAND flash, increases in performance
and erases. Since wear is not as significant of a
and capacity and decreases in cost. NAND flash
factor on magnetic disk, we propose a system
memory is built on an architecture of blocks and
called BuSS, intended to integrate into an SSD
pages. A block is the smallest unit of erase on
storage architecture a small magnetic disk,
such systems and a page is the smallest unit of
which will be used for replacement blocks and
read/write. A block contains multiple pages (e.g.,
allow in-page replacement. This system differs
64 pages of 2,048 bytes each for a block size of
from existing hybrid SSD/HDD architectures [6]
128 KB). NAND flash memory allows for only
in two respects. First, existing hybrid
a limited number of erase operations before the
architectures use a large magnetic disk as the
medium becomes worn out and unreliable. To
primary storage medium and flash memory as a
decrease the amount of wear on an SSD (a
1
backup. BuSS uses a large amount of flash 2.2 Operation
memory and a small magnetic disk and thus is
intended as an alternative to Flash FTLs. The solid-state disk portion of BuSS system
Second, existing hybrid systems employ a works as a main storage place and magnetic disk
combination of magnetic disk and flash memory portion of BuSS system contains the
in order to improve read/write performance, replacement blocks for system’s solid-state disk
where the objective of the BuSS approach is to portion. Since magnetic disk allows in place
improve wear leveling. Section 2 describes the replacement of the page, we can expect that the
design of this system. Section 3 explains some system does not need to erase the whole SSD's
basic calculations to describe the wear of BuSS disk block to write new pages. The system can
compared with other FTLs when the data is overwrite the existing pages on the magnetic
written to the disk in a trivial looping pattern. disk portion of the system. In this case, system
Section 4 describes an environment in which will require a fewer number of erases in the SSD
BuSS and the traditional FTLs were simulated portion.
using a large amount of random data and reports
the results. Section 5 concludes. The controller is the gateway of the whole BuSS
system. Since the BuSS system has two identical
storage places, the Operating system (OS) does
2. Design not know which portion of data should be read.
The controller gets the request from OS. It will
2.1 Basic Architecture then check the magnetic portion of the system
first. If the magnetic portion of BuSS system has
Physically BuSS consists of a small hard disk the data, the controller returns the data from
and an array of one or more large SSD disks. magnetic disk to OS. And if the magnetic
Logically the BuSS system has a controller portion of the BuSS system does not have the
between a small hard disk and SSD. Also this requested data, the controller automatically
controller works as a gateway of this BuSS reads from the SSD portion and returns the data
system. The BuSS system's solid-state disk to the OS.
portion acts as the system's primary storage and
its size is always greater than its system's
magnetic disk. A RAID layout may be used to 2.3 Research Approach
structure the larger SSD volume.
The BuSS system is designed to focus on
lowering SSD flash memory wear. To reach this
Controll goal, the BuSS system minimizes the number of
disk erases per file system operation in the SSD
Ma portion of the system. To prove the efficiency of
Solid-state disk
gn BuSS and evaluate this proposal, we used
eti mathematical system models to determine
c expected BuSS performance when run using
dis trivial read/write patterns as input. We also
k simulated expected real world usage scenarios
with randomly generated data and compared the
results of the BuSS system with other FTLs.
2.4 Flash Translation Layer for BuSS
BuSS System
Figure 1 The FTL design in BuSS is similar to the FAST
FTL. Replacement blocks are present in
Magnetic Disk. The magnetic disk will act as a
2
replacement pool for the SSD. Since the 3. System model
magnetic disk will contain only replacement
blocks and pages, it is smaller in capacity than In this system model, we use mathematical
the SSD. Read and write behavior is defined as equations to prove the efficiency of the BuSS
follows: system. From the given information, we
calculate the expected erase number from
Write Operation: When a write request is various existing FTLs (ANAND, FMAX, and
received by the controller, the controller will FAST) and BuSS model with different data
check writing patterns.
• If the page is not written in SSD, then
the system will write in SSD. 3.1 FTL Calculations Assumption
• If the page is present in SSD and not
present in Magnetic disk, then the We have made the following assumption in
replacement page will be written into our analysis.
the Magnetic disk.
• If the page is already present in Page Size 2K
Magnetic disk then the system will Block Size 128k
perform in place replacement of the (1 block = 64 pages)
Magnetic disk size 10GB
page.
SSD 100GB
Read Operation: Whenever a read request is Table 1
received, the controller will check
• If the page is present in the magnetic
For simplicity it is assumed that all the page
disk. If it is present in the Magnetic disk
writes would need replacement blocks. All the
then the copy of the page in MD is the page-writes need a page to be overwritten. Since
most current version of the page. So the NAND Flash doesn't do in-place replacement,
system will read the page from we need to allocate a replacement block and
Magnetic disk. write the page. Let us assume that the page write
• If the page is not present in Magnetic sequence is
Disk then the system will look for it in 1 2 3 4 5 1 2 3 4 5 1 2 and so on...
The numbers represent the index of the file
SSD. system page number from the user’s perspective;
for example, the user may be writing iteratively
Garbage collection: Garbage collection is
to the first five blocks of a file. The pattern is a
required when the replacement blocks fill up the
sequential loop write where the page number of
space in the Magnetic disk. Garbage collection
the write sequence repeats after every 5 writes.
is triggered when the system is idle and the
number of pages filled is 70% of the total pages
that the Magnetic Disk can accommodate. The 3.2 ANAND FTL Calculation
percentage is arbitrary, and further research is
required to determine the optimal value. We are In ANAND FTL, a replacement block is created
keeping 30% to allow the system to write into whenever a page has to be replaced. The
the Magnetic disk if it’s a long write. The position of the page in the block should be the
garbage collector will clean up the whole of the same. For example consider the following figure
magnetic disk when triggered. So only when the
garbage collector is triggered will we see erase
operations in SSD.
3
Main Rep. Rep.
We see that the number of erases depends upon
block block 1 block 2
the size of the sequential write sequence.
1 11 111
2 21 211 3.3 FMAX FTL Calculation
3 31 FMAX has one primary block and one
1 secondary block. When the secondary block is
4 4
filled, both blocks are erased.
5 51
Primary Secondary
1 11
2 21
3 31
Figure 2
4 41
We can see in the Figure 2 above that the new
5 51
pages 11 would need to be replaced. So we
create replacement block 1 and then write Page
11 in the same position. We do the same for 111.
For a 100 GB disk, every 6th write we create
one replacement block. 800k blocks in the Flash
will be filled for 4000k (800k * 5) writes. Once Figure 3
the flash drive is full, the garbage collector has
to be invoked. When the garbage collector is In FMAX FTL, the first five pages are written in
called it has to erase 800k -1 erases. the primary block. When the FTL has to
overwrite, it creates a new secondary block and
Graph 1 shows how the number of erases then keeps filling pages in the secondary till it
increases as you increase the number of writes, gets filled. Once it is filled FTL erases both
keeping other parameters constant. blocks.
Graph 1 Graph 2
4
Graph 2 shows number of erases per number of
writes in the FMAX FTL.
We see similar behavior in terms of erases to
the ANAND FTL. But we see significant
improvement in the number of erases for the
same number of page write operations.
3.4 FAST FTL Calculation
FAST FTL is a bit different from ANAND and
FMAX. It has a replacement pool where all
replacement pages are written. Once the pool is
full, one block is chosen from the pool to be
erased. This system doesn’t depend on the
Graph 4
looping behavior of the write sequence. But the
no of erases does depend on the replacement
pool size. As the number of blocks allocated to 3.6 Analysis of Random data
replacement pool increases, the number of erases
decreases. We can see this behavior in Graph 3. The number of Erases in the BuSS system is
dependent on the type of write sequences. If
write sequence is only patterned looping such as
1,2,3,4,1,2,3,4,1,2,3,4,... BuSS can prevent a lot
of erase operations that would take place in
traditional FTLs. However, the nature of page
writes is not always pure looping. We needed to
consider more realistic sequence of writes to
calculate the number of erases.
We assume that X% of the writes in the write
sequence will overwrite the existing pages. And
other Writes (pages) don’t overwrite. For
example consider the following sequence of
writes
1 2 3 4 5 6 7 8 9 3
Graph 3
The legends are the size of the replacement pool
blocks in Ks (e.g. 10*1024 blocks) We can see that page number 3 is repeated. So it
would be written over the existing page. In this
sequence, 1 out of 10 writes is overwritten. We
3.5 BuSS FTL Calculation
say that our write sequence will have 10%
overwriting. In this manner, the write sequence
A BuSS drive consists of Magnetic disk and
SSD. MD contains the replacement blocks.
Since magnetic disk allows in place replacement 1 2 3 4 5 2 3 4 6 7
of the page, we need not erase the whole block
to write a page when we overwrite the existing
page. So if we have writes in sequential loop as
the above example, we would see ZERO erases. has 2, 3, 4 repeated. So we can say this sequence
has 3/10 overwriting or 30 % overwriting. We
5
come up with a formula to calculate the number
of erases given the percentage of overwrite in Comparison of FTLs
the write sequence. With this formula we can
calculate the number of erases for a different
percentage.
It follows that the greater the overwrite
percentage the lesser the number of erases.
E = no of Erases
T = total number of writes
n = no of pages in a block
p = overwrite percentage
E = ((T – n) * (100-p)/100)/n
Graph 6
We compared the number of erases in SSD
for different FTLs and found that BuSS
performs better than other FTLs.
4. Simulation
4.1 Simulation method
As we mentioned previously, the nature of page
writes does not have always pure patterned
looping. A more realistic pattern of data reads
and writes uses the “Parento principle” which
Graph 5 also known as the 80-20 rule. We assume the
20% of disk blocks are accessed 80% of the time
The legend represents the percentage of writes and 80% of disk block are accessed 20% of the
that will overwrite the pages that exist in the time. Based on this assumption, we generated a
replacement blocks. We can see that the number random access pattern with the stipulation that
of erases decreased drastically with the increase 80% of all disk accesses would draw from a “hot
in percentage of overwriting. set” array of randomly generated indices, the
size of which was 20% of the total accesses.
In the worst case i.e. with 0% overwriting, This attempted to simulate access patterns that
BuSS achieves the same number of erases as are more typical in the real world. We developed
FAST FTL. simulation software which simulated the file
translation layers FMAX, FAST and BuSS. We
input the numbers generated into this system and
calculated the number of erases for each FTL.
We assumed the size of magnetic disk to be 1
GB and size of SSD to be 16GB.
6
We assumed the replacement pool size to be 5. Conclusion
equivalent to the size of magnetic disk for FAST
FTL. We also assumed a 70% threshold for
By integrating a small disk into the SSD storage
garbage collection in BuSS, as described in
architecture, BuSS significantly reduces the
section 2.4.
amount of wear on an SSD, while retaining the
advantages of both magnetic disk and flash
Writes
memory. These advantages include lower
(millions) Fast Buss Fmax
energy consumption than a pure magnetic disk
1 111515 0 1301170
solution, lower latency of small reads and writes
2 752632 5735 3200550 than could be found in a pure magnetic disk
3 1400128 11470 4967260 solution, and, as the price of SSD continues to
4 2051677 17205 6683134 fall, a lower cost-benefit ratio of the overall
5 2711160 22945 8377690 system. This paper has shown, furthermore, that
wear-leveling is greatly improved over a pure
Table 2 SSD solution, as it performs a smaller number of
erases than any of the existing FTL simulations.
6. References
[1] M. Rosenblum, J. Ousterhout. The Design
and Implementation of a Log-Structured File
System. In ACM Transactions on Computer
Systems. 1991
[2] Intel Corp. Understanding the Flash
Translation Layer (FTL) Specification. 1998
[3] J. Hennessy, D. Patterson. Computer
Architecture: A Quantitative Approach. Fourth
Graph 7 Ed., Morgan Kaufman Publishers. 2007
4.2 Simulation analysis [4] S. Lee, D. Park, T. Chung, D. Lee, S. Park,
and H. Song. A Log Buffer-Based Flash
The preceding simulation shows a dramatic Translation Layer Using Fully-Associative
improvement of wear leveling when the BuSS Sector Translation. In ACM Transactions on
architecture is implemented, due to a reduced Embedded Computing Systems, Vol. 6, No. 3,
number of erases over the other simulated FTLs. Article 18, 2007.
In all FTLs, we observed that the number of
erases increases in an approximate linear fashion [5] “Flash SSDs – Inferior Technology or Closet
with the number of writes. In FMAX, the Superstar?” Bit Micro Networks website, 2008:
increase in erases is about 2 million for every shttp://www.bitmicro.com/press_resources_flash
million writes. The FAST mapping shows an _ssd.php.
increase of about 600,000 – 700,000 erases per
million writes. In the BuSS system, the number [6] “A Design for High Performance Flash
of erases is only about 6,000 – 7,000 per million Disks.” Microsoft Research website, 2005:
writes, an improvement of about 100 times over http://research.microsoft.com/
the FAST method. research/pubs/view.aspx?type=Technical%20Re
port&id=1032.