0% found this document useful (0 votes)

12 views34 pages

Top-N Queries

The document discusses the use of analytic Top-N queries, highlighting their performance optimizations and limitations. It explains how to efficiently retrieve the first N rows or the minimum group_key for a given start_key using both traditional SQL and analytic functions. The article concludes by demonstrating a mass Top-N query and its execution plan, emphasizing the importance of indexing for performance.

Uploaded by

Jitendra Arethiya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views34 pages

Top-N Queries

Uploaded by

Jitendra Arethiya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 34

Analytic Top-N Queries

One of the more advanced tricks I like to exploit are analytic

Top-N queries. Although I am using them for quite a while, I
recently discovered a “limitation” that I was not aware of.
Actually—to be honest—it’s not a limitation; it is a missing
optimization in a rarely used feature that can easily worked
around. I must admit that I ask for quite a lot in that case.

The article starts with a general introduction into Top-N

queries, applies that technique to analytic queries and
explains the case where I miss an optimization. But is is
really worth all that efforts? The article concludes with my
answer to that question.

Please find the CREATE and INSERT statements at the end of

the article.

Top-N Queries
Top-N queries are queries for the first N rows according to a
specific sort order—e.g., the first three rows like that:

select * from (

select start_key, group_key, junk

from demo

where start_key = 'St'

order by group_key

where rownum <= 3;

That’s well known and very straight. However, the

interesting part is performance—as usual. A naïve
implementation executes the inner SQL first—that is, fetch
and sort all the matching records—before limiting the result
set to the first three rows. In absence of a useful index, that
is really happening:
START_KEY GROUP_KEY JUNK

---------- --------- ----------

St 1 junk

St 3 junk

St 10 junk

3 rows selected.

Execution Plan

---------------------------------------------------------
-

Plan hash value: 142682949

---------------------------------------------------------
-----

| Id | Operation | Name | Rows | Bytes |

Cost |

---------------------------------------------------------
-----

| 0 | SELECT STATEMENT | | 3 | 1032 |

8240 |

|* 1 | COUNT STOPKEY | | | |
|

| 2 | VIEW | | 370 | 124K|

8240 |

|* 3 | SORT ORDER BY STOPKEY| | 370 | 76960 |

8240 |
|* 4 | TABLE ACCESS FULL | DEMO | 370 | 76960 |
8239 |

---------------------------------------------------------
-----

Predicate Information (identified by operation id):

---------------------------------------------------

1 - filter(ROWNUM<=3)

3 - filter(ROWNUM<=3)

4 - filter("START_KEY"='St')

Statistics

---------------------------------------------------------
-

0 recursive calls

0 db block gets

30370 consistent gets

30365 physical reads

0 redo size

998 bytes sent via SQL*Net to client

419 bytes received via SQL*Net from client

2 SQL*Net roundtrips to/from client

1 sorts (memory)

0 sorts (disk)
3 rows processed

A full table scan is performed—more on that in a few

seconds—to retrieve all the rows that match the where
clause; about 370 according to the optimizers estimate. The
next sorts the entire result set. Finally the limit is applied—
the COUNT STOPKEY step—and the number of rows is reduced
to three.
The performance problem of this query is obviously the full
table scan. Let’s create an index to make it go away:

create index demo_idx on demo (start_key);

exec dbms_stats.gather_index_stats(null, 'DEMO_IDX');

That’s much better:

START_KEY GROUP_KEY JUNK

---------- --------- ----------

St 1 junk

St 3 junk

St 10 junk

3 rows selected.

Execution Plan

---------------------------------------------------------
-

Plan hash value: 1129354520

---------------------------------------------------------
---------------

| Id | Operation | Name | Rows |

Bytes | Cost |

---------------------------------------------------------
---------------

| 0 | SELECT STATEMENT | | 3 |
1032 | 372 |

|* 1 | COUNT STOPKEY | | |
| |

| 2 | VIEW | | 370 |
124K| 372 |

|* 3 | SORT ORDER BY STOPKEY | | 370 |

76960 | 372 |

| 4 | TABLE ACCESS BY INDEX ROWID| DEMO | 370 |

76960 | 371 |

|* 5 | INDEX RANGE SCAN | DEMO_IDX | 370 |

| 3 |

---------------------------------------------------------
---------------

Predicate Information (identified by operation id):

---------------------------------------------------

1 - filter(ROWNUM<=3)

3 - filter(ROWNUM<=3)

5 - access("START_KEY"='St')
Statistics

---------------------------------------------------------
-

1 recursive calls

0 db block gets

360 consistent gets

201 physical reads

0 redo size

998 bytes sent via SQL*Net to client

419 bytes received via SQL*Net from client

2 SQL*Net roundtrips to/from client

1 sorts (memory)

0 sorts (disk)

3 rows processed

You can see that the full table scan was replaced by an index
lookup and the corresponding table access. The other steps
remain unchanged.

However, this is still a bad execution plan because all

matching records are fetched and sorted just to throw most
of them away. The following index allows a much better
execution plan:

drop index demo_idx;

create index demo_idx on demo (start_key, group_key);

exec dbms_stats.gather_index_stats(null, 'DEMO_IDX');

The new execution plan looks like this:

ID START_KEY GROUP_KEY
---------- ---------- ---------

936196 St 1

232303 St 3

759212 St 10

3 rows selected.

Execution Plan

---------------------------------------------------------
-

Plan hash value: 1891928015

---------------------------------------------------------
--------------

| Id | Operation | Name | Rows |

Bytes | Cost |

---------------------------------------------------------
--------------

| 0 | SELECT STATEMENT | | 3 |
465 | 7 |

|* 1 | COUNT STOPKEY | | |
| |

| 2 | VIEW | | 3 |
465 | 7 |

| 3 | TABLE ACCESS BY INDEX ROWID| DEMO | 3 |

36 | 7 |

|* 4 | INDEX RANGE SCAN | DEMO_IDX | 370 |

| 3 |
---------------------------------------------------------
--------------

Predicate Information (identified by operation id):

---------------------------------------------------

1 - filter(ROWNUM<=3)

4 - access("START_KEY"='St')

Statistics

---------------------------------------------------------
-

0 recursive calls

0 db block gets

7 consistent gets

0 physical reads

0 redo size

609 bytes sent via SQL*Net to client

419 bytes received via SQL*Net from client

2 SQL*Net roundtrips to/from client

0 sorts (memory)

0 sorts (disk)

3 rows processed

Well, that is efficient. The sort operation has vanished at all

because the index definition supports the ORDER BY clause.
But even more powerful, the STOPKEY takes effect down to
the index range scan. You can see the reduced number of
table accesses in the plan. Although not visible in the
execution plan, the index range scan is also aborted after
fetching the first three records.
Well, that optimization is in the Oracle database for quite a
while—at least since 8i I guess. After that preparation, I can
demonstrate what 10R2 has to offer on top of that.

Analytic Top-N Queries

It is actually the very same story with a small extension: I
don’t want to retrieve the first N rows, but all the rows
where the group_key value is at it’s minimum for the
respective start_key. A very straight solution is that:
select id, start_key, group_key

from demo

where start_key = 'St'

and group_key = (select min(group_key)

from demo

where start_key = 'St'

);

That statement is perfectly legal—even performance wise:

ID START_KEY GROUP_KEY

---------- ---------- ---------

936196 St 1

1 row selected.
Execution Plan

---------------------------------------------------------
-

Plan hash value: 1142136980

---------------------------------------------------------
---------------

| Id | Operation | Name | Rows |

Bytes | Cost |

---------------------------------------------------------
---------------

| 0 | SELECT STATEMENT | | 1 |
12 | 8 |

| 1 | TABLE ACCESS BY INDEX ROWID | DEMO | 1 |

12 | 5 |

|* 2 | INDEX RANGE SCAN | DEMO_IDX | 1 |

| 3 |

| 3 | SORT AGGREGATE | | 1 |
7 | |

| 4 | FIRST ROW | | 1 |
7 | 3 |

|* 5 | INDEX RANGE SCAN (MIN/MAX)| DEMO_IDX | 1 |

7 | 3 |

---------------------------------------------------------
---------------

Predicate Information (identified by operation id):

---------------------------------------------------

2 - access("START_KEY"='St'
AND "GROUP_KEY"= (SELECT MIN("GROUP_KEY")
FROM

"DEMO" "DEMO" WHERE "START_KEY"='St'))

5 - access("START_KEY"='St')

Statistics

---------------------------------------------------------
-

1 recursive calls

0 db block gets

8 consistent gets

0 physical reads

0 redo size

550 bytes sent via SQL*Net to client

419 bytes received via SQL*Net from client

2 SQL*Net roundtrips to/from client

0 sorts (memory)

0 sorts (disk)

1 rows processed

Analytic functions
Analytic functions can perform calculations on the basis of multiple rows.
However, not to be confused with aggregate functions, analytical functions
work without GROUP BY. A very typical use for analytical functions is a running
balance; that is, the sum of all the rows preceding the current row.
The function used in the example (dense_rank) returns the rank of the
current row according to the supplied OVER(ORDER BY) clause—that is, in
turn, not to be confused with a regular ORDER BY.
orafaq.com has a nice intro to Oracle analytic functions.
The fist step is to fetch the smallest group_key. Because of
the min/max optimization in combination with a well
supporting index, the database doesn’t need to sort the data
—it just picks the first record from the index which must be
the smallest anyway. The second step is to perform a regular
index lookup for the start_key and the group_key that was
just retrieved from the sub-query.
Another possible implementation for that is to use
an analytic function:
select * from (

select id, start_key, group_key,

dense_rank() OVER (order by group_key) rnk

from demo

where start_key = 'St'

where rnk <= 1;

Do you recognize the pattern? It is very similar to the

traditional Top-N query that was described at the beginning
of this article. Instead of limiting on
the rownum pseudocolumn we use an analytic function. The
execution plan reveals the performance characteristic of
that statement:
ID START_KEY GROUP_KEY RNK

---------- ---------- --------- ----------

936196 St 1 1

1 row selected.

Execution Plan
---------------------------------------------------------
-

Plan hash value: 3221234897

---------------------------------------------------------
--------------

| Id | Operation | Name | Rows |

Bytes | Cost |

---------------------------------------------------------
--------------

| 0 | SELECT STATEMENT | | 370 |

62160 | 374 |

|* 1 | VIEW | | 370 |
62160 | 374 |

|* 2 | WINDOW NOSORT STOPKEY | | 370 |

4440 | 374 |

| 3 | TABLE ACCESS BY INDEX ROWID| DEMO | 370 |

4440 | 373 |

|* 4 | INDEX RANGE SCAN | DEMO_IDX | 370 |

| 3 |

---------------------------------------------------------
--------------

Predicate Information (identified by operation id):

---------------------------------------------------

1 - filter("RNK"<=1)

2 - filter(DENSE_RANK() OVER ( ORDER BY

"GROUP_KEY")<=1)

4 - access("START_KEY"='St')
Statistics

---------------------------------------------------------
-

0 recursive calls

0 db block gets

6 consistent gets

0 physical reads

0 redo size

610 bytes sent via SQL*Net to client

419 bytes received via SQL*Net from client

2 SQL*Net roundtrips to/from client

0 sorts (memory)

0 sorts (disk)

1 rows processed

What was the COUNT STOPKEY operation for the traditional

Top-N query has become the WINDOW NOSORT
STOPKEY operation for the analytical function. However, the
expected number of rows is not known to the optimizer—any
number of rows could have the lowest group_key value. Still
the index range scan is aborted once the required rows have
been fetched. On the one hand, the consistent gets are even
better as with the sub-query statement. On the other hand,
the cost value is higher. Whenever you use analytic
functions, go for a benchmark to know the actual
performance.
Let’s have some thoughts about this optimization. The
database knows that the index order corresponds to
the OVER (ORDER BY) clause and avoids the the sort
operation. But even more impressive is that that it can abort
the range scan when the first value that doesn’t match
the rnk <= 1 expression is fetched. That is only possible
because the dense_rank() function can not decrease if the
rows are fetched in order of the OVER(ORDER BY) clause.
That’s impressive, isn’t it?

Mass Top-N Queries

The next step towards the issue that made me writing this
article is to make a mass Top-N query. With the previous
statement as basis, it is actually quite simple; just remove
the inner where clause to get the result for
all start_key values and add a partition clause to make sure
the rank is built individually for each start_key:
select * from (

select start_key, group_key, junk,

dense_rank() OVER (partition by start_key

order by group_key) rnk

from demo

where rnk <= 1;

Declaring the partition is required to make sure

those start_keysthat don’t have a group_key of one will
still show up, with their lowest group_key value.
With that query, we have reached the end of the optimizers
smartness—as of release 11r2. On the first sight, the plan is
not surprising:

3260 rows selected.

Execution Plan
---------------------------------------------------------
-

Plan hash value: 1766530486

---------------------------------------------------------
-----

| Id | Operation | Name | Rows | Bytes |

Cost |

---------------------------------------------------------
-----

| 0 | SELECT STATEMENT | | 1000K| 340M|

8239 |

|* 1 | VIEW | | 1000K| 340M|

8239 |

|* 2 | WINDOW SORT PUSHED RANK| | 1000K| 198M|

8239 |

| 3 | TABLE ACCESS FULL | DEMO | 1000K| 198M|

8239 |

---------------------------------------------------------
-----

Predicate Information (identified by operation id):

---------------------------------------------------

1 - filter("RNK"<=1)

2 - filter(DENSE_RANK() OVER ( PARTITION BY

"START_KEY" ORDER BY

"GROUP_KEY")<=1)
Statistics

---------------------------------------------------------
-

22 recursive calls

20 db block gets

30370 consistent gets

33163 physical reads

0 redo size

59215 bytes sent via SQL*Net to client

2806 bytes received via SQL*Net from client

219 SQL*Net roundtrips to/from client

0 sorts (memory)

1 sorts (disk)

3260 rows processed

It’s a full table scan. However, a “mass” query performs a

full table scan on good purpose—that did not call my
attention. What did call my attention is the following:

select * from (

select start_key, group_key, junk,

dense_rank() OVER (partition by start_key

order by group_key) rnk

from demo

) where rnk <= 1

) where start_key = 'St';

It is actually the individual Top-N query again. This time it is

built on the basis of the mass Top-N query—that was set up
as view. That way, a single database view can be used for
any mass query as well as for individual Top-N queries—
that’s a maintainability benefit. If the advanced magic to
abort the index range scan is still working it would be
extremely efficient as well. The execution plan proves the
opposite:
START_KEY GROUP_KEY JUNK RNK

---------- --------- ---------- ----------

St 1 junk 1

1 row selected.

Execution Plan

---------------------------------------------------------
-

Plan hash value: 1309566133

---------------------------------------------------------
--------------

| Id | Operation | Name | Rows |

Bytes | Cost |

---------------------------------------------------------
--------------

| 0 | SELECT STATEMENT | | 370 |

128K| 373 |
|* 1 | VIEW | | 370 |
128K| 373 |

|* 2 | WINDOW NOSORT | | 370 |

76960 | 373 |

| 3 | TABLE ACCESS BY INDEX ROWID| DEMO | 370 |

76960 | 373 |

|* 4 | INDEX RANGE SCAN | DEMO_IDX | 370 |

| 3 |

---------------------------------------------------------
--------------

Predicate Information (identified by operation id):

---------------------------------------------------

1 - filter("RNK"<=1)

2 - filter(DENSE_RANK() OVER ( PARTITION BY

"START_KEY" ORDER BY

"GROUP_KEY")<=1)

4 - access("START_KEY"='St')

Statistics

---------------------------------------------------------
-

0 recursive calls

0 db block gets

363 consistent gets

0 physical reads
0 redo size

808 bytes sent via SQL*Net to client

419 bytes received via SQL*Net from client

2 SQL*Net roundtrips to/from client

0 sorts (memory)

0 sorts (disk)

1 rows processed

Although no sort is required, the STOPKEY has disappeared

from the WINDOW NOSORT operation. That means that the full
index range scan will be performed; for all 359 rows where
start_key='St'. On top of that, the number of consistent
gets is rather high. A closer look into the execution plan
reveals that the entire row is fetched from the
table before the filter on the analytic expression is applied.
The junk column that is fetched from the table is not
required for the evaluation of this predicate; it would be
possible to fetch that column only for those rows that pass
the filter.
The “premature table access” is the reason why the full
table scan is more efficient for the mass query than a index
full scan. Have a look into the (hinted) full index scan
execution plan for the mass query:

3260 rows selected.

Execution Plan

---------------------------------------------------------
-

Plan hash value: 1402975529

---------------------------------------------------------
---------------

| Id | Operation | Name | Rows |

Bytes | Cost |

---------------------------------------------------------
---------------

| 0 | SELECT STATEMENT | | 1000K|

340M| 1002K |

|* 1 | VIEW | | 1000K|
340M| 1002K |

|* 2 | WINDOW NOSORT | | 1000K|

198M| 1002K |

| 3 | TABLE ACCESS BY INDEX ROWID| DEMO | 1000K|

198M| 1002K |

| 4 | INDEX FULL SCAN | DEMO_IDX | 1000K|

| 2504 |

---------------------------------------------------------
---------------

Predicate Information (identified by operation id):

---------------------------------------------------

1 - filter("RNK"<=1)

2 - filter(DENSE_RANK() OVER ( PARTITION BY

"START_KEY" ORDER BY

"GROUP_KEY")<=1)

Statistics
---------------------------------------------------------
-

0 recursive calls

0 db block gets

1002692 consistent gets

817172 physical reads

0 redo size

59215 bytes sent via SQL*Net to client

2806 bytes received via SQL*Net from client

219 SQL*Net roundtrips to/from client

0 sorts (memory)

0 sorts (disk)

3260 rows processed

The expensive step in this execution plan is the table access.

If the table access would be moved up to take place after the
window filter, the cost for this step would be 3260 (one for
each fetched row). The total cost for the plan would
probably stay below 7000; that is, lower then the cost for the
full table scan plan.

Conclusion
Just to re-emphasize the motivation behind the view that can
serve both needs; it is about multidimensional optimization—
and that has nothing to do with OLAP!

Typically, performance optimization takes only one

dimension into account; that is, performance. So far so good,
but what about long term maintenance? Very often,
performance optimization reduces the maintainability of the
software. That’s not a coincidence, it’s because
maintainability is the only degree of freedom during
optimization. Unfortunately, reduced maintainability is very
hard to notice. If it is noticed at all, it is probably years later.
I have been in both worlds for some years—operations and
development—and try to optimize for all dimensions
whenever possible because all of them are important for the
business.

Create and Insert Statements

To try it yourself:

create table demo (

id number not null,

start_key varchar2(255) not null,

group_key number not null,

junk char(200),

primary key (id)

);

insert into demo (

select level,

dbms_random.string('A', 2) start_key,

trunc(dbms_random.value(0,1000)),

'junk'

from dual

connect by level <= 1000000

);

commit;
exec DBMS_STATS.GATHER_TABLE_STATS(null, 'DEMO');

My tests were conducted on 11R2.

Finding the Best Match With a Top-

N Query
In Performance on 2010-09-29 at 11:16
There was an interesting index related performance problem
on Stack Overflow recently. The problem was to check an
input string against a table that holds about 2000 prefix
patterns (e.g., LIKE 'xyz%'). A fast select is needed that
returns one row if any pattern matches the input string, or
no row otherwise.
I believe my solution is worth a few extra words to explain it
in more detail. Even though it’s a perfect fit for Use The
Index, Luke it’s a little early to put it as an exercise there. It
is, however, a very good complement to my previous
article Analytic Top-N queries—so I put it here.
Although the problem was raised for a MySQL database, my
solution applies to all databases that can properly optimize
Top-N queries.

The original SQL statement in the question was like that:

select 1

from T1

where 'fxg87698x84' like concat (C1, '%')

T1.C1 is the column that holds the prefix patterns—one per

row. Although a prefixed LIKE filter can use an index range
scan, the problem is that it is the wrong way around: it’s not
searching for a string that matches the pattern, it’s
searching for a pattern that matches the string.
The query, as written, must check all the patterns against
the string. E.g., by a full table scan or (fast) full index scan.
However, it’s always a full scan. Can that be improved?
Let’s start step-by-step. The simplest case is that the exact
input string is a pattern in the table. A SQL statement to
check for the exact pattern is very simple:

select C1

from T1

where C1 = 'fxg87698x84'

The next case that the exact pattern doesn’t exist in the
table, but a prefix pattern, that matches the input string,
exists. That pattern must be shorter than the input string—
otherwise it cannot match. Because we aim to solve the
problem with an index, let’s imagine the patterns, as they
would be stored in an index:

axt3

fxg

<- place where 'fxg87698x84' would be

tru56

If the exact pattern doesn’t exist, the preceding index entry

is the best possible match (precondition: no overlapping
patterns exist). That’s because shorter strings are
considered “smaller” when sorted. So, let’s extend the select
to find the preceding record if the exact pattern is not in the
table:
select C1

from T1

where C1 <= 'fxg87698x84'

order by c1 desc

limit 1

The less than or equals condition will match the exact

pattern, if it exists, and all that precede it. The
reverse ORDER BY clause makes sure that the index is
traversed upwards. In conjunction with the where clause, it
means that the tree traversal is done to find the input string,
and the leaf node scan continues upwards from there.
The LIMIT 1clause is the MySQL way to make a Top-N
query so that the leaf node scan aborts after the first record.
Voilà, this statement will return the best candidate pattern
(or none at all) by performing a very small index range scan.
The final case we need to take care of is that no pattern
matches the input string. There are two sub-variants that
can happen: (a) a potentially matching pattern would be the
very first entry in the index. In that case the Top-N query
will not return any row and we are done; (b) the Top-N
query returns a pattern that is not a prefix for the input
string. That can be handled by wrapping the Top-N query to
filter the result through the original LIKE expression:
select 1

from (

select C1

from T1

where C1 <= 'fxg87698x84'

order by C1 desc

limit 1

) tmp

where 'fxg87698x84' like concat (C1, '%')

Done.

Simple? With a good understanding of index fundamentals,

it is simple! That’s why I am writing a Web-Book about
indexing basics: Use The Index, Luke!. Funny enough, the
basics are the same for all databases—we all put our pants
on one leg at a time.

Closing Note
The precondition for all that is that there are no overlapping
patterns in the table. E.g., the statement doesn’t work with
the following patterns:

axt3

fxg

fxg1

<- place where 'fxg87698x84' would be

tru56

In that case, the closest entry doesn’t match although there

is a matching entry. However, the FXG entry matches
everything that FXG1can possibly match—the two patterns
are overlapping.

Second Closing Note

The original problem posted on Stack Overflow mentioned
that this lookup must be performed 1 million times—within
half an hour. The author did not mention if that target was
reached, nor if the process is single-threaded.

However, considering the overall problem, the most

computing resource efficient solution would probably be to
sort both sets—the patterns and the input strings—and
implement a manual merge. But that’s probably much more
effort to implement. The index solution is very efficient on
human resources. Whatever is the best solution for the
business is up to the company to decide.

Choosing NoSQL For The

Right Reason
In Performance, Reliability, Scalability on 2011-05-13 at 09:42

Observing the NoSQL hype through the eyes of an SQL

performance consultant is an interesting experience. It is,
however, very hard to write about NoSQL because there are
so many forms of it. After all, NoSQL is nothing more than a
marketing term. A marketing term that works pretty well
because it goes to the heart of many developers that
struggle with SQL every day.
My unrepresentative observation is that NoSQL is often taken
for performance reasons. Probably because SQL performance
problems are an everyday experience. NoSQL, on the other
hand, is known to “scale well”. However, performance is
often a bad reason to choose NoSQL—especially if the side
effects, like eventual consistency, are poorly understood.
Most SQL performance problems result out of improper
indexing. Again, my unrepresentative observation. But I
believe it so strongly that I am writing a book about SQL
indexing. But indexing is not only a SQL topic, it applies to
NoSQL as well. MongoDB, for example, claims to support
“Index[es] on any attribute, just like you’re used to“. Seems
like there is no way around proper indexing—no matter if you
use SQL or NoSQL. The latest release of my book, “Response
Time, Throughput and Horizontal Scalability“, describes that
in more detail.
Performance is—almost always—the wrong reason for NoSQL.
Still there are cases where NoSQL is a better fit than SQL. As
an example, I’ll describe a NoSQL system that I use almost
every day. It is the distributed revision control system Git.
Wait! Git is not NoSQL? Well, let’s have a closer look.
Git doesn’t have an SQL front end
Git has specialized interfaces to interact with the
repository. Either on the command line or integrated
into an IDE. There isn’t anything that remotely compares
to SQL or a relational model. I never missed it.

Git doesn’t use an SQL back-end

Honestly, if I would have to develop a revision control
system, I wouldn’t take an SQL database as back-end.
There is no benefit in putting BLOBs into a relational
model and handling BLOBs all the time is just too
awkward.

Git is distributed
That’s my favourite Git feature. Working offline is
exactly what is meant by ‘partition tolerance’
in Brewer’s CAP Theorem. I can use all Git features
without Internet connection. Others can, of course, still
use the server if they can connect to it. Full functionality
on either end. It is partition tolerant.
Conflicts happen anyway
If there is one thing we learned in the 25 years
since Larry Wall introduced patch, it is that conflicts
happen. No matter what. Software development has a
very long “transaction time” and we are mostly using
optimistic locking—conflicts are inevitable. But here
comes the famous CAP Theorem again. If we cannot
have consistency anyway, let’s focus on the other two
CAP properties: availability and partition tolerance.
Acknowledging inconsistencies means to take care of
methods and tools to find and resolve them. That
involves the software (e.g., Git) as well as the user. But
here comes one last unrepresentative observation from
my side: most NoSQL users just ignore that. They
assume that the system magically resolves
contradicting writes automatically. It’s like using a CVS
work flow with Git—it works for a while, but you’ll end up
in trouble soon.

I’m not aware of a minimum feature set for NoSQL datastores

—it’s therefore hard to tell if Git fulfils them or not. However,
Git feels to me like using NoSQL for the right reason.

It’s about choosing the right tool for the job. But I can’t get
rid of the feeling that NoSQL is too often taken for the wrong
reasons—query response time, in particular. No doubt,
NoSQL is a better fit for some applications. However,
an index review would often solve the performance problems
within a few days. SQL is no better than NoSQL, nor vice-
versa. Because the question is not what’s better. The
question is what is a better fit for a particular problem.
▶ 6 Responses

1. “Most SQL performance problems result out of improper indexing.”

I’m sorry, but that is simply not true. The main reason you run into
performance problems with relational databases is because the data
model has problems. Indices are more like “inherent opportunities”
to speed things up here and there, but the main performance comes
from understanding the data, understanding the access patterns
and understanding how the SQL engine will calculate a plan. And
then designing a relational model that will balance these things out.

Designing a relational model that performs well is hard. Which is

why there are so few database professionals available who seem
able to do this well. It requires deep understanding of both the
domain that needs to be modeled as well as the technology it will
run on.

Just tweaking indices is going to work for a small subset of problems,

but the real performance gains are made during the design of the
relational model you will use. And unfortunately, they do not teach
this in school.

Also, the whole NoSQL vs SQL debate is artificial. There is no real

debate: they address entirely different classes of problems. It can be
boiled down to this: SQL is about consistency and flexibility during
querying at the expense of scalability and performance.

NoSQL is about “performance at scale”and flexibility when writing

data at the expense of consistency and querying flexibility. Note
that I said “performance at scale”, because many of the NoSQL
databases do not have particularly impressive performance for small
datasets.

Sacrificing absolute consistency is hard, but as it turns out, for a lot

of “new” problems lack of consistency is less of a problem than long
response times. The “new” problems here are online systems with
massive numbers of users. For some classes of companies you can
feel this directly. Most online banks are still pretty slow. Sites like
Amazon, on the other hand are quite snappy given that they have a
lot more web traffic than any bank. (And when it comes to online
commerce, every millisecond counts).
A system can be said to be scalable when the cost of increasing its
size is sublinear with respect to the dimension you need to scale.
Since the relational model is inherently expensive to apply in a
distributed manner, it is relatively easy to show that you cannot get
sublinear cost for arbitrary scale along any dimension.

However, with certain sacrifices you can get sublinear cost. For
instance by breaking the relational model somewhat and
partitioning the data into independent instances that have no
dependencies on other instances.

Note that when we say “cost” we mostly talk in terms of latency and
processing power. Not dollars. Although it will end up costing dollars.

SQL has its place and NoSQL has its place, but it is important to
understand that they address different types of problems. I have
worked at companies that have naively used SQL databases for
NoSQL type problems and vice versa. It is unhelpful that people
keep comparing them directly instead of trying to develop and
disseminate the kind of knowledge needed to reason about this.

Also it doesn’t help that Stonebreaker et al, to draw attention to

themselves, muddy the waters and confuse the issues by planting
the idea that NoSQL is somehow the antithesis of SQL. In fact the
label “NoSQL” has been incredibly unhelpful because it suggests
that there is a problem with SQL and that NoSQL is the magic
solution. This is, at best, naive. And unfortunately leads people to
get hung up on the wrong ideas.

Gruntle Grüber14 May 2011 at 10am

 “The main reason you run into performance problems with

relational databases is because the data model has problems.”

Well, I made another observation. In fact, everybody knows that

database design is important and must be done carefully. There
are many books covering that in more or less detail. I do not say
that database design is not important, but I say it’s usually done
carefully anyway because everybody knows it’s important.

What I find at client sites is that developers are not aware how to
index properly and how to write queries that can benefit from
indexing. The DBAs, on the other hand, know about indexing but
don’t have the deep domain knowledge to know how the data is
queried.

Nobody ignores schema design but indexing is almost always

ignored until it’s too late. Adding some more or less random
indexes might improve the situation, but it is exactly what I refer
to as “improper” indexing. Indexing without a plan. In fact, my
position is that indexing must be designed with the same care as
the schema.

I pretty much agree with your statements about NoSQL.

Markus Winand14 May 2011 at 5pm

 “No doubt, NoSQL is a better fit for some applications.”

Other than performance, could you provide examples of

applications that are more suitable to NoSQL semantics? SQL has
all other advantages, like persistence frameworks, consistency,
and other tools. NoSQL solutions like cassandra are target for
scaling writes. Other than sharding how do you think to scale
writes on a RDMS?

Deniz Oguz16 May 2011 at 12pm

 Well, in lack of a definition for “NoSQL semantics” I like to see

the CAP Theorem as the central star that NoSQL systems orbit
around.

That said, I believe that any application where partition tolerance

is more important than consistency is a good fit for NoSQL.
Partition tolerance seems to be poorly understood in the field. I
took the Git example because many developers know what
“distributed” means in context of Git. I could have taken any
other distributed, partition tolerant revision control system for
that purpose. Source code repositories are a particularly good fit
because they hardly every reach consistency anyway.

The question—what is more important, partition tolerance or

consistency—depends on the data. Huge social networks have to
cope with tons of data that has very little value. Strict
consistency doesn’t pay off for that. The damage caused by
conflicts is little compared to the costs to establish strict
consistency. That argument is, however, nonexistent for small
sites because consistency is easy to achieve there.

I also mentioned BLOBs in the article. As a software architect, I

have been involved in many discussions where to store low
value binary data like user uploads. I remember a meeting
where the DBA smashed the proposal to use BLOBs for vast
amounts of user data by proclaiming that “BLOBs have no
business in my database”. BLOBs have, quite often, little value—
even if connected to high value relational data. From that angle,
I believe that some NoSQL systems make a great distributed
BLOB store which can coexist with a relational database.

Scaling writes is subject to Brewer’s CAP Theorem—take two out

of three. Sharding and similar methods bypass it by not
distributing the data at large—that is, only a small subset of the
nodes is responsible to maintain a particular data sub-set.

I feel, however, that the need to scale out is constantly

decreasing for most applications. I have observed a multi-
national banking system over the past decade. It was initially
running on a huge two node active-active cluster to distribute
load. A few years later, it was moved to a hot-standby cluster—
just one node active. Today, it’s being migrated to a virtualized
server running other databases on the same hardware. “Scale-
out” is not the trend—virtualization is, at least in enterprise
environments. Huge social sites being the obvious exception.

Markus Winand16 May 2011 at 3pm

2. Nice post, Git being close to a NoSQL system shows how little the
word “NoSQL” actually means.
Giorgio16 May 2011 at 1pm

3. […] Winand made the case earlier this year that the version control
system Git is actually a NoSQL datastore. The blog […]

Understanding The Relationship Among Arrhenius, Brønsted Lowry, and Lewis Theories
100% (1)
Understanding The Relationship Among Arrhenius, Brønsted Lowry, and Lewis Theories
6 pages
Plans
No ratings yet
Plans
0 pages
PLSQL Indexing Demo
No ratings yet
PLSQL Indexing Demo
3 pages
Adaptive Plans in Active Session History - Striving For Optimal
No ratings yet
Adaptive Plans in Active Session History - Striving For Optimal
7 pages
Les 04 Optioper
No ratings yet
Les 04 Optioper
67 pages
Oracle Index Optimization Guide
No ratings yet
Oracle Index Optimization Guide
14 pages
Performance and Tuning: Oracle Initialization Parameters Used in The Compilation of PLSQL Units
No ratings yet
Performance and Tuning: Oracle Initialization Parameters Used in The Compilation of PLSQL Units
19 pages
SQL Tuning
No ratings yet
SQL Tuning
27 pages
Autotrace
No ratings yet
Autotrace
4 pages
Oracle SQL High Performance Tuning: Guy Harrison Director, R&D Melbourne
100% (1)
Oracle SQL High Performance Tuning: Guy Harrison Director, R&D Melbourne
56 pages
Sangam 10 SQL 101 Query Execution Plans
No ratings yet
Sangam 10 SQL 101 Query Execution Plans
12 pages
SQL Tuning for Database Professionals
No ratings yet
SQL Tuning for Database Professionals
30 pages
Oracle Perf Presentation
No ratings yet
Oracle Perf Presentation
20 pages
Oracle Query Execution Plan Guide
100% (1)
Oracle Query Execution Plan Guide
7 pages
Index Column Cardinality Order
No ratings yet
Index Column Cardinality Order
4 pages
Explain Plan - Explained Jonathan Lewis © June 2000 JL Computer Consultancy, UK
No ratings yet
Explain Plan - Explained Jonathan Lewis © June 2000 JL Computer Consultancy, UK
5 pages
Function Based Index
No ratings yet
Function Based Index
4 pages
How To Use SQL Tuning Advisor
No ratings yet
How To Use SQL Tuning Advisor
6 pages
Buy Quest Products Buy Guy'S Book Buy Quest Products: Top Tips For Oracle SQL Tuning
No ratings yet
Buy Quest Products Buy Guy'S Book Buy Quest Products: Top Tips For Oracle SQL Tuning
41 pages
Buy Quest Products Buy Guy'S Book Buy Quest Products: Top Tips For Oracle SQL Tuning
No ratings yet
Buy Quest Products Buy Guy'S Book Buy Quest Products: Top Tips For Oracle SQL Tuning
41 pages
SQL Tuning
No ratings yet
SQL Tuning
69 pages
Oracle Tuning
No ratings yet
Oracle Tuning
7 pages
Explain: Window To The DB2 Optimizer
No ratings yet
Explain: Window To The DB2 Optimizer
60 pages
Schneider 10g Tuning Highlights
No ratings yet
Schneider 10g Tuning Highlights
17 pages
Paging in SQL Server
No ratings yet
Paging in SQL Server
10 pages
How The CBO Works: Jonathan Lewis WWW - Jlcomp.demon - Co.uk
No ratings yet
How The CBO Works: Jonathan Lewis WWW - Jlcomp.demon - Co.uk
37 pages
PR - 2 Simulate Query Optimization
No ratings yet
PR - 2 Simulate Query Optimization
4 pages
Plan SQL
No ratings yet
Plan SQL
50 pages
SQL Query Tuning Information Guide
No ratings yet
SQL Query Tuning Information Guide
12 pages
SQL That Tunes Itself: Oracle 12c's Built-In Tuning Features
No ratings yet
SQL That Tunes Itself: Oracle 12c's Built-In Tuning Features
23 pages
Automating Index Rebuilds in Multitenant Databases
No ratings yet
Automating Index Rebuilds in Multitenant Databases
19 pages
Interpreting Explain Plan in Oracle - For Beginners
No ratings yet
Interpreting Explain Plan in Oracle - For Beginners
13 pages
Flashback Snapshot of Schema
No ratings yet
Flashback Snapshot of Schema
49 pages
Tuning SQL Statements Again
100% (6)
Tuning SQL Statements Again
9 pages
Query Optimization
No ratings yet
Query Optimization
9 pages
SQL Techniques
No ratings yet
SQL Techniques
37 pages
Database Query Optimization Guide
No ratings yet
Database Query Optimization Guide
30 pages
Internal Tables: Why We Need Internal Table
No ratings yet
Internal Tables: Why We Need Internal Table
5 pages
SQL Tuning: Vinay Singh Tata Consultancy Services
No ratings yet
SQL Tuning: Vinay Singh Tata Consultancy Services
22 pages
How To Hint
No ratings yet
How To Hint
20 pages
Database Performance Metrics
No ratings yet
Database Performance Metrics
624 pages
What's An Explain Plan?
No ratings yet
What's An Explain Plan?
16 pages
Oracle DBA Basics 2
No ratings yet
Oracle DBA Basics 2
19 pages
Calculating Selectivity: Whoami?
No ratings yet
Calculating Selectivity: Whoami?
15 pages
SQL Table Indexing Explained
No ratings yet
SQL Table Indexing Explained
4 pages
4.4.tuning SQL Execution-Plan
No ratings yet
4.4.tuning SQL Execution-Plan
56 pages
NRIV1
No ratings yet
NRIV1
35 pages
Lec 13
No ratings yet
Lec 13
26 pages
How To Create A SQL Profile Manually
No ratings yet
How To Create A SQL Profile Manually
8 pages
Optimization of SQL Queries in Firebird: Dmitry Yemanov, Firebird Alexey Kovyazin, Ibsurgeon
No ratings yet
Optimization of SQL Queries in Firebird: Dmitry Yemanov, Firebird Alexey Kovyazin, Ibsurgeon
38 pages
Function Based Index and Column Statistics
No ratings yet
Function Based Index and Column Statistics
6 pages
Ekitec MyFavoriteScripts-2010 KerryOsborne
No ratings yet
Ekitec MyFavoriteScripts-2010 KerryOsborne
24 pages
Ten Surprising Performance Ideas: "We All Knew About That!"
No ratings yet
Ten Surprising Performance Ideas: "We All Knew About That!"
25 pages
Query Processing
No ratings yet
Query Processing
39 pages
DB2 Explain
No ratings yet
DB2 Explain
7 pages
Health and Safety Policy
No ratings yet
Health and Safety Policy
6 pages
Feudalism Debate
No ratings yet
Feudalism Debate
10 pages
IRC - 1 Mathematics
No ratings yet
IRC - 1 Mathematics
8 pages
LSM6DS3 Datasheet
No ratings yet
LSM6DS3 Datasheet
100 pages
ICT 1st Paper, CH 01
No ratings yet
ICT 1st Paper, CH 01
78 pages
46 2021 (14) Performa Sumber Daya Genetik Babi Lokal (Sus Scropa Domesticus) Di Pulau Timor, Nusa Tenggara Timur
No ratings yet
46 2021 (14) Performa Sumber Daya Genetik Babi Lokal (Sus Scropa Domesticus) Di Pulau Timor, Nusa Tenggara Timur
12 pages
50 bài đọc cô MP (không có lời giải)
No ratings yet
50 bài đọc cô MP (không có lời giải)
46 pages
Porosity and Lithology Determination From Formation Density Log and SNP Sidewall Neutron Porosity Log
No ratings yet
Porosity and Lithology Determination From Formation Density Log and SNP Sidewall Neutron Porosity Log
1 page
National Security Strategy of Japan 12:2022
No ratings yet
National Security Strategy of Japan 12:2022
36 pages
The Digital City Media and The Social Production O... - (Introduction)
No ratings yet
The Digital City Media and The Social Production O... - (Introduction)
24 pages
2023.6 SAT机考真题
No ratings yet
2023.6 SAT机考真题
43 pages
Math Syllabi
No ratings yet
Math Syllabi
8 pages
Physics PART3
No ratings yet
Physics PART3
39 pages
SDO Navotas SHS DISS FirstSem FV
No ratings yet
SDO Navotas SHS DISS FirstSem FV
100 pages
Intercultural Communication Amongst Native Language Speakers Using Uncertainty Reduction Theory
No ratings yet
Intercultural Communication Amongst Native Language Speakers Using Uncertainty Reduction Theory
3 pages
Western and Eastern Perspectives On Consciousness
No ratings yet
Western and Eastern Perspectives On Consciousness
3 pages
Road Drainage System
No ratings yet
Road Drainage System
4 pages
Catalogue - Contact Rivets
No ratings yet
Catalogue - Contact Rivets
10 pages
Screenshot 2024-10-15 at 5.02.54 PM
No ratings yet
Screenshot 2024-10-15 at 5.02.54 PM
1 page
Form Four Geo-1
No ratings yet
Form Four Geo-1
6 pages
Unit Test Integral Calculus Set A
No ratings yet
Unit Test Integral Calculus Set A
4 pages
Full Chapter of Social Psychology 10th Edition by Saul Kassin Ebook and TestBank Bundle EPUB DOCX PDF Download Now
No ratings yet
Full Chapter of Social Psychology 10th Edition by Saul Kassin Ebook and TestBank Bundle EPUB DOCX PDF Download Now
405 pages
Analysis of Supply Chain in Siddhi Engineers: Interim Report ON
No ratings yet
Analysis of Supply Chain in Siddhi Engineers: Interim Report ON
6 pages
Diagram Fasa Dan Transisi-Baru-2023
No ratings yet
Diagram Fasa Dan Transisi-Baru-2023
23 pages
ASTM B 111 B 111M-2009 Standard Specification For Copper and Copper-Alloy Seamless Condenser Tubes and Ferrule Stock
No ratings yet
ASTM B 111 B 111M-2009 Standard Specification For Copper and Copper-Alloy Seamless Condenser Tubes and Ferrule Stock
11 pages
0830 Warning Codes
No ratings yet
0830 Warning Codes
10 pages
Least Common Multiple (LCM)
No ratings yet
Least Common Multiple (LCM)
20 pages
EFL CM2 DecidingWhatSkills
No ratings yet
EFL CM2 DecidingWhatSkills
7 pages
Luo - Plo1 - Edte601 Language Analysis and Comparison Paper
No ratings yet
Luo - Plo1 - Edte601 Language Analysis and Comparison Paper
10 pages