0% found this document useful (0 votes)

755 views129 pages

Java Web Development With MongoDB (Presented at Devoxx 2010)

In this presentation, we will try to answer the following - What is a document and a document database? - How does replication and sharding enable me to scale my application? - How does Java web development change when using MongoDB? - How do I deploy my application with MongoDB

Uploaded by

Alvin John Richards

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

755 views129 pages

Java Web Development With MongoDB (Presented at Devoxx 2010)

Uploaded by

Alvin John Richards

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 129

Alvin Richards

[email protected]
Topics

Overview
Data modeling
Replication & Sharding
Developing with Java
Deployment
Drinking from the fire hose
Part One
MongoDB Overview
Strong adoption of MongoDB

90,000
Database
downloads
per month
Over 1,000 Production
Deployments

web 2.0 companies started out using this

but now:
- enterprises
- financial industries

3 Reason
- Performance
- Large number of readers / writers
- Large data volume
- Agility (ease of development)
NoSQL Really
Means:
non-‐relational, next-‐generation
operational datastores and databases
RDBMS
(Oracle, MySQL)

past : one-size-fits-all
RDBMS
(Oracle, MySQL)

New Gen.
OLAP
(vertica, aster,
greenplum)

present : business intelligence and analytics is now its own segment.

RDBMS
(Oracle, MySQL)

New Gen. Non-relational

OLAP Operational
(vertica, aster, Stores
greenplum) (“NoSQL”)

future
we claim nosql segment will be:
* large
* not fragmented
* ‘platformitize-able’
Philosophy: maximize features -‐ up to the “knee” in the curve, then stop

• memcached

scalability & performance

• key/value

• RDBMS

depth of functionality

no joins
+ no complex transactions

Horizontally Scalable
Architectures
no joins
+ no complex transactions

New Data Models

Improved ways to develop
Platform and Language support
MongoDB is Implemented in C++ for best performance

Platforms 32/64 bit

• Windows
• Linux, Mac OS-X, FreeBSD, Solaris

Platforms 32/64 bit

• Windows
• Linux, Mac OS-X, FreeBSD, Solaris
Language drivers for
• Java
• Ruby / Ruby-on-Rails
• C#
• C / C++
• Erlang
• Python, Perl, JavaScript
• Scala
• others...

ease of development a surprisingly big benefit : faster to code, faster to change, avoid upgrades and scheduled downtime
more predictable performance
fast single server performance -> developer spends less time manually coding around the database
bottom line: usually, developers like it much better after trying
Part Two
Data Modeling in MongoDB
So why model data?
A brief history of normalization
• 1970 E.F.Codd introduces 1st Normal Form (1NF)
• 1971 E.F.Codd introduces 2nd and 3rd Normal Form (2NF, 3NF)
• 1974 Codd & Boyce define Boyce/Codd Normal Form (BCNF)
• 2002 Date, Darween, Lorentzos define 6th Normal Form (6NF)

Goals:
• Avoid anomalies when inserting, updating or deleting
• Minimize redesign when extending the schema
• Make the model informative to users
• Avoid bias towards a particular style of query

* source : wikipedia
The real benefit of relational

• Before relational
• Data and Logic combined
• After relational
• Separation of concerns
• Data modeled independent of logic
• Logic freed from concerns of data design

• MongoDB continues this separation

Relational made normalized
data look like this
Document databases make
normalized data look like this
Terminology

RDBMS MongoDB
Table Collection
Row(s) JSON Document
Index Index
Join Embedding & Linking
Partition Shard
Partition Key Shard Key
DB Considerations
How can we manipulate Access Patterns ?
this data ?
• Read / Write Ratio
• Dynamic Queries • Types of updates
• Secondary Indexes • Types of queries
• Atomic Updates • Data life-cycle
• Map Reduce
Considerations
• No Joins
• Document writes are atomic
So today’s example will use...
Design Session
Design documents that simply map to
your application
post = {author: “Hergé”,
date: new Date(),
text: “Destination Moon”,
tags: [“comic”, “adventure”]}

>db.post.save(post)
Find the document
>db.posts.find()

{ _id : ObjectId("4c4ba5c0672c685e5e8aabf3"),
author : "Hergé",
date : "Sat Jul 24 2010 19:47:11 GMT-0700 (PDT)",
text : "Destination Moon",
tags : [ "comic", "adventure" ] }

Notes:
• ID must be unique, but can be anything you’d like
• MongoDB will generate a default ID if one is not
supplied
Add and index, find via Index
Secondary index for “author”

// 1 means ascending, -1 means descending

>db.posts.ensureIndex({author: 1})

>db.posts.find({author: 'Hergé'})

{ _id : ObjectId("4c4ba5c0672c685e5e8aabf3"),
author : "Hergé",
... }
Verifying indexes exist

>db.system.indexes.find()

// Index on ID
{ name : "_id_",
ns : "test.posts",
key : { "_id" : 1 } }

// Index on author
{ _id : ObjectId("4c4ba6c5672c685e5e8aabf4"),
ns : "test.posts",
key : { "author" : 1 },
name : "author_1" }
Query operators
Conditional operators:
$ne, $in, $nin, $mod, $all, $size, $exists, $type, ..
$lt, $lte, $gt, $gte, $ne,

// find posts with any tags

>db.posts.find({tags: {$exists: true}})
Query operators
Conditional operators:
$ne, $in, $nin, $mod, $all, $size, $exists, $type, ..
$lt, $lte, $gt, $gte, $ne,

// find posts with any tags

>db.posts.find({tags: {$exists: true}})

Regular expressions:
// posts where author starts with h
>db.posts.find({author: /^h*/i })
Query operators
Conditional operators:
$ne, $in, $nin, $mod, $all, $size, $exists, $type, ..
$lt, $lte, $gt, $gte, $ne,

// find posts with any tags

>db.posts.find({tags: {$exists: true}})

Regular expressions:
// posts where author starts with h
>db.posts.find({author: /^h*/i })

Counting:
// posts written by Hergé
>db.posts.find({author: “Hergé”}).count()
Extending the Schema

new_comment = {author: “Kyle”,
date: new Date(),
text: “great book”}

>db.posts.update({_id: “...” },

{ ‘$push’: {comments: new_comment},
‘$inc’: {comments_count: 1}})
Extending the Schema

{ _id : ObjectId("4c4ba5c0672c685e5e8aabf3"),
author : "Hergé",
date : "Sat Jul 24 2010 19:47:11 GMT-‐0700 (PDT)",
text : "Destination Moon",
tags : [ "comic", "adventure" ],
comments_count: 1,
comments : [
{
author : "Kyle",
date : "Sat Jul 24 2010 20:51:03 GMT-‐0700
(PDT)",
text : "great book"
}
]}

Extending the Schema
// create index on nested documents:
>db.posts.ensureIndex({"comments.author": 1})

>db.posts.find({comments.author:”Kyle”})
Extending the Schema
// create index on nested documents:
>db.posts.ensureIndex({"comments.author": 1})

>db.posts.find({comments.author:”Kyle”})

// find last 5 posts:

>db.posts.find().sort({date:-1}).limit(5)
Extending the Schema
// create index on nested documents:
>db.posts.ensureIndex({"comments.author": 1})

>db.posts.find({comments.author:”Kyle”})

// find last 5 posts:

>db.posts.find().sort({date:-1}).limit(5)

// most commented post:

>db.posts.find().sort({comments_count:-1}).limit(1)

When sorting, check if you need an index

Explain a query plan
> db.blogs.find({author: 'Hergé'}).explain()
{
"cursor" : "BtreeCursor author_1",
"nscanned" : 1,
"nscannedObjects" : 1,
"n" : 1,
"millis" : 5,
"indexBounds" : {
"author" : [
[
"Hergé",
"Hergé"
]
]
}
Watch for full table scans

> db.blogs.find({text: 'Destination Moon'}).explain()

{
"cursor" : "BasicCursor",
"nscanned" : 1,
"nscannedObjects" : 1,
"n" : 1,
"millis" : 0,
"indexBounds" : {

}
}
Map Reduce
Map reduce : count tags
mapFunc = function () {
this.tags.forEach(function (z) {emit(z, {count:1});});
}

reduceFunc = function (k, v) {

var total = 0;
for (var i = 0; i < v.length; i++) { total += v[i].count; }
return {count:total};
}

res = db.posts.mapReduce(mapFunc, reduceFunc)

>db[res.result].find()
{ _id : "comic", value : { count : 1 } }
{ _id : "adventure", value : { count : 1 } }
Group

• Equivalent to a Group By in SQL

• Specific the attributes to group the data

• Process the results in a Reduce function

Group
cmd = { key: { "author":true },
initial: {count: 0},
reduce: function(obj, prev) {
prev.count++;
},
};
result = db.posts.group(cmd);

[
{
"author" : "Hergé",
"count" : 1
},
{
"author" : "Kyle",
"count" : 3
}
]
Review

So Far:
- Started out with a simple schema
- Queried Data
- Evolved the schema
- Queried / Updated the data some more
Single Table Inheritance

>db.shapes.find()
{ _id: ObjectId("..."), type: "circle", area: 3.14, radius: 1}
{ _id: ObjectId("..."), type: "square", area: 4, d: 2}
{ _id: ObjectId("..."), type: "rect", area: 10, length: 5, width: 2}

// find shapes where radius > 0

>db.shapes.find({radius: {$gt: 0}})

// create index
>db.shapes.ensureIndex({radius: 1})
One to Many
- Embedded Array / Array Keys
- slice operator to return subset of array
- some queries hard
e.g find latest comments across all documents
One to Many
- Embedded Array / Array Keys
- slice operator to return subset of array
- some queries hard
e.g find latest comments across all documents

- Embedded tree
- Single document
- Natural
- Hard to query
One to Many
- Embedded Array / Array Keys
- slice operator to return subset of array
- some queries hard
e.g find latest comments across all documents

- Embedded tree
- Single document
- Natural
- Hard to query

- Normalized (2 collections)
- most flexible
- more queries
Many - Many
Example:

- Product can be in many categories

- Category can have many products

Products Product_Categories
- product_id - product_id
- category_id

Category
- category_id
Many - Many
products:
{ _id: ObjectId("4c4ca23933fb5941681b912e"),
name: "Destination Moon",
category_ids: [ ObjectId("4c4ca25433fb5941681b912f"),
ObjectId("4c4ca25433fb5941681b92af”]}

Many - Many
products:
{ _id: ObjectId("4c4ca23933fb5941681b912e"),
name: "Destination Moon",
category_ids: [ ObjectId("4c4ca25433fb5941681b912f"),
ObjectId("4c4ca25433fb5941681b92af”]}

categories:
{ _id: ObjectId("4c4ca25433fb5941681b912f"),
name: "Adventure",
product_ids: [ ObjectId("4c4ca23933fb5941681b912e"),
ObjectId("4c4ca30433fb5941681b9130"),
ObjectId("4c4ca30433fb5941681b913a"]}
Many - Many
products:
{ _id: ObjectId("4c4ca23933fb5941681b912e"),
name: "Destination Moon",
category_ids: [ ObjectId("4c4ca25433fb5941681b912f"),
ObjectId("4c4ca25433fb5941681b92af”]}

categories:
{ _id: ObjectId("4c4ca25433fb5941681b912f"),
name: "Adventure",
product_ids: [ ObjectId("4c4ca23933fb5941681b912e"),
ObjectId("4c4ca30433fb5941681b9130"),
ObjectId("4c4ca30433fb5941681b913a"]}

//All categories for a given product

>db.categories.find({product_ids: ObjectId
("4c4ca23933fb5941681b912e")})
Alternative
products:
{ _id: ObjectId("4c4ca23933fb5941681b912e"),
name: "Destination Moon",
category_ids: [ ObjectId("4c4ca25433fb5941681b912f"),
ObjectId("4c4ca25433fb5941681b92af”]}

categories:
{ _id: ObjectId("4c4ca25433fb5941681b912f"),
name: "Adventure"}
Alternative
products:
{ _id: ObjectId("4c4ca23933fb5941681b912e"),
name: "Destination Moon",
category_ids: [ ObjectId("4c4ca25433fb5941681b912f"),
ObjectId("4c4ca25433fb5941681b92af”]}

categories:
{ _id: ObjectId("4c4ca25433fb5941681b912f"),
name: "Adventure"}

// All products for a given category

>db.products.find({category_ids: ObjectId
("4c4ca25433fb5941681b912f")})

Alternative
products:
{ _id: ObjectId("4c4ca23933fb5941681b912e"),
name: "Destination Moon",
category_ids: [ ObjectId("4c4ca25433fb5941681b912f"),
ObjectId("4c4ca25433fb5941681b92af”]}

categories:
{ _id: ObjectId("4c4ca25433fb5941681b912f"),
name: "Adventure"}

// All products for a given category

>db.products.find({category_ids: ObjectId
("4c4ca25433fb5941681b912f")})

// All categories for a given product

product = db.products.find(_id : some_id)
>db.categories.find({_id : {$in : product.category_ids}})
Trees
Full Tree in Document

{ comments: [
{ author: “Kyle”, text: “...”,
replies: [
{author: “Fred”, text: “...”,
replies: []}
]}
]
}

Pros: Single Document, Performance, Intuitive

Cons: Hard to search, Partial Results, 4MB limit

Trees
Parent Links
- Each node is stored as a document
- Contains the id of the parent

Child Links
- Each node contains the id’s of the children
- Can support graphs (multiple parents / child)
Array of Ancestors
- Store Ancestors of a node
{ _id: "a" }
{ _id: "b", ancestors: [ "a" ], parent: "a" }
{ _id: "c", ancestors: [ "a", "b" ], parent: "b" }
{ _id: "d", ancestors: [ "a", "b" ], parent: "b" }
{ _id: "e", ancestors: [ "a" ], parent: "a" }
{ _id: "f", ancestors: [ "a", "e" ], parent: "e" }
{ _id: "g", ancestors: [ "a", "b", "d" ], parent: "d" }
Array of Ancestors
- Store Ancestors of a node
{ _id: "a" }
{ _id: "b", ancestors: [ "a" ], parent: "a" }
{ _id: "c", ancestors: [ "a", "b" ], parent: "b" }
{ _id: "d", ancestors: [ "a", "b" ], parent: "b" }
{ _id: "e", ancestors: [ "a" ], parent: "a" }
{ _id: "f", ancestors: [ "a", "e" ], parent: "e" }
{ _id: "g", ancestors: [ "a", "b", "d" ], parent: "d" }

//find all descendants of b:

>db.tree2.find({ancestors: ‘b’})
Array of Ancestors
- Store Ancestors of a node
{ _id: "a" }
{ _id: "b", ancestors: [ "a" ], parent: "a" }
{ _id: "c", ancestors: [ "a", "b" ], parent: "b" }
{ _id: "d", ancestors: [ "a", "b" ], parent: "b" }
{ _id: "e", ancestors: [ "a" ], parent: "a" }
{ _id: "f", ancestors: [ "a", "e" ], parent: "e" }
{ _id: "g", ancestors: [ "a", "b", "d" ], parent: "d" }

//find all descendants of b:

>db.tree2.find({ancestors: ‘b’})

//find all ancestors of f:

>ancestors = db.tree2.findOne({_id:’f’}).ancestors
>db.tree2.find({_id: { $in : ancestors})
findAndModify
Queue example

//Example: find highest priority job and mark

job = db.jobs.findAndModify({
query: {inprogress: false},
sort: {priority: -1),
update: {$set: {inprogress: true,
started: new Date()}},
new: true})
Part Three
Replication & Sharding
Scaling

• Data size only goes up

• Operations/sec only go up
• Vertical scaling is limited
• Hard to scale vertically in the cloud
• Can scale wider than higher

What is scaling?
Well - hopefully for everyone here.
Traditional Horizontal Scaling

• read only slaves

• caching
• custom partitioning code

scaling isn’t new

sharding isn’t
manual re-balancing is painful at best
New methods of Scaling

• relational database clustering

• consistent hashing (Dynamo)
• range based partitioning (BigTable/PNUTS)
Read Scalability : Replication
read

ReplicaSet 1

Primary

Secondary

write
Basics
• MongoDB replication is a bit like MySQL replication
Asynchronous master/slave at its core
• Variations:
Master / slave
Replica Pairs (deprecated – use replica sets)
Replica Sets
Replica Sets
• A cluster of N servers
• Any (one) node can be primary
• Consensus election of primary
• Automatic failover
• Automatic recovery
• All writes to primary
• Reads can be to primary (default) or a secondary
Replica Sets – Design Concepts

1. Write is durable once avilable on a majority of

members
2. Writes may be visible before a cluster wide
commit has been completed
3. On a failover, if data has not been replicated
from the primary, the data is dropped (see #1).
Replica Set: Establishing

Member 1
Member 3

Member 2
Replica Set: Electing primary

Member 1
Member 3

Member 2
PRIMARY
Replica Set: Failure of master
negotiate
Member 1 new
Member 3
master PRIMARY

Member 2
DOWN
Replica Set: Reconfiguring

Member 1
Member 3
PRIMARY

Member 2
DOWN
Replica Set: Member recovers

Member 1
Member 3
PRIMARY

Member 2
RECOVER-
ING
Replica Set: Active

Member 1
Member 3
PRIMARY

Member 2
Set Member Types
Normal (priority == 1)
Passive (priority == 0)
Arbiter (no data, but can vote)
Write Scalability: Sharding
read key range key range key range
0 .. 30 31 .. 60 61 .. 100

ReplicaSet 1 ReplicaSet 2 ReplicaSet 3

Primary Primary Primary

Secondary Secondary Secondary

write
Sharding

• Scale horizontally for data size, index size, write and

consistent read scaling

• Distribute databases, collections or a objects in a

collection

• Auto-balancing, migrations, management happen

with no down time

• Replica Sets for inconsistent read scaling

for inconsistent read scaling

Sharding

• Choose how you partition data

• Can convert from single master to sharded system
with no downtime
• Same features as non-sharding single master
• Fully consistent
Range Based

• collection is broken into chunks by range

• chunks default to 200mb or 100,000 objects
Architecture
Shards

mongod mongod mongod ...

Conﬁg mongod mongod mongod
Servers

mongod

mongod mongos mongos ...

client
Config Servers

• Hold meta data of where chunks are located

•1 or 3 of them (3 for availability)
• changes are made with 2 phase commit
• if a majority are down, meta data goes read only
• system is online as long as 1/3 is up
Shards

• Hold the actual data

•Can be master, master/slave or replica sets
• Replica sets gives sharding + full auto-failover
• Regular mongod processes
mongos

• Sharding Router (or Switch)

• Acts just like a mongod to clients
• Can have 1 or as many as you want
• Can run on appserver so no extra network traffic
Writes

• Inserts : require shard key, routed

• Removes: routed and/or scattered
• Updates: routed or scattered
Queries

• By shard key: routed

• Sorted by shard key: routed in order
• By non shard key: scatter gather
• Sorted by non shard key: distributed merge sort
Operations

• split: breaking a chunk into 2

• migrate: move a chunk from 1 shard to another
• balancing: moving chunks automatically to
keep system in balance
Part Four
Java Development
Library Choices
• Raw MongoDB Driver
Map<String, Object> view of objects
Rough but dynamic
• Morphia (type-safe mapper)
POJOs
Annotation based (similar to JPA)
Syntactic sugar and helpers
• Others
Code generators, other jvm languages
MongoDB Java Driver
• BSON Package
Types
Encode/Decode
DBObject (Map<String, Object>)
Nested Maps
Directly encoded to binary format (BSON)
• MongoDB Package
Mongo
DBObject (BasicDBObject/Builder)
DB/DBColletion
DBQuery/DBCursor
BSON Package
Types
int and long
Array/ArrayList
String
byte[] – binData
Double (IEEE 754 FP)
Date (secs since epoch)
Null
Boolean
JavaScript String
Regex
MongoDB Package
• Mongo
Connection, ThreadSafe
WriteConcern*
• DB
Auth, Collections
getLastError()
Command(), eval()
RequestStart/Done
• DBCollection
Insert/Save/Find/Remove/Update/
FindAndModify
ensureIndex
Simple Example
DBCollection coll = new Mongo().getDB(“blogdb”);

ArrayList<String> tags = new ArrayList<String>();

tags.add("comic");
tags.add("adventure");

coll.save(
new BasicDBObjectBuilder(
“author”, “Hergé”).
append(“text”, “Destination Moon”).
append(“date”, new Date()).
append(“tags”, tags);
Simple Example, Again
DBCollection coll = new Mongo().getDB(“blogdb”);

ArrayList<String> tags = new ArrayList<String>();

tags.add("comic");
tags.add("adventure");

Map<String, Object> fields = new …

fields.add(“author”, “Hergé”);
fields.add(“text”, “Destination Moon”);
fields.add(“date”, new Date());
fields.add(“tags”, tags);

coll.insert(new BasicDBObject(fields));
DBObject <-> (B/J)SON
{author:”kyle”, text:“Destination Moon”,
date: }

BasicDBObjectBuilder dbObj = new

BasicDBObjectBuilder()
.append(“author”, “Hergé”)
.append(“text”, “Destination Moon”)
.append(“date”, new Date())
.get();

String text = (String)dbObj.get(“text”);

JSON.parse(…)
DBObject dbObj = JSON.parse(“
{‘author’:‘Hergé’,
‘text’:‘Destination Moon’,
‘date’:‘Sat Jul 24 2010 19:47:11 GMT-‐0700 (PDT)’,
}
”);
Lists
DBObject dbObj = JSON.parse(“
{‘author’:‘Hergé’,
‘text’:‘Destination Moon’,
‘date’:‘Sat Jul 24 2010 19:47:11 GMT-‐0700 (PDT)’,
}
”);

List<String> tags = new …

tags.add(“comic”);
tags.add(“adventure”);
dbObj.put(“tags”, tags);

EntityListeners
EntityInterceptor
Basic POJO
@Entity
class Person {
@Id
String author;
@Indexed
Date date;
String text;
}
Datastore Basics
get(class, id)
find(class, […])
save(entity, […])
delete(query)
getCount(query)
update/First(query, upOps)
findAndModify/Delete(query, upOps)
Add, Get, Delete
Blog entry = new Blog(“Hergé”, New Date(),
“Destination Moon”)

Datastore ds = new Morphia().createDatastore()

ds.save(entry);

Blog foundEntry = ds.get(Blog.class, “Hergé”)

ds.delete(entry);
Queries
Datastore ds = …

Query q = ds.createQuery(Blog.class);

q.field(“author”).equal(“Hergé”).limit(5);

for(Blog e : q.fetch())
print(e);

Blog entry = q.field(“author”).startsWith

(“H”).get();
Update
Datastore ds = …
Query q = ds.find(Blog.class, “author”, “Hergé”);
UpdateOperation uo = ds.createUpdateOperations
(cls)

uo.inc(“views”, 1).set(“lastUpdated”, new Date

());

UpdateResults res = ds.update(q, uo);

if(res.getUpdatedCount() > 0)
//do something?
Update Operations
set(field, val)
unset(field)

inc(field, [val])
dec(field)

add(field, val)
addAdd(field, vals)

removeFirst/Last(field)
removeAll(field, vals)
Relationships
[@Embedded]
Loaded/Saved with Entity
Update
@Reference
Stored as DBRef(s)
Loaded with Entity
Not automatically saved
Key<T> (DBRef)
Stored as DBRef(s)
Just a link, but resolvable by Datastore/
Query
MongoDB features in Java

• Durability
• Replication
• Sharding
• Connection options
Durability

What failures do you need to recover from?

• Loss of a single database node?
• Loss of a group of nodes?
Durability - Master only

• Write acknowledged
when in memory on
master only
Durability - Master + Slaves
• Write acknowledged when
in memory on master +
slave

• Will survive failure of a

single node
Durability - Master + Slaves +
fsync
• Write acknowledged when in
memory on master + slaves

• Pick a “majority” of nodes

• fsync in batches (since it
blocking)
Setting default error checking
// Do not check or report errors on write
com.mongodb.WriteConcern.NONE;

// Use default level of error check. Do not send

// a getLastError(), but raise exction on error
com.mongodb.WriteConcern.NORMAL;

// Send getLastError() after each write. Raise an

// exception on error
com.mongodb.WriteConcern.STRICT;

// Set the concern

db.setWriteConcern(concern);
Customized WriteConcern
// Wait for three servers to acknowledge write
WriteConcern concern =
new WriteConcern(3);

// Wait for three servers, with a 1000ms timeout

WriteConcern concern =
new WriteConcern(3, 1000);

// Wait for 3 server, 100ms timeout and fsync

// data to disk
WriteConcern concern =
new WriteConcern(3, 1000, true);

// Set the concern
db.setWriteConcern(concern);
Using Replication from Java

slaveOk()
- driver to send read requests to Secondaries
- driver will always send writes to Primary

Can be set on
-‐ DB.slaveOk()
-‐ Collection.slaveOk()
-‐ find(q).addOption(Bytes.QUERYOPTION_SLAVEOK);
Using sharding Java

Before sharding

coll.save(
new BasicDBObjectBuilder(“author”, “Hergé”).
append(“text”, “Destination Moon”).
append(“date”, new Date());

Query q = ds.find(Blog.class, “author”, “Hergé”);

After sharding

No code change required!

Connection options

MongoOptions mo = new MongoOptions();

// Restrict number of connections

mo.connectionsPerHost = MAX_THREADS + 5;

// Auto reconnection on connection failure

mo.autoConnectRetry = true;
Part Five
Deploying MongoDB
Part Five
Deploying MongoDB

• Performance tuning
• Sizing
• O/S Tuning / File System layout
• Backup
Backup
• Typically backups are driven from a slave
• Eliminates impact to client / application traffic to master
Backup

•Two strategies
• mogodump / mongorestore
• fsync + lock
mongodump

• binary, compact object dump

• each consistent object is written
• not necessarily consistent from start to finish
fsync + lock

• fsync - flushes buffers to disk

• lock - blocks writes
db.runCommand({fsync:1,lock:1})

• Use file-system / LVM / storage snapshot

• unlock
db.$cmd.sys.unlock.findOne();
Slave delay

• Protection against app faults

• Protection against administration mistakes
O/S Config

• RAM - lots of it
• Filesystem
• EXT4 / XFS
• Better file allocation & performance

• I/O
• More disk the better
• Consider RAID10 or other RAID configs
Monitoring

• Munin, Cacti, Nagios

Primary function:
• Measure stats over time
• Tells you what is going on with
your system
• Alerts when threshold reached
Remember me?
Summary

MongoDB makes building Java Web application simple

You can focus on what the apps needs to do

MongoDB has built-in

• Horizontal scaling (reads and writes)

• Simplified schema evolution
• Simplified deployed and operation
• Best match for development tools and agile processes

MongoDB Schema Design Basics
100% (2)
MongoDB Schema Design Basics
51 pages
MongoDB Capacity Planning Guide
No ratings yet
MongoDB Capacity Planning Guide
36 pages
College Geometry
No ratings yet
College Geometry
27 pages
MongoDB Schema Design Guide
No ratings yet
MongoDB Schema Design Guide
59 pages
MongoDB Schema Design Guide
No ratings yet
MongoDB Schema Design Guide
61 pages
MongoDB Schema Design
No ratings yet
MongoDB Schema Design
69 pages
Mongodb
No ratings yet
Mongodb
9 pages
Mongodb (Cont.) : Excerpts From "The Little Mongodb Book" Karl Seguin
No ratings yet
Mongodb (Cont.) : Excerpts From "The Little Mongodb Book" Karl Seguin
37 pages
Mongo DB
No ratings yet
Mongo DB
31 pages
Prac4 NOSQL-QUE
No ratings yet
Prac4 NOSQL-QUE
9 pages
NoSQL Database Guide
No ratings yet
NoSQL Database Guide
100 pages
Introducing: Mongodb: David J. C. Beach
No ratings yet
Introducing: Mongodb: David J. C. Beach
57 pages
Big Training Data Module 2 - Mongo DB 2
No ratings yet
Big Training Data Module 2 - Mongo DB 2
67 pages
Big Data (Unit 3)
No ratings yet
Big Data (Unit 3)
46 pages
Mongodb: Goo The Following Table Shows The Relationship of Rdbms Terminology With Mongodb
No ratings yet
Mongodb: Goo The Following Table Shows The Relationship of Rdbms Terminology With Mongodb
7 pages
Unit 4
No ratings yet
Unit 4
27 pages
NoSQL MongoDB Tutorial Final
No ratings yet
NoSQL MongoDB Tutorial Final
30 pages
FSD Unit III
No ratings yet
FSD Unit III
22 pages
Mongodb Interview Questions (V4.4)
No ratings yet
Mongodb Interview Questions (V4.4)
25 pages
Dbms Unit5 Notes
No ratings yet
Dbms Unit5 Notes
81 pages
Nosql: Non-Relational Next Generation Operational Datastores and Databases
No ratings yet
Nosql: Non-Relational Next Generation Operational Datastores and Databases
19 pages
Mongo DB
No ratings yet
Mongo DB
77 pages
Complete Unit 3 Notes
No ratings yet
Complete Unit 3 Notes
30 pages
MongoDB Guide for Developers
No ratings yet
MongoDB Guide for Developers
24 pages
Mongodb
No ratings yet
Mongodb
49 pages
UNIT-IV MongoDB
No ratings yet
UNIT-IV MongoDB
54 pages
A. Im, G. Cai, H. Tunc, J. Stevens, Y. Barve, S. Hei Vanderbilt University
No ratings yet
A. Im, G. Cai, H. Tunc, J. Stevens, Y. Barve, S. Hei Vanderbilt University
81 pages
Lecture 40 1
No ratings yet
Lecture 40 1
22 pages
Journey To The Mongodb: Myat Su Htwe Senior Lecturer Academic Department
No ratings yet
Journey To The Mongodb: Myat Su Htwe Senior Lecturer Academic Department
44 pages
Big Data-Unit 4
No ratings yet
Big Data-Unit 4
41 pages
1664473609-Unit 5 - Database Management - MongoDB
No ratings yet
1664473609-Unit 5 - Database Management - MongoDB
23 pages
MongoDB Schema Validation and Patterns
No ratings yet
MongoDB Schema Validation and Patterns
24 pages
Module 3 Mongodb
No ratings yet
Module 3 Mongodb
10 pages
Unit - Iii Bda
No ratings yet
Unit - Iii Bda
51 pages
Ultimate Mongodb Cheatsheet
No ratings yet
Ultimate Mongodb Cheatsheet
5 pages
Manual Group B Assignment No 1
No ratings yet
Manual Group B Assignment No 1
7 pages
Csis 3300 w5 9 Nosql
No ratings yet
Csis 3300 w5 9 Nosql
27 pages
DPA Lecture 6
No ratings yet
DPA Lecture 6
69 pages
MongoDb Imp
No ratings yet
MongoDb Imp
21 pages
MongoDB Guide for Students
No ratings yet
MongoDB Guide for Students
104 pages
Module 3
No ratings yet
Module 3
15 pages
NoSQL Data Analytics Guide
0% (1)
NoSQL Data Analytics Guide
50 pages
Mongo DB
No ratings yet
Mongo DB
36 pages
Lecture 9 - MongoDB
No ratings yet
Lecture 9 - MongoDB
8 pages
Introduction to NoSQL Databases
No ratings yet
Introduction to NoSQL Databases
14 pages
Mongodb DBA Homework 4.3 Answer
100% (1)
Mongodb DBA Homework 4.3 Answer
6 pages
L48 - MongoDB
No ratings yet
L48 - MongoDB
31 pages
Big Data Notes
No ratings yet
Big Data Notes
13 pages
Chapitre 4 MongoDB
No ratings yet
Chapitre 4 MongoDB
27 pages
Data Modeling With Mongodb
No ratings yet
Data Modeling With Mongodb
22 pages
MongoDB Data Modeling - Sample Chapter
No ratings yet
MongoDB Data Modeling - Sample Chapter
40 pages
NoSQL 14 MONGO 2
No ratings yet
NoSQL 14 MONGO 2
37 pages
Advanced Developer Student Workbook
No ratings yet
Advanced Developer Student Workbook
30 pages
MongoDB Cheat Sheet
No ratings yet
MongoDB Cheat Sheet
5 pages
MongoDB: Scalable NoSQL Database Guide
No ratings yet
MongoDB: Scalable NoSQL Database Guide
21 pages
Document Database
No ratings yet
Document Database
25 pages
Mongodb Homework 5.4
100% (1)
Mongodb Homework 5.4
7 pages
Java Database & NoSQL Programming
No ratings yet
Java Database & NoSQL Programming
51 pages
Bda Unit 4
No ratings yet
Bda Unit 4
13 pages
Mongodb Tutorial: Database Collection
No ratings yet
Mongodb Tutorial: Database Collection
36 pages
Oscon 2017 - Long
No ratings yet
Oscon 2017 - Long
53 pages
2016 NYC Database Meetup - Containers
No ratings yet
2016 NYC Database Meetup - Containers
45 pages
Scaling MariaDB With Docker - Webinar
No ratings yet
Scaling MariaDB With Docker - Webinar
47 pages
Docker + MongoDB
No ratings yet
Docker + MongoDB
28 pages
Indexing: Alvin Richards - Alvin@
No ratings yet
Indexing: Alvin Richards - Alvin@
45 pages
Schema Chalk Talk
No ratings yet
Schema Chalk Talk
36 pages
MongoDB Scaling for Developers
No ratings yet
MongoDB Scaling for Developers
44 pages
Indexes, What Indexes?
No ratings yet
Indexes, What Indexes?
42 pages
Circular Letter No.4623 - Information On Hybrid Meetings (Secretariat)
No ratings yet
Circular Letter No.4623 - Information On Hybrid Meetings (Secretariat)
6 pages
Freud vs. Frankl: Student Coping Strategies
No ratings yet
Freud vs. Frankl: Student Coping Strategies
1 page
Storytelling With Data - v2
No ratings yet
Storytelling With Data - v2
80 pages
STR Profiles: Multiplex PCR, Tri-Alleles, Amelogenin, and Partial Profiles
No ratings yet
STR Profiles: Multiplex PCR, Tri-Alleles, Amelogenin, and Partial Profiles
20 pages
Employer Branding Essentials Guide
No ratings yet
Employer Branding Essentials Guide
4 pages
Intro to Operating Systems
No ratings yet
Intro to Operating Systems
13 pages
UN Sustainable Development Goals Overview
No ratings yet
UN Sustainable Development Goals Overview
12 pages
Bench Bulletin Issue 30
100% (1)
Bench Bulletin Issue 30
102 pages
Sadhana
No ratings yet
Sadhana
2 pages
Fall 97 Principles of Microeconomics Slide 1: R. Larry Reynolds
No ratings yet
Fall 97 Principles of Microeconomics Slide 1: R. Larry Reynolds
40 pages
Monday - Mercury, Venus and The Great Attractor - Astrology and Horoscopes by Eric Francis 261215
No ratings yet
Monday - Mercury, Venus and The Great Attractor - Astrology and Horoscopes by Eric Francis 261215
4 pages
Lecture 1 - Plane Wave
No ratings yet
Lecture 1 - Plane Wave
35 pages
SAP Area Menu Maintenance
No ratings yet
SAP Area Menu Maintenance
21 pages
INTASON, Montira, 2024 - The Dilemma Between Cultural Rituals and Hedonism For A Tourism in A Cultural Festival (Scopus)
No ratings yet
INTASON, Montira, 2024 - The Dilemma Between Cultural Rituals and Hedonism For A Tourism in A Cultural Festival (Scopus)
29 pages
Evidence Review of Organisational Sustainability (November 2017)
100% (3)
Evidence Review of Organisational Sustainability (November 2017)
95 pages
Creative Design Skill Guide
No ratings yet
Creative Design Skill Guide
4 pages
Earth Surface Changes Explained
No ratings yet
Earth Surface Changes Explained
31 pages
Machine Design Tutorials - Week 6
No ratings yet
Machine Design Tutorials - Week 6
17 pages
Sriya PPT 2.0
No ratings yet
Sriya PPT 2.0
16 pages
Extended Shear Tab Connections Under Combined Axial and Shear Loading
No ratings yet
Extended Shear Tab Connections Under Combined Axial and Shear Loading
10 pages
Atlas of Human Hair Microscopic Characteristics, 1st Edition Complete Digital Book
No ratings yet
Atlas of Human Hair Microscopic Characteristics, 1st Edition Complete Digital Book
15 pages
Adaptable PID Controller For Industrial Hot and Cold Chamber
No ratings yet
Adaptable PID Controller For Industrial Hot and Cold Chamber
46 pages
Domains of Development
No ratings yet
Domains of Development
19 pages
MacBook Air Service Source
100% (1)
MacBook Air Service Source
155 pages
Introducing Quality Patient Safety Program
No ratings yet
Introducing Quality Patient Safety Program
15 pages
Theories of Perception PDF
No ratings yet
Theories of Perception PDF
31 pages
Production Management
100% (1)
Production Management
435 pages
Node B Installation Guide
67% (3)
Node B Installation Guide
29 pages

Java Web Development With MongoDB (Presented at Devoxx 2010)

Uploaded by

Java Web Development With MongoDB (Presented at Devoxx 2010)

Uploaded by

Alvin Richards

web 2.0 companies started out using this

present : business intelligence and analytics is now its own segment.

New Gen. Non-relational

scalability & performance

depth of functionality

New Data Models

Platforms 32/64 bit

Platforms 32/64 bit

• MongoDB continues this separation

// 1 means ascending, -1 means descending

// find posts with any tags

// find posts with any tags

// find posts with any tags

>db.posts.update({_id: “...” },

// find last 5 posts:

// find last 5 posts:

// most commented post:

When sorting, check if you need an index

> db.blogs.find({text: 'Destination Moon'}).explain()

reduceFunc = function (k, v) {

res = db.posts.mapReduce(mapFunc, reduceFunc)

• Equivalent to a Group By in SQL

• Specific the attributes to group the data

• Process the results in a Reduce function

// find shapes where radius > 0

- Product can be in many categories

//All categories for a given product

// All products for a given category

// All products for a given category

// All categories for a given product

Pros: Single Document, Performance, Intuitive

Cons: Hard to search, Partial Results, 4MB limit

//find all descendants of b:

//find all descendants of b:

//find all ancestors of f:

//Example: find highest priority job and mark

• Data size only goes up

• read only slaves

scaling isn’t new

• relational database clustering

1. Write is durable once avilable on a majority of

ReplicaSet 1 ReplicaSet 2 ReplicaSet 3

Primary Primary Primary

Secondary Secondary Secondary

Secondary Secondary Secondary

• Scale horizontally for data size, index size, write and

• Distribute databases, collections or a objects in a

• Auto-balancing, migrations, management happen

• Replica Sets for inconsistent read scaling

for inconsistent read scaling

• Choose how you partition data

• collection is broken into chunks by range

mongod mongod mongod ...

mongod mongos mongos ...

• Hold meta data of where chunks are located

• Hold the actual data

• Sharding Router (or Switch)

• Inserts : require shard key, routed

• By shard key: routed

• split: breaking a chunk into 2

ArrayList<String> tags = new ArrayList<String>();

ArrayList<String> tags = new ArrayList<String>();

Map<String, Object> fields = new …

BasicDBObjectBuilder dbObj = new

String text = (String)dbObj.get(“text”);

List<String> tags = new …

{…, tags: [‘comic’, ‘adventure’]}

Datastore ds = new Morphia().createDatastore()

Blog foundEntry = ds.get(Blog.class, “Hergé”)

Blog entry = q.field(“author”).startsWith

uo.inc(“views”, 1).set(“lastUpdated”, new Date

UpdateResults res = ds.update(q, uo);

What failures do you need to recover from?

• Will survive failure of a

• Pick a “majority” of nodes