@@ -30,7 +30,7 @@ First create a new Storm project using its Leiningen template:
30
30
31
31
[source,console]
32
32
----
33
- $ lein new storm-project feeds
33
+ $ lein new cookbook- storm-project feeds
34
34
----
35
35
36
36
In the project directory, run the default storm topology (which the
@@ -109,7 +109,7 @@ and only if the user is following the user who triggered the event:
109
109
110
110
Finally, add a bolt that accepts a user and an event and stores the event
111
111
in a hash of sets like +{:user1 #{event1 event2} :user2 #{event1 event2}}+ -
112
- these are the activity streams we 'll present to users.
112
+ these are the activity streams you 'll present to users.
113
113
114
114
[source,clojure]
115
115
----
@@ -124,18 +124,18 @@ these are the activity streams we'll present to users.
124
124
(ack! collector tuple)))))
125
125
----
126
126
127
- This gives us all the pieces you'll need, but you'll still need
127
+ This gives you all the pieces you'll need, but you'll still need
128
128
to assemble them together into a computational topology. Open up
129
129
+src/feeds/topology.clj+ and use the topology DSL to wire the spouts
130
130
and bolts together:
131
131
132
132
[source,clojure]
133
133
----
134
- (defn topology []
134
+ (defn storm- topology []
135
135
(topology
136
136
{"events" (spout-spec event-spout)}
137
137
138
- {"active users" (bolt-spec {"eventst " :shuffle} active-user-bolt :p 2)
138
+ {"active users" (bolt-spec {"events " :shuffle} active-user-bolt :p 2)
139
139
"follows" (bolt-spec {"active users" :shuffle} follow-bolt :p 2)
140
140
"feeds" (bolt-spec {"follows" ["user"]} feed-bolt :p 2)}))
141
141
----
@@ -150,9 +150,6 @@ You'll also need to update the +:require+ statement in that file:
150
150
[backtype.storm [clojure :refer [topology spout-spec bolt-spec]] [config :refer :all]])
151
151
----
152
152
153
- Now delete the old stormy-topology function, and change the +run!+
154
- function to point to the new +topology+ function you just defined.
155
-
156
153
Run the topology again. Feeds will be printed to the console by the
157
154
final bolts in the topology:
158
155
@@ -191,7 +188,7 @@ better picture of how these primitives work together.
191
188
+defspout+ looks much like Clojure's standard +defn+ with one
192
189
difference - the second argument to +defspout+ is a list of names that
193
190
will be assigned to elements of each tuple this spout produces. This
194
- lets us use tuples like vectors or maps interchangeably. The third
191
+ lets you use tuples like vectors or maps interchangeably. The third
195
192
argument to +defspout+ is a list of arguments that will be bound
196
193
various components of Storm's operational infrastructure - +collector+
197
194
is used below, ignoring the other two for now.
@@ -206,8 +203,8 @@ is used below, ignoring the other two for now.
206
203
----
207
204
208
205
++defspout++'s body will be evaluated once, when the spout instance is
209
- created, which gives us an opportunity to create in-memory state. In
210
- this case we 'll create a list of events this spout will produce, but
206
+ created, which gives you an opportunity to create in-memory state. In
207
+ this case you 'll create a list of events this spout will produce, but
211
208
usually this will be a connection to a database or distributed queue.
212
209
213
210
[source,clojure]
@@ -245,7 +242,7 @@ the seller of the item. In this world, you need to consider a variety
245
242
of factors for each user in the system for every event and determine
246
243
whether the event should be added to that user's feed.
247
244
248
- THe first bolt starts this process by generating a tuple of +(user,
245
+ The first bolt starts this process by generating a tuple of +(user,
249
246
event)+ for each user in the system every time an event is generated
250
247
by the +event-spout+:
251
248
@@ -308,8 +305,8 @@ provides the actual bolt definition inside a call to +bolt+:
308
305
309
306
Note that the tuple argument is inside the bolt's definition of
310
307
+execute+ in this case, and may be destructured as usual. In cases
311
- where the event's user is not following the user in the tuple, we do
312
- not emit a new tuple and simply acknowledge that we received our
308
+ where the event's user is not following the user in the tuple, it does
309
+ not emit a new tuple and simply acknowledges that it received the
313
310
input.
314
311
315
312
As noted earlier, this particular system could be implemented much
@@ -444,8 +441,8 @@ create.
444
441
445
442
===== deployment
446
443
447
- The real magic of Storm comes out in deployment. Storm gives the
448
- tools us to build small, independent components that make no
444
+ The real magic of Storm comes out in deployment. Storm gives you the
445
+ tools to build small, independent components that make no
449
446
assumptions about how many identical instances are running in the same
450
447
topology. This means that the topology itself is essentially
451
448
infinitely scalable. The edges of the system, which receive data
@@ -461,7 +458,7 @@ A simple deployment strategy is built into the Storm library:
461
458
(.submitTopology "my first topology"
462
459
{TOPOLOGY-DEBUG (Boolean/parseBoolean debug)
463
460
TOPOLOGY-WORKERS (Integer/parseInt workers)}
464
- (topology)))
461
+ (storm- topology)))
465
462
----
466
463
467
464
+LocalCluster+ is an in-memory implementation of a Storm cluster. You can
@@ -481,7 +478,7 @@ as you can see in +src/feeds/TopologySubmitter.clj+:
481
478
"feeds topology"
482
479
{TOPOLOGY-DEBUG (Boolean/parseBoolean debug)
483
480
TOPOLOGY-WORKERS (Integer/parseInt workers)}
484
- (topology)))
481
+ (storm- topology)))
485
482
----
486
483
487
484
This file uses Clojure's Java interop to generate a Java class with a
@@ -499,7 +496,7 @@ $ storm jar path/to/thejariuploaded.jar feeds.TopologySubmitter "workers" 5
499
496
500
497
This command will tell the cluster to allocate 5 dedicated workers for
501
498
this topology and begin polling +nextTuple+ on all of its spouts, as
502
- it did when we used +LocalCluster+. A cluster may run any number of
499
+ it did when you used +LocalCluster+. A cluster may run any number of
503
500
topologies simultaneously - each worker is a physical JVM and may end
504
501
up running instances of many different bolts and spouts.
505
502
0 commit comments