Fountain is a streaming API server for 4chan, similar to the Twitter firehose. Fountain replicates 4chan through the official 4chan API, then exposes discrete events as Server-Sent-Events or whitespace-delimited JSON.
Among other things, you can use fountain to lurk every thread, stress test your browser's CSS 3D transforms, make use of your advanced 256-color virtual terminal, or you can just dump the stream into elasticsearch. Other more interesting uses could be machine learning/sentiment analysis, as well as live-updating thread watchers in userscripts.
If you're just interested in the API and not running fountain yourself, I host
a publicly-accessible server at fountain.hakase.org that streams /a/ and /g/.
If you want to stream other boards or have heavy usage requirements, you can
run also fountain locally or on your own server.
Fountain has been in development for the past couple months, and is currently in fairly rough condition code-wise; I'm releasing it in this state because I suspect moot is going to make an official version of "streaming 4chan" available soon, and I want to be able to say that I did it first.
However, despite the current state of the code, fountain is pretty feature complete and stable when running. Depending on what moot's new thing is and when it's released, I expect to clean up the code and finalize the API for a stable release in the next few weeks.
These examples will hit fountain.hakase.org. Change the host to
localhost:3500 if you want to hit your local copy.
var es = new EventSource('http://fountain.hakase.org/v1/a/stream');
es.addEventListener('new-posts', function (e) {
var posts = JSON.parse(e.data);
posts.forEach(function (it) {
var d = document.createElement('div');
d.innerHTML = it.com;
document.body.appendChild(d);
});
});For more complete examples (written in LiveScript), check out the clients
directory, or the hosted versions at
http://fountain.hakase.org/demo/
View the raw json stream:
curl --compressed -s http://fountain.hakase.org/v1/a/json
Prettify with jq:
curl --compressed -s http://fountain.hakase.org/v1/a/json |\
jq --unbuffered '.'
Stalk moot on /g/:
curl --compressed -s http://fountain.hakase.org/v1/g/json |\
jq --unbuffered 'select(.name and (.name == "moot"))'
View images on /a/ as ANSI art (requires img2txt from libcaca, and
zsh for its temporary file process substitution =() syntax):
#!/usr/bin/env zsh
BOARD=a
curl --compressed -s http://fountain.hakase.org/v1/$BOARD/json |\
while read -r line; do
if [[ -n "$line" ]]; then
(jq -r 'select(.tim) | [.tim, (if .resto == 0 then .no else .resto end)]|@sh' <<< "$line") \
| while read tim tno; do
echo;
img2txt -W 80 -f utf8 -d fstein \
=(curl -s http://phosphene.hakase.org/$BOARD/thumbs/$tno/${tim}s.jpg);
done;
fi;
doneThe ANSI art demo is also hosted on fountain.hakase.org:4747 for /a/ and 4848 for /g/:
nc fountain.hakase.org 4747
GET /v1/<board-name>/stream- returns a
text/event-streamwith the following event types:new-posts:datais a JSON-serialized array of posts as defined by the 4chan API. The OPs of new threads will be present in this event.deleted-posts:datais a JSON-serialized array of strings identifying thenoof deleted posts. Only individual post deletions will show up here. Posts of deleted threads will not be present in this streamchanged-posts:datais a JSON-serialized array of posts in 4chan API format.changed-postsevents are emitted for events such as moderation, deleted images, or "USER WAS BANNED FOR THIS POST".new-threads:datais a JSON-serialized array of threads in 4chan API format, i.e., thread-level data + apostsfield which contains an array of posts, the first of which is the OP of the thread. Note that the OP will also be emitted in thenew-postsevent.deleted-threads:datais a JSON-serialized array of strings identifying thenoof deleted threads.changed-threads:datais a JSON-serialized array of thread-level data for changed threads, e.g. stickiness changes.- If the query parameter
catalogis set thetrue, one additional event will be emitted at the beginning of the stream calledcatalog, thedatabeing a JSON-serialized hash of threadnoto the 4chan API thread data for each active thread at the time of the request, as well as a thepostsarray containing only the OP of the thread. - Events--especially changed post events--are provided on a "best-effort"
basis due the to inexact nature of 4chan replication.
new-postsposts are usually ordered bynoand most post deletions will be picked up, but there will be holes, especially if fountain loses network connectivity. Last-Event-Idis currently ignored, so if your client disconnects, you will miss events. TODO fix this
GET /v1/<board-name>/json- returns a
application/json+streamthat emits a JSON-serialized post for eachnew-postevent, separated by whitespace. Only new post events are emitted on this stream.
- returns a
- returns a
Fountain is an node.js-based HTTP server. Install the dependencies with
npm install
Then run with
npm start
By default, fountain will replicate and stream /a/ on port 3500. To change settings, set the appropriate environment variables:
BOARD="g" PORT="3600" npm start
Fountain takes about ~3 minutes to replicate an entire board's threads. After
this "initial sync", fountain is able to keep in sync with 4chan with
a median of 5 seconds latency between 4chan timestamp and emission of a
new-post event.
Fountain runs unit tests on start time, so it'll spit out a whole bunch of output. Don't worry, it's running normally. TODO fix this
Fountain operates by holding an entire board in memory, i.e., it does not require a backing persistent database. However, if the process is killed, the memory state is lost. With the ~3 minute initial sync time, restarts are thus not as seamless as I'd like.
As a hack, fountain will dump its state to /tmp/org.hakase.fountain.<board-name>.json
every 30 seconds, and upon receiving SIGINT or SIGPIPE before exiting. When starting,
fountain attempts to read from the same file. This papers over most temporary
hiccups as well as development restarts, while still not requiring a database server.
Phosphene is a companion server to fountain that proxies requests for thumbnails to get around 4chan's referer checks. The demos assume that phosphene is running on localhost:3700, which it will do by default:
npm run phosphene
Fountain's normal memory usage averages around 120M allocated and 60M resident set, mainly due to holding an entire 4chan board in memory.
Fountain logs to STDOUT with ANSI colors. Pipe through ts from moreutils if
you want timestamped logging.
Fountain also spits out a whole bunch of metrics in StatsD format at
localhost:8125 over UDP. If you care to run a StatsD server and a backend
like graphite, you can collect some interesting data.
Fountain employs a similar polling strategy to Asagi, Foolz's board dumper.
However, fountain achieves tighter sync latency by polling catalog.json, from
which new posts can be found most of the time.
Fountain is written in a dataflow-like style using Bacon.js.
TODO elaborate
Development is coordinated through the Github repository:
http://github.com/qqueue/fountain
Please submit bug reports and pull requests there.
Fountain uses the User-Agent `Fountain/0.1.0' and respects the 1 req/s rate limit. pls no bully.