Learn the secret to github's success: Open Source!
I had a chat with Tom Preston-Werner to talk about Github and their server/software architecture. Tom is one of the cofounders of Github and he has written a ton of open source stuff like: jekyll, ernie, chronic, and god.
A github request hits the load balancer, goes to the front-end which runs nginx, then nginx shuttles the request to unicorn which runs the rails app.
The rails app either needs a git repo or it doesn’t, if it does it makes a request to the sharded file system servers using grit.
Tom explains grit communications with “hand gestures” ;)
Github uses BERT and Ernie to let the grit front-end and back-end communicate.
Tom decided to use Erlang because of his previous work on Fuzed with Dave Fayram.
Github uses the Linux HA load balancers, which they’ve setup to rewrite the tcp headers to bypass the load balancers on the trip back to the client.
Github users Anchor Systems to handle lower level server tasks
Larry Wright asked “What does Github use Redis and MongoDB for?”
Tom wrote proxymachine to work with Redis to do sharding on the file system.
Redis is also used for Resque (job worker) and stats tracking.
MongoDB is used for exception tracking, a closed source project named Haystack.
Wikis will become git backed soon!
Github has made open source commits every day this year except two.
A twitter follower asked what they use for shared FS or backup: they build their own stuff!
Backups are stored offsite in Australia, just in case the feds raid Github.
Github moved to rackspace to use custom hardware and to move away from GFS
Jack Dempsey wanted to know what could be learned from Github for a normal rails site
Tom recommends using Unicorn
Tom also recommends using Resque
If you are writing a server either don’t use Ruby or use EventMachine