How Sinatra Works, or How We Can 山寨 (Shanzhai) a mini Sinatra in 23 Lines of Code

Sinatra is great. It is simple, it is elegant, it is easy to use, and it is wildly popular.

It's arguably the second most popular Ruby web framework, following, well very obvious, the Rails.

Ideally we want to thoroughly study how Rails works, but the codebase of Rails is quite complicated and difficult to navigate. Ignoring tests, Rails 6.0 has 80,000+ lines of Ruby:

➜  rails git:(6-0-stable) cloc . --exclude-dir=test,spec
    2719 text files.
    2691 unique files.
     269 files ignored.

github.com/AlDanial/cloc v 1.80  T=10.11 s (245.7 files/s, 90203.2 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
HTML                           937         314353            786         380505
Ruby                          1313          19213          46175          83455
Markdown                        84          16266              0          40204
JavaScript                      44            376            212           3525
CSS                             18            368            153           2706
ERB                             54            229              0           1402
YAML                            14            103            165            838
CoffeeScript                    12             97            125            411
JSON                             6              1              0            303
yacc                             1              4              0             46
Sass                             1              3              9             24
-------------------------------------------------------------------------------
SUM:                          2484         351013          47625         513419
-------------------------------------------------------------------------------

While the v2.0.5 of Sinatra has < 2K lines of code:

➜  sinatra git:((v2.0.5)) cloc lib
       6 text files.
       6 unique files.
       0 files ignored.

github.com/AlDanial/cloc v 1.80  T=0.07 s (90.9 files/s, 39707.7 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Ruby                             6            385            338           1897
-------------------------------------------------------------------------------
SUM:                             6            385            338           1897
-------------------------------------------------------------------------------

Much easier to handle.

Later we will have a full series dedicated to discuss the inner workings of Rails. For now, let's put our focus on a smaller target: to survey how Sinatra works, and try to rebuild a minimal clone (or a 山寨 version) of our own.

Test Drive Sinatra: Fly Me to the Moon

How do we write a web app using Sinatra? Per their documentation, we can create a very simple web app like this:

require 'sinatra'
get '/frank-says' do
  'Put this in your pipe & smoke it!'
end

save the file as app.rb and start it with ruby app.rb:

➜ ruby app.rb
[2019-04-26 12:19:23] INFO  WEBrick 1.4.2
[2019-04-26 12:19:23] INFO  ruby 2.6.2 (2019-03-13) [x86_64-darwin18]
== Sinatra (v2.0.5) has taken the stage on 4567 for development with backup from WEBrick
[2019-04-26 12:19:23] INFO  WEBrick::HTTPServer#start: pid=32041 port=4567

If we read the output (we should always do!) we will find WEBrick 1.4.2 and WEBrick::HTTPServer#start, port=4567.

This means Sinatra is using WEBrick 1.4.2 to serve HTTP requests, and listens on port number 4567.

If we point our browser to http://localhost:4567/frank-says we will see:

It works.

But how?

Rebuilding Sinatra, a Naive Approach

Let's look at our code again.

require 'sinatra'
get '/frank-says' do
  'Put this in your pipe & smoke it!'
end

Here get is a Ruby method defined in Sinatra. We called get with two arguments: a string '/frank-says' and a block which returns a string that eventually becomes the response of the server.

created with Monodraw

That's good, but not very dynamic, we can create a static html file and achieve the same effect. In order to showoff some backend muscle, let's create a route that will return the current time:

# app.rb
require 'sinatra'
get '/frank-says' do
  'Put this in your pipe & smoke it!'
end
get '/time' do
  Time.now.to_s
end

That's more like it. But wait, how does it work, really?

When we run ruby app.rb, the program runs indefinitely, until you CTRL-C to terminate it.

Apparently the loop is not performed by the get method call; otherwise we will never be able to register a second route like we did with the GET /time.

Let's try our first naive approach to reimplement Sinatra, and see what would happen if the loop is run by the get method:

# fake.rb

######### begin fake Sinatra

def handle_request
  # pretending to be handling requests...
end

def get(path, &block)
  puts "[FAKE] registering new route: GET #{path}"
  loop { handle_request }
end

######### end fake Sinatra 

get '/frank-says' do
  'Put this in your pipe & smoke it!'
end

puts 'this will never be executed..'

get '/time' do
  Time.now.to_s
end
ruby fake.rb
[FAKE] registering new route: GET /frank-says

...
# GET /time is not registered!!

As we've seen, the approach above will only allow registering one route per app. So apparently Sinatra uses a different trick.

Sinatra: an Autopsy

Time to open our terminal, clone the source code of Sinatra to our laptop, and fire up our editor and start exploring its content.

Gems and Gemspec

Sinatra is a gem, and the way to read the source code of a gem, is to start from the .gemspec file; in this case, the sinatra.gemspec:

# https://github.com/sinatra/sinatra/blob/v2.0.5/sinatra.gemspec

version = File.read(File.expand_path("../VERSION", __FILE__)).strip

Gem::Specification.new 'sinatra', version do |s|
  s.description       = "Sinatra is a DSL for quickly creating web applications in Ruby with minimal effort."
  s.summary           = "Classy web-development dressed in a DSL"
  s.authors           = ["Blake Mizerany", "Ryan Tomayko", "Simon Rozet", "Konstantin Haase"]
  s.email             = "[email protected]"
  s.homepage          = "http://sinatrarb.com/"
  s.license           = 'MIT'
  s.files             = Dir['README*.md', 'lib/**/*', 'examples/*'] + [
    ".yardopts",
    "AUTHORS.md",
    "CHANGELOG.md",
    "CONTRIBUTING.md",
    "Gemfile",
    "LICENSE",
    "MAINTENANCE.md",
    "Rakefile",
    "SECURITY.md",
    "sinatra.gemspec",
    "VERSION"]

  # omitted...

  s.required_ruby_version = '>= 2.2.0'

  # gem dependencies:
  s.add_dependency 'rack', '~> 2.0'
  s.add_dependency 'tilt', '~> 2.0'
  s.add_dependency 'rack-protection', version
  s.add_dependency 'mustermann', '~> 1.0'
end

The most important section here is probably the gem dependencies part. Sinatra 2.0.5 depends on 4 other gems: rack, tilt, rack-protection and mustermann. We will take a look at them a bit later.

Then we shift our attention to Gem::Specification#files, which describe the files that will be packaged into the released gem (https://rubygems.org/downloads/sinatra-2.0.5.gem). Actually, you can download the sinatra-2.0.5.gem file, and use tar which is basically a way to combine (or reverse: decompose) multiple files/folders into an archive, with optional compressing capabilities to extract and inspect its content:

$ mkdir sinatra-gem; cd sinatra-gem

# download the gem file
$ wget https://rubygems.org/downloads/sinatra-2.0.5.gem
--2019-04-26 14:57:14--  https://rubygems.org/downloads/sinatra-2.0.5.gem
Resolving rubygems.org... 151.101.192.70, 151.101.128.70, 151.101.0.70, ...
Connecting to rubygems.org|151.101.192.70|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 364544 (356K) [application/octet-stream]
Saving to: ‘sinatra-2.0.5.gem’

sinatra-2.0.5.gem                100%[========================================================>] 356.00K  28.2KB/s    in 10s

2019-04-26 14:57:25 (34.2 KB/s) - ‘sinatra-2.0.5.gem’ saved [364544/364544]

# use `tar` to eXtract the content of the File, Verbosely:
$ tar xvf sinatra-2.0.5.gem
x metadata.gz
x data.tar.gz
x checksums.yaml.gz

# the source code is in the `data.tar.gz`, which is a gunzip'ed tar file
# use `tar` to eXtract the content of the File
# pass the content through gunZip filter to decompress, Verbosely:
$ tar xzvf data.tar.gz
x .yardopts
x AUTHORS.md
x CHANGELOG.md
x CONTRIBUTING.md
x Gemfile
x LICENSE
x MAINTENANCE.md
x README.de.md
x README.es.md
x README.fr.md
x README.hu.md
x README.ja.md
x README.ko.md
x README.malayalam.md
x README.md
x README.pt-br.md
x README.pt-pt.md
x README.ru.md
x README.zh.md
x Rakefile
x SECURITY.md
x VERSION
x examples/chat.rb
x examples/simple.rb
x examples/stream.ru
x lib/sinatra.rb
x lib/sinatra/base.rb
x lib/sinatra/images/404.png
x lib/sinatra/images/500.png
x lib/sinatra/indifferent_hash.rb
x lib/sinatra/main.rb
x lib/sinatra/show_exceptions.rb
x lib/sinatra/version.rb
x sinatra.gemspec

Voila! Doesn't it look like the content of the git repo already?

But let's see, don't guess. With a wonderful app called Kaleidoscope on the Mac, we will be able to easily compare the contents of two folders.

In the following screenshot, on the left hand side are the files and folders extracted from the sinatra-2.0.5.gem, on the right hand side is the freshly cloned Sinatra source code and checked out to v2.0.5 branch.

We find that the gem we require has exactly the same examples and lib folders, but without the extra content like rack-protection and sinatra-contrib etc. Also the test files are not included in the released gem.

That's awesome. That means the gems we use everyday are simply archives, and their contents are specified in the .gemspec. Every time we run bundle install or gem install, we are just downloading the .gem files to our computers. When we require the gems, Ruby just unpacks the code and loads them locally. Simple and straightforward.

By default, when we require 'sinatra', it is requiring the file lib/sinatra.rb in the Sinatra gem.

So let's look at the lib folder, which is where the source code of Sinatra lives. First, the lib/sinatra.rb:

# https://github.com/sinatra/sinatra/blob/v2.0.5/lib/sinatra.rb

require 'sinatra/main'

enable :inline_templates

Hmm, almost nothing. Let's follow onto lib/sinatra/main.rb file:

# https://github.com/sinatra/sinatra/blob/v2.0.5/lib/sinatra/main.rb

require 'sinatra/base'

module Sinatra
  class Application < Base
    # ... omitted
  end

  at_exit { Application.run! if $!.nil? && Application.run? }
end

# ... omitted

The key is this line:

at_exit { Application.run! if $!.nil? && Application.run? }

at_exit is a Ruby trick that hooks a user provided block to be run right before the program is about to exit. Let's write some small program to give it a test:

# at_exit.rb

at_exit { puts 'at_exit called!' }

puts "ending program...."

run it with ruby at_exit.rb:

$ ruby at_exit.rb
ending program....
at_exit called!

Great. So at_exit does what it is advertised to do.

With this trick, our fake Sinatra can be rewritten as:

# fake2.rb

def handle_request
  # pretending to be handling requests...
end

def get(path, &block)
  puts "[FAKE] registering new route: GET #{path}"
end

at_exit {
  puts "starting to listen for requests..."
  loop { handle_request }
}

######### end fake Sinatra

get '/frank-says' do
  'Put this in your pipe & smoke it!'
end

get '/time' do
  Time.now.to_s
end

Run it with ruby fake2.rb:

$ ruby fake2.rb
[FAKE] registering new route: GET /frank-says
[FAKE] registering new route: GET /time
starting to listen for requests...

Awesome.

Application.run!

Now let's dig in further to see how Sinatra listens for and handles web requests. Remember in at_exit, Sinatra simply calls Application.run!, but since Sintra::Application does not implement the run! method, let's enter lib/sinatra/base.rb to look at its parent class Sinatra::Base:

# https://github.com/sinatra/sinatra/blob/v2.0.5/lib/sinatra/base.rb

# ...omitted

module Sinatra
  class Request < Rack::Request
    # ...omitted
  end

  class Response < Rack::Response
    # ...omitted
  end
  
  class Base
    # ...omitted

    class << self
      def run!(options = {}, &block)
        # ...omitted
        handler         = detect_rack_handler
        handler_name    = handler.name.gsub(/.*::/, '')
        # ...omitted
        begin
          start_server(handler, server_settings, handler_name, &block)
        rescue Errno::EADDRINUSE
          $stderr.puts "== Someone is already performing on port #{port}!"
          raise
        ensure
          quit!
        end
      end

      alias_method :start!, :run!
    end
    
    # ...omitted
   end
end

If you don't already know, the class << self opens Base class and defines singleton methods on it.

Doing:

class Base
  class << self
    def run!
    end
  end
end

Is basically the same as:

class Base
  def self.run!
  end
end

It's just a fancier (and purer) way to express the same idea.

Sinatra::Base.run! simply calls start_server, so how's the Sinatra:Base#start_server method implemented?

# https://github.com/sinatra/sinatra/blob/v2.0.5/lib/sinatra/base.rb#L1520

def start_server(handler, server_settings, handler_name)
  handler.run(self, server_settings) do |server|
    # ...omitted
  end
end

Hmm. Basically delegates to handler#run.

What is this handler object? As in Sinatra::Base.run!, handler = detect_rack_handler, so the handler is simply the return value of the method Sinatra::Base#detect_rack_handler:

# https://github.com/sinatra/sinatra/blob/v2.0.5/lib/sinatra/base.rb#L1707

def detect_rack_handler
  servers = Array(server)
  servers.each do |server_name|
    begin
      return Rack::Handler.get(server_name.to_s)
    rescue LoadError, NameError
    end
  end
  fail "Server handler (#{servers.join(',')}) not found."
end

The Return Keyword

A Ruby tip: although blocks are like anonymous methods, they are not full featured methods. One key difference between blocks and methods is the effect of the return keyword: the return keyword always returns from current method, ignoring any block contexts it might be in. A simple test reveals this:

# return.rb

def magic!
  (1..1_000_000_000_000_000).each do |i|
    return i
  end
  return 'never reached!'
end

puts magic!
$ ruby return.rb
1

So detect_rack_handler basically returns the first handler it can find with Rack::Handler.get. The list of options is defined here:

set :server, %w[HTTP webrick]

if ruby_engine == 'macruby'
  server.unshift 'control_tower'
else
  server.unshift 'reel'
  server.unshift 'puma'
  server.unshift 'mongrel'  if ruby_engine.nil?
  server.unshift 'thin'     if ruby_engine != 'jruby'
  server.unshift 'trinidad' if ruby_engine == 'jruby'
end

If we modify the #detect_rack_handler a little bit to print debugging information, we will have a clearer idea how it works actually:

def detect_rack_handler
  servers = Array(server)
  puts "candidate servers:"
  p servers
  servers.each do |server_name|
    puts "trying #{server_name} ..."
    begin
      handler = Rack::Handler.get(server_name.to_s)
      puts "success!"
      return handler
    rescue LoadError, NameError
      puts "failed."
    end
  end
  fail "Server handler (#{servers.join(',')}) not found."
end

In order to load modified Sinatra gem we need a Gemfile:

# Gemfile
gem 'sinatra', path: '../path/to/your/sinatra/folder'

And start the server with bundle exec ruby app.rb. The bundle exec here is required otherwise Ruby will load the system default Sinatra gem instead.

$ bundle exec ruby app.rb
candidate servers:
["thin", "puma", "reel", "HTTP", "webrick"]
trying thin ...
failed.
trying puma ...
failed.
trying reel ...
failed.
trying HTTP ...
failed.
trying webrick ...
success!
[2019-04-26 19:58:32] INFO  WEBrick 1.4.2
[2019-04-26 19:58:32] INFO  ruby 2.6.2 (2019-03-13) [x86_64-darwin18]
== Sinatra (v2.0.5) has taken the stage on 4567 for development with backup from WEBrick
[2019-04-26 19:58:32] INFO  WEBrick::HTTPServer#start: pid=47401 port=4567

Interesting. Here ["thin", "puma", "reel", "HTTP", "webrick"] are all popular (or used to be) Ruby web servers.

If instead of MRI, we use JRuby, which is a Ruby implementation on Java Virtual Machine (JVM):

$ bundle exec jruby app.rb
candidate servers:
["trinidad", "puma", "reel", "HTTP", "webrick"]
trying trinidad ...
failed.
trying puma ...
failed.
trying reel ...
failed.
trying HTTP ...
failed.
trying webrick ...
success!
[2019-04-27 11:26:32] INFO  WEBrick 1.4.2
[2019-04-27 11:26:32] INFO  ruby 2.5.3 (2019-02-11) [java]
== Sinatra (v2.0.5) has taken the stage on 4567 for development with backup from WEBrick
[2019-04-27 11:26:32] INFO  WEBrick::HTTPServer#start: pid=66952 port=4567

We will see now trinidad is JRuby's first option.

If we add puma gem to the Gemfile:

$ bundle exec ruby app.rb
candidate servers:
["thin", "puma", "reel", "HTTP", "webrick"]
trying thin ...
failed.
trying puma ...
success!
== Sinatra (v2.0.5) has taken the stage on 4567 for development with backup from Puma
Puma starting in single mode...
* Version 3.12.1 (ruby 2.6.2-p47), codename: Llamas in Pajamas
* Min threads: 0, max threads: 16
* Environment: development
* Listening on tcp://localhost:4567
Use Ctrl-C to stop

Instead of WEBrick, We will be booting Puma!

So our chase inside Sinatra codebase ends here... The trace leads to Rack::Handler class but it seems it's not directly a part of Sinatra. So what is this Rack thing? And how come adding puma gem changes everything?

Rack and Roll: Diving into Source Code of Rack

Time to go even deeper, and inspect the source code of Rack.

Semantic Versioning

Before we immerse ourselves into the codebase of rack, we have one question: which version of rack should we look at?

According to sinatra.gemspec, Sinatra depends on Rack version ~> 2.0:

# https://github.com/sinatra/sinatra/blob/v2.0.5/sinatra.gemspec
# ...omitted
s.add_dependency 'rack', '~> 2.0'
# ...omitted

What does ~> 2.0 means? Well, like many other software communities, most Ruby gem developers use a system called Semantic Versioning . To quote the Summary section:


Given a version number MAJOR.MINOR.PATCH, increment the:

  1. MAJOR version when you make incompatible API changes,
  2. MINOR version when you add functionality in a backwards-compatible manner, and
  3. PATCH version when you make backwards-compatible bug fixes.

That means, if a gem called foo follows the Semantic Versioning principle, we could have:

  • foo 1.0.0 init public release
  • foo 1.0.1, 1.0.2 bug fixes while keeping backward compatibility
  • foo 1.1.0 new functionalities added
  • foo 1.1.1, 1.1.2 bug fixes upon 1.1.0
  • foo 2.0.0 major functionalities added, potentially backward incompatible

Easy and all makes senses. By following the Semantic Versioning principle way of naming your gems you make your gems' upgrading behavior predictable to the users and other gems that depend on them.

Be a good citizen, conform to the social contract.

Since rack follows the Semantic Versioning principle, we have a safe and not so strict way to defining Sinatra's dependent rack versions, by using a pattern called Pessimistic Version Constraint.

Basically, it means we should be pessimistic when declaring dependencies: generally Sinatra wants rack version >= 2.0, but locking to a specific version (say, 2.0.7) is to rigid. By pessimistic nature we want to make sure we use rack version < 3.0, since by 3.0 according to semver it might be backward incompatible.

The constraint could be written as:

s.add_dependency 'rack', ['>= 2.0', '< 3.0']

Since this pattern is so common (that's why it's called a pattern!) we have a special operator for it (~>), and its nicely called the twiddle-wakka:

s.add_dependency 'rack', '~> 2.0'

Similarly, if you want to be more pessimistic, you can write:

s.add_dependency 'library', '~> 2.2.0'

to achieve the same effect as:

s.add_dependency 'library', ['>= 2.2.0', '< 2.3.0']

All good.

How Rack::Handler.get Works

So now we know version ~> 2.0 of rack is what we want, we clone the source code of rack, checkout to 2-0-stable branch.

Quickly we navigate to lib/rack/handler.rb which is where the Rack::Handler class resides.

The Rack::Handler.get method is defined as:

# https://github.com/rack/rack/blob/2-0-stable/lib/rack/handler.rb

module Rack
  # ...omitted
  module Handler
    def self.get(server)
      # ...omitted
      if klass = @handlers[server]
        klass.split("::").inject(Object) { |o, x| o.const_get(x) }
      else
        const_get(server, false)
      end
      # ...omitted
    end
    
    # ...omitted
    
    def self.register(server, klass)
      @handlers ||= {}
      @handlers[server.to_s] = klass.to_s
    end

    # ...omitted

    register 'cgi', 'Rack::Handler::CGI'
    register 'fastcgi', 'Rack::Handler::FastCGI'
    register 'webrick', 'Rack::Handler::WEBrick'
    register 'lsws', 'Rack::Handler::LSWS'
    register 'scgi', 'Rack::Handler::SCGI'
    register 'thin', 'Rack::Handler::Thin'
  end
end

const_get is a method on Ruby Module class, it takes a String and returns the corresponding constant or raise an error if the constant cannot be found.

Since the test code of rack is very good, we can actually read them to understand how this method works:

# https://github.com/rack/rack/blob/2-0-stable/test/spec_handler.rb

# ...omitted

describe Rack::Handler do
  it "has registered default handlers" do
    Rack::Handler.get('cgi').must_equal Rack::Handler::CGI
    Rack::Handler.get('webrick').must_equal Rack::Handler::WEBrick
    # ...omitted
  end

  it "raise LoadError if handler doesn't exist" do
    lambda {
      Rack::Handler.get('boom')
    }.must_raise(LoadError)

    lambda {
      Rack::Handler.get('Object')
    }.must_raise(LoadError)
  end

  # ...omitted
end

Since among our candidate servers (["thin", "puma", "reel", "HTTP", "webrick"]), rack only implements Rack::Handler::WEBrick, thus it will be the default one that gets loaded.

A Brief Detour to Puma

If we take a small detour to puma's source code we will see that puma implements Rack::Handler::Puma, and it takes precedence of webrick in Sinatra's search list. Thus when we add puma to our Gemfile, Sinatra will use it instead of webrick.

Rack::Handler::WEBrick.run

So now we've found the webrick handler's code, let's look at Rack::Handler::WEBrick.run:

# https://github.com/rack/rack/blob/2-0-stable/lib/rack/handler/webrick.rb#L25-L35

module Rack
  module Handler
    class WEBrick < ::WEBrick::HTTPServlet::AbstractServlet
      def self.run(app, options={})
        # ...omitted
        @server = ::WEBrick::HTTPServer.new(options)
        @server.mount "/", Rack::Handler::WEBrick, app
        yield @server  if block_given?
        @server.start
      end

      # ...omitted
    end
  end
end

Again, it simply delegates to ::WEBrick::HTTPServer#start.

Module#const_get

Another small Ruby tip here. We have to use ::WEBrick instead of simply WEBrick because we are looking for top-level WEBrick constant. Look at example below:

# const_lookup.rb

class B
  def initialize
    puts "instance of ::B created"
  end
end

module A
  class B
    def initialize
      puts "instance of ::A::B created"
    end
  end

  class C
    def initialize
      @inner = B.new
      p @inner
      @outmost = ::B.new
      p @outmost
    end
  end
end

A::C.new

Test it:

$ ruby const_lookup.rb
instance of ::A::B created
#<A::B:0x00007fa52b061ff8>
instance of ::B created
#<B:0x00007fa52b061e40>

So we know the WEBrick in ::WEBrick::HTTPServer.new is not the same as the one in Rack::Handler::WEBrick. Actually, using same named classes/modules under different namespaces is a very common Ruby pattern, we will see more of this in the future.

After reading the documentation of webrick we quickly update our fake Sinatra:

# fake3.rb

require 'webrick'

def get(path, &block)
  puts "[FAKE] registering new route: GET #{path}"
end

at_exit {
  server = WEBrick::HTTPServer.new(Port: 4567)
  server.start
}

get '/frank-says' do
  'Put this in your pipe & smoke it!'
end

get '/time' do
  Time.now.to_s
end

Test run:

$ ruby fake3.rb
[FAKE] registering new route: GET /frank-says
[FAKE] registering new route: GET /time
[2019-04-27 13:13:15] INFO  WEBrick 1.4.2
[2019-04-27 13:13:15] INFO  ruby 2.6.2 (2019-03-13) [x86_64-darwin18]
[2019-04-27 13:13:15] INFO  WEBrick::HTTPServer#start: pid=70070 port=4567

Awesome! It seems to work. But now whatever route we hit we always get a 404 response:

http://localhost:4567/time

Of course. We haven't register any route to webrick yet. Let do this in the next section.

All Routes Lead to Rome: Routing Implementation

If we go back to the Rack::Handler::WEBrick.run method we immediately noticed that the key ingredient missing is this line:

# https://github.com/rack/rack/blob/2-0-stable/lib/rack/handler/webrick.rb#L32

@server.mount "/", Rack::Handler::WEBrick, app

As good developers we immediately run out to read the documentation of WEBrick::HTTPServer#mount:

https://ruby-doc.org/stdlib-2.6.3/libdoc/webrick/rdoc/WEBrick/HTTPServer.html#method-i-mount

It says nothing. Honestly, Ruby's documentation is not known to be very good. Fortunately, we have the ultimate weapon:

Use the Source, Luke!

Since webrick is a standard library of Ruby, we can find its source code directly inside Ruby's repository:

# https://github.com/ruby/ruby/blob/ruby_2_6/lib/webrick/server.rb
# drastically simplified to make it easy to read

module WEBrick
  class SimpleServer
    def SimpleServer.start
      yield
    end
  end

  class GenericServer
    def start(&block)
      server_type = @config[:ServerType] || SimpleServer

      server_type.start{
        @logger.info \
          "#{self.class}#start: pid=#{$$} port=#{@config[:Port]}"

        if svrs = IO.select([sp, *@listeners])
          svrs[0].each{|svr|
            if sock = accept_client(svr)
              th = start_thread(sock, &block)
            end
          }
        end
      }
    end
    
    def accept_client(svr)
      case sock = svr.to_io.accept_nonblock(exception: false)
      when :wait_readable
        nil
      else
        sock
      end
    end

    def start_thread(sock, &block)
      Thread.start{
        addr = sock.peeraddr
        @logger.debug "accept: #{addr[3]}:#{addr[1]}"
        block ? block.call(sock) : run(sock)
      }
    end
  end
end
# https://github.com/ruby/ruby/blob/ruby_2_6/lib/webrick/httpserver.rb

# ...omitted
module WEBrick
  # ...omitted
  class HTTPServer < ::WEBrick::GenericServer
    # ...omitted
    def run(sock)
      while true
        req = create_request(@config)
        res = create_response(@config)
        server = self
        begin
          # ...omitted
          server.service(req, res)
        ensure
          # ...omitted
        end
        # ...omitted
      end
    end
    # ...omitted
    def service(req, res)
      # ...omitted
      servlet, options, script_name, path_info = search_servlet(req.path)
      # ...omitted
      si = servlet.get_instance(self, *options)
      @logger.debug(format("%s is invoked.", si.class.name))
      si.service(req, res)
    end
    # ...omitted
    def search_servlet(path)
      script_name, path_info = @mount_tab.scan(path)
      servlet, options = @mount_tab[script_name]
      if servlet
        [ servlet, options, script_name, path_info ]
      end
    end
    # ...omitted
  end
end

Hmm. It seems a bit complicated. Let's break them down:

  1. we start the webrick server with WEBrick::HTTPServer#start
  2. WEBrick::HTTPServer does not implement the start method, we turn to its parent class WEBrick::GenericServer
  3. GenericServer#start calls SimpleServer.start with a block, since SimpleServer is basically an empty class so it minimizes side effects
  4. IO.select and GenericServer#accept_client to find newly connected HTTP clients
  5. for each client, call GenericServer#start_thread to start a new Thread to handle the request
  6. GenericServer#start_thread simply calls #run, and in this case, implemented in WEBrick::HTTPServer#run
  7. WEBrick::HTTPServer#run eventually calls Rack::Handler::WEBrick#service

Don't worry too much about the IO or Thread related parts, we will have more blog posts on these topics.

In app.rb we can hack webrick to make it dump more verbose logs:

# app.rb

require 'sinatra'

# force webrick to log as verbose as possible
# https://github.com/ruby/ruby/blob/ruby_2_6/lib/webrick/server.rb#L91
require 'webrick'
set :server_settings, {
  Logger: WEBrick::Log.new(nil, WEBrick::Log::DEBUG)
}

# ... continue with the routes we defined

Boot it and hit http://localhost:4567/time:

$ bundle exec ruby app.rb
[2019-04-27 14:00:23] INFO  WEBrick 1.4.2
[2019-04-27 14:00:23] INFO  ruby 2.6.2 (2019-03-13) [x86_64-darwin18]
[2019-04-27 14:00:23] DEBUG Rack::Handler::WEBrick is mounted on /.
== Sinatra (v2.0.5) has taken the stage on 4567 for development with backup from WEBrick
[2019-04-27 14:00:23] INFO  WEBrick::HTTPServer#start: pid=72440 port=4567

[2019-04-27 14:06:25] DEBUG accept: ::1:55186
[2019-04-27 14:06:25] DEBUG Rack::Handler::WEBrick is invoked.
::1 - - [27/Apr/2019:14:06:25 +0800] "GET /time HTTP/1.1" 200 25 0.0009
::1 - - [27/Apr/2019:14:06:25 CST] "GET /time HTTP/1.1" 200 25
- -> /time
[2019-04-27 14:06:25] DEBUG close: ::1:55186

After inspecting the log and cross reference with Rack::Handler::WEBrick#run and Rack::Handler::WEBrick#service, we now understand the whole setup flow is:

created with Monodraw

Note that req is short for Request, and res is short for Response.

Connecting FakeSinatra to WEBrick

So we only need to handle the rack part of the diagram, which is the handler#service method, let's move forward to improve our fake Sinatra implementation:

# fake4.rb
require 'webrick'

class FakeSinatra
  # no need to inherit from WEBrick::HTTPServlet::AbstractServlet
  def self.get_instance(server)
    new
  end

  def service(req, res)
    res.body = 'hello world'
  end
end

def get(path, &block)
  puts "[FAKE] registering new route: GET #{path}"
end

at_exit {
  server = WEBrick::HTTPServer.new(Port: 4567)
  server.mount('/', FakeSinatra)
  server.start
}

# ... continue with route definitions

When we run it and hit /time (actually, any route) we are welcomed with hello world:

http://localhost:4567/anything

Which Path Is It?

Great. What's left is to make it dynamic. What we need to do is according to different routes, we hit different blocks the user provides. Naturally we want to look at req object:

# fake4.rb
require 'webrick'
require 'pp'

class FakeSinatra
  def self.get_instance(server)
    new
  end

  def service(req, res)
    pp req # prettier p
  end
end

# ... continue with definitions

When we run it and hit /time we get a huge dump, but at this moment we only care about @path and @path_info, which is basically the same:

#<WEBrick::HTTPRequest:0x00007f9beb12a118
 # ... omitted
 @path="/time",
 @path_info="/time",
 # ... omitted

Which that knowledge and some Ruby tricks we can happy finish our fake Sinatra app:

# fake_sinatra.rb
require 'webrick'

class FakeSinatra
  def self.get_instance(server=nil)
    @instance ||= new
  end

  def service(req, res)
    res.body = "#{@routes[req.path].call} - served by FakeSinatra"
  end

  def register(path, block)
    @routes ||= {}
    @routes[path] = block
  end

  module Helpers
    def get(path, &block)
      FakeSinatra.get_instance.register(path, block)
    end
    # TODO: implement more HTTP methods
  end
end

# so the user can call `get` etc
include FakeSinatra::Helpers

at_exit {
  server = WEBrick::HTTPServer.new(Port: 4567)
  server.mount('/', FakeSinatra)
  server.start
}

Now we just need to replace the require 'sinatra' line in our app.rb:

# app.rb
require_relative 'fake_sinatra'

get '/frank-says' do
  'Put this in your pipe & smoke it!'
end

get '/time' do
  Time.now.to_s
end

Boot it with ruby app.rb and hit http://localhost:4567/time:

And. It. Works.

Wrapping Up

So this wraps it up. In this post we looked into source code of Sinatra together with rack and webrick and at the end we were able to create our own minimal 山寨 version of it. Our version is so minimal it has no external dependencies other than the Ruby built-in webrick library.

There are still a lot that we did not cover, to name a few:

  • how to handle queries params of GET requests
  • how to handle POST requests and POST BODY
  • how to use a different handler other then webrick when available
  • how to use a template engine
  • how to make it modular and therefore mountable by other web apps
  • ...

And most importantly, in our final implementation we completely bypassed rack, which is THE CORE PART of any serious Ruby app server. But that, could be a topic of another post.

Thanks for reading this blog post. In the future we will have more posts like this, and eventually we will tackle Rails, to break it down and analyze individual parts of it, and, most importantly, rebuild our own clone of it.

The source code of this blog post lives on GitHub. Don't hesitate to drop me a message at [email protected] if you have any feedback or questions.

Update:

I have a follow up post on this topic where I took it further and implemented more features, go read it now!