canary, v1

I recently released v1 of canaryio/canary and consider it to be the spiritual successor of canaryio/sensord and canaryio/canaryd. After six months of exploration, some amount of clarity has emerged and a course correction was needed.

small, sharp, composable

The previous incarnation of the project unnecessarily aimed at holism. We were building up an entire toolchain of snowflakes to help you understand the availability and performance at the top level of your website. We were experimenting with dashboards and long term storage of data. There was talk of building monitoring and alerting components, as if we could do it better.

It soon became clear that this was a mistake. Instead of trying to own the entire experience, scope should be reduced and I should do what I know how to do best - take measurements and make room so those outputs can be integrated into existing telemetry ecosystems. Today, It is perfectly reasonble for a company to invest in a SaaS product such as Librato or to roll their own in-house solutions based on Graphite and Grafana. Context-rich logs are made searchable via services like Papertrail or in-house via Heka, Elasticsearch, and Kibana. If canary is going to provide value, it needs to be easy to integrate into such environments.

v1 takes a step towards correcting the project and offers a way forward for future experimentation.

what's in v1?

The v1 release contains a core set of interfaces along with a cli tool, canary. Usings it looks like so:

$ canary http://www.canary.io
2014/12/27 16:14:59 http://www.canary.io 200 96 true
2014/12/27 16:15:00 http://www.canary.io 200 95 true
2014/12/27 16:15:01 http://www.canary.io 200 102 true
^C

As you can see, we ask the tool to monitor a single website (http://www.canary.io), and the results are emitted to STDOUT. The canary tool is meant to be used in a similar fashion to ping - it gives you quick insight into the basic availability and performance of your target.

This release also introduces two interfaces, Sampler and Publisher that can be read about via the docs. It is believed that these will reduce friction for future expansion.

what's coming in v2?

v2 is an intermediary release that will introduce the canaryd command. canaryd is similar to canary, but is capable of monitoring multiple sites and receives its configuration via a JSON manifest. A representative manifest can be found here.

An example of what it currently looks like when executed:

$ MANIFEST_URL=http://www.canary.io/manifest.json canaryd
2014/12/27 15:20:09 http://www.canary.io 200 128 true
2014/12/27 15:20:09 https://www.simple.com/ 200 252 true
2014/12/27 15:20:09 https://github.com 200 384 true
2014/12/27 15:20:09 https://www.heroku.com/ 200 413 true
2014/12/27 15:20:10 https://www.simple.com/ 200 76 true
2014/12/27 15:20:10 http://www.canary.io 200 94 true
2014/12/27 15:20:10 https://github.com 200 306 true
2014/12/27 15:20:10 https://www.heroku.com/ 200 306 true
^C

Note that things may change before v2 is merged into master.

After v2, I'll build other publishers, beginning with one for Librato.

A roadmap and deprecation

The initial roadmap is housed here and contains a short list of goals that I need to reach in order to scratch a personal itch. Once those have been completed, the project should be in good shape for further improvement.

At this time I also plan to deprecate canaryio/canaryd, canaryio/sensord, and canaryio/meta]. The repositories will remain intact, but all issues will be closed and the READMEs will be updated accordingly. Anyone is welcome to fork and run with those projects as they see fit, but they will no longer be supported by me.

I will also be shutting down the existing api.canary.io and watch.canary.io sites since the core project is now heading in a much simpler direction. I am very grateful for all of the community support, and am especially thankful for the support of Rackspace for hosting us up to this point and for the talented Jeremy Green for all the time spent on watch.canary.io. Thank you very much for helping make this experiment possible.

Unhelpful AWS Tip 1 - Compute Instances, Not Computers

I've been hacking around on AWS for five years now, and am starting to compile a list of tips and tricks to help make the most of things. I'm hoping to use this series of posts to help me clarify my own thinking, and if you find any of this helpful, even better.

When working in a bare metal environment, I'd likely provision a new box and hook it up to a Chef Server or Puppet Master. I'd spend most of my time thinking about my on-instance configuration management, as that matters quite a bit since these boxes are going to hang around for a long time.

This isn't the right way to approach EC2. In the AWS model, you want your EC2 instances to be stateless and thrown away often. You should be pushing state onto dedicated services such as Heroku Postgres or Amazon RDS / DynamoDB. Your computer instances should be as thin as possible, focusing as much of their compute power on the task at hand.

It's useful to think of EC2 instances as nothing more than dynamic compute containers awaiting instruction. For example, take a look at this small ruby script:

#!/usr/bin/env ruby

require 'aws-sdk'

metadata = {
  image_id:      'ami-9aaa1cf2',
  instance_type: 't2.micro',
  subnet:        'subnet-a55587fc',
  key_name:      'ops',
  slug_url:      'https://s3-us-west-1.amazonaws.com/example-slugs/demo.tar.gz',
  cmd:           'bin/web'
}

script=<<ENDSCRIPT
#!/bin/bash

set -exo pipefail

apt-get update
apt-get install -y runit ruby

# our hero
useradd -r app
mkdir -p /srv/app
chown app:app -R /srv/app

# fetch the app
su - app -c 'curl -s #{metadata[:slug_url]} | tar xvz -C /srv/app'

# write our runit script
mkdir -p /etc/sv/app
cat >> /etc/sv/app/run <<EOF
#!/bin/bash

cd /srv/app
exec chpst -u app #{metadata[:cmd]}
EOF
chmod +x /etc/sv/app/run

# write our runit logger script
mkdir -p /etc/sv/app/log
mkdir -p /var/log/app
cat >> /etc/sv/app/log/run <<EOF
#!/bin/bash
exec svlogd -t /var/log/app
EOF
chmod +x /etc/sv/app/log/run

# symlink to turn our app on
ln -s /etc/sv/app /etc/service/app
ENDSCRIPT

ec2 = AWS::EC2.new
i = ec2.instances.create({
  image_id:                             metadata[:image_id],
  instance_type:                        metadata[:instance_type],
  subnet:                               metadata[:subnet],
  associate_public_ip_address:          true,
  key_name:                             metadata[:key_name],
  instance_initiated_shutdown_behavior: 'terminate',
  user_data:                            script
})

puts "launched #{i.id}..."

Let's walk through this:

  • we define some metadata for our instance that not only describes the necessary AWS elements but also a URL for the app we want to run as well as what command we need to use to invoke it
  • we render a shell script based on those paraemters. It sets up a process supervisor (runit in this case), fetches our app and starts it up
  • we launch an instance with these inputs

And that's about it. If you were to visit port 5000 with a browswer, you'd find a 'Hello World' web app running.

The Good Parts

  • things are data-driven and fairly functional - data in, app instance out
  • we just launched our app on a compute instance with minimal code
  • it is repeatable - you could run N of these if you so pleased
  • at this point, we're not required to use more complex configuration management - a shell script is just fine
  • we stand a fair chance at being able to run any app we want - just give us a tarball, and we're set if the right libs are installed on the host

Room for Improvement

Things that immediately jump to mind:

  • what if we want to configure this app - what then?
  • how do we know that it booted okay?
  • how do I maintain a fleet of these with minimal overhead?
  • what if I need a load balancer?
  • what if I want to run a lot of different apps, with minimal overhead?
  • I don't like installing packages at boot time
  • how do we reliably support apps that are not ruby?

In Summary

The more you treat EC2 instances like bare metal computers, the more you'll hate your life. This is a proven ratio, embedded deep within the universe. I'm pretty sure it was featured in a recent Dan Brown novel. Treat them like single-task compute instances, and you'll find yourself working with something more like Lego Blocks. Start treating complexity as a smell (as God intended), and watch your life improve.

Future posts will likely dig into improvements and higher level concerns. I'll probably also go all hipster on you and demonstrate how docker could be applied here.

Canary Gets Experimental Websocket Support

This morning, I upgraded the public version of canaryd to the 1.0.0 branch. The big public-facing benefit here is websocket support for streaming down per-check measurements, rather than having to continously poll the measurements endpoint.

Big thanks to Karl Kirch for doing all of the work.

You can test this yourself like so:

$ go get github.com/gorsuch/ws

$ ws -url wss://api.canary.io:443/ws/checks/https-github.com/measurements
{"check":{"id":"https-github.com","url":"https://github.com"},"id":"38dbd030-2821-4468-40f9-ed33c8eb11ab","location":"rax-syd","t":1406383947,"timestamp":"","exit_status":0,"http_status":200,"local_ip":"119.9.20.212","primary_ip":"192.30.252.131","namelookup_time":0.000561,"connect_time":0.213167,"starttransfer_time":0.864768,"total_time":1.0773899999999998,"size_download":15383}
#...

As stated in the title, this is considered experimental for the time being and is subject to change.

Enjoy, and show Karl some love!

Canary Component Versioning

I wanted to drop a small note mentioning that canary.io components are now being versioned via SemVer.

It is my hope that this will make it much easier to communicate the scope of change, help prevent breakage, and allow for the use of services such as gopkg.in.

More updates soon! <3

Status Update

It's been a long, long time since I've written anything. Here are some brief notes on current status:

Family

I am married and have two kids. Twin daughters, to be more specific. They are wonderful and give me a good reason to keep my nose clean.

We have a dog, Starbuck, named after the character from the reimagined Battlestar Galactica. I really have no idea why we did that, but it works.

Work

I am happily employed at Simple, focusing on technical operations. They are a remote-friendly company, and I live and work out of my home in Oklahoma City.

Current Readings

Currently trying on making sense of the Satipatthana Sutta in both theory and practice. I'm leaning on the following works:

Political Leanings

I'm still pretty damned liberal, regardless of my cynical trips.

Side Projects

canary.io is my baby. I've neglected it over the past month as I transitioned to my new job, but it has a lot of good things coming up.

For a preview, keep an eye on gorsuch/canary, my experimental reboot of the service offering.

Development Environment

I do most of work inside a Vagrant instance, maintained by Ansible. I ssh in, and use tmux, vim and irssi to get things done. I don't like to customize.

Programming Language of Choice

go wherever applicable. I'm still a newb, but the language and community have been wonderful so far.