Building a Simple Olympic Medals API

olympic-rings-fail

I’m shamelessly excited about the upcoming olympic games. I’m a sucker for both the competition and the cheesy human-interest stories…. I thought the games would make a good excuse to show how a simple API can be built and launched from scratch with modern tools.

Put on your propeller beanie and let’s take a gentle geeky look at how I built it.

Olympics Medals API

The project was to launch an web-based API that returns JSON data on the current medal count for the Sochi 2014 games. In plain english, that means a URL:

which returns raw data that can easily be consumed by another computer program.

Why would we want this? This is very similar to almost every API that powers mobile apps today. Most iPhone and Android applications are constantly visiting URLs like this to get the data they need to update views in response to user input, loading a new screen, etc. These things power nearly every interaction you do on mobile, and a good chunk of the web too.

In our specific case, we return JSON text as seen below, with the latest medals counts for all the olympic countries. You’ll get the full data if you click the link above.[1]

[
    {
      "country_id": "united-states",
      "country_name": "United States",
      "rank": 1,
      "gold_count": 12,
      "silver_count": 14,
      "bronze_count": 6,
      "medal_count": 32
    },
    {
      "country_id": "germany",
      "country_name": "Germany",
      "rank": 2,
      "gold_count": 8,
      "silver_count": 16,
      "bronze_count": 1,
      "medal_count": 25
    },
    and so on, for all 94 countries represented.
]  

There’s also another URL for retrieving the medal counts for a particular country:

That one returns a very little bit of text:

{
    "country_name": "United States",
    "rank": 1,
    "gold_count": 12,
    "silver_count": 14,
    "bronze_count": 6,
    "medal_count": 32
}  

Getting the Data

This app was a fun reason to try out a newly launched tool called Kimono. They offer a service which scrapes structured data off web pages for you. I created a Kimono scraper in only a few clicks which retrieves the raw data directly from Sochi2014.com. Wouldn’t have been hard to do myself, but developers love shortcuts wherever we can find them.

It’s worth noting here that my API is a wrapper for a Kimono API, which is scraping the official Sochi website, which is displaying raw data from the International Olympic Committee medal standings API. These kinds of services-built-on-services are what makes the modern web so exciting and powerful, while simultaneously confusing and often fragile. If I were building a real production-quality API for olympic medal standings, I’d almost certainly try to license the raw data source to make my app faster and more reliable. But this approach will work for our purposes, and allowed me to get the whole API built and deployed in only a couple hours.

Building the App

I chose the lightweight Ruby Padrino framework for this app. It doesn’t have as many advanced features and support as something like Ruby on Rails, but it’s fast and easy to work with for a tight small project that doesn’t need a fancy front-end or even a database (though you can do all that with Padrino too).

You can find all the source code for this application open-sourced on GitHub. If you haven’t poked around at an app like this before, indulge yourself, and go take a look at just three files:

  1. The main application file shows three simple URLs. Our two API endpoints, and the root, which redirects to our documentation.
  2. The MedalData class which does the work of grabbing the raw data and arranging it to match what we return via JSON.
  3. A simple automated test for MedalData that makes sure future changes to my code or the Kimono scraper don’t break the behavior I’m expecting. This is a great example of how simple an automated test can be.

All the rest of the files in the project are just decoration, configuration, documentation, the boilerplate plumbing that Ruby and Padrino require to do the work. Not that hard, right?

Documenting the API

Developer tool Apiary maintains an open standard for documenting APIs like this one, called the API Blueprint.

I wrote up a similar description as above, but in their specified format, which is shown when a user visits http://olympics.clearlytech.com/.

Simple documentation like this goes a really long way towards convincing others to consume your API. Developers love this stuff.

Deploying It To The World

I decided to launch it on the mind-bogglingly easy Heroku platform. I created a new app, ran some git commands (Heroku manages your code by using the git source control tool that your developers are probably using anyway), and voilà! Instant public application.

Technically, the Heroku app runs at http://olympics-api.herokuapp.com/, but I told it to answer to http://olympics.clearlytech.com/ as well, by putting an entry in my DNS zone, managed by Amazon Route53. This may seem like a lot of moving parts, but wiring this kind of thing up is second-nature stuff to any full-stack developer worth her salt.

The whole process of setting this up on Heroku (including signing up for the service, setting up the app, deploying it, and changing my DNS) took about 10 minutes. There isn’t a faster way right now to deploy a low-volume application for public consumption.


  1. The code at the raw URL is not nicely formatted like our example, but another piece of code consuming this service doesn’t care how pretty it looks.  ↩

Beware of These Five “SaaSumptions”

Collapsed_Kinzua_Bridge

The intoxicating siren song of a new generation of Software as a Service (SaaS) offerings is a promise that you can bring awesome products to market faster than ever, or run your business operations with dramatically less friction. Heck, it’s what ClearlyTech Recommends is all about. But caveat emptor if you blindly assume that the latest greatest SaaS offering is automatically mana from heaven. Oscar Wilde taught us all what happens when we assume.

And so we present here the five SaaSumptions that may come back to haunt you.

  1. They’re free! Many SaaS offerings take a first-hit-is-free approach. Take the time to understand when the premium part of their “freemium” offer kicks in. Understand how much it will cost and what the ROI is before you do a deep integration.
  2. They will always exist. How sure are you that this partner will be in business longer than you will? How much do you trust them to tell you far enough in advance of an impending shutdown that you will be able to respond gracefully? Note: almost none tell you in their terms of service what happens to your data if they shut down.
  3. I can trust them with my private data. Data is at the heart of everything you do. Sales data, log files, billing invoices, customer information, user generated content, etc. Before you trust a provider to own a copy (or the only copy) of that data, find out if they do backups and have an appropriate disaster recovery plan. Consider doing regular exports of the data into your own backups. And consider what happens if their security is compromised. You will still be liable if all your customer information gets stolen, at least in the eyes of the marketplace, and potentially the law.
  4. Their service is up 24/7 What happens if you are serving traffic, but your partners are down? Maybe you are hosting all your videos on Amazon S3, or your images on Cloudinary. Make sure you know what your site looks like if those providers have a temporary hiccup. Murphy’s law promises that it will happen during your big marketing push, of course. Even if they are up, your SaaS provider might be hosted in another part of the country or the world, meaning that general Internet routing issues might cut you off from their services. Note: most don’t provide rock-solid SLAs, don’t expect to ever get money back if they go offline.
  5. They totally meet my needs. They might have exactly what you need right now. What happens as you grow and change? Make sure their other customers are like you, that will help ensure they have your best interests in mind. See if their customer support is quick in responding to your specific needs. Can they scale with you? Do they have international support? Be careful you don’t design your product around the services you choose. You owe it to your customers to give them a stellar product, even if you have to get custom to make that happen.

When it comes to picking SaaS partners, remember to pay particular attention to your pillars, as those are the areas where a bet on SaaS partners or a decision to build it yourself will matter the most. Beyond that, do your diligence, and design your dependencies on outside services so that you could shift to another one if business needs warrant it. The buck stops with you.

Don’t be afraid to go with SaaS (it’s a powerful advantage you have as a small company), but be smart about it.

3 Rules for Choosing Startup Technologies

Black, White, and Oh-So-Gray

Despite our more impassioned wishes, technology choices (like all worthwhile choices in life) are rarely black or white.

Show me a piece of technology advice on the Internet, and I’ll show you three pieces of conflicting advice. Such is the nature of both debating on the Internet, and making decisions in a rapidly evolving and still relatively nascent industry.

I see founders all the time paralyzed by technology choices, deciding to wait to build until the right tech falls in their lap. Or deciding to keep up with horrible home-grown solutions just in case some software package might not meet exactly their needs.

How To Make A Technology Choice

When in doubt, follow these three rules:

  1. Choose tech that has lots of customers like you and a thriving ecosystem. Avoid solutions that are only used by the Fortune 500, or by the one pre-revenue startup that hired the guy who built the tech in the first place.
  2. Choose tech that’s specifically designed for your purpose. Avoid solutions that try to solve all your problems with one monolithic package.
  3. Choose tech that your developers are excited about. They will work harder, faster, and more creatively with tools and services that inspire them. Avoid forcing anything on your team that will be an excuse for building anything less-than-awesome.

Embrace the impossibility of a one best choice, and relax. Choose with informed data, and your gut. There are as many ways to solve a given technology problem as there are technologists, so chances are good that whatever informed choice you make will be capable of delivering for you.

Only You Can Choose

Choices aren’t made by committee (at least not smart choices…)

Make a confident choice, and go solve some problems with it!

Best Cloud Infrastructure Provider

New Cloud IaaS providers are coming out of the woodwork these days. Here are ClearlyTech’s current picks for the top-tier solutions you should consider before you deploy to the Cloud.

Large Scale Production Needs

Amazon AWS is the 400lb. gorilla in the space. Their pure-play IaaS product is their Elastic Compute Cloud (EC2). It’s the largest, oldest, and most feature-rich of its peers.

Amazon has had its issues over the years, sometimes bugs, sometimes bizarro pricing, and their share of highly publicized downtime. But they are a giant, and with all that experience comes a maturity that should count a lot when hunting for a reliable partner to run your high-traffic application. They are a tried-and-true solution at this point, warts and all.

Note that Amazon may not be (probably isn’t, in fact), the best technical solution. Other newer entrants to the game have had the opportunity to watch Amazon’s trials over the past 7 years, and improve in the few areas that Amazon has shown weakness.

I count two auxilliary areas I’ve seen AWS fall short.

  1. Amazon has terrible customer support for AWS. Even if you pay for the Gold support, I’ve found it’s impossible to get a truly knowledgable person on the phone, especially during a crisis situation. Amazon appears to be taking the Google approach to support. Make sure everything runs right, so you never have to interact with your customers. And when things don’t work, throw them just enough of a bone so they don’t leave, while you fix the problem.
  2. Amazon is starting to feel like an aging platform. Network and Disk I/O in particular have inconsistent performance. While they are upgrading all the time and rolling out new tiers of machine (like Provisioned IOPS, which is consistent, but still not particularly fast), and recently SSD drives, AWS is at risk of getting passed by some players investing heavily in newer infrastructure with a fresher architectural approach.

Despite these minor issues, ClearlyTech favors Amazon AWS for large scale production needs. Gartner Group agrees so strongly that AWS lives in a quadrant all its own among cloud hosting providers in their 2013 report.

Gartner 2013 Cloud Hosting Comparison
Gartner 2013 Cloud Hosting Comparison

Another Mature Player

ClearlyTech recommends Rackspace as another long-standing player in the hosting business worth a look. Rackspace was a pioneer in the managed-hosting space (they run your traditional infrastructure), and has recently put a lot of attention into a Virtualized Cloud hosting model to compete with AWS.

If the lack of good customer support from Amazon is a deal-breaker for you, Rackspace is a very appealing alternative. Every time I’ve called Rackspace, a real person answers the phone, is sufficiently technical, and handles my needs quickly and painlessly. Their commit to customer support shows.

Personal Projects, Prototypes, and Developer Playgrounds

For pure ease-of-use, low-prices, ClearlyTech supports Digital Ocean, the self-described “Simple Cloud Hosting”. They are built on a new clear technology stack, including all SSD drives, mitigating some of the risk of bad disk I/O slowing down an otherwise sufficient server.

This is a great option if your developers need extra horsepower without a lot of hassle, or if you need to get a prototype up for customer testing on a public IP address with easy setup/teardown process. They are a pleasure to use, and time will tell whether they keep expanding their offering into something mature enough to use for more serious production deployments.

Your Own Iron

If you want dedicated hardware, you could go to Rackspace. But for that, ClearlyTech supports Softlayer

Thousands of reputable companies rely on them for no-frills managed data-center hosting. You may still need some sysadmins to run the show, but SoftLayer consistently has good prices on excellent hardware and extremely reliable hosting.

Some to Watch

There are so many players out there, we won’t even try to mention more than a few alternatives. But we’re keeping an eye on a few

  • Google Compute and associated services is going to make a run at AWS in the next few years. They’ve learned a lot from watching Amazon, from their own experience with huge scale provisioning of cloud resources, and from App Engine, their initial platform as a service cloud play, which failed to gain traction among serious startups and open-source developers.
  • Microsoft Azure has a tough row to hoe, as fewer and fewer companies are deploying Microsoft technologies to the cloud vs the rapidly evolving open-source stacks. But Microsoft isn’t taking it lying down. We’ll be watching what innovations they can bring to the table, especially given that they control the hardware and the software stack. And if you need to deploy Windows to the cloud, we prefer them over Amazon EC2.

Choosing A Web Framework

Paraphrasing one of the most common questions asked by non-technical founders when launching into the phase of building a web application, web service, or mobile API — What framework should we be developing in?

In reality, the question usually takes one of these forms:

  • “The outsourced team we hired wants to build in Java, but I’ve heard that newer languages are better, is that right?”
  • “I found a Ruby developer and he wants to use something called Padrino instead of Rails. But isn’t everyone using Rails?”
  • “We built a prototype in PHP, but now we want to rewrite it. No developers want to be doing PHP. Should we switch to Python and Django so we get the best people?”

You know that all frameworks are really just a set of helpful libraries built to be leveraged by a particular programming language. In many ways, the choice of language will have a longer lasting impact on your organization than the choice of framework itself.

Worrying (or not) About Scale

“But I heard framework X doesn’t scale![1] because it’s written in language Y!”
Honestly, don’t pay too much attention to such blanket statements. When it comes to making sure your app can grow to support tens of millions of users, here are two truths:

  • Whether your app will scale has everything to do with your architecture. You should care immensely about how the code is structured, the relationship between web servers, database servers, caching layers, services, and asynchronous queues. A properly architected application will scale in almost any language or framework. When you read about huge-scale applications, you’ll rarely find them talking about what language or framework they used, but rather what architecture works at scale.
  • How efficiently your application scales, on the other hand, may well depend on the language and framework used. For a given application load, you might find that code built in Ruby on Rails might require 10 servers, while the same application running in erlang on Chicago Boss might only require 2 servers. Or vice versa, to be clear. Some kinds of applications suited particularly well to Rails might turn that table.

Just Get It Built

While you might be tempted to demand the most resource efficient framework out of the gate, don’t! You’d be making the mistake of premature optimization. You don’t even know if you have product-market fit yet, let alone what the performance profile of your application will be.

Two factors mitigate the risk associated with choosing a “less scalable” web framework.

  1. The ability for your early developers to implement new features and changes rapidly, and with confidence, is absolutely crucial for you to have any chance at your business succeeding.[2]
  2. By the time you have millions of users, your application will most likely be large enough and complex enough that your engineering team will start to rebuild various component pieces using more specialized languages and tools anyway. If you properly architect your application from the start, you won’t be “stuck” forever with any particular language/framework choice.

Facebook is an interesting example. They started in PHP, which was the defacto standard for dynamic web sites back in 2004. Now today, as one of the largest web properties in the world, scale is incredibly important to them. A 10% difference in efficiency makes a multi-million dollar difference in hosting costs and user-retention (faster site = more users sticking around). Rather than change languages or frameworks (an arguably impossible task at their scale), Facebook wrote a custom compiling that translates PHP code into C++ code, so it can run without the interpreter layer on their datacenter hardware. So once you get to Facebook’s scale, then you can worry about all sorts of things. For now, use the framework that gets your site actually built the fastest, over the one that runs it the fastest.

Great Developers Trump Great Frameworks

If you don’t hire a rockstar founding engineer primarily because he wants to build your site in CakePHP when you were leaning towards Scala Lift, then you’re a fool.[3]

Developers will work best in the tools they are passionate about, and nowhere is that more true than the framework of choice for major web applications. So listen seriously to the reasons your developers want to use a particular framework. Quiz them. Make sure they understand the pros and cons of their choice. And then, in the absence of other compelling reasons, encourage them to go build with it.

Maturity and Community

Different web frameworks, and for that matter programming languages, have different philosophical approaches. Ideally, the kind of software you are building and your process for building it will be aligned with the philosophy of the framework you choose.

For example, if you are building a high-volume messaging or realtime gaming system, you might be better served with a language that’s designed to be really good at concurrency like Erlang, or perhaps Node.js. If you are building rich interactive content site, you might find great support and like-mindedness from the Ruby on Rails community. If you are building a banking application, the legacy and enterprise adoption of Java J2EE might be a safe bet that assures you support from heavyweight customers who care about your backend platform choices.

There’s rarely a reason to choose a framework without a great community, great documentation, and a mature set of companies using it to run successful production applications. For whatever framework you choose:

  1. Check out their website. For comparison, here are a few: Rails, Django, ASP.NET, Yii, Play, ExpressJS. Read their main marketing points.
  2. Look for other companies using a given framework. Often the frameworks include lists of companies using them right on their site. Do they represent businesses with products like yours? Businesses you respect? Ones whose applications are fast, modern, and running reliably in your estimation?
  3. Read the documentation for the major frameworks. Even if you don’t understand it all, does it seem comprehensive to you? Do they include lots of guides and clear english descriptions and code samples?

Finally, evaluate the community around a given language however you can. Talk to developers in your network who have used it (can’t track any down? might be a bad sign). Read threads on some mailing lists or Google groups. Search GitHub for open-source projects, check out their activity and bug lists. Look for communities that are productive and friendly and supportive. Be wary of ones that are too exclusive, too new, or too elitist.

The more mature and well-supported a framework is, the longer it will be around. And the higher-quality the community, the more support your developers will get, both in the way of documentation and friendly online help from others in the community, but also by way of great open-source libraries to help you get more done, more quickly.

A quality community will attract more developers of all levels, which means a better experience for your developers. But also, it means more potential candidates for your business when you look to expand your development team. Candidates looking to build on a modern, fun, and thoughtfully-chosen web framework.

Blah, blah, blah. Just tell me which one is the best?

ClearlyTech will publish a series of articles on various framework choices, so stay tuned. In the meantime, keep an eye on ClearlyTech Recommends for up-to-date suggestions.

Additional Resources


  1. Case in point: Can Rails Scale?  ↩

  2. If you were truly optimizing just for long-term application performance, you could have developers write all your code in x86_64 assembly code. Unfortunately, it would take even the best developers so many years to write a substantial web application in raw assembly, you might as well give up before you even began.  ↩

  3. On the other hand, if his insistence on PHP is because he is afraid of new tools, has stopped learning and growing, or thinks that PHP is a legitimate contender for “best web language of the decade”, then for all our sakes, he never should have gotten as far as the offer stage…  ↩

11 Parts of Every Application Stack

zen-rocks

All modern web applications — whether they are delivering a website, powering an API for a mobile application, or providing a B2B web service — are made with the same family of key ingredients. Developers love to apply physical metaphors to our abstract virtual world, so we describe this as the “application stack”.

When discussing the eleven components below with your technology team, ask three questions for each item:

  • Do we need this kind of functionality?
  • If we don’t have it now, when will we need it?
  • Are our needs unique enough to require something other than a tried-and-true solution for this?

For 80% of the items on the list below, your answers to the above questions should be yes, soon, and no.

Without further ado, here’s a list of things you will almost certainly need to address as you build your application:

  1. HTTP Server — your face to the world, the waiter who shepherds URL requests to all the other parts of your stack. A straightforward, but important piece.

    Apache? nginx? See our recommendations…

  2. Web Framework — the server code that powers your site. Picking a language and framework represents one of the largest decisions you’ll make, affecting both the product and the kind of developers you’ll hire and groom.

    Java? Ruby on Rails? Python? PHP? Drupal? Node.js?

  3. Transactional Data Store — one or more databases which will support all the data being read and written by your web framework code. Picking the right data stores for your use case is critical for scale and performance.

    MySQL? MongoDB? Oracle? Redis? Relational? NoSQL?

  4. File Data Store — somewhere to serve downloads, videos, images, and static content.

    RAID array? Amazon S3? Akamai? Dropbox?

  5. Background Queueing — all the stuff that happens when your users aren’t watching. Sending email notifications, calculating leaderboards, pre-processing image uploads.

    RabbitMQ? Resque? ActiveMQ? ZeroMQ? Amazon SQS? Kestrel? Kafka?

  6. Caching Layer — when all complexity starts to act like quicksand and you need to speed things up.

    Memcache? Redis? Cachely? Ehcache?

  7. Search — searching through text is a very specialized problem, and typically deserves a specialized solution.

    Elasticsearch? Solr? Lucene? Sphinx?

  8. Email / SMS / Push Notifications — outbound messaging to your customers, another specialty field you likely don’t want to build yourself.

    Qmail? Sendgrid? Mailchimp? Urban Airship? PubNub?

  9. Automated Deployment — rapid, reliable updates to your software, operating systems, and configuration, a must-have for any mature product.

    Capistrano? Fabric? Cargo? Make?

  10. Monitoring — keep tabs on the health of both your systems and applications, get alerted when something goes wrong.

    Nagios? Munin? LogicMonitor? Pingdom? ServerDensity? PagerDuty?

  11. Reporting and Analytics — business intelligence that powers data driven decision making for your organization.

    Data warehouse? Dashboards? GoodData? Pentaho? Jasper? Crystal? Google Analytics? OLAP? RedShift?

If you don’t know how your business is handling each of these, ask your tech team. If they don’t know, then raise this list with them, or reach out and ask for help. Or pray.