Programming

Flask in Production: Minimal Web APIs

Flask is a popular 'micro-framework' for building web APIs in Python. However, getting a Flask API 'into production' can be a little tricky for newcomers. This post provides a minimal template project for a Flask API, and gives some tips on how to build out basic production Flask APIs.

Mark Douthwaite

Jan 28, 2021 — 22 min read

What is Flask?

If you work in the world of (or a world adjacent to) cloud software and are familiar with Python, the chances are you'll have come across Flask – the excellent, minimal 'micro' web framework that's been at the heart of the Python web community for around a decade. It has found its way into a huge number of applications, reportedly including aspects of the LinkedIn and Pinterest platforms, as well as innumerable other commercial, private and research projects too. This popularity ensures it has a vibrant ecosystem of extensions, documentation and tutorials.

So why another Flask blog post? Over the last five or so years, I've used Flask for dozens of personal and professional projects, and I've learned a lot. However, it can be difficult to find simple, good-quality guides on how to get started with a 'production-ready' Flask app. I thought I'd put together some of my thoughts on what a simple, minimal Flask 'production-ready' application could look like – a (not comprehensive) list of a few things I wish I'd have known over the years. If you have suggestions for improvements to the advice in this post, do make sure to get in touch and let me know.

Additionally, for this post I'm particularly focussing on Flask apps developed for the purpose of providing APIs as web services. This is a fundamental use-case for many software professionals, and it's also increasingly relevant to many Machine Learning practitioners too. Plus it is something that can be a bit bewildering to new-comers to the software world. This post is therefore aimed at giving a simple but solid 'production-ready' Flask service template for you to build upon, and to share some rationale for the structure I've provided. I've also given a few basic steps to get the template project deployed to Google Cloud Run too. Let's dive in.

A little terminology...

As always with technical fields, it's easy to trip up over the language of APIs. Before we take a look at the template project itself, it's worth taking a quick review of a few of the more important terms to get your head around. If you'd rather skip to the code, then here you go:

What do we mean by 'production-ready'?

The software community uses the word 'production-ready' as an almost talismanic, self-evident term. However, it can be a little unclear specifically what makes a piece of software production-ready, especially to newcomers to the field, and perhaps less technical teams too. So what does it mean (for a software professional) when we say a piece of software is 'production-ready'? It is a little subjective, but I think these are the key elements:

The software satisfies its requirements. Your software does what it its supposed to, and it does it well enough to (typically) generate business value.
The software has been adequately tested. There's a (preferably) automated test suite that effectively exercises your code.
The software is adequately documented. There's sufficient documentation for another practitioner to pick up the software and get started with it.
The software is 'well-architected'. It's extensible, maintainable and scalable. This includes ensuring the software includes graceful error handling.
The software has been peer-reviewed. You've had a fellow practitioner review and sign-off on your code.

Clearly, I can't help you with a few of these points: I can't determine if your software satisfies the requirements you've been given, and I can't check your test suite is sound, your app well-architected and your documentation sufficient. However, I can get you started with a project template that indicates how you might go about structuring your work to satisfy each of these things, where related code and information could/should sit, and template code providing an initial scaffold for you to use to get your project to this standard. That's what I'm aiming for with this post.

What is the difference between an 'API' and a 'service'?

There's a chance you may have only heard the term 'API' used in the context of web services, in which case you may have deemed them synonymous. That's not quite an accurate view.

API - An API is an 'Application Programming Interface', and defines a set of functionality that other applications can access and utilise. This can be exposed through different mediums, including over the web and as the public interface for languages, packages and libraries, for example.
Service - A service is a resource exposed over the web. In other words, it's an entity that can be interacted with over a network. This entity could be a collection of functions, or simply a static file. In practice, this means that a web service is also an API, but it is not the only kind of API.

In other words, a web service exposes an API, but not all APIs are exposed over the web. If you write a Python package, it's correct to say that the package provides an API, too.

What is the difference between a route and an endpoint?

Here's another subtle technical distinction. If you've glanced at a Flask app in the past, you'll no doubt have seen the route decorator, and perhaps heard references to 'routes' in general. If you've read much about web APIs, you may have also heard mention of 'endpoints' too. Sometimes these terms appear to be used almost synonymously. So what do they mean?

Route - The route is the URL. It's a way to locate a resource (a resource being some functionality or information, for example). For example, this might be localhost:8080/users.
Endpoint - An endpoint is one specific end of a communication channel. Importantly, there might be multiple endpoints for a single route. For example, GET localhost:8080/users and DELETE localhost:8080/users would be two distinct endpoints for the single route localhost:8080/users. This has important implications for how you might choose to design your APIs.

That'll do for terminology for now. Let's take a look at the code!

Enter the template

For the rest of this post, I'm going to be referring to a GitHub template repository that I've put together to demonstrate what I regard as some of the essential elements of a minimal production-ready Flask API. You can find the repository over here:

To get set up, you can either create a new repository from this template directly, or you can fork the repository to your own account, and clone it from there instead – whatever works for you! Finally, if you'd like to deploy your API with Google Cloud Run, make sure to checkout the google-cloud-run branch. All set? Great!

Project structure

Next question: what have you just pulled down? Here's a high-level run-through of a few the most important aspects of the project's structure you're looking at:

bin - As you might expect, this directory is where any directly executable files for the project are stored. By default, there is a file in here called run.sh. This file contains the bash command that will initialise a Gunicorn server and expose your Flask app. If this doesn't make much sense yet, don't worry, we'll cover it later!
docs - This directory should store any documentation about your API. You'll see a file called swagger.yaml in this directory. This stores a standard specification for your API. In practice, this can make it easier for other teams and services to make use of your API, and you should consider making one for every API you build. We won't be digging into Swagger or the OpenAPI standard in this post, but you can find out more on the Swagger website.
requirements - This directory stores your classic Python requirements.txt -type files. Note that there are actually two files in here: common.txt and develop.txt. This helps break out the dependencies you'll need when deploying your API (the essential dependencies are stored in common.txt) from the dependencies you'll need while developing your API (these are stored in develop.txt and are typically testing tools and styling/linting utilities). In other words, when developing your API, you'll need to install both common.txt and develop.txt, but when deploying your API, you can install just common.txt.
api - This directory stores the source code for your API. The file app.py sets up your Flask app and creates your app's 'routes'. We'll look at this in more detail later. You'll also see the errors.py file. This stores a Flask Blueprint for handling errors in your API code. We're going to take a look at Blueprints shortly. Finally, the handlers directory should store your model handlers. In other words, this is where you provide functions to load and query your models. Again, we'll take a look at this again shortly.
Dockerfile - This file contains the definition of a particular Docker image. In other words, it contains commands to build a Docker image, which can then be run as an isolated container. If this doesn't make much sense to you, you can think of it simply as a way of defining the specific environment you want your code to run in. I'm not going to be covering the basics of Docker here, so if Docker and container technologies are new to you, make sure to check out the Docker documentation, plus there's some great introductory tutorials on YouTube, if you're interested. Having an understanding of Docker will come in handy if you follow the deployment steps at the end of this post.
Makefile - This file provides a simple mechanism for running useful commands for your project. We'll touch on a few of these later in this post.
tests - This directory stores files for your test suite, including a basic example showing how to use pytest to test your Flask app, and a minimal locustfile that you can use to load test your API too.
wsgi.py - This file is your API's WSGI entrypoint. More on this in a moment.

Make sense? Great! Let's dig a little deeper.

Getting to know Flask

While Flask is a compact framework, there are still a few important concepts that can really help you build cleaner, more scalable applications, and it is therefore well worthwhile understanding a few of these a little better. Doing so can be the difference between having a neat, performant API, and spaghettified, spluttering mess of an API. I'd also like to emphasise that this is far from all there is to Flask, but I think these are few areas that you might find particularly useful for basic APIs.

Application routes

It's worth starting at the very beginning. One of the most basic concepts in Flask (and web services generally) is the concept of a route in Flask. These are designed to be used as decorators that indicate that a specific function should be executed when a given route is called. I'm not going to go into the technical details of decorators here, but if you aren't familiar with them, it's well worth reading some of the material out there explaining what they do in more detail.

Take a look at the example below:

import flask

app = flask.Flask(__name__)

@app.route("/")
def home():
	return "You are home"

This code is creating a Flask app with a single route. This route tells the app to execute the home function when the / route is called. The Flask route decorator can be used to indicate and unpack URL parameters. For example, here's how you could define a new route to extract a name parameter from a URL:

@app.route("/users/<name>")
def user(name):
	return f"Hey, {name}!"

If you were then to call the route /user/Jane, this handler would return Hey, Jane!. As you might see, these are the building blocks of any web service, and by extension, understanding the capabilities of the route decorator will set you well on your way to building APIs.

Application Blueprints

Another key concept in the Flask framework is the idea of a Blueprint. Blueprints are great. They can be used to compartmentalise your API. For example, let's say you want to create an API to manage your online store. You'll want a handful of key functions in there. Three obvious groups of functions would be functions to handle your users, your products and your checkout process. You can easily capture this behaviour with the capabilities provided by the above route decorator, as discussed above. You could create individual routes in your app.py for the following:

Figure 1: Example routes you may expect to see on a shopping API. These would be individual 'standard' routes in your Flask app.

However – while a contrived example – there's clearly some well-defined subsets of functionality in this API. You could capture these subsets as standalone Python modules, each defining a distinct Blueprint for that subset. One way of compartmentalising this functionality into Blueprints could be as follows:

Figure 2: Example routes for a shopping API, where the colour indicates functionality grouped into an individual Blueprint and injected into the Flask app.

You can then register each of these Blueprints in your 'core' app. Registering a Blueprint immediately adds all routes defined in the Blueprint to the 'parent' Flask app. The benefit here is that you can quickly add and remove entire subsets of a Flask app's functionality, and develop subsets in isolation from each other. As you might imagine, this can be extremely useful when you're developing more complex APIs.

To see a practical example of Blueprints in action, take a look at the api/error.py file in the template repository. This is a super practical use-case for Blueprints. In this case, this Blueprint defines a simple pattern to gracefully handle uncaught errors in your API. It defines a special type of decorator, 'error_handler', that will, usefully, handle errors in you app. Again I'm not going to go into detail on the specifics of how decorators work, but suffice it to say that this special decorator catches unhandled Exceptions in your app. In this case it returns a stock string response, and an error code (500).

This Blueprint is registered in the main app with the line:

app.register_blueprint(errors)

This injects the error handling behaviour defined in your Blueprint into your app. In other words, your whole app can now make use of this Blueprint to catch and handle otherwise uncaught errors wherever they may crop up. This will work on all routes in your app. Cool, eh?

Why use Gunicorn?

One thing that can sometimes trip up newcomers to the world of Flask is the need to use a WSGI-compliant web server to wrap Flask apps. You should not deploy a Flask app using the bundled Flask development server. In practice, this development server can be handy for quickly testing your application, but it isn't designed for high-demand applications, and therefore is unlikely to play nicely when you push it into a production environment.

Instead, Flask is designed to be used with other WSGI-compliant web server. For this post, (and in the template) I've used Gunicorn. It's a solid piece of kit. However, it isn't the only option, there's twisted and uWSGI too, for example. If you'd like to find out more about your deployment options, Flask comes with some excellent pointers on how to prepare some of these alternatives:

Deploy to Production — Flask Documentation (1.1.x)

Logo

Within the template repository, the configuration of Gunicorn with Flask is handled by two files. The first is the bin/run.sh bash script. If you inspect this file, you'll see a single command. This command points a Gunicorn server at your Flask app, exposes the app at 0.0.0.0:8080 (i.e. on your localhost at port 8080), sets your app to log at debug level (i.e. log everything), and to initialises the server with 4 workers (i.e. the number of worker processes that will be initialised for handling requests to your API, and therefore the number requests that can be handled in parallel). This latter parameter (workers) is often best set to be the number of available cores on the server you're running your API on.

The second file of interest is the wsgi.py Python module, what's that all about?

WSGI and WSGI Entrypoints

The wsgi.py is typically referred to as a WSGI entrypoint. What is WSGI, you ask? It stands for 'Web Server Gateway Interface', and – in short – it's a specification defining how a web server can interact with Python applications.

In the case of this example project, it is simply a file that lets your web server (in the case of this example: Gunicorn) hook into your application. You'll see and possibly need to provide similar WSGI entrypoints whenever you find yourself working with common Python web frameworks (including Django, for example).

In the case of this example, I've followed a relatively standard convention of having this file be a separate from the definition of the 'core' Flask app. However, you can keep all of your code together in a single file (e.g. in api/app.py) if you so wish.

Getting setup

At this point, you're probably about ready to spin up a server. Fair enough. First, you'll need to install the development requirements for this project. To do this, in the toplevel of the project directory, you should run:

make develop

This will install the testing and styling tools, plus the core Python packages need to run your app. When these have installed, you can start your server by running:

make start

This will start your Gunicorn server (exposing your Flask app). Before continuing, make sure you've installed curl (a command line tool for interacting with your API) You can check your server has successfully started up by running (possibly in a separate terminal window):

curl localhost:8080/health

You should get the response OK. This /health route provides a simple means of checking if your API is still responsive. Some cloud services require that these routes exist, and in general, it's a good idea to set them up for your own monitoring (and sanity!). Clearly, the example project comes with one bundled. You can now call:

curl localhost:8080

And you should see the response Hello, world! with the default handler. You have a working, minimal Flask API running on your local machine. Nice. Now, what if you wanted to do something a little more complicated?

Modifying your API

As you might expect, many 'real-world' web APIs are going to be at least a little more complicated than that. So how can you go about extending your application? There's a few ways, but here's a couple of basic suggestions.

Creating new routes and handlers

The obvious option is to simply create more routes. For example, take a look at the snippet below:

from flask import request

# ... existing routes and handlers

@app.route("/custom", methods=["POST"])
def custom():
	"""A handler that receives a POST request with a JSON payload, and may or may not return a friendly JSON-formatted response!"""
    payload = request.get_json()

    if payload.get("say_hello") is True:
        output = jsonify({"message": "Hello!"})
    else:
        output = jsonify({"message": "..."})

    return output

This snippet adds the /custom route to your app and sets it to accept only requests made with the POST method. In this case, the custom function attempts to extract JSON data from the request made to the server. This means it's expecting a JSON-formatted body to be sent in the request too. Finally, it returns a JSON-formatted response too.

If you add the above snippet (remember the additional import statement!) to your api/app.py file and restart your server, you can then call this new route with:

curl -X POST 'localhost:8080/custom' --data-raw '{"say_hello": true}' --header 'Content-Type: application/json'

And you should see the response:

{
    "message": "Hello!"
}

As you can see, you've added a slightly more complicated piece of functionality to your API with that small addition. You may also be able to imagine how you could build up an increasingly complex API by adding more routes and associated functions. However, for a large application (i.e. API), adding lots of routes this way can become burdensome. Fortunately, that's where Flask's Blueprints (discussed earlier) come in.

Compartmentalising your application

Time to make the API a little more complicated still. Let's say you want to add a whole bunch of functionality related to manipulating user data. As discussed earlier, you could create a separate Blueprint to capture this functionality.

To see how, create a new Python file in the api directory called users.py. In this file, copy the following snippet:

from flask import Blueprint, Response


users = Blueprint("users", __name__, url_prefix="/users")


@users.route("/list")
def index():
    return Response(f"Looks like there are no registered users!")


@users.route("/purchases")
def history():
    return Response(f"Looks like there are no purchases!")

In this Blueprint, you're adding two new routes (/list and /purchases, respectively) to your Blueprint. Importantly, the url_prefix parameter in the Blueprint initialisation indicates that the routes for this Blueprint should be mounted to the /users prefix. Concretely, this means the routes /list and /purchases will be available at /users/list and /users/purchases when calling the app after the Blueprint is registered. To register the Blueprint, you can include the following code in your api/app.py file:

# ... existing imports

from .users import users

app = Flask(__name__)  # you don't need this twice.
# ... register other blueprints
app.register_blueprint(users)

# ... 'core' handlers

When you now restart your Gunicorn server, you'll be able to call:

curl localhost:8080/users/list

And see the response Looks like there are no registered users! in this case. Practically, then, all of the routes defined in the users module have now been added to your API. Hopefully you can see how you could apply this pattern to each major subset of functionality in your API, which in turn would help you develop a nicely compartmentalised, maintainable web service.

Updating your test suite

As you saw earlier, the tests directory contains your test code for your API. I've sketched out a super basic test suite built with pytest and locust. The former is a great testing framework that you can use to test functional properties of your API (e.g. your API implements required functionality), while the latter is used to test non-functional properties of your API (e.g. your API can handle expected traffic).

I'll be giving an in depth look at locust and pytest for web APIs in future articles, so I'm going to cheekily skip over them here. However, if you'd like to boot up locust and have a play with its web-based UI in the meantime, you can run:

make load-test

Similarly, you can run the template pytest test suite with:

make test

Of course, if you're updating the routes as in previous sections in your API, you need to make sure to update the test suite to capture these changes. In fact, if you're being good and following a Test Driven Development (TDD) workflow, you should add tests to the tests/test_api.py (and any other files you add) before you write any extensions to your core API code. The tests/test_api.py file is set up to (hopefully) make writing tests for your API that little bit easier.

Deploying to Google Cloud Run

At this point, we've touch on most of the key files in the project and related concepts, let's take the final step and deploy the minimal API as a live web service. If you want to follow along with this last step, make sure you've checked out the google-cloud-run branch of the template, and (if you haven't already) sign up for a free account with Google Cloud. At the time of writing you can get $300 for signing up, which will more than cover the costs of anything you'll do here. Do remember to disable your account afterwards if you don't intend to do any more work on Google Cloud.

Be aware too that you need to make sure your API runs locally before continuing, so if you've made any changes to the files in the template project while following along, make sure to check those changes work before pushing on.

Ready? Let's deploy.

Understanding your `Dockerfile`

Before deploying to Cloud Run, you'll need to build and push a Docker image to Google Cloud. That means it's useful to understand what's going on in that mysterious Dockerfile you'll see in the project. Being familiar with container technologies will definitely stand you in good stead when you're trying to deploy your services.

However, this post isn't about Docker, and I won't be going into detail here. I'm going to assume some basic familiarity with Docker and Dockerfiles. If you're not familiar with Docker, there are loads of great resources online and I encourage you to check them out. It also goes without saying: make sure you've downloaded Docker for your system before continuing.

Now, onto your Dockerfile itself. First of all, you need to specify the base image you want to build your image from. This is captured in the line:

FROM python:3.7.8-slim

This tells Docker you'd like to use the official Python 3.7.8 image as the base for your new image. In other words, this sets up a standard Python 3.7.8 environment. The -slim suffix tells Docker to use the 'slimline' version of this image. This removes some 'clutter' included in the 'full' Python 3.7.8 image. Practically, this helps keep your image size small, which can help you keep costs down and performance up when your API is in service.

It's worth noting that if you're creating an API with very minimal dependencies (including system dependencies), you should consider using the alpine base image instead. This is extra 'slimline'. However, alpine-based images can cause a bit of a headache for new users: alpine images typically don't ship with common compilers and other tools. This is great if you don't need them, but can be fiddly if you're not confident in how to add such tools to an image. For example, many numerical libraries depend on certain compilers and specialised libraries that don't ship in alpine images.

Next up, it's time to install dependencies. This is handled by the lines:

COPY requirements/common.txt requirements/common.txt
RUN pip install -U pip && pip install -r requirements/common.txt

This snippet copies the requirements/common.txt file from your system environment into the image. We then upgrade pip and install the requirements in the requirements/common.txt file. This means we're only installing the dependencies we need to run your service and not installing dependencies used only for development (like the testing tools, for example).

You may see some example images elsewhere that use COPY . . instead. This copies everything in your current working directory (actually: Docker context) into the image. This is a bad idea, as while Docker does its best to avoid re-running steps that it doesn't need to, it does this by looking for changes in files in each 'layer' as the image builds. If you use the COPY . . command followed by RUN pip install -r requirements.txt, Docker will be forced to reinstall all of your requirements each time it builds whenever you make a change to any file in your current working directory. By copying your requirements in isolation, Docker will instead only reinstall dependencies in this file (or files) when you update the files themselves. This can save you a lot of time.

The image then copies all relevant API code and the WSGI entrypoint into the container in a directory that isn't in 'toplevel' of the given operating system, and sets this directory as its working directory. This ensures your code executes away from important system files. Additionally, by adding a user and assuming the role of this user with:

RUN useradd demo
USER demo

You limit the access your app code has to the system. This can be useful for various security and usability reasons, and is regarded as good practice.

Finally, the image exposes port 8080 (as required by Cloud Run), and runs the bash script discussed above to boot up the Gunicorn server. With that, you have a containerised API. You can check your API is running nicely by executing:

docker build . -t flask-demo
docker run -t -p 8080:8080 flask-demo

This will build and then run your containerised web service on port 8080 on your local machine. You should then be able to once again run:

curl localhost:8080/health

To confirm your API is running and healthy inside your Docker container. With that done, you're ready to push to Cloud Run.

Pushing to Cloud Run

Before pushing, you'll need to make sure you have GCP tools installed. When you've done this, navigate to the toplevel directory of the template (the one with the cloudbuild.yaml on the google-cloud-run branch). You may need to log in to your GCP account from the command line, so make sure you've done that before continuing. You'll also need to make sure the environment variable PROJECT_ID has been set, too. If you're unsure where to find the value of this variable, take a look at Google's help docs. Next, run:

gcloud builds submit

This will trigger the Google Cloud Build tool which will first push your project files to Google Cloud Storage, then use Cloud Build to build your image in the cloud. It'll then spin up your API as a containerised web service from this image in Cloud Run. The great thing about Cloud Run is that it manages a lot of the fiddly network settings and routing for you, saving you a lot of potentially fiddly work. After a few minutes, you should see as success message.

You should now be able to call:

curl {your-cloud-run-url}/health

Where {your-cloud-run-url} will be visible when your deployment has completed. And once again see the response OK. That's it, your API is live and ready to receive requests from the public! Not too shabby.

Next steps

It goes without saying that this template is far from the end of the story. As with any production system, building out a 'prime-time' web service can involve many other tools, technologies and processes. However, there's a few obvious next steps you might want to account for depending on how you're intending to deploy your API. These are:

If you're not making use of a managed service like Google Cloud Run or AWS Fargate, you should consider setting up some form of reverse-proxy for your API. This could be an NGINX (pronounced engine x) server, for example. DigitalOcean have a good guide to explain how this could work for you (on their platform!).
I'd strongly recommend adding some Continuous Integration/Continuous Delivery (CI/CD) functionality to this project too. For most small projects, the free allowance from GitHub for their GitHub Actions is likely to be sufficient for your needs, and a great place to start.
If you're looking to build a REST API with Flask, you should check out Flask RESTful. While Flask itself can be used to build REST APIs, Flask RESTful provides a few useful odds and ends to get you started that bit faster. There's a great tutorial on building a RESTful API over on Miguel Grinberg's blog.
I've not discussed authentication here, but clearly it is a good idea – at least in most applications – to add an authentication layer to your APIs. If you're using Cloud Run, there are a few options on that front.

And that's it for this post! As always, feedback is very welcome, so feel free to drop me a line on LinkedIn or Twitter.

Flask in Production: Minimal Web APIs

Mark Douthwaite

What is Flask?

A little terminology...

What do we mean by 'production-ready'?

What is the difference between an 'API' and a 'service'?

What is the difference between a route and an endpoint?

Enter the template

Project structure

Getting to know Flask

Application routes

Application Blueprints

Why use Gunicorn?

WSGI and WSGI Entrypoints

Getting setup

Modifying your API

Creating new routes and handlers

Compartmentalising your application

Updating your test suite

Deploying to Google Cloud Run

Understanding your `Dockerfile`

Pushing to Cloud Run

Next steps

Read more

Retrieval-Augmented Generation for LLMs: A Gentle Introduction

7 Reasons To Work At A Startup, And 1 Reason Not To

Load Testing a Machine Learning Model API

Books of 2020

What is Flask?

A little terminology...

What do we mean by 'production-ready'?

What is the difference between an 'API' and a 'service'?

What is the difference between a route and an endpoint?

Enter the template

Project structure

Getting to know Flask

Application routes

Application Blueprints

Why use Gunicorn?

WSGI and WSGI Entrypoints

Getting setup

Modifying your API

Creating new routes and handlers

Compartmentalising your application

Updating your test suite

Deploying to Google Cloud Run

Understanding your Dockerfile

Pushing to Cloud Run

Next steps

Read more

Retrieval-Augmented Generation for LLMs: A Gentle Introduction

7 Reasons To Work At A Startup, And 1 Reason Not To

Load Testing a Machine Learning Model API

Books of 2020

Understanding your `Dockerfile`