Deploying an NLP model on the web using FastAPI

Deploying an NLP model on the web using FastAPI

This API receives unpunctuated sentences in the English language and returns them as punctuated.

Table of contents

No heading

No headings in the article.


To deploy a model on the web that accepts unpunctuated text in the English language and returns a punctuated text.


There are several open-source APIs for speech recognition and in my experience, the following two work really well

  1. WebSpeechAPI
  2. Vosk API

WebSpeechAPI is meant to integrate voice data into your web apps and Vosk has two kinds of models - small and big. The small models are meant for mobile applications while the big ones are for usage on servers. This article is not meant for the comparison between the two APIs, so I will keep it limited to what's important for this post.

Both of them have one significant limitation - there is no punctuation. What this means is your transcribed text would have no comma, no full-stop, no question mark, no exclamation mark, etc.

Although, in the last couple of years there had been several open-source punctuation models available for the English language such as

After trying all of these, I figured the 5th one i.e. Deep Multilingual Punctuation model worked the best - great accuracy and fast inference time even on CPU. And although the dedicated page for this python package here cautions us that since the model is trained on a dataset consisting of political speeches, it might perform differently on texts from other domains, we have never had issues with our text from different domains. So, in a nutshell, its a fairly accurate punctuator model.

Problem Statement

I have a punctuator model working on my server and I could choose to deploy it right on my machine or deploy it on the web as API. I choose the latter option since this approach is far more modular. Now, I had to find a way to deploy this model such that I can make a POST request with my unpunctuated text (i.e. output of WebSpeechAPI or Vosk API) and receive a punctuated text.

Approach Use of FastAPI.

Implementation Steps

1. The code for running the punctuation model

The code to run the model is provided on its python package. So, please follow the steps provided here

Please test it on your development environment before proceeding to the next step.

2. Deployment of ML model as using FastAPI on local development environment

Let's first install FastAPI

pip install fastapi

Please install a production server Uvicorn

pip install "uvicorn[standard]"
from typing import Optional
from fastapi import FastAPI, Request
from deepmultilingualpunctuation import PunctuationModel
import requests

app = FastAPI()

model = PunctuationModel()

# we use /api endpoint to make POST requests"/api/")
async def get_en_punctuated(request: Request):
    data = await request.json()
    print("data =", data)
    text_str = data["text"]
    print("text =", text_str)
    ans = model.restore_punctuation(text_str)
    return ans

Put this code within and run the server with uvicorn main:app --reload

You will find Uvicorn running on

Now, let's make the request to the API with the following code

import requests
import ast

data = {'text': 'hello how are you my name is Alex I am living in US i work for a construction company how are you i hope things are going well at your end '}

url = ""

response =, json=data)
ans = response.text
ans = ast.literal_eval(ans)

This will produce the result as hello, how are you? my name is Alex. I am living in US. i work for a construction company. how are you? i hope things are going well at your end.

3. Deployment of ML model as FastAPI on remote server such as GCP/Digital Ocean

I use Digital Ocean as my hosting server. Depolying FastAPI is pretty much same as deploying a Flask model using Nginx and Gunicorn. So, if you are not sure on what to do, I recommend following this tutorial.

The package list from the requirements.txt file is as follows


Let's, create the systemd service unit file. Creating a systemd unit file will allow Ubuntu’s init system to automatically start Gunicorn and serve the FastAPI application whenever the server boots.

Description=Gunicorn instance to serve myproject

ExecStart=/home/raghav/myproject/myprojectenv/bin/gunicorn  --bind unix:myproject.sock -m 007 main:app --worker-class uvicorn.workers.UvicornWorker


As you can notice, my code file is placed inside /home/raghav/myproject/ directory, and myprojectenv is my python virtual environment. Please note that we had to use a worker-class uvicorn.workers.UvicornWorker if we want to use it with gunicorn.

Now, you can deploy it on your domain name quite easily following the Step 5: Configuring Nginx to Proxy Requests from here.

I hope this helps, if you face some difficulty please feel free to send me your queries.