20 november 2022
How serverless works
Last time, we figured out how the cloud works at the lowest level. And now I suggest you look inside the cloud from the other side. I think that everyone knows what serverless functions are (those who do not know, go to the article anyway - you will find out there :). An example of such serverless technologies is Azure Functions, one of the most powerful tools that allows you to easily and elegantly solve many computational problems. So let's figure out how serverless functions work and there really are no servers there...
Serverless computing is an example of PaaS (Platform as a Service) services. The idea of the serverless approach is that the service provider, without the user's involvement, controls and configures the operation of physical infrastructure, virtual machines, storage systems, and the environment to run the client’s code.

The user also gets automatic scaling and fault tolerance. Serverless services are designed so that your code runs even in the case of a fault of one or even several data centers. If your code is called more frequently over time, the service will take on the task of scaling the infrastructure on which the code will run.

Let's consider an example of a company that rents electrical scooters. Within its system, telemetry from rental stations needs to be collected. In practice, stations may fail, and over time their count will increase. This means that the traffic from stations will always be different. The amount of data will be sometimes large, sometimes small, but on average it will increase over time. When using serverless functions, the system will react to all these changes automatically and call fewer or more functions depending on the volume of traffic. The company does not need to do anything for this - everything will happen automatically.

The company will not have to worry even about server failures because the internal structure of the functions is tolerant to this. The company can be confident that its system will adapt to changes in the flow of data, and all data will be processed. When using IaaS, a lot of tasks like adjusting the system to changing loads, handling software errors, or hardware failures would fall on the company's employees.

The concept of serverless computing is useful when you need to quickly create a REST API, write a chatbot, process a message queue, process data from a device, etc. The approach allows you to not worry about virtual machines, operating systems, runtime settings, scaling, and fault tolerance - all tasks are solved by the service. At the same time, the solution is cheaper than using virtual machines, because you pay only for the time your code runs.

However, the concept imposes two important restrictions on the user's code:

  • The function must have time limits. It cannot be designed for infinite work in a loop.
  • The function cannot store states between runs. Each time the function is launched, it will have no information about the previous work.

Azure Functions is a classic example of serverless, combining all the advantages of this approach.

When using the Azure Functions service, you choose the runtime environment (essentially the programming language), the trigger that will launch the function, and add the function code itself. That's enough to get your code up and running. Sounds very easy, doesn't it? Here you can find a step-by-step guide on how to create a simple Python function.

But let's finally get back to our main topic. Let's take a look at how functions are structured internally.
A function is a logical container with a set of parameters. Something similar to the familiar Docker, only simpler and smaller. And this container needs to be executed somewhere.
So does the server exist after all? Of course, it does. It's just invisible to the user, that’s why it has the name - serverless.

The architecture of the service looks something like this:

The service uses virtual machines internally, but this is hidden from the user. The important parts of the service are special routers and schedulers. They determine the number of virtual machines needed to run user code and select a machine to run the function on each call. It is these components that allow users to take advantage of the serverless approach and solve many application tasks associated with running code, such as:

  • Selecting the virtual machine parameters and creating it based on the complexity of the code.
  • Installing the operating system and configuring it, such as Ubuntu.
  • Installing additional software, such as the Python runtime environment.
  • Configuring software to run user code, such as setting system variables or downloading additional libraries, such as the boto3 library for working with the queue service.
  • Running the code by connecting to the virtual machine and starting it manually or writing a script to do it automatically.

All internal and external requests to call functions come to a load balancer, which redirects the request to one of the availability zones in the router (which probably also has multiple zones and a local balancer in front of them), which then queries the scheduler (also several in one zone) and asks which worker to run the function on. A worker is a virtual machine with an isolated execution environment for each function. After the router receives the worker address from the scheduler, it sends a command to run the function with certain parameters on that worker.

Where does the worker come from? Thanks to the scheduler. Schedulers analyze the number and duration of function executions, start and stop virtual machines with different execution environments. If there is already existed worker for the incoming request, the function will be launched on it. If there is no worker, a new worker will be launched to serve this request.

There are two important points to note here:

  • Functions should run as quickly as possible. Most of the time it takes to start the virtual machine and configure it. But even in this case, it takes a few hundred milliseconds to run the function.
  • Functions of different users can run on the same virtual machine. So it is extremely important to isolate them from each other.

The only thing left is to understand how to call a function. Usually, there are many integrations with cloud services for this.

In the case of Azure Functions, the list looks like this:

The simplest and most accessible are HTTP calls and timers.

In the case of HTTP, you need to send an HTTP request with certain parameters to some endpoint and the system will automatically call your function.

In the case of timers, you can set up a periodic function call at specified time intervals.

Well, of course, the function can be called “by hand” by clicking on the button in the cloud UI interface.

I hope that now you understand why serverless functions have this name and how they work.