Profiling your Web application
In this article, we will try to learn how to profile our Python web application and find all the factors contributing to the performance of the application. So, what is profiling? Profiling enables us to find all the bottlenecks in our application which lets us do the least amount of work and get the biggest possible performance gain.
Any measurable resource of a system can be profiled. For example, CPU time, memory usage, network bandwidth, disk I/O, etc. With profiling, you would be able to find the most contributing factors and start solving these to get the desired performance.
In a Web application, we usually test the response time of an API call. We could create a simple decorator which measures the time of an API. A sample decorator can be written as follows:
from functools import wraps
from time import time
def timer(fn):
@wraps(fn)
def measure_time(*args, **kwargs):
t1 = time()
result = fn(*args, **kwargs)
t2 = time()
print(f"@timer: {fn.__name__} took {t2 - t1} seconds")
return result
return measure_time
Adding the above timer decorator will let you capture the time taken by each API function. With this, you would be able to analyze the time-consuming APIs and concentrate your focus on improving them.
If your application is built using flask and flask-restful, the profiling of our application has gotten way easier. We just need to add the following two lines before app.run()
.
from werkzeug.middleware.profiler import ProfilerMiddleware
app.wsgi_app = ProfilerMiddleware(app.wsgi_app, restrictions=[5], profile_dir='.')
ProfilerMiddleware
takes the two parameters restrictions
which is for the number of calls to be displayed in the console and profile_dir
is the location at which the profiled file is to be created.
The console log looks like the following:
--------------------------------------------------------------------------------
PATH: '/'
654 function calls (630 primitive calls) in 0.001 secondsOrdered by: internal time, call count
List reduced from 175 to 5 due to restriction <5>ncalls tottime percall cumtime percall filename:lineno(function)
13/1 0.000 0.000 0.000 0.000 {built-in method _abc._abc_subclasscheck}
25 0.000 0.000 0.000 0.000 {method 'split' of 're.Pattern' objects}
1 0.000 0.000 0.000 0.000 {built-in method _socket.gethostname}
1 0.000 0.000 0.000 0.000 C:\ProgramData\Anaconda3\envs\microservices\lib\site-packages\werkzeug\http.py:470(parse_accept_header)
12 0.000 0.000 0.000 0.000 C:\ProgramData\Anaconda3\envs\microservices\lib\site-packages\werkzeug\local.py:162(top)--------------------------------------------------------------------------------
In my system the profile file was created with the following name:
GET.root.0ms.1625933197.prof`
Wondering if any of the above console logs are making any sense. Even I find it really difficult to find the required stats from the long console log. We do have a tool built which gives us a visual description of the above log data. All we have to do is run the following two commands from one of your terminals.
pip install snakeviz
snakeviz <profile_file_name>
snakeviz web server started on 127.0.0.1:8080; enter Ctrl-C to exit
http://127.0.0.1:8080/snakeviz/service2%5Capp%5CGET.root.0ms.1625933197.prof
The snakeviz tool would be redirected to a web browser where you could interactively look at each function and its computation time. This could act as a starter for us to determine the call flow, their time consumption and will concentrate on parts where most part of the time is consumed for an API call. Sample screenshot of the report looks like below:
Profiling is a very large area and this is just the start of it. In my future articles, I will try to cover how to profile other measurable resources.
Let me know if this was useful and if there are any other tools that I could test out.