PolarSPARC |
Introduction to Asyncio in Python
Bhaskar S | 07/26/2020 |
Overview
asyncio is a popular library for writing concurrent asynchronous applications in Python.
One may wonder why we need one more addition to the existing list of choices - multiprocessing, threading, concurrent.futures , etc. The simple answer - asyncio is a safer and less error-prone concurrency paradigm as compared to the non-trivial threading model which is susceptible to race conditions.
Before we go further, it is worth clarifying the difference between Concurrency and Parallelism, as it is often a source of some confusion.
In our fictitious example, Alice has to run a few errands - go to a Pharmacy to get a prescription filled, take-out pizza for dinner from Bob's Pizza, and get some items from the Grocery store. Each of the three tasks take some time.
The following illustration depicts the act of Alice performing each of the three tasks sequentially:
A more optimal approach for Alice would be to go to the Pharmacy to submit the prescription, and then head to Bob's Pizza to place an order for take-out, and then go to the Grocery store to pick the items. Once she has completed grocery shopping, she will come back to pick-up the pizza as it will be ready by then, and then finally pick the prescription. This is an example of Concurrency.
The following illustration depicts the act of Alice performing each of the three tasks concurrently:
If Alice gets help from Charlie to get the prescription and the pizza while she gets the groceries, that is and example of Parallelism. The following illustration depicts the act of both Alice and Charlie dividing and performing the tasks in parallel:
In summary, Concurrency means multiple tasks can run in an overlapping manner, while Parallelism means multiple tasks run at the same time independently.
The recommended pattern is to use Concurrency ( concurrent.futures and/or threading ) for I/O (networking, storage, etc) intensive workloads and Parallelism (multiprocessing) for compute intensive workloads.
Now that we have clarified the difference between Concurrency and Parallelism , we turn our attention to Asynchronous. So, what is it ???
Asynchronous is a simpler concurrency paradigm that uses a single thread within a single process, along with cooperative preemptive multitasking to let the different tasks take turns to make progress. If a task blocks, it yields to another ready task to move forward. In other words, tasks overlap each other giving the illusion that they are all running at the same time - it is *NOT* parallel, but concurrent.
From our example, once Alice gets the groceries, she picks up the pizza as it is ready, then she goes to check on the prescription and realizes it is not ready yet. So she heads off to get a cup of coffee and check back again. The following illustration depicts the act of Alice performing the three tasks in an asynchronous manner:
Now that we have explained what we mean by Asynchronous, we will start to dig a little deeper into the asyncio library in Python.
The core components of asyncio are as follows:
Component | Description |
---|---|
Event Loop | Manages the execution of a set of Python functions and switches between them as they block and unblock |
Coroutines | Special Python functions that behave like generators and yield control back to the event loop when they block |
Tasks & Futures | Objects that represent the state of the coroutine(s) that may or may not have completed execution. A task is a subclass of future. A task object can be used to monitor the status of the underlying coroutine |
Setup
For the demonstration, we will be using Python version 3.8 or above.
Hands-on Asyncio
The following is the first simple example demonstrating the use of asyncio in Python:
Executing example-01.py produces the following output:
Sun Jul 19 20:20:07 2020 main_task - Hello Sun Jul 19 13:20:07 2020 sub_task - World Sun Jul 19 13:20:07 2020 sub_task - Hello Sun Jul 19 13:20:07 2020 main_task - World
The following are brief descriptions for some of the keyword(s) and method(s) used in example-01.py above:
async :: defines a coroutine, which is an object that wraps a Python function, with the ability to resume the function if suspended
await :: suspends the current coroutine and yields to other coroutine(s). This keyword takes a coroutine as a parameter
asyncio.run() :: method to execute a coroutine
The following illustration depicts the execution of example-01.py:
The following example demonstrates a variation of the previous example but using the event loop explicitly:
Executing example-02.py produces the following output:
Sun Jul 19 13:25:44 2020 main_task - Hola Sun Jul 19 13:25:44 2020 sub_task - Mundo Sun Jul 19 13:25:45 2020 sub_task - Hola Sun Jul 19 13:25:45 2020 main_task - Mundo
The following are brief descriptions for some of the keyword(s) and method(s) used in example-02.py above:
asyncio.get_event_loop() :: returns an instance of the event loop
loop.run_until_complete() :: executes the given coroutine until it runs to completion
loop.close() :: typically called at the end to clean and release the resources
Under the hood, the call to asyncio.run() performs the steps in example-02.py , that is, call the method asyncio.get_event_loop(), then invoke the method run_until_complete(), and finally call the method close().
The following example demonstrates three tasks executed concurrently by the event loop:
Executing example-03.py produces the following output:
Starting ... Sun Jul 19 13:31:29 2020:: P1 [0.6] - step - 1 Sun Jul 19 13:31:29 2020:: P2 [0.9] - step - 1 Sun Jul 19 13:31:29 2020:: P3 [0.3] - step - 1 Sun Jul 19 13:31:29 2020:: P3 [0.3] - step - 2 Sun Jul 19 13:31:30 2020:: P1 [0.6] - step - 2 Sun Jul 19 13:31:30 2020:: P2 [0.9] - step - 2 Done !!!
The following is brief description for the method used in example-03.py above:
asyncio.gather() :: execute the specified list of coroutine(s) concurrently. If all the coroutine(s) have completed successfully, it returns an aggregate list of results (from each of the coroutine(s) in the given order)
The following illustration depicts the execution of example-03.py:
The following example demonstrates returning values from coroutine(s):
Executing example-04.py produces the following output:
Starting ... Ready to start tasks t1, t2, and t3... Sun Jul 19 13:36:58 2020:: P1 [0.1] - step - 1 Sun Jul 19 13:36:58 2020:: P2 [0.7] - step - 1 Sun Jul 19 13:36:58 2020:: P3 [0.8] - step - 1 Sun Jul 19 13:36:58 2020:: P1 [0.1] - step - 2 Sun Jul 19 13:36:58 2020:: P1 [0.1] - result - 5 Sun Jul 19 13:36:58 2020:: P2 [0.7] - step - 2 Sun Jul 19 13:36:58 2020:: P2 [0.7] - result - 5 Sun Jul 19 13:36:59 2020:: P3 [0.8] - step - 2 Sun Jul 19 13:36:59 2020:: P3 [0.8] - result - 7 Result from t1: 5 Result from t2: 5 Result from t3: 7 Completed Tasks t1, t2, and t3 !!! Done !!!
The following are brief descriptions for some of the keyword(s) and method(s) used in example-04.py above:
asyncio.create_task() :: it schedules the given coroutine to be run in the event loop and immediately returns a task object that wraps the specified coroutine. The task object can be used to monitor the status of the coroutine
await :: this keyword can also be used to return values from a coroutine
The following example demonstrates returning values from coroutine(s) as they complete, the quickest one first, then the next quickest, and so on:
Executing example-05.py produces the following output:
Starting ... Ready to get values from tasks t1, t2, and t3... Sun Jul 19 13:44:57 2020:: P1 [1.0] - step - 1 Sun Jul 19 13:44:57 2020:: P2 [0.6] - step - 1 Sun Jul 19 13:44:57 2020:: P3 [0.9] - step - 1 Sun Jul 19 13:44:58 2020:: P2 [0.6] - step - 2 Sun Jul 19 13:44:58 2020:: P2 [0.6] - result - 8 Result from P2: 8 Sun Jul 19 13:44:58 2020:: P3 [0.9] - step - 2 Sun Jul 19 13:44:58 2020:: P3 [0.9] - result - 6 Result from P3: 6 Sun Jul 19 13:44:58 2020:: P1 [1.0] - step - 2 Sun Jul 19 13:44:58 2020:: P1 [1.0] - result - 3 Result from P1: 3 Completed Tasks t1, t2, and t3 !!! Done !!!
The following is a brief description of the method used in example-05.py above:
asyncio.as_completed() :: returns an iterable of future(s) that needs to be awaited to yield values in the order in which the coroutine(s) complete
The following example demonstrates returning values from coroutine(s) within a specific duration or else timeout:
Executing example-06.py produces the following output for a successful run:
Starting ... Ready to get values within 0.5 secs ... Ready to get values from tasks t1, t2, and t3 ... Sun Jul 19 13:52:36 2020:: P1 [0.3] - step - 1 Sun Jul 19 13:52:36 2020:: P2 [0.4] - step - 1 Sun Jul 19 13:52:36 2020:: P3 [0.1] - step - 1 Sun Jul 19 13:52:36 2020:: P3 [0.1] - step - 2 Sun Jul 19 13:52:36 2020:: P3 [0.1] - result - 2 Sun Jul 19 13:52:36 2020:: P1 [0.3] - step - 2 Sun Jul 19 13:52:36 2020:: P1 [0.3] - result - 6 Sun Jul 19 13:52:36 2020:: P2 [0.4] - step - 2 Sun Jul 19 13:52:36 2020:: P2 [0.4] - result - 6 Completed Tasks t1, t2, and t3 - [('P1', 6), ('P2', 6), ('P3', 2)] Got all values !!! Done !!!
Executing example-06.py produces the following output for a timeout run:
Starting ... Ready to get values within 0.5 secs ... Ready to get values from tasks t1, t2, and t3 ... Sun Jul 19 13:53:02 2020:: P1 [1.0] - step - 1 Sun Jul 19 13:53:02 2020:: P2 [0.7] - step - 1 Sun Jul 19 13:53:02 2020:: P3 [0.2] - step - 1 Sun Jul 19 13:53:02 2020:: P3 [0.2] - step - 2 Sun Jul 19 13:53:02 2020:: P3 [0.2] - result - 4 Could not complete the tasks in under 0.5 secs Got all values !!! Done !!!
The following is a brief description of the method used in example-06.py above:
asyncio.wait_for() :: specify a timeout (in seconds) for the task(s) to complete under. If a timeout occurs, the gathered task(s) are all cancelled
The following example demonstrates the ability to cancel a task that wraps a coroutine if it exceeds an SLA:
Executing example-07.py produces the following output for a successful run:
Ready to start data processing... Sun Jul 19 14.03:49 2020:: [0] - retrieve from DB... Sun Jul 19 14.03:49 2020:: [2] - process dataset... Sun Jul 19 14.03:51 2020:: [2] - result - 10 Completed data processing: [1, 53, 30, 18, 65, 27, 5, 51, 64, 37]
Executing example-07.py produces the following output for a timeout run:
Ready to start data processing... Sun Jul 19 14:04:18 2020:: [9] - retrieve from DB... Cancelling data processing due to SLA breach... Task processing cancelled !!!
The following are the brief descriptions of some of the method(s) used in example-07.py above:
task.done() :: returns a True if the task has completed
task.cancel() :: cancels the execution of the underlying coroutine . This will throw a asyncio.CancelledError in the cancelled coroutine
References