Multi-threading in Python

Process:

A process is an instance of a computer program that is being executed. Any process has 3 basic components:


An executable program.

The associated data needed by the program (variables, work space, buffers, etc.)

The execution context of the program (State of process)


Thread:

A thread is an entity within a process that can be scheduled for execution. 

Also, it is the smallest unit of processing that can be performed in an OS.


A thread is a sequence of such instructions within a program that can be executed independently of other code. 

For simplicity, you can assume that a thread is simply a subset of a process!


A thread contains all this information in a Thread Control Block (TCB):


Thread Identifier: Unique id (TID) is assigned to every new thread

Stack pointer: Points to thread’s stack in the process. Stack contains the local variables under thread’s scope.

Program counter: a register which stores the address of the instruction currently being executed by thread.

Thread state: can be running, ready, waiting, start or done.

Thread’s register set: registers assigned to thread for computations.

Parent process Pointer: A pointer to the Process control block (PCB) of the process that the thread lives on.


Multiple threads can exist within one process where:


Each thread contains its own register set and local variables (stored in stack).

All thread of a process share global variables (stored in heap) and the program code.


Context switching:


In a simple, single-core CPU, it is achieved using frequent switching between threads. 

This is termed as context switching. In context switching, the state of a thread is saved and state of another 

thread is loaded whenever any interrupt (due to I/O or manually set) takes place. 


Context switching takes place so frequently that all the threads appear to be running parallely 

(this is termed as multitasking).


# Python program to illustrate the concept 

# of threading 

# importing the threading module 

import threading 


def print_cube(num): 

""" 

function to print cube of given num 

"""

print("Cube: {}".format(num * num * num)) 


def print_square(num): 

""" 

function to print square of given num 

"""

print("Square: {}".format(num * num)) 


if __name__ == "__main__": 

# creating thread 

t1 = threading.Thread(target=print_square, args=(10,)) 

t2 = threading.Thread(target=print_cube, args=(10,)) 


# starting thread 1 

t1.start() 

# starting thread 2 

t2.start() 


# wait until thread 1 is completely executed 

t1.join() 

# wait until thread 2 is completely executed 

t2.join() 


# both threads completely executed 

print("Done!") 


Critical Section:


Thread synchronization is defined as a mechanism which ensures that two or more concurrent threads do not simultaneously execute some particular program segment 

known as critical section.




Race Condition:


A race condition occurs when two or more threads can access shared data and they try to change it at the same time. 

As a result, the values of variables may be unpredictable and vary depending on the timings of context switches of the processes.



Lock:

Threading module provides a Lock class to deal with the race conditions. 

Lock is implemented using a Semaphore object provided by the Operating System.


Semaphore:


A semaphore is a synchronization object that controls access by multiple processes/threads to a common resource in a parallel programming environment. 

It is simply a value in a designated place in operating system (or kernel) storage that each process/thread can check and then change. 

Depending on the value that is found, the process/thread can use the resource or will find that it is already in use and must wait for some period before trying again.

Semaphores can be binary (0 or 1) or can have additional values. Typically, a process/thread using semaphores checks the value and then, if it using the resource, 

changes the value to reflect this so that subsequent semaphore users will know to wait.



Semaphores were originally a key part of railway system architecture and it was the famous Dijkstra that translated this real-world concept into our computing world.


These semaphores have an internal counter that is incremented and decremented whenever either an acquire or a release call is made.


Say we protected a block of code with a semaphore, and set the semaphore’s initial value to 2. If one worker acquired the semaphore,

 the value of our semaphore would be decremented to 1, if a second worker comes along the semaphore’s value would be decremented to 0.


At this point if another worker comes along and tries again it would be denied. 

The value of these semaphores is that they allow us to protect resources from being overused.


Bounded Semaphores

There lies a very subtle difference between a normal semaphore and a bounded-semaphore. 

A bounded semaphore only differs in terms of not allowing more releases to be made than acquires. 

If it does exceed the value then a ValueError is raised.








Advantages:


It doesn’t block the user. This is because threads are independent of each other.

Better use of system resources is possible since threads execute tasks parallely.

Enhanced performance on multi-processor machines.

Multi-threaded servers and interactive GUIs use multithreading exclusively.

Disadvantages:


As number of threads increase, complexity increases.

Synchronization of shared resources (objects, data) is necessary.

It is difficult to debug, result is sometimes unpredictable.

Potential deadlocks which leads to starvation, i.e. some threads may not be served with a bad design

Constructing and synchronizing threads is CPU/memory intensive.


Methods of Thread Class:


The threading module, as described earlier, has a Thread class that is used for implementing threads, 

and that class also contains some predefined methods used by programmers in multi-threaded programming. These are:


run(): It acts as the entry of the thread

start(): is used for starting the thread by calling the run()

isAlive(): is used to verify whether the still executing or not

getName(): is used for returning the name of a thread

setName(): is used to set the name of the thread

notifyAll():  method that wakes up all thread waiting for the condition.

enumerate(): to retrieve the list of all active threads

join():thread method is used to wait until it terminates

thread.stop() & thread.wait(): to terminate a blocking thread.


Comments

Popular posts from this blog

What is Navie Bayes Theorem?

Ensemble Methods