Operating Systems
Operating Systems
PROCESS MANAGEMENT
Process Concept: Process, Process Control Blocks, Operations on Processes, Inter process
Communication
Multithreaded Programming: Multicore programming, Multithreading Models, Thread Libraries,
Threading Issues
Process Scheduling: Scheduling Criteria, scheduling algorithms (FCFS, SJF, Round Robin, and
Priority) and their evaluation, Multiprocessor scheduling. Case Study: Linux
What is a process?
A Program does nothing unless its instructions are executed by a CPU. A program in execution is
called a process. In order to accomplish its task, process needs the computer resources.
There may exist more than one process in the system which may require the same resource at the
same time. Therefore, the operating system has to manage all the processes and the resources in a
convenient and efficient way.
Attributes of a process:
The Attributes of the process are used by the Operating System to create the process control block
(PCB) for each of them. This is also called context of the process. Attributes which are stored in the
PCB are described below.
1. Process ID
When a process is created, a unique id is assigned to the process which is used for unique
identification of the process in the system.
2. Program counter
A program counter stores the address of the last instruction of the process on which the
process was suspended. The CPU uses this address when the execution of this process is resumed.
3. Process State
The Process, from its creation to the completion, goes through various states which are new,
ready, running and waiting. We will discuss about them later in detail.
4. Priority
Every process has its own priority. The process with the highest priority among the
processes is executed earlier.
5. I/O information:
Each process needs some I/O device for their execution
6. CPU scheduling information:
In this each process is executed by using some process scheduling algorithms like FCFS,SJF
ect
Process states:
Process vs Program
Let us take a look at the differences between Process and Program:
Process Program
A Process requires resources such as memory, A Program is stored by hard-disk and does not
CPU, Input-Output devices. require any resources.
A process has a dynamic instance of code and A Program has static code and static data.
data
Basically, a process is the running On the other hand, the program is the executable
instance of the code. code.
Once the process is created, it will be ready and come into the ready queue (main memory) and will
be ready for the execution.
2. Scheduling
Out of the many processes present in the ready queue, the Operating system chooses one process
and start executing it. Selecting the process which is to be executed next, is known as scheduling.
3. Execution
Once the process is scheduled for the execution, the processor starts executing it. Process may come
to the blocked or wait state during the execution then in that case the processor starts executing the
other processes.
4. Deletion/killing
Once the purpose of the process gets over then the OS will kill the process. The Context of the
process (PCB) will be deleted and the process gets terminated by the Operating system.
● Semaphore
A semaphore is a variable that controls the access to a common resource by multiple
processes. The two types of semaphores are binary semaphores and counting semaphores.
● Mutual Exclusion
Mutual exclusion requires that only one process thread can enter the critical section at a
time. This is useful for synchronization and also prevents race conditions.
● Spinlock
This is a type of lock. The processes trying to acquire this lock wait in a loop while checking
if the lock is available or not. This is known as busy waiting because the process is not
doing any useful operation even though it is active.
● Pipe
A pipe is a data channel that is unidirectional. Two pipes can be used to create a two-way
data channel between two processes. This uses standard input and output methods. Pipes are
used in all POSIX systems as well as Windows operating systems.
● Socket
The socket is the endpoint for sending or receiving data in a network. This is true for data
sent between processes on the same computer or data sent between different computers on
the same network. Most of the operating systems use sockets for interprocess
communication.
● File
A file is a data record that may be stored on a disk or acquired on demand by a file server.
Multiple processes can access a file as required. All operating systems use files for data
storage.
● Signal
Signals are useful in interprocess communication in a limited way. They are system
messages that are sent from one process to another. Normally, signals are not used to
transfer data but are used for remote commands between processes.
● Shared Memory
Shared memory is the memory that can be simultaneously accessed by multiple processes.
This is done so that the processes can communicate with each other. All POSIX systems, as
well as Windows operating systems use shared memory.
● Message Queue
Multiple processes can read and write data to the message queue without being connected to
each other. Messages are stored in the queue until their recipient retrieves them. Message
queues are quite useful for interprocess communication and are used by most operating
systems.
A diagram that demonstrates message queue and shared memory methods
of interprocess communication is as follows −
Types of Parallelism
The concept of multicore programming is to have multiple system tasks executing in parallel. Types
of parallelism include:
● Data parallelism
● Task parallelism
Data Parallelism
Data parallelism involves processing multiple pieces of data independently in parallel.
The processor performs the same operation on each piece of data. You achieve parallelism by
feeding the data in parallel.
The figure shows the timing diagram for this parallelism. The input is divided into four
chunks, A, B, C, and D. The same operation F() is applied to each of these pieces and the output is
OA, OB, OC, and OD respectively. All four tasks are identical, and they run in parallel.
Reliability It is more reliable than the multicore It is not much reliable than the
system. If one of any processors fails multiprocessors.
in the system, the other processors
will not be affected.
Traffic It has high traffic than the multicore It has less traffic than the multiprocessors.
system.
Multithreading Models:
Multithreading allows the application to divide its task into individual threads. In multi-threads,
the same process or task can be done by the number of threads, or we can say that there is more than
one thread to perform the task in multithreading. With the use of multithreading, multitasking
can be achieved.
For example:
In the above example, client1, client2, and client3 are accessing the web server without any
waiting. In multithreading, several tasks can run at the same time.
In an operating system
threads are divided into the user-level thread and the Kernel-level thread. User-level threads
handled independent form above the kernel and thereby managed without any kernel support.
On the opposite hand, the operating system directly manages the kernel-level threads.
Nevertheless, there must be a form of relationship between user-level and kernel-level threads.
There exists three established multithreading models classifying these relationships are:
o Many to one multithreading model
o One to one multithreading model
o Many to Many multithreading models
The many to one model maps many user levels threads to one kernel thread. This type of
relationship facilitates an effective context-switching environment, easily implemented even on
the simple kernel with no thread support.
In the above figure, the many to one model associates all user-level threads to single kernel-level
threads.
In the above figure, one model associates that one user-level thread to a single kernel-level thread.
In this type of model, there are several user-level threads and several kernel-level threads. The
number of kernel threads created depends upon a particular application. The developer can create as
many threads at both levels but may not be the same. The many to many model is a compromise
between the other two models. In this model, if any thread makes a blocking system call, the
kernel can schedule another thread for execution. Also, with the introduction of multiple
threads, complexity is not present as in the previous models. Though this model allows the
creation of multiple kernel threads, true concurrency cannot be achieved by this model. This is
because the kernel can schedule only one process at a time.
Thread Libraries:
● Thread libraries provide programmers with an API for creating and managing
threads.
● Thread libraries may be implemented either in user space or in kernel space. The
former involves API functions implemented solely within user space, with no
kernel support. The latter involves system calls, and requires a kernel with thread
library support.
● There are three main thread libraries in use today:
1. POSIX Pthreads - may be provided as either a user or kernel library, as an
extension to the POSIX standard.
2. Win32 threads - provided as a kernel-level library on Windows systems.
3. Java threads - Since Java generally runs on a Java Virtual Machine, the
implementation of threads is based upon whatever OS and hardware the
JVM is running on, i.e. either Pthreads or Win32 threads depending on the
system.
● The following sections will demonstrate the use of threads in all three systems for
calculating the sum of integers from 0 to N in a separate thread, and storing the
result in a variable "sum".
Pthreads:
● The POSIX standard ( IEEE 1003.1c ) defines the specification for pThreads, not
the implementation.
● pThreads are available on Solaris, Linux, Mac OSX, Tru64, and via public domain
shareware for Windows.
● Global variables are shared amongst all threads.
● One thread can wait for the others to rejoin before continuing.
● pThreads begin execution in a specified function, in this example the runner( ) function:
Windows Threads:
● Similar to pThreads. Examine the code example to see the differences, which are mostly
syntactic & nomenclature:
Java Threads
Threading Issues in OS
1. System Calls
2. Thread Cancellation
3. Signal Handling
4. Thread Pool
5. Thread Specific Data
The fork() and exec() are the system calls. The fork() call creates a duplicate process of the process
that invokes fork(). The new duplicate process is called child process and process invoking the
fork() is called the parent process. Both the parent process and the child process continue their
execution from the instruction that is just after the fork().
Let us now discuss the issue with the fork() system call. Consider that a thread of the multithreaded
program has invoked the fork(). So, the fork() would create a new duplicate process. Here the issue
is whether the new duplicate process created by fork() will duplicate all the threads of the parent
process or the duplicate process would be single-threaded.
Next system call i.e. exec() system call when invoked replaces the program along with all its
threads with the program that is specified in the parameter to exec(). Typically the exec() system
call is lined up after the fork() system call.
Here the issue is if the exec() system call is lined up just after the fork() system call then duplicating
all the threads of parent process in the child process by fork() is useless. As the exec() system call
will replace the entire process with the process provided to exec() in the parameter.
In such case, the version of fork() that duplicates only the thread that invoked the fork() would be
appropriate.
2. Thread cancellation
Termination of the thread in the middle of its execution it is termed as ‘thread cancellation’. Let us
understand this with the help of an example. Consider that there is a multithreaded program which
● What if the resources had been allotted to the cancel target thread?
● What if the target thread is terminated when it was updating the data, it was sharing with
some other thread.
Here the asynchronous cancellation of the thread where a thread immediately cancels the target
thread without checking whether it is holding any resources or not creates troublesome.
However, in deferred cancellation, the thread that indicates the target thread about the cancellation,
the target thread crosschecks its flag in order to confirm that it should it be cancelled immediately
or not. The thread cancellation takes place where they can be cancelled safely such points are
termed as cancellation points by Pthreads.
3. Signal Handling:
Signal handling is more convenient in the single-threaded program as the signal would be directly
forwarded to the process. But when it comes to multithreaded program, the issue arrives to which
thread of the program the signal should be delivered.
Let’s say the signal would be delivered to:
4. Thread Pool
When a user requests for a webpage to the server, the server creates a separate thread to service the
request. Although the server also has some potential issues. Consider if we do not have a bound on
the number of actives thread in a system and would create a new thread for every new request then
it would finally result in exhaustion of system resources.
We are also concerned about the time it will take to create a new thread. It must not be that case that
the time require to create a new thread is more than the time required by the thread to service the
request and then getting discarded as it would result in wastage of CPU time.
The solution to this issue is the thread pool. The idea is to create a finite amount of threads when
the process starts. This collection of threads is referred to as the thread pool. The threads stay in the
thread pool and wait till they are assigned any request to be serviced.
We all are aware of the fact that the threads belonging to the same process share the data of that
process. Here the issue is what if each particular thread of the process needs its own copy of data.
So the specific data associated with the specific thread is referred to as thread-specific data.
Consider a transaction processing system, here we can process each transaction in a different thread.
To determine each transaction uniquely we will associate a unique identifier with it. Which will
help the system to identify each transaction uniquely.
Process Scheduling
Scheduling Criteria:
Different CPU scheduling algorithms have different properties and the choice of a particular
algorithm depends on the various factors. Many criteria have been suggested for comparing CPU
scheduling algorithms.
The criteria include the following:
1. CPU utilisation:
The main objective of any CPU scheduling algorithm is to keep the CPU as busy as
possible. Theoretically, CPU utilisation can range from 0 to 100 but in a real-time
system, it varies from 40 to 90 percent depending on the load upon the system.
2. Throughput –
A measure of the work done by CPU is the number of processes being executed and
completed per unit time. This is called throughput. The throughput may vary
depending upon the length or duration of processes.
Turnaround time :-
For a particular process, an important criteria is how long it takes to execute that
process. The time elapsed from the time of submission of a process to the time of
completion is known as the turnaround time. Turn-around time is the sum of times
spent waiting to get into memory, waiting in ready queue, executing in CPU, and
waiting for I/O.
Turnaround time = completion time – arrival time
3. Waiting time –
A scheduling algorithm does not affect the time required to complete the process once it
starts execution. It only affects the waiting time of a process i.e. time spent by a
process waiting in the ready queue.
Waiting time = Turnaround time – burst time
4. Response time –
In an interactive system, turn-around time is not the best criteria. A process may
produce some output fairly early and continue computing new results while
previous results are being output to the user. Thus, another criteria is the time taken
from submission of the process of request until the first response is produced. This
measure is called response time.
Pre-emption means the ability of the operating system to pre-empt (that is, stop or pause) a
currently scheduled task in favour of a higher priority task. The resource being scheduled
may be the processor or I/O, among others.
Scheduling Algorithms (FCFS, SJF, Round Robin, and Priority) and their
evaluation:
FCFS:
First come first serve (FCFS) scheduling algorithm simply schedules the jobs according to their
arrival time. The job which comes first in the ready queue will get the CPU first. The lesser the
arrival time of the job, the sooner will the job get the CPU. FCFS scheduling may cause the
problem of starvation if the burst time of the first process is the longest among all the jobs.
Advantages of FCFS
o Simple
o Easy
o First come, First serve
Disadvantages of FCFS
1. The scheduling method is non-pre-emptive, the process will run to the completion.
2. Due to the non-pre-emptive nature of the algorithm, the problem of starvation may occur.
3. Although it is easy to implement, but it is poor in performance since the average waiting
time is higher as compare to other scheduling algorithms.
Example
Gantt chart
The average waiting Time is determined by summing the respective waiting time of all the
processes and divided the sum by the total number of processes.
Process Arrival time Burst time Turnaround Waiting
time(CT-AT) time(TT-BT)
P1 0 21 21-0=21 21-21=0
P2 0 3 24-0=24 24-3=21
P3 0 6 30-0=30 30-6=24
P4 0 2 32-0=32 32-2=30
idle P3 P1 P2 P4
0 1 3 6 10 14
Turn Around Time = Completion Time - Arrival Time
Priority Scheduling:
In Priority scheduling, there is a priority number assigned to each process. In some systems, the
lower the number, the higher the priority. While, in the others, the higher the number, the
higher will be the priority. The Process with the higher priority among the available processes is
given the CPU. There are two types of priority scheduling algorithm exists. One is Pre-
emptive priority scheduling while the other is Non Pre-emptive Priority scheduling.
Example:
Process Arrival time Burst time
P1 0 11
P2 5 28
P3 12 2
P4 2 10
P5 9 16
Gantt chart
P1 P2 P4 P3 P5
0 11 39 49 51 67
Round Robin scheduling algorithm is one of the most popular scheduling algorithm which can
actually be implemented in most of the operating systems. This is the pre-emptive version of first
Disadvantages
1. The higher the time quantum, the higher the response time in the system.
2. The lower the time quantum, the higher the context switching overhead in the system.
3. Deciding a perfect time quantum is really a very difficult task in the system.
P1 P2 P3 P1 P4 P2 P1
0 2 4 6 8 9 11 12
Multiprocessor scheduling:
There are two approaches to multiple processor scheduling in the operating system: Symmetric
Multiprocessing and Asymmetric Multiprocessing.
Processor Affinity
Processor Affinity means a process has an affinity for the processor on which it is currently
running. When a process runs on a specific processor, there are certain effects on the cache
memory. The data most recently accessed by the process populate the cache for the processor. As a
result, successive memory access by the process is often satisfied in the cache memory.
Now, suppose the process migrates to another processor. In that case, the contents of the cache
memory must be invalidated for the first processor, and the cache for the second processor must be
repopulated. Because of the high cost of invalidating and repopulating caches, most
SMP(symmetric multiprocessing) systems try to avoid migrating processes from one processor to
another and keep a process running on the same processor. This is known as processor affinity.
There are two types of processor affinity, such as:
Load Balancing