What is Unix?
Unix is an Operating System that is truly the base of all Operating Systems like Ubuntu, Solaris, POSIX, etc. It
was developed in the 1970s by Ken Thompson, Dennis Ritchie, and others in the AT&T Laboratories. It was
originally meant for programmers developing software rather than non-programmers.
Unix and the C were found by AT&T and distributed to government and academic institutions, which led to
both being ported to a wider variety of machine families than any other operating system. The main focus that
was brought by the developers in this operating system was the Kernel. Unix was considered to be the heart of
the operating System. System Structure of Unix OS are as follows:
UNIX is a family of multitasking, multiuser computer operating systems developed in the mid 1960s at Bell
Labs. It was originally developed for mini computers and has since been ported to various hardware platforms.
UNIX has a reputation for stability, security, and scalability, making it a popular choice for enterprise-level
computing.
The basic design philosophy of UNIX is to provide simple, powerful tools that can be combined to perform
complex tasks. It features a command-line interface that allows users to interact with the system through a series
of commands, rather than through a graphical user interface (GUI).
Some of the key features of UNIX include:
1. Multiuser support: UNIX allows multiple users to simultaneously access the same system and share
resources.
2. Multitasking: UNIX is capable of running multiple processes at the same time.
3. Shell scripting: UNIX provides a powerful scripting language that allows users to automate tasks.
4. Security: UNIX has a robust security model that includes file permissions, user accounts, and network
security features.
5. Portability: UNIX can run on a wide variety of hardware platforms, from small embedded systems to
large mainframe computers.
The structure of Unix OS
Layer-1: Hardware - This layer of UNIX consists of all hardware-related information in the UNIX
environment.
Layer-2: Kernel - The core of the operating system that's liable for maintaining the full functionality is named
the kernel. The kernel of UNIX runs on the particular machine hardware and interacts with the hardware
effectively.
Layer-3: The Shell - The Shell is an interpreter that interprets the command submitted by the user at the
terminal, and calls the program you simply want.
Layer-4: Application Programs Layer - It is the outermost layer that executes the given external applications.
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
FILE DESCRIPTORS
In the UNIX operating system, all input and output is done by reading or writing files, because all
peripheral devices, even keyboard and screen, are files in the file system. This means that a single
homogeneous interface handles all communication between a program and peripheral devices.
In the most general case, before you read and write a file, you must inform the system of your intent to
do so, a process called opening the file.
The system checks your right to do so and if all is well, returns to the program a small non-negative
integer called a file descriptor.
Whenever input or output is to be done on the file, the file descriptor is used instead of the name
to identify the file. All information about an open file is maintained by the system; the user program
refers to the file only by the file descriptor.
Since input and output involving keyboard and screen is so common, special arrangements exist to
make this convenient.
When the command interpreter (the ``shell'') runs a program, three files are open, with file descriptors
0, 1, and 2, called the standard input, the standard output, and the standard error. If a program reads 0
and writes 1 and 2, it can do input and output without worrying about opening files.
The user of a program can redirect I/O to and from files with < and >:
$ prog <infile >outfile //command line arguments, prog is the name of this program
In this case, the shell changes the default assignments for the file descriptors 0 and 1 to the named files.
Normally file descriptor 2 remains attached to the screen, so error messages can go there.
In all cases, the file assignments are changed by the shell, not by the program. The program does not
know where its input comes from nor where its output goes, so long as it uses file 0 for input and 1 and
2 for output.
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
Low Level I/O - Read and Write
Input and output uses the read and write system calls, which are accessed from C programs through
two functions called read and write.
For both, the first argument is a file descriptor. The second argument is a character array in your
program where the data is to go to or to come from. The third argument is the number is the number of
bytes to be transferred.
int n_read = read(int fd, char *buf, int n);
int n_written = write(int fd, char *buf, int n);
Each call returns a count of the number of bytes transferred. On reading, the number of bytes returned
may be less than the number requested. A return value of zero bytes implies end of file, and -
1 indicates an error of some sort. For writing, the return value is the number of bytes written; an error
has occurred if this isn't equal to the number requested.
Any number of bytes can be read or written in one call. The most common values are 1, which means
one character at a time (``unbuffered''), and a number like 1024 or 4096 that corresponds to a physical
block size on a peripheral device. Larger sizes will be more efficient because fewer system calls will be
made.
Write a simple program to copy its input to its output,
#include "syscalls.h"
#define BUFSIZ=512
int main(void) /* copy input to output */
{
char buf[BUFSIZ];
int n;
while ((n = read(0, buf, BUFSIZ)) > 0)
write(1, buf, n);
return 0;
}
We have collected function prototypes for the system calls into a file called syscalls.h so we can include it in the
programs. The parameter BUFSIZ is also defined in syscalls.h; its value is a good size for the local system. If
the file size is not a multiple of BUFSIZ, some read will return a smaller number of bytes to be written by write;
the next call to read after that will return zero.
It is instructive to see how read and write can be used to construct higher-level routines like getchar, putchar,
etc. Here is a version of getchar() that does unbuffered input, by reading the standard input one character at a
time.
#include "syscalls.h"
/* getchar: unbuffered single character input */
int getchar(void)
{
char c;
return (read(0, &c, 1) == 1) ? (unsigned char) c : EOF;
}
c must be a char, because read needs a character pointer.
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
open, creat, close, unlink
We have to open files in order to read or write them.
There are two system calls for this, open and creat.
open is rather like the fopen except that instead of returning a file pointer, it returns a file descriptor,
which is just an int. open returns -1 if any error occurs.
The commands are found in <fcntl.h>.
consider the code.
#include <fcntl.h>
int fd;
int open(char *name, int flags, int perms);
fd = open(name, flags, perms);
As with fopen, the name argument is a character string containing the filename. The second argument, flags, is
an int that specifies how the file is to be opened; the main values are
O_RDONLY open for reading only
O_WRONLY open for writing only
O_RDWR open for both reading and writing
To open an existing file for reading,
fd = open(name, O_RDONLY, 0);
The perms argument is always zero for the uses of open.
It is an error to try to open a file that does not exist.
The system call creat is provided to create new files, or to re-write old ones.
int creat(char *name, int perms);
example
fd = creat(name, perms);
returns a file descriptor if it was able to create the file, and -1 if not.
If the file already exists, creat will truncate it to zero length, thereby discarding its previous contents; it
is not an error to creat a file that already exists.
If the file does not already exist, creat creates it with the permissions specified by the perms argument.
In the UNIX file system, there are nine bits of permission information associated with a file that control
read, write and execute access for the owner of the file, for the owner's group, and for all others.
Thus a three-digit octal number is convenient for specifying the permissions. For
example, 0775 specifies read, write and execute permission for the owner, and read and execute
permission for the group and everyone else.
To illustrate, here is a simplified version of the UNIX program cp, which copies one file to another. Our version
copies only one file, it does not permit the second argument to be a directory, and it invents permissions instead
of copying them.
#include <stdio.h>
#include <fcntl.h>
#include "syscalls.h"
#define PERMS 0775 /* Read, Write, Execute permission for owner, group, others */
int main(int argc, char *argv[])
{
int f1, f2, n;
char buf[BUFSIZ];
f ((f1 = open(argv[1], O_RDONLY, 0)) == -1)
error("cp: can't open %s", argv[1]);
if ((f2 = creat(argv[2], PERMS)) == -1)
error("cp: can't create %s, mode %03o", argv[2], PERMS);
while ((n = read(f1, buf, BUFSIZ)) > 0)
if (write(f2, buf, n) != n)
error("cp: write error on file %s", argv[2]);
return 0;
}
This program creates the output file with fixed permissions of 0666. we can determine the mode of an existing
file and thus give the same mode to the copy.
random access - lseek
Input and output are normally sequential: each read or write takes place at a position in the file right after the
previous one. When necessary, however, a file can be read or written in any arbitrary order. The system
call lseek provides a way to move around in a file without reading or writing any data:
long lseek(int fd, long offset, int origin);
sets the current position in the file whose descriptor is fd to offset, which is taken relative to the location
specified by origin. Subsequent reading or writing will begin at that position. origin can be 0, 1, or 2 to specify
that offset is to be measured from the beginning, from the current position, or from the end of the file
respectively.
example
to append to a file seek to the end before writing:
lseek(fd, 0L, 2);
To get back to the beginning (``rewind''),
lseek(fd, 0L, 0);
Notice the 0L argument; it could also be written as (long) 0 or just as 0 if lseek is properly declared.
#include "syscalls.h"
/*get: read n bytes from position pos */
int get(int fd, long pos, char *buf, int n)
{
if (lseek(fd, pos, 0) >= 0) /* get to pos */
return read(fd, buf, n);
else
return -1;
}
The return value from lseek is a long that gives the new position in the file, or -1 if an error occurs. The standard
library function fseek is similar to lseek except that the first argument is a FILE * and the return is non-zero if
an error occurred.
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
Storage Allocator
Rather than allocating from a compiled-in fixed-size array, malloc will request space from the
operating system as needed. Since other activities in the program may also request space without
calling this allocator, the space that malloc manages may not be contiguous.
Thus its free storage is kept as a list of free blocks. Each block contains a size, a pointer to the next
block, and the space itself. The blocks are kept in order of increasing storage address, and the last block
(highest address) points to the first.
When a request is made, the free list is scanned until a big-enough block is found.
This algorithm is called ``first fit,'' by contrast with ``best fit,'' which looks for the smallest block that
will satisfy the request.
If the block is exactly the size requested it is unlinked (deleted) from the list and returned to the user.
If the block is too big, it is split, and the proper amount is returned to the user while the residue remains
on the free list.
If no big-enough block is found, another large chunk is obtained by the operating system and linked
into the free list.
Freeing also causes a search of the free list, to find the proper place to insert the block being freed.
If the block being freed is adjacent to a free block on either side, it is coalesced with it into a single
bigger block, so storage does not become too fragmented.
Determining the adjacency is easy because the free list is maintained in order of decreasing address.
In malloc, the requested size in characters is rounded up to the proper number of header-sized units; the block
that will be allocated contains one more unit, for the header itself, and this is the value recorded in the size field
of the header. The pointer returned by malloc points at the free space, not at the header itself. The user can do
anything with the space requested, but if anything is written outside of the allocated space the list is likely to be
scrambled.
The storage allocator typically operates through a set of system calls and functions that programs can use to
interact with the memory management system. Here are some key components and concepts related to a storage
allocator in Unix operating systems:
1. malloc and free:
malloc: Stands for "memory allocation," this function is used by programs to request a block
of memory of a specified size.
free: This function is used to release or deallocate a previously allocated block of memory.
2. Heap:
The memory space from which dynamic memory is allocated is often referred to as the
"heap." Unlike the stack (used for function call management), the heap allows for dynamic
allocation and deallocation of memory.
3. Memory Pools and Strategies:
Memory allocators often use strategies like first-fit, best-fit, or worst-fit to determine where in
the heap to allocate memory.
Some allocators implement memory pools, where they manage separate pools of memory for
different types or sizes of allocations.
4. Memory Fragmentation:
Fragmentation can occur in the heap over time, leading to inefficient use of memory. There
are two types of fragmentation: external fragmentation (free memory scattered throughout the
heap) and internal fragmentation (unused memory within allocated blocks).
5. Memory Allocators in C Library:
The C Standard Library provides memory allocation functions like malloc, free, calloc, and
realloc that are often implemented using the underlying storage allocator of the operating
system.
LISTING DIRECTORIES
The following is a list of these key directories and their contents in Unix operating system:
/bin - The /bin directory contains binary files.
/dev - This directory only contains special files, including those relating to the devices.
/etc - System specific configuration files, and files essential for system startup are located in the /etc directory..
/home - The /home directory is where the home directories for all users of the system are stored.
/opt - The /opt directory contains software files that are not installed when the operating system is installed.
/tmp - As the name implies, this directory is used for holding temporary files.
/usr - The /usr directory contains programs and files related to the users of a system.
/var -This includes system log files, mail system files, and print spooling system files.
/media-Default mount point for removable devices, such as USB sticks, media players, etc.
/mnt – similar to media directory, but mount physical devices manually by admin..
/lib - Contains system libraries, and some critical files such as kernel modules or device drivers.
/boot – contains the files of the kernel
/proc - directory contains the information about currently running processes
The ls Command in Linux:
The ls command as discussed above can be used to list out files and directories within a directory in Linux. Now
follow the below steps to get a hands-on experience with the ls command:
1. Listing files/directories in a specific directory:
To list the files and folders within a specific directory we make use of the following syntax:
$ls [path_of_target_directory]
Example: In the below example, we are in the home directory and will list the files and folders inside the
Downloads directory using the below command:
$ls Downloads
2. Listing files/directories in the parent directory:
To list files and directories in the parent directory of the current directory, we make use of the following
syntaxes:
For listing content of parent directory one level above:
$ls ..
3. Listing all directories only in a directory:
We can make use of the following command to list only the directories within a directory:
$ls -d */
4. Listing all files only in a directory:
We can make use of the following command to list only the directories within a directory:
$ls -f
5. Listing all files with the subdirectories:
We can make use of the following command to list all the files with their subdirectories within a directory:
$ls *
6. 5. Listing files recursively:
We can make use of the following command to list all the files and directories recursively within the current
system:
$ls -R
7. Listing all files with their sizes:
We can make use of the following command to list all the files with their respective sizes within a directory:
$ls -s
8. . Listing all files in long format:
We can make use of the following command to list all the files in log format that includes the following pieces
of information:
file/directory name
size of file or directory
content author/owner
number of links to the content
file/directory permissions
content creation and modification time
$ls -l
9. Listing all files including hidden files:
We can make use of the following command to list all the files including the hidden files within a directory:
$ls -a
10. Listing all files ordered by size:
We can make use of the following command to list all the files within a directory and order them by size:
$ls -S
To reverse the sorting order we can make use of the -r flag as shown in the below command:
$ls -Sr
11. Listing all files ordered by date and time:
We can make use of the following command to list all the files within a directory and order them by date and
time:
$ls -t
To reverse the sorting order we can make use of the -r flag as shown in the below command:
$ls –tr
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@