DATA STRUCTURE
Data representation is logical representation of elements of data.
Types of data structure:
Array: Quick insertion and slow access
Ordered Array: fast access but slow insertion
Stack: Last in first out, Slow access
Queue: First in first out, Slow access
Linked List: Quick insert and delete
Binary Tree: Everything is quick but algo is complex
Sparse Array: All elements have same value.
In linked list characters in a string are sequentially arranged in memory cells
called nodes.
To access substring in a string, we need:
Name of string
First character of sub string
Length of sub string
Iteration: process of executing statement until required condition does not
meet.
Notation of expressions: Infix A+B, Prefix +AB, Postfix AB+
Process of adding in STACK is called PUSH while deleting is called POP.
PEEK reading value from top.
Two ends of STACKS are called front and rear. Insertion is done at rear end
while deletion is done from front end.
Circular Queue: last inserted element automatically gets first.
Advantages of linked list:
This is dynamic structure
Memory allocation is properly used
Operations can be easily performed
Concatenation is the process of mixing two linked lists.
There are three prominent types of linked lists 1) Single linked 2) Doubly
linked 3) Circular linked
Check below image to understand root, node, level, child etc of linked list:
Degree at any node = Number of connected sub trees.
Degree of tree is maximum degree of any node.
Leaf Node: where structure ends or degree=0
Height or depth of tree is number of levels involved in tree.
There are three ways of binary tree traversal:
Pre Order: RootLeft Right
In Order: LeftRoot Right
Post Order: Left RightRoot
AVL tree is self balancing tree. Height of sub trees differ by 1.
Undirected Graph: Where edges have no direction. Directed graph is
opposite.
Spanning tree is part of undirected graph with all vertices inscribed.
Trivial Graph: text based details of graph. All nodes, vertices etc are defined
properly.
Time complexity is the time lapsed in data searching.
Fibanocci is also a way of search like binary search but process will be fast.
Bubble sort is an arrangement method inside hash table. Each element is
compared with nearby to make an increasing order.
Three most popular sorting techniques are quick sort , merge sort and heap
sort.
DATA COMMUNICATION
Digital data can traverse both serially as well as in parallel.
Synchronized transmission begins with START and STOP bit.
Simplex: information flows in one direction.
Half Duplex: One time one way communication
Full Duplex: Simultaneous connection of both ways.
Multiplexing is the technique that allows multiple signals to traverse across
single data link.
To complete digital communication, multiplexer and de-multiplexer both are
required.
Multiplexing are of three types:
FDM: it stands for frequency division multiplexing. Total bandwidth
is divided in frequencies. Each channel gets its own frequency.
WDM: it stands for wave division multiplexing. Different wavelength
light multiplexed into one.
TDM: It stands for time division multiplexing. Switches are used to
carry this type of multiplexing.
OSI stands for open system interconnection. This is a standard model for
phonic and computer communication.
There are seven layers in OSI as listed below:
1. Physical layer: physical data system is defined in this layer.
Transmission mode is defined in this layer ( Simplex, Duplex etc ).
Network topology is defined in this layer of communication.
2. Data Link Layer: it provides node to node data transfer. Data link layer
is sub divided into two parts a) media access control b) logical link
control
3. Network layer: this layer provides functional and procedural means for
data transfer.
4. Transport layer: It is responsible for delivering message between
network host. It also accepts data from application layer and prepare it
for addressing at network layer.
5. Session layer: it controls connection between computers in network.
6. Presentation layer: Data compression, decompression, encryption,
decryption are completed in this layer
7. Application layer: this layer communicates directly with user. It
provides user services like user login, naming network devices,
formatting messages, and e-mails, transfer of files etc.
Piggy Backing: this is a bi-directional data transmission in OSI model with
acknowledgement.
Poll and Select is also network communication system. Based on master
slave configuration.
Error detection is done in Data Link Layer while error correction is done in
Transport Layer.
Data Gram: data sequence transfer system in which arrival time and order
are not in fix.
Quick Pack: it makes adjustment between PC rate and dial up connection
transfer rate.
BAUD Rate: Number of signals transferred per second.
Parity Bit: ASCII is actually 7 bit code but in general it is considered as 8
bit. Vendors add an extra bit at the side of least significant bit side. Extra bit
works for error check.
Cyclic Redundancy Check: this is an error detecting code system.
Hamming code, detect as well as corrects error.
WAN coverage area is at least a country or continent. WAN connection is
made using Subnet. Subnet means part of a big network.
Repeater operates in physical layer. On travelling distance, signal gets
attenuated and this loss is restored by repeater.
Router defines route of data frame.
A gateway is a network node that connects two networks using different
protocols together. While a bridge is used to join two similar types of
networks.
Handshaking is a process in which communication rules is set prior to
beginning of communication.
Nowadays IPV6 has largely replaced IPV4.
Network topology is an arrangement of various elements of network. Nodes,
Links etc.
Peer to peer is decentralized communication system where each node can
communicate with other.
Topology can be further divided in physical and logical topology. Physical
topology is concerned with components like cables, modems etc. On other
hand logical topology is concerned with data flow in network.
Various Types of Topologies:
Bus: central cable is used for connection .
Star: central hub is used to make connection. This is based on point to
point connection. Central hub can be a router or switch.
Ring Topology: every connected device works as repeater to keep
signal alive. There is no central controller in this.
Tree topology is combination of Bus and Star.
Daisy Chaining: process of adding new computer in series. This is not
possible in STAR topology.
DBMS
DBMS is a collection of files and program through which files can be
accessed and modified.
Flat file is file of records with no structured relation.
Master file is the main file in database which changes rarely.
If instead of database, traditional style of file access used then two common
problems are data redundancy and data security.
There are three levels of architecture of DBMS:
External view level
Conceptual schema
Internal Physical level
DBMS entity is identified by its attributes.
Strong entity: which has primary key in attributes.
Weak entity: No primary key available
A weak entity can be turned to a strong one by addition of some attributes.
ER model stands for entity relationship model.
Hierarchical data model: data is organized in parent child hierarchy.
Network data model is based on many to many relation.
Rows other than head in table are called touple.
Some popular RDBMS are Oracle, Sybase, Informix, SQL etc.
Candidate key: column or set of columns through which database records
can be uniquely identified. Best candidate key is called primary key.
Foreign key: primary key of one table is considered as foreign key of
another table. This is used to make relation from one table to another table.
Normalization of database means systematic approach of data arrangement
in table. Through normalization, redundancy as well as unexpected
operations is controlled.
Normalization is the process of organizing data in a database. This includes
creating tables and establishing relationships between those tables according
to rules designed both to protect the data and to make the database more
flexible by eliminating redundancy and inconsistent dependency.
Normalization rules are 1NF, 2NF, 3NF and BCNF.
1NF: any row must not have a column with more than one value.
Example of 1NF below:
2NF: there must not be any particular dependency on primary key.
Dependent column is eliminated.
3NF: every non prime attribute of table must be dependent only on primary
key. If this is not the case then separate values from that table. Check image
below:
Transitive dependency: if first factor depends on second and second depends
on third then first is also dependent on third, this is called transitive
dependency.
Types of statement:
DDL: CREATE, ALTER, DROP, TRUNCATE
DML: SELECT, INSERT INTO, UPDATE, DELETE FROM
DCL: GRANT, REVOKE, COMMIT, ROLL BACK
In competitive exams, question comes which keyword belongs to which
part. Example: Revoke comes in DDL, DML or DCL
Truncate statement is used to delete all rows from table. Table gets deleted
but definition remains intact.
DELETE and TRUNCATE seems same but TRUNCATE comes under DDL
while DELETE comes under DML. Data deleted through TRUNCATE
cannot be recovered while with DELETE can be recovered using
ROLLBACK.
Data Control Languages are used to give and take back privileges.
BCNF is higher version of 3NF. A 3NF table which does not have
multiple overlapping candidate key.
Constraints are used to specify rule for data in table.
There can be more than one primary key in table called composite primary
key.
1NF: Attribute values must be atomic. Example: there is column named
COLOR and two values are there RED and BLUE. Because of two values
this is not 1NF.
2NF: table must be in 1NF. All attribute must depend on primary key.
Problems comes when there will be case of composite primary key.
3NF: 2NF must be followed. There should be no transitive dependency
among attributes.
BCNF: this is advanced version of 3NF with no overlapping of candidate
keys.
4NF: Works in case of multi-Valued dependency. Multi-Valued functions
are being eliminated.
5NF: this is also called project join normal form. Table is divided into more
than one table without loss of information.
Super key is set of columns with unique combination of row values.
DATA MINING
Data Mining means data extraction.
There are 4 main phases of data mining:
Clustering
Classification
Regression: error checking is done in this phase
Association rule: gathering data for custom habit in
supermarket.
There are two steps in data mining:
OLTP: Online transaction processing
OLAP: Online analytics processing
For data updates in Ware House, ETL is used. ETL stands for extraction,
transformation and loading.
There are two prominent schema for data ware house:
Star Schema
Snowflake Schema
Datamart: Collection of particular type of data.
OPERATING SYSTEM
What is Operating System?
An operating system (OS) is a collection of software that manages computer
hardware resources and provides common services for computer programs.
An Operating System (OS) is an interface between a computer user and computer
hardware.
What is purpose of operating system?
An operating system is a software which performs all the basic tasks like file
management, memory management, process management, handling input and
output, and controlling peripheral devices such as disk drives and printers.
Memory Management
Processor Management
Device Management
File Management
Security
Control over system performance
Job accounting
Error detecting aids
Coordination between other software and users
Which are popular operating systems?
Some popular Operating Systems include Linux Operating System, Windows
Operating System, VMS, OS/400, AIX, z/OS, etc.
What is memory management?
Memory management refers to management of Primary Memory or Main Memory
for execution of any program. In multiprogramming, the OS decides which process
will get memory when and how much
What is Processor Management?
In multiprogramming environment, the OS decides which process gets the
processor when and for how much time. This function is called process scheduling.
What is file management?
A file system is normally organized into directories for easy navigation and usage.
These directories may contain files and other directions. Keeps track of
information, location, uses, status etc.
Types of Operating System:
What is Batch Operating System?
Batch processing is a technique in which an Operating System collects the
programs and data together in a batch before processing starts. This type of
operating system does not interact with the computer directly. Example: Punch
Card
What is time sharing operating system?
Time-sharing is a technique which enables many people, located at various
terminals, to use a particular computer system at the same time. This is an
extension of multi programming. Example: Unix
Multi programming: single processor and multiple programs. Computer running
Google Chrome and Word file simultaneously is example of multi programming.
What is distributed operating system?
Distributed systems use multiple central processors to serve multiple real-time
applications and multiple users. Data processing jobs are distributed among the
processors accordingly. Example: Plan 9, MOSIX
Distributed operating system is also known as loosely coupled system.
What is Network Operating System?
A Network Operating System runs on a server and provides the server the
capability to manage data, users, groups, security, applications, and other
networking functions. Examples: windows, LINUX, MAC etc
What is Real Time Operating System?
Real time operating system is intended to take and execute data instantly without
any buffer delay.
What is multitasking?
Multitasking is when multiple jobs are executed by the CPU simultaneously by
switching between them. Switches occur so frequently that the users may interact
with each program while it is running.
Multitasking Operating Systems are also known as Time-sharing systems.
Difference between multi programming and multi tasking?
Multiprogramming works on single processor whereas Multitasking works on
multiple processors.
What is spooling?
Spooling is an acronym for simultaneous peripheral operations on line. Spooling
refers to putting data of various I/O jobs in a buffer.
When a program is loaded into the memory and it becomes a process, it can be
divided into four sections : stack, heap, text and data.
Stack contains the temporary data such as method/function parameters, return
address and local variables.
Heap: this is dynamically allocated memory to a process during its run time.
What is process scheduling?
The process scheduling is the activity of the process manager that handles the
removal of the running process from the CPU and the selection of another process
on the basis of a particular strategy.
A Process Scheduler schedules different processes to be assigned to the CPU based
on particular scheduling algorithms.
Which are popular CPU scheduling algorithms?
First-Come, First-Served (FCFS) Scheduling
Shortest-Job-Next (SJN) Scheduling
Priority Scheduling
Shortest Remaining Time
Round Robin(RR) Scheduling
Multiple-Level Queues Scheduling
These algorithms are either non-preemptive or preemptive. Non-preemptive
algorithms are designed so that once a process enters the running state, it cannot be
preempted until it completes its allotted time, whereas the preemptive scheduling is
based on priority where a scheduler may preempt a low priority running process
anytime when a high priority process enters into a ready state.
What is priority scheduling?
Priority scheduling is a non-preemptive algorithm and one of the most common
scheduling algorithms in batch systems. Each process is assigned a priority.
Process with highest priority is to be executed first and so on. Processes with same
priority are executed on first come first served basis. Priority can be decided based
on memory requirements, time requirements or any other resource requirement.
What is round robin scheduling?
Round Robin is the preemptive process scheduling algorithm. Each process is
provided a fix time to execute, it is called a quantum. Once a process is executed
for a given time period, it is preempted and other process executes for a given time
period.
What is thread?
A thread is a flow of execution through the process code, with its own program
counter that keeps track of which instruction to execute next.
What is program counter?
A program counter is a register in a computer processor that contains the address
(location) of the instruction being executed at the current time.
How many types of threads?
There are two primary types of threads:
1. User level thread: user manages various aspects
2. Kernel level thread: Operating system manages
What is Swapping?
Swapping is a mechanism in which a process can be swapped temporarily out of
main memory (or move) to secondary storage (disk) and make that memory
available to other processes.
What is Fragmentation?
As processes are loaded and removed from memory, the free memory space is
broken into little pieces. It happens after sometimes that processes cannot be
allocated to memory blocks considering their small size and memory blocks
remains unused. This problem is known as Fragmentation.
What is paging?
In computer operating systems, paging is a memory management scheme by which
a computer stores and retrieves data from secondary storage for use in main
memory.
What is Direct Memory Access?
Direct memory access (DMA) is a method that allows an input/output (I/O) device
to send or receive data directly to or from the main memory, bypassing the CPU to
speed up memory operations.
What is polling I/O?
Polling, or polled operation, in computer science, refers to actively sampling the
status of an external device by a client program.
Most of the time, devices will not require attention and when one does it will have
to wait until it is next interrogated by the polling program. This is an inefficient
method and much of the processors time is wasted on unnecessary polls.
What is Interrupt I/O?
An alternative scheme for dealing with I/O is the interrupt-driven method. An
interrupt is a signal to the microprocessor from a device that requires attention.
What is Semaphore?
In computer science, a semaphore is a variable or abstract data type used to control
access to a common resource by multiple processes in a concurrent system such as
a multitasking operating system.
What is deadlock?
Deadlock is a situation where a set of processes are blocked because each process
is holding a resource and waiting for another resource acquired by some other
process. Dijkstra’s algorithm used to avoid deadlock.
QUESTIONS ASKED IN OTHERS EXAMS OF COMPUTER SCIENCE
C Rangarajan was chairman of committee that brought in computerization of
banks.
UNIX is an example of multi user and multi tasking operating system.
Large organization like Bank runs through EDP ( Electronic Data Processing
). EDP is also popularly known as MIS ( Management Information System ).
BIT System was developed by Claude E. Shannon.
IC was first developed by Jack Kilby.
PARAM was India’s first super computer while Atlas was first in the world.
Frame is digital data transmission unit used in the telecommunication.
ASCII stands for American Standard Code for Information Interchange.
Nowadays UTF-8 has replaced ASCII.
BER: Bit error rate is number of data bits received with some distortion per
unit time.
Firewall is used for security of networks.
DSL or Digital Subscriber Line is technology used for data transmission
over telephone lines.
Data warehouse: this is central data repository system under which data
analysis is performed.
Head Quarter of C-DAC is Pune.
XML stands for extensible markup language. This can be read by human as
well as machine.
DMA stands for direct memory access that is a feature in which RAM is
accessed without intervention of CPU.
FLOPS stands for floating point operations per second. This measures
computer’s performance.
ICM or image color matching system comes in Windows and assures that
color remains intact while printing multiple devices on same platforms.
Transducer device is used to convert Physical quantity into analog signal.
BCC in mailing stands for Blind carbon copy.
8 Bits =1 Byte, 4 Bytes =1 Nibble
BISDN stands for broadband integrated services digital network.
Bandwidth: amount of data that can flow through channel.
Clock produces pulses and later these pulses are used for synchronization.
Through Defragmentation speed of computer can be improved.
Modems (modulators/demodulators) are data communication devices that
convert digital signals to analog signals, and vice versa.
Domain Name System is a decentralized naming system for computers in
network. It allocates IP address to computer.
LAN is prepared through Ethernet cable. Wi-Fi and some others options can
be also used to create a LAN.
FAT stands for file allocation table. This is the way file is organized on hard
disk. FAT 16 and 32 are two prominent options.
Handshaking: A method of dataflow control between two devices so that one
device ensures data is transmitted only when other device is ready.
Spoofing: an act to have unauthorized access to someone’s computer.
Trunk is aggregation of multiple telecom lines to get high bandwidth.
Netizens: citizen of any country with access of internet.
Informix and Ingrex are two popular databases.
Artificial intelligence is concerned with fifth generation of computer.
First Generation Computer Vacuum tube
Second generation Computer Transistor
Third generation Computer IC
Forth generation Computer VLSI
Fifth generation Computer Artificial Intelligence
Freeware software programs are those that are free to use while shareware
are free in beginning and later that will become paid.
Address path inside Windows Operating System is not case sensitive.
Resource sharing is the main feature of networking. Here resource means
RAM, Hard Disk, Internet Connection etc.
Peer to peer connection has no centralized system like a computer connected
with internet.
In STAR topology HUB is used for central connections. HUB works like
router in this topology
Port number is written after IP address with colon. Example
11.21.33.255: 18
Port 80 is port where server listens to client.
Port 110 works as mail server
POP or post office protocol is also used for mail transfer system. IMAP and
SMTP are two main protocols for mailing.
Digital divide is the term used for people connected with internet and
without internet.
Encryption is also called text masking. Initially it was used by military.
Token key is called changing password with time.
In some exams conversion of Binary to Decimal and vice versa asked so
check image below:
SWADHAN network is first ever banking network developed by Indian
bank association for shared ATM system.
HWAK is a type of ATM
Cryptography can be called digital identification certification.
MICR stands for magnetic ink character recognition. This is used for bank
cheque.
Data mining: a way to retrieve data from database.
DSS stands for decision support system. Nowadays commercial computers
are equipped with DSS.
IT ACT was majorly introduced in year 2000
Von Neumann is a computer design model.
There are three types of buses for computer communication namely address,
data and control buses.
Registers are small data storage units while data processing.
RAID stands for redundant array of independent disk. This is used for
memory storage virtualization.
Program execution takes place in two phases a. instruction fetch b.
instruction execute
IR hold current instruction and clock is primarily used for synchronization.
Acknowledgement comes from control bus.
There are two design approaches for control unit: hardwired and Micro
Programming
Hardwired: control signals are generated through hardware. Flip flop, logic
gates, decoders, multiplexers etc.
Microprogramming: all instructions are executed with help of stored
programs.
Hardwired is based on RISC while Microprogramming is based on CISC.
RISC stands for reduced instruction set computing
CISC stands for complex instruction set computing
Nowadays RISC is mainly used in computer system.
Logic gates are considered building blocks of digital circuit.
Gate delay is the delay in computation when we change input value.
Common arithmetic operations are ADD, SUBTRACT, MULTIPLY,
DIVIDE, NEGATE, INCREMENT, DECREMENT.
There are three types of instruction set a) data transfer instructions b)
Arithmetic instructions c) Logical and program instruction
Three types of data transfers are possible namely register to register, register
to memory and memory to register while memory to memory is not possible.
RISC is better than CISC because
Low number of instructions
Low number of addressing modes
Low number of instruction formats
Execution through register to register so minimum memory space
required.
Cache is considered as safe and fast memory storage place between
processor and RAM.
HIT and MISS are terms used for data found and not found in cache.
Cache memory organization are of three types a) Direct mapped b) Fully
associative c) Set associative
RAM memory is divided in block while cache is divided in lines.
Associative memory can retrieve data with partial information about data.
Flip-Flop or Latch always has two stable states and stores data information.
Flip Flop are considered as basic block building in any digital circuit.
There are 4 prominent types of flip flops namely SR ( set reset ). D ( Data ),
T ( Toggle ) and J-K.
Virtual memory is a concept in which processor recognizes more memory
than actually available.
Volatile memory: those which loses data when power is switched off.
Example: Cache
Name Random access memory because data can be accessed in any order.
Static RAM is made through Flip Flop while Dynamic RAM is made of
transistor and capacitor.
Dynamic RAM memory can be deleted- refreshed while running and same is
not possible for STATIC.
Dynamic RAM is used when speed is not a major aspect only requirement is
to handle large circuits.
IEEE stands for institute of electrical and electronics engineering.
Fragmentation means scattering of file among various hard disk clusters.
Input output can be either Interrupt driven or polling driven.
Interrupt is better than polling and nowadays widely used because it is fast.
LDR or light dependent register is the fastest among all registers
YACC or yet another compiler compiler is actually a parser.
Loader: it takes code in machine language and place in the main memory.
Parser is software that takes input data and builds a data structure mainly
some kind of parse tree, abstract syntax tree or other structure.
Compiler checks for error in program and list them. Interpreter checks error
statement by statement.
Lexer breaks code into token.
Dirty bit is associated with computer block memory and indicates whether or
not the block memory has been changed.
Virus: expands within a computer to other program.
Trojan horse, or Trojan, is any malware which misleads users of its true
intent.
Worm: expands to other computer also
Logic bomb: Trigger action when condition occurs.
Polymorphic virus: mutates with every infection
Metamorphic virus: this re-writes itself every time
Sniffing: It means guessing a password, this is part of ethical hacking.
Spoofing: legal looking email ID used for information retrieval.
In RACE Condition, computed values depend on time taken.
2011 STET QUESTIONS
1. Which type of computer is portable?
2. Which word is not related to email? a) Power point b) Inbox c) Outbox d)
receiver
3. Printer and Monitor are? Answer Hardware
4. How many megabytes equal one Giga Byte
5. Educational institutes have domain extension? Answer EDU
6. Part of computer that can’t be touched Software
7. What is ROM in computer?
8. What can be alternate name of computer program?
9. What is called computer brain?
10. How to delete any part of document?
11. Which type of work is not done by computer?
12.What is full form of BIT?
13. What is full form of PROM
14. What is full form of RAM
15. Which type of device floppy is?
16. How to edit excel worksheet in power point?
17.What is full form of ROM?
18.What is an example of Track ball?
19. Which computer part can be touched?
20. Convert binary to decimal 101
21.Which is the most common input device?
22.Which term is related with internet?
23. What is advantage of DRAM over SRAM?
24. What is other name of RAM?
25. Which is faster? SRAM or DRAM
26. What is device driver?
27. What is LINUX?
28. What is full form of POST?
29. Topology that uses common cable?
30. What is meaning of HOST in internet?
31.What is unique identification of any website?
32. Which type of software Microsoft Word is?
33. What is use of numeral named folder in database?
34. What is used to write code of web page?