Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
91 views5 pages

DMC Unit 4 Notes

The document discusses the evolution and characteristics of Distributed Database Management Systems (DDBMS), highlighting their advantages over centralized systems, such as improved data access and reduced costs, as well as their disadvantages, including management complexity and security issues. It outlines the different levels of data and process distribution, transparency features, and the importance of transaction and performance management in DDBMS. Additionally, it covers query optimization techniques and the need for standard protocols to ensure efficient operation and data integrity across distributed environments.

Uploaded by

unnatibariya046
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
91 views5 pages

DMC Unit 4 Notes

The document discusses the evolution and characteristics of Distributed Database Management Systems (DDBMS), highlighting their advantages over centralized systems, such as improved data access and reduced costs, as well as their disadvantages, including management complexity and security issues. It outlines the different levels of data and process distribution, transparency features, and the importance of transaction and performance management in DDBMS. Additionally, it covers query optimization techniques and the need for standard protocols to ensure efficient operation and data integrity across distributed environments.

Uploaded by

unnatibariya046
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Sem 4 (NEP 2020) Subject = DMC Unit = 4

* The Evolution of DDBMS


- A distributed database management system (DDBMS) governs the storage and
processing of logically related
data over interconnected computer systems in which both data and processing are
distributed among several sites.
- During the 1970s, corporations implemented centralized database management
systems to meet their structured information
- two database requirements became obvious:
1) Rapid ad hoc data access became crucial in the quick-response decision-
making environment.
2) The decentralization of management structures based on the decentralization
of business units made decentralized multiple-
access and multiple-location databases a necessity.
- The dynamic business environment and the centralized database’s shortcomings
spawned a demand for applications
based on accessing data from different sources at multiple locations. Such a
multiple-source/multiple-location database
environment is best managed by a distributed database management system
(DDBMS).

* Advantages & Disadvantages od DDBMS:


-DDBMS has SEVERAL ADVANTAGES OVER CENTRALIZED DATABASE:-
1) Data are located near the Greatest demand side
2) Faster Data Access
3) Faster Data Processor
4) Grow Facilation
5) Improve Communication
6) Reduced Operating Cost
7) User-Friendly Interface
8) Less danger of a single point failure
9) Process Independent

- DDBMS ARE SUBJECT TO SOME PROBLEM - DISADVANTAGES :


1) Complexity of managment & control
2) Technological Difficulty
3) Security
4) Lack of students
5) Increased Trainning Cost
6) Various Cost

-The inherently complex distributed data environment increases the urgency for
standard protocols governing transaction management, concurrency control,
security, backup, recovery, query optimization, access path selection, and so
on..

* DISTRIBUTED PROCCESSING & DISTRIBUTED DATABASE


- Distributed Processing Does not require distribute database but a Distributed
Processing required distributed processing
- Distrubuted Processing maybe based on single database or may be distributed to
all data storage sites.
- both requires network connection for communcation

- CHARACTERISTICS OF DISTRIBUTED DATABASE MANAGEMENT SYSTEMS [EXTRA NOTES]


- Application interface
- Validation
- Transformation
- Query optimization
- Mapping
- I/O interface
- Formatting
- Security
- Backup and recovery
- DB administration
- Concurrency control
- Transaction management
- all of the functions of a centralized DBMS
- all necessary functions imposed by the distribution of data and processing
- perform those additional functions transparently to the end user

- DDBMS COMPONENTS [EXTRA NOTES] = figure 12.5


- computer workstations or remote devices
- network hardware and software components
- communication media
- the Transaction Processor (TP) = also known as the Application Processor
(AP) or the Trasaction Manager (TM)
- the Data Process [DP]
- protocols (a specific set of rules) used by ddbms for communication between
TP and DP

* LEVEL OF DATA AND PROCESS DISTRIBUTION (Table 12.2)


1) single-site processing, single-site data (SPSD)
2) multiple-site processing, single-site data (MPSD)
3) multiple-site processing, multiple-site data (MPMD)
4) single site processing ,multiple-site data (SPMD)

* 1) single-site processing, single-site data (SPSD) = figure 12.6


- A single host computer maybe of a single processor server / multi-
processor / main-frame system
- The DBMS of the host computer is accessed by dumb terminals connected to
it.
- the functions of the TP and the DP are embedded within the DBMS (usually
runs under a time-sharing, multi-tasking OS)

* 2) multiple-site processing, single-site data (MPSD) = figure 12.7


- multiple processes run on different computers sharing a single data
repository.
- requires a network file server running conventional applications that are
accessed through a network.
- features:
1) The TP on each workstation acts only as a redirector to route all
network data requests to the file server.
2) The end user sees the file server as just another hard disk.
3) The end user must make a direct reference to the file server.
4) All data selection, search, and update functions take place at the
workstation.
- Client/server architecture is also used for MPSD scenario.

* 3) multiple-site processing, multiple-site data (MPMD) = table 12.3


- describes a fully distributed DBMS with support for multiple data
processors and transaction processors at multiple sites.
- DDBMSs are classified here as per their support to local variation of
centralized DBMSs:
1) Homogeneous DDBMS
= only one type of centralized DBMS over a network.
2) Heterogeneous DDBMS
= integrate different types of centralized DBMSs over a network.
- Some DDBMS implementations support several platforms, operating systems,
and networks and allow remote data access to another DBMS.

* 4) single site processing ,multiple-site data (SPMD) = not applicable because


multiple processes are generally required for multiple site data.

* Distribution Database Transparancy Features


- functional characteristics of Distributed Database are referred as
Transparacy features
- the common property = allowing the end user to feel like the database’s
only user
- the user believes that (s)he is working with a centralized DBMS; all
complexities of a distributed database are hidden, or transparent, to the user.
- transparency features are:
1) Distribution transparency
2) Transaction transparency
3) Failure transparency
4) Performance transparency
5) Heterogeneity transparency

* Distribution transparency
- a distributed database is treated as a single logical database
- a physically dispersed database is managed as a centralized database.
- the user does not need to know:
1) that the data are partitioned
2) that the data can be replicated at several sites.
3) the data location.
- Three levels of distribution transparency are:
1) Fragmentation Transparency = page: 493, case 1
= the highest level of transparency by the DDBMS point of view
= neither fragment names nor fragment locations are specified
2) Location Transparency = = page: 493, case 2
= the medium level of transparency by the DDBMS point of view
= must specify the database fragment names but fragment locations are
not needed to be specified
3) Local mapping Transparency = = page: 493, case 3
= the lowest level of transparency by the DDBMS point of view
= must specify both the fragment names and their locations using
pseudo-SQL
- it is supported by ......
1) a distributed data dictionary (DDD)
- contains the description of the entire database (known as the
distributed global schema) as seen by the DBA

2) a distributed data catalog (DDC)


- self-distributed and replicated at the network nodes with maintaining
consistency at all sites.

* Transaction transparency
- it ensures that database transactions will maintain the distributed
database’s integrity and consistency.
- it ensures that the transaction will be completed only when all database
sites involved in the transaction complete their part of the transaction.
- Distributed database systems require complex mechanisms to manage
transactions. Following concepts help in managing :
1) remote requests = a single SQL statement (figure 12.9)
2) remote transactions = composed of several requests (figure 12.10)
3) distributed transactions = refers to several different local or remote DP
sites. (figure 12.11)
4) distributed requests = single SQL statement reference data located at
several different local or remote DP sites (figure 12.12 , 12.13)

- Distributed Concurrency Control


- multisite, multiple-process operations are more likely to create data
inconsistencies and deadlocked transactions than
single-site systems are.
- inconsistent database leads to its inevitable integrity problems
- Two-phase commit protocol
- it is a solution for the problems of Concurrency Control of Distributed
DBMS
- it guarantees that if a portion of a transaction operation cannot be
committed; all changes made at the other sites participating
in the transaction will be undone to maintain a consistent database state.
- it requires a DO-UNDO-REDO protocol and a write-ahead protocol.
- defines the operations between two types of nodes: the coordinator and one
or more subordinates, or cohorts.
- two phases:
1) Preparation Phase
2) the Final committ Phase
- The objective of the two-phase commit =
1) to ensure that each node commits its part of the transaction;
otherwise, the transaction is aborted.
2) if one of the nodes fails to commit, the information necessary to
recover the database is in the transaction log, and the database
can be recovered with the DO-UNDO-REDO protocol.

* Performance transparency
- The system will not suffer any performance degradation due to its use on a
network or due to the network’s platform differences.
- Performance transparency also ensures that the system will find the most
cost-effective path to access remote data.
- Compared to centralized database(all data reside at a single site), Query
evaluation/access/translation in DDBMS
is even more complex due to:
- it must decide which fragment of the database to access.
- it must decide which copy of the data to access from replicated data of
several different sites.
- query optimization techniques = is used by DDBMS to deal with such problems
and to ensure acceptable database performance.
- The objective of a query optimization routine is = to minimize the total
cost associated with the execution of a request.
- Total cost is consisting of
- Access time (I/O) cost of physical data
- Communication cost of nodes
- CPU time cost of managing transactions
- standard process of evaluating query optimization
- the TP must receive data from the DP
- synchronize it
- assemble the answer and
- present it to the end user or an application.
- One of the most important characteristics of query optimization:
- it must provide distribution transparency as well as replica transparency.
- Replica transparency = Ability of DDBMS to hide the existence of multiple
copies of data from the user.
- two principles for the algorithms of query optimization:
-#The selection of the optimum execution order.
# - The selection of sites to be accessed to minimize communication costs.
- Evaluation basis of a query optimization algorithm:
1) its operation mode 2) the timing of its optimization
- Two types of operation modes:
1) Automatic query optimization = the DDBMS finds the most cost-effective
access path without user intervention (more desirable)
2) Manual query optimization = the optimization be selected and scheduled
by the end user / programmer.
- classification of Query optimization algorithms as per timing:
1) Static query optimization = the best optimization strategy is selected at
compilation time. (common approach)
2) Dynamic query optimization = Database access strategy is defined at run
time / execution time. (efficient)
- classification of Query optimization according to the type of information:
1) statistically based query optimization algorithm
- dynamic statistical generation mode = DDBMS automatically evaluates and
updates the statistics after each access
- manual statistical generation mode = statistics must be updated
periodically through a user-selected utility
2) rule-based query optimization algorithm
- set of user-defined (end user/DBA) rules to determine the best query
access strategy.
* Failure transparency
- DDBMS ensures that the system will continue to operate in the event of a
node failure.

* Heterogeneity transparency
- allows the integration of several different local DBMSs (relational,
network, and hierarchical) under a common, or global, schema.
- The DDBMS is responsible for translating the data requests from the global
schema to the local DBMS schema.

**********

You might also like