Disconnected Operations in CODA File System
Presentation By : CG
www.powerpointpresentationon.blogspot.com
CODA File System
Developed for Distributed Machines Used in Disconnected Operations Environment Based on AFS (Andrew File System) Developed in UNIX Environment Replicates server for high availability Uses Cache Management for high availability
Motivation
Mechanisms to Increase Availability
Server Replication Managing Disconnected Operations using Venus Updating Server When Disconnection ends Replica Updating after Disconnection
An Example
An Example
An Example
An Example
An Example
An Example
Design Rationale
Scalability
Whole-file caching Placing burden on Clients rather than Servers Avoidance of System wide changes
Portable Workstations
They are moving Have selective usage of File Users can predict their disconnection in most cases
Design Rationale (contd)
First Vs. Second Class Replication
First Class replicas reside on Server, That are reliable, clean, available & complete.
Second Class replica reside on Clients, That are inferior to all above dimensions.
Cache Coherence protocol is used to synchronize both type of replica. Degraded second class replica is modified after reconnection.
CODA also support sole server use of replica in case of performance & cost related issues.
Design Rationale (contd)
Pessimistic Vs. Optimistic Replica Control
Pessimistic approach requires client to acquire exclusive control before caching & disconnection & retain that until reconnection Pessimistic approach is used when disconnection short & degrades performance in long disconnections Optimistic approach requires updating replica as new updates are released Optimistic approach requires sophisticated hardware & software to manage replicas In optimistic approach each client has its own accessible universe to which it sends & gets updates
CODA Design & Implementation
Application
VENUS
To CODA Servers
System Call Interface
Vnode Interface
CODA MiniCache
Structure of the CODA Client
Client structure
It is a user level process Minicache contains no support for remote access,disconnected operation or server replication These functions are handled by venus
CODA Design & Implementation (contd)
Hoarding
Disconnection
Logical Reconnection
Emulation
Physical Reconnection
Reintegration
Venus States & Transitions
CODA Design & Implementation (contd) Hoarding
Hoarding is a process to caching & management of data in case of disconnection Many factors like cache miss ratio, Disconnection frequency, cache space, freshness of cached object affects the performance of hoarding
Prioritized Cache Management
Venus combines implicit and explicit sourcesof information Hoard database contains hoard profile which can be easily Hoard profiles , Meta expansion A/usr/bin 100:d+
Hoard walking
Cache equilibrium Venus periodically restores equilibrium by performing operation knowm as hoard walk There are two phases in hoard walk name binding of the HDB entries are reevaluated to reflect update activity by other coda client Priorities of all entries in the cache and HDB are reevaluated
CODA Design & Implementation (contd)
Emulation
Logging
During emulation sufficient logs are generated about the update activity, so that cache reintegration process can be managed easily Generated log are called replay log venus uses a number of optimizations to reduce the length of the replay log
persistence
Emulation uses a RVM (Recoverable Virtual Memory) to store meta-data about cached objects so that they can be retrieved in case of crash & recovery during reconnection Emulation enables user to start his work after a shut-down or crash from where he left off
Emulation also exhaust resources because replay logs & file cache modified with updates become large to mange
CODA Design & Implementation (contd) Reintegration
Reintegration is a transitory state through which Venus passes in changing roles from pseudo-server to cache manager. In reintegration process Venus propagates through replay logs so that cache can be updated to reflect current server state. It has two major processes 1) Replay Algorithm 2) Conflict Handling
CODA Design & Implementation (contd)
Replay Algorithm
First Venus gets the permanent fids for cached object from server & updates its temporary fids in replay log After that reply log is parsed so that updating of cached objects can be processed After successful updating venus flushes its old cache, but in case of unsuccessful reintegration venus writes reply log in to a local replay file & then a refetch is made for all object of cache entries. Replay of logs can be done at different granularity levels of files according to the user needs likewise we can reintegrate the related objects of the cached object.
CODA Design & Implementation (contd) Conflict Handling
It may happen that disconnected operations of one client may conflict with activity at servers or other disconnected clients. Conflicts are managed using a storeid for each cached object. During reintegration this storeid is compared to its own replica for each entry of cached object in log. Different algorithms are used to manage conflicts of directories & files
Resource exhaustion
Three alternatives to free up disk space while emulation To compress file cache and RVM contents Allow user to selectively back out updates made while disconnected To allow of the file cache and RVM contents to be written out to removable disk
Status and evaluation
How long does integration take Time to allocate permanent fids Time for replay at the servers Time for the second phase of the update protocol used for server replication
How large a local disk does one need
How likely are conflicts