BASICS OF DISTRIBUTED SYSTEMS
SERVICE MODELS (APPLICATION ARCHITECTURES)
CENTRALIZED MODEL
No networking
Traditional time-sharing system
Single workstation/PC or direct connection of multiple terminals to a
computer
One or several CPUs
Not easily scalable
Limiting factor: number of CPUs in system
Contention for same resources (memory, network, devices)
CLIENT-SERVER MODEL
Clients send requests to servers
A server is a system that runs a service
Clients do not communicate with other clients
LAYERED ARCHITECTURES IN SOFTWARE DESIGN
Break functionality into multiple layers
Each layer handles a specific abstraction
Hides implementation details and specifics of hardware, OS, network
abstractions, data encoding, …
TIERED ARCHITECTURES IN NETWORKED SYSTEMS
Tiered (multi-tier) architectures
Distributed systems analogy to a layered architecture
Each tier (layer)
Runs as a network service
Is accessed by surrounding layers
The basic client-server architecture is a two-tier model
MULTI-TIER EXAMPLE
MULTI-TIER EXAMPLE
PEER-TO-PEER (P2P) MODEL
No reliance on servers
Machines (peers) communicate with each other
Goals
Robustness
Self-scalability
Examples
BitTorrent, Skype
HYBRID MODEL
Many peer-to-peer architectures still rely on a server
Look up, track users
Track content
Coordinate access
But traffic-intensive workloads are delegated to peers
PROCESSOR POOL MODEL
Collection of CPUs that can be assigned
processes on demand
Similar to hybrid model
Coordinator dispatches work requests to
available processors
Render farms, big data processing,
machine learning
CLOUD COMPUTING
Resources are provided as a network (Internet) service
IP VS. OPI
TCP, UDP is
implemented in
transport level
Sockets are used
to implement the
connection
SOCKET
OPERATIONS
Read / Write
based
communication
SOCKET IMPLEMENTATION IN DIFFERENT LANGUAGES
Python
SOCKET BASED COMMUNICATION
Socket API: all we get from the OS to access the network
Socket = distinct end-to-end communication channels
Read/write model
To make distributed computing look more like centralized computing, I/O
(read/write) is not the way to go
Line-oriented, text-based protocols common
Not efficient but easy to debug & use
RPC – REMOTE PROCEDURE CALL
1984: Birrell & Nelson
Mechanism to call procedures on other machines
Implementing RPC
No architectural support for remote procedure calls The compiler
creates code to
Simulate it with tools we have (local procedure calls) send messages to
invoke remote
Simulation makes RPC a language-level construct functions
instead of an operating system construct
The OS gives us
sockets
IMPLEMENTING RPC
Create stub functions to make it appear to the user that the call is
local
On the client
The stub function (proxy) has the function’s interface
Packages parameters and calls the server
On the server
The stub function (skeleton) receives the request and calls the local function
STUB FUNCTIONS
1. Client calls stub (params on stack)
2. Stub marshals params to network message
3. Network message sent to server
4. Receive message: send it to server stub
5. Unmarshal parameters, call server function
6. Return from server function
7. Marshal return value and send message
8. Transfer message over network
9. Receive message: client stub is receiver
10. Unmarshal return value(s), return to client
code
A SERVER STUB CONTAINS TWO PARTS
1. Dispatcher – the listener
Receives client requests
Identifies appropriate function (method)
2. Skeleton – the unmarshaller & caller
Unmarshals parameters
Calls the local server procedure
Marshals the response & sends it back to the dispatcher
All this is invisible to the programmer
The programmer doesn’t deal with any of this
Dispatcher + Skeleton may be integrated
Depends on implementation
RPC BENEFITS
RPC gives us a procedure call interface
Writing applications is simplified
RPC hides all network code into stub functions
Application programmers don’t have to worry about details
Sockets, port numbers, byte ordering
Where is RPC in the OSI model?
Layer 5: Session layer: Connection management
Layer 6: Presentation: Marshaling/data representation
Uses the transport layer (4) for communication (TCP/UDP)
RPC ISSUES
Parameter passing
Pass by value or pass by reference?
Pointerless representation
Service binding. How do we locate the server endpoint?
Central DB
DB of services per host
When things go wrong
Opportunities for failure
Performance
RPC is slower … a lot slower (why?)
Security
messages may be visible over network – do we need to hide them?
Authenticate client? Authenticate server?
PROGRAMMING WITH RPC
Language support
Many programming languages have no language-level concept of remote procedure calls
(C, C++, Java <J2SE 5.0, …)
These compilers will not automatically generate client and server stubs
Some languages have support that enables RPC
(Java, Python, Haskell, Go, Erlang)
But we may need to deal with heterogeneous environments (e.g., Java communicating with a Python service)
Common solution
Interface Definition Language (IDL): describes remote procedures
Separate compiler that generate stubs (pre-compiler)
INTERFACE DEFINITION LANGUAGE (IDL)
Allow programmer to specify remote procedure interfaces (names,
parameters, return values)
Pre-compiler can use this to generate client and server stubs
Marshaling code
Unmarshaling code
Network transport routines
Conform to defined interface
An IDL looks similar to function prototypes
RPC COMPILER
SENDING DATA OVER THE NETWORK
No such thing as incompatibility problems on local system
Remote machine may have:
Different byte ordering
Different sizes of integers and other types
Different floating point representations
Different character sets
Alignment requirements
REPRESENTING DATA
Big endian: Most significant byte in low memory
SPARC < V9, Motorola 680x0, older PowerPC
Little endian: Most significant byte in high memory
Intel/AMD IA-32, x64
Bi-endian: Processor may operate in either mode
ARM, PowerPC, MIPS, SPARC V9, IA-64 (Intel Itanium)
IP (headers) forced all to use big endian byte ordering for 16- and 32-bit values
REPRESENTING DATA: SERIALIZATION
Need standard encoding to enable communication between heterogeneous
systems
Serialization
Convert data into a pointerless format: an array of bytes
Examples
XDR (eXternal Data Representation), used by ONC RPC
JSON (JavaScript Object Notation)
W3C XML Schema Language
ASN.1 (ISO Abstract Syntax Notation)
Google Protocol Buffers
THE NEXT GENERATION OF RPCS
Distributed objects:
support for object-oriented languages
DOA: Distributed Object Architecture
JAVA RMI
Java language had no mechanism for invoking remote methods
1995: Sun added extension
Remote Method Invocation (RMI)
Allow programmer to create distributed applications where methods of remote
objects can be invoked from other JVMs
RMI COMPONENTS
Client
Invokes method on remote object
Server
Process that owns the remote object
Object registry
Name server that relates objects with names
INTEROPERABILITY
RMI is built for Java only!
No goal of OS interoperability (as CORBA)
No language interoperability (goals of SUN, DCE, and CORBA)
No architecture interoperability
No need for external data representation
All sides run a JVM
Benefit: simple and clean design
RMI SIMILARITIES
Similar to local objects
References to remote objects can be passed as parameters (not as pointers, of
course)
You can execute methods on a remote object
Objects can be passed as parameters to remote methods
Object can be cast to any of the set of interfaces supported by the
implementation
Operations can be invoked on these objects
RMI DIFFERENCES
Objects (parameters or return data) passed by value
Changes will visible only locally
Remote objects are passed by reference
Not by copying remote implementation
The “reference” is not a pointer. It’s a data structure:
{ IP address, port, time, object #, interface of remote object }
RMI generates extra exceptions
CLASSES TO SUPPORT RMI
Needed for
remote class: remote objects
One whose instances can be used remotely
Within its address space: regular object
Other address spaces:
Remote methods can be referenced via an object handle
serializable class: Needed for
Object that can be marshaled parameters
Support serialization of parameters or return values
If a parameter is a remote object, only the object handle is copied
STUB & SKELETON GENERATION
Automatic stub generation since Java 1.5
Need stubs and skeletons for the remote interfaces
Automatically built from java files
Pre 1.5 (still supported) generated by separate compiler: rmic
Auto-generated code:
Skeleton
Server-side code that calls the actual remote object implementation
Stub
Client-side proxy for the remote object
Communicates method invocations on remote objects to the server
NAMING SERVICE
We need to look an object up by name
Get back a remote object reference to perform remote object
invocations
Object registry does this: rmiregistry running on the server
SERVER AND CLIENT
Register object(s) with Object Registry
Client contacts rmiregistry to look up name
rmiregistry service returns a remote object reference.
lookup method gives reference to local stub.
The stub now knows where to send requests
Invoke remote method(s):
JAVA RMI INFRASTRUCTURE
RPC HAD PROBLEMS
Distributed objects mostly ended up in intranets of homogenous systems and low latency networks
Interoperability – different languages, OSes, hardware
Transparency – not really there
Memory access, partial failure
Firewalls – dynamic ports
State – load balancing, resources
No group communication – no replication
No asynchronous messaging
Large streaming responses not possible
Notifications of delays not possibly
No subscribe-publish models
WEB BROWSING AND WEB SERVICES
Web browser:
Dominant model for user interaction on the Internet
Not good for programmatic access to data or manipulating data
UI is a major component of the content
Site scraping is a pain!
Web services
Remotely hosted services – that programs can use
Machine-to-machine communication
WEB SERVICES
Set of protocols by which services can be published, discovered, and
used in a technology neutral form
Language & architecture independent
Applications will typically invoke multiple remote services
Service Oriented Architecture (SOA)
App is integration of network-accessible services (components)
Each service has a well-defined interface
Components are unassociated & loosely coupled
BENEFITS OF SOA
Autonomous modules
Each module does one thing well
Supports reuse of modules across applications
Loose coupling
Requires minimal knowledge – don’t need to know implementation
Migration: Services can be located and relocated on any servers
Scalability: new services can be added/removed on demand … and on different
servers – or load balanced
Updates: Individual services can be replaced without interruption
GENERAL PRINCIPLES OF WEB SERVICES
Coarse-grained
Usually few operations & large messages
Platform neutral
Messages don’t rely on the underlying language, OS, or hardware
Standardized protocols & data formats
Payloads are text (XML or JSON)
Message-oriented
Communicate by exchanging messages
HTTP often used for transport
Use existing infrastructure: web servers, authentication, encryption, firewalls, load-balancers
WEB SERVICES VS. DISTRIBUTED OBJECTS
Web services Distributed Objects
Document oriented Object oriented
- Exchange documents - Instantiate remote objects
- Request operations on a remote object
- Receive results
- …
- Eventually release the object
Document design is the key Interface design is the key
- Interfaces are just a way to pass documents - Data structures just package data
Stateless computing Stateful computing
- State is contained within the documents that are - Remote object maintains state
exchanged (e.g., customer ID)
WEB SERVICE IMPLEMENTATION EXAMPLES
XML RPC
SOAP: (Simple) (Object) Access Protocol
JAX-WS: Java Web Services
AJAX & XMLHTTP
REST: REpresentational State Transfer
REST
Stay with the principles of the web
Four HTTP commands let you operate on data (a resource):
PUT (create)
GET (read) CRUD:
POST (update) Create, Read, Update, Delete
DELETE (delete)
And a fifth one:
OPTIONS (query) - determine options associated with a resource
Rarely used … but it’s there
Messages contain representation of data
RESOURCE-ORIENTED SERVICES
Blog example
Get a snapshot of a user’s blogroll:
HTTP GET //example.com/listsubs
HTTP authentication handles user identification
To get info about a specific subscription:
HTTP GET http://example.com/getitems?s={subid}
RESOURCE-ORIENTED SERVICES
Get parts info
HTTP GET //www.example.com/parts
Returns a document containing a list of parts
<?xml version="1.0"?>
<p:Parts xmlns:p="http://www.example.com" xmlns:xlink="http://www.w3.org/1999/xlink">
<Part id="00345" xlink:href="http://www.example.com/parts/00345"/>
<Part id="00346" xlink:href="http://www.example.com/parts/00346"/>
<Part id="00347" xlink:href="http://www.example.com/parts/00347"/>
<Part id="00348" xlink:href="http://www.example.com/parts/00348"/>
</p:Parts>
RESOURCE-ORIENTED SERVICES
Get detailed parts info:
HTTP GET //www.example.com/parts/00345
Returns a document with information about a specific part
<?xml version="1.0"?>
<p:Part xmlns:p="http://www.example.com" xmlns:xlink="http://www.w3.org/1999/xlink">
<Part-ID>00345</Part-ID>
<Name>Widget-A</Name>
<Description>This part is used within the frap assembly</Description>
<Specification xlink:href="http://www.example.com/parts/00345/specification"/>
<UnitCost currency="USD">0.10</UnitCost>
<Quantity>10</Quantity>
</p:Part>
REST VS. RPC INTERFACE PARADIGMS
Example from wikipedia:
RPC
getUser(), addUser(), removeUser(), updateUser(), getLocation(), AddLocation(), removeLocation()
exampleObject = new ExampleApp(“example.com:1234”);
exampleObject.getUser();
REST
http://example.com/users
http://example.com/users/{user}
http://example.com/locations
userResource = new Resource(“http://example.com/users/001”);
userResource.get();
EXAMPLES OF REST SERVICES
Various Amazon & Microsoft APIs
Facebook Graph API
Yahoo! Search APIs
Flickr
Twitter
Tesla Cars
…
QUESTIONS?
NOW, BY E-MAIL, …