0% found this document useful (0 votes)

100 views44 pages

The Protocol Informatics Project: Automating Network Protocol Analysis

Network analysis

Uploaded by

Phong Nhi Trung

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

100 views44 pages

The Protocol Informatics Project: Automating Network Protocol Analysis

Network analysis

Uploaded by

Phong Nhi Trung

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 44

The Protocol Informatics Project

Automating Network Protocol Analysis

Debuted at Toorcon 2004 by
Marshall Beddoe ([email protected])
Copyright © 2004 Baseline Research

http://www.baselineresearch.net
Before We Start

• HTTP will be used for visualization of concept

 Most people know HTTP
 PowerPoint slides are only so large

• Questions will be gladly answered at the end

My email: [email protected]
PI Homepage: http://www.baselineresearch.net/PI
Objective of Protocol Analysis

• Determine protocol fields

• Understand structure of requests and responses
• Simplified Plaintext Example: HTTP
 GET /index.html HTTP/1.0
 GET: Keyword
 /index.html: Filename
 HTTP/1.0: Keyword
• Why is this knowledge important?
 Understanding proprietary protocols
 Finding vulnerabilities in unknown or badly
documented protocols
Problems with Protocol Analysis

• Binary protocols
• Large amount of data
• Dynamically sized fields
• Time consuming
• Amazingly boring

• There must be a better way…

 Enter bioinformatics
Bioinformatics

• What is Bioinformatics?
 “The use of mathematical and informational techniques,
including statistics, to solve biological problems” -
Wikipedia
 Processing of large amounts of structured, yet complex
data
 Operates on large sequences of strings to find patterns
 Objective: To find genes that produce specific proteins by
performing a series of comparisons.
 Mapping of phenotypes to genotypes
 Example: Attached earlobes to the sequence: ATTGAC
Protocol Analysis & Bioinformatics

• Similarities
 Both operate on large sequences of data
 Whereas bioinformatics helps find specific genes that produce
proteins, protocol analysis finds specific fields in a packet
 Both work through a series of compares and contrasts between a
large number of samples

• Creating an application that helps understand

structured, complex data would be an asset when
doing this type of analysis..
Tech Behind the Talking Points

• Sequence Alignment
 Needleman-Wunsch
• Similarity Matrices
 BLOSUM, PAM
• Phylogenetic Trees
 UPGMA
• Multiple Alignment
 Phylogeny
Sequence Alignment

• Base technology used in bioinformatics

• Idea: Take two sequences regardless of length and
align them to each other so both have equal length
• Gaps are inserted when needed to achieve the
maximum alignment of the sequences
• Example of amino acid alignment:
 TCAT---CAA
 |||| |||
 TCATGGGCAA
• Notice the gaps inserted into sequence one to
force length alignment
• Simple concept right?
Needleman-Wunsch Algorithm

• Dynamic programming algorithm

• Performs global alignment on a pair of sequences

 Global means that all characters in the sequence
participate in the alignment
 What goes in, comes out

• Used for analyzing closely related structures

Dynamic Programming

• Dynamic programming is not coding

• Idea: Break problem into sub-problems

• Operations mainly on matrices

• Results of previous computations are saved

and used by the remaining sub-problems

• Needleman Wunsch is a DP algorithm

How NW Works

• Sequence one is placed in the top-most row and

sequence two is placed in the left-most column.
• For each cell, perform the following:
 Assign similarity values
 Assess possible pathways through matrix (left, up and
diagonal), assigning the current cell with value of the
maximum scoring pathway using:
M i, j = MAX(M i"1, j"1 + Si, j , M i, j"1 + w, M i"1, j + w)
where w is the gap penalty (currently 0) and S is the similarly weight
 Construct a pathway from the highest scoring cell to the
! beginning of the matrix to get the maximum global
alignment
• A gap penalty is used to decrease the number of
gaps in the final alignment
In Other Words: Step One

G E T / i n d e x . h t m l H T T P / 1 . 0

G 1
E 1
T 1
1 1
/ 1 1
1
H 1
T 1 1 1
T 1 1 1
P 1
/ 1 1
1 1
. 1 1
0 1

• Characters that are similar receive a scoring of 1 (for now)

In Other Words: Step Two

G E T / i n d e x . h t m l H T T P / 1 . 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
G 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
E 0 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
T 0 1 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
0 1 2 3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
/ 0 1 2 3 4 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
0 1 2 3 4 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6
H 0 1 2 3 4 5 5 5 5 5 5 5 5 5 5 5 6 7 7 7 7 7 7 7 7
T 0 1 2 3 4 5 5 5 5 5 5 5 5 5 5 5 6 7 8 8 8 8 8 8 8
T 0 1 2 3 4 5 5 5 5 5 5 5 5 5 5 5 6 7 8 9 9 9 9 9 9
P 0 1 2 3 4 5 5 5 5 5 5 5 5 5 5 5 6 7 8 9 A A A A A
/ 0 1 2 3 4 5 5 5 5 5 5 5 5 5 5 5 6 7 8 9 A B B B B
1 0 1 2 3 4 5 5 5 5 5 5 5 5 5 5 5 6 7 8 9 A B C C C
. 0 1 2 3 4 5 5 5 5 5 5 6 6 6 6 6 6 7 8 9 A B C D D
0 0 1 2 3 4 5 5 5 5 5 5 6 6 6 6 6 6 7 8 9 A B C D E

Starting at position 1,1

For each cell:
M i, j = MAX(M i"1, j"1 + Si, j , M i, j"1 + w, M i"1, j + w)
In Other Words: Step Three

• Starting in cell with highest value (0xE), traverse

matrix to the beginning
What did this do?

• Now that we computed a path through the

matrix, we can apply the rules of NW to
obtain two aligned sequences

• Anytime the path travels upwards or to the

left, a gap is inserted into a sequence

• Upwards aﬀects sequence 1 (row)

• Left aﬀects sequence 2 (column)
The Result

GET /index.html HTTP/1.0

||||| |||||||||
GET /__________ HTTP/1.0
Analyzing the Results

GET /index.html HTTP/1.0

||||| |||||||||
GET /__________ HTTP/1.0

• We can easily discern the protocol

fields from these results

1. GET / is considered a keyword

2. index.html had no alignment, and is
therefore considered a variable length
field
3. Followed by keyword HTTP/1.0
Similarity Matrices

• Each character similarity is weighted

• In the earlier NW example, the value of S
was 1
• In Bioinformatics, similarity matrices are
used to optimize alignments of sequences.
 Markov chain probability table
 Based on observed mutations accepted in
evolution. Adenine can mutate into thymine,
etc.
• Applications to protocol analysis? Datatypes
 Binary data mutates into other binary data, as
ASCII mutates into other ASCII
PI Similarity Matrices

• 256x256 matrix
• Contains mutation probabilities
between every character
• Direct match has probability of 1
• Others are categorized and weighted
• Arbitrary example:
 ASCII character set, probability = .3
 ASCII printable, probability = .4
 Binary, probability = .4
What this Allows

• This allows more optimized alignments,

with sequences converging on similar data
types and reduces the number of incorrect
gaps

• Similarity matrices must be tweaked

 It is not uncommon to spend a lot of time
creating these matrices
 Bioinformatics scientists spend years perfecting
their version of similarity matrices (BLOSUM,
PAM, etc.)
What Now?

• Illustrated the ability to align two

sequences to each other and discern
protocol fields

• Shown how similarity matrices can be used

to optimize alignment

• Is it really useful only comparing two

sequences?
Multiple Sequence Alignment

• Act of aligning more than 2 sequences

• Uses NW as alignment algorithm

• Computation issues
Computation of Multiple NW

• To perform NW algorithm on multiple

sequences, a hypercube would be traversed
• This leads to NP-completeness
• 2n x Ln
• Where n is the number of sequences and L
is the length of the sequences
• In other words, our sun will supernova
before finishing the alignment 1000, 800
byte sequences
Heuristic Sequence Alignment

• Sacrificing accuracy for time

• Objective: To align every sequence to

each other in a reasonable amount of
time

• However, results are never perfect

Phylogenetic Trees

• A tree of evolutionary development

 Used in biology to construct taxonomic
groupings based purely on DNA analysis as
opposed to fossil records
• Typically binary trees
• Interesting parallel in protocol analysis
 A protocol mimics evolution by changing fields
 This can be characterized as a mutation
• What came first? GET /index.html or GET / ?
Phylogeny in Biology
Creating Phylogenetic Trees

• UPGMA cluster distance algorithm

 Unweighted Pair Group Method using
Arithmetic Averages
1
di, j = # d p,q
Ci C j p "C i ,q "C j
Where di,j is the distance between two clusters Ci and Cj

!
Building the Tree

1. Place each sequence into an individual

cluster, insert cluster into universal set
2. Use UPGMA algorithm to calculate
distance between each cluster, finding
two clusters where dij is minimal
3. Create a new cluster k. Ck = Ci ∪ Cj
4. Define a node k with child nodes i and j
5. Add Ck to the universal set and remove Ci
and Cj
Phylogeny in Protocol Analysis

Phylogenetic tree of the SMB protocol

More Than a Pretty Picture

Phylogenetic tree of the SMB protocol

The Tree is your Guide

• Helps categorize subtypes of a particular

protocol
 SMB contains at least 11 main subtypes as
illustrated
• Tree acts as a guide to perform actual
multiple sequence alignment
• As opposed to NP-complete hypercube
traversal, the UPGMA tree performs n
comparisons where n is equal to the depth
of the tree.
Multiple Sequence Alignment

• Rule: Once a gap always a gap

• Recursive Traversal Mechanism
 If root is NULL, go left, then right
 If left is !NULL and right !NULL, align sequences
and choose the sequence with the least number
of gaps inserted.
 Seq1: GET /index.html HTTP/1.0
 Seq2: GET /__________ HTTP/1.0
 Therefore: Seq1 is chosen to be the representative
 Place new sequence in root
 Keep track of edits in edge
Tree Traversal Algorithm

1 2 3
Tree Traversal Algorithm

1’
E(1, 2) E(2, 1)

1 2 3
Tree Traversal Algorithm

1’
E(1, 2) E(2, 1)

1 2 3
Tree Traversal Algorithm

1’’
E(1’, 3)
E(3, 1’)
1’
E(1, 2) E(2, 1)

1 2 3
Therefore

Sequence 1 Aligned = E(1,2) + E(1’,3)

Sequence 2 Aligned = E(2,1) + E(1’,3)
Sequence 3 Aligned = E(3, 1’)
Analyzing the Results
Qualitatively

Example

GET /cgi-bin/whois.pl HTTP/1.0 Host: _a_rin.net User-Agent: Opera__ Accept: text/xml

GET /__i___ndex.h___tml HTTP/1.0 Host: www.yahoo___.com User-Agent: Mozilla/5.0 Accept: text/xml
GET /__________________ HTTP/1.0 Host: www.__google.com User-Agent: ______IE4.0 Accept: text/xml

GET /?????????????????? HTTP/1.0 Host: ????????????.??? User-Agent: ??????????? Accept: text/xml

Conclusion:
GET / <variable> HTTP/1.0 Host: <variable>.<variable> User-Agent: <variable> Accept: text/xml

Definitely works on binary protocols, but isn’t as apparent on slides.

Analyzing the Results
Quantitatively
• Statistical analysis on columns
 Histograms
 Build a consensus sequence as performed on previous
 Mutation rates & oﬀset comparison
• Group based on mutation rate: Sequence Ids, checksum
• Beware of junk data
 In last example, junk data could have been a
POST in a sea of GETs
• Classification is your friend
 If you can adequately classify in beginning, data
results will be clearer
 Entropic edit distance
 N-gram analysis
Experimental Phase

• Initial thought: Simply separate dynamic

data versus static data, however, this is not
verbose enough
• Identifying integer fields: Build n-gram
frequency tables for 1, 2 and 4 byte
window sizes
• Observe rate of mutation for each n-gram
 Example 1: If two consecutive bytes mutate at
the same rate, chances are they are part of the
same field and perhaps a checksum
 Example 2: If in two consecutive bytes, the LSB
increments faster than the GSB, it may be a 16-
bit sequence identifier field.
Next Steps

• Current Ideas
 Building protocol profile on each sequence
individually, filtering out deviants
 Build single consensus sequence to describe
entire protocol
 Not usually feasible since many block-based protocols
such as ISAKMP, SMB, etc. have many layers.
 Present data in an intuitive way to allow
improved human estimation and understanding
 Colors, interface design, etc.
 This can never be fully automated if accuracy is in mind
Applications

• This technology can be used for:

 Understanding unknown protocols
 Fuzz network protocols more eﬃciently
 Instead of writing protocol specifications to
fuzz against, have them be auto-generated
from a tcpdump sample
 Learning the structure of any sequence
containing complex and somewhat
random data
• Do you have any ideas?
Conclusions

• Never be fully automated 100%

• Experimental technology
• Framework under development
 Python/C++, cross platform
 Widget based visual programming interface
similar to the Orange data mining application
(http://magix.fri.uni-lj.si/orange/)
 Open source and looking for interested people
to help
• Closing note: Solutions to computer related
problems can be found in other sciences.
It is important to expand your horizons.
Questions/Comments/Ideas?

• Thanks for coming

Marshall Beddoe [email protected]

Baseline Research http://www.baselineresearch.net/PI

If you are interested in contributing, please contact me.

P's List. All Other Edges Are OK.: N 1 N 1 I I I S
No ratings yet
P's List. All Other Edges Are OK.: N 1 N 1 I I I S
3 pages
SAT SMT by Example
No ratings yet
SAT SMT by Example
585 pages
8, 9 Rank Order Clustering
100% (1)
8, 9 Rank Order Clustering
19 pages
DAA Tutorials
100% (1)
DAA Tutorials
8 pages
Graph Algorithms for Students
No ratings yet
Graph Algorithms for Students
164 pages
Asymptotic Notation, Review of Functions & Summations
100% (1)
Asymptotic Notation, Review of Functions & Summations
45 pages
SAT SMT by Example PDF
No ratings yet
SAT SMT by Example PDF
575 pages
Advanced Algorithms Intro
No ratings yet
Advanced Algorithms Intro
36 pages
5-Minute Break Introduction To Pset0 Until 2:30pm
No ratings yet
5-Minute Break Introduction To Pset0 Until 2:30pm
97 pages
HW 1
No ratings yet
HW 1
4 pages
Data Structures Homework Guide
No ratings yet
Data Structures Homework Guide
6 pages
All Matlab Codes PDF
100% (2)
All Matlab Codes PDF
44 pages
ACM ICPC Problem Set 2011
No ratings yet
ACM ICPC Problem Set 2011
14 pages
Set - B - Answer Key CT2
No ratings yet
Set - B - Answer Key CT2
16 pages
SAT/SMT Solvers Guide
No ratings yet
SAT/SMT Solvers Guide
563 pages
Algorithms Question Paper ECE IIT Khragpur
No ratings yet
Algorithms Question Paper ECE IIT Khragpur
2 pages
SAT SMT by Example
No ratings yet
SAT SMT by Example
671 pages
Matrix Path and Algorithm Guide
No ratings yet
Matrix Path and Algorithm Guide
9 pages
Dsad
No ratings yet
Dsad
2 pages
CSC - A - L - P3 - CRM - Ipw - Caspa 2024
No ratings yet
CSC - A - L - P3 - CRM - Ipw - Caspa 2024
6 pages
Constructive PDF
No ratings yet
Constructive PDF
80 pages
Mid-Term Examination (100 Points) Open Textbook and Open Notes Only Winter 2003
No ratings yet
Mid-Term Examination (100 Points) Open Textbook and Open Notes Only Winter 2003
5 pages
Youareinamazeof Twisty Passages... : Minos 2009 Peter Harrison
No ratings yet
Youareinamazeof Twisty Passages... : Minos 2009 Peter Harrison
73 pages
DS Ans Key
No ratings yet
DS Ans Key
18 pages
A* Algorithm & Heuristics Guide
No ratings yet
A* Algorithm & Heuristics Guide
19 pages
Samsung Course Programming Problems
No ratings yet
Samsung Course Programming Problems
30 pages
Algorithms 2. Order 3. Analysis of Algorithm 4. Some Mathematical Background
No ratings yet
Algorithms 2. Order 3. Analysis of Algorithm 4. Some Mathematical Background
41 pages
Test Soto Mayor
No ratings yet
Test Soto Mayor
1 page
Design & Analysis of Algorithms Exam
No ratings yet
Design & Analysis of Algorithms Exam
10 pages
Answers To Chapter 1
No ratings yet
Answers To Chapter 1
3 pages
Spring 2025 - CS607 - 1
No ratings yet
Spring 2025 - CS607 - 1
3 pages
The Hitchhiker's Guide To The Programming Contests
100% (2)
The Hitchhiker's Guide To The Programming Contests
78 pages
Introduction
No ratings yet
Introduction
36 pages
Lecture 2
No ratings yet
Lecture 2
71 pages
Greedy Algorithms
No ratings yet
Greedy Algorithms
11 pages
Algorithms Course Overview
No ratings yet
Algorithms Course Overview
83 pages
Data Structure 4
No ratings yet
Data Structure 4
7 pages
CS218-Data Structures Final Exam
100% (2)
CS218-Data Structures Final Exam
7 pages
Set D Answer Key
No ratings yet
Set D Answer Key
7 pages
DS Final Exam Solution
No ratings yet
DS Final Exam Solution
22 pages
C++ Two-Dimensional Array Exercises
No ratings yet
C++ Two-Dimensional Array Exercises
4 pages
CSE-205 Algorithms: Algorithm, Asymptotic Notation & Complexity Analysis
No ratings yet
CSE-205 Algorithms: Algorithm, Asymptotic Notation & Complexity Analysis
43 pages
Answers For Algorithms
No ratings yet
Answers For Algorithms
10 pages
Cs550 Manuscript
No ratings yet
Cs550 Manuscript
406 pages
Unit 1 Daa PPT
No ratings yet
Unit 1 Daa PPT
52 pages
ADE 16-17 Sol
No ratings yet
ADE 16-17 Sol
13 pages
Week 0 W
No ratings yet
Week 0 W
84 pages
A Level Cs PPQ Merged Booklet 23-24
No ratings yet
A Level Cs PPQ Merged Booklet 23-24
328 pages
Assignment 1
No ratings yet
Assignment 1
2 pages
Competitive Programming Notebook
No ratings yet
Competitive Programming Notebook
137 pages
Graphs and Algorithms Exam Guide
No ratings yet
Graphs and Algorithms Exam Guide
17 pages
6689 01 Que 20050118
No ratings yet
6689 01 Que 20050118
19 pages
ECEN3250 Lab 7: Design of Common-Source MOS Amplifiers Prelab Assignment
No ratings yet
ECEN3250 Lab 7: Design of Common-Source MOS Amplifiers Prelab Assignment
14 pages
Hard Disk Basics for Tech Enthusiasts
0% (1)
Hard Disk Basics for Tech Enthusiasts
16 pages
Dennis
No ratings yet
Dennis
27 pages
Audison Thesis Car Audio
100% (3)
Audison Thesis Car Audio
5 pages
CEMS Exam Guidelines 2023
No ratings yet
CEMS Exam Guidelines 2023
1 page
Smart Load Cell Digital Filtering
No ratings yet
Smart Load Cell Digital Filtering
6 pages
CH 3-5 MRI Contrast Spatial Localization
No ratings yet
CH 3-5 MRI Contrast Spatial Localization
109 pages
MSC Adams 2019.2 Software Overview
No ratings yet
MSC Adams 2019.2 Software Overview
2 pages
Service Level Management Upgrade Training: HPSM For HP Enterprise Services
No ratings yet
Service Level Management Upgrade Training: HPSM For HP Enterprise Services
32 pages
Implen Nanophotometer User Manual V1.0.5
No ratings yet
Implen Nanophotometer User Manual V1.0.5
70 pages
Unit 30 - Assignment 1
100% (1)
Unit 30 - Assignment 1
3 pages
Joystick DANFOSS JS1-H
No ratings yet
Joystick DANFOSS JS1-H
4 pages
Data Analysis and Property Modeling With SKUA-GOCAD Training Manual - Paradigm 15
No ratings yet
Data Analysis and Property Modeling With SKUA-GOCAD Training Manual - Paradigm 15
186 pages
PMP Certification: PMBOK® 6.0
No ratings yet
PMP Certification: PMBOK® 6.0
11 pages
Tlc555-Q1 Lincmos™ Timer: 1 Features 3 Description
No ratings yet
Tlc555-Q1 Lincmos™ Timer: 1 Features 3 Description
26 pages
RX1 Getting Started
No ratings yet
RX1 Getting Started
60 pages
ATV600 Communication Parameters EAV64332 V3.6
No ratings yet
ATV600 Communication Parameters EAV64332 V3.6
324 pages
Factsheet Ric290 2018-08 en Web
No ratings yet
Factsheet Ric290 2018-08 en Web
2 pages
333 High Frequency GRE Words With Meanings
No ratings yet
333 High Frequency GRE Words With Meanings
7 pages
!!!!!!!!!AC SINGLE PHASE INDUCTION MOTOR SPEED CONTROL U2008b PDF
No ratings yet
!!!!!!!!!AC SINGLE PHASE INDUCTION MOTOR SPEED CONTROL U2008b PDF
6 pages
Fire Panel Guide for Engineers
100% (1)
Fire Panel Guide for Engineers
11 pages
Submitted By:: Abhinav Chaturvedi Kanika Sheokand Manjalika Neha Sharma Palak Bajaj Ms. Japneet Kaur
No ratings yet
Submitted By:: Abhinav Chaturvedi Kanika Sheokand Manjalika Neha Sharma Palak Bajaj Ms. Japneet Kaur
15 pages
IBM POST & BIOS Error Codes Guide
No ratings yet
IBM POST & BIOS Error Codes Guide
4 pages
T.ms6586.u705 + 25-DB5414-X2P1 Shg6002c-173e Lc-60ui9362e
100% (1)
T.ms6586.u705 + 25-DB5414-X2P1 Shg6002c-173e Lc-60ui9362e
54 pages
Itu-T G.841
No ratings yet
Itu-T G.841
98 pages
Payroll Calculator & Database Code
No ratings yet
Payroll Calculator & Database Code
49 pages
Muhammad Danish Afif Bin Rosman Resume As of Aug 2022
No ratings yet
Muhammad Danish Afif Bin Rosman Resume As of Aug 2022
1 page
Nour Abdelhafiz CV
No ratings yet
Nour Abdelhafiz CV
2 pages
FAX236S Brochure 2
No ratings yet
FAX236S Brochure 2
1 page
Operating Manual-Sx60-100 Om 090824
No ratings yet
Operating Manual-Sx60-100 Om 090824
112 pages

The Protocol Informatics Project: Automating Network Protocol Analysis

Uploaded by

The Protocol Informatics Project: Automating Network Protocol Analysis

Uploaded by

The Protocol Informatics Project

Automating Network Protocol Analysis

• HTTP will be used for visualization of concept

• Questions will be gladly answered at the end

• Determine protocol fields

• There must be a better way…

• Creating an application that helps understand

• Base technology used in bioinformatics

• Dynamic programming algorithm

• Performs global alignment on a pair of sequences

• Used for analyzing closely related structures

• Dynamic programming is not coding

• Idea: Break problem into sub-problems

• Operations mainly on matrices

• Results of previous computations are saved

• Needleman Wunsch is a DP algorithm

• Sequence one is placed in the top-most row and

• Characters that are similar receive a scoring of 1 (for now)

Starting at position 1,1

• Starting in cell with highest value (0xE), traverse

• Now that we computed a path through the

• Anytime the path travels upwards or to the

• Upwards aﬀects sequence 1 (row)

GET /index.html HTTP/1.0

GET /index.html HTTP/1.0

• We can easily discern the protocol

1. GET / is considered a keyword

• Each character similarity is weighted

• This allows more optimized alignments,

• Similarity matrices must be tweaked

• Illustrated the ability to align two

• Shown how similarity matrices can be used

• Is it really useful only comparing two

• Act of aligning more than 2 sequences

• Uses NW as alignment algorithm

• To perform NW algorithm on multiple

• Sacrificing accuracy for time

• Objective: To align every sequence to

• However, results are never perfect

• A tree of evolutionary development

• UPGMA cluster distance algorithm

1. Place each sequence into an individual

Phylogenetic tree of the SMB protocol

Phylogenetic tree of the SMB protocol

• Helps categorize subtypes of a particular

• Rule: Once a gap always a gap

Sequence 1 Aligned = E(1,2) + E(1’,3)

GET /cgi-bin__/whois.pl HTTP/1.0 Host: _____a___rin.net User-Agent: __Opera____ Accept: text/xml

GET /?????????????????? HTTP/1.0 Host: ????????????.??? User-Agent: ??????????? Accept: text/xml

Definitely works on binary protocols, but isn’t as apparent on slides.

• Initial thought: Simply separate dynamic

• This technology can be used for:

• Never be fully automated 100%

• Thanks for coming

Marshall Beddoe [email protected]

If you are interested in contributing, please contact me.

You might also like

GET /cgi-bin/whois.pl HTTP/1.0 Host: _a_rin.net User-Agent: Opera__ Accept: text/xml