Data-Structure-Graph-Book

This is the github repository for the book Data Structure Graphs

A Data Structure Graph is a group of atomic entities that are related to each other, stored in a repository, then moved from one persistence layer to another, rendered as a Graph.

Essentially it is a way to interpret either a data architecture or a data model as a graph. There are a number of metrics that can be gathered in order to quantify the architecural design that this code demonstrates as documented in the book.

Here are some instructions for trying this yourself:

Download and install Gephi. a. You will probably need to modify the gephi.conf file to give it more resources -J-Xms2048m -J-Xmx8192m -J-Xverify:none are the settings I use.
Follow these instructions to set up the Vertabela reverse engineering tool: https://vertabelo.com/blog/reverse-engineering/
Run the python code ERD_to_DSG.py vertabelo_file.xml
This will spit out two files dsg_nodes_datetimestamp.csv, and dsg_edges_datetimestamp.csv
Open Gephi.
In Data Laboratory click import spreadsheet
Select the dsg_nodes file.
Ensure “Import as”: is set to nodes table.
Select Append to current workspace.
Click import spreadsheet again.
Select the dsg_edges file.
“Import as:” should be set to edges table.
Select Append to current workspace.
Select Overview.
For layout choose “Force Atlas 2”
For the Behavior alternatives click “Dissuade Hubs”,”LinLog Mode”, and “Prevent Overlap”.
Click Start, then let it run for a few seconds till it is visually appealing click Stop when done.
On the right you should see a statistics window.
Choose Connected Components. A dialog box will pop up, just choose directed and click OK.
A graph will be displayed, just close that.
Go back to Data Laboratory.
Sort on the Component ID, and compare this with the table names and subject areas you may have had.
These Component ID’s are groups of tables that have a high affinity for each other. Essentially, these are the subject areas of the data model.
So long as good foreign keys (real or virtual) are defined. Gephi will identify the “clusters of tables that belong to the same subject area.

Please leave comments or questions as you read through the book!

Thank you,

Doug Needham

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
AppendixA		AppendixA
DBeaver_ERD_to_DSG		DBeaver_ERD_to_DSG
DMZone_DSG_Demo		DMZone_DSG_Demo
Data		Data
Generate_Network		Generate_Network
Graphviz		Graphviz
NetworkX		NetworkX
Topology		Topology
Vertabelo		Vertabelo
ERD_to_DSG.py		ERD_to_DSG.py
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Data-Structure-Graph-Book

About

Uh oh!

Releases

Packages

Languages

dougneedham/Data-Structure-Graph-Book

Folders and files

Latest commit

History

Repository files navigation

Data-Structure-Graph-Book

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages