© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.
Training objectives
What will you be able to do at the end of the training?
• Build Anatella scripts to transform data.
• Extract and load data to and from Anatella.
• Become proficient at using Anatella.
Presentation Exercises
© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.
3
Training Agenda
• What is Anatella
• The Anatella Environment
• Basic Operations of Anatella
• Anatella boxes you cannot live without
• Practical exercise
© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.
4
What is Anatella?
Anatella is an ETL: it extracts, transforms, and
loads data
Anatella is a data Transformation tool
• known as an “ETL tool”, an acronym for “Extract, Transform and Load”
Transformations
Transfomed
Data file Extract Load results
© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.
is user-friendly.
Anatella is User-Friendly:
• Most data-transformations are meta-data-free: you don’t need to care about the meta-type of a column. In this
regard, Anatella is like MS-Excel: In MS-Excel, you don’t need to specify the data-type of your columns/cells, neither
do you in Anatella. Anatella is only slightly more complex than MS-Excel.
• Most data-transformations are code-free: You only need to connect "boxes":
Filter
(where) Group by order by
Select education, count (education) as count from table
where sex="Female"
group by education
order by count
© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.
The advantage of Antella 6
Fast to execute & debug, easier to understand
Fast to execute
• It has been optimized to the maximum
Fast to debug
• Data can be viewed after each step (not only at the final output)
Easy to understand graphic interface accessible to non-programmers
• Step-by-step logic; no coding required.
Anatella
SQL
© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.
6
7
Training Agenda
• What is Anatella
• The Anatella Environment
• Basic Operations of Anatella
• Anatella boxes you cannot live without
• Practical exercise
© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.
The Anatella environment 8
Anatella has a user friendly interface with no
coding required.
Menu:
Menus list for quick
actions.
Data Table:
Results of
action Action Properties:
Log:
displayed here Where action box
Log file of actions
properties are modified
and transformations
kept here
© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.
The Anatella environment 9
Data is Extracted, Transformed and Loaded in an
easy intuitive way
Extraction from
scratch, from text
files, from gel, from
DB
Transformations on
the data: sorts,
aggregations,
calculations, graph
analysis, …
Load into flat files,
gel, DB
© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.
10
Training Agenda
• What is Anatella
• The Anatella Environment
• Basic Operations of Anatella
• Anatella boxes you cannot live without
• Practical exercise
© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.
Building and running Anatella scripts 11
Anatella transforms graphs composed of linked
boxes using arrows
Arrows indicate that the data at
the output pin of one box is used
as input for the following box
Boxes indicate
operations on
the underlying
data
The flag is used to show the
termination of the graph
© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.
Building and running Anatella scripts 12
Example of how to build a transformation graph
1. Select the “connect” mode to build arrows. In this
mode, click on the output pin of the outgoing box then
on the input pin of the incoming box to create an arrow 2. Drag and drop boxes
From the right panel to
the middle frame to add
them to the graph
3. Double-click on a
4. Right-clicking on the
box to edit its
flag and selecting the
properties in the
green arrow will run
lower left frame
the complete graph
© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.
Building and running Anatella scripts 13
Testing scripts for intermmediate results
In run mode, when clicking on an
output pin, the graph runs from
Click on “run” to the last saved result till this
switch to run mode output pin The status bar
shows the
overall
progress of the
calculation
This icon (the rotating
cube) shows that the
box is currently running
© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.
14
Building and running Anatella scripts
Running a transformation graph
Box Description
Run to finish line:
Click on this flag to run the graph from the last cached point to the flag. This is a very
useful method of testing scripts efficiently and quickly however it is not advised to
production situations.
Delete all caches and run to finish line:
Click on this flag to run the graph from the beginning of the graph until the flag. This
method deletes all saved caches on the graphs. It is best practice to use this method
in production.
© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.
15
Training Agenda
• What is Anatella
• The Anatella Environment
• Basic Operations of Anatella
• Most essential Anatella boxes
• Practical exercise
© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.
Operations of Standard Boxes 16
It is important to understand the most commonly
used boxes.
Extraction and Loading Transformation
Automation
© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.
17
Extraction boxes
Boxes Description
Extraction types:
The boxes above are used for extracting data from flat file or
from Gel files. Gel files are highly optimized data file formats
that is unique to the Anatella software.
Example of box parameters: Read .csv
File name
Column
delimiter
© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.
18
Automation boxes
Box Description
Global runner :
Often scripts will be required to run automatically. The global runner box is used as
an “end” box for a script. In Anatella’s top menu there is a global runner icon, if that
is clicked, all boxes linked to a global runner box will run.
© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.
19
Automation boxes
Box Description
Parallel run:
The parallel run box is used to run a series of Anatella scripts using one script. Often
scripts are built in isolation to do a certain transformation, the parallel run box
allows users to create a list of scripts and runs all the scripts in the specified order.
© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.
20
Transformation boxes
Box Description
Append box:
This box is used to append on table to another. It is equivalent to a “UNION” statement in SQL
Office Agent Amount Union
DBN Adam R 400 Office Agent Amount
JHB Paul R 450 DBN Adam R 400
CPT Lilly R620 JHB Paul R 450
GMR Jenny R300 CPT Lilly R620
Office Agent Amount GMR Jenny R300
PTA Dela R 320 PTA Dela R 320
ELN Chris R 470 ELN Chris R 470
CPT Adam R 800 CPT Adam R 800
JHB John R 120 JHB John R 120
© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.
21
Transformation boxes
Box Description
Single Join box:
Used to joining two tables on a specific key value. The key value must be sorted beforehand.
Office Agent Amount
DBN Adam R 400 Left Join
JHB Paul R 450 Office Agent Amount Area
CPT Lilly R620 DBN Adam R 400 21
GMR Jenny R300 JHB Paul R 450 63
CPT Lilly R620 112
Office Area
GMR Jenny R300
JHB 63
DBN 21
PTA 83
CPT 112
© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.
22
Transformation boxes
Box Description
Single Join box:
Used to joining two tables on a specific key value. The key value must be sorted beforehand.
Office Agent Amount
DBN Adam R 400 Inner Join
JHB Paul R 450
Office Agent Amount Area
CPT Lilly R620
DBN Adam R 400 21
GMR Jenny R300
JHB Paul R 450 63
Office Area CPT Lilly R620 112
JHB 63
DBN 21
PTA 83
CPT 112
© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.
23
Transformation boxes
Box Description
Single Join box:
Used to joining two tables on a specific key value. The key value must be sorted beforehand.
Office Agent Amount
Full Outer Join
DBN Adam R 400
Office Agent Amount Area
JHB Paul R 450
DBN Adam R 400 21
CPT Lilly R620
JHB Paul R 450 63
GMR Jenny R300
CPT Lilly R620 112
Office Area GMR Jenny R300
JHB 63 PTA 83
DBN 21
PTA 83
CPT 112
© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.
24
Transformation boxes
Box Description
Multi Join box:
Similar to the single join key, however tables can be joined on multiple join keys specified by
the user. The keys do not have to sorted as the complete slave tables will be stored in memory.
Office Agent Amount
DBN Adam R 400 Multiple Left Join
JHB Paul R 450
Office Agent Amount Area Target
CPT Lilly R620
GMR Jenny R300
DBN Adam R 400 21 R 750
Office Area JHB Paul R 450 63
JHB 63 CPT Lilly R620 112 R 750
DBN 21 GMR Jenny R300
PTA 83
CPT 112
Agent Target
Adam R 750
Lilly R 750
Michel R 650
© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.
24
25
Transformation boxes
Box Description
Sort box:
This box is used to sort data. It is very commonly used in Anatella as a mandatory task
to do before other tasks can be complete. For example, data has to be sorted on the
join key before joining.
Office Agent Amount Office Agent Amount
DBN Adam R 400 CPT Lilly R 620
JHB Paul R 450 JHB Paul R 450
CPT Lilly R 620 DBN Adam R 400
GMR Jenny R 300 GMR Jenny R 300
© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.
26
Transformation boxes
Box Description
Aggregation box:
All aggregation processes are done with this box. It is equivalent to a “GROUP BY” statement in SQL.
Office Agent Amount
DBN Adam R 400 Office Amount_sum
JHB Paul R 450 DBN R 400
CPT Lilly R620 JHB R 570
GMR Jenny R300 CPT R 1420
PTA Dela R 320 GMR R300
ELN Chris R 470 PTA R 320
CPT Adam R 800 ELN R 470
JHB John R 120
© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.
27
Transformation boxes
Box Description
Data type wizard box:
Used to convert data types. For example change integers to float values or to string values
String manipulation box:
This box is used to treat/clean strings (text). E.g.: remove brackets, convert all letters to
capitals of replace words with other words.
Column rename box:
This box is used to rename columns in your data. It is often useful to use before loading the
data to Excel/Tableau/Qlickview.
Column selection box:
This box is used to choose certain columns in the data. It is equivalent to a “SELECT” statement in SQL.
Date formatter box:
This is box is used to format dates to a specific format.
For example, “2012-02-02” to “12/02/02”
© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.
28
Transformation boxes
Box Description
Calculator box:
Used to perform calculations based on several columns and various data types.
With this box, you can create and/or updates columns. Below are examples of how to
use this box. Use the help tab for more information about available functions:
Calculating Profit:
Qty * (Price_per_unit – Cost_per_unit)
Concatenate name and surname
name//”-”//surname
Return a “yes” if x is > 10:
X>10? “yes” : ”no”
© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.
29
Transformation boxes
Box Description
Calculator box:
Used to perform calculations based on several columns and various data types.
With this box, you can create and/or updates columns. Below are examples of how
to use this box. Use the help tab for more information about available functions:
Calculating Profit example:
Qty * (Price_per_unit – Cost_per_unit)
Office Price per unit Cost per unit Quantity Profit
Yokohama Tires R 40 R 30 1200 R 12 000
Dunlop Tires R 57 R 45 1400 R 16 800
© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.
30
Transformation boxes
Box Description
Calculator box:
Used to perform calculations based on several columns and various data types.
With this box, you can create and/or updates columns. Below are examples of how
to use this box. Use the help tab for more information about available functions:
Concatenate name and surname example:
name//”-”//surname
Agent Surname Concatenation
Jacob Zuma Jacob-Zuma
Helen Zille Helen-Zille
Tony Stark Tony-Stark
© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.
31
Transformation boxes
Box Description
Calculator box:
Used to perform calculations based on several columns and various data types.
With this box, you can create and/or updates columns. Below are examples of how
to use this box. Use the help tab for more information about available functions:
Return a yes if x is > 10 example:
X>10? “yes” : ”no”
Agent Millions Rich
Jacob 210 Yes
Helen 70 Yes
Tony 6 No
© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.
32
Transformation boxes
Box Description
Filter box:
This box is used to filter out rows of data that fit a certain criteria. It is
equivalent to a “WHERE” statement in SQL
Below are examples of how to use this box. Use the help tab for more
information about available functions:
Filter only waybills from CPT with a weight greater than 6
Loading == “CPT” && weight > 6
Remove null from File Reference:
FileRef != “NULL”
Filter names with first three letters “Dav”
left(name,3) == “Dav”
© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.
33
Transformation boxes
Box Description
Filter box:
This box is used to filter out rows of data that fit a certain criteria. It is equivalent to
a “WHERE” statement in SQL
Use the help tab for more information about available functions.
Filter only waybills from CPT with a weight greater than 6
Loading == “CPT” && weight > 6
Waybill Loading Weight
1234 CPT 5
1235 CPT 12
Waybill Loading Weight
1236 JHB 14
1235 CPT 12
1237 JHB 7
1238 JHB 5
1239 DBN 14
© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.
34
Transformation boxes
Box Description
Filter box:
This box is used to filter out rows of data that fit a certain criteria. It is equivalent to
a “WHERE” statement in SQL
Use the help tab for more information about available functions.
Remove null from File Reference example:
not(isNull(FileRef))
Filref Loading Weight
HANLB2 CPT 5 Filref Loading Weight
D4355 CPT 12 HANLB2 CPT 5
C3452 JHB 14 D4355 CPT 12
JHB 7 C3452 JHB 14
JHB 5 23NULL3 DBN 14
23NULL3 DBN 14
© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.
35
Transformation boxes
Box Description
Filter box:
This box is used to filter out rows of data that fit a certain criteria. It is equivalent to
a “WHERE” statement in SQL
Use the help tab for more information about available functions.
Filter names with first three letters “Dav”
left(name,3) == “Dav”
© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.
36
Transformation boxes: calculator & filterRows
Below a short summary of the main functions available:
Operators: + - * / ^
Comparison: ==, >, <, <=, >=, !=
Logical: &&,||
Condition: (x>a?”True”:”False”)
Format: ftoa, atof, itoa
Math: abs, floor, ceil, round, sum, max, min, sqrt…
Char: right, left, substr, strlen, toupper, tolower, indexof…
Special: isNull, nDaysInMonth,nvl
Constants: _pi, _e, _n, _null
© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.
37
Training Agenda
• What is Anatella
• The Anatella Environment
• Basic Operations of Anatella
• Most essential Anatella boxes
• Practical exercise
© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.
Thank you for your Attention
For more information, please visit our website :
http://www.business-insight.com
© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.
Backup up Slides
The following slides are not part of the
presentation. They are used occasionnaly to
answer to some specific technical question.
© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.
Is it possible to comment code?
Yes, it is.
You can put comments
everywhere:
• Directly on the graph.
• In the javascript.
• In the SQL
(put "--" at the beginning
of a line)
© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.
Can we edit directly .Anatella files?
Yes, for a technician.
• This file is a simple XML file (a text file) that is formatted so that a human can directly and easily
understand and change it.
• For example: you can directly and easily edit the SQL statements inside the .anatella file:
You can use any “unicode” text editor
to edit .anatella files. For example,
you can use the free editor
“EditPadLite7”.
Equivalent© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.