Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
8 views9 pages

Components & Runtime Behaviour

The document outlines various components of AbInitio, categorized into folders such as Sort, Transform, Departition, Partition, Datasets, Database, Miscellaneous, and Validate. Each component is described with its parameters and functionalities, detailing operations like sorting, filtering, joining, and partitioning data records. Additionally, it includes examples of string functions and programming constructs used within the AbInitio environment.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views9 pages

Components & Runtime Behaviour

The document outlines various components of AbInitio, categorized into folders such as Sort, Transform, Departition, Partition, Datasets, Database, Miscellaneous, and Validate. Each component is described with its parameters and functionalities, detailing operations like sorting, filtering, joining, and partitioning data records. Additionally, it includes examples of string functions and programming constructs used within the AbInitio environment.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 9

AbInitio Components:

==============================
Sort Folder
--------------
Sort
Sort Within Groups
Partition by key and sort

Transform Folder
-------------------
Dedup Sorted
Filter By Expression
Fuse
Join
Normalize
ReFormat
RollUp
Scan

Departition
----------------
Concatenate
Gather
InterLeave
Merge

Partition
-------------
Partition By Round-Robin
Partition By Key
Partition By Expression
Partition By Range
Partition By Percentage
Partition with Load Balance
Broadcast

Datasets
--------------
Input File
Output File
LookUp File & Dynamic LookUp File
Intermediate File
Input Table
Output Table

Data base
---------------
Input Table
Output Table
Run SQL
Update Table
Truncate Table

Miscellaneous
-------------------
Gather Logs
Meta Pivot
Redefine Format
Replicate
Run Program
Trash

Validate
-----------
Generate Records
Validate Records
Check Order
Compare Records
Compute Checksums
Compare Checksum

AbInitio Components
----------------------
Runtime Behaviour
Parameters

=============
Sort
=============
It will be taking the input file parameter and process the input & store the
processed data in the output file.
Parameters:
Key :Here key means column it needs to be specified.Based on the key it will sort
the data and send to out port.
Max_Core:10MB
Total memory allocated for the component for performing the entire operation.

===================
Sort within group
====================
The data should be already sorted based on one key
Parameters:
MajorKey
MinorKey
Max_Core:10MB
Allow unsorted:false

==========================
Partition by key and sort
==========================
Repartitions the data records by key values and then sorts the records within each
partition.The no.of input and output partitions can be different.
Parameters:
Key
InputLayout
MaxCore
OutPutLayout

====================
Dedup Sorted:
====================
Dedup Sorted separates one specified data record in each group of data records from
the rest of the records in the group.

Dedup Sorted requires grouped input.

input port-->output port,duplicate port


optional ports-reject,error,log
Parameters:
Key:Name(s) of the key field(s) you want Dedup Sorted to use when determining
groups of data records.
Select:Filter for records before Dedup Sorted separates duplicates
Keep:first,last,unique
first keeps the first record of a group. This is the default.
last keeps the last record of a group.
unique-only keeps only records with unique key values
logging
reject-threshold
-----------------
Abort on first reject — Write Multiple Files stops the execution of the graph at
the first reject event it generates.
Never abort — the component does not stop the execution of the graph, no matter how
many reject events it generates.
Use ramp/limit — the component uses the settings in the ramp and limit parameters
to determine how many reject events to allow before it stops the execution of the
graph.

=========================
Filter By Expression
=========================
Filter by Expression filters data records according to a specified DML expression.
Basically it can be compared with the where clause of sql select statement.
Different functions can be used in the select expression of the filter by
expression component
even lookup can also be used.
It filters data records according to a DML expression.
input port
output port,deselect port
reject,error,log
Parameters:
Select expr:condition, Filter for data records.
reject-threshold
logging

=============
ReFormat
=============
Reformat changes the record format of data records by dropping fields, or by using
DML expressions to add fields, combine fields, or transform the data in the
records.

lookup("lkp_file", in.ProductID).Category
Parameters:
Count
select
transform()
rejectthreshold
logging
Output_index
out::output_index(in)=
begin

end;
Output_indexes

=================
LookUp File
=================
It is used as a reference file.Here we can map the input file and lookup file with
an primary key reference and get the required columns from the both the tables in
the output file.
Key:on which column reference basis we will get the records
RecordFormat:specify the columns in lookup file.

==============
Join
==============
2 input ports(default)
1 output port
2 unused ports(default)
2 reject ports(default)
2 error ports(default)
1 log port(default)
Parameters:
count
sorted-input
key
transform
join-type
record-required0
record-required1
dedup0
dedup1
select0
select1
override-key0
override-key1
driving
maintain-order
max-core
reject-threshold
logging

Checkpoint Sort:
===================
Parameters:
Key
Max_Core:100 MB

========
Fuse
========
Fuse combines multiple input flows into a single output flow by applying a
transform function to corresponding records of each flow.
2 i/p ports,1 o/p port
optional ports:reject,log,error
Parameters:
Count
Transform
Reject-Threshold
Logging

=============
Normalize
=============
Generates multiple output data records from each input data record.
Normalize can separate a data record with a vector field into several individual
records, each containing one element of the vector.
Parameters:
transform
reject-threshold
Logging

=========
RollUp
=========
Generates data records that summarize groups of data records. Rollup in Memory
maximizes performance by keeping intermediate results in main memory.
Parameters:
sorted-input
key-method:key specifier/key change Function
key
transform
reject-threshold
logging

=======
Scan
=======
Generates a series of cumulative summary records--such as year-to-date totals--for
groups of data records. Scan Sorted requires grouped input.
Parameters:
sorted-input
key-method:key specifier/key change Function
key
transform
reject-threshold
logging

===========
Concatenate
============
Appends multiple flow partitions of data records one after another.

========
Gather
=========
Combines data records from multiple flow partitions arbitrarily.

===========
InterLeave
===========
Combines blocks of data records from multiple flow partitions in round-robin
fashion.
Parameters:
Blocksize

=========
Merge
=========
Combines data records from multiple flow partitions that have been sorted according
to the key specifier, and maintains the sort order.
Parameters:
key
=========================
Partition By Round-Robin
=========================
Distributes data records evenly to each output flow in round-robin fashion.

Use the Interleave component to reverse the effects of Partition by Round-robin.


Parameters:
Blocksize

================
Partition By Key
================
Distributes data records to its output flow partitions according to key values.
Parameters:
key

========================
Partition By Expression
========================
Distributes data records to its output flow partitions according to a specified DML
expression.
Parameters:
Function

==================
Partition By Range
==================
Distributes data records to its output flow partitions according to ranges of key
values specified for each partition.
Parameters:
key

========================
Partition By Percentage
========================
Distributes a specified percentage of the total number of input data records to
each output flow.
Parameters:
Percentages

=============================
Partition with Load Balance
=============================
Distributes data records to output flow partitions, writing more records to the
flow partitions that consume records faster.

===========
Broadcast
===========
Distributes data by combining input data records into a single flow and writing a
copy of that flow to each output flow partition.

===========
Gather Logs
===========
Collects the output from log ports of components for analysis of a graph after
execution.
Parameters:
LogFile
StartText
EndText

============
Meta Pivot
============
Pivots around one or more fields in the input
Parameters:
name_field
value_field
pivot1
pivot2
pivot3

=============
Redefine Format
=============
Copies data records from its input to its output without changing the values. Use
Redefine Format to change a record format or rename fields.

=============
Replicate
=============
Arbitrarily combines all the data records it receives into a single flow and writes
a copy of that flow to each of its output flows.

===============
Run Program
===============
Executes a standard UNIX or Windows NT program.
Parameters:
commandline

=======
Trash
=======
Ends a flow by discarding all input data records.

===========
LookUp File
===========
Lookup Files are components containing shared data. Use lookup files with the DML
lookup functions to access records according to a key.
Parameters:
key
RecordFormat

String functions
===================
string_length("abc def")->7
string_length("")->0
string_compare("aaa","bbb")-> -1
string_compare("bbb","aaa")->1

string_index("ABCD,FG,HJ,KL",",")->5
string_index("to be late be","be")->4
string_index("abc","x")->0
string_index("abc","")->1

string_substring("abcdefgh",3,4)->cdef

string("|")str = "ABCD,FG,HJ,KL";
Integer(4) l = string_length(str);
Integer("|") n = string_index(str,",")->5
string_substring(str,1,n-1)->ABCD
L=l-n;--->8
string_substring(str,n+1,l);-->"FG,HJ,KL"

string_split("ABCD,FG,HJ,KL")-->[vector "ABCD","FG","HJ","KL"];
string_split("Rini Jain","")-->[vector "Rini","Jain"];
first_name=string_split(in.FULLNAME,"")[0];
last_name=string_split(in.FULLNAME,"")[1];

string(16) str = "abc";


string(16)[2] str = [vector "abc","def"]

string_rindex(s,"n")--->9

string_filter("ABC","ABC")-->0
string_filter("AxByCz","ABC")-->ABC

string_like("abcdef","abc%")-->1
string_like("abcdef","abc_")-->0
string_like("abcdef","abc_ef)-->1

decimal(",") phone = "9870651233";


decimal(",")[2] phone = [vector "5432167898","1234567654"];

type my_rec=
record
string(",") s;
decimal(",") d;
end;

my_rec r=[record s "abc" d 3000];


====================================
out::function(c,n)=
begin

end;
dt is the same record type as lookup file
lookup("dept",in.key);
========================================
while(i<9)
begin
end
===================
integer(4) i=0;
for(i,i<4)
begin
end
=====================
integer(4) i = 2000;
string(16) s = (string(16))(i);

date('yyyy-dd-mm) dt='2000-02-03'
date('MMDDYYYY')dt1=(date("MMDDYYYY"))(dt);
======================================
record type
syn of function,loop,type conversion,vector basics
return value of lookup function;
==========================================
string(",") FULLNAME="shri Ganeshaya Namah";
string(",")[3] s=string_split(FULLNAME," ");
s[0]="Shri"
s[1]="Ganeshaya"
s[2]="Namah"

firstName=string_split(FULLNAME," ")[0];
middleName=string_split(FULLNAME," ")[1];
lastName=string_split(FULLNAME," ")[2];

out::reformat(in) =
begin
let string("|") s ="";
let integer(4) i =0;
let integer(4) c =lookup_count("DEPT",in.dn);

for(i,i<c)
begin
s=string_concat(s,"|",lookup_next("DEPT").dname);
end

out.id :: in.id;
out.name :: in.name;
out.dn :: in.dn;
out.dname :: lookup("DEPT",in.dn).dname;
out.lkp_count :: c;
out.lkp_s :: s;
end;

Dataware house class


SQL
Unix

You might also like