1.What is the Exact difference between BASIC Transformer and NORMAL Transformer?
A. The Transformer stage is inherent PX functionality, whereas the BASIC Transformer uses a Server interface to call a Server Transformer stage. There's severe performance impact as well as partitioning limitations, but it does give a PX job some access to existing Server functionality. 2. There ate two types of transformer i. Basic
transformer and ii. Active transformer. Basic transformer is used for SMP system and not in MPP or cluster. Basic transformer (BASIC is the language supported by the Data stage server engine and available in Server job). Where in Datastage Px the Active transformer get use. 3. Transformer stages are always active stages. The
basic transform stage is part of the Server product, but the PX engine allows this stage to be called (the opposite, using a PX stage in Server is not possible)
2.Did sequential stage accepts .xl files ,xml? znd how? yes it accepts. use fixed line pattern
3.what is main difference between change capture and change apply stages
the stage compares two data set(after and before) and makes a record of the differences.
change apply stage combine the changes from the change capture stage with the original before data set to reproduce the after data set
4.difference between server shared container and parallel shared container
1. Server shared containers contain server stage types, parallel shared containers contain parallel stage types. 2. When we go for parallel shared container the logic
can be reusable across many jobs
Introduction DataStage Enterprise Edition is a package of three products: DataStage Server Edition, the parallel extender with parallel ETL jobs and the MetaStage product described on the Metadata Workbench entry. The flagship tool of Enterprise Edition is parallel ETL jobs. [edit]
History During the 1990s the data integration vendors such as Ascential and Informatica were competing to deliver tools that provided a wide range of data connectivity and transformation functions in a mostly code free environment. Towards the late 1990s data stores were becoming large, data warehouses and business intelligence was demanding larger volumes of data loads. The physical architecture of these loads was hitting a limit on the volume that a single server could handle and was moving towards clusters or grids of servers. The data integration vendors need to be able to integrate data across a massively scalable architecture to keep up with the increased data volumes. Ascential started to roll out a parallel capability in the DataStage Server Edition product called multiple instance jobs. This allowed some additional manual programming to partition and process data in parallel. In November 2001 they switched to a buy approach and purchased Torrent Systems for $46 million. Torrent had the capability to run tools on a massively parallel processing (MPP) platform. [edit] Versions This section lists each major release of DataStage Enterprise Edition and the enhancements for DataStage parallel jobs. For a list of enhancements to the client tools see the versions on the DataStage Server Edition page is it is the version that has been delivered with every release going back to DataStage 1. All release of DataStage 7 can import and upgrade DataStage 6 export files. DataStage 8 can only import and upgrade DataStage 7.5.1 or 7.5.2 jobs. [edit]
DataStage 6 Released in September 2002, ten months after the acquisition of Torrent, it was the first version of DataStage to feature the Parallel Extender (PX), the parallel platform that allows processes to run in parallel across a multiple processor environment. New parallel job type with a new set of parallel stages. Some with the same name as server job stages but with different properties and options. Server job shared container for parallel jobs.
CPU based licensing instead of server based licensing. Support for SAS 6.12 and 8.2.
This release was followed by the client only 6.0.1 release that fixed a number problems. [edit] DataStage 7 Release September 2003 it uses much the same architecture of the previous version with improvements to the usability. This was the first release to have no server job improvements but many parallel job improvements. XML Pack 2.0 provides improved XML metadata support for parallel jobs.
National Language Support (NLS) for parallel jobs but not for all parallel stages. Parallel shared and local stages.
Enhanced transformer with improved reject row handling, string handling, timestamp conversion and compile performance. [edit] DataStage 7.5 Unknown release date. Parallel complex flat file stage. Modify, Switch and Filter stages added. Multiple-instance parallel jobs. Non blocking funnel stage.
A parallel job message handler for demoting or removing warning messages from the job log. Lookup stage changes from a property screen to a drag and drop mapping screen. Multi node import of sequential files.
Additional options for sequential file and file set stages such as Read First Rows, Row Number Column and First Line is Column Names.
[edit]
View data support for custom stages. New Parallel Advanced Job Developers Guide.
DataStage 7.5.1 Released in March 2005. New SQL Builder for building SQL query statements from a database plugin stage. Command line job search function added. DataStage parallel jobs for Unix System Services (USS) on the mainframe. Remote job deployment to deliver and run jobs across a cluster or grid. Vector support in the parallel transformer stage. Sybase and ODBC stages added to parallel jobs.
Complex Flat File stage improvements: multiple output links, automatically generated fillers, MVS dataset support. [edit] DataStage 7.5X2 Released in December 2004 this was the first release of parallel jobs that could run on Windows. While the Server runs on all the same Unix and Linux platforms as 7.5.1 it adds the additional platform of Windows 2003 Standard or Enterprise on the Intel x86 Processor Family. There were no changes to parallel jobs in this release apart from the capability to compile and run them on Windows. [edit] DataStage 8 Released in October 2006 for Windows and April 2007 for Unix this is the first version to run on the IBM Information Server. There are a number of parallel job improvements in this release: Lookup stage now supports two new lookup types: range lookup and caseless lookup. Thread based job monitoring for parallel jobs.
New Slowly Changing Dimension stage. New QualityStage stages for parallel jobs.
What is the difference between a Filter and a Swit... ________________________________________ A Filter stage is used to filter the incoming data ,for suppose u want to get the details of customer 20 if u give customer 20 as the constraint in filter it will display only the customer 20 files and u can also give a reject link,the rest of the records will go into reject link. where as in the switch, we need to give as cases, like case1,case2. case1=10; case2=20; it will give the outputs of 10 and 20 customer records. switch will check the cases and execute them.