Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
7 views14 pages

Module 3 - Working With Files

This document covers the Talend Data Integration module focusing on working with files, including supported formats like CSV, Excel, and JSON. It details components for reading and writing these file types, defining schemas, debugging, and iterating over multiple files. The document concludes with a summary of key functionalities and a hands-on exercise.

Uploaded by

rizqi ardiansyah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views14 pages

Module 3 - Working With Files

This document covers the Talend Data Integration module focusing on working with files, including supported formats like CSV, Excel, and JSON. It details components for reading and writing these file types, defining schemas, debugging, and iterating over multiple files. The document concludes with a summary of key functionalities and a hands-on exercise.

Uploaded by

rizqi ardiansyah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 14

Talend Data

Integration
and Big Data
Module 3
Working with Files
Working with Files
• Understanding supported file formats in Talend
• Reading files using tFileInputDelimited, tFileInputExcel, and
tFileInputJSON
• Writing data to files with output components
• Defining and previewing file schema
• Debugging with tLogRow
• Iterating over multiple files using tFileList
Supported File Formats
Format Use Case File Extension
Tabular, simple
CSV (Delimited) export/import .csv, .tsv
Excel Structured spreadsheets .xls, .xlsx
JSON Hierarchical, API data .json
Reading CSV – tFileInputDelimited
• Component: tFileInputDelimited
• Reads row-by-row from delimited text files
• Key settings:
• File Path
• Field Separator
• Header row
• Schema definition
Reading Excel – tFileInputExcel
• Component: tFileInputExcel
• Allows:
• Selecting sheet
• Setting header and data range
• Mapping schema
• Supports .xls and .xlsx
Reading JSON – tFileInputJSON
• Component: tFileInputJSON
• Reads hierarchical JSON using JSONPath
• Key parts:
• Loop JsonPath
• Field Mapping
• Schema definition
Writing CSV – tFileOutputDelimited
• Component: tFileOutputDelimited
• Writes row-by-row to CSV
• Options:
• Include header
• Delimiter settings
• Overwrite or append mode
Writing Excel – tFileOutputExcel
• Component: tFileOutputExcel
• Exports data into .xls or .xlsx
• Customizable:
• Sheet name
• Append mode
• Cell formatting
Writing JSON – tFileOutputJSON
• Component: tFileOutputJSON
• Outputs structured JSON
• Requires:
• Root tag
• Row tag
• Proper schema
Defining File Schema & Preview
• Use “Edit Schema” to define columns and types
• Use “Preview” to check file structure before running
• Reuse schema via Repository for consistency
Debugging with tLogRow
• Component: tLogRow
• Displays row data in:
• Table
• Basic
• Vertical format
• Used for testing and validation
Iterating Files – tFileList
• Component: tFileList
• Allows looping over files in a directory
• Common use case:
• tFileList → Iterate → tFileInputDelimited
Summary and Hands-On
✅ Read and write CSV, Excel, JSON
✅ Define and reuse schemas
✅ Debug with tLogRow
✅ Loop files with tFileList
💻 Now let’s try it hands-on!

You might also like