Talend Data
Integration
and Big Data
Module 3
Working with Files
Working with Files
• Understanding supported file formats in Talend
• Reading files using tFileInputDelimited, tFileInputExcel, and
tFileInputJSON
• Writing data to files with output components
• Defining and previewing file schema
• Debugging with tLogRow
• Iterating over multiple files using tFileList
Supported File Formats
Format Use Case File Extension
Tabular, simple
CSV (Delimited) export/import .csv, .tsv
Excel Structured spreadsheets .xls, .xlsx
JSON Hierarchical, API data .json
Reading CSV – tFileInputDelimited
• Component: tFileInputDelimited
• Reads row-by-row from delimited text files
• Key settings:
• File Path
• Field Separator
• Header row
• Schema definition
Reading Excel – tFileInputExcel
• Component: tFileInputExcel
• Allows:
• Selecting sheet
• Setting header and data range
• Mapping schema
• Supports .xls and .xlsx
Reading JSON – tFileInputJSON
• Component: tFileInputJSON
• Reads hierarchical JSON using JSONPath
• Key parts:
• Loop JsonPath
• Field Mapping
• Schema definition
Writing CSV – tFileOutputDelimited
• Component: tFileOutputDelimited
• Writes row-by-row to CSV
• Options:
• Include header
• Delimiter settings
• Overwrite or append mode
Writing Excel – tFileOutputExcel
• Component: tFileOutputExcel
• Exports data into .xls or .xlsx
• Customizable:
• Sheet name
• Append mode
• Cell formatting
Writing JSON – tFileOutputJSON
• Component: tFileOutputJSON
• Outputs structured JSON
• Requires:
• Root tag
• Row tag
• Proper schema
Defining File Schema & Preview
• Use “Edit Schema” to define columns and types
• Use “Preview” to check file structure before running
• Reuse schema via Repository for consistency
Debugging with tLogRow
• Component: tLogRow
• Displays row data in:
• Table
• Basic
• Vertical format
• Used for testing and validation
Iterating Files – tFileList
• Component: tFileList
• Allows looping over files in a directory
• Common use case:
• tFileList → Iterate → tFileInputDelimited
Summary and Hands-On
✅ Read and write CSV, Excel, JSON
✅ Define and reuse schemas
✅ Debug with tLogRow
✅ Loop files with tFileList
💻 Now let’s try it hands-on!