Course Duration - 7 Weeks | Validity - 1.
5years
CURRICULUM
Databricks Performance Tuning (7 Weeks)
-> Spark Architecture
-> Spark UI in Databricks
-> Understanding, why some queries perform faster than others
-> Understanding & tuning Data skew
-> Avoid or reduce data shuffle
-> Minimize or avoid data spill
-> Join optimization
-> Dynamic file pruning
-> Caching - Disk Caching vs Spark Cache
-> Performance problems with Serialization
-> Mitigating Serialization issues
-> Best practices for User defined functions
-> Small file problem
-> Partitioning
-> Liquid clustering
-> Photon acceleration
-> Adaptive query execution
-> Z ordering
-> Table statistics
-> Predictive optimization
-> Estimating cluster size & right Instance type
THANK
YOU