Your central database for all things ETL: advice, suggestions, and best practices
ETL: a definition
ETL refers to the process of extracting, transforming, and loading data into a new host source, such as a data warehouse. It's a necessary process if you want to optimize your data for analytics. ETL solves two core problems that enable better analytics:
- You can now do data analysis in an environment optimized for that purpose: Transactional databases like MySQL and Postgres are excellent at processing transactional workloads. They're great at reading and updating single rows of data with low latency. They're not great for conducting large-scale analytics across huge datasets.
- Cross-domain analysis: By joining data from disparate data sources, business leaders can answer deeper business problems. This demand from business leaders is becoming more urgent as enterprises become more complex, operate at a faster pace, and deploy systems on the cloud.
You can navigate to the ETL Process Overview section, which provides an explanation of exactly what's going on in the ETL process, as well as current best practices. Or just click one of the images below for more specific information about each step: