Data preparation is the process of gathering, combining, structuring, and organizing data so it can be analyzed as part of data visualization, analytics, and machine learning applications. The components of data preparation include pre-processing, profiling, cleansing, validation, and transformation; it often also involves pulling together data from different internal systems and external sources” (TechTarget).

“Data preparation is necessary to manipulate and transform raw data so that the information content enfolded in the data set can be exposed, or made more easily accessible. This is the first step in data analytics projects for data wrangling and can include many discrete tasks such as loading data or data ingestion, data fusion, data cleansing, data augmentation, and data delivery” (Wikipedia).

“The preparation of data dictates the types of analysis that can be performed from the front end of the data analytics solution, and how difficult it will be for end users to answer their business questions. … Effective data modeling and ETL processes could have a major impact on the overall performance of the BI solution” (Sisense).

More from the data glossary

A definitive guide to data definitions and trends, from the team at Stitch.

Give Stitch a try, on us

Stitch streams all of your data directly to your analytics warehouse.

Set up in minutes Unlimited data volume during trial 5 million rows of data free, forever