Organizations today struggle to adopt, integrate, and manage the enterprise data that moves through their systems. Data may be the currency that drives business, but without a holistic enterprise data management strategy, businesses are unable to harness its full value.
Business leaders can gain a key competitive advantage from their data ecosystems, but research suggests that most leaders realize they don’t have their data ducks in a row. According to Experian’s 2018 Global Data Benchmark Report, US organizations believe 33% of their customer and prospect data is inaccurate in some way. A UK article reports that nearly 85% of businesses say they’re operating databases with between 10–40% bad records. Poor data quality affects everything from analytics and business intelligence (wrong conclusions drawn) to employee productivity (lost time due to rework and poor communication).
Enterprise data management (EDM) refers to a set of processes, practices, and activities focused on data accuracy, quality, security, availability, and good governance.
Enterprise data takes many forms
Enterprise data is the totality of the digital information flowing through an organization. This includes structured data, such as records in spreadsheets and relational databases, and unstructured data, such as images and video content. Some examples include:
- Operational data, such as customer orders and transaction records, billing and accounting systems, or internal labor statistics
- Network alerts and logs used in managing IT infrastructure, by cybersecurity teams, or by application developers
- Strategic data from customer relationship management (CRM) systems, sales reporting, trend and opportunity analyses, or external sources of market data
- Application-specific data, including GPS data for logistics or transportation companies, sensor data for IoT businesses, weather data for news organizations, or web content for social media applications.
The pillars of enterprise data management
High-level, interrelated components of EDM present challenges and opportunities, but their true value lies in encouraging data awareness, a holistic perspective, and a focus on data’s meaning and utility over mere volume and velocity. Let’s look at some of the processes, practices, and activities that comprise EDM.
|Data management term||What it is||What is does|
|Data Integration||A process of combining data from several different sources into a unified repository, making it actionable and valuable to those who want to access it.||Data integration can result in:
|Master data management (MDM)||A process to ensure an organization works with — and makes decisions based on — one version of current, "true" data. It is often referred to as a "golden record."||MDM reconciles varied, integrated data and makes it consistent in downstream applications and analytics. Data stewards use MDM tools to remove duplicates, aggregate records for reporting, and apply rules laid down during modeling.|
|Data governance||A set of disciplines that underpin the success of an MDM program. Includes stakeholders throughout the enterprise.||Ensures that the right people are assigned the right data responsibilities.
|Data quality management||Activities focused on discovering and addressing underlying problems in data||Supports the defined data management system. May include:
|Data stewardship||Activities focused on execution and operationalization||Manages the lifecycle of data from curation to retirement. Ensures data is consistent with the data governance plan, is linked with other data assets, and is under control in terms of data quality, compliance, or security. It includes:
|Data warehouse||A repository that stores current and historical data from disparate sources. May be on-premises or cloud-based.||A data warehouse is a key component of a data analytics architecture that serves as a platform for decision support, analytics, business intelligence, and data mining.|
|ETL/ELT||The processes a data pipeline uses to replicate data from a source system into a target system such as a data warehouse. ETL stands for extract, transform, load.||Extracts data from sources that are not optimized for analytics, moves it to a central host that is optimized for analytics.|
What’s your EDM strategy?
The flow of data relies on end-to-end management across ingestion, storage, transformation, reporting, and analytics layers. How an organization designs its EDM strategy depends on its particular needs, but as discussed in our data strategy guide, the answers to four key questions must guide any EDM strategy:
- How do we collect the data we need to analyze? Businesses generate massive amounts of data, and selecting the most relevant subset for analytics or business intelligence can be daunting. A modern ETL tool can ingest raw, rich data and send it to a data warehouse at minimal cost.
- How should we consolidate our disparate data sources? A data pipeline refers to the technology and processes an organization employs to extract data from all of the various systems from which it originates and make it ready for analysis. A business should consider its particular needs and choose a data pipeline accordingly.
- What technology should we use to store and analyze our data? A data warehouse is usually the most appropriate and performant solution.
- How should we facilitate data exploration? In a typical data exploration process, an analyst is asked a broad question about the business and needs to come up with theories that can be tested against the data. The analyst may use statistical programming, data visualization, or business intelligence tools to derive real value from the data.
ETL: an important part of the process
Businesses with robust EDM policies, procedures, and tools have a better chance of keeping their data accurate, high-quality, secure, and available. They also have distinct competitive advantages in the form of accurate and timely analytics and business intelligence, increased employee productivity (accurate data means no rework), and new revenue and business opportunities due to reliable insights.
An ETL tool is an important part of the EDM ecosystem that should make the process of moving data from sources to destinations simple.
The Stitch ETL tool can load data to your cloud data warehouse from more than 90 data sources. Stitch provides a secure, easy-to-use data pipeline that’s also a bridge to business intelligence. Sign up for a free trial and get data into your data warehouse in minutes.