Oracle Database empowers businesses to collect vast amounts of data. Oracle’s ecosystem is expansive, but with the right tool, you can quickly and reliably bring your data to any cloud data warehouse, extracting the maximum value from your data with analytics and business intelligence tools.
Oracle Database 101
Oracle Database is a relational database management system (RDBMS) that uses Structured Query Language (SQL). It can be deployed on-premises, in the cloud, or in a hybrid configuration. It’s typically used to store data for online transaction processing (OLTP) systems, and sold in several editions with different feature sets and price points.
Although Oracle Database is a popular data warehouse for companies in all sectors, you’ll want to take a close look at the platform’s cost, lock-in strategy, and licensing requirements before deciding on exactly how (and when) to use Oracle.
In most cases, companies incorporate Oracle into a broader data management strategy that includes other databases and data warehouses. This allows them to take advantage of each platform’s unique features and capabilities, and provides opportunities to make optimal use of their data.
Oracle Database architecture
Oracle Database manages the storage of data, organizes it in a manner for easy retrieval, and provides access to data for multiple users. Let’s look at some of the concepts and components in the Oracle architecture.
Physical vs. logical storage
Oracle database architecture depends upon two types of storage: physical and logical. Physical (disk) storage contains all the files in the database. Logical storage structures, such as tablespaces, segments, extents, and blocks, appear on the disk but are not part of the dataset. Logical storage helps users locate specific data and improves the efficiency of the retrieval process, enabling a modular data storage system in which capacity can be adjusted without affecting performance.
A book serves as a helpful analogy for the difference between physical and logical storage. A book’s contents are printed on paper (physical storage), but additional information in the form of chapters, page numbers, and footnotes (logical storage) helps readers navigate the book and its contents.
Logical storage is the contextual information that appears in the book, but isn’t part of the book’s story. Rather, it’s a guide to help you find specific information or mark your place.
Understanding the difference between physical storage structures and logical storage structures is an excellent foundation for understanding database objects.
Logical storage structures differ from database objects in their degree of visibility. While logical storage structures reside within the database to help organize data, database objects consist of conceptual representations of data.
Examples of database objects include rows, tables, indexes, and views that display data.
Database objects represent the logical view of data stored in files from multiple locations. To achieve this, database objects rely on the logical storage structures, which locate the targeted data.
Database vs. instance
The term “database” is often used to refer to the database and instance as a whole. It’s more precise to use the term “database” when referring specifically to the physical disk or file storage that contains the data and metadata. In contrast, an “instance” is the processes and memory set aside to access the information within the database.
Databases exist on disk, while instances reside in memory and run as processes. Memory structures store data and metadata. Processes help run the database, enable communication between various components, and keep the data between memory and disk in sync.
While memory structures contain information, processes control what happens within the data server. Put another way, processes determine the series of steps that are applied to the data. Oracle Database processes can be organized into four primary categories:
- Client processes — allow client applications to connect to the server and run its code.
- Server processes — bridge the client and the database. In dedicated server mode, a server caters only to one client. In shared server mode, a server can cater to multiple clients simultaneously.
- Background processes — handle administration activities such as I/O. Examples include process monitor (PMON), system monitor (SMON), and log writer (LGWR).
- Slave processes — assist the other processes to achieve their objectives.
Oracle Database stores in memory information that includes program code, sessions, locking details, and program execution requirements, along with data. Oracle organizes this information into four distinct areas:
- System global area (SGA): Every Oracle database instance has one SGA. The data inside SGA is shared across multiple processes and sessions. Examples of data stored here include cached data and shared SQL.
- Program global area (PGA): PGA contains data and control information, exclusive to each process. A process such as database log writer has details specific to its logs in its PGA.
- User global area (UGA): UGA is memory allocated exclusively for a session. As a user session ends, its UGA is deleted. When you connect to a database instance and execute SQL commands and transactions, these happen inside your UGA. In dedicated server mode, UGA is stored inside PGA, whereas in a shared server mode, UGA is stored inside SGA.
- Code areas: These are memory areas used to store and run code.
Moving your Oracle data
That’s a brief look at the fundamental structures of Oracle Database, but it’s only part of the story.
Most companies rely on a transactional database like Oracle to collect, store, and manage data, and a column-oriented data warehouse to analyze their data. They need to replicate data from the transactional platform to the analytical one quickly and reliably. To do that, you have to know about data models and data mapping.
All databases, including data warehouses, support a variety of data types and organizational structures to create data models. Understanding the compatibility between the data models of your source and destination databases is a great starting point for integration.
Part of the process is mapping the data model from the data source to that of the destination. If the source and destination don’t support the same numeric, data, or other data types, you have to convert or transform the source data into a format that the destination supports.
Two options exist for preparing to move data between platforms: traditional ETL and ELT.
Moving Oracle data with traditional ETL
Network congestion can be a consideration for large data integration jobs. Some companies solve this problem by scheduling ETL jobs during non-peak hours, planning their ETL strategy carefully to minimize disruption of normal operations.
A more significant challenge is finding developers with expertise in Oracle, ETL programming, and the data warehouse platform you’re using. And security is another important consideration. Does the data going across your network need to be encrypted? Compliance requirements might dictate encryption; keeping all of your data behind a firewall might mean it’s not necessary.
If the prospect of managing a complex data replication strategy and navigating potential latency, data integrity, and security issues doesn’t excite you, you’re not alone. Many companies today forgo writing their own ETL code to bypass these complexities. In this scenario, the priority becomes extracting data from its source and immediately loading it into the destination warehouse — a tactic called ELT.
ELT is cost-efficient because cloud-native data warehouses such as Google BigQuery, Amazon Redshift, Snowflake, and Microsoft Azure SQL Data Warehouse can transform data after it’s been loaded. The result is a simpler and more flexible data pipeline.
Stitch — the ETL alternative
When you’re ready to use your Oracle data for analytics, you don’t have to budget for time-intensive, expensive ETL development. Stitch streamlines the ETL process — extracting Oracle data with speed and efficiency, and accelerating your time to value.
Designed for self-service with features including row-based pricing and flexible scheduling, Stitch is the ideal ETL tool for extracting Oracle data to a cloud data warehouse — it’s one of 90+ sources we support. Set up a free trial in minutes and start leveraging Oracle Database to its fullest potential today.