Manage your Amazon Redshift data in HDFS with Talend

Amazon Redshift is a petabyte-scale cloud-based data warehouse. Redshift was the first cloud data warehouse, and it remains a market leader. It supports structured and semistructured data, and shares the AWS platform with the Amazon S3 data lake. Manage Amazon Redshift data in HDFS with Talend's suite of data integration tools.

Connecting to Amazon Redshift

To connect to Redshift, use the tRedshiftConnection component. Enter your host’s IP address and port (5439 by default), along with the names of the database and schema you want to use. You must also specify a username and password.

Learn more about connecting to Redshift

More about integrating Amazon Redshift data

Talend has detailed documentation on how to ETL your Amazon Redshift data for a better view of the business.

Connecting to HDFS

Apache HDFS (Hadoop Distributed File Systems) provides a software framework for distributed storage and processing of big data. In combination with tools such as MapReduce, Yarn, and other core modules, HDFS lets organizations build Apache Hadoop clusters of hundreds or thousands of nodes that can handle datasets of terabyte size. A robust ecosystem of other tools can take advantage of data stored in HDFS.

To connect to HDFS, use the Component tab of the tHDFSExist component. Enter the Hadoop distribution and version, the HDFS directory, and name of the file you want to use.

Learn more about connecting to HDFS

Get more from your Amazon Redshift data

Deliver data your organization can trust... Get started today.

Explore Talend's full suite of apps