Manage your HDFS data in Snowflake with Talend

Apache HDFS (Hadoop Distributed File Systems) provides a software framework for distributed storage and processing of big data. In combination with tools such as MapReduce, Yarn, and other core modules, HDFS lets organizations build Apache Hadoop clusters of hundreds or thousands of nodes that can handle datasets of terabyte size. A robust ecosystem of other tools can take advantage of data stored in HDFS. Manage HDFS data in Snowflake with Talend's suite of data integration tools.

Connecting to HDFS

To connect to HDFS, use the Component tab of the tHDFSExist component. Enter the Hadoop distribution and version, the HDFS directory, and name of the file you want to use.

Learn more about connecting to HDFS

More about integrating HDFS data

Talend has detailed documentation on how to ETL your HDFS data for a better view of the business.

Connecting to Snowflake

Snowflake is a SaaS analytic data warehouse that uses a new SQL database engine with a unique architecture designed for the cloud.

Connect to Snowflake with our native connector. Just type in your login details and a few fields about your Snowflake data. Follow these three steps:

  1. Look at the native connection in the Project repository.
  2. Find the Metadata section and locate the Snowflake icon.
  3. Right-click and select the Create Snowflake menu option.

See the steps >

Work with your HDFS data

Computing data with Hadoop distributed file system Create a file in a defined directory, get it into and out of HDFS, store it to another local directory, and read it at the end of the Job. See how >

Get more from your HDFS data

Deliver data your organization can trust... Get started today.

Explore Talend's full suite of apps