To connect to HDFS, use the Component tab of the tHDFSExist component. Enter the Hadoop distribution and version, the HDFS directory, and name of the file you want to use.
Learn how to get the most out of your HDFS data with Talend's suite of data integration tools.
Join HDFS and other critical business data in Talend for a holistic view of your organization.
Whether you are on-prem, on-cloud, or somewhere in between, Talend can help you ETL, ELT, clean, govern, transform, and integrate your HDFS data.
Talend Data Fabric lets you integrate HDFS data and ensure that it — and all your company data — is clean, compliant, and broadly available. With integration tools Talend Studio and Talend Pipeline Designer, you can construct data pipelines using a drag-and-drop visual interface to extract data from HDFS plus hundreds of other data sources. You can run transformations in the pipeline using hundreds of bundled components. And you can replicate your data to virtually any destination, including cloud data warehouses such as Amazon Redshift, Google BigQuery, Snowflake, Microsoft Azure Synapse Analytics, and Delta Lake on Databricks; on-premises databases such as Oracle, Microsoft SQL Server, MySQL, and others via JDBC; and data warehouse appliances such as SAP HANA.
Talend Data Fabric is the only cloud-native tool that bundles data integration, data integrity, and data governance in a single integrated platform, so you can do more with your HDFS data and ensure its accuracy using applications that include:
To connect to HDFS, use the Component tab of the tHDFSExist component. Enter the Hadoop distribution and version, the HDFS directory, and name of the file you want to use.
More about integrating HDFS data
Talend has detailed documentation on how to ETL your HDFS data for a better view of the business.Computing data with Hadoop distributed file system Create a file in a defined directory, get it into and out of HDFS, store it to another local directory, and read it at the end of the Job. See how >
ETL your HDFS data to the destination of your choice:
Deliver data your organization can trust. Get started today.