Manage your FTP data in HDFS with Talend

FTP (file transfer protocol) is widely employed to transfer files over networks. Most systems support FTP clients and servers, though the protocol's lack of built-in security has engendered variants such as SSH FTP (SFTP) and FTPS (which supports Transport Layer Security (TLS)). Manage FTP data in HDFS with Talend's suite of data integration tools.

Connecting to FTP

To connect to files over FTP, use the tFTPConnection component. Enter the host, port, username, password, and other optional parameters, such as SFTP support.

More info on establishing an FTP connection

More about integrating FTP data

Talend has detailed documentation on how to ETL your FTP data for a better view of the business.

Connecting to HDFS

Apache HDFS (Hadoop Distributed File Systems) provides a software framework for distributed storage and processing of big data. In combination with tools such as MapReduce, Yarn, and other core modules, HDFS lets organizations build Apache Hadoop clusters of hundreds or thousands of nodes that can handle datasets of terabyte size. A robust ecosystem of other tools can take advantage of data stored in HDFS.

To connect to HDFS, use the Component tab of the tHDFSExist component. Enter the Hadoop distribution and version, the HDFS directory, and name of the file you want to use.

Learn more about connecting to HDFS

Get more from your FTP data

Deliver data your organization can trust... Get started today.

Explore Talend's full suite of apps