Sqoop
Sqoop allows to easily import data from relational databases into HDFS.
We have already deployed the Sqoop connectors for the following databases:
MySQL / MariaDB
PostgreSQL
Microsoft SQL Server
Oracle 18c
This way, out of the box you can use the Sqoop tool to import data from any of these databases:
sqoop import \
--username ${USER} --password ${PASSWORD} \
--connect jdbc:postgresql://${SERVER}/${DB} \
--table mytable \
--target-dir /user/username/mytable \
--num-mappers 1
Note
We recommend that you use only one mapper process to avoid overloading your database.
If you need to import data from a different database don’t hesitate to contact us.
For further information on how to use Sqoop you can check the Sqoop Tutorial that we have prepared to get you started and the Sqoop Guide in the CDH documentation.