且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

Sqoop - 我可以将多个 mysql 表批量导入到一个 HBase/Hive 表吗

更新时间:2023-01-21 22:53:01

如果表格之间存在某种关联,这绝对是可能的.在 Sqoop 中可以使用***格式的查询来做到这一点.在这种情况下,***格式查询将是一个连接.例如,导入到 Hive 时:

It's definitely possible if the tables are somehow related. A free-form query can be used in Sqoop to do exactly that. In this case, the free-form query would be a join. For example, when importing into Hive:

sqoop import --connect jdbc:mysql:///mydb --username hue --password hue --query "SELECT * FROM users JOIN customers ON users.id=customers.user_id JOIN employee ON users.id = employee.user_id WHERE \$CONDITIONS" --split-by oozie_job.id --target-dir "/tmp/hue" --hive-import --hive-table hive-table

同样,对于 Hbase:

Similarly, for Hbase:

sqoop import --connect jdbc:mysql:///mydb --username hue --password hue --query "SELECT * FROM users JOIN customers ON users.id=customers.user_id JOIN employee ON users.id = employee.user_id WHERE \$CONDITIONS" --split-by oozie_job.id --hbase-table hue --column-family c1

所有这些的关键要素是提供的 SQL 语句:

The key ingredient in all of this is the SQL statement being provided:

SELECT * FROM users JOIN customers ON users.id=customers.user_id JOIN employee ON users.id = employee.user_id WHERE \$CONDITIONS

有关***格式查询的更多信息,请查看 http://sqoop.apache.org/docs/1.4.4/SqoopUserGuide.html#_free_form_query_imports.

For more information on free-form queries, check out http://sqoop.apache.org/docs/1.4.4/SqoopUserGuide.html#_free_form_query_imports.