SQOOP COMMANDS
Password File
Step 1 Create password file
echo -n "cloudera" > /home/cloudera/mysql/pass.txt
Step 2 Copy the file to HDFS
hdfs dfs -put ./mysql/pass.txt ./pass.txt
Step 3 Using --password-file option
sqoop-import --connect jdbc:mysql://localhost/training --table emp_addr
--target-dir /user/cloudera/emp_addr --username root --password-file
/user/cloudera/pass.txt -m 1;
Import
-------Primary Key
1. Without mapper configuration (default mappers is 4)
sqoop import --connect jdbc:mysql://localhost/test --table student_gpa --target-dir
/user/cloudera/sqoop_import/student_gpa --username root -P;
2. With mapper configuration (we can give any number of mappers using -m argument)
sqoop import --connect jdbc:mysql://localhost/test --table student_gpa --target-dir
/user/cloudera/sqoop_import/student_gpa --username root -P -m 2;
--------No Primary Key
3. With one mapper
sqoop import --connect jdbc:mysql://localhost/test --table practice --target-dir
/user/cloudera/sqoop_import/student_gpa --username root -P -m 1;
4. With split by option
sqoop import --connect jdbc:mysql://localhost/test --table practice --split-by id
--target-dir /user/cloudera/sqoop_import/student_gpa --username root -P;
------------------------------------
5. With where condition
sqoop-import --connect jdbc:mysql://localhost/training --table emp --target-dir
/user/cloudera/emp --username root -password cloudera --where "salary > 30000" -m 1
;
6. Importing few columns from the table
sqoop-import --connect jdbc:mysql://localhost/training --table emp --target-dir
'/user/cloudera/emp_col' --username root --password-file '/user/cloudera/pass'
--columns "id,name,salary"
7. Free form query
for query mandatory items are a) need to add the token 'where $CONDITIONS' at the
end of the query b) target-dir c) split-by or -m 1
sqoop-import --connect jdbc:mysql://localhost/training --query 'select a.id,
a.name, b.city, a.salary from emp a join emp_addr b on (a.id = b.id) where
$CONDITIONS' --target-dir '/user/cloudera/emp_join' --username root --password-file
'/user/cloudera/pass' --split-by a.id;
or
sqoop-import --connect jdbc:mysql://localhost/training --query 'select a.id,
a.name, b.city, a.salary from emp a join emp_addr b on (a.id = b.id) where
$CONDITIONS' --target-dir '/user/cloudera/emp_join' --username root --password-file
'/user/cloudera/pass' -m 1;
Using where condition in query
sqoop-import --connect jdbc:mysql://localhost/training --query 'select id, name,
salary from emp where salary > 20000 AND $CONDITIONS' --target-dir
'/user/cloudera/emp_join' --username root --password-file '/user/cloudera/pass' -m
1;
8. Incremental Import
Step 1: Do normal sqoop import and load the data to HDFS
Step 2: Choose any of the below 2 options based on the requirement
Incremental append (appends the newly created records and also for updated records
there will be duplicate records created)
sqoop-import --connect jdbc:mysql://localhost/training --table emp --username
root --password cloudera --incremental append --check-column id --last-value 1202
-m 1;
Incremental Last Modified (Appends the newly created record and updates the record
for which the values are updated)
sqoop-import --connect jdbc:mysql://localhost/training --table sales
--username root --password-file /user/cloudera/pass --target-dir
/user/cloudera/sales_1/ --incremental lastmodified --check-column lastmodified
--last-value '2017-11-12 20:33:34' --merge-key sid
HIVE Import:
sqoop-import --connect jdbc:mysql://localhost/poc --table aadhar_state_report
--username root --password-file '/user/cloudera/pass' --target-dir
'/user/cloudera/poc/state_report' --hive-import --create-hive-table --hive-table
poc.state_report -m 1
HBASE Data Load
step 1: create table in HBase
create 'student1' 'stud_details'
step 2: sqoop command
Load date to existing table
sqoop-import --connect jdbc:mysql://localhost/test --table student --username
root --password cloudera --hbase-table student1 --cloumn-family stud_details
--hbase-row-key id
Load data by creating new table in Hbase
sqoop-import --connect jdbc:mysql://localhost/test --table student --username
root --password cloudera --hbase-create-table --hbase-table student1 --cloumn-
family stud_details --hbase-row-key id
-----------------------------------------------------------------------------------
-----------------------------------------------------------------
9. Import all tables from RDBMS into HDFS as Avro format
sqoop import-all-tables --connect jdbc:mysql://localhost/world --username root -P
--as-avrodatafile --warehouse-dir /user/hive/warehouse/world.db -m 10;
The tables are imported into HIVE directory.
create database in the same name in hive
5.1 create table
Once the import using avro format is completed the schema of all the tables will be
stored as avsc files in the local system. you can find the files at /home/cloudra/
Move all the files into hdfs using put command. Lets say the files are loaded into
directory call sqoop_import.
Then run the below command to create the table.
Login into hive
CREATE EXTERNAL TABLE categories
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED AS AVRO
LOCATION '/user/hive/warehouse/retail.db/categories'
TBLPROPERTIES ('avro.schema.url' = '/user/cloudera/sqoop_import/categories.avsc');
repete the above for all the tables imported using sqoop import.
-----------------------------------------------------------------------------------
--------------------------------------------------------
Sqoop Export
sqoop-export --connect jdbc:mysql://localhost/training --table emp --export-dir
/user/cloudera/emp --username root -P -m 1;
Sqoop Job
sqoop job --create myjob -- import --connect jdbc:mysql://localhost/training
--table emp --target-dir /user/cloudera/sqoop-import/ --username root --password
cloudera -m 1;
sqoop job --list
sqoop job --show myjob
sqoop job --exec myjob
-----------------------------------------------------------------------------------
------------------------------------------------------
Sqoop Eval
Select Data from MYSQL table
sqoop eval --connect jdbc:mysql://localhost/training --username root -P --query
"select * from emp"
Insert Data into MYSQL Table
sqoop eval --connect jdbc:mysql://localhost/training --username root -P --query
"insert into emp values(1209,'Akil', 'CEO', 1000000,'CM')"
sqoop eval --connect jdbc:mysql://localhost/training --username root -P --e "insert
into emp values(1207,'Akil', 'CEO', 1000000,'CM')"
List Database and Tables
sqoop list-databases --connect jdbc:mysql://localhost/ --username root -P
sqoop list-tables --connect jdbc:mysql://localhost/training --username root -P