Using Advanced Functions in Hive
You need to download the “Student.dat” dataset given below the video.
Creating Table from the data
1. To create the table from the data use the below query
CREATE TABLE IF NOT EXISTS students (
name STRING,
id INT,
subjects ARRAY<STRING>,
feeDetails MAP<STRING, FLOAT>,
phoneNumber STRUCT<areacode:INT, number:INT> )
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
COLLECTION ITEMS TERMINATED BY '#'
MAP KEYS TERMINATED BY '|'
LINES TERMINATED BY '\n'
STORED AS TEXTFILE;
2. Load the data into the table(if stored on HDFS), remember this will move the file to
load data inpath 'add path to your file here' overwrite into table
students;
Note: If you are using the file located on your local directory on VM you will need to use the
below query.
load data local inpath 'add path to your file here' overwrite into
table students;
3. To verify if the data has been loaded correctly use the below query
Select * FROM students;
Using Advanced Functions
1. Explode()
Select explode(feedetails) FROM students;
Select explode(subjects) FROM students;
Select explode(feedetails) F ROM students WHERE name="Alexa";
2. Upper()
Select upper(name) from students;
3. Regex_Replace()
Select regexp_replace(concat(upper(name),id),' ','') as username from
students;
Note: Please be careful while copying queries from documents to Hue/CLI for running queries, especially
those involving quotations marks. Sometime the quotes are not properly copied and upon running the
query you may receive an error, replacing the quotes on Hue/CLi should solve the problem.