Hive 1

Download as txt, pdf, or txt
Download as txt, pdf, or txt
You are on page 1of 7

employee.

csv
id, name, city, department, salary, domain
1,Sachin,Pune,Product Engineering,100000,Big Data
2,Gaurav,Bangalore,Sales,90000,CRM
3,Manish,Chennai,Recruiter,125000,HR
4,Bhushan,Hyderabad,Developer,50000,BFSI

[cloudera@quickstart ~]$ sudo service zookeeper-server staus


Usage: /etc/init.d/zookeeper-server {start|stop|restart|force-reload|status|force-
stop|condrestart|try-restart|init}
[cloudera@quickstart ~]$ sudo service zookeeper-server status
zookeeper-server is running
[cloudera@quickstart ~]$ service hive-server2 status
Hive Server2 is running [ OK ]
[cloudera@quickstart ~]$ service hive-server2 start
Error: root user required

[cloudera@quickstart ~]$ sudo service zookeeper-server status


zookeeper-server is running
[cloudera@quickstart ~]$ sudo service zookeeper-server start
JMX enabled by default
Using config: /etc/zookeeper/conf/zoo.cfg
Starting zookeeper ... already running as process 5133.
[cloudera@quickstart ~]$ hdfs dfs -mkdir /InputHive
[cloudera@quickstart ~]$ hdfs dfs –put Desktop/emloyee.txt InputHive
–put: Unknown command
[cloudera@quickstart ~]$ hdfs dfs -put Desktop/emloyee.txt InputHive
put: `Desktop/emloyee.txt': No such file or directory
[cloudera@quickstart ~]$ hdfs dfs -put /Desktop/emloyee.txt InputHive
put: `/Desktop/emloyee.txt': No such file or directory
[cloudera@quickstart ~]$ hdfs dfs -put /Desktop/Emloyee.txt InputHive
put: `/Desktop/Emloyee.txt': No such file or directory
[cloudera@quickstart ~]$ hdfs dfs -put /Desktop/Employee.txt InputHive
put: `/Desktop/Employee.txt': No such file or directory
[cloudera@quickstart ~]$ hdfs dfs -put /Desktop/Employee.csv InputHive
put: `/Desktop/Employee.csv': No such file or directory
[cloudera@quickstart ~]$ hdfs dfs -put Desktop/Employee.csv InputHive
[cloudera@quickstart ~]$ hdfs dfs -ls /InputHive
[cloudera@quickstart ~]$ hdfs dfs -ls InputHive/
-rw-r--r-- 1 cloudera cloudera 206 2021-10-17 23:10 InputHive
[cloudera@quickstart ~]$ hdfs dfs -put Desktop/Employee.csv /InputHive/
[cloudera@quickstart ~]$ hdfs dfs -ls InputHive/
-rw-r--r-- 1 cloudera cloudera 206 2021-10-17 23:10 InputHive
[cloudera@quickstart ~]$ hdfs dfs -ls /InputHive
Found 1 items
-rw-r--r-- 1 cloudera supergroup 206 2021-10-17 23:11
/InputHive/Employee.csv
[cloudera@quickstart ~]$ ^C
[cloudera@quickstart ~]$ sudo hive

Logging initialized using configuration in file:/etc/hive/conf.dist/hive-


log4j.properties
WARNING: Hive CLI is deprecated and migration to Beeline is recommended.
hive> create database Org;
OK
Time taken: 1.799 seconds
hive> use Org;
OK
Time taken: 0.077 seconds
hive> create table employee(
> id int,
> name string,
> city string,
> department string,
> salary int,
> domain string)
> row format delimited
> fields terminated by ',';
OK
Time taken: 0.459 seconds
hive> show databases;
OK
default
org
Time taken: 0.12 seconds, Fetched: 2 row(s)
hive> show tables;
OK
employee
Time taken: 0.059 seconds, Fetched: 1 row(s)
hive> describe employee;
OK
id int
name string
city string
department string
salary int
domain string
Time taken: 0.157 seconds, Fetched: 6 row(s)
hive> describe formatted employee;
OK
# col_name data_type comment

id int
name string
city string
department string
salary int
domain string

# Detailed Table Information


Database: org
Owner: root
CreateTime: Sun Oct 17 23:17:07 PDT 2021
LastAccessTime: UNKNOWN
Protect Mode: None
Retention: 0
Location:
hdfs://quickstart.cloudera:8020/user/hive/warehouse/org.db/employee
Table Type: MANAGED_TABLE
Table Parameters:
transient_lastDdlTime 1634537827

# Storage Information
SerDe Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
InputFormat: org.apache.hadoop.mapred.TextInputFormat
OutputFormat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
Compressed: No
Num Buckets: -1
Bucket Columns: []
Sort Columns: []
Storage Desc Params:
field.delim ,
serialization.format ,
Time taken: 0.116 seconds, Fetched: 32 row(s)
hive> LOAD DATA LOCAL INPATH 'Deskstop/Employee.csv' INTO TABLE employee;
FAILED: SemanticException Line 1:23 Invalid path ''Deskstop/Employee.csv'': No
files matching path file:/home/cloudera/Deskstop/Employee.csv
hive> LOAD DATA LOCAL INPATH '/home/cloudera/Deskstop/Employee.csv' INTO TABLE
employee;
FAILED: SemanticException Line 1:23 Invalid path
''/home/cloudera/Deskstop/Employee.csv'': No files matching path
file:/home/cloudera/Deskstop/Employee.csv
hive> LOAD DATA LOCAL INPATH 'Desktop/Employee.csv' INTO TABLE employee;
Loading data to table org.employee
Table org.employee stats: [numFiles=1, totalSize=206]
OK
Time taken: 0.818 seconds
hive> select * from employee;
OK
NULL name city department NULL domain
1 Sachin Pune Product Engineering 100000 Big Data
2 Gaurav Bangalore Sales 90000 CRM
3 Manish Chennai Recruiter 125000 HR
4 Bhushan Hyderabad Developer 50000 BFSI
Time taken: 0.386 seconds, Fetched: 5 row(s)
hive> select count(*) from employee;
Query ID = root_20211017234343_5e82489c-ff51-48cb-a4d6-42ffecd25809
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapreduce.job.reduces=<number>
Starting Job = job_1634536620613_0001, Tracking URL =
http://quickstart.cloudera:8088/proxy/application_1634536620613_0001/
Kill Command = /usr/lib/hadoop/bin/hadoop job -kill job_1634536620613_0001
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2021-10-17 23:43:59,079 Stage-1 map = 0%, reduce = 0%
2021-10-17 23:44:08,148 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.26 sec
2021-10-17 23:44:17,922 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 2.69 sec
MapReduce Total cumulative CPU time: 2 seconds 690 msec
Ended Job = job_1634536620613_0001
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1 Reduce: 1 Cumulative CPU: 2.69 sec HDFS Read: 7242 HDFS
Write: 2 SUCCESS
Total MapReduce CPU Time Spent: 2 seconds 690 msec
OK
5
Time taken: 33.029 seconds, Fetched: 1 row(s)
hive> select min(salary) from employee;
Query ID = root_20211017234444_f8d6a967-5715-45a8-9dcb-81545520ed3a
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapreduce.job.reduces=<number>
Starting Job = job_1634536620613_0002, Tracking URL =
http://quickstart.cloudera:8088/proxy/application_1634536620613_0002/
Kill Command = /usr/lib/hadoop/bin/hadoop job -kill job_1634536620613_0002
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2021-10-17 23:44:36,066 Stage-1 map = 0%, reduce = 0%
2021-10-17 23:44:43,851 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.16 sec
2021-10-17 23:44:53,617 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 2.59 sec
MapReduce Total cumulative CPU time: 2 seconds 590 msec
Ended Job = job_1634536620613_0002
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1 Reduce: 1 Cumulative CPU: 2.59 sec HDFS Read: 7378 HDFS
Write: 6 SUCCESS
Total MapReduce CPU Time Spent: 2 seconds 590 msec
OK
50000
Time taken: 28.139 seconds, Fetched: 1 row(s)
hive> select max(salary) from employee;
Query ID = root_20211017234545_28a4c597-e014-4804-a4e6-a24d45a7331e
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapreduce.job.reduces=<number>
Starting Job = job_1634536620613_0003, Tracking URL =
http://quickstart.cloudera:8088/proxy/application_1634536620613_0003/
Kill Command = /usr/lib/hadoop/bin/hadoop job -kill job_1634536620613_0003
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2021-10-17 23:45:12,563 Stage-1 map = 0%, reduce = 0%
2021-10-17 23:45:20,300 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.15 sec
2021-10-17 23:45:28,940 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 2.62 sec
MapReduce Total cumulative CPU time: 2 seconds 620 msec
Ended Job = job_1634536620613_0003
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1 Reduce: 1 Cumulative CPU: 2.62 sec HDFS Read: 7378 HDFS
Write: 7 SUCCESS
Total MapReduce CPU Time Spent: 2 seconds 620 msec
OK
125000
Time taken: 26.704 seconds, Fetched: 1 row(s)
hive> select sum(salary) from employee;
Query ID = root_20211017234545_8098d1bc-afb5-4c69-8458-65a4e599cb54
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapreduce.job.reduces=<number>
Starting Job = job_1634536620613_0004, Tracking URL =
http://quickstart.cloudera:8088/proxy/application_1634536620613_0004/
Kill Command = /usr/lib/hadoop/bin/hadoop job -kill job_1634536620613_0004
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2021-10-17 23:45:45,885 Stage-1 map = 0%, reduce = 0%
2021-10-17 23:45:53,711 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.16 sec
2021-10-17 23:46:03,443 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 2.63 sec
MapReduce Total cumulative CPU time: 2 seconds 630 msec
Ended Job = job_1634536620613_0004
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1 Reduce: 1 Cumulative CPU: 2.63 sec HDFS Read: 7368 HDFS
Write: 7 SUCCESS
Total MapReduce CPU Time Spent: 2 seconds 630 msec
OK
365000
Time taken: 26.568 seconds, Fetched: 1 row(s)
hive> select avg(salary) from employee;
Query ID = root_20211017234646_a9cb4c5e-6315-4ae6-852a-7fec1d36ba90
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapreduce.job.reduces=<number>
Starting Job = job_1634536620613_0005, Tracking URL =
http://quickstart.cloudera:8088/proxy/application_1634536620613_0005/
Kill Command = /usr/lib/hadoop/bin/hadoop job -kill job_1634536620613_0005
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2021-10-17 23:46:32,079 Stage-1 map = 0%, reduce = 0%
2021-10-17 23:46:41,698 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.21 sec
2021-10-17 23:46:51,533 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 2.8 sec
MapReduce Total cumulative CPU time: 2 seconds 800 msec
Ended Job = job_1634536620613_0005
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1 Reduce: 1 Cumulative CPU: 2.8 sec HDFS Read: 7582 HDFS
Write: 8 SUCCESS
Total MapReduce CPU Time Spent: 2 seconds 800 msec
OK
91250.0
Time taken: 29.22 seconds, Fetched: 1 row(s)
hive> Select * from employee limit 2;
OK
NULL name city department NULL domain
1 Sachin Pune Product Engineering 100000 Big Data
Time taken: 0.078 seconds, Fetched: 2 row(s)
hive> LOAD DATA LOCAL INPATH 'Desktop/Employee.csv' INTO TABLE employee;
Loading data to table org.employee
Table org.employee stats: [numFiles=2, totalSize=369]
OK
Time taken: 0.302 seconds
hive> Select * from employee limit 2;
OK
NULL name city department NULL domain
1 Sachin Pune Product Engineering 100000 Big Data
Time taken: 0.065 seconds, Fetched: 2 row(s)
hive> Select * from employee ;
OK
NULL name city department NULL domain
1 Sachin Pune Product Engineering 100000 Big Data
2 Gaurav Bangalore Sales 90000 CRM
3 Manish Chennai Recruiter 125000 HR
4 Bhushan Hyderabad Developer 50000 BFSI
1 Sachin Pune Product Engineering 100000 Big Data
2 Gaurav Bangalore Sales 90000 CRM
3 Manish Chennai Recruiter 125000 HR
4 Bhushan Hyderabad Developer 50000 BFSI
Time taken: 0.072 seconds, Fetched: 9 row(s)
hive> create table employee1 like employee;
OK
Time taken: 0.143 seconds
hive> Select * from employee1 ;
OK
Time taken: 0.064 seconds
hive> LOAD DATA LOCAL INPATH 'Desktop/Employee.csv' INTO TABLE employee1;
Loading data to table org.employee1
Table org.employee1 stats: [numFiles=1, totalSize=163]
OK
Time taken: 0.257 seconds
hive> Select * from employee1 ;
OK
1 Sachin Pune Product Engineering 100000 Big Data
2 Gaurav Bangalore Sales 90000 CRM
3 Manish Chennai Recruiter 125000 HR
4 Bhushan Hyderabad Developer 50000 BFSI
Time taken: 0.076 seconds, Fetched: 4 row(s)
hive> Create table emp as select id,name,city from employee1;
Query ID = root_20211018001414_da7dfa9c-f654-4caa-9489-87be58f60f5f
Total jobs = 3
Launching Job 1 out of 3
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1634536620613_0006, Tracking URL =
http://quickstart.cloudera:8088/proxy/application_1634536620613_0006/
Kill Command = /usr/lib/hadoop/bin/hadoop job -kill job_1634536620613_0006
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2021-10-18 00:15:05,786 Stage-1 map = 0%, reduce = 0%
2021-10-18 00:15:14,976 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.65 sec
MapReduce Total cumulative CPU time: 1 seconds 650 msec
Ended Job = job_1634536620613_0006
Stage-4 is selected by condition resolver.
Stage-3 is filtered out by condition resolver.
Stage-5 is filtered out by condition resolver.
Moving data to: hdfs://quickstart.cloudera:8020/user/hive/warehouse/org.db/.hive-
staging_hive_2021-10-18_00-14-56_959_3838344040900696865-1/-ext-10001
Moving data to: hdfs://quickstart.cloudera:8020/user/hive/warehouse/org.db/emp
Table org.emp stats: [numFiles=1, numRows=4, totalSize=70, rawDataSize=66]
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1 Cumulative CPU: 1.65 sec HDFS Read: 3410 HDFS Write: 133
SUCCESS
Total MapReduce CPU Time Spent: 1 seconds 650 msec
OK
Time taken: 19.445 seconds
hive> decribe emp;
NoViableAltException(26@[])
at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1028)
at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:201)
at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:166)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:522)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1356)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1473)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1285)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1275)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:220)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:172)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:383)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:775)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:693)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:628)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:4
3)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
FAILED: ParseException line 1:0 cannot recognize input near 'decribe' 'emp' '<EOF>'
hive> describe emp;
OK
id int
name string
city string
Time taken: 0.156 seconds, Fetched: 3 row(s)
hive> Select * from emp ;
OK
1 Sachin Pune
2 Gaurav Bangalore
3 Manish Chennai
4 Bhushan Hyderabad
Time taken: 0.061 seconds, Fetched: 4 row(s)
hive>
hive> ALTER TABLE emp RENAME TO new_emp;
OK
Time taken: 0.153 seconds
hive> show tables;
OK
employee
employee1
new_emp
Time taken: 0.012 seconds, Fetched: 3 row(s)
hive>

You might also like