Hadoop is one of the most extensively used technologies for analyzing large amounts of Big data. Hive, on the other hand, is a Hadoop-compatible tool for storing and processing large datasets. This page introduces Apache Hive and walks you through the architecture and installation process.
Data is a profitable asset that helps organizations to understand their customers better and therefore improve performance. To store and analyze data, organizations need a data warehouse system. In this article, we would be discussing Apache Hive, an open-source data warehouse system built on Hadoop.
In this Hadoop Hive article the following topics we will be discussing ahead:
If you want to enrich your career and become a professional in Hadoop Hive, then enroll in "Hadoop Hive Training". This course will help you to achieve excellence in this domain. |
Execution engines: The component executes the tasks in proper dependency order and also interacts with Hadoop.
[ Check out Hadoop Data Types with Examples ]
[ Related Article: Hadoop Installation and Configuration ]
Step 1: Download the Hive Release at https://Hive.apche.org/ HTML. i.e. far ball file.
Step 2: Unpack the tarball in a suitable place in your Hadoop Installation environment. i.e $ far – xzvf Hive- 0.8.1 tar.gz
Step 3: Setting the environment variable HIVE-HOME to point the installation directory:
$ cd Hive -0.8.1
$ export HIVE –HOME={{pwd}}
4.Add $ HIVE –HOME/bin to your PATH
$ export PATH=$ HIVE –HOME/bin: $ PATH
Cmd:> Mysql
Mysql>create DATABASE meta store;
Mysql>use meta store;
Mysql> SOURCE usr/lib/Hive/scripts/ meta store/upgrade/ Mysql/ Hive-schema-07.0. Mysql. Sql;
Mysql> CREATE USER ‘Hive user’@’%’ IDETIFIED By’ password’;
Mysql> GRANT SELECT ‘INSERT,UPDATE, DELETE ON meta store To’ Hive user’@’%’;
Mysql> REVOKE ALTER,CREATE ON meta store* FROM Hive user’@’%’;
To start Mysql services
Cmd:>cd/etc/init.d
>./ Mysql start
To stop:>./ Mysql stop.
To start Hive services
Cmd: usr/bin/Hive-service Hive server
[ Check out Hadoop HDFS Commands with Examples ]
Configure the Hive-site.xml as below
javax.dbo. option. connection URL
jdbc: der by; data base name -/var/lib/Hive/meta store/meta store-db; create=true
JDBC Connect string for a JDBC meta store
$HIVE-HOME/bin/Hive is a shell utility that can be used to run Hive queries in either interactive or batch mode.
Hive-d or—define: variable substitution to apply to Hive Commands
Example:– -d A=B or—define A=B
1. hive –e: SQL from the command line.
2. hive-f: SQL from files.
3. hive-connection to Hive server on the remote host
4. —hive conf: use-value for a given property.
5. —hive var: variable substitution to apply to Hive commands.
Example: —hive var A=B.
6. hive-i: initialization SQL file.
7. hive-p: connecting to Hive server on port number.
hive-s—or–silent: silent mode in the interactive shell.
hive-v or—ver bose: verbox mode(echo executed SQL to the console).
Examples:-
$HIVE-HOME/bin/Hive-e ’select a.col from tab1
$HIVE-HOME/bin/Hive-e ’select a.col from tab1 a’-Hive conf
Hive. exec. Scrarch dir=/home/my/Hive-Hive conf mapred. reduce. tasks=32
$HIVE-HOME/bin/Hive-s-e ’select a.col from tab1 a’>a.txt
$HIVE-HOME/bin/Hive-f /home/my/Hive-script.sql
$HIVE-HOME/bin/Hive-i /home/my/Hive-init.sql
The CCI when invoked without the – I option will attempt to load $HIVE-HOME/bin/Hive rc and HOME/.Hive rc as initialization files.
[ Learn Top Hadoop Interview Questions and Answers ]
When $HIVE-HOME/bin/Hive is run with the –e or-option, it executes SQL Commands in batch mode.
.hive-e’’execute the query string.
.hive-f execute one or more SQL queries from a file.
Examples:
SL No. | Command | Description |
1 | Quit or exit | Use quit or exit to lease the interactive shell. |
2 | Reset | Resets the configuration to the default values. Set the value of a particular configuration variable(key) |
3 | Set= -> | Note: If you misspell the variable name, the CLI will not show an error. |
4 | Set | Prints a list of configuration variables that are overridden by the user or Hive |
5 |
Set-r Add file[S]* | Prints all Hadoop and Hive configuration variables. |
6 |
Add JAR [S]* Add ARCHIVE[S]* | Adds one or more files, jars or archives to the list of resources in the distributed cache. |
7 |
List File[S] List JAR[S] List ARCHIVE[S] List File[s] >* | Lists the resources that are already added to the distributed cache. |
8 |
Add JAR [S]* Add ARCHIVE[S]* Delete FILE[S]* | Checks whether the given resources are already added to the distributed cache or not. |
9 |
Delete JAR[S]* Delete ARCHIVE[S]* | Removes the resource(s) from the distributed cache. |
10 | ! | Executes the shell command from the Hive shell |
11 | dfs | Executes a dfs command from the Hive shell |
12 | Executes a Hive query and prints results to the standard output. | |
13 | Source File |
Executes a script file inside the CLI |
For Example:
hive>set map red. reduce. tasks=32;
hive >set;
hive >select a.* from tab1;
hive >! Ls
With this, we would like to wind up the article and hope you found the article informative. In case you have any doubt regarding any related concept, please feel free to drop the query in the comment section.
Name | Dates | |
---|---|---|
Hadoop Training | Sep 17 to Oct 02 | View Details |
Hadoop Training | Sep 21 to Oct 06 | View Details |
Hadoop Training | Sep 24 to Oct 09 | View Details |
Hadoop Training | Sep 28 to Oct 13 | View Details |
Ravindra Savaram is a Technical Lead at Mindmajix.com. His passion lies in writing articles on the most popular IT platforms including Machine learning, DevOps, Data Science, Artificial Intelligence, RPA, Deep Learning, and so on. You can stay up to date on all these technologies by following him on LinkedIn and Twitter.