There are three types of UDF’s in Hive
1. UDF’s (regular)
2. UDF’s (user defined Aggregate Functions)
3. UDF’s (user defined table – generating Functions)
1) UDF Operates on a single row and produces a single row as its output has most of the functions, such as mathematical functions.
2) UDAF’S:-
Ex:- Consider a table with a single column x which contains arrays of strings.
hive>CREATE TABLE arrays(*ARRAY DELIMITED FIELDS TERMANATED By’?01’Collection ITEMS By’?02’;
hive>SELECT * FROM arrays;
[“a”, ”b”]
[“c”, ”d” ,“e”]
Hive>SELECT explode(x)As y from arrays;
Package com. hadoop book .hive; Import . org . apache. Common. Long. String URLS; Import . org . apache. hadoop. Hive. ql. exec UDF; Import . org . apache. hadoop. Io .text; Public class strip extends UDF { Private Text result = new text(); Public. Text. evaluate(Text str) { If(str==null) { Return null; } Result. set(string utils. Strip(str. To string())); Return result: } Public. Text. evaluate(Text str, string strip chers) { If(str==null) { Return null; } result. set(string utils. Strip(str. To string(),strip chars)); Return result; } }
1. A UDF must be a sub class of org. apache. Hadoop. Hive ql. exec. UDF
2. A UDF must implement at least one evaluate() method.
To use MB UDF in Hive, Run as JAVA Application and register the file with Hive:
hive>ADD JAR/path/to/Hive-examples.jar;
Hive)CREATE TEMPORARY FUNCTION strip As ‘com-hadoop book. Hive. strip.;
%hive—aux path/path/to/Hive-examples jar
or by setting the HIVE-AUX-JARS-PATH environment variable before involving Hive.
hive>SELECT EMPID, Strip(EMPNAME),ESAL FROM Employee;
(Or)
hive>SELECT strip(‘banana’, ’ab’)FROM dummy;
Output is: non
Hadoop Adminstartion | MapReduce |
Big Data On AWS | Informatica Big Data Integration |
Bigdata Greenplum DBA | Informatica Big Data Edition |
Hadoop Hive | Impala |
Hadoop Testing | Apache Mahout |
Name | Dates | |
---|---|---|
Hadoop Training | Nov 09 to Nov 24 | View Details |
Hadoop Training | Nov 12 to Nov 27 | View Details |
Hadoop Training | Nov 16 to Dec 01 | View Details |
Hadoop Training | Nov 19 to Dec 04 | View Details |
Ravindra Savaram is a Technical Lead at Mindmajix.com. His passion lies in writing articles on the most popular IT platforms including Machine learning, DevOps, Data Science, Artificial Intelligence, RPA, Deep Learning, and so on. You can stay up to date on all these technologies by following him on LinkedIn and Twitter.