Mindmajix

Writing a User Defined Functions (UDF) for Pig

Capture 15 Pig provides extensive support for user defined functions as a way to specify custom processing.

Capture 15 Pig UDF’s can currently be implemented in three languages like Java, Python, Java script, Ruby and Groovy.

Capture 15 Java Functions are more efficient because they are implemented in the same language as pig.

Limited support is provided for Python, Ruby, Java script and Groovy functions.

Capture 15 Pig also provides support for piggy Bank, a repository for JAVA UDF’s and through piggy Bank, you can access JAVA UDF’s written by other users and can also contribute Java UDFs that you have written.

Example for User Defined Function i.e Eval Function in Pig

Capture 15 To create the simple Java project.

Log on to Eclipse

Screenshot_1821 To  Screenshot_1825  menu

   Screenshot_1821    New

      Screenshot_1821   Java Project

         Screenshot_1821  Provide the name for the project

           Screenshot_1821  Once project have been created successfully, Right-click on the Java project name and select new

                 Screenshot_1821   Then select package and provide the package name

Screenshot_1821  Once the package is created successfully, Right –click on the package name

Screenshot_1821   Select new and then select class.

  Screenshot_1821  Provide the name for the class.

  Screenshot_1821 As a part of the apache pig customization, the created class should extend eval func<String>, which is predefined function.

       Screenshot_1821  If you are extending the eval function, we have to override the method called EXEC which will have tuples.

          Screenshot_1821     Write the code as below.

Ex:- Package myudfs;
Import java. io. IOException;
Import org. apache.pig EvalFunc;
Import org. apache.pig data.Tuple;
Public class UPPER extends Eval Func<String>
{
Public string exec(Tuple input)throws IO Exception
{
If(input ==null// input.size()==o)
Return null;
Try
{
String str=(String)input.get(0);
Return str.toUpperCase();
}
Catch(Exception e)
{
Throw new IO Exception(“caught exception processing input row”,e)
}
}
}

Execution:

Capture 15 For executing the pig-customization program, we have to import the following packages.

Import org. apache. pig. Eval Fune;
Import org. apache. pig. Data. Tuple;
Import org. apache. pig. Impl. util. wrappe IO Exception;

 Capture 15 To these imported packages, we have to add the supporting external jars for the pig in the below fashion.

Right click on the Package Name

Screenshot_1821    Click on Build path

Screenshot_1821  Select configure Build path

Screenshot_1821   Window will be open, in that click on libraries tab

  Screenshot_1821  Click on Add External Jars

Screenshot_1821  We have to select the respective supporting jars from your local drive [Multiple selection is possible]

  Screenshot_1821   Whatever jar files have been selected, they will come under referenced libraries folder of the project explorer window

   Screenshot_1821   Check for the Errors

Capture 15 To compile the program, we make a jar file for the program as in the below fashion

       go to project

 Screenshot_1821     Right  click and select Export

    Screenshot_1821      Select an Export Destination As JAR file

     Screenshot_1821    Next

     Screenshot_1821  Provide the JAR file name in the JAR file, with extenstion .jar

     Screenshot_1821     Finish

Capture 15 Running Pig in embedded mode using the JAR file as shown in the below.

--My script- pig
REGISTER my udfs.jar;
A=LOAD ‘Student-data.txt’ AS(name: char array, age: int, gpa: float);
B=FOREACH A GENERATE my udfs.UPPER(name);
DUMP B;

 


 

0 Responses on Writing a User Defined Functions (UDF) for Pig"

Leave a Message

Your email address will not be published. Required fields are marked *

Copy Rights Reserved © Mindmajix.com All rights reserved. Disclaimer.
Course Adviser

Fill your details, course adviser will reach you.