REGISTER- Register jar file with the pig runtime
DEFINE- Create an alias for a macro, UDF, Streaming script (or) command specification.
IMPORT- Import macros defined in separate file into a script.
Load: load data from the file system.
FILETER: Remove unwanted rows from a location
FOREACH: Particular column is displayed
GENERATE: Add or Remove fields from a Relation
GROUP: To group data in a single relation.
COGROUP: To group or join data in two or more relation
UNION: To merge the contents of two or more relations
SPLIT: To partition the contents of a relation into multiple relations
JOIN (Inner or Outer): To join the data in two or more relations
ORDER: Sort the relations by one or more fields
LIMIT: Limits the size of a relation to a maximum no. of tuples
|Big Data On AWS||Informatica Big Data Integration|
|Bigdata Greenplum DBA||Informatica Big Data Edition|
|Hadoop Testing||Apache Mahout|
I am Ruchitha, working as a content writer for MindMajix technologies. My writings focus on the latest technical software, tutorials, and innovations. I am also into research about AI and Neuromarketing. I am a media post-graduate from BCU – Birmingham, UK. Before, my writings focused on business articles on digital marketing and social media. You can connect with me on LinkedIn.