Every piece of data in PIG has one of these four types:
REGISTER- Register jar file with the pig runtime
DEFINE- Create an alias for a macro, UDF, Streaming script (or) command specification.
IMPORT- Import macros defined in separate file into a script.
Load: load data from the file system.
FILETER: Remove unwanted rows from a location
FOREACH: Particular column is displayed
GENERATE: Add or Remove fields from a Relation
GROUP: To group data in a single relation.
COGROUP: To group or join data in two or more relation
UNION: To merge the contents of two or more relations
SPLIT: To partition the contents of a relation into multiple relations
JOIN (Inner or Outer): To join the data in two or more relations
ORDER: Sort the relations by one or more fields
LIMIT: Limits the size of a relation to a maximum no. of tuples
|Big Data On AWS||Informatica Big Data Integration|
|Bigdata Greenplum DBA||Informatica Big Data Edition|
|Hadoop Testing||Apache Mahout|
Get Updates on Tech posts, Interview & Certification questions and training schedules