Easy Pig Script Generation Tool


Örnek Ö., Gürel U.

4th International Mediterranean Science and Engineering Congress (IMSEC 2019), Antalya, Turkey, 25 - 27 April 2019

  • Publication Type: Conference Paper / Full Text
  • City: Antalya
  • Country: Turkey
  • Eskisehir Osmangazi University Affiliated: Yes

Abstract

With increase in digitalization the amount of data produced, volume of data and variety of data increased. This generated data can be in structured and unstructured format. This kind of data has to be handled in Big Data concept. Big data analytic techniques can be used for very large data sets with different type’s data such as structured/unstructured.  Big data, comes with a lot of challenges like managing the huge volume of data, analyzing, storing and the visualizing data. So that, Hadoop can be a good solution for these challenges which itself has the combination of many components. The two most important components in the Hadoop are distributed file system and the Map Reduce. The distributed file system is for the storage and MapReduce is for the processing part. The Hadoop can be used for analysis on structured and unstructured data. In a Hadoop cluster, data is divided into pieces and it’s distributed to the cluster. This give the scalability needed for Big Data processing. Apache Pig is one of the Hadoop components for process Big Data in minimum time duration with less technical knowledge. But sometimes writing pig scripts is not easy. In this study a graphical user interface is develop with this tool a user can build pig scripts easily.