Big Data Analytics In Different Frameworks
Daily data is generated at enormous speed. Single machine is insufficient to store and process it. Most of this data
are unstructured. Most of the data (about 90%) was generated in last few years. Such data is characterized by high volume
velocity and veracity. Such data is known as Big Data and is stored in clusters. Efficient framework is required for managing
such big data. Framework which exists are Apache Hadoop, Apache Spark, Apache Flink, Microsoft REEF (Retainable
Evaluator Execution Framework). Flink is a new framework which has build-in optimization techniques for serialization and
de-serialization. Flink also has built-in program optimizer which selects proper runtime operations for each program. This
paper does a comparative study of different frameworks.
Keywords: Big Data, Hadoop, Spark, Flink, REEF.