4.Conclusion.
MapReduce and Apache spark are the most important tools for big data processing.The main advantage of MapReduce is that it can be easily scaled to process data across multiple compute nodes. Apache spark, on the other hand, offers high-speed computing, flexibility, and relative ease of use that perfectly complements MapReduce. MapReduce and Apache spark have a symbiotic relationship with each other. Hadoop offers features that Spark does not provide, such as a distributed file system. Spark, on the other hand, provides real-time processing for the data sets it needs. MapReduce runs on disk, while Apache spark runs on memory. MapReduce and Apache spark together are powerful tools for processing, analyzing large volumes of data, and making the Hadoop cluster more reliable.
References:
[1] Franks, B. Taming the Big Data Tidal Wave Finding Opportunities in Huge Data Streams with Advanced Analytics/ Bill Franks,2012. – 45 с. ##
[2] Gantz, J. The digital universe in 2020: Big Data, Bigger Digital Shadows, and Biggest Growth in the Far East - United States / J. Gantz, D. Rainsel // IDC Country brief, 2013. ##
[3] Hadoop and Big Data: http://www.cloudera.com/content/cloudera/en/about/hadoop-and-big-data.html. ##
[4] Golnar Assadat Afzali, Shahriar Mohammadi. Privacy Preserving Big Data Mining: Association Rule Hiding. 10.7508/jist.2016.02.001. http://www.jist.ir/Article/139504261512112857##
[5] Kachalov D.L., Mishustin A.V., Farkhadov M.P. Institute of Control Problems of the Russian Academy of Sciences named after V.A. Trapeznikova. Modern methods of processing big data in large-scale systems##
[6] Cuzzocrea, A., Song, I., Davis, K.C.: Analytics over Large-Scale Multidimensional Data: The Big Data Revolution! In: Proceedings of the ACM International Workshop on Data Warehousing and OLAP, pp. 101–104 (2011)Google Scholar##
[7] Economist Intelligence Unit: The Deciding Factor: Big Data & Decision Making. In: Capgemini Reports, pp. 1–24 (2012)Google Scholar##
[8] Elgendy, N.: Big Data Analytics in Support of the Decision Making Process. MSc Thesis, German University in Cairo, p. 164 (2013)Google Scholar##
[9] EMC: Data Science and Big Data Analytics. In: EMC Education Services, pp. 1–508 (2012)Google Scholar##
[10] He, Y., Lee, R., Huai, Y., Shao, Z., Jain, N., Zhang, X., Xu, Z.: RCFile: A Fast and Space-efficient Data Placement Structure in MapReduce-based Warehouse Systems. In: IEEE International Conference on Data Engineering (ICDE), pp. 1199–1208 (2011)Google Scholar##
[11] Herodotou, H., Lim, H., Luo, G., Borisov, N., Dong, L., Cetin, F.B., Babu, S.: Starfish: A Self-tuning System for Big Data Analytics. In: Proceedings of the Conference on Innovative Data Systems Research, pp. 261–272 (2011)Google Scholar##
[12] Kubick, W.R.: Big Data, Information and Meaning. In: Clinical Trial Insights, pp. 26–28 (2012)Google Scholar##
[13] T Q Urazmatov, B B Nurmetova and X Sh Kuzibayev: In: 2020 IOP Conf. Ser.: Mater. Sci. Eng. 862 042006
Do'stlaringiz bilan baham: |