MapReduce
-
MapReduceDistributedSystem/HadoopEcyosystem 2019. 9. 25. 05:08
1. Overview a processing technique and a program model for distributed computing based on java. The MapReduce algorithm contains two important tasks, namely Map and Reduce. The major advantage of MapReduce is that it is easy to scale data processing over multiple computing nodes. Under the MapReduce model, the data processing primitives are called mappers and reducers. Decomposing a data process..
-
Apache HadoopDistributedSystem 2019. 9. 5. 03:44
1. Overview Apache Hadoop is a set of software technology components that together form a scalable system optimized for analyzing data. Data analyzed on Hadoop has several typical characteristics. Structured: For example, customer data, transaction data, and clickstream data that is recorded when people click links while visiting websites Unstructured: For example, text from web-based news feeds..