»
Stored: Some data is stored because it may help answer unclear questions
later. This method relies on techniques to store it immediately and analyze it
later very fast, no matter how massive it is.
»
Summarized: Some data is summarized because keeping it all as it is makes
no sense; only the important data is kept.
»
Consumed: The remaining data is consumed because its usage is predeter-
mined. Algorithms can instantly read, digest, and turn the data into informa-
tion. After that, the system forgets the data forever.
The book deals with the first point in Chapter 13, which is about distributing data
among multiple computers and understanding the algorithms used to deal with it
(a divide-and-conquer strategy). The following sections address the second and
third points, applying them to data that streams in systems.
When talking of massive data arriving into a computer system, you will often hear
it compared to water: streaming data, data streams, data fire hose.
234
PART 4
Do'stlaringiz bilan baham: |