IN THIS PART . . .
Use various Python data structures.
Work with trees and graphs.
Sort data to make algorithms work faster.
Search data to locate precisely the right information
quickly.
Employ hashing techniques
to create smaller data
indexes.
CHAPTER 6
Structuring Data
115
IN THIS CHAPTER
» Defining why data requires structure
» Working with stacks, queues, lists,
and dictionaries
» Using trees to organize data
» Using graphs to represent data with
relations
Structuring Data
R
aw data is just that: raw. It’s not structured or cleaned in any way. You
might find some parts of it missing or damaged in some way, or simply that
it won’t work for your problem. In fact, you’re not entirely sure just what
you’re getting because it’s raw.
Before you can do anything with most data, you must structure it in some manner
so that you can begin to see what the data contains (and, sometimes, what it
doesn’t). Structuring data entails organizing it in some way so that all the data
has the same attributes, appearance, and components. For example, you might get
data from one source that contains dates in string form and another source that
uses date objects. To use the information, you must make the kinds of data match.
Data sources might also structure the data differently. One source might have the
last and first name in a single field; another source might use individual fields for
the same information. An important part of structuring data is organization. You
aren’t changing the data in any way — simply making the data more useful.
(Structuring data contrasts with remediating or shaping the data where you
sometimes do change values to convert one data type to another or experience a
loss of accuracy, such as with dates, when moving between data sources.)
Python provides access to a number of organizational structures for data. The
book uses these structures, especially stacks, queues, and dictionaries, for many
of the examples. Each data structure provides a different means of working with
the data and a different set of tools for performing tasks such as sorting the data
into a particular order. This chapter presents you with the most common
organizational methods, including both trees and graphs (both of which are so
important that they appear in their own sections).
Chapter
6