Collection data types
As well as the simple data types, Python has several common collection data types, tuples,
lists, sets and dictionaries, that provide a means of bringing multiple items together into a
container.
The simplest collection type is a tuple. A tuple contains a fixed number of items and
once it is created it cannot be modified. You can think of it as a fixed (immutable) and
ordered collection of items. A tuple is defined using a left round parenthesis ‘(’ at the start
and a right round parenthesis ‘)’ at the end. For example, we could have:
x = () # empty tuple
x = ("Ala",) # tuple with one item
x = (123, 54, 92, 54) # tuple with four items
Note the peculiar-looking syntax for tuples with only one item inside; there is a comma
(‘,’) at the end. This is because otherwise Python would interpret the parentheses as an
expression rather than a tuple. For example, (2+3) is a mathematical expression for the
number 5, but (2+3,) is a tuple containing one item (again 5). So parentheses are used in
both these contexts in Python, and the comma is a small irritation that results to avoid
ambiguity.
The items inside a tuple can repeat themselves and be of different data types; you can
mix numbers, strings or whatever. In common usage, however, the items tend to all be of
the same type, as illustrated above. Also, an item inside a tuple does not have to be a
simple data type, it can itself be a collection, or even a user-defined type (which we come
to in
Chapter 7
). A nonsense example of a tuple with mixed types and repetition, where
the last item is another tuple, inside the first, is:
x = ( 2, 2, 'banana', False, ('a','b') )
Like all the collection types, tuples may also be created using variables:
x = 1.2
y = -0.3
z = 0.9
t = (x, y, z)
The next simple collection type is a list. A list contains an arbitrary number of items,
and new items can be added and existing ones removed. As with tuples, the items in a list
remain in their specified order. The major difference between lists and tuples is that the
contents of a list can be modified, whereas a tuple is fixed at the moment it is defined. A
list is defined using a left square bracket ‘[’ at the start and a right square bracket ‘]’ at the
end. For example, we could have:
x = [] # empty list
x = ["Ala"] # list with one item
x = [123, 54, 92, 54] # list with four items
As with tuples, the items in a list can repeat and be of different data types, although in
normal usage they tend to all have the same type. And again, an item does not have to be a
simple data type. You can convert a tuple to a list with the inbuilt list() function, and you
can convert a list to a tuple with the tuple() function:
t = (123, 54)
x = list(t) # x is [123, 54], t is still (123, 54)
w = tuple(x) # w is (123, 54), x is still [123, 54]
The next collection type is a set. A set contains an arbitrary number of items, and can be
modified; new items can be added and existing ones removed. Unlike tuples and lists,
however, the items in a set are not in any order. Also, an item can only appear once in a
set; if you try and add the same item twice then the second time it will be ignored. Sets
were introduced relatively late into Python
6
, and so the syntax used a keyword,
specifically set(collection) to get a filled set or set() to get an empty set. When we pass a
collection (list, tuple or other set) to the construction the contents of the collection are
used to define the contents of the set.
x = set() # Empty set
listData = [123, 54, 92, 54]
x = set(listData) # Set with _three_ items
Note that the second set has three items, not four, because the 54 is repeated and so the
second one is ignored. Because set() does not take more than one argument, extra brackets
are often used to create an inner collection, for specifying multiple items directly:
x = set(1,4,9,16,25) # Fails! – Multiple arguments
x = set([1,4,9,16,25]) # Works – Brackets make a single list
In Python 2.7 and in Python 3, although set() is still used for making sets with the
contents of other collections, a new shorter notation can be used for directly defining non-
empty sets.
7
x = {123, 54, 92, 54}
Be aware that creating a set using an inner tuple requires an extra comma if the tuple
contains only one item, otherwise the parenthesis will effectively be ignored. Using square
brackets, to make an inner list instead, does not have this issue.
x = set(("Ala")) # A set containing three letters!
x = set(("Ala",)) # Set containing one string item
x = set(["Ala"]) # Set containing one string item
As with tuples and lists, the items inside a set may represent a mixture of different
kinds of data, so you could have a set containing both numbers and text if you wanted,
although in normal usage they tend to all be of the same type. Additionally, an item does
not have to be one of the simple Python types; you could place your own custom data
objects in it.
There is a significant caveat with putting things in sets because it turns out that not all
Python data types can be placed in one. Only items that can be described as hashable can
go in. The concept of hashability
8
is perhaps too complex to describe at this point.
However, the basic essence of the situation is that if an item is to be allowed within a set it
cannot be modified internally, to take on a new value. If such value modifications were
allowed then items inside a set could be changed so that they become indistinguishable,
and this is inconsistent with sets not having repeats. The inbuilt simple types like integers
and strings are not modifiable, because their values define what they are. Thus, these are
hashable and hence are allowed as set items. Modifiable collections like lists, dictionaries
and other sets are not allowed as items, because when their content changes so does their
value.
The final, main collection type is a dictionary. (Python dictionaries are equivalent to
Do'stlaringiz bilan baham: |