Protein Data Bank file which is used to represent data about molecules and their three-
Figure 6.2. An example of a plain text format that uses keys and values. Here the
last value extends over multiple lines. This extract is a fragment of a file format called
mmCIF. As with the PDB format mmCIF is used to represent data about biological
molecules and their structure.
Figure 6.3. An example of a file format containing tagged elements. A truncated
fragment of an XML file from the Protein Data Bank (PDBML).
Reading files
Using ‘
open
’
When reading a file within a Python program, you obviously need to specify the location
and the name of the file, collectively termed the path to the file. The file’s path is then
used by the inbuilt open() function, or similar, to get access to the file’s contents. The path
can be absolute, which is to say that it starts at the top or root of the operating-system file
hierarchy, or the path can be relative to the current working directory (folder). Initially the
current working directory is the directory where the Python interpreter starts, although it
can be changed from inside the code if needed. In Python the file path is just a string, so,
for example, you could have:
path = 'examples/dataFile.txt'
fileObj = open(path)
The open function passes back what is termed a file handle, a Python object used to
access and represent the open file. Here the file handle is stored in a variable called
fileObj. It might be tempting to use the more natural variable name ‘file’, but in Python
the word ‘file’ is already used to describe the type (class) of such objects. Hence, it is best
to avoid overriding the internal variable name. An alternative to the descriptive but
verbose ‘fileObj’ is the shorter ‘fh’, meaning ‘file handle’.
There is the possibility that a program may fail to open the file with the specified path.
For example, the stated file may simply not exist (e.g. the name was wrong), or you may
not have permission to read it. Under such circumstances, when you try to open it the
function will throw an error, a standard Python exception (IOError). Otherwise, if all goes
well, when you are done reading a file then you can explicitly say you are finished, by
using the close() function, a method that is known to the file object class. A listing of
methods associated with file objects is given in
Appendix 2
.
fileObj.close()
Usually you do not have to explicitly close a file, because
when the variable name
fileObj goes out of scope and is no longer meaningful, such as at the end of a loop or
function, Python will automatically close it.
2
However, it is generally a good habit to
explicitly close a file handle when you know you are done, just in case it is important.
Also, this lets someone who reads your code know that you have finished using the file.
The open() function can take an optional second argument: a ‘mode’, specifying how
the file should be opened. These modes include ‘r’ to read from a file and ‘w’ to write to a
file, but, as you might have spotted in the above examples, the ‘r’ is optional, given that it
is the default. Although we could still have explicitly stated reading mode:
fileObj = open(path, 'r')
Do'stlaringiz bilan baham: