A computer file is a means by which data is stored on a permanent basis, or at least until it
is deleted. It is held in a place such as a hard disk drive or removable storage device that is
separate from the active, temporary memory of a computer. While the active memory may
hold the current program and an amount of data, files represent a larger archive of stored
data and the general idea is that this should survive when the computer is switched off.
Parts of this saved data may be copied into the active memory as required. Loading data
from files (which may be stored locally or transmitted via a network) places data into the
active memory so that it can be worked upon efficiently. This data might be the code for a
computer program which can then be executed to do a job. Naturally we save program
instruction code as a file so that it may be used as many times as desired, without having
to rewrite anything.
This chapter will focus on data files that store information for programs to work with,
rather than the program files themselves, given that we can trust the Python interpreter to
handle the loading and running of Python code. We will show how data can be read into a
program and written out from a program, e.g. to and from files stored on disk. Such data
files come in a large variety of shapes, sizes and forms (unlike Python files, which
conform to a single, precise standard). Information can be stored in an endless number of
ways, sometimes at the whim of the programmer, but fortunately in the spirit of
cooperation (including mutual financial interest) particular types of data are often stored in
a standardised way, with a known specification.
Stored data is represented as a series of binary numbers, i.e. zeros and ones (merely the
absence or presence of a signal) and a connected series of data that goes together, as a
named unit, is what we mean by a file. However, there are two distinct types of file:
binary and plain text. The difference between these is that plain text files only use a
limited series of binary codes to describe data as character symbols, like digits and letters.
Binary files are not restricted in this sense, but their interpretation is dependent upon
having the right kind of computer system and/or program to load them. Plain text files are
much more universal and keep to a standard set of binary codes, so that they will be
interpretable in the same way whatever the computer system or programming language. In
this chapter we concentrate on data stored as plain text files, and this will cover most of
the file standards used in biology.
Do'stlaringiz bilan baham: