230
◾
Linux with Operating System Concepts
Our replacement string would be
\3, \1
. That is, we would put the third string first
followed by a comma followed by a space followed by the first string, and no second string.
The sed command is
sed ‘s/\([A-Z][a-z]
+
\) \([A-Z]\.\) \([A-Z][a-z]
+
\)/
\3, \1/’ names.txt
Unfortunately, the above sed command will not work if the first name is just an initial
or if the person’s name has no middle initial.
To solve these problems, we can use multiple
pattern/substitution pairs using the –e option. In the first case, we replace
[A-Z][a-z]
+
with
[A-Z]\.
giving us
–e ‘s/\([A-Z]\.\) \([A-Z]\.\) \([A-Z][a-z]
+
\)/\3,
\1/’
. For the second case, we remove the middle portion of the expression and renumber
\3
to
\2
since there are only two patterns we are remembering now,
giving us
–e ‘s/\
([A-Z][a-z]
+
\) \([A-Z][a-z]
+
\)/\2, \1/’
As you can see, sed commands can become both complicated and cryptic very quickly.
We end this section with a brief listing of examples. Assume names.txt is a file of names
as described above. The first example below capitalizes every vowel found. The second
example fully capitalizes every first name found. Notice the use of
*
for the lower-case let-
ters. This allows us to specify that the first name could be an initial (in which there are no
lower-case letters). The third example removes all initials from the file.
The fourth example
replaces every blank space (which we assume is being used to separate first, middle, and
last names) with a tab. The fifth example matches every line and repeats the line onto two
lines. In this case, the replacement is & (the original line) followed by a new line (\n) fol-
lowed by & (the original line again).
• sed ‘s/[aeiou]/\u&/g’ names.txt
• sed ‘s/[A-Z][a-z]*/\U&/’ names.txt
• sed ‘s/[A-Z]\.//g’ names.txt
• sed ‘s//\t/g’ names.txt
• sed ‘s/[A-Za-z.]
+
/&\n&/’ names.txt
6.6 awk
The awk program, like sed and grep, will match a regular expression against a file of strings.
However, grep returns the matching lines and sed replaces matching strings,
the awk pro-
gram allows you to specify actions to perform on matching lines. Actions can operate
on the substring(s) that matched, other portions of the line, or perform other operations
entirely. In essence, awk provides the user with a programming language so that the user
can specify condition and action pairs to do whatever is desired.
We also see in this section
that awk does not need to use regular expressions for matches but instead can use different
forms of conditions.
Regular Expressions
◾
231
Overall, awk (named after its authors Aho, Kernighan, and Weinberger) is a more pow-
erful tool than either grep or sed, giving you many of the same features as a programming
language. You are able to search files for literal strings, regexes,
and conditions such as if
a particular value is less than or greater than another. Based on a matching line, awk then
allows you to specify particular fields of a matching line to manipulate and/or output, use
variables to store values temporarily, and perform calculations on those variables, output-
ting results of the calculations.
Unlike grep and sed, awk expects that the text file is not just
a sequence of lines but
that each line is separated into fields (or columns). In this way, awk is intended for use on
tabular information rather than ordinary text files. While you could potentially use awk
on a textfile, say a text document or an email message, its real power comes into existence
when used on a file that might look like a spreadsheet or database, with distinct rows and
columns. Figure 6.4 illustrates this idea where we see two matching lines with awk operat-
ing on the values of other fields from those lines.
6.6.1 Simple
awk Pattern-Action Pairs
The structure of a simple awk command will look something like this:
awk ‘/
Do'stlaringiz bilan baham: