Linux with Operating System Concepts



Download 5,65 Mb.
Pdf ko'rish
bet85/254
Sana22.07.2022
Hajmi5,65 Mb.
#840170
1   ...   81   82   83   84   85   86   87   88   ...   254
Bog'liq
Linux-with-Operating-System-Concepts-Fox-Richard-CRC-Press-2014

global regular expression print
. The idea behind grep is to search 
one or more files line-by-line to match against a specified regular expression. In this case, 
the string being matched against is the entire line. If the regular expression matches any part 
of the line, the entire line is returned. The grep program can search multiple files and will 
return all matching lines. This is a convenient way to search files for specific types of content.
6.4.1 Using grep/egrep
The format for grep is
grep [options] 
regex file(s)
The 
regex
is any regular expression that does not include the extended set of metacha-
racters (such as 
{}
). To use the extended regular expression set, either use 
grep –E
or 
egrep
. The –E denotes “use egrep,” so they are identical. The egrep program can apply all 
metacharacters, so it is preferred to always use egrep (or 
grep –E
). We will examine the 
other options later in this section.
In addition, it is good practice to place your regex inside of ‘ ’ as in 
egrep 
‘regex’ 
file(s)
. This is to prevent confusion between the metacharacters that can appear as wild-
cards, such as 
*
and 
+
. Recall that the Bash interpreter performs filename expansion 
before executing an instruction. This could result in an 
*
or 
+
being applied by the Bash 
interpreter, replacing the wildcard with a list of matching files, and thus not being part of 
the regular expression. We will examine this problem in more detail at the end of this sec-
tion. For now, all regular expressions will be placed inside of single quote marks.


Regular Expressions

217
We have a series of files, some of which include financial information. Assume that such 
files contain dollar amounts in the form 
$
number.number
as in 
$1234.56
and 
$12.00

If we want to quickly identify these files, we might issue the grep instruction
egrep ‘$[0-9]
+
\.[0-9]{2}’ *
As another example, we want to locate files that contain an address. In this case, let us 
assume the address will contain the zip code 
41099
. The obvious instruction is
egrep ‘41099’ *
However, this instruction will match any file that contains that five-digit sequence of 
numbers. Since this is part of an address, it should only appear within an address, which 
will include city and state. The 41099 zip code is part of Highland Heights, KY. So we might 
further refine our regular expression with the following instruction:
egrep ‘Highland Heights, KY 41099’ *
In this case though, we might have too precise a regular expression. Perhaps some 
addresses placed a period after KY and others have only one space after KY. In yet others, 
Highland Heights may not appear on the same line. We can resolve the first two issues by 
using a list of “or” possibilities, as in 
KY
|KY
|KY\.
and we can resolve the second prob-
lem by removing Highland Heights entirely. Now our instruction is
egrep ‘(KY |KY|KY\.) 41099’ *
The use of the 
()
makes it clear that there are three choices, KY with a space, KY with-
out a space, and KY with a period, followed by a space and 41099.
Let us now turn to a more elaborate example. Here, we wish to use egrep to help us 
locate a particular file. In this case, we want to find the file in /etc that stores the DNS 
name server IP address(es). We do not recall the file name and there are hundreds of files 
in this directory to examine. The simple approach is to let egrep find any file that con-
tains an IP address and report it by name. How do we express an IP address as a regular 
expression?
An IP address is of the form 1.2.3.4 where each number can range between 0 and 255. 
We will want to issue the egrep command:
egrep ‘
regex-for-ip-address
’/etc/*
where 
regex-for-ip-address
is the regular expression that we come up with that will 
match an IP address. The instruction will return all matching lines of all files. Included 
in this list should be the line(s) that matched the file that we are looking for (which is 
resolv.conf).


218

Linux with Operating System Concepts
An IP address (version 4) consists of four numbers separated by periods. If we could 
permit any number, the regular expression could be
[0-9]
+
.[0-9]
+
.[0-9]
+
.[0-9]
+
However, this is not accurate because the . represents any character. We really mean “a 
period” so we want the period interpreted literally. We need to modify this by using 
\.
or 
[.]
. Our regular expression is now
[0-9]
+
\.[0-9]
+
\.[0-9]
+
\.[0-9]
+
The above regular expression certainly matches any IP address, but in fact it can match 
against any four numbers that are separated by periods. The four numbers that make up 
the IP address must be within the range of 0–255. How can we express that the number 
must be no greater than 255? Your first thought may be to use 
[0-255]
. Unfortunately, 
that does not work nor does it make sense. Recall that in 
[]
we enumerate a list of 
choices
to match 
one
character in the string. The expression 
[0-255]
can match one of three dif-
ferent sets of single characters: a character in the range 0–2, the character 5, and the char-
acter 5. This expression is equivalent to 
[0-25]
or 
[0125]
as that second 5 is not needed. 
Obviously, this expression is not what we are looking for.
What we need to do is express that the item to match can range from 0, a single digit, 
all the way up to 255, three digits long. How do we accomplish this? Let us consider this 
solution:
[0-9]{1,3}
This regular expression will match any sequence of one to three digits. This includes 
0, 1, 2, 10, 11, 12, 20, 21, 22, 99, 100, 101, 102, 201, 202, and 255 so it appears to work. 
Unfortunately, it is too liberal of an expression because it also matches 256, 257, 258, 301, 
302, 400, 401, 500, 501, 502, 998, and 999 (as well as 000, 001, up through 099), none of 
which are permissible as parts of an IP address.
If we do not mind our regular expression matching strings that it should not, then we 
can solve our problem with the command
egrep ‘[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}’ /etc/*
We are asking grep to search all of the files in /etc for a string that consists of 1–3 digits, a 
period, 1–3 digits, a period, 1–3 digits, a period and 1–3 digits. This would match 1.2.3.4 or 
10.11.12.13 or 172.31.185.3 for instance, all of which are IP addresses. It would also match 
999.998.997.996 which is not an IP address.
If we want to build a more precise regular expression, then we have some work to do. Let 
us consider again the range of possible numbers in 0–255. First, we have 0–9. That is easily 


Regular Expressions

219
captured as 
[0-9]
. Next, we have 10–99. This is also easily captured as 
[1-9][0-9]
. We 
could use the expression 
[0-9]|[1-9][0-9]
.
What about 100–255? Here, we have a little bit more of a challenge. We cannot just 
use [1-2][0-9][0-9]
because this would range from 100 to 299. So we need to enu-
merate even more possible sequences. First, we would have 
1[0-9][0-9]
to allow for any 
value from 100 to 199. Next, we would have 
2[0-4][0-9]
to allow for any combination 
from 200 to 249. Finally, we would have 25[0–5] to allow for the last few combinations
250–255.
Let us put all this together to find a regex for any legal IP address octet. Combining the 
Download 5,65 Mb.

Do'stlaringiz bilan baham:
1   ...   81   82   83   84   85   86   87   88   ...   254




Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©hozir.org 2024
ma'muriyatiga murojaat qiling

kiriting | ro'yxatdan o'tish
    Bosh sahifa
юртда тантана
Боғда битган
Бугун юртда
Эшитганлар жилманглар
Эшитмадим деманглар
битган бодомлар
Yangiariq tumani
qitish marakazi
Raqamli texnologiyalar
ilishida muhokamadan
tasdiqqa tavsiya
tavsiya etilgan
iqtisodiyot kafedrasi
steiermarkischen landesregierung
asarlaringizni yuboring
o'zingizning asarlaringizni
Iltimos faqat
faqat o'zingizning
steierm rkischen
landesregierung fachabteilung
rkischen landesregierung
hamshira loyihasi
loyihasi mavsum
faolyatining oqibatlari
asosiy adabiyotlar
fakulteti ahborot
ahborot havfsizligi
havfsizligi kafedrasi
fanidan bo’yicha
fakulteti iqtisodiyot
boshqaruv fakulteti
chiqarishda boshqaruv
ishlab chiqarishda
iqtisodiyot fakultet
multiservis tarmoqlari
fanidan asosiy
Uzbek fanidan
mavzulari potok
asosidagi multiservis
'aliyyil a'ziym
billahil 'aliyyil
illaa billahil
quvvata illaa
falah' deganida
Kompyuter savodxonligi
bo’yicha mustaqil
'alal falah'
Hayya 'alal
'alas soloh
Hayya 'alas
mavsum boyicha


yuklab olish