Linux with Operating System Concepts



Download 5,65 Mb.
Pdf ko'rish
bet81/254
Sana22.07.2022
Hajmi5,65 Mb.
#840170
1   ...   77   78   79   80   81   82   83   84   ...   254
Bog'liq
Linux-with-Operating-System-Concepts-Fox-Richard-CRC-Press-2014

empty string
. Thus, this regex only matches the empty string.
6.2.4 Matching from a List of Options
So far, our expressions have allowed us to match against strings that have a variable num-
ber of characters, but only if the characters appear in the order specified. For instance
a
+
b
+
c
+
can match any number of a’s, b’s, and c’s, but only if they appear in that order. 
What if we want to match any string that contains any number of a’s, b’s, and c’s, but 
in no particular order? If we want to match any three character sequence that consists 
of only a’s, b’s, and c’s, as in 
abc

acb

bca
, and so forth, we will need an additional 
metacharacter.
The 
[ ]
metacharacters, often referred to as brackets, straight brackets, or braces, allow 
us to specify a list of options. The list indicates that the 
next
character in the string can 
match 
any single character
in the list.
Inside of the brackets we can specify characters using one of three notations: an enu-
merated list as in 
[abcd]
, a range as in 
[a-d]
, or a 
class
such as 
[[:alpha:]]
. The class 
:alpha:
represents all alphabetic characters. Obviously we would not use :alpha: if we 
only wanted to match against a subset of letters like a-d. Alternatively, we could use a-zA-
Z to indicate all letters. Notice when describing a class, we use double brackets instead of 
single brackets.
Consider the regular expression 
[abc][abc][abc]
. This expression will match any 
string that contains three consecutive characters that are a’s, b’s, or c’s in any combination. 
This expression will match abc, acb, and bca. And, because we are not restricting the num-
ber of times any character appears, it will also match aaa, bbb, aab, aca, and so forth. We 
could also use a range to define this expression, as in 
[a-c][a-c][a-c]
.
We can combine 
[ ]
with 
*, 
+
,
and 
?
to control the number of times we expect the 
characters to appear. For instance, 
[abc]
+
will match any string that contains a sequence 


208

Linux with Operating System Concepts
of 1 or more characters in the set a, b, c while 
[abc]*
will also match the empty string. In 
this latter case, we actually have a regular expression that will match anything because any 
string can contain 0 a’s, b’s, and c’s. For instance, 
12345
contains no a’s, b’s, or c’s, and so 
it can match 
[abc]*
when 
*
is interpreted as 0.
Now we have a means of expressing a regular expression where order is not impor-
tant. The expression 
[abc]
+
will match any of these four strings that we saw earlier that 
matched 
a*b*c*
:
• 
aaaabbbbcccc
• 
abcccc
• 
accccc
• 
aaaaaabbbbbb
This expression will also match strings like the following.
• 
abcabcabcabc
• 
abacab
• 
aaaaaccccc
• 
a
• 
cccccbbbbbbaaaa
We can combine any characters in the brackets as in 
[abcxyz]

[abcd1234],
or 
[abcdABCD]
. If we have a number of characters to enumerate, a range is more practical. 
We would certainly prefer to use a range like 
[a-z]
than to list all of the letters. We can also 
combine ranges and enumerations. For instance, the three sequences above could also be 
written as 
[a-cx-z]

[a-d1-4],
and 
[a-dA-D]
respectively. Now consider the list of all 
lower case consonants. We could enumerate them all as 
[bcdfghjklmnpqrstvwxyz]
or we could use several ranges as in 
[b-df-hj-np-tv-z]
.
While we can use ranges for letters and digits, there is no range available for the punc-
tuation marks. You could enumerate all of the punctuation marks in brackets to capture 
“any punctuation mark” but this would be tedious. Instead, we also have a class named 
:punct:
which is applied in double brackets, as in 
[[:punct:]]
. Table 6.2 provides a 
listing of the classes available in Linux.
Let us now combine all of the metacharacters we have learned with some exam-
ples. We want to find a string that consists only of letters. We can use 
^[a-zA-Z]
+
$
or 
^[[:alpha:]]
+
$
. The ^ and $ force the regex to match an entire string. Thus, any string 
that contains nonletters will not match. If we had used only 
[a-zA-Z]
+
, then it could 
match any string that contains letters but could also have other characters that precede or 
succeed the letters such as 
abc123

123abc

abc!def,
as well as 
^#!$a*%&
. Why do we 
use the 
+
in this regex? If we had used 
*
, this could also match the empty string, that is, 


Regular Expressions

209
a string with no characters. The 
+
insists that there be at least one letter and the 
^
and 
$
insist that the only characters found are letters.
We could similarly match a string of only binary digits. Binary digits are 0 and 1. So 
instead of [a-zA-Z] or [[:alpha:]], we use [01]. The regex is 
^[01]
+
$
. Again, we use the 
^
and 
$
to force the expression to match entire strings and we use 
+
instead of * to disallow the 
empty string. If we wanted to match strings that comprised solely digits, but any digits, we 
would use either 
^[0-9]
+
$
or 
^[[:digit:]]
+
$
.
If we want to match a string of only punctuation marks, we would use 
^[[:punct:]] 
+
$

Unlike the previous examples, we would not use […] and enumerate the list of punctuation 
marks. Why not? There are too many and we might (carelessly) miss some. There is no range 
to indicate all punctuation marks, such as [!-?], so we must either list them all, or use :punct:.
If we want to match a string that consists only of digits and letters where the digits precede 
the letters, we would use 
^[0-9] 
+
[[:alpha:]] 
+
$
. If we wanted to match a string that 
consists only of letters and digits where the first character must be a letter and then can be 
followed by any (0 or more) letters and digits, we would use 
^[[:alpha:]][0-9a-zA-Z]*$
.
6.2.5 Matching Characters That Must Not Appear
In some cases, you will have to express a pattern that seeks to match a string that does not 
contain specific character(s). We might want to match a string that has no blank spaces 
in it. You might think to use 
[. . .]
+
where the 
. . .
is “all characters except the blank 
space.” That would require enumerating quite a list as it would have to include every letter, 
every digit, and every punctuation mark. In such a case, we would prefer to indicate “no 
space” by using the notation 
[^ ]
. The 
^
, when used inside of 
[]
means “do not match” 
against the characters listed in the brackets. The blank space after 
^
indicates that the only 
character we do not want to match against is the blank.
Unfortunately, our regex 
[^ ]
will have the same flaw as earlier expressions in that if it 
locates any single nonblank character within the string, it is a match to the string. If our 
string is “hi there,” the 
[^ ]
regex will match the ‘h’ at the beginning of the string because it 
TABLE 6.2 
Classes Defined for Regular Expressions

Download 5,65 Mb.

Do'stlaringiz bilan baham:
1   ...   77   78   79   80   81   82   83   84   ...   254




Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©hozir.org 2024
ma'muriyatiga murojaat qiling

kiriting | ro'yxatdan o'tish
    Bosh sahifa
юртда тантана
Боғда битган
Бугун юртда
Эшитганлар жилманглар
Эшитмадим деманглар
битган бодомлар
Yangiariq tumani
qitish marakazi
Raqamli texnologiyalar
ilishida muhokamadan
tasdiqqa tavsiya
tavsiya etilgan
iqtisodiyot kafedrasi
steiermarkischen landesregierung
asarlaringizni yuboring
o'zingizning asarlaringizni
Iltimos faqat
faqat o'zingizning
steierm rkischen
landesregierung fachabteilung
rkischen landesregierung
hamshira loyihasi
loyihasi mavsum
faolyatining oqibatlari
asosiy adabiyotlar
fakulteti ahborot
ahborot havfsizligi
havfsizligi kafedrasi
fanidan bo’yicha
fakulteti iqtisodiyot
boshqaruv fakulteti
chiqarishda boshqaruv
ishlab chiqarishda
iqtisodiyot fakultet
multiservis tarmoqlari
fanidan asosiy
Uzbek fanidan
mavzulari potok
asosidagi multiservis
'aliyyil a'ziym
billahil 'aliyyil
illaa billahil
quvvata illaa
falah' deganida
Kompyuter savodxonligi
bo’yicha mustaqil
'alal falah'
Hayya 'alal
'alas soloh
Hayya 'alas
mavsum boyicha


yuklab olish