Linux with Operating System Concepts



Download 5,65 Mb.
Pdf ko'rish
bet90/254
Sana22.07.2022
Hajmi5,65 Mb.
#840170
1   ...   86   87   88   89   90   91   92   93   ...   254
Bog'liq
Linux-with-Operating-System-Concepts-Fox-Richard-CRC-Press-2014

filename
NOTE: the 

indicates additional uppercase words that we want to parenthesize.
Just how many additional –e /pattern/replacement/pairs would we need? The problem 
is that we want to replace any fully capitalized word. So in this case, the … would indi-
cate every possible upper case word. This would be tens of thousands of entries! (or if 
we allowed any combination of upper-case letters, it could be an infinite list). It is clearly 
ridiculous to try to enumerate every possibility. And in fact, we can easily represent “upper 
case words” with the single regular expression 
[A-Z]
+
.
What we need is a mechanism in sed that can be used to represent the matched string, or 
the portion of the string that matched the regular expression. If we could denote the matched 
portion with a placeholder, we could then use that placeholder in the replacement string to 
indicate “use the matched string here.” sed does have such a mechanism, the 
&
character.
We can now solve our problem much more simply using a single /pattern/replacement/
pair. The pattern is 
[A-Z]
+
. The replacement is to put the string in parentheses, so we will 
use 
(&)
. This indicates a replacement of “open paren, the matched string, close paren.” Our 
sed command is now much simpler.
sed ‘s/[A-Z]
+
/(&)/g’ 
filename
There are modifiers that can be applied to 
&
. These modifiers include 
\U

\u

\L,
and 
\l
to entirely upper case the string, capitalize the string (the first letter), entirely lower case the 
string, or lower case the first letter of the string respectively. For instance
sed ‘s/[A-Z][a-z]*/\L&/g’ 
filename
will find any capitalized words and lower case them. Notice that we could also use 
\l
since 
we are only seeking words that start with a capital letter and therefore there is only one 
letter to lower case. The modifiers 
\U

\u

\L

\l
only impact letters found in the matched 
strings, not digits or punctuation marks.
Aside from the 
&
, sed offers us another facility for representing portions of the matched 
string. This can be extremely useful if we want to alter one portion of a matching string, or 
rearrange the string. To denote a portion of a string, we must indicate which portion(s) we 


Regular Expressions

229
are interested in. We denote this in our regular expression by placing that portion of the pat-
tern inside 
\(
and 
\)
marks. We can then reference that matching portion using 
\1
. In sed, 
we can express up to nine parts of a pattern and refer to them using 
\1

\2
, up through 
\9
.
Let us assume the file names.txt contains information about people including their full 
names in the form 
first_name middle_initial last_name
, such as 
Homer 
J. Simpson
or 
Frank V. Zappa
. Not all entries will have middle initials, such as 
Michael Keneally
. We want to delete the period from any middle initial when found. 
To locate middle initials, we would seek 
[A-Z]\.
to represent the initial and the period. 
We embed the 
[A-Z]
in 
\(\)
to mark it. This gives us the pattern
\([A-Z]\)\.
Although this is somewhat ugly looking, it is necessary. Now to remove the period, the 
replacement string is 
\1
. The 
\1
refers to whatever matched 
\([A-Z]\)
. The rest of the 
match, the 
\.
, is not part of the replacement and so is omitted. Our sed command is
sed ‘s/\([A-Z]\)\./\1/’ names.txt
This sed instruction says “find any occurrences of a capital letter followed by a period 
in a line and replace it with just the capital letter.” Note that if a person’s first name was 
also offered as an initial, as in 
H. J. Simpson
, this replacement would result in 
H J. 
Simpson
. If we want to ensure that we delete all periods from initials, we would have to 
add ‘g’ after the last / in the sed command for a global replacement. Alternatively, if we 
want to ensure that only a middle initial’s period is deleted, we would have to be more 
clever. A revised expression could be
sed ‘s/\([A-Z][a-z.]
+
[A-Z]\)\./\1/’ names.txt
Let us now consider an example of referencing multiple portions of the pattern. We want 
to rearrange the names.txt file so that the names are changed from the format

Download 5,65 Mb.

Do'stlaringiz bilan baham:
1   ...   86   87   88   89   90   91   92   93   ...   254




Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©hozir.org 2024
ma'muriyatiga murojaat qiling

kiriting | ro'yxatdan o'tish
    Bosh sahifa
юртда тантана
Боғда битган
Бугун юртда
Эшитганлар жилманглар
Эшитмадим деманглар
битган бодомлар
Yangiariq tumani
qitish marakazi
Raqamli texnologiyalar
ilishida muhokamadan
tasdiqqa tavsiya
tavsiya etilgan
iqtisodiyot kafedrasi
steiermarkischen landesregierung
asarlaringizni yuboring
o'zingizning asarlaringizni
Iltimos faqat
faqat o'zingizning
steierm rkischen
landesregierung fachabteilung
rkischen landesregierung
hamshira loyihasi
loyihasi mavsum
faolyatining oqibatlari
asosiy adabiyotlar
fakulteti ahborot
ahborot havfsizligi
havfsizligi kafedrasi
fanidan bo’yicha
fakulteti iqtisodiyot
boshqaruv fakulteti
chiqarishda boshqaruv
ishlab chiqarishda
iqtisodiyot fakultet
multiservis tarmoqlari
fanidan asosiy
Uzbek fanidan
mavzulari potok
asosidagi multiservis
'aliyyil a'ziym
billahil 'aliyyil
illaa billahil
quvvata illaa
falah' deganida
Kompyuter savodxonligi
bo’yicha mustaqil
'alal falah'
Hayya 'alal
'alas soloh
Hayya 'alas
mavsum boyicha


yuklab olish