Python Projects for Beginners a ten-Week Bootcamp Approach to Python Programming



Download 2,61 Mb.
bet198/200
Sana20.06.2022
Hajmi2,61 Mb.
#681748
1   ...   192   193   194   195   196   197   198   199   200
Bog'liq
Python Projects for Beginners A Ten Week Bootcamp Approach to Python

Installing Beautiful Soup


To install Beautiful Soup, make sure your virtual environment is activated first, then write the following command into the terminal:

$ pip install bs4

After running the command, it should install a few packages that Beautiful Soup requires.

Importing Beautiful Soup


To follow along with the rest of this lesson, let’s open and continue from our previous notebook file “Week_10” and simply add a markdown cell at the bottom that says, “Web Scraping.
CHapter 10 INtroduCtIoN to data aNalYsIs
We need to import requests and the BeautifulSoup class that is within the bs4 library:

# importing the beautiful soup and requests library from bs4 import BeautifulSoup import requests

Go ahead and run the cell. We’ll use the requests module to send out a request to a given URL. When the URL endpoint is not an API that gives back properly formatted data but rather a web page that renders HTML and CSS, the response that we get back is the code for that web page. In order to parse through this code, we pass it into the BeautifulSoup object, which makes it easy to manipulate and traverse through the code.

Requesting Page Content


To begin scraping data, let’s send a request to a simple web page that contains only a poem:

# performing a request and outputting the status code page = requests.get("http://www.arthurleej.com/e-love.html") print(page)

Go ahead and run the cell. We’ll get an output of “”. This lets us know that the request to the web page was a success. In order to see what we received back as a response though, we need to access the content attribute of the page variable:

# outputting the request response content print(page.content)

Go ahead and run the cell. This will output a large string of all the code that was used to write this web page, including tags, styles, scripts, etc. As the book stated earlier, this URL renders a web page, so the response we get back is a string of all the code. The next step is to turn the response into an object that we can work with and parse through the data.

Download 2,61 Mb.

Do'stlaringiz bilan baham:
1   ...   192   193   194   195   196   197   198   199   200




Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©hozir.org 2024
ma'muriyatiga murojaat qiling

kiriting | ro'yxatdan o'tish
    Bosh sahifa
юртда тантана
Боғда битган
Бугун юртда
Эшитганлар жилманглар
Эшитмадим деманглар
битган бодомлар
Yangiariq tumani
qitish marakazi
Raqamli texnologiyalar
ilishida muhokamadan
tasdiqqa tavsiya
tavsiya etilgan
iqtisodiyot kafedrasi
steiermarkischen landesregierung
asarlaringizni yuboring
o'zingizning asarlaringizni
Iltimos faqat
faqat o'zingizning
steierm rkischen
landesregierung fachabteilung
rkischen landesregierung
hamshira loyihasi
loyihasi mavsum
faolyatining oqibatlari
asosiy adabiyotlar
fakulteti ahborot
ahborot havfsizligi
havfsizligi kafedrasi
fanidan bo’yicha
fakulteti iqtisodiyot
boshqaruv fakulteti
chiqarishda boshqaruv
ishlab chiqarishda
iqtisodiyot fakultet
multiservis tarmoqlari
fanidan asosiy
Uzbek fanidan
mavzulari potok
asosidagi multiservis
'aliyyil a'ziym
billahil 'aliyyil
illaa billahil
quvvata illaa
falah' deganida
Kompyuter savodxonligi
bo’yicha mustaqil
'alal falah'
Hayya 'alal
'alas soloh
Hayya 'alas
mavsum boyicha


yuklab olish