Searching the Internet for data
With an estimated size of almost 50 billion pages (
http://www.worldwide
websize.com/
), the web isn’t easy to represent. Studies describe the web as a
bowtie shaped graph (see
http://www.immorlica.com/socNet/broder.pdf
and
http://vigna.di.unimi.it/ftp/papers/GraphStructureRevisited.pdf
). The
web mainly consists of an interconnected core and other parts that link to that
core. However, some parts are completely unreachable. By taking any road in the
real world, you can go anywhere (you may have to cross the oceans to do it). On
the web, you can’t touch all the sites just by following its structure; some parts
aren’t easily accessible (they are disconnected or you’re on the wrong side to
reach them). If you want to find something on the web, even when time isn’t a
problem, you still need an index.
Do'stlaringiz bilan baham: |