Ai survey Assignment

Download 2,33 Mb.

Pdf ko'rish

bet	3/11
Sana	22.01.2022
Hajmi	2,33 Mb.
	#400057

1 2 3 4 5 6 7 8 9 10 11

Bog'liq
AI Survey Assignment

Applications

The notion of web page ranking will be recognizable to the majority of readers. That is, the

process of sending a search query to a search engine, which subsequently identifies and delivers

web pages related to the query in order of relevancy. Abow figure shows an example of the

"machine learning" query results. That is, given a query, the search engine delivers a sorted list of

webpages. A search engine must 'know' which pages to return in order to achieve this purpose

which pages are relevant, and which pages correspond to the question Such information may be

gleaned from a variety of sources, including web page link structures, content, the frequency with

which people would click on recommended links in a query, and instances of searches combined

with manually ranked webpages.

Machine learning is increasingly being utilized to automate the process of developing a strong

search engine, rather than guessing and creative engineering.

Collaborative filtering is a similar application. This information is heavily used by online bookshops

like Amazon and video rental services like Netflix to urge consumers to purchase further things (or

rent more movies). The issue is quite similar to that of page ranking on the internet. We wish to

get a sorted list once again (in this case of articles). The main distinction is that there is no

express inquiry, therefore we can only anticipate future watching and purchasing behaviors based

on the user's prior purchases and viewing selections. The essential side information here is the

judgments made by comparable users, which explains why the process is collaborative. As an

example, see Figure 1.2. It is obvious that having an automatic solution to address this problem

would be preferable, since it would eliminate guesswork and save time.

The challenge of automated document interpretation is also ill-defined. At one extreme, we may

strive to completely comprehend a document before converting it using a curated set of rules

devised by a computational linguist fluent in the two languages to be translated. This is a difficult

process, especially because language is not always grammatically perfect, and document

comprehension is not a simple task. Instead, we might simply utilize samples of translated texts to

learn how to translate between both the two 1.1 A Taste of Machine Learning 5 languages, such

as the sessions of the Canadian parliament or other multilingual bodies (United Nations, European

Union, Switzerland). To put it another way, we may learn how to translate by looking at instances

of translations. This machine learning strategy was a huge hit.

Face recognition is one of the components used in many security applications, such as access

control. To put it another way, recognize a person from a photograph (or video clip). To put it

another way, the system must identify the faces into one of many groups (Alice, Bob, Charlie, etc.)

or determine that they are unknown. The verification problem is a comparable, but fundamentally

distinct, problem. The purpose here is to confirm that the individual in issue is who he claims to

be. This is now a yes/no question, as opposed to the previous one. To deal with a variety of

lighting situations, facial expressions, whether or not a person is wearing glasses, hairstyle, and

other factors, a system that learns which aspects are important for recognizing a person is

desirable.

The challenge of named entity recognition is another area where learning may aid (see Figure

1.4). That is, the difficulty of extracting entities from documents, such as places, titles, people,

acts, and so on. These processes are critical for automated document digestion and

comprehension. Some current email clients, such as Apple's Mail.app, now have the ability to

automatically recognize addresses in emails and file them in an address book. While hand-crafted

rules can produce good results, it is significantly more efficient to learn such dependencies

automatically using examples of marked-up documents, especially if we wish to deploy our system

in several languages. For example, the words 'bush' and 'rice' are both used interchangeably.

are plainly agricultural phrases, but they also clearly allude to members of the Republican Party in

the context of modern politics.

Speech recognition (annotate an audio series with message, such as the system included with

Microsoft Vista), handwritten recognition (annotate a sequence of strokes with text, a feature

found on many PDAs), computer trackpads (e.g. Synaptics, a major manufacturer of these kind of

pads originally comes from the synaptic of a neural network), aircraft engine failure detection, and

profile pic behavior in computer games are all examples of applications that use learning.

The overriding theme of learning issues is that there is a nontrivial dependency between certain

observations, which we will refer to as x, and a desired response, which we will refer to as y, in

which there is no small set of deterministic rules. We may infer such a relationship between x and

y in a systematic way utilizing learning.

We'll wrap up this section by talking about the categorization issue, which will serve as a model

problem for the rest of the book. It happens all the time in practice: for example, during spam

filtering, we want a yes/no answer as to whether or not an e-mail includes useful information. Note

that this is a very user-dependent issue: e-mails from an airline advising him about recent

discounts may be useful information for a regular traveller, but for many other users, this may be a

nuisance (for example, when the e-mail pertains to things accessible only overseas). Furthermore,

the nature of bothersome emails may vary over time, for example, due to the availability of new

items (Viagra, Cialis, Levitra), new fraud chances (the Nigerian 419 scam, which took on a new

twist following the Iraq war), or new data formats (e.g. spam which consists mainly of images).

To combat these problems we want to build a system which is able to learn how to classify new

emails. A seemingly unrelated problem, that of cancer diagnosis shares a common structure:

given histological data (e.g. from a microarray analysis of a patient’s tissue) infer whether a patient

is healthy or not. Again, we are asked to generate a yes/no answer given a set of observations.

See Figure 1.5 for an example.

Data

That's helpful to categorize learning challenges based on the type of data they employ. This is

extremely useful when confronted with fresh issues, since many problems involving comparable

data types may be handled using very similar strategies. For example, for strings of natural

language text and DNA sequences, natural language and bioinformatics utilize very comparable

technologies. Vectors are the most fundamental entities we may come across in our work.

A life insurance firm, for example, would be interested in getting a vector of factors (blood

pressure, heart rate, height, weight, cholesterol level, smoker, gender) in order to estimate a

potential customer's life expectancy.

A farmer can be interested in detecting the maturity of fruit by looking at the color of the fruit (size,

weight, spectral data). In (voltage, current) pairings, an engineer would wish to look for

dependencies. Similarly, a vector of counts that characterize the presence of words might be used

to represent documents. The latter is sometimes known as a "bag of words" feature.

One of several difficulties with vectors is that the sizes and units of various coordinates might

change dramatically. We could, for example, measure height in kilos, pounds, grams, tons, and

stones, all of which would result in multiplicative changes. Similarly, depending on whether we

describe temperatures in terms of Celsius, Kelvin, or Farenheit, we have a whole class of affine

transformations. Normalizing the data is one technique to deal with those challenges in an

automated manner. We'll talk about how to accomplish it in an automated way.

Lists:

The vectors we receive may include a varying number of features in various instances. For

example, if the patient looks to be in good health, a physician may not feel compelled to do a

complete battery of diagnostic testing.

When there are a huge number of potential sources of an effect that are not well defined,

Sets

may occur in learning issues. For example, data on the toxicity of mushrooms is rather easy

to come by. It would be useful to utilize such information to infer the toxicity of a new mushroom

based on the chemical components it contains. Mushrooms, on the other hand, contain a mix of

substances, one or more of which may be hazardous. As a result, we must deduce an object's

attributes from a collection of features whose composition and quantity may differ significantly.

Download 2,33 Mb.

Do'stlaringiz bilan baham:

1 2 3 4 5 6 7 8 9 10 11