Ai survey Assignment

Multiclass classification

Download 2,33 Mb.

1 2 3 4 5 6 7 8 9 10 11

Bog'liq
AI Survey Assignment

Multiclass classification

. The primary difference

is that y 1,..., n can now take on a variety of different values. For example, we could want to

categorize a document by the language in which it was written (English, French, German,

Spanish, Hindi, Japanese, Chinese, etc.). As an example, see Figure 1.6. The primary change

from previously is that the cost of error might vary significantly depending on the type of error.

a mistake we make. For example, when assessing cancer risk, whether we misclassify an early

stage of cancer as healthy (in which case the patient is likely to die) or as an advanced stage of

cancer (in which case the patient is likely to be inconvenienced by overly aggressive treatment)

makes a significant difference.

By presuming that the labels y contain some extra structure that may be exploited in the estimate

process, Structured Estimation goes beyond basic multiclass estimation. For example, when

attempting to categorize websites, y may be a path in an ontology, or a permutation when

attempting to match objects, do collaborative filtering, or rank documents in a retrieval scenario.

When doing named entity recognition, y might also be an annotation of a text. Each of these

issues has its own peculiarities in terms of the set of y that we may consider acceptable, as well

as how to search this space. In Chapter??, we'll look at a few of these issues.

Another frequent use is regression. Given a pattern x, the purpose is to estimate a real-valued

variable y R. (see e.g. Figure 1.7). For example, we could wish to predict the following day's value

of a stock, the yield of a semiconductor fab based on the present process, the iron content of ore

based on mass spectroscopy readings, or an athlete's pulse rate based on accelerometer data.

The choice of a loss is one of the fundamental ways in which regression problems differ from one

another. When predicting stock prices, for example, our loss for a put option will be lopsided. A

hobby athlete, on the other hand, could just be concerned that our estimate of heart rate is close

to the actual average.

The term "novelty detection" is a bit of a misnomer. It discusses the problem of identifying

"abnormal" readings from a set of previous data. Clearly, deciding what is uncommon is a highly

subjective process. Unusual occurrences are thought to happen seldom, according to popular

belief. As a result, one possible objective is to create a system that provides a grade to each

observation based on how innovative it is. Readers who are knowledgeable with density

estimation could argue that the latter is a viable option. We don't require a score that adds up to 1

for the entire domain, and we don't worry too much about novelty ratings for normal observations.

We'll explore how we can directly attain this relatively easier aim later. When used in an optical

character recognition database, novelty detection is shown in Figure 1.8.

Download 2,33 Mb.

Do'stlaringiz bilan baham:

1 2 3 4 5 6 7 8 9 10 11