Limitations of Data Analysis & Machine Learning
You might have read from news and online articles that machine learning and
advanced data analysis can change the fabric of society (automation, loss of jobs,
universal basic income, artificial intelligence takeover).
In fact, the society is being changed right now. Behind the scenes machine
learning and continuous data analysis are at work especially in search engines,
social media, and e-commerce. Machine learning now makes it easier and faster
to do the following:
● Are there human faces in the picture?
● Will a user click an ad? (is it personalized and appealing to him/her?)
● How to create accurate captions on YouTube videos? (recognise speech
and translate into text)
● Will an engine or component fail? (preventive maintenance in
manufacturing)
● Is a transaction fraudulent?
● Is an email spam or not?
These are made possible by availability of massive datasets and great processing
power. However, advanced data analysis using Python (and machine learning) is
not magic. It’s not the solution to all problem. That’s because the accuracy and
performance of our tools and models heavily depend on the integrity of data and
our own skill and judgment.
Yes, computers and algorithms are great at providing answers. But it’s also about
asking the right questions. Those intelligent questions will come from us
humans. It also depends on us if we’ll use the answers being provided by our
computers.
Accuracy & Performance
The most common use of data analysis is in successful predictions (forecasting)
and optimization. Will the demand for our product increase in the next five
years? What are the optimal routes for deliveries that lead to the lowest
operational costs?
That’s why an accuracy improvement of even just 1% can translate into millions
of dollars of additional revenues. For instance, big stores can stock up certain
products in advance if the results of the analysis predicts an increasing demand.
Shipping and logistics can also better plan the routes and schedules for lower
fuel usage and faster deliveries.
Aside from improving accuracy, another priority is on ensuring reliable
performance. How can our analysis perform on new data sets? Should we
consider other factors when analyzing the data and making predictions? Our
work should always produce consistently accurate results. Otherwise, it’s not
scientific at all because the results are not reproducible. We might as well shoot
in the dark instead of making ourselves exhausted in sophisticated data analysis.
Apart from successful forecasting and optimization, proper data analysis can
also help us uncover opportunities. Later we can realize that what we did is also
applicable to other projects and fields. We can also detect outliers and interesting
patterns if we dig deep enough. For example, perhaps customers congregate in
clusters that are big enough for us to explore and tap into. Maybe there are
unusually higher concentrations of customers that fall into a certain income
range or spending level.
Those are just typical examples of the applications of proper data analysis. In the
next chapter, let’s discuss one of the most used examples in illustrating the
promising potential of data analysis and machine learning. We’ll also discuss its
implications and the opportunities it presents.
Do'stlaringiz bilan baham: |