Analysis of Big Data
with Neural Network
http://iaeme.com/Home/journal/IJCIET
212
editor@iaeme.com
Big data analytics mainly involved in collecting huge amount of data from different
sources and managing the data in such a way that depleted by analysts and finally is delivered
to the useful organizational business. The process includes with converting huge assortment
of data into structured and unstructured raw data which has been
collected from different
sources. The huge assortment of data is been collected for better business understanding and
in order to process the data which is to be prepared for modeling the predications and then
continued for evaluation.
In this the data is sampled nothing but to convert into data set for modeling. The data set
is large volume of data which provides sufficient amount of information to retrieve in an
efficient manner and it is explored for better understanding to
overcome abnormalities with
the help of data visualization. Then we focus on preparing the data for modeling in which to
come up with a desired output as an evaluation is planned to come up with an accurate
possibility of data.
The data processing module includes creation of data set design and the data set design
leads towards two important
portions of data which is
•
Trained data set.
•
Evaluation of data set.
The trained data set consists of data which is to be trained within the huge organization of
data which includes data in the form of rows. As each row consists of input vector id, input
vector and targeted output. The input vector id is designed unique for each row in which
includes input vector data in the form of dimensions which is followed up by targeted output
in which results either true or false.
The Evaluation data set is the data which to be evaluated to improve the accuracy among
different text editors. The data is been represented in the form of rows. As each row consists
of id and input vector, where the id is an unique id to be used to
identify the related input
vector and input vector is the data which is represented in the form of 319 dimensions purpose
is to generate predictions to evaluate the trained data set.
The input dataset is generated in the form of CSV(Comma Separated Values ) format
which is to be loaded in the form of matrix. The data set has input vectors along
with vector id
and targeted output as discussed earlier. Here we come up with constructing two matrices
from the trained data set which includes input data matrix and targeted data output matrix
which is been included for huge millions of rows of each vector information.
The trained data
set is collected to predict the output which is done based on evaluation data set which ranges
in the decimal format as 0 or 1 and separation is made between vector id and input vector to
limit the origin which is used for accurate predictions which is to be saved using vector id
later on after the evaluation process.
Do'stlaringiz bilan baham: