VISUALIZATION OF EXPERIMENTAL DATA, PRESENTED IN THE
FORM OF NUMERICAL TABLES
S.M.Maxanov (master's student, TUIT named after Muhammad al-Khorezmi)
The scope and possibilities of numerical experiments grow with the
development of computer technology. The complexity and variety of tasks to be
solved are increasing. The huge amount of information obtained during the
experiment requires adequate ways of presenting it. Instead of arrays of numerical
data and simple graphs, visual images are increasingly used to facilitate a full and
timely understanding of the results obtained. Data visualization is a challenge that
any researcher faces in his work. The problem of visualizing data is reduced to the
problem of visualizing experimental data or the results of theoretical research. The
traditional tools in this area are graphs and charts. The success of visualization
directly depends on the correctness of its application, namely, on the choice of the
type of chart, its correct use, and design.
Graph 1. 60% of visualization success depends on the choice of the type of graph, 30% on its
correct use, and 10% on its correct design
73
The goals of visualization are the implementation of the main idea of
information, this is what you need to show the selected data, what effect you need to
achieve - identifying relationships in information, displaying the distribution of data,
composition, or comparison of data.
Relationships in data are how they depend on each other, the relationship
between them. Using relationships, you can identify the presence or absence of
dependencies between variables. If the main idea of the information contains the
phrases "refers to", "decreases/increases with", then you need to strive to show
exactly the relationship in the data.
The distribution of data is how it is located in relation to something, how many
objects fall into certain sequential areas of numerical values. The main idea, in this
case, will contain phrases "in the range from x to y", "concentration", "frequency",
"distribution".
Graph 2. The first row shows graphs with the goals of showing relationships on the data and the
distribution of the data, and the second row shows the goals of showing composition and comparing
data
Data composition combining data in order to analyze the big picture as a whole,
compare the components that make up a percentage of a whole. Key phrases for the
composition are "accounted for x%", "share", "percent of the whole". Comparison of
data combining data in order to compare some indicators, revealing how objects
relate to each other. It is also a comparison of components that change over time. Key
phrases for an idea when comparing are "more / less than", "equal", "changes", "up /
down".
After determining the purpose of the visualization, you need to determine the
data type. They can be very heterogeneous in their type and structure, but in the
simplest case, they distinguish continuous numerical and temporal data, discrete data,
geographic and logical data. Continuous numerical data contains information about
the dependence of one numerical value on another, for example, graphs of functions
such as y = 2x. Continuous-time frames contain data on events occurring over a
certain period of time, like a graph of the temperature measured every day. Discrete
data may contain dependencies of categorical values, for example, a graph of the
number of sales of goods in different stores. Geographic data contains various
74
information related to location, geology, and other geographic indicators, a prime
example is an ordinary geographic map. Boolean data shows the logical arrangement
of components in relation to each other, such as a family tree.
Depending on the purpose and data, you can choose the most suitable schedule
for them. It is best to avoid variety for the sake of variety and choose the simpler the
better.
Graph 3. Graphs of continuous numeric and temporal data, discrete data, geographic and logical
data
Only for specific data, use specific types of charts, in other cases, the most
common charts are well suited:
linear (line)
with areas
columns and histograms (bar)
pie chart (pie, donut)
polar graph (radar)
scatter, bubble
maps
trees (tree, mental map, tree map)
time diagrams (time line, gantt, waterfall).
In medical and psychological research, experimental results are often presented in
the form of numerical tables. Methods for visualizing this kind of information are
based, as a rule, on the transition from a multidimensional to a two-dimensional
coordinate system (method of principal components, methods of structural ordering).
Consider the algorithm for forming the coordinates of objects in the initial ordering
method.
To estimate the mismatch of structures in R
L
and R
2
, the matrix D
N
(X)=[d
nk
]1,1
N
,1
N
of mutual distances d
nk
between the elements X
n
and X
k
from the sample X is
calculated:
The n-th row of such a matrix contains the distances from some nth element X
n
to all other (N – 1) elements of the set {X
n
}
1
N
, and the kth column of the matrix is
formed by the distances from all elements of the set {X
n
}
1
N
to some k th element.
75
Any n-th row of the matrix D
N
(X) can be considered as the result of ordering the
elements {X
n
}
1
N
relative to the n-th element X
n
by mapping this set to the numeric
axis of real numbers R
n
+
. By setting the position of the n-th element on the R
n
+
axis
and taking it as the origin (point Y
n
, the coordinate of which on the R
n
+
axis is equal
to zero), we can arrange the images {Y
n
}
1
N
of the X sample on the R
n
+
axis relative to
the n-th element, using as a measure ordering distance from element X
n
to all other
(N – 1) elements.
From the point Y
n
R
n
+
(the origin in R
n
+
), we construct another numerical axis
R
k
+
perpendicular to the R
n
+
axis, in this case, the k-th element of the X sample is
located at the intersection of the R
n
+
and R
k
+
axes, and on the R
k
+
axis we map the set
{X
n
}
1
N
, similarly to as was done for the R
n
+
axis. The coordinates of the elements
{Y
n
}
1
N
on the R
k
+
axis are the distances from the k-th element to all other (N – 1)
elements and make it possible to judge about the groupability of the vectors {X
n
}
1
N
around the vector X
k
. These two axes R
n
+
and R
k
+
define some pseudo-plane (R
+
)
2
.
Thus, choosing any two rows (or two columns) of the matrix D
N
(X), we can
form new pseudo spaces (R
+
)
2
images {Y
n
}
1
N
of the set {X
n
}
1
N
. The collection of
images {Y
n
}
1
N
obtained by projecting the set {X
n
}
1
N
into (R
+
)
2
is used as an initial
approximation for the iterative procedure.
This approach was used to visualize experimental data in the information system
for assessing and monitoring the psychophysiological state of pregnant women.
The effectiveness of the method depends on a “good” choice of rows of the
matrix D
N
(X), which should not be completely random. The choice of elements X
n
and X
k
close in R
L
as the centers of ordering of the remaining (N – 1) elements on the
axes R
n
+
and R
k
+
is irrational, since it does not give essentially new information about
the ordering of the sample X, therefore, it is necessary to choose elements of X that
are relatively distant from each other. friend. In this regard, we have chosen the
"reference" object and the object with the worst parameters as the centers of ordering
(Fig. 1).
Fig. 1. Display of the psychophysiological state of various groups of pregnant women in space (R
+
)
2
The results of experimental studies make it possible to confidently assert that
visualization is one of the most promising directions for increasing the efficiency of
methods for analyzing and presenting the information.
Bibliography
1. Современные методы представления и обработки биомедицинской информации / под ред.
Ю.В. Кистенева, Я.С. Пеккера. – Томск: Изд-во ТПУ, 2004. – 336 с.
76
2. Горохов В.Л., Лукьянец А.А., Чернов А.Г. Современные методы когнитивной
визуализации многомерных данных – Томск: Некоммерческий фонд развития региональной
энергетики, 2007. – 216 с.
3. Дюк В.А., Эммануэль В. Информационные технологии в медико-биологических
исследованиях. – СПб.: Питер, 2003. – 528 с.
4. Попечителев Е.П., Старцева О.Н. Аналитические исследования в медицине, биологии и
экологии – М.: Высшая школа, 2003. – 279 с.
5. Берестнева О.Г., Добрянская Р.Г., Муратова Е.А., Шаропин К.А. Интеллектуальная
система выявления групп риска среди беременных женщин // Информатика и системы
управления. – 2008. – № 2 (16). – С. 22–23.
6. SAS,Data Visualization Techniques: From Basics to Big Data with SAS
®
Visual Analytics, 2014.
7. https://www.zingchart.com/docs/chart-types/pie
Do'stlaringiz bilan baham: |