Adjacency list
Vertices are stored as records or objects, and every vertex stores a list of adjacent vertices. This data structure allows the storage of additional data on the vertices. Additional data can be stored if edges are also stored as objects, in which case each vertex stores its incident edges and each edge stores its incident vertices.
Adjacency matrix
A two-dimensional matrix, in which the rows represent source vertices and columns represent destination vertices. Data on edges and vertices must be stored externally. Only the cost for one edge can be stored between each pair of vertices.
Incidence matrix
A two-dimensional matrix, in which the rows represent the vertices and columns represent the edges. The entries indicate the incidence relation between the vertex at a row and edge at a column.
The following table gives the time complexity cost of performing various operations on graphs, for each of these representations, with |V| the number of vertices and |E| the number of edges. In the matrix representations, the entries encode the cost of following an edge. The cost of edges that are not present are assumed to be
In the distributed memory model, the usual approach is to partition the vertex set {\displaystyle V} of the graph into {\displaystyle p} sets {\displaystyle V_{0},\dots ,V_{p-1}}. Here, {\displaystyle p} is the amount of available processing elements (PE). The vertex set partitions are then distributed to the PEs with matching index, additionally to the corresponding edges. Every PE has its own subgraph representation, where edges with an endpoint in another partition require special attention. For standard communication interfaces like MPI, the ID of the PE owning the other endpoint has to be identifiable. During computation in a distributed graph algorithms, passing information along these edges implies communication.
Partitioning the graph needs to be done carefully - there is a trade-off between low communication and even size partitioning But partitioning a graph is a NP-hard problem, so it is not feasible to calculate them. Instead, the following heuristics are used.
1D partitioning: Every processor gets {\displaystyle n/p} vertices and the corresponding outgoing edges. This can be understood as a row-wise or column-wise decomposition of the adjacency matrix. For algorithms operating on this representation, this requires an All-to-All communication step as well as {\displaystyle {\mathcal {O}}(m)} message buffer sizes, as each PE potentially has outgoing edges to every other PE.
2D partitioning: Every processor gets a submatrix of the adjacency matrix. Assume the processors are aligned in a rectangle {\displaystyle p=p_{r}\times p_{c}}, where {\displaystyle p_{r}} and {\displaystyle p_{c}} are the amount of processing elements in each row and column, respectively. Then each processor gets a submatrix of the adjacency matrix of dimension {\displaystyle (n/p_{r})\times (n/p_{c})}. This can be visualized as a checkerboard pattern in a matrix. Therefore, each processing unit can only have outgoing edges to PEs in the same row and column. This bounds the amount of communication partners for each PE to {\displaystyle p_{r}+p_{c}-1} out of {\displaystyle p=p_{r}\times p_{c}} possible ones.
Do'stlaringiz bilan baham: |