Example for NEST graph measures using A*II #

A*II uses a better informed heuristic. For each node and each edge, a maximum similarity is computed, which is used for the heuristic. Also, every edge is tried to map as soon as possible.

Which nodes are chosen first is randomly. So, the order in this example is just one way, how to map the graphs. But the result will be the same in each case.

At the beginning, the similarities for each item are computed. These can be found in the example, which uses A*I. So, the following heuristic is computed:

\(h_{ II }(S) = \sum_{x \in S.N \cup S.E }{ }{ \left( \max\limits_{{y \in N_{ c } \cup E_{ c }}} \left\{ sim_{ E/N } \left( x,y \right) \right\} \right)} * \frac{ 1 }{ \left\lvert N_{ q } \right\rvert + \left\lvert E_{ q } \right\rvert} = (1.0 + 1.0 + 1.0 + 0.0 + 1.0 + 1.0 + 1.0 + 1.0 + 1.0 + 1.0) * \frac{ 9 }{10} = \frac{ 9 }{10} = 0.9\)

\(g_{I}\) is 0.0, so \(f_{I}\) would be \(f_{I} = \frac{ 9 }{ 10 }\) .

The algorithm starts by mapping nodes. As there is only one workflow node, the only possible mapping is \(WN1_{ q }\) to \(WN1_{c}\) . Here, as first node \(TN1_{q}\) is chosen. It can be mapped to \(TN1_{c}\) or \(TN2_{c}\) . The other nodes weren’t considered, because a Task Node can only be mapped to another Task Node. For both, a new search node is created.

The similarity of \(TN1_{q}\) to \(TN1_{c}\) and \(TN2_{c}\) is computed. In this case, the similarity of \(TN1_{q}\) and \(TN1_{c}\) is 1.0, because both are the same. The similarity of \(TN1_{q}\) and \(TN2_{c}\) is lower, the value depends on which similarity measure will be used for the semantic descriptors. For both search nodes, the heuristic \(h_{II}\) is computed. Here, the new similarity values are used for S.N, the other not mapped items are treated as similarity values of 1.0. The search node with the mapping \(TN1_{q}\) to \(TN1_{c}\) has the higher heuristic, so this is chosen and expanded. The value of the heuristic changes to \(f_{I} = \frac{ 7 }{9}\) , but the value for \(f_{I}\) doesn’t change.

Now, the sets have the following items:

\(S.N = \left\{ TN2_{ q }, DN1_{ q } \right\} \\ S.E = \left\{ POE1_{ q }, POE2_{ q }, POE3_{ q }, DE1_{ q }, DE2_{ q }, CE1_{ q } \right\}\)

At next, the other Task Node \(TN2_{ q }\) is mapped. Because there is only one other element, which can be legally mapped, one search node is created. There, \(TN2_{ q }\) is mapped to \(TN2_{ c }\) , where the similarity is also 1.0. The value for the heuristic gets smaller, but the value for \(g_{I}\) gets bigger. So, the value for \(f_{I}\) stays the same.

Now, the sets have the following items:

\(S.N = \left\{ DN1_{ q } \right\} \\ S.E = \left\{ POE1_{ q }, POE2_{ q }, POE3_{ q }, DE1_{ q }, DE2_{ q }, CE1_{ q } \right\}\)

Now, the edge \(CE1_{q}\) can be mapped, because the pre-node and the post-node are mapped, too. They are also compared by their semantic descriptor. The similarity computation isn’t explained further at this point. The similarity value is set to 1.0. The value for the heuristic gets smaller, but the value for \(g_{I}\) gets bigger. So, the value for \(f_{I}\) stays the same.

Now, the sets have the following items:

\(S.N = \left\{ \right\} \\ S.E = \left\{ POE1_{ q }, POE2_{ q }, POE3_{ q }, DE1_{ q }, DE2_{ q } \right\}\)

Because no other edges can be mapped yet, the next node is considered. At next, the Data Nodes are selected. There is also just one element for each graph. So, the only possible mapping is \(DN1_{ q }\) to \(DN1_{ c }\) . Using the String Equals measure, the similarity would be 0.0. Because this similarity was also taken account in the similarity computation before, the value for the heuristic gets smaller, but the value for \(g_{I}\) gets bigger. So, the value for \(f_{I}\) stays the same.

Now, the sets have the following items:

\(S.N = \left\{ \right\} \\ S.E = \left\{ POE1_{ q }, POE2_{ q }, POE3_{ q }, DE1_{ q }, DE2_{ q } \right\}\)

At this point, all nodes are mapped. So, all of the edges can be mapped yet. The mappings are equal to the ones, which were made in the example of A*I. The value for the heuristic gets smaller, but the value for \(g_{I}\) gets bigger. So, the value for \(f_{I}\) stays the same.

At last, the sets have the following items:

\(S.N = \left\{ \right\} \\ S.E = \left\{ \right\}\)

At this point, all nodes and edges are mapped. The heuristic value is now \(0.0\) . The value for \(g_{I}\) is now equal to \(f_{I}\) , which is the best possible similarity value if \(\frac{9}{10} = 0.9\) .