Nestgraph

# NEST Graph #

## Basic Information #

ProCAKE provides a generic graph representations supporting directed, multi-labeled, and multi-typed graphs. A graph is represented in ProCAKE as an object named NESTGraph. Two subclasses exist: NESTSequentialWorkflow and NESTWorkflow , which are described in the respective sections.

In general, a NEST graph is a quadruple $$W = (N,E,S,T)$$ where $$N$$ is a set of nodes and $$E \subseteq N \times N$$ is a set of edges. $$T : N \cup E \rightarrow \Omega$$ associates to each node and each edge a type from $$\Omega$$ . $$S : N \cup E \rightarrow \Sigma$$ associates to each node and each edge a semantic description from a semantic metadata language $$\Sigma$$ .

A graph object consists of various NESTGraphItems representing the nodes and edges. In a NESTGraph, different types of nodes and edges are used that are semantically enriched in an object-oriented manner. Several data classes exist to represent the various types of graph items. For the respective data classes NESTGraph , NESTSequentialWorkflow and NESTWorkflow there are restrictions on which graph item class may be used. This is described in the respective sections for the data classes. The following table gives an overview of the allowed item types per class. (In this case, an X indicates that the data class is allowed, and an - indicates that it cannot be applied.)

Available graph item classesNESTGraphNESTWorkflowNESTSequentialWorkflow
NESTNodeX--
NESTWorkflowNodeXXX
NESTSubworkflowNodeXXX
NESTDataNodeXXX
NESTControlflowNode (+subclasses)XX-
NESTEdgeObjectX--
NESTPartOfEdgeXXX
NESTDataflowEdgeXXX
NESTControlflowEdgeXXX
NESTConstraintEdgeXXX

## NESTGraph #

A generic NEST graph is a general graph that contains nodes and edges with semantic descriptions. It has no restrictions on the allowed graph items. Any object that is either a subclass of NESTNodeObject or NESTEdgeObject can be used.

## NESTWorkflow #

Instances of the class NESTWorkflow are referred to as semantic workflow graphs. These are described in detail in the paper Similarity Assessment and Efficient Retrieval of Semantic Workflows . (Note: The NEST graphs described in this paper, along with their properties, are referred to as NEST workflows in ProCAKE.) In a NESTWorkflow, workflow, subworkflow, task, dataflow, and controlflow nodes are allowed, as well as part of, dataflow, controlflow, and constraint edges. In contrast to the NESTSequentialWorkflow, alternative or parallel task sequences can be represented by means of the controlflow nodes. In addition, a NESTWorkflow can have several start and end nodes. This semantic workflow graph can be used to represent different types of workflows, such as scientific or business workflows.

The following illustration shows an exemplary workflow represented as NESTWorkflow. In this case, it is also a valid NESTSequentialWorkflow.

The illustration shows a workflow that basically consists of two task nodes and one data node. In this graph, the first task node that is named Task A produces data that is referred to as Data A. Data A is then consumed by task node Task B. Even if the graph contains only three workflow components, it can be seen that the NESTWorkflow contains more objects and also different edge types. Despite the two task nodes that are objects of type NESTTaskNodeObject and the data node of type NESTDataNodeObject, a workflow node called Workflow A is part of the graph. Workflow nodes are represented by objects of type NESTWorkflowNodeObject, and they can contain basic information about the overall workflow. Each workflow must have exactly one workflow node. Looking at the edges used, it can be seen that task nodes are connected by controlflow edges that are of type NESTControlflowEdgeObject. Task and data nodes are connected by dataflow edges that are of type NESTDataflowEdgeObject. In a workflow graph, all nodes (except of the workflow node itself) must be connected to the workflow node via part-of-edges that are of type NESTPartOfEdgeObject.

In the example above, only the graph items of a workflow have been presented. If we want to define a semantic workflow, some semantic information to each item of the workflow should also be added. For example, we might want to give a name to each task node and also define how long a task node is executed in real application scenarios. To add those additional information to edges and nodes of the workflow, semantic descriptions are used. Semantic descriptions play a key role when defining NEST workflow graphs because the calculation of the similarity between two NESTWorkflows heavily depends on the semantic descriptions of nodes and edges, i.e., the metadata of each graph item.

## NESTSequentialWorkflow #

A sequential workflow denotes a special type of NESTWorkflow. It only allows for sequential controlflows. It is a simple linearly ordered collection of task nodes, each of which can have multiple input and output data nodes. A sequential workflow can only contain sequentially aligned tasks and could therefore be represented as lists.

It can be used to represent a de facto workflow instance. A de facto instance only consists of task and data nodes and builds a sequence of tasks linked by the edges. De facto instances are a special subset of workflows constrained to the result of a workflow engine that simply records the tasks happening over time.

## Constructing NEST Graphs #

### Class definition #

Using subclasses of the NEST graph classes and the NEST graph item classes, it is possible to specify semantic descriptions for any node and edge class. For this purpose, an entry can be created in the model XML file for each desired class, in which the class for the semantic descriptions is specified. This can look as follows:

<NESTWorkflowNodeClass name="CustomWorkflowNode" semanticDescriptionClass="String"/>
<NESTSubWorkflowNodeClass name="CustomSubWorkflowNode" semanticDescriptionClass="String"/>
<NESTDataNodeClass name="CustomDataNode" semanticDescriptionClass="String"/>

<NESTPartOfEdgeClass name="CustomPartOfEdge" semanticDescriptionClass="Void"/>
<NESTControlflowEdgeClass name="CustomControlflowEdge" semanticDescriptionClass="Void"/>
<NESTDataflowEdgeClass name="CustomDataflowEdge" semanticDescriptionClass="Void"/>


If objects of these classes are created, then a check takes place whether the semantic descriptor is an object of these classes or subclasses of these. Otherwise, an exception is thrown. Here, the String system class is specified for the nodes. For the edges, the classes refer to the default Void class. For the system classes always the DataClass is set as semantic description, all possible values are permitted there thus.

To use these graph item classes within a NEST graph, a subclass of a NEST graph class must be created. In this subclass all allowed classes must be specified. These can be system classes as well as user-defined classes. If an attempt is made to add an object of a class that is not stored to a graph, an exception is thrown. Such a class definition with specified semantic descriptions may look like the following in XML:

<NESTWorkflowClass name="CustomNESTWorkflowClass">
<NESTNode class="CustomWorkflowNode"/>
<NESTNode class="CustomSubWorkflowNode"/>
<NESTNode class="CustomDataNode"/>

<!-- NESTControlflowNodes -->
<NESTNode class="NESTAndStartNode"/>
<NESTNode class="NESTAndEndNode"/>
<NESTNode class="NESTLoopStartNode"/>
<NESTNode class="NESTLoopEndNode"/>

<NESTEdge class="CustomPartOfEdge"/>
<NESTEdge class="CustomControlflowEdge"/>
<NESTEdge class="CustomDataflowEdge"/>
</NESTWorkflowClass>


This example is a NESTWorkflow, the definition is analogous for the NESTGraph and the NESTSequentialWorkflow with the respective node types. The user classes defined above are used, so for these the semantic descriptions are restricted as specified. For AND and LOOP nodes the system classes are used, so all possible semantic descriptions are allowed. For example, if an attempt is made to add XOR or OR nodes to an instance of this graph type, an exception is thrown.

The graph item classes and the nest graph class can also be created during the runtime in Java:

// create new nest workflow class
NESTWorkflowClass nestWorkflowClass = ModelFactory.getDefaultModel().getNESTWorkflowClass();
NESTWorkflowClass customNESTWorkflowClass = (NESTWorkflowClass) nestWorkflowClass.createSubclass("CustomNESTWorkflowClass");

// create NEST node classes and set semantic descriptor classes
NESTWorkflowNodeClass nestWorkflowNodeClass = (NESTWorkflowNodeClass) ModelFactory.getDefaultModel().getNESTWorkflowNodeClass().createSubclass("CustomWorkflowNode");
nestWorkflowNodeClass.setSemanticDescriptorClass(ModelFactory.getDefaultModel().getStringSystemClass());
nestWorkflowNodeClass.finishEditing();

NESTSubWorkflowNodeClass nestSubWorkflowNodeClass = (NESTSubWorkflowNodeClass) ModelFactory.getDefaultModel().getNESTSubWorkflowNodeClass().createSubclass("CustomSubWorkflowNode");
nestSubWorkflowNodeClass.setSemanticDescriptorClass(ModelFactory.getDefaultModel().getStringSystemClass());
nestSubWorkflowNodeClass.finishEditing();

NESTDataNodeClass nestDataNodeClass = (NESTDataNodeClass) ModelFactory.getDefaultModel().getNESTDataNodeClass().createSubclass("CustomDataNode");
nestDataNodeClass.setSemanticDescriptorClass(ModelFactory.getDefaultModel().getStringSystemClass());
nestDataNodeClass.finishEditing();

// create NEST edge classes and set semantic descriptor classes
NESTPartOfEdgeClass nestPartOfEdgeClass = (NESTPartOfEdgeClass) ModelFactory.getDefaultModel().getNESTPartOfEdgeClass().createSubclass("CustomPartOfEdge");
nestPartOfEdgeClass.setSemanticDescriptorClass(ModelFactory.getDefaultModel().getVoidSystemClass());
nestPartOfEdgeClass.finishEditing();

NESTControlflowEdgeClass nestControlflowEdgeClass = (NESTControlflowEdgeClass) ModelFactory.getDefaultModel().getNESTControlflowEdgeClass()
.createSubclass("CustomControlflowEdge");
nestControlflowEdgeClass.setSemanticDescriptorClass(ModelFactory.getDefaultModel().getVoidSystemClass());
nestControlflowEdgeClass.finishEditing();

NESTDataflowEdgeClass nestDataflowEdgeClass = (NESTDataflowEdgeClass) ModelFactory.getDefaultModel().getNESTDataflowEdgeClass().createSubclass("CustomDataflowEdge");
nestDataflowEdgeClass.setSemanticDescriptorClass(ModelFactory.getDefaultModel().getVoidSystemClass());
nestDataflowEdgeClass.finishEditing();

// finish editing of nest workflow class
customNESTWorkflowClass.finishEditing();


When using the modifiers, a query is automatically made as to which classes are allowed for a graph class. These are taken from the model definition. The method insertNewTaskNode would thus return an object of the class CustomTaskNode. If an attempt is made to create an object of a class not defined in the graph, for example an XOR node, an exception is thrown.

### NESTWorkflowBuilder and NEST Modifiers #

Graphs can be constructed at run-time using the NESTWorkflowBuilder. Please note that a running ProCAKE instance is required. The NESTWorkflowBuilder offers methods for creating NESTWorkflows via XML or programmatically via modifier methods.

Each graph class has its own modifier, which contains basic methods for inserting and removing nodes and edges as well as class-specific additional methods. For the NESTGraph the class NESTGraphModifier exists, for the workflow classes the class NESTAbstractWorkflowModifier and specifically for the NESTWorkflow the class NESTWorkflowModifier. Since the modifiers inherit from each other according to the class hierarchy, the submodifiers also contain the methods of the superclasses. The modifiers can be returned by any graph object using the getModifier method. This can then look like this:

NESTGraphModifier modifier = graphObject.getModifier();


All modifiers contain the methods insertNewNode as well as insertNewEdge and specific methods for the allowed node and edge types. More information about the modifier methods can be found in the JavaDoc.

#### XML Definition #

A NESTWorkflow object can be created from a valid XML representation using the method createNESTWorkflowObject(String xmlString):

NESTWorkflowBuilder<NESTWorkflowObject> workflowBuilder = new NESTWorkflowBuilderImpl();
NESTWorkflowObject workflow = workflowBuilder.createNESTWorkflowObject(xmlString);


The XML representation of the graph depicted above is as follows:

<nest:NESTWorkflow xmlns:cdol="http://cake.wi2.uni-trier.de/xml/cdol"
xmlns:nest="http://cake.wi2.uni-trier.de/xml/nest"
xmlns:cdop="http://cake.wi2.uni-trier.de/xml/cdop"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://cake.wi2.uni-trier.de/xml/cdop cdop.xsd http://cake.wi2.uni-trier.de/xml/cdol cdol.xsd http://cake.wi2.uni-trier.de/xml/nest nest.xsd"
id="MyNESTWorkflow" c="NESTWorkflow">
<nest:Nodes>
</nest:Node>
</nest:Node>
<nest:Node id="DATA_3" c="NESTDataNode">
<cdol:A c="String" v="Data A"/>
</nest:Node>
<nest:Node id="WORKFLOW_MyNESTWorkflow" c="NESTWorkflowNode"/>
</nest:Nodes>
<nest:Edges>
<nest:Edge id="e2" pre="2" post="WORKFLOW_MyNESTWorkflow" c="NESTPartOfEdge"/>
<nest:Edge id="e4" pre="1" post="DATA_3" c="NESTDataflowEdge"/>
<nest:Edge id="e3" pre="DATA_3" post="WORKFLOW_MyNESTWorkflow" c="NESTPartOfEdge"/>
<nest:Edge id="e6" pre="1" post="2" c="NESTControlflowEdge"/>
<nest:Edge id="e5" pre="DATA_3" post="2" c="NESTDataflowEdge"/>
<nest:Edge id="e1" pre="1" post="WORKFLOW_MyNESTWorkflow" c="NESTPartOfEdge"/>
</nest:Edges>
</nest:NESTWorkflow>


A NEST graph can also be stored in the case base with the appropriate XML definition and then read back in.

#### Programmatic Definition #

A workflow graph can be also created from scratch using the NESTWorkflowBuilder and NESTWorkflowModifier. An empty graph is created with createEmptyNESTWorkflowObject(String NESTWorkflowID, String workflowClass) and a workflow graph containing a workflow node is created with createNESTWorkflowGraphObject(String NESTWorkflowId, String workflowClass, DataObject semanticDescriptor) using the NESTWorkflowBuilder. The workflow graph object can be extended (adding new nodes and edges) afterwards with the NESTWorkflowModifier:

NESTWorkflowBuilder<NESTWorkflowObject> builder = new NESTWorkflowBuilderImpl();
NESTWorkflowObject workflow = builder.createNESTWorkflowGraphObject("MyNESTWorkflow", NESTWorkflowClass.CLASS_NAME,null);

NESTWorkflowModifier modifier = workflow.getModifier();
DataObjectUtils utils = new DataObjectUtils();

NESTDataNodeObject dataA = modifier.insertNewDataNode(utils.createStringObject("Data A"));



Here the class DataObjectUtils is used to create string objects more easily with the methods there. The manual definition of String objects can be read here.

The modifier automatically adds the partOfEdge to the workflow node if the workflow node is existing in the workflow graph. However, all other edges such as ControlFlowEdges and DataFlowEdges must be added manually after the corresponding nodes have been added. In the code snippet, semantic descriptions are only added to the nodes although edges could be also annotated.

A new NESTSequentialWorkflow can be converted from a NESTGraph. This can be done as follows:

NESTSequentialWorkflowObject sequentialWorkflow=(NESTSequentialWorkflowObject)ModelFactory.getDefaultModel().getNESTSequentialWorkflowClass().newObject();
sequentialWorkflow.transformNESTGraphToNESTSequentialWorkflow(nestWorkflow);


Similarly, a NESTSequentialWorkflow can also be created from a list containing only task nodes. The sorting of the list is assumed to be sequential in the workflow.

ListObject taskNodeList = new DataObjectUtils().createListObject(List.of(taskA, taskB));
NESTSequentialWorkflowObject sequentialWorkflowFromList = (NESTSequentialWorkflowObject) ModelFactory.getDefaultModel().getNESTSequentialWorkflowClass().newObject();


Both methods that create a NESTSequentialWorkflow from a NESTGraph or a list of task nodes throw a * NoSequentialGraphException* if they are not valid NESTSequentialWorkflows.

### NESTWorkflowEditor #

The NESTWorkflow Editor visualizes a NESTWorkflow, facilitates its modification and offers the subsequent export of the modified workflow graph to an XML file. Additionally, an editor for modifying DataObjects attached to NESTGraphItems as semantic descriptors is provided. Further explanation of the NESTWorkflowEditor capabilities and usage can be found here.

## Validating NEST Graphs #

To check whether a NESTGraph, NESTWorkflow, or NESTSequentialWorkflow is valid, the NESTGraphValidator, NESTWorkflowValidator, or NESTSequentialWorkflowValidator can be used. The validators provide several methods for performing fine-grained validations.

The result of each method call is cached in the validator instance. Consequently, methods can be invoked in any order or several times without causing unnecessary validations. Please note that if the graph is modified, the cached values must be deleted by creating a new validator instance or calling the method validator.reset();.

The validation is terminated after the first error is detected. The error message can be retrieved with validator.getErrorMessage();. The message is also logged with log level TRACE.

### NESTGraphValidator #

The NESTGraphValidator can be used for both generic graphs and workflow graphs.

NESTGraphValidator graphValidator = new NESTGraphValidatorImpl(nestWorkflow);
graphValidator.isValidGraph();


### NESTWorkflowValidator #

The NESTWorkflowValidator provides three main validation methods, which are built up upon each other:

NESTWorkflowValidator workflowValidator = new NESTWorkflowValidatorImpl(nestWorkflow);
workflowValidator.isValidGraph();
workflowValidator.isValidWorkflow();
workflowValidator.isBlockOrientedWorkflow();


This validator also provides the following parameters to allow for exceptions in the validation:

workflowValidator.setAllowEmptyControlflowBlocks(true);


If this parameter is set to true, the validator will allow controlflow blocks without any containing elements.The default value is false.

A valid workflow graph must not contain nodes or edges with duplicate ids and edges have to be fully connected and two nodes must not be connected redundantly. If allowEmptyControlflowBlocks=true, controlflow nodes may be connected by two edges redundantly.

A valid workflow graph is a valid graph that contains a unique workflow node. Each other node in the workflow graph has to be connected to a sub-workflow node or workflow node via part-of edge. Edges cannot connect a node to itself, i.e. have the same pre and post node. Control-flow edges can only link sequence nodes while data-flow edges can only link data nodes with sequence nodes. There must not be any unconnected data node. Control-flow blocks must be complete, cannot be interleaved, or have more than one empty sequence. If allowEmptyControlflowBlocks=true, controlflow blocks may have two empty sequences.

A valid block-oriented workflow graph is a valid workflow graph that has only a single start and end node in the control-flow.

### NESTSequentialWorkflowValidator #

To check whether a NESTSequentialWorkflow or NESTSequentialWorkflow represents a valid sequential workflow, the class NESTSequentialWorkflowValidator and it’s method isValidSequentialWorkflow() can be used as follows:

NESTSequentialWorkflowValidator sequentialWorkflowValidator = new NESTSequentialWorkflowValidatorImpl(nestWorkflow);
boolean isValid = sequentialWorkflowValidator.isValidSequentialWorkflow();