Nestgraph

Please note that this page is still work-in-progress.

NEST Graph #

This page contains the following content:

Basic Information #

ProCAKE provides a generic graph representations supporting directed, multi-labeled, and multi-typed graphs. A graph is represented in ProCAKE as an object named NESTGraphObject. A graph object consists of various NESTGraphItems representing the nodes and edges. In a NESTGraph, different types of nodes and edges are typically used that are also semantically enriched in an object-oriented manner. Two subclasses exist: NESTSequentialWorkflow and NESTWorkflow, which is a subclass of NESTSequentialWorkflow. These classes are described in the respective sections.

graph RL; NESTSequentialWorkflowObject --> NESTGraphObject; NESTWorkflowObject --> NESTSequentialWorkflowObject;

In general, a NEST graph is a quadruple \(W = (N,E,S,T)\) where \(N\) is a set of nodes and \(E \subseteq N \times N\) is a set of edges. \( T : N \cup E \rightarrow \Omega\) associates to each node and each edge a type from \(\Omega\) . \(S : N \cup E \rightarrow \Sigma\) associates to each node and each edge a semantic description from a semantic metadata language \(\Sigma\) .

NEST Graph Items #

Several data classes exist that represent special types of graph items. All of them inherit from the class NESTGraphItemObject, which has as subclasses NESTNodeObject and NESTEdgeObject. The following hierarchy is given:

graph RL; NESTNodeObject--> NESTGraphItemObject; NESTEdgeObject--> NESTGraphItemObject; NESTSequenceNodeObject-->NESTNodeObject; NESTControlflowNodeObject-->NESTSequenceNodeObject; NESTAndStartNodeObject-->NESTControlflowNodeObject; NESTAndEndNodeObject-->NESTControlflowNodeObject; NESTOrStartNodeObject-->NESTControlflowNodeObject; NESTOrEndNodeObject-->NESTControlflowNodeObject; NESTXorStartNodeObject-->NESTControlflowNodeObject; NESTXorEndNodeObject-->NESTControlflowNodeObject; NESTLoopStartNodeObject-->NESTControlflowNodeObject; NESTLoopEndNodeObject-->NESTControlflowNodeObject; NESTTaskNodeObject-->NESTSequenceNodeObject; NESTDataNodeObject-->NESTNodeObject; NESTWorkflowNodeObject-->NESTNodeObject; NESTSubworkflowNodeObject-->NESTNodeObject; NESTControlflowEdgeObject-->NESTEdgeObject; NESTConstraintEdgeObject-->NESTEdgeObject; NESTDataflowEdgeObject-->NESTEdgeObject; NESTPartOfEdgeObject-->NESTEdgeObject;

For the respective data classes NESTGraph, NESTSequentialWorkflow and NESTWorkflow there are restrictions on which graph item class may be used. This is described in the respective sections for the data classes. The following table gives an overview of the allowed item types per class. (In this case, an X indicates that the data class is allowed, and an - indicates that it cannot be applied.)

Available graph item classes by data class NESTGraph NESTSequentialWorkflow NESTWorkflow
NESTNode X - -
NESTWorkflowNode - X X
NESTSubworkflowNode - X X
NESTDataNode - X X
NESTTaskNode - X X
NESTControlflowNode (abstract class, subclasses are mentioned) - X -
NESTEdgeObject X - -
NESTPartOfEdge - X X
NESTDataflowEdge - X X
NESTControlflowEdge - X X
NESTConstraintEdge - X X

NEST Modifiers #

Each graph class has its own modifier, which contains basic methods for inserting and removing nodes and edges as well as class-specific additional methods. For the NESTGraph the class NESTGraphModifier exists, for the NESTSequentialWorkflow the class NESTSequentialWorkflowModifier and for the NESTWorkflow the class NESTWorkflowModifier. Since the modifiers inherit from each other according to the class hierarchy, the submodifiers also contain the methods of the superclasses. The modifiers can be returned by any Graph object using the getModifier method. This can then look like this:

NESTGraphModifier modifier = graphObject.getModifier();

All modifiers contain the methods insertNewNode as well as insertNewEdge and specific methods for the allowed node and edge types. More information about the modifier methods can be found in the JavaDoc.

Generic NEST graph #

A generic NEST graph is a general graph that contains nodes and edges with semantic descriptions. Only objects of the classes NESTNodeObject and NESTEdgeObject are allowed, no special subclasses. There are no other requirements for the graph.

Sequential Workflow #

A sequential workflow denotes a special type of NESTGraph. It represents a de facto workflow instance. A de facto instance only consists of task and data nodes and builds a sequence of tasks linked by the edges. De facto instances are a special subset of workflows constrained to the result of a workflow engine that simply records the tasks happening over time. It only allows for sequential controlflows. It is a simple linearly ordered collection of task nodes, each of which can have multiple input and output data nodes. De facto instances can only contain sequentially aligned tasks and can therefore be represented as lists.

To check whether a NESTSequentialWorkflow is a valid de facto graph, the method isValidSequentialWorkflow can be used as follows:

NESTSequentialWorkflowObject sequentialWorkflow = ...;
boolean isValid = sequentialWorkflow.isValidSequentialWorkflow();

A new NESTSequentialWorkflow can be created by using a NESTGraph. This can be done as follows:

NESTSequentialWorkflowObject sequentialWorkflow = (NESTSequentialWorkflowObject) ModelFactory.getDefaultModel().getNESTSequentialWorkflowClass().newObject();
sequentialWorkflow.transformNESTGraphToNESTSequentialWorkflow(nestGraph);

Similarly, a NESTSequentialWorkflow can also be created from a list containing only task nodes. The sorting of the list is assumed to be sequential in the workflow.

NESTSequentialWorkflowObject sequentialWorkflow = (NESTSequentialWorkflowObject) ModelFactory.getDefaultModel().getNESTSequentialWorkflowClass().newObject();
sequentialWorkflow.transformListObjectToNESTSequentialWorkflow(taskNodeList);

Both methods that create a NESTSequentialWorkflow from a NESTGraph or a list of task nodes throw a NoSequentialGraphException if they are not valid NESTSequentialWorkflows.

NEST Workflow #

Instances of the class NESTWorkflow are referred to as semantic graphs. These are described in detail in the paper Similarity Assessment and Efficient Retrieval of Semantic Workflows. (Note: The NEST graphs described in this paper, along with their properties, are referred to as NEST workflows in ProCAKE.) In a NESTWorkflow, workflow, subworkflow, task, dataflow, and controlflow nodes are allowed, as well as part of, dataflow, controlflow, and constraint edges. In contrast to the NESTSequentialWorkflow, alternative or parallel task sequences can be represented by means of the controlflow nodes. In addition, a NESTWorkflow can have several start and end nodes. This semantic workflow graph can be used to represent different types of workflows, such as scientific or business workflows.

Example of a Workflow Graph #

The following illustration shows an exemplary workflow represented as NESTWorkflow in ProCAKE. In this case, it is also a valid NESTSequentialWorkflow. However, the graph is created as an object of NESTWorkflow.

nestworkflow_example_small

The illustration shows a workflow that basically consists of two task nodes and one data node. In this graph, the first task node that is named Task A produces data that is referred to as Data A. Data A is then consumed by task node Task B. Even if the graph contains only three workflow components, it can be seen that the NESTWorkflow contains more objects and also different edge types. Despite the two task nodes that are objects of type NESTTaskNodeObject and the data node of type NESTDataNodeObject, a workflow node called Workflow A is part of the graph. Workflow nodes are represented by objects of type NESTWorkflowNodeObject, and they can contain basic information about the overall NESTWorkflow. Each NESTWorkflow must have exactly one workflow node. Looking at the edges used, it can be seen that task nodes are connected by controlflow edges that are of type NESTControlflowEdgeObject. Task and data nodes are connected by a dataflow edges that are of type NESTDataflowEdgeObject. In a NESTWorkflow, all nodes (except of the workflow node itself) must be connected to the workflow node via part-of-edges that are of type NESTPartOfEdgeObject.

In the example above, only the components of a workflow have been presented. If we want to define a semantic workflow, some semantic information to each object of the workflow should also be added. For example, we might want to give a name to each task node and also define how long a task node is executed in real application scenarios. To add those additional information to edges and nodes of the NESTWorkflow, semantic descriptions are used. Semantic descriptions play a key role when defining NEST workflow graphs because the calculation of the similarity between two NESTWorkflows heavily depends on the semantic descriptions of nodes and edges, i.e., the metadata of each graph item.

Constructing Workflow Graphs #

NESTWorkflowBuilder #

Graphs can be constructed at run-time using the NESTWorkflowBuilder. Please note that a running ProCAKE instance is required. The NESTWorkflowBuilder offers methods for creating NESTWorkflows via XML or programmatically via modifier methods.

XML Definition #

A NESTWorkflow object can be created from a valid XML representation using the method createNESTWorkflowObject(String xmlString):

NESTWorkflowBuilder builder = new NESTWorkflowBuilderImpl();
NESTWorkflowObject graph = builder.createNESTWorkflowObject(xmlString);

The XML representation of the graph depicted above is as follows:

<?xml version="1.0" encoding="UTF-8"?>
<nest:NESTWorkflow xmlns:cdol="http://cake.wi2.uni-trier.de/xml/cdol"
                xmlns:nest="http://cake.wi2.uni-trier.de/xml/nest"
                xmlns:cdop="http://cake.wi2.uni-trier.de/xml/cdop"
                xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                xsi:schemaLocation="http://cake.wi2.uni-trier.de/xml/cdop cdop.xsd http://cake.wi2.uni-trier.de/xml/cdol cdol.xsd http://cake.wi2.uni-trier.de/xml/nest nest.xsd"
                id="MyNESTWorkflow" c="Workflow">
    <nest:Nodes>
        <nest:Node id="1" c="NESTTaskNode">
            <cdol:A c="String" v="Task A"/>
        </nest:Node>
        <nest:Node id="2" c="NESTTaskNode">
            <cdol:A c="String" v="Task B"/>
        </nest:Node>
        <nest:Node id="DATA_3" c="NESTDataNode">
            <cdol:A c="String" v="Data A"/>
        </nest:Node>
        <nest:Node id="WORKFLOW_MyNESTWorkflow" c="NESTWorkflowNode">
            <cdol:Agg c="WorkflowSemantic"/>
        </nest:Node>
    </nest:Nodes>
    <nest:Edges>
        <nest:Edge id="e2" pre="2" post="WORKFLOW_MyNESTWorkflow" c="NESTPartOfEdge"/>
        <nest:Edge id="e4" pre="1" post="DATA_3" c="NESTDataflowEdge"/>
        <nest:Edge id="e3" pre="DATA_3" post="WORKFLOW_MyNESTWorkflow" c="NESTPartOfEdge"/>
        <nest:Edge id="e6" pre="1" post="2" c="NESTControlflowEdge"/>
        <nest:Edge id="e5" pre="DATA_3" post="2" c="NESTDataflowEdge"/>
        <nest:Edge id="e1" pre="1" post="WORKFLOW_MyNESTWorkflow" c="NESTPartOfEdge"/>
    </nest:Edges>
</nest:NESTWorkflow>
Programmatic Definition #

A workflow graph can be also created from scratch using the NESTWorkflowBuilder and NESTWorkflowModifier. An empty graph is created with createEmptyNESTWorkflowObject(String NESTWorkflowID) and a workflow graph containing a workflow node is created with createNESTWorkflowGraphObject(String NESTWorkflowID, DataObject semanticDescriptor) using the NESTWorkflowBuilder. The workflow graph object can be extended (adding new nodes and edges) afterwards with the NESTWorkflowModifier:

NESTWorkflowBuilder builder = new NESTWorkflowBuilderImpl();
NESTWorkflowObject nestWorkflow = builder.createNESTWorkflowGraphObject("MyNESTWorkflow", null);

NESTWorkflowModifier modifier = nestWorkflow.getModifier();

NESTTaskNodeObject taskA = modifier.insertNewTaskNode(createStringObject("Task A"));
NESTTaskNodeObject taskB = modifier.insertNewTaskNode(createStringObject("Task B"));
NESTDataNodeObject dataA = modifier.insertNewDataNode(createStringObject("Data A"));

modifier.insertNewDataflowEdge(taskA, dataA);
modifier.insertNewDataflowEdge(dataA, taskB);
modifier.insertNewControlflowEdge(taskA, taskB);

The method createStringObject works similar to the object creation that is described here. It returns a String object that contains the given value.

The modifier automatically adds the partOfEdge to the workflow node if the workflow node is existing in the workflow graph. However, all other edges such as ControlFlowEdges and DataFlowEdges must be added manually after the corresponding nodes have been added.

NESTWorkflowEditor #

The NESTWorkflow Editor visualizes a NESTWorkflow, facilitates its modification and offers the subsequent export of the modified workflow graph to an XML file. Additionally, an editor for modifying DataObjects attached to NESTGraphItems as semantic descriptors is provided. Further explanation of the NESTWorkflowEditor capabilities and usage can be found here.

Validating Workflow Graphs #

To check if a NESTWorkflow is valid, the NESTWorkflowValidator can be used. The validator provides several methods for performing fine-grained validations. It provides three methods, which are built upon each other, for checking the most common NESTWorkflow definitions:

NESTWorkflowValidatorImpl validator = new NESTWorkflowValidatorImpl(nestWorkflow);
validator.isValidGraph();
validator.isValidWorkflow();
validator.isBlockOrientedWorkflow();

The result of each method call is cached in the validator instance. Consequently, methods can be invoked in any order or several times without causing unnecessary validations. Please note that if the workflow graph is modified, the cached values must be deleted by creating a new validator instance or calling the method validator.reset();.

A valid workflow graph must not contain nodes or edges with duplicate ids and edges have to be fully connected and two nodes must not be connected redundantly.

A valid workflow graph is a valid graph that contains a unique workflow node. Each other node in the workflow graph has to be connected to a sub-workflow node or workflow node via part-of edge. Edges cannot connect a node to itself, i.e. have the same pre and post node. Control-flow edges can only link sequence nodes while data-flow edges can only link data nodes with sequence nodes. There must not be any unconnected data node. Control-flow blocks must be complete, cannot be interleaved, or have more than one empty sequence.

A valid block-oriented workflow graph is a valid workflow graph that has only a single start and end node in the control-flow.

The validation is terminated after the first error is detected. The error message can be retrieved with validator.getErrorMessage();. The message is also logged with log level TRACE.