public class Fragment
extends java.lang.Object
A graph fragment is a part of a graph. It consists of a list of embeddings that indicate where the fragment can be found in different graphs. In addition, a fragment contains information about the extension edge (and maybe extension node) by which it was constructed from a smaller fragment.
Modifier and Type | Field and Description |
---|---|
protected static int |
ADAPTED
flag for an adapted fragment; this flag is set when a fragment
gets adapted, so that the adaptation is not repeated
|
protected Fragment |
base
the base fragment (which was extended)
|
protected Embedding |
bemb
the current base embedding (which was extended)
|
protected static int |
CHAIN
flag for the result of an extension that is a chain start
|
protected int |
chcnt
the number of variable length chains
|
protected static int |
CLOSED
flag for a closed fragment
(no super-fragment has the same support)
|
protected int |
cnt
the number of embeddings/graphs for the current graph
|
static int |
COMPL
group identifier: complement
|
protected NamedGraph[] |
cover
the graph cover of the fragment
|
protected Embedding |
curr
the current embedding (cursor)
|
protected static int |
DEFAULT
default flags that are set when a fragment is created
|
protected int |
dst
the index of the destination node of the (first) new edge
|
protected int |
flags
the property flags of the embedding (e.g.
|
static int |
FOCUS
group identifier: focus
|
protected Graph |
graph
the fragment as a graph
|
static int |
GRAPHS
support type: number of graphs that contain an embedding
|
static int |
GREEDY
support type flag: whether to use the greedy MIS algorithm
|
protected int |
idx
the index of the (first) new edge
|
protected Embedding |
list
the list of embeddings of the fragment
|
protected int |
max
the maximal number of embeddings per graph
|
static int |
MIN_IMAGE
support type: minimum number of different nodes mapped to
|
static int |
MIS_HARM
support type: maximum independent set of harmful overlap graph
|
static int |
MIS_OLAP
support type: maximum independent set of normal overlap graph
|
protected static int |
ORBITS
flag for valid orbit identifiers in the fragment graph
|
protected static int |
PACKED
flag for a packed list of embeddings
|
protected static int |
PERFECT
flag for a perfect extension; set if the fragment was created
by a perfect extension of its base fragment
|
protected static int |
REVERTED
flag for reverted extension information
|
protected int[] |
ris
the indices of the nodes of new ring edges
|
protected static int |
SIBLINGS
flag for possible equivalent siblings
|
protected int |
size
the number of nodes in a ring or chain
(positive: ring, negative: chain)
|
protected int |
src
the index of the source node of the (first) new edge
|
protected int[] |
supp
the support and embedding counters for focus and complement
|
static int |
SUPPMASK
the support type mask
|
protected Embedding |
tail
the tail of the embedding list
|
protected static int |
VALID
flag for a valid fragment; invalid fragments can occur in ring
mining with canonical form pruning: they are non-canonical
fragments that have to processed nevertheless
|
Modifier | Constructor and Description |
---|---|
protected |
Fragment()
Create an empty fragment.
|
protected |
Fragment(CanonicalForm cnf)
Create a fragment from an extension.
|
protected |
Fragment(Fragment frag,
int src,
int dst,
int edge,
int node)
Create an extended fragment.
|
protected |
Fragment(Graph graph)
Create a fragment from a graph.
|
protected |
Fragment(Graph graph,
Graph sub)
Create a fragment from a graph and a subgraph.
|
protected |
Fragment(Graph graph,
Graph sub,
int max)
Create a fragment from a graph and a subgraph.
|
protected |
Fragment(Graph graph,
int max)
Create a fragment with an embedding limit from a graph.
|
protected |
Fragment(int max)
Create an empty fragment with an embedding limit.
|
Modifier and Type | Method and Description |
---|---|
protected boolean |
adapt(CanonicalForm cnf,
boolean check)
Adapt the fragment (limited reordering of the edges and nodes).
|
protected boolean |
addEmb(CanonicalForm cnf)
Add an extension (as an embedding) to the fragment.
|
protected void |
addEmb(Embedding emb)
Add an embedding (list) to the fragment.
|
void |
addGraph(NamedGraph graph)
Add a graph to the cover of the fragment.
|
protected boolean |
chainsValid()
Check whether all variable length chains are valid.
|
void |
computeSupport(int type)
Compute the support of the fragment.
|
boolean |
equals(java.lang.Object frag)
Check whether two fragments are equal.
|
boolean |
equalsCanonic(Fragment frag)
Compare the (canonical) code words of two fragments.
|
protected boolean |
equivSiblings()
Check whether a fragments can have equivalent siblings.
|
protected Embedding |
firstEmb()
Get the first embedding of the fragment.
|
Graph |
firstGraph()
Get the first graph containing the fragment.
|
int |
getComplEmbCount()
Get the number of embeddings in the complement.
|
int |
getComplSupport()
Get the complement support of a fragment.
|
int |
getEmbCount()
Get the number of embeddings (in both groups together).
|
int |
getEmbCount(int group)
Get the number of embeddings in the given group.
|
int |
getFocusEmbCount()
Get the number of embeddings in the focus.
|
int |
getFocusSupport()
Get the focus support of a fragment.
|
Graph |
getGraph()
Get the fragment as a (sub-)graph.
|
int |
getSupport()
Get the support of a fragment (in both groups together).
|
int |
getSupport(int group)
Get the support of a fragment in the given group.
|
int |
hashCode()
Compute the hash code of the fragment.
|
protected boolean |
hasOpenRings(int min,
int max)
Check whether the fragment has open rings.
|
protected boolean |
hasOrbits()
Check whether the fragment has node orbit identifiers.
|
protected boolean |
hasUnclosableRings(CanonicalForm cnf)
Check whether the fragment has unclosable rings.
|
protected boolean |
isCanonic(CanonicalForm cnf)
Check whether the fragment is in canonical form.
|
protected int |
isCanonic(CanonicalForm cnf,
boolean partial)
Check whether the fragment is in canonical form.
|
protected boolean |
isClosed(CanonicalForm cnf)
Check whether the fragment is closed.
|
protected boolean |
isEquivTo(Fragment frag)
Check whether two fragments are equivalent.
|
protected boolean |
isPerfect(boolean chain)
Check whether an extension is perfect.
|
protected boolean |
isRingEdgeExt()
Check whether the fragment is the result of an extension
by a ring edge.
|
protected boolean |
isRingExt()
Check whether the fragment is the result of a ring extension.
|
protected boolean |
isValid()
Check whether the valid flag is set.
|
static void |
main(java.lang.String[] args)
Main function for testing some basic functionality.
|
protected boolean |
makeCanonic(CanonicalForm cnf)
Make the fragment canonic (minimize the code word).
|
protected boolean |
makeCanonic(CanonicalForm cnf,
int keep)
Make the fragment canonic (minimize the code word).
|
protected void |
map(CanonicalForm cnf)
Reorganize the fragment with a map from a canonical form.
|
protected int |
mergeExts(Fragment[] exts,
int cnt)
Merge ring extensions that have the same initial edge.
|
protected Embedding |
nextEmb()
Get the next embedding of the fragment.
|
Graph |
nextGraph()
Get the next graph containing the fragment.
|
void |
pack()
Pack the list of embeddings.
|
void |
pack(int max)
Pack the list of embeddings.
|
protected void |
reembed()
Reembed a fragment, that is, recreate its embedding list.
|
protected void |
revert()
Revert the extension information of the fragment.
|
protected void |
setClosed(boolean closed)
Set the flag for a closed fragment.
|
protected void |
setValid(boolean valid)
Set or clear the valid flag.
|
protected int |
size()
Get the size of the fragment.
|
java.lang.String |
toString()
Create a string description of the fragment.
|
java.lang.String |
toString(CanonicalForm cnf)
Create the code word of the fragment.
|
java.lang.String |
toString(Notation ntn)
Create a string description of the fragment.
|
protected void |
unembed()
Unembed a fragment, that is, remove all embeddings.
|
void |
unpack()
Unpack the list of embeddings.
|
public static final int FOCUS
public static final int COMPL
public static final int GRAPHS
public static final int MIS_OLAP
public static final int MIS_HARM
public static final int MIN_IMAGE
public static final int SUPPMASK
public static final int GREEDY
protected static final int VALID
protected static final int CLOSED
protected static final int CHAIN
protected static final int SIBLINGS
protected static final int PERFECT
protected static final int REVERTED
protected static final int ADAPTED
protected static final int ORBITS
protected static final int PACKED
protected static final int DEFAULT
protected Graph graph
protected Fragment base
protected Embedding bemb
protected NamedGraph[] cover
protected Embedding list
protected Embedding tail
protected Embedding curr
protected int max
protected int cnt
protected int chcnt
protected int idx
protected int src
protected int dst
protected int size
protected int flags
PERFECT
)protected int[] supp
protected int[] ris
protected Fragment()
protected Fragment(Graph graph)
graph
- the subgraph representing the fragmentprotected Fragment(int max)
max
- the maximum number of embeddings per graphprotected Fragment(Graph graph, int max)
graph
- the subgraph representing the fragmentmax
- the maximum number of embeddings per graphprotected Fragment(Graph graph, Graph sub)
graph
- the graph into which to embed the subgraphsub
- the subgraph to embed into the graphprotected Fragment(Graph graph, Graph sub, int max)
graph
- the graph into which to embed the subgraphsub
- the subgraph to embed into the graphmax
- the maximum number of embeddingsprotected Fragment(Fragment frag, int src, int dst, int edge, int node)
A fragment is created from a base fragment and information about a single edge extension, which is the common first edge of the fragments to be merged into it.
frag
- the base fragment which was extended by ringssrc
- the source node index of the edge to adddst
- the destination node index of the edge to addedge
- the edge type of the edge to addnode
- the destination node type of the edge to addprotected Fragment(CanonicalForm cnf)
This function is called if there is no fragment that is equivalent to a created extension and thus a new fragment has to be created.
cnf
- the canonical form from which to create the fragmentpublic boolean equals(java.lang.Object frag)
This method is overridden only to avoid certain warnings.
equals
in class java.lang.Object
frag
- the fragment to compare topublic int hashCode()
hashCode
in class java.lang.Object
protected int size()
The size of the fragment is the number of nodes in an output (or a created subgraph), which is why the number of chains, each of which will be represented as a pseudo-node, has to be added to the number of nodes of an embedding.
public Graph getGraph()
For this function to work the fragment must be created from a subgraph or an extension or at least one embedding must have been added to it. Otherwise the (sub-)graph representing the fragment is undefined.
protected void addEmb(Embedding emb)
This function is meant for adding the embeddings of the seed structure into some graph to the seed fragment. It is assumed that, if multiple embeddings are added with one call (that is, if a list of embeddings is added), they all refer to the same graph. On the other hand, the function may be called several times with different lists of embbedings all of which refer to the same graph.
If the fragment is limited w.r.t. the number of embeddings that may be stored per graph, the embedding(s) may not actually be stored, but will be regenerated on demand.
emb
- the embedding to add to the fragmentprotected boolean addEmb(CanonicalForm cnf)
This function is called if a created extension is equivalent
to the fragment and thus only an embedding has to be added.
It assumes that the extension is equal to the fragment,
that is, that cnf.compareTo(this)
yields 0.
cnf
- the canonical form describing the embedding to addpublic void pack()
Pack the list of embeddings with the maximum number of embeddings per graph that was specified when this fragment was created.
public void pack(int max)
max
- the maximum number of embeddings per graphpublic void unpack()
protected Embedding firstEmb()
Together with the function next()
this function
allows to traverse the list of embeddings without having to pay
attention to the fact that due to a limit for the number of
embeddings that may be stored per graph, some embeddings
have been packed and thus are not available directly. Rather
these two functions regenerate the embeddings from a packed
embedding by reembedding the subgraph representing the
fragment into the corresponding graph.
Note that this function, just as the accompanying function
next()
, modifies the tail
pointer,
which is reused as a cursor for packed embeddings, and thus it
is impossible to add embeddings after this function has been
called.
null
if there is no embeddingnextEmb()
protected Embedding nextEmb()
null
if there is no next embeddingfirstEmb()
public void addGraph(NamedGraph graph)
graph
- the graph to add to the coverpublic Graph firstGraph()
Together with the function nextGraph()
this
function allows to traverse the list of graphs containing this
fragment without having to pay attention to the fact that the
fragment may have several embeddings into the same graph.
null
if there is no such graphnextGraph()
public Graph nextGraph()
null
if there is no next graphfirstGraph()
public void computeSupport(int type)
The support of the fragment in both graph groups, that is,
FOCUS
and COMPL
, is computed.
The computed values can then be accessed with the functions
getSupport(int)
, getFocusSupport()
, and
getComplSupport()
.
type
- the type of support to computepublic int getSupport()
computeSupport(int)
public int getSupport(int group)
group
- the group for which to get the supportcomputeSupport(int)
public int getFocusSupport()
computeSupport(int)
public int getComplSupport()
computeSupport(int)
public int getEmbCount()
public int getEmbCount(int group)
group
- the group for which to get the number of embeddingspublic int getFocusEmbCount()
public int getComplEmbCount()
protected void unembed()
The embeddings of fragments that correspond to nodes in the search tree that are siblings of nodes on the current path in the search tree can temporarily be removed. They can recreated from the embeddings of their base fragments when recursion returns and the fragment is actually processed.
protected void reembed()
As long as sibling fragments are processed, the embeddings of a fragment need not be avaiblable and thus can temporarily be removed. They are recreated when the fragment is actually processed. This is done by extending their base embeddings or, if this is impossible, by reembedding the fragment.
protected void setValid(boolean valid)
By default a fragment is valid. The valid flag is cleared for a fragment that is not canonical (and thus must not be reported), but nevertheless has to be kept and processed. Such fragments can occur when ring extensions are combined with canonical form pruning.
valid
- whether to set or clear the valid flagprotected boolean isValid()
If the valid flag is cleared, the fragment need not be reported, because it is non-canonical and thus a duplicate.
protected boolean isRingExt()
protected boolean isRingEdgeExt()
protected boolean isPerfect(boolean chain)
An extension is perfect if it can be applied to all base embeddings. If this is the case, all search tree branches to the right of the perfect extension can be pruned (partial perfect extension pruning), since no closed frequent fragment can be contained in these branches, or even all search tree branches other than the perfect extension branch can be pruned (full perfect extension pruning).
In order to avoid certain problems with rings, such pruning must be restricted to extension edges that are bridges in all graphs or that close a ring (in all graphs). Ring edges that do not close a ring can lead to problems if they are part of rings of different sizes in the underlying graphs and thus are not considered as candidates for perfect extensions. Furthermore, if chain extensions are considered, an edge that could be the start of a chain is also not taken into account as a candidate for a perfect extension.
If a fragment is found to have resulted from a perfect extension, it is marked as perfect, so that a second call to this function does not repeat the checks.
chain
- whether there are sibling chain extensionsprotected void revert()
If a perfect extension is followed as the only branch and there are search tree branches to the left of this branch (where "to the left" means "referring to fragments with lexicographically smaller code words"), the extension information must be reset to that of the base fragment, so that no fragments are lost.
protected boolean equivSiblings()
If extensions have been filtered with node orbits, a fragment can have equivalent siblings only if it is the result of a ring or chain extension or the result of an edge extension that closes a ring. If no orbit filtering was carried out, any fragment may possess equivalent siblings.
protected boolean isEquivTo(Fragment frag)
Two fragments are equivalent if they refer to the same nodes and edges, although maybe in a different way. To check this, it is tried to find an equivalent embedding of the two fragments for one graph. Technically this is done by marking one embedding of one fragment in the graph referred to and then checking whether any embedding of the other fragment into the same graph is fully marked.
frag
- the fragment to compare toprotected void setClosed(boolean closed)
In the search it may be possible to determine, without an
explicit test, that a fragment is not closed, namely if one of
the extensions of the fragment that are considered in the search
has the same support. In this case the fragments closed flag
is cleared, so that the test function isClosed()
returns immediately with a negative result. By default the
closed flag is set for a fragment.
protected boolean isClosed(CanonicalForm cnf)
A fragment is closed if no superstructure has the same support, that is, occurs in the same number of graphs in the database. Closed fragments are not reported.
The basic idea of the test is to form a list of all possible extensions of the embeddings into the first graph and then trying to do the same extensions on embeddings into the remaining graphs. Whenever an extension cannot be done on some embedding, the extension is removed from the list. That is, the list always contains the extensions that are possible in all already processed graphs. If this list gets empty during the process, there is no extension that is possible in all graphs and consequently the fragment is closed. If, on the other hand, the list contains at least one element after all graphs have been processed, the fragment is not closed, since each of the extensions that remain in the list can be done in all graphs and thus lead to a superstructure with the same support.
cnf
- the canonical form defining
which extensions must be consideredprotected boolean isCanonic(CanonicalForm cnf)
cnf
- the canonical formprotected int isCanonic(CanonicalForm cnf, boolean partial)
The test may be executed in a partial form, in which only the edges up to the last added edge are seen as fixed, while the rest can be reordered freely to achieve canonical form.
cnf
- the canonical formpartial
- whether to carry out a partial test-1, | if the fragment differs from the canonical form in the fixed edges, |
0, | if it is not canonical, but does not differ in the fixed edges, |
+1, | if the fragment is canonical. |
protected void map(CanonicalForm cnf)
The reorganization consists in reordering the nodes and
edges of the subgraph representing the fragment as well
as reordering the node and edge references of all embeddings.
This function is called in makeCanonic()
and in
adapt()
to carry out the changes determined by
the corresponding canonical form.
cnf
- the canonical form containing the mapprotected boolean makeCanonic(CanonicalForm cnf)
cnf
- the canonical formprotected boolean makeCanonic(CanonicalForm cnf, int keep)
In the process of trying to minimize the code word, the first
keep
edges of the fragment are left untouched. Hence
the result may or may not be the overall minimal code word.
cnf
- the canonical formkeep
- the number of edges to keepprotected boolean hasOrbits()
protected boolean hasOpenRings(int min, int max)
min
- minimum ring size (number of nodes/edges)max
- maximum ring size (number of nodes/edges)protected boolean hasUnclosableRings(CanonicalForm cnf)
If the output is restricted to fragments containing only closed rings, the restricted extensions (as they can be derived from a canonical form) render certain nodes unextendable. If such an node has only one incident ring edge, the ring of which this edge is part cannot be closed by future extensions. Hence neither this fragment nor any of its extensions can produce output and thus it can be pruned.
cnf
- the canonical form defining the unextendable nodesprotected int mergeExts(Fragment[] exts, int cnt)
This function does not work with packed embeddings lists due to ordering problems. However, the fact that the set of embeddings is reduced would also lead to a considerable loss of information if embeddings were discarded and recreated by reembedding.
exts
- the array of ring extension fragments to mergecnt
- the number of fragments to considerexts
after mergingprotected boolean adapt(CanonicalForm cnf, boolean check)
For full perfect extension pruning and for combining ring extensions with canonical form pruning it is necessary to allow for a (strictly limited) reorganization of a fragment (a certain reordering of the edges and nodes), so that certain edges, which may have be added in a wrong order (w.r.t. the canonical form), are brought into the proper order. The core idea is to split the list of edges into to parts: a fixed prefix part and a volatile suffix part. Only the volatile suffix part is adapted in this function.
cnf
- the canonical formcheck
- whether to check for a valid ring orderprotected boolean chainsValid()
A variable length chain is valid if it actually represents chains of different length in the underlying graphs. If, however, a chain has the same length in all underlying graphs, it is not valid, as it could be represented explicitly.
public boolean equalsCanonic(Fragment frag)
This function determines whether the code words of two fragments (the fragment the function is called on and the argument fragment), as they are implicitly represented by the order of their nodes and edges, are equal.
frag
- the fragment to compare topublic java.lang.String toString()
toString
in class java.lang.Object
public java.lang.String toString(Notation ntn)
ntn
- the notation to use for the descriptionpublic java.lang.String toString(CanonicalForm cnf)
cnf
- the canonical form defining the code word formpublic static void main(java.lang.String[] args)
It is tried to parse the first two command line arguments as SMILES, SLN, or LiNoG descriptions of graph (in this order). If parsing suceeds, the second graph is embedded into the first and the number of embeddings is printed..
args
- the command line arguments