public class Recoder
extends java.lang.Object
Since it can speed up the mining process for frequent substructures considerably if the node types are processed in increasing order of their frequency, it is advisable to recode the node types to reflect the frequency order.
A recoder is implemented as a hash table (for encoding types) and an accompanying array (for decoding type codes).
Constructor and Description |
---|
Recoder()
Create a recoder of default size.
|
Modifier and Type | Method and Description |
---|---|
int |
add(int type)
Add a type to the recoder.
|
void |
clear(int code)
Clear the frequency and support of a type.
|
void |
commit()
Commit a type code counting.
|
void |
count(int code)
Count a type code.
|
int |
decode(int code)
Decode a type code, that is, retrieve the original type value.
|
int |
encode(int type)
Encode a type, that is, retrieve its code.
|
void |
exclude(int code)
Mark a type as excluded.
|
int |
getFreq(int type)
Get the frequency of a type (number of occurrences).
|
int |
getSupp(int type)
Get the support of a type (number of containing graphs).
|
boolean |
isExcluded(int code)
Check whether a type is excluded.
|
boolean |
isMaximal(int code)
Check whether a code has maximal frequency.
|
void |
maximize(int code)
Set frequency and support to a maximal value.
|
int |
size()
Get the size of the recoder.
|
void |
sort()
Sort types w.r.t. their frequency.
|
void |
trim(boolean freq,
int min)
Trim the recoder with a minimum support or frequency.
|
void |
trim(int min)
Trim the recoder with a minimum support.
|
void |
trimFreq(int min)
Trim the recoder with a minimum frequency.
|
void |
trimSupp(int min)
Trim the recoder with a minimum support.
|
public Recoder()
public int size()
The size is the number of stored type/code pairs.
public int add(int type)
The added type is assigned the next code, which is the size of the recoder before the new type was added. This ensures that type codes are consecutive integers starting at 0.
type
- the type to add to the recoderpublic int encode(int type)
type
- the type to encodepublic int decode(int code)
code
- the type code to decodepublic void count(int code)
Increment the internal counters for the frequency (and maybe also for the support) of this type code.
code
- the type code to countpublic void commit()
This function must be called after each graph for which the types of its node have been counted, so that the support of a type (number of graphs that contain a node of a given type) can be determined.
count(int)
public int getFreq(int type)
type
- the type of which to get the frequencypublic int getSupp(int type)
type
- the type of which to get the supportpublic void trim(boolean freq, int min)
All types having a support or a frequency less than the given
minimum support or frequency are marked as excluded. Note that
the types with a lower support are only marked, not actually
removed from the recoder. Hence it is possible to reactivate
them, for example by calling the functionclear()
for such a type.
min
- the minimum support of a typetrimFreq(int)
,
trimSupp(int)
public void trim(int min)
min
- the minimum support of a typetrimSupp(int)
public void trimSupp(int min)
min
- the minimum support of a typetrim(boolean,int)
public void trimFreq(int min)
min
- the minimum frequency of a typetrim(boolean,int)
public void clear(int code)
Calling this function also removes a possible marking of the type as excluded.
code
- the code of the type for which to clear the counterspublic void exclude(int code)
Note that marking a type as excluded loses its frequency and
support information. Excluded types will be sorted to the front
with the function sort()
, that is, by sorting
excluded types will be assigned the lowest codes.
code
- the code of the type to mark as excludedpublic boolean isExcluded(int code)
Types can be excluded by explicitely calling the function
exclude()
or by trimming the recoder with a minimum
frequency (by calling the function trim()
).
code
- the code of the type to checkexclude(int)
public void maximize(int code)
This function indirectly offers the possibility to move a type to the end of the recoder. Since types are sorted w.r.t. to their frequency, a type with maximal frequency will end up at the end of the recoder. This is needed if certain types are to be treated in a special way, independent of their frequency in the graph database.
code
- the code of the type
for which to maximize the frequencypublic boolean isMaximal(int code)
code
- the code of the type to checkmaximize(int)
public void sort()
The types are sorted ascendingly w.r.t. their frequency, so that the least frequent type receives the code 0, the next frequent the code 1 etc. Excluded types precede all non-excluded types, maximized type succeed all other types.