org.peace_tools.data
Class ClusterFile

java.lang.Object
  extended by org.peace_tools.data.ClusterFile

public class ClusterFile
extends java.lang.Object

The top-level class that encapsulates all the pertinent information regarding a cluster data file. This class deserializes the information in a cluster data file generated by PEACE and stores it in memory. The in-memory storage format for the core cluster information is achieved using an hierarchically nested set of ClusterNode objects. In addition, this class also maintains any generated information that is placed in the file by PEACE.

Note that the in-memory format represented by this class has been primarily designed to provide more convenient access to the related information and for display in a GUI. However, this class does not directly perform any GUI related task. Instead, the GUI display is organized using the MVC (Model-View-Controller) design pattern. This class constitutes the "model" as in the MVC terminology.

Note: In order to create a valid ClusterFile use the loadCluster(File) static method in this class.


Field Summary
private  java.lang.String fileName
          The file name from where the data has been read.
private  java.util.ArrayList<Pair> metadata
          The set of meta data that was loaded from the cluster file.
private  java.lang.Object prevClusterList
          This reference is used to track the object used to hold the list of DBClassifiers in the work space.
private  ClusterNode root
          The root of the cluster node.
 
Constructor Summary
private ClusterFile(java.lang.String fileName)
          The constructor creates an empty cluster object.
 
Method Summary
 void classify(ESTList estList, javax.swing.ProgressMonitor pm)
          Method to recompute (as needed) the classification of ESTs in clusters.
 java.lang.String getFileName()
          The absolute path to the file name from where the cluster data was originally loaded.
 java.util.ArrayList<Pair> getMetadata()
          Obtain the meta data associated with this cluster file.
 ClusterNode getRoot()
          Obtain the top-level root cluster for this cluster file.
 boolean isClassified()
          Determine if the clusters are current with work space classifiers.
static ClusterFile loadCluster(java.io.File clusterFile)
          This method loads cluster data into an in-memory format.
static ClusterFile loadCluster(java.lang.String fileName, java.io.InputStream is)
          This method loads cluster data into an in-memory format.
protected static void makeClusterNode(java.util.ArrayList<ClusterNode> nodeList, java.lang.String line)
          Helper method to process a comma separated set of values representing a cluster node.
protected static Pair makeMetadataEntry(java.lang.String line)
          This is a helper method that is used to parse a line of meta data entry (line starts with a '#' character) and convert it to a a Pair containing a name, value pair and returns the meta data as a pair.
 void print(java.io.PrintStream out)
          Method to print the cluster in a simple text-based format.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

fileName

private java.lang.String fileName
The file name from where the data has been read. This is typically the absolute path of the file from where the data was read.


root

private ClusterNode root
The root of the cluster node. Typically there is at least one root node.


metadata

private java.util.ArrayList<Pair> metadata
The set of meta data that was loaded from the cluster file. The meta data is stored as a list of name, value pairs.


prevClusterList

private java.lang.Object prevClusterList
This reference is used to track the object used to hold the list of DBClassifiers in the work space. This information is used to decide if the classifications need to be recomputed. This works because the classifiers are modified as a single complete batch of entries. Each time a new object is set and if the objects have changed then the classifications need to be recomputed for this cluster file.

Constructor Detail

ClusterFile

private ClusterFile(java.lang.String fileName)
The constructor creates an empty cluster object. This method is called from the loadCluster method. This object is filled in later on as data is read from the cluster file.

Parameters:
fileName - The absolute path to the file from where the cluster data was loaded. This file name is used as an identifier to locate the files.
Method Detail

getFileName

public java.lang.String getFileName()
The absolute path to the file name from where the cluster data was originally loaded.

Returns:
The absolute path to the file that uniquely identifies the contents of the cluster.

print

public void print(java.io.PrintStream out)
Method to print the cluster in a simple text-based format. This method is primarily used for validating the cluster data to ensure that the data was loaded correctly.

Parameters:
out - The output stream to which the cluster data is to be written.

getRoot

public ClusterNode getRoot()
Obtain the top-level root cluster for this cluster file.

Returns:
This method must return the top-level root cluster for this cluster file.

getMetadata

public java.util.ArrayList<Pair> getMetadata()
Obtain the meta data associated with this cluster file.

Returns:
The meta data associated with this cluster file.

loadCluster

public static ClusterFile loadCluster(java.io.File clusterFile)
                               throws java.lang.Exception
This method loads cluster data into an in-memory format. This method must be used to load cluster data from a PEACE generated cluster data file and deserialize the information into the in-memory format. The in-memory format provides an hierarchical organization that is a bit more streamlined and easier to display in the GUI.

Parameters:
clusterFile - The cluster file (generated by PEACE) from where the data is to be loaded in the in-memory format.
Returns:
On success this method returns a valid cluster data structure loaded from the file.
Throws:
java.lang.Exception - This method throws an exception on errors.

loadCluster

public static ClusterFile loadCluster(java.lang.String fileName,
                                      java.io.InputStream is)
                               throws java.lang.Exception
This method loads cluster data into an in-memory format. This method must be used to load cluster data from a PEACE generated cluster data file and deserialize the information into the in-memory format. The in-memory format provides an hierarchical organization that is a bit more streamlined and easier to display in the GUI.

Parameters:
fileName - The absolute path to the file from where the data is being read.
is - The input stream from where the data is to be read.
Returns:
On success this method returns a valid cluster data structure loaded from the file.
Throws:
java.lang.Exception - This method throws an exception on errors.

isClassified

public boolean isClassified()
Determine if the clusters are current with work space classifiers. This method can be used to determine if the clusters have already been classified using the current set of classifiers configured for this work space.

Returns:
This method returns true if the clusters have already been classified using the current set of classifiers.

classify

public void classify(ESTList estList,
                     javax.swing.ProgressMonitor pm)
Method to recompute (as needed) the classification of ESTs in clusters. This method must be used to recompute the classification of ESTs in a cluster whenever the data base classifiers in the EST change. This method recomputes classifications only if the classifier list in the work space has actually changed. Consequently, repeatedly calling this method does not have side effects. However, if classification is performed then it may be a long running process particularly, for large EST sets. Consequently, it is best to invoke this method from a separate thread so that the GUI does not appear to be hanging while classifications are computed.

Note: This method assumes that the classifications for the ESTs in the estList have already been computed.

Parameters:
estList - The list of ESTs associated with this cluster file to be used to classify the ESTs.
pm - An optional progress monitor to be updated to indicate progress. This parameter can be null.

makeClusterNode

protected static void makeClusterNode(java.util.ArrayList<ClusterNode> nodeList,
                                      java.lang.String line)
                               throws java.io.IOException
Helper method to process a comma separated set of values representing a cluster node. This method is a helper method that is used to parse a data line from the cluster file and convert it to a ClusterNode. This method reads and validates the data. It then creates a clusterNode and adds it to its parent (if one is present) and to the nodeList.

Parameters:
nodeList - The list of nodes that have been read so far.
line - The line containing node data to be processed and converted to a clusterNode.
Throws:
java.io.IOException - This method throws an exception if the data was invalid or not read.

makeMetadataEntry

protected static Pair makeMetadataEntry(java.lang.String line)
This is a helper method that is used to parse a line of meta data entry (line starts with a '#' character) and convert it to a a Pair containing a name, value pair and returns the meta data as a pair. It assumes that the name and value are separated by ':'.

Parameters:
line - The line from the cluster file to be processed.
Returns:
A pair object containing the name, value pair.