uk.ac.ebi.intact.application.dataConversion
Class FileGenerator

java.lang.Object
  |
  +--uk.ac.ebi.intact.application.dataConversion.FileGenerator

public class FileGenerator
extends java.lang.Object

This class is the main application class for generating a flat file format from the contents of a database. Currently the file format is PSI, and the DBs are postgres or oracle, though the DB details are hidden behind the IntactHelper/persistence layer as usual.

Version:
$id$
Author:
Chris Lewington

Field Summary
static int LARGESCALESIZE
           
static int SMALLSCALELIMIT
           
 
Constructor Summary
FileGenerator(IntactHelper helper, DataBuilder builder)
           
 
Method Summary
 void buildFileData(java.util.Collection searchResults)
          Creates the necessary file data from the DB information, using the specified DataBuilder.
static java.util.HashMap classifyExperiments(java.lang.String searchPattern)
          Classify experiments matching searchPattern into a data structure according to species and experiment size.
 void generateFile(java.lang.String fileName)
          Generates the file from the previously created data.
static void generatePsiData(java.lang.String searchPattern, java.lang.String fileName)
          Generates a PSI MI formatted file for a searchPattern.
 java.util.Collection getDbData(java.lang.String searchPattern)
          Obtains the data from the dataSource, in preparation for the flat file generation.
static void main(java.lang.String[] args)
          Main method for the PSI application.
static void processLargeExperiment(Experiment exp, java.lang.String fileName)
           
static void writeExperimentsClassification(java.util.HashMap allExp)
          Output the experiment classification, suitable for scripting
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

SMALLSCALELIMIT

public static final int SMALLSCALELIMIT
See Also:
Constant Field Values

LARGESCALESIZE

public static final int LARGESCALESIZE
See Also:
Constant Field Values
Constructor Detail

FileGenerator

public FileGenerator(IntactHelper helper,
                     DataBuilder builder)
Method Detail

getDbData

public java.util.Collection getDbData(java.lang.String searchPattern)
                               throws IntactException
Obtains the data from the dataSource, in preparation for the flat file generation.

Parameters:
searchPattern - for search by shortLabel. May be a comma-separated list.
Throws:
IntactException - thrown if there was a search problem

buildFileData

public void buildFileData(java.util.Collection searchResults)
                   throws ElementNotParseableException
Creates the necessary file data from the DB information, using the specified DataBuilder.

ElementNotParseableException

generateFile

public void generateFile(java.lang.String fileName)
                  throws DataConversionException
Generates the file from the previously created data.

Parameters:
fileName - the name to use for the generated file.
Throws:
DataConversionException - thrown if there were problems with writing the data.

writeExperimentsClassification

public static void writeExperimentsClassification(java.util.HashMap allExp)
Output the experiment classification, suitable for scripting

Parameters:
allExp - HashMap of HashMap of ArrayLists of Experiments: {species}{scale}[n]

classifyExperiments

public static java.util.HashMap classifyExperiments(java.lang.String searchPattern)
                                             throws IntactException
Classify experiments matching searchPattern into a data structure according to species and experiment size.

Parameters:
searchPattern -
Returns:
HashMap of HashMap of ArrayLists of Experiments: {species}{scale}[n]
Throws:
IntactException

generatePsiData

public static void generatePsiData(java.lang.String searchPattern,
                                   java.lang.String fileName)
                            throws java.lang.Exception
Generates a PSI MI formatted file for a searchPattern. Large scale Experiments will typically be a searchpattern of a single shortlabel, and these are processed as chunks. Small scale ones will generally have a searchpattern consisting of multiple shortlabels, and these will be placed into a single file.

Parameters:
searchPattern -
fileName -
Throws:
java.lang.Exception

processLargeExperiment

public static void processLargeExperiment(Experiment exp,
                                          java.lang.String fileName)
                                   throws java.lang.Exception
java.lang.Exception

main

public static void main(java.lang.String[] args)
                 throws java.lang.Exception
Main method for the PSI application. The application is typically run twice - firstly with a wildcard ('%') argument to generate a file containing classifications into species of experiment labels, then secondly to use that file to generate the PSI XML data for each classification. This secodn step is handled via a perl script which repeatedly calls this application to generate the files. Note that the exceptions to the species classification are large-scale experiments (as defined by the SMALLSCALELIMIIT constant) - these cannot be put into XMl files with other experiments due to size and memory constraints, and so they are generated in 'chunks' of data divided by 'chunks' of interactions.

Parameters:
args -
Throws:
java.lang.Exception


IntAct Project - EMBL-EBI 2004 - intact-help@ebi.ac.uk