MMTSB
Tool Set Documentation

enscluster.pl

From MMTSB
Jump to: navigation, search

Usage

usage:   enscluster.pl [options] tag
options: [-jclust] [-kclust]
         [-maxnum value] [-minsize value] [-maxlevel value]
         [-radius value] [-[no]iterate]
         [-mode rmsd|contact|phi|psi|phipsi|mix]
         [-contmaxdist value] [-mixfactor value]
         [-l min:max[=min:max ...]] [-fit min:max[=min:max] | -fitxl]
         [-selmode ca|cb|cab|heavy|all]
         [-[no]lsqfit]
         [-dir workdir]
         [-opt file[:file]]
Show source


Description

This script applies a clustering algorithm to ensemble structures. The options and functionality is very similar to cluster.pl. The differences are that instead of a list of files an ensemble tag is expected and the output is stored in a file tag.cluster in the ensemble data directory. The clustering options are also stored in and read from the options file associated with the ensemble tag.

In addition to the parameters from cluster.pl the parameter -dir is used to specify the ensemble directory. With -opt other options files (other than the default one) can be read in.

For fragment/loop modeling the residue range may be specified as in cluster.pl, but if a residue range has been stored in the ensemble configuration file previously clustering will also only be based on the corresponding residue subset even if -l is not explicitly given. Fitting for RMSD based clustering is always done for the protein template surrounding the selected residues.

The centroid output options are not supported in the ensemble clustering script.

Options

-help 
usage information
-jclust 
hierarchical clustering
-kclust 
K-means clustering
-maxnum value 
maximum number of clusters for hierarchical clustering
-minsize value 
minimum cluster size to generate subclusters in hierarchical clustering
-maxlevel value 
maximum levels for hierarchical clustering
-radius value 
define cluster radius for K-means clustering
-[no]iterate 
(do not) iterate during K-means clustering
-mode rmsd|contact|phi|psi|phipsi|mix 
measure for comparing structures during clustering
-contmaxdist value 
contact distance threshold if clustering based on contact map
-mixfactor value 
weight factor if clustering both on RMSD and contact map
-l min:max[=min:max ..] 
compare only specified residue range when clustering
-selmode ca|cb|cab|heavy|all 
atoms used for comparing structures during clustering
-[no]lsqfit 
(do not) superimpose structures before comparing
-dir workdir 
data directory
-opt file[:file] 
provide file with clustering options

Examples

enscluster.pl -maxnum 3 -minsize 10 -dir data sample
performs hierarchical clustering for ensemble structure associated with the sample tag. The maximum number of clusters at each level is set to 3, subclusters are recursively clustered again if they have 10 or more elements.


enscluster.pl -kclust -radius 5 -dir data sample
performs K-means clustering with a radius of 5 Å for ensemble structure associated with the sample tag.