|
Difference between revisions of "bestcluster.pl"
Line 49: | Line 49: | ||
; -help : usage information | ; -help : usage information | ||
− | + | ; -dir directory : data directory | |
+ | ; -level num : specify clustering level for hierarchical clusters | ||
+ | ; -ctag alttag : read clustering data from <TT>alttag.clusters</TT> | ||
+ | ; -prop tag[+tag...] : specify which properties to use for sorting clusters | ||
+ | ; -size : sort clusters by size | ||
+ | ; -crit avg|avglow|avgcent|best|best#|median : criteria for ranking clusters (average, best score etc.) | ||
+ | ; -limstd multiple : provide cutoff in terms of multiples of standard deviation for excluding data when averaging | ||
+ | ; -lowest : show | ||
+ | ; -xlowest tags : show additional properties for lowest scoring structure | ||
== Examples == | == Examples == |
Revision as of 12:56, 30 July 2009
Usage
usage: bestcluster.pl [options] tag options: [-dir datadir] [-level num] [-ctag tag] [-prop tag[+tag...]] [-size] [-crit avg|avglow|avgcent|best|best#|median] [-limstd multiple] [-lowest] [-xlowest tags]
Description
This script scores ensemble clusters previously generated with enscluster.pl. As with the other utilities
for ensembles a tag is required for identifying
the structure set and the ensemble directory may be given with
-dir.
By default the final clusters at the bottom of the cluster
hierarchy are scored. Alternatively, clusters at a specific level
may be scored instead by specifying the level through the
-level option.
The total energy (etot) is used for scoring as the default
property. A different property may be chosen with -prop.
A number of different methods are available to obtain a single
score for each cluster from the property values of the cluster members.
The default method is to calculate a simple average for all members.
Other methods are available with the -crit option followed
by a corresponding keyword. With avglow and avgcent
cluster members with a property value outside the standard distribution around
the mean are ignored in calculating the average property value. If
avglow is selected only values on the high end of the distribution
are ignored, with avgcent extreme values on both sides of the
distribution are omitted from the calculation of the average. avglow
is particularly useful with energy values if a small number of erroneously
high energies occur due to structural distortions that are not resolved
during minimization. If avglow or avgcent is selected
the option -limstd may be used to change the limit in multiples of
the standard deviation for excluding cluster members.
The best value is used as the score with best, the average
over the best num structures with best<num>. Finally,
with median the median value is taken as the cluster score.
In the output the clusters are sorted according to the score (or according to their size if -size is given). For each cluster it consists of the number of total members, the number of members used in calculating the score, the score itself and, if applicable, the standard deviation and the statistical error of the score based on the standard deviation and the number of values used in calculating the score. If the option -lowest is given additional fields contain the energy and filename of the lowest energy conformation for each cluster.
Options
- -help
- usage information
- -dir directory
- data directory
- -level num
- specify clustering level for hierarchical clusters
- -ctag alttag
- read clustering data from alttag.clusters
- -prop tag[+tag...]
- specify which properties to use for sorting clusters
- -size
- sort clusters by size
- -crit avg|avglow|avgcent|best|best#|median
- criteria for ranking clusters (average, best score etc.)
- -limstd multiple
- provide cutoff in terms of multiples of standard deviation for excluding data when averaging
- -lowest
- show
- -xlowest tags
- show additional properties for lowest scoring structure
Examples
bestcluster.pl -dir data vacmin
scores final clusters for the vacmin ensemble
structures according to the average total energy
t.2.1 5 5 -1870.1520 44.8227 20.0453 t.2.2 5 5 -1769.0720 88.6326 39.6377 t.1.1 5 5 -1741.1520 49.8687 22.3020 t.1.2 5 5 -1654.2400 152.9225 68.3890
bestcluster.pl -dir data -level 1 -crit best vacmin
scores first level clusters for the vacmin ensemble
structures according to the best total energy
t.2 10 1 -1916.5800 0.0000 0.0000 t.1 10 1 -1791.0300 0.0000 0.0000
bestcluster.pl -dir data -crit avglow -limstd 1.2 vacmin
scores final clusters for the vacmin ensemble structures according to the
average total energy but excluding structures with energies beyond 1.2 times
the standard deviation from the average.
t.2.1 5 3 -1898.0267 16.0680 9.2769 t.1.1 5 3 -1775.0900 7.3533 4.2454 t.2.2 5 3 -1707.0833 32.4968 18.7620 t.1.2 5 3 -1694.6000 41.5083 23.9648
bestcluster.pl -prop rgyr -crit best2 -dir data vacmin
scores final clusters for the vacmin ensemble
structures according to the average radius of gyration of
the two best structures (also with respect to radius of gyration).
t.1.2 5 2 8.9657 0.1587 0.1122 t.2.2 5 2 9.1108 0.0507 0.0359 t.2.1 5 2 9.2168 0.3081 0.2178 t.1.1 5 2 9.2837 0.0013 0.0009