Difference between revisions of "bestcluster.pl"
Revision as of 14:33, 12 August 2006
usage: bestcluster.pl [options] tag options: [-dir datadir] [-level num] [-ctag tag] [-prop tag[+tag...]] [-size] [-crit avg|avglow|avgcent|best|best#|median] [-limstd multiple] [-lowest] [-xlowest tags]
This script scores ensemble clusters previously generated with
<docmark>enscluster.pl</docmark>. As with the other utilities
for ensembles a tag is required for identifying
the structure set and the ensemble directory may be given with
By default the final clusters at the bottom of the cluster hierarchy are scored. Alternatively, clusters at a specific level may be scored instead by specifying the level through the -level option.
The total energy (etot) is used for scoring as the default property. A different property may be chosen with -prop. A number of different methods are available to obtain a single score for each cluster from the property values of the cluster members. The default method is to calculate a simple average for all members. Other methods are available with the -crit option followed by a corresponding keyword. With avglow and avgcent cluster members with a property value outside the standard distribution around the mean are ignored in calculating the average property value. If avglow is selected only values on the high end of the distribution are ignored, with avgcent extreme values on both sides of the distribution are omitted from the calculation of the average. avglow is particularly useful with energy values if a small number of erroneously high energies occur due to structural distortions that are not resolved during minimization. If avglow or avgcent is selected the option -limstd may be used to change the limit in multiples of the standard deviation for excluding cluster members.
The best value is used as the score with best, the average over the best num structures with best<num>. Finally, with median the median value is taken as the cluster score.
In the output the clusters are sorted according to the score (or according to their size if -size is given). For each cluster it consists of the number of total members, the number of members used in calculating the score, the score itself and, if applicable, the standard deviation and the statistical error of the score based on the standard deviation and the number of values used in calculating the score. If the option -lowest is given additional fields contain the energy and filename of the lowest energy conformation for each cluster.
- usage information
bestcluster.pl -dir data vacmin
scores final clusters for the vacmin ensemble structures according to the average total energy
t.2.1 5 5 -1870.1520 44.8227 20.0453 t.2.2 5 5 -1769.0720 88.6326 39.6377 t.1.1 5 5 -1741.1520 49.8687 22.3020 t.1.2 5 5 -1654.2400 152.9225 68.3890
bestcluster.pl -dir data -level 1 -crit best vacmin
scores first level clusters for the vacmin ensemble structures according to the best total energy
t.2 10 1 -1916.5800 0.0000 0.0000 t.1 10 1 -1791.0300 0.0000 0.0000
bestcluster.pl -dir data -crit avglow -limstd 1.2 vacmin
scores final clusters for the vacmin ensemble structures according to the average total energy but excluding structures with energies beyond 1.2 times the standard deviation from the average.
t.2.1 5 3 -1898.0267 16.0680 9.2769 t.1.1 5 3 -1775.0900 7.3533 4.2454 t.2.2 5 3 -1707.0833 32.4968 18.7620 t.1.2 5 3 -1694.6000 41.5083 23.9648
bestcluster.pl -prop rgyr -crit best2 -dir data vacmin
scores final clusters for the vacmin ensemble structures according to the average radius of gyration of the two best structures (also with respect to radius of gyration).
t.1.2 5 2 8.9657 0.1587 0.1122 t.2.2 5 2 9.1108 0.0507 0.0359 t.2.1 5 2 9.2168 0.3081 0.2178 t.1.1 5 2 9.2837 0.0013 0.0009