![]() |
ROCPLOT documentation |
____________________________________________________ | SINGLE HITS FILE | MULTIPLE HITS FILES | | | | | | | Do not combine | Combine | | | data | data | _____________________|___________________|________________|_____________| | | | | ROC curves / value | Single | Multiple (1) | Single | Bar chart | - | Yes | - | Classification plot | Single | Multiple | Single | Summary file | Yes | Yes | Yes | _____________________|___________________|________________|_____________| |
_classplot_dat0 Data file for classification plot _classplot_dat1 Data file for classification plot _classplot_dat2 Data file for classification plot _classplot_dat3 Data file for classification plot _classplot_dat4 Data file for classification plot _classplot Driver file for classification plot _rocplot_dat0 Data file for roc plot. _rocplot Driver file for roc plot. _summary Summary file. |
_classplot0_dat0 Data file for first classification plot _classplot0_dat1 "" _classplot0_dat2 "" _classplot0_dat3 "" _classplot0_dat4 "" _classplot0 Driver file for first classification plot _classplot1_dat0 Data file for second classification plot _classplot1_dat1 "" _classplot1_dat3 "" _classplot1_dat4 "" _classplot1 Driver file for second classification plot _rocplot_dat0 Data file for roc plot. _rocplot_dat1 "" _rocplot Driver file for roc plot. _summary Summary file. |
_barchart_dat Data file for bar chart. _barchart Driver file for bar chart. |
# GNUPLOT driver file for roc plot set title "ROC plots for data1.hits & data2.hits (combined - " set xlabel "1 - SPEC" set ylabel "SENS" set nokey set noautoscale set xrange [0:1] set yrange [0:1] set key top outside title "Data Series" box 3 set data style points set pointsize 0.45 plot "_rocplot_dat0" smooth bezier title "Combined dataset (0.185)" |
# GNUPLOT data file for rocplot, series 0 0.000 0.007 0.000 0.014 0.000 0.021 0.000 0.029 0.200 0.029 0.167 0.036 0.143 0.043 0.250 0.043 0.222 0.050 0.200 0.057 0.182 0.064 0.250 0.064 0.231 0.071 0.214 0.079 0.200 0.086 0.250 0.086 0.235 0.093 0.222 0.100 0.263 0.100 0.250 0.107 0.238 0.114 0.273 0.114 0.261 0.121 0.292 0.121 0.280 0.129 0.308 0.129 0.296 0.136 0.286 0.143 0.276 0.150 0.300 0.150 0.290 0.157 0.281 0.164 0.273 0.171 0.294 0.171 0.286 0.179 0.278 0.186 0.297 0.186 0.316 0.186 0.333 0.186 0.350 0.186 0.366 0.186 0.381 0.186 0.395 0.186 0.409 0.186 0.400 0.193 0.391 0.200 0.404 0.200 0.417 0.200 0.429 0.200 0.440 0.200 0.451 0.200 0.462 0.200 0.472 0.200 0.481 0.200 0.473 0.207 0.464 0.214 0.474 0.214 0.483 0.214 0.492 0.214 0.500 0.214 0.508 0.214 0.516 0.214 0.524 0.214 0.531 0.214 0.538 0.214 0.545 0.214 0.552 0.214 0.559 0.214 0.565 0.214 0.571 0.214 0.577 0.214 0.583 0.214 0.589 0.214 0.595 0.214 0.600 0.214 0.605 0.214 0.610 0.214 0.615 0.214 0.620 0.214 0.625 0.214 |
# GNUPLOT driver file for classification plot set title "Classification plot for data1.hits & data2.hits (c" set xlabel "Number of hits detected" set ylabel "Proportion of hits detected that are of a certain type" set nokey set key top outside title "Data Series" box 3 set data style points set pointsize 0.45 plot "_classplot_dat0" smooth bezier title "True hits", "_classplot_dat1" smooth bezier title "Cross hits", "_classplot_dat2" smooth bezier title "Uncertain hits", "_classplot_dat3" smooth bezier title "Unknown hits", "_classplot_dat4" smooth bezier title "False hits" |
# GNUPLOT data file for True hits, series 0 1.000 1.000 2.000 1.000 3.000 1.000 4.000 1.000 5.000 0.800 6.000 0.833 7.000 0.857 8.000 0.750 9.000 0.778 10.000 0.800 11.000 0.818 12.000 0.750 13.000 0.769 14.000 0.786 15.000 0.800 16.000 0.750 17.000 0.765 18.000 0.778 19.000 0.737 20.000 0.750 21.000 0.762 22.000 0.727 23.000 0.739 24.000 0.708 25.000 0.720 26.000 0.692 27.000 0.704 28.000 0.714 29.000 0.724 30.000 0.700 31.000 0.710 32.000 0.719 33.000 0.727 34.000 0.706 35.000 0.714 36.000 0.722 37.000 0.703 38.000 0.684 39.000 0.667 40.000 0.650 41.000 0.634 42.000 0.619 43.000 0.605 44.000 0.591 45.000 0.600 46.000 0.609 47.000 0.596 48.000 0.583 49.000 0.571 50.000 0.560 51.000 0.549 52.000 0.538 53.000 0.528 54.000 0.519 55.000 0.527 56.000 0.536 57.000 0.526 58.000 0.517 59.000 0.508 60.000 0.500 61.000 0.492 62.000 0.484 63.000 0.476 64.000 0.469 65.000 0.462 66.000 0.455 67.000 0.448 68.000 0.441 69.000 0.435 70.000 0.429 71.000 0.423 72.000 0.417 73.000 0.411 74.000 0.405 75.000 0.400 76.000 0.395 77.000 0.390 78.000 0.385 79.000 0.380 80.000 0.375 |
# GNUPLOT data file for Cross hits, series 1 1.000 0.000 2.000 0.000 3.000 0.000 4.000 0.000 5.000 0.200 6.000 0.167 7.000 0.143 8.000 0.250 9.000 0.222 10.000 0.200 11.000 0.182 12.000 0.250 13.000 0.231 14.000 0.214 15.000 0.200 16.000 0.250 17.000 0.235 18.000 0.222 19.000 0.263 20.000 0.250 21.000 0.238 22.000 0.273 23.000 0.261 24.000 0.250 25.000 0.240 26.000 0.231 27.000 0.222 28.000 0.214 29.000 0.207 30.000 0.233 31.000 0.226 32.000 0.219 33.000 0.212 34.000 0.235 35.000 0.229 36.000 0.222 37.000 0.216 38.000 0.211 39.000 0.205 40.000 0.200 41.000 0.195 42.000 0.190 43.000 0.186 44.000 0.182 45.000 0.178 46.000 0.174 47.000 0.170 48.000 0.167 49.000 0.163 50.000 0.160 51.000 0.157 52.000 0.154 53.000 0.151 54.000 0.148 55.000 0.145 56.000 0.143 57.000 0.140 58.000 0.138 59.000 0.136 60.000 0.133 61.000 0.131 62.000 0.129 63.000 0.127 64.000 0.125 65.000 0.123 66.000 0.121 67.000 0.119 68.000 0.118 69.000 0.116 70.000 0.114 71.000 0.113 72.000 0.111 73.000 0.110 74.000 0.108 75.000 0.107 76.000 0.105 77.000 0.104 78.000 0.103 79.000 0.101 80.000 0.100 |
# GNUPLOT data file for Uncertain hits, series 2 1.000 0.000 2.000 0.000 3.000 0.000 4.000 0.000 5.000 0.000 6.000 0.000 7.000 0.000 8.000 0.000 9.000 0.000 10.000 0.000 11.000 0.000 12.000 0.000 13.000 0.000 14.000 0.000 15.000 0.000 16.000 0.000 17.000 0.000 18.000 0.000 19.000 0.000 20.000 0.000 21.000 0.000 22.000 0.000 23.000 0.000 24.000 0.000 25.000 0.000 26.000 0.038 27.000 0.037 28.000 0.036 29.000 0.034 30.000 0.033 31.000 0.032 32.000 0.031 33.000 0.030 34.000 0.029 35.000 0.029 36.000 0.028 37.000 0.054 38.000 0.053 39.000 0.051 40.000 0.050 41.000 0.049 42.000 0.048 43.000 0.047 44.000 0.045 45.000 0.044 46.000 0.043 47.000 0.064 48.000 0.062 49.000 0.061 50.000 0.060 51.000 0.059 52.000 0.058 53.000 0.057 54.000 0.056 55.000 0.055 56.000 0.054 57.000 0.070 58.000 0.069 59.000 0.068 60.000 0.067 61.000 0.066 62.000 0.065 63.000 0.063 64.000 0.062 65.000 0.062 66.000 0.061 67.000 0.060 68.000 0.059 69.000 0.058 70.000 0.057 71.000 0.056 72.000 0.056 73.000 0.055 74.000 0.054 75.000 0.053 76.000 0.053 77.000 0.052 78.000 0.051 79.000 0.051 80.000 0.050 |
# GNUPLOT data file for Unknown hits, series 3 1.000 0.000 2.000 0.000 3.000 0.000 4.000 0.000 5.000 0.000 6.000 0.000 7.000 0.000 8.000 0.000 9.000 0.000 10.000 0.000 11.000 0.000 12.000 0.000 13.000 0.000 14.000 0.000 15.000 0.000 16.000 0.000 17.000 0.000 18.000 0.000 19.000 0.000 20.000 0.000 21.000 0.000 22.000 0.000 23.000 0.000 24.000 0.000 25.000 0.000 26.000 0.000 27.000 0.000 28.000 0.000 29.000 0.000 30.000 0.000 31.000 0.000 32.000 0.000 33.000 0.000 34.000 0.000 35.000 0.000 36.000 0.000 37.000 0.000 38.000 0.026 39.000 0.026 40.000 0.025 41.000 0.049 42.000 0.048 43.000 0.047 44.000 0.068 45.000 0.067 46.000 0.065 47.000 0.064 48.000 0.083 49.000 0.082 50.000 0.080 51.000 0.098 52.000 0.096 53.000 0.094 54.000 0.111 55.000 0.109 56.000 0.107 57.000 0.105 58.000 0.121 59.000 0.119 60.000 0.117 61.000 0.131 62.000 0.129 63.000 0.127 64.000 0.125 65.000 0.123 66.000 0.121 67.000 0.119 68.000 0.118 69.000 0.116 70.000 0.114 71.000 0.113 72.000 0.111 73.000 0.110 74.000 0.108 75.000 0.107 76.000 0.105 77.000 0.104 78.000 0.103 79.000 0.101 80.000 0.100 |
# GNUPLOT data file for False hits, series 4 1.000 0.000 2.000 0.000 3.000 0.000 4.000 0.000 5.000 0.000 6.000 0.000 7.000 0.000 8.000 0.000 9.000 0.000 10.000 0.000 11.000 0.000 12.000 0.000 13.000 0.000 14.000 0.000 15.000 0.000 16.000 0.000 17.000 0.000 18.000 0.000 19.000 0.000 20.000 0.000 21.000 0.000 22.000 0.000 23.000 0.000 24.000 0.042 25.000 0.040 26.000 0.038 27.000 0.037 28.000 0.036 29.000 0.034 30.000 0.033 31.000 0.032 32.000 0.031 33.000 0.030 34.000 0.029 35.000 0.029 36.000 0.028 37.000 0.027 38.000 0.026 39.000 0.051 40.000 0.075 41.000 0.073 42.000 0.095 43.000 0.116 44.000 0.114 45.000 0.111 46.000 0.109 47.000 0.106 48.000 0.104 49.000 0.122 50.000 0.140 51.000 0.137 52.000 0.154 53.000 0.170 54.000 0.167 55.000 0.164 56.000 0.161 57.000 0.158 58.000 0.155 59.000 0.169 60.000 0.183 61.000 0.180 62.000 0.194 63.000 0.206 64.000 0.219 65.000 0.231 66.000 0.242 67.000 0.254 68.000 0.265 69.000 0.275 70.000 0.286 71.000 0.296 72.000 0.306 73.000 0.315 74.000 0.324 75.000 0.333 76.000 0.342 77.000 0.351 78.000 0.359 79.000 0.367 80.000 0.375 |
rocplot summary file (15 Jul 2013) mode == 2 (Multiple input file mode) multimode == 2 (Combine data: single ROC plot, single classification plot.) datamode == 1 (Single list of known true relatives.) File Known data1.hits 140 data2.hits 140 ROC50 == 0.185 (combined) |
MODE INFO modei: 2 multimodei: 2 datamodei: 1 NUMBER OF INPUT FILES numfiles: 2 NAMES ONLY OF INPUT FILES hitsnames[0]: data1.hits hitsnames[1]: data2.hits ROC NUMBER roc: 50 ROC VALUES rocn[0]: 0.184714 COUNT OF HITS hitcnt[0]: 80 |
Perform ROC analysis on hits files Version: EMBOSS:6.6.0.0 Standard (Mandatory) qualifiers (* if not always prompted): [-hitsfilespath] dirlist [rocplot] This option specifies the directory of hits files (input). A 'hits file' contains a list of hits (e.g. from a prediction method) that are classified and rank-ordered on the basis of score, p-value, E-value etc. The files generated by using SIGSCAN and LIBSCAN will contain the results of a search of a discriminating element (e.g. hidden Markov model, profile or signature) against a sequence database. The ROCPLOT application is run on the files to perform Receiver Operator Characteristic (ROC) analysis on the hits. -mode menu [1] This option specifies the mode of ROCPLOT operation (main mode). In 'single input file mode', ROC analysis is performed on the individual hits file; a ROC plot containing a single ROC curve, and a single ROC value and classification plot are generated. In 'multiple input file mode' there are two sub-modes depending upon whether (1) ROC analysis is to performed separately for the individual input files or (2) the lists of hits in the hits files are combined and ROC analysis is performed on the whole (see the ACD option called 'multimode' for more information). If the input file does not contain at least as many 'FALSE' hits as are specified after the 'ROC' token in the input file, then an error will be generated and rocplot will terminate. Where multiple input files are given as input, each must contain the same value after the 'ROC' token, or an error will be generated and rocplot will terminate. The hits in the hits files *must* have been rank-ordered on the basis of score, p-value, E-value etc, with the highest scoring / most significant hit being given in the highest rank (1); i.e. on the second line of the file. Other hits should then be given in order of decreasing score / significance. (Values: 1 (Single input file mode); 2 (Multiple input file mode)) * -multimode menu [1] This option specifies the mode of ROCPLOT operation (multimode). In 'Do not combine data' mode, ROC analysis is performed separately for the individual input files. Multiple ROC curves will be given on the same ROC plot and a ROC value and a classification plot will be generated for each input file. A bar chart giving the distribution of ROCn values, and the mean and standard deviation of ROCn values are also generated. In 'Combine data' mode, the lists of hits in the hits files are combined and ROC analysis is performed on the whole. A single ROC curve will be given in the ROC plot and a single ROC value and classification plot will be generated. In this second mode there are two further sub-modes depending on whether there is (1) a single list of known true relatives for the different searches or (2) there is a different list of known true relatives for each different search (see the ACD option called 'datamode' for more information) (Values: 1 (Do not combine data (multiple ROC curves in single ROC plot - multiple classification plots)); 2 (Combine data (single ROC curve - single classification plot))) * -datamode menu [1] This option specifies the mode of ROCPLOT operation (datamode). This determine how the ROC number and value are calculated in cases where there are multiple input files (lists of hits) and the user has specified the data are to be combined. See rocplot.c for more information. (Values: 1 (Single list of known true relatives); 2 (Multiple lists of known true relatives)) * -thresh integer [10] This option specifies the overlap threshold for hits. In cases where the lists of hits are to be combined and there is a single set of relatives, the accession number (or other database identifier code) of the hit, and the start and end point respectively of the hit relative to full length sequence must be provided in the lists of hits (see 'Input file format' below). rocplot ensures that only unique hits are counted when calculating SENS and SPEC; two hits are 'unique' if they have (i) different accesssion numbers or (ii) the same accession numbers but which do not overlap by any more than a user-defined number of residues. The overlap is determined from the start and end points of the hit. For example two hits both with the same accession numbers and with the start and end points of 1-100 and 91 - 190 respectively are considered to be the same hit if the overlap threshold is 10 or less. (Any integer value) [-outdir] outdir [./] This option specifies the directory where output files are written. [-rocbasename] string [_rocplot] This option specifies the base name of ROC plot file(s) (output). A file of meta data that contains graphs that illustrate the diagnostic performance of the discriminator. rocplot generates Receiver Operating Characteristic (ROC) curves, that display graphically the sensitivity and specificity of discriminating elements, and accompanying ROC value(s), which are a convenient numerical measure of the sensitivity and specificity of a method. Classification plots, which are a valuable aid in interpreting the ROC plot and value, are also generated and, depending upon the mode rocplot is run in, a plot of the distribution of ROC values. (Any string) -outfile outfile [_summary] This option specifies the name of the summary file (output). A text file summarising the analysis. * -barbasename string [_barchart] This option specifies the base name of bar chart for ROC value distribution (output). A bar chart giving the distribution of ROCn values will be generated when multiple input files (lists of hits) are provided and the user has specified 'Do not combine data (multiple ROC curves). (Any string) -classbasename string [_classplot] This option specifies the base name of classification plot file(s) (output). Classification plots are a valuable aid in interpreting the ROC plot and value. A single plot will be generated where a single input file is provided or where multiple input files are provided and the user has specified 'Combine data (single ROC curve)' mode. Multiple plots will be generated where multiple input files are provided and the user has specified 'Do not combine data (multiple ROC curves)' mode. (Any string) Additional (Optional) qualifiers: (none) Advanced (Unprompted) qualifiers: -norange boolean [N] This option specifies whether to disregard range data when identifying unique hits. If set, the range data specified in the hits files are disregarded, two hits are classed as unique if they have different accession numbers (no requirement for overlapping ranges). -logfile outfile [rocplot.log] Domainatrix log output file Associated qualifiers: "-hitsfilespath" associated qualifiers -extension1 string Default file extension "-outdir" associated qualifiers -extension2 string Default file extension "-outfile" associated qualifiers -odirectory string Output directory "-logfile" associated qualifiers -odirectory string Output directory General qualifiers: -auto boolean Turn off prompts -stdout boolean Write first file to standard output -filter boolean Read first file from standard input, write first file to standard output -options boolean Prompt for standard and additional values -debug boolean Write debug output to program.dbg -verbose boolean Report some/full command line options -help boolean Report command line options and exit. More information on associated and general qualifiers can be found with -help -verbose -warning boolean Report warnings -error boolean Report errors -fatal boolean Report fatal errors -die boolean Report dying program messages -version boolean Report version number and exit
Qualifier | Type | Description | Allowed values | Default | ||||
---|---|---|---|---|---|---|---|---|
Standard (Mandatory) qualifiers | ||||||||
[-hitsfilespath] (Parameter 1) |
dirlist | This option specifies the directory of hits files (input). A 'hits file' contains a list of hits (e.g. from a prediction method) that are classified and rank-ordered on the basis of score, p-value, E-value etc. The files generated by using SIGSCAN and LIBSCAN will contain the results of a search of a discriminating element (e.g. hidden Markov model, profile or signature) against a sequence database. The ROCPLOT application is run on the files to perform Receiver Operator Characteristic (ROC) analysis on the hits. | Directory with files | rocplot | ||||
-mode | list | This option specifies the mode of ROCPLOT operation (main mode). In 'single input file mode', ROC analysis is performed on the individual hits file; a ROC plot containing a single ROC curve, and a single ROC value and classification plot are generated. In 'multiple input file mode' there are two sub-modes depending upon whether (1) ROC analysis is to performed separately for the individual input files or (2) the lists of hits in the hits files are combined and ROC analysis is performed on the whole (see the ACD option called 'multimode' for more information). If the input file does not contain at least as many 'FALSE' hits as are specified after the 'ROC' token in the input file, then an error will be generated and rocplot will terminate. Where multiple input files are given as input, each must contain the same value after the 'ROC' token, or an error will be generated and rocplot will terminate. The hits in the hits files *must* have been rank-ordered on the basis of score, p-value, E-value etc, with the highest scoring / most significant hit being given in the highest rank (1); i.e. on the second line of the file. Other hits should then be given in order of decreasing score / significance. |
|
1 | ||||
-multimode | list | This option specifies the mode of ROCPLOT operation (multimode). In 'Do not combine data' mode, ROC analysis is performed separately for the individual input files. Multiple ROC curves will be given on the same ROC plot and a ROC value and a classification plot will be generated for each input file. A bar chart giving the distribution of ROCn values, and the mean and standard deviation of ROCn values are also generated. In 'Combine data' mode, the lists of hits in the hits files are combined and ROC analysis is performed on the whole. A single ROC curve will be given in the ROC plot and a single ROC value and classification plot will be generated. In this second mode there are two further sub-modes depending on whether there is (1) a single list of known true relatives for the different searches or (2) there is a different list of known true relatives for each different search (see the ACD option called 'datamode' for more information) |
|
1 | ||||
-datamode | list | This option specifies the mode of ROCPLOT operation (datamode). This determine how the ROC number and value are calculated in cases where there are multiple input files (lists of hits) and the user has specified the data are to be combined. See rocplot.c for more information. |
|
1 | ||||
-thresh | integer | This option specifies the overlap threshold for hits. In cases where the lists of hits are to be combined and there is a single set of relatives, the accession number (or other database identifier code) of the hit, and the start and end point respectively of the hit relative to full length sequence must be provided in the lists of hits (see 'Input file format' below). rocplot ensures that only unique hits are counted when calculating SENS and SPEC; two hits are 'unique' if they have (i) different accesssion numbers or (ii) the same accession numbers but which do not overlap by any more than a user-defined number of residues. The overlap is determined from the start and end points of the hit. For example two hits both with the same accession numbers and with the start and end points of 1-100 and 91 - 190 respectively are considered to be the same hit if the overlap threshold is 10 or less. | Any integer value | 10 | ||||
[-outdir] (Parameter 2) |
outdir | This option specifies the directory where output files are written. | Output directory | ./ | ||||
[-rocbasename] (Parameter 3) |
string | This option specifies the base name of ROC plot file(s) (output). A file of meta data that contains graphs that illustrate the diagnostic performance of the discriminator. rocplot generates Receiver Operating Characteristic (ROC) curves, that display graphically the sensitivity and specificity of discriminating elements, and accompanying ROC value(s), which are a convenient numerical measure of the sensitivity and specificity of a method. Classification plots, which are a valuable aid in interpreting the ROC plot and value, are also generated and, depending upon the mode rocplot is run in, a plot of the distribution of ROC values. | Any string | _rocplot | ||||
-outfile | outfile | This option specifies the name of the summary file (output). A text file summarising the analysis. | Output file | _summary | ||||
-barbasename | string | This option specifies the base name of bar chart for ROC value distribution (output). A bar chart giving the distribution of ROCn values will be generated when multiple input files (lists of hits) are provided and the user has specified 'Do not combine data (multiple ROC curves). | Any string | _barchart | ||||
-classbasename | string | This option specifies the base name of classification plot file(s) (output). Classification plots are a valuable aid in interpreting the ROC plot and value. A single plot will be generated where a single input file is provided or where multiple input files are provided and the user has specified 'Combine data (single ROC curve)' mode. Multiple plots will be generated where multiple input files are provided and the user has specified 'Do not combine data (multiple ROC curves)' mode. | Any string | _classplot | ||||
Additional (Optional) qualifiers | ||||||||
(none) | ||||||||
Advanced (Unprompted) qualifiers | ||||||||
-norange | boolean | This option specifies whether to disregard range data when identifying unique hits. If set, the range data specified in the hits files are disregarded, two hits are classed as unique if they have different accession numbers (no requirement for overlapping ranges). | Boolean value Yes/No | No | ||||
-logfile | outfile | Domainatrix log output file | Output file | rocplot.log | ||||
Associated qualifiers | ||||||||
"-hitsfilespath" associated dirlist qualifiers | ||||||||
-extension1 -extension_hitsfilespath |
string | Default file extension | Any string | |||||
"-outdir" associated outdir qualifiers | ||||||||
-extension2 -extension_outdir |
string | Default file extension | Any string | |||||
"-outfile" associated outfile qualifiers | ||||||||
-odirectory | string | Output directory | Any string | |||||
"-logfile" associated outfile qualifiers | ||||||||
-odirectory | string | Output directory | Any string | |||||
General qualifiers | ||||||||
-auto | boolean | Turn off prompts | Boolean value Yes/No | N | ||||
-stdout | boolean | Write first file to standard output | Boolean value Yes/No | N | ||||
-filter | boolean | Read first file from standard input, write first file to standard output | Boolean value Yes/No | N | ||||
-options | boolean | Prompt for standard and additional values | Boolean value Yes/No | N | ||||
-debug | boolean | Write debug output to program.dbg | Boolean value Yes/No | N | ||||
-verbose | boolean | Report some/full command line options | Boolean value Yes/No | Y | ||||
-help | boolean | Report command line options and exit. More information on associated and general qualifiers can be found with -help -verbose | Boolean value Yes/No | N | ||||
-warning | boolean | Report warnings | Boolean value Yes/No | Y | ||||
-error | boolean | Report errors | Boolean value Yes/No | Y | ||||
-fatal | boolean | Report fatal errors | Boolean value Yes/No | Y | ||||
-die | boolean | Report dying program messages | Boolean value Yes/No | Y | ||||
-version | boolean | Report version number and exit | Boolean value Yes/No | N |
% rocplot Perform ROC analysis on hits files Hits directories [rocplot]: rocplot/hitsin Available modes 1 : Single input file mode 2 : Multiple input file mode Select mode of operation. [1]: 2 Available modes 1 : Do not combine data (multiple ROC curves in single ROC plot - multiple classification plots) 2 : Combine data (single ROC curve - single classification plot) Select mode of operation. [1]: 2 Available modes 1 : Single list of known true relatives 2 : Multiple lists of known true relatives Select mode of operation. [1]: 1 Overlap threshold for hits. [10]: General output file output directory [./]: Base name of ROC plot file(s) (output). [_rocplot]: Rocplot summary output file [_summary]: Base name of classification plot file(s) (output). [_classplot]: /homes/user/test/data/structure/rocplot/hitsin/data1.hits /homes/user/test/data/structure/rocplot/hitsin/data2.hits Processing data1.hits Processing data2.hits Please wait ... done! |
Go to the output files for this example
FILE TYPE | FORMAT | DESCRIPTION | CREATED BY | SEE ALSO |
Hits file | Text file of classified hits | A list of hits (e.g. from a prediction method) that are classified and rank-ordered on the basis of score, p-value, E-value etc. | ROCON and LIBSCAN (hits from searches of a discriminating element (hidden Markov model, profile or signature) against a sequence database). | ROCPLOT is run on the files to perform Receiver Operator Characteristic (ROC) analysis on the hits. |
From the gold standard | | | | Related | Unrelated| _______|__________|__________|_______ | | | S r (+ve) | TP | FP | P (=TP+FP) e e hits | | | a s _______|_ ________|__________|_______ r u | | | c l (-ve) | FN | TN | N (=FN+TN) h t misses | | | _______|__________|__________|_______ | | | | R | U | | (=TP+FN) | (=FP+TN) | |
| | * No. true | * positives | * detected | * | * | * | * | * | * |* |______________________________ No. false positives detected |
| | * SENS | * | * "rate of | * true | * positives" | * | * | * | * |* |______________________________ 1 - SPEC "rate of false positives" |
Proportion of 1.0| hits detected | that are of a | certain type | | * * TRUE | * . . CROSS | * . | * . | * . x x FALSE | *. x |*. x |______________________________ 0 1.0 Proportion of total number of hits detected. |
Frequency | | ___ | | | | ___| | | ___ | | | | | | | | | | ___ | | | | | | | |___| | | | | | | | | |___| | | |___| | | | | | | | | | | | | | | |___|___|___|___|___|___|___|__ Bins for different ranges of value of ROCn value |
Rank File1 File2 File3 ROC3 ROC3 ROC3 1 TRUE TRUE TRUE 2 TRUE TRUE TRUE 3 TRUE TRUE TRUE 4 FALSE TRUE TRUE 5 FALSE TRUE TRUE 6 FALSE FALSE TRUE 7 FALSE FALSE 8 TRUE FALSE 9 FALSE TRUE 10 TRUE 11 FALSE |
Program name | Description |
---|---|
cathparse | Generate DCF file from raw CATH files |
domainalign | Generate alignments (DAF file) for nodes in a DCF file |
domainnr | Remove redundant domains from a DCF file |
domainrep | Reorder DCF file to identify representative structures |
domainseqs | Add sequence records to a DCF file |
domainsse | Add secondary structure records to a DCF file |
helixturnhelix | Identify nucleic acid-binding motifs in protein sequences |
libgen | Generate discriminating elements from alignments |
matgen3d | Generate a 3D-1D scoring matrix from CCF files |
pepcoil | Predict coiled coil regions in protein sequences |
rocon | Generate a hits file from comparing two DHF files |
scopparse | Generate DCF file from raw SCOP files |
seqalign | Extend alignments (DAF file) with sequences (DHF file) |
seqfraggle | Remove fragment sequences from DHF files |
seqsort | Remove ambiguous classified sequences from DHF files |
seqwords | Generate DHF files from keyword search of UniProt |
ssematch | Search a DCF file for secondary structure matches |
See also http://emboss.sourceforge.net/