#If an old run file was found, prompt the user with choices to make.
if(-e "$run_dir") {
printcolor ("red"), "$cancer_type\_GEO-files directory exists...This run was not completed\n", color("reset");
my$text= "";
my$ok= timed_response( sub{
printcolor ("red"), "Do you want to resume an interrupted execution [r], or start a new one [n]? (r/n)\nDefault selection will be [n] after 10 seconds...\n", color("reset"); $text= <STDIN>;
}, 10);
chomp($text);
if($texteq "r") {
printcolor ("green"), "Resuming analysis using input file: $restart_input_file\n", color("reset");
#Check for a GeoDatasets timeout error and abort run, if found.
if(!$data) {
printcolor ("red"), "\nThe download from GeoDatasets was not successful...\nA GeoDatasets timeout error was detected: current run aborted...\nPlease restart the run...\n", color("reset");
#count total diagnostic datasets found by diagnostic_signature_finder() filters
$filter2_count+= $i;
return($signature);
}
exit0;
=pod
=encoding utf8
=head1 NAME
geoCancerDiagnosticDatasetsRetriever - GEO Cancer Diagnostic Datasets Retriever is a bioinformatics tool for cancer diagnostic dataset retrieval from the GEO website.
The input and output files of geoCancerDiagnosticDatasetsRetriever will be found in the ~/geoCancerDiagnosticDatasetsRetriever_files/data/ and ~/geoCancerDiagnosticDatasetsRetriever_files/results/ directories, respectively.
=head1 DESCRIPTION
Gene Expression Omnibus (GEO) Cancer Diagnostic Datasets Retriever is a Bioinformatics tool for cancer diagnostic dataset retrieval from the GEO database. It requires a GeoDatasets input file listing all GSE dataset entries for a specific cancer (for example, Myelodysplastic syndrome), obtained as a download from the GEO database. This Bioinformatics tool functions by applying keyword filters to examine individual GSE dataset entries listed in a GEO DataSets input file. The first Diagnostic text filter flags for diagnostic keywords (for example, “diagnosis” or “health”) used by clinical science researchers and present in the title/abstract entries. Next, a flagged dataset is examined (by a second Diagnostic text filter) for diagnostic keywords, which may be present in the "Overall design" section of a GSE dataset. If found, this tool outputs the GSE code of the likely diagnostic dataset. If not found by the second filter, a more intensive filtering stage is performed. Here, this tool runs an R script (healthyControlsPresentInputParams.r) whose function is to detect desired keywords in the .SOFT file of this dataset and identify if it is a likely diagnostic dataset.
=head1 INSTALLATION
geoCancerDiagnosticDatasetsRetriever can be used on any Linux or macOS machines. To run the program, you need to have cURL (version 7.68.0 or later), Lynx (version 2.9.0dev.5 or later), and the R programming language (version 4 or later)
installed on your computer.
By default, Perl is installed on all Linux or macOS operating systems. Likewise, cURL is installed on all macOS versions. cURL/R may not be installed on Linux/macOS or Lynx on macOS. They would need to be manually installed through your operating system's software centres. cURL and Lynx will be installed automatically on Linux Ubuntu by geoCancerDiagnosticDatasetsRetriever.
Manual install:
$ perl Makefile.PL
$ make
$ make install
On Linux Ubuntu, you might need to run the last command as a superuser (sudo make install) and to manually install the libfile-homedir-perl package (sudo apt-get install -y libfile-homedir-perl), if not already installed in your Perl 5 configuration.
On Linux Ubuntu, you might need to run the two previous CPAN commands as a superuser (sudo cpanm App::geoCancerDiagnosticDatasetsRetriever and sudo cpanm --uninstall App::geoCancerDiagnosticDatasetsRetriever).
=head1 DATA FILE
The required input file is a GEO DataSets file obtainable as a download from GEO DataSets, upon querying for any particular cancer (for example, myelodysplastic syndrome) in geoCancerDiagnosticDatasetsRetriever.
=head1 HELP
Help information can be read by typing the following command:
$ geoCancerDiagnosticDatasetsRetriever -h
This command will print the following instructions:
Usage: geoCancerDiagnosticDatasetsRetriever -h
Mandatory arguments:
CANCER_TYPE type of the cancer as query search term
PLATFORM_CODES list of GPL platform codes
Optional arguments:
-h show help message and exit
=head1 AUTHORS
Abbas Alameer (Kuwait University) and Davide Chicco (University of Toronto)
For information, please contact Abbas Alameer at abbas.alameer(AT)ku.edu.kw or Davide Chicco at davidechicco(AT)davidechicco.it
=head1 COPYRIGHT AND LICENSE
Copyright 2021 by Abbas Alameer (Kuwait University) and Davide Chicco (University of Toronto)
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License, version 2 (GPLv2).
=cut
Keyboard Shortcuts
Global
s
Focus search bar
?
Bring up this help dialog
GitHub
gp
Go to pull requests
gi
go to github issues (only if github is preferred repository)