tacco.tools.setup_goa_analysis¶
- setup_goa_analysis(gene_index, gene_info_file='https://ftp.ncbi.nih.gov/gene/DATA/GENE_INFO/Mammalia/Mus_musculus.gene_info.gz', tax_id=10090, GO_obo_file='http://purl.obolibrary.org/obo/go/go-basic.obo', gene2GO_file='https://ftp.ncbi.nih.gov/gene/DATA/gene2go.gz', working_directory='.')[source]¶
Setup a GO analysis. This is a convenience wrapper around the goatools package [Klopfenstein18] and like goatools performs the enrichment analysis independent of the availability of webservices using a databases downloaded once for reproducibility.
- Parameters:
gene_index – The list of all possible genes.
gene_info_file – File containing a mapping from NCBI GeneIDs to gene symbols, e.g. downloaded from https://ftp.ncbi.nih.gov/gene/DATA/GENE_INFO/Mammalia/Mus_musculus.gene_info.gz. If this is not available as a local file, it is treated as an URL and downloaded to the working_directory if necessary, see below.
tax_id – The NCBI taxonomy ID to filter the gene_info_file for.
GO_obo_file – File containing the Gene Ontology data, e.g. downloaded from http://purl.obolibrary.org/obo/go/go-basic.obo analysis/go-basic.obo. If this is not available as a local file, it is treated as an URL and downloaded to the working_directory if necessary, see below.
gene2GO_file – File containing a mapping from NCBI GeneIDs to Gene Ontology data, e.g. downloaded from https://ftp.ncbi.nih.gov/gene/DATA/gene2go.gz. If this is not available as a local file, it is treated as an URL and downloaded to the working_directory if necessary, see below.
working_directory – Directory where to buffer downloaded files. If a file of the same name already exists in this directory, it is not downloaded again.
- Returns:
Returns a
go_enrichment_ns:GOEnrichmentStudyNS
and aSeries
mapping gene symbols to gene ids. Both are needed to run the enrichment analyses. For convenience, they are also buffered as global objects and used automatically.