Entrezpy: NCBI Entrez databases at your fingertips¶
Synopsis¶
$ pip install entrezpy --user
>>> import entrezpy.conduit
>>> c = entrezpy.conduit.Conduit('myemail')
>>> fetch_influenza = c.new_pipeline()
>>> sid = fetch_influenza.add_search({'db' : 'nucleotide', 'term' : 'H3N2 [organism] AND HA', 'rettype':'count', 'sort' : 'Date Released', 'mindate': 2000, 'maxdate':2019, 'datetype' : 'pdat'})
>>> fid = fetch_influenza.add_fetch({'retmax' : 10, 'retmode' : 'text', 'rettype': 'fasta'}, dependency=sid)
>>> c.run(fetch_influenza)
Entrezpy is a dedicated Python library to interact with NCBI Entrez
databases [Entrez2016] via the E-Utilities [Sayers2018]. Entrezpy facilitates
the implementation of queries to query or download data from the Entrez
databases, e.g. search for specific sequences or publiations or fetch your
favorite genome. For more complex queries entrezpy
offers the class
entrezpy.conduit.Conduit
to run query pipelines or reuse previous queries.
Supported E-Utility functions:
- Entrez pipeline design helper class: Conduit module
- NCBI Entrez utilities and asociated parameters: https://dataguide.nlm.nih.gov/eutilities/utilities.html
entrezpy
publication: [Buchmann2019]
Licence and Copyright¶
entrezpy
is licensed under the GNU Lesser General Public License v3
(LGPLv3) or later.
Concerning the copyright of the material available through E-Utilities, please read their disclaimer and copyright statement at https://www.ncbi.nlm.nih.gov/home/about/policies/.
Contact¶
To report bugs and/or errors, please open an issue at https://gitlab.com/ncbipy/entrezpy or contact me at: jan.buchmann@sydney.edu.au
Of course, feel free to fork the code, improve it, and/or open a pull request.
NCBI API key¶
NCBI offers API keys to allow more requests per second. For more details and
rational see [Sayers2018]. entrezpy
checks for NCBI API keys as follows:
- The NCBI API key can be passed as parameter to
entrezpy
classes- Entrezpy checks for the environment variable
$NCBI_API_KEY
- The enviroment variable, e.g.
NCBI_API_KEY
, can be passed via theapikey_var
parameter to any derivedentrezpy.base.query.EutilsQuery
class.
Work in progress¶
- easier logging configuration via file
- simplify Elink results
- Deploy cleaner testing
- Status indicating of request