Entrezpy: NCBI Entrez databases at your fingertips

https://img.shields.io/pypi/pyversions/entrezpy.svg?style=popout-square:alt:PyPI-PythonVersion https://img.shields.io/pypi/l/entrezpy.svg?style=popout-square:alt:PyPI-License https://img.shields.io/pypi/v/entrezpy.svg?style=popout-square:alt:PyPI https://img.shields.io/pypi/format/entrezpy.svg?style=popout-square:alt:PyPI-Format https://img.shields.io/pypi/status/entrezpy.svg?style=popout-square:alt:PyPI-Status

Synopsis

$ pip install entrezpy --user
>>> import entrezpy.conduit
>>> c = entrezpy.conduit.Conduit('myemail')
>>> fetch_influenza = c.new_pipeline()
>>> sid = fetch_influenza.add_search({'db' : 'nucleotide', 'term' : 'H3N2 [organism] AND HA', 'rettype':'count', 'sort' : 'Date Released', 'mindate': 2000, 'maxdate':2019, 'datetype' : 'pdat'})
>>> fid = fetch_influenza.add_fetch({'retmax' : 10, 'retmode' : 'text', 'rettype': 'fasta'}, dependency=sid)
>>> c.run(fetch_influenza)

Entrezpy is a dedicated Python library to interact with NCBI Entrez databases [Entrez2016] via the E-Utilities [Sayers2018]. Entrezpy facilitates the implementation of queries to query or download data from the Entrez databases, e.g. search for specific sequences or publiations or fetch your favorite genome. For more complex queries entrezpy offers the class entrezpy.conduit.Conduit to run query pipelines or reuse previous queries.

Supported E-Utility functions:

Source code

git clone https://gitlab.com/ncbipy/entrezpy.git

Contact

To report bugs and/or errors, please open an issue at https://gitlab.com/ncbipy/entrezpy or contact me at: jan.buchmann@sydney.edu.au

Of course, feel free to fork the code, improve it, and/or open a pull request.

NCBI API key

NCBI offers API keys to allow more requests per second. For more details and rational see [Sayers2018]. entrezpy checks for NCBI API keys as follows:

  • The NCBI API key can be passed as parameter to entrezpy classes
  • Entrezpy checks for the environment variable $NCBI_API_KEY
  • The enviroment variable, e.g. NCBI_API_KEY, can be passed via the apikey_var parameter to any derived entrezpy.base.query.EutilsQuery class.

Work in progress

  • easier logging configuration via file
  • simplify Elink results
  • Deploy cleaner testing
  • Status indicating of request

Glossary

Glossary:

Indices and tables