Efetch modules¶
Efetcher¶
- class
entrezpy.efetch.efetcher.
Efetcher
(tool, email, apikey=None, apikey_var=None, threads=None, qid=None)¶Bases:
entrezpy.base.query.EutilsQuery
Efetcher implements Efetch E-Utilities queries [0]. It implements
entrezpy.base.query.EutilsQuery.inquire()
to fetch data from NCBI Entrez servers. [0]: https://www.ncbi.nlm.nih.gov/books/NBK25499/#chapter4.EFetch [1]: https://www.ncbi.nlm.nih.gov/books/NBK25497/table/ chapter2.T._entrez_unique_identifiers_ui/?report=objectonly
Variables: result – entrezpy.base.result.EutilsResult
inquire
(parameter, analyzer=<entrezpy.efetch.efetch_analyzer.EfetchAnalyzer object>)¶Implements
entrezpy.base.query.EutilsQuery.inquire()
and configures fetch.Note
Efetch prefers to know the number of UIDs to fetch, i.e. number of UIDs or retmax. If this information is missing, the max number of UIDs for the specific retmode and rettype are fetched.
Parameters:
- parameter (dict) – EFetch parameter
- analyzer (
entrezpy.base.analyzer.EutilsAnalyzer
) – analyzer for Efetch resultsReturns: analyzer instance or None if request errors have been encountered
Return type:
EfetchParameter¶
entrezpy.efetch.efetch_parameter.
DEF_RETMODE
= 'xml'¶Default retmode for fetch requests
- class
entrezpy.efetch.efetch_parameter.
EfetchParameter
(param)¶Bases:
entrezpy.base.parameter.EutilsParameter
EfetchParameter implements checks and configures an EftechQuery. A fetch query knows its size due to the id parameter or earlier result stored on the Entrez history server using WebEnv and query_key. The default retmode (fetch format) is set to XML because all E-Utilities can retun XML but not JSON, unfortunately.
req_limits
= {'json': 500, 'text': 10000, 'xml': 10000}¶Max number of UIDs to fetch per request mode
valid_retmodes
= {'gene': {'text', 'xml'}, 'nuccore': {'text', 'xml'}, 'pmc': {'xml'}, 'poset': {'text', 'xml'}, 'protein': {'text', 'xml'}, 'pubmed': {'text', 'xml'}, 'sequences': {'text', 'xml'}}¶Enforced request uid sizes by NCBI for fetch requests by format
adjust_retmax
(retmax)¶Adjusts retmax parameter. Order of check is crucial.
Parameters: retmax (int) – retmax value Returns: adjusted retmax or None if all UIDs are fetched Return type: int or None
check_retmode
(retmode)¶Checks for valid retmode and retmode combination
Parameters: retmode (str) – retmode parameter Returns: retmode Return type: str
adjust_reqsize
(reqsize)¶Adjusts request size for query
Parameters: reqsize (str or None) – Request size parameter Returns: adjusted request size Return type: int
calculate_expected_requests
(qsize=None, reqsize=None)¶Calculate anf set the expected number of requests. Uses internal parameters if non are provided.
Parameters:
- or None qsize (int) – query size, i.e. expected number of data sets
- reqsize (int) – number of data sets to fetch in one request
haveDb
()¶Check for required db parameter
Return type: bool
haveExpectedRequets
()¶Check fo expected requests. Hints an error if no requests are expected.
Return type: bool
haveQuerykey
()¶Check for required QueryKey parameter
Return type: bool
haveWebenv
()¶Check for required WebEnv parameter
Return type: bool
useHistory
()¶Check if history server should be used.
Return type: bool
check
()¶Implements
entrezpy.base.parameter.EutilsParameter.check
to check for the minumum required parameters. Aborts if any check fails.
dump
()¶Dump instance attributes
Return type: dict Raises: NotImplementedError – if not implemented
EfetchAnalyzer¶
- class
entrezpy.efetch.efetch_analyzer.
EfetchAnalyzer
¶Bases:
entrezpy.base.analyzer.EutilsAnalyzer
EfetchAnalyzer implements a basic analysis of Efetch E-Utils responses. Stores results in a
entrezpy.efetch.efetch_result.EfetchResult
instance.Note
This is a very superficial analyzer for documentation and educational purposes. In almost all cases a more specific analyzer has to be implemented in inheriting
entrezpy.base.analyzer.EutilsAnalyzer
and implementing the virtual functionsentrezpy.base.analyzer.EutilsAnalzyer.analyze_result()
andentrezpy.base.analyzer.EutilsAnalzyer.analyze_error()
.
Variables: result – entrezpy.efetch.efetch_result.EfetchResult
init_result
(response, request)¶Should be implemented if used properly
analyze_result
(response, request)¶Virtual function to handle responses, i.e. parsing them and prepare them for
entrezpy.base.result.EutilsResult
Parameters: response (dict or io.StringIO) – converted response from convert_response()
Raises: NotImplementedError – if implementation is missing
analyze_error
(response, request)¶Virtual function to handle error responses
Parameters: response (dict or io.StringIO) – converted response from convert_response()
Raises: NotImplementedError – if implementation is missing
norm_response
(response, rettype=None)¶Normalizes response for printing
Parameters: response (dict or io.StringIO) – efetch response Returns: str or dict
isEmpty
()¶Test for empty result
Return type: bool
check_error_json
(response)¶Checks for errors in JSON responses. Not unified among Eutil functions.
Parameters: response (dict) – reponse Returns: status if JSON response has error message Return type: bool
check_error_xml
(response)¶Checks for errors in XML responses
Parameters: response ( io.stringIO
) – XML responseReturns: if XML response has error message Return type: bool
convert_response
(raw_response_decoded, request)¶Converts raw_response into the expected format, deduced from request and set via the retmode parameter.
Parameters:
- raw_response (
urllib.request.Request
) – responseentrezpy.requester.requester.Requester
- request (
entrezpy.base.request.EutilsRequest
) – query requestReturns: response in parseable format
Return type: dict or
io.stringIO
- ..note::
- Using threads without locks randomly ‘looses’ the response, i.e. the raw response is emptied between requests. With locks, it works, but threading is not much faster than non-threading. It seems JSON is more prone to this than XML.
follow_up
()¶Return follow-up parameters if available
Returns: Follow-up parameters Return type: dict
get_result
()¶Return result
Returns: result instance Return type: entrezpy.base.result.EutilsResult
isErrorResponse
(response, request)¶Checking for error messages in response from Entrez Servers and set flag
hasErrorResponse
.
Parameters:
- response (dict or
io.stringIO
) – parseable response fromconvert_response()
- request (
entrezpy.base.request.EutilsRequest
) – query requestReturns: error status
Return type: bool
isSuccess
()¶Test if response has errors
Return type: bool
known_fmts
= {'json', 'text', 'xml'}¶
parse
(raw_response, request)¶Check for errors and calls parser for the raw response.
Parameters:
- raw_response (
urllib.request.Request
) – response fromentrezpy.requester.requester.Requester
- request (
entrezpy.base.request.EutilsRequest
) – query requestRaises: NotImplementedError – if request format is not in
EutilsAnalyzer.known_fmts
EfetchRequest¶
- class
entrezpy.efetch.efetch_request.
EfetchRequest
(eutil, parameter, start, size)¶Bases:
entrezpy.base.request.EutilsRequest
The EfetchRequest class implements a single request as part of an Efetch query. It stores and prepares the parameters for a single request.
entrezpy.efetch.efetch_query.Efetch.inquire()
calculates start and size for a single request.
Parameters:
- parameter – request parameter
- type –
entrezpy.efetch.efetch_parameter.EfetchParameter
- start (int) – number of first UID to fetch
- size (int) – requets size
get_post_parameter
()¶Virtual function returning the POST parameters for the request from required attributes.
Return type: dict Raises: NotImplemetedError –
dump
()¶Dumps instance attributes
calc_duration
()¶Calculates request duration
dump_internals
(extend=None)¶Dumps internal attributes for request.
Parameters: extend (dict) – extend dump with additional information
get_request_id
()¶
Returns: full request id Return type: str
prepare_base_qry
(extend=None)¶Returns instance attributes required for every POST request.
Parameters: extend (dict) – parameters extending basic parameters Returns: base parameters for POST request Return type: dict
report_status
(processed_requests=None, expected_requests=None)¶Reports request status when triggered
set_request_error
(error)¶Sets request error and HTTP/URL error message
Parameters: error (str) – HTTP/URL error
set_status_fail
()¶Set status if request failed
set_status_success
()¶Set status if request succeeded
start_stopwatch
()¶Starts time to measure request duration.