Esearch modules¶
Esearcher¶
- class
entrezpy.esearch.esearcher.
Esearcher
(tool, email, apikey=None, apikey_var=None, threads=None, qid=None)¶Bases:
entrezpy.base.query.EutilsQuery
Esearcher implements ESearch queries to NCBI’s E-Utilities. Esearch queries return UIDs or WebEnv/QueryKey references to Entrez’ History server. Esearcher implments
entrezpy.base.query.EutilsQuery.inquire()
which analyzes the first result and automatically configures subseqeunt requests to get all queried UIDs if required.
inquire
(parameter, analyzer=<entrezpy.esearch.esearch_analyzer.EsearchAnalyzer object>)¶Implements
entrezpy.base.query.EutilsQuery.inquire()
and configures follow-up requests if required.
Parameters:
- parameter (dict) – ESearch parameter
- analyzer (analyzer) – analyzer for ESearch results, default is
entrezpy.esearch.esearch_analyzer.EsearchAnalyzer
Returns: analyzer instance or None if request errors have been encountered
Return type:
initial_search
(parameter, analyzer)¶Does first request and triggers follow-up if required or possible.
Parameters:
- parameter (
entrezpy.esearch.esearch_parameter.EsearchParamater
) – Esearch parameter instances- analyzer (
entrezpy.esearch.esearch_analyzer.EsearchAnalyzer
) – Esearch analyzer instanceReturns: follow-up parameter or None
Return type:
entrezpy.esearch.esearch_parameter.EsearchParamater
or None
isGoodQuery
()¶Tests for request errors
rtype: bool
entrezpy.esearch.esearcher.
configure_follow_up
(parameter, analyzer)¶Adjusting EsearchParameter to follow-up results based on the initial Esearch result. Fetch remaining UIDs using the history server.
Parameters:
- analyzer (
entrezpy.search.esearch_analyzer.EsearchAnalyzer
) – Esearch analyzer instance- parameter – Initial Esearch parameter
entrezpy.esearch.esearcher.
reachedLimit
(parameter, analyzer)¶Checks if the set limit has been reached
Return type: bool
EsearchParameter¶
entrezpy.esearch.esearch_parameter.
MAX_REQUEST_SIZE
= 100000¶Maximum number of UIDs for one request
- class
entrezpy.esearch.esearch_parameter.
EsearchParameter
(parameter)¶Bases:
entrezpy.base.parameter.EutilsParameter
EsearchParameter checks query specific parameters and configures an Esearch query. If more than one request is required the instance is reconfigured by
entrezpy.esearch.esearcher.Esearcher.configure_follow_up()
.Note
EsearchParameter works best when using the NCBI Entrez history server. If usehistory is not used, linking requests cannot be guaranteed.
goodDateparam
()¶
Return type: bool
useMinMaxDate
()¶
Return type: bool
set_uilist
(rettype)¶
Return type: bool
adjust_retmax
(retmax)¶Adjusts retmax parameter. Order of check is crucial.
Parameters: retmax (int) – retmax value Returns: adjusted retmax Return type: int
adjust_reqsize
(request_size)¶Adjusts request size for low retmax
Returns: adjusted request size Return type: int
calculate_expected_requests
(qsize=None, reqsize=None)¶Calculate anf set the expected number of requests. Uses internal parameters if non are provided.
Parameters:
- or None qsize (int) – query size, i.e. expected number of data sets
- reqsize (int) – number of data sets to fetch in one request
check
()¶Implements
entrezpy.base.parameter.EutilsParameter.check
to check for the minumum required parameters. Aborts if any check fails.
dump
()¶Dump instance attributes
Return type: dict Raises: NotImplementedError – if not implemented
haveDb
()¶Check for required db parameter
Return type: bool
haveExpectedRequets
()¶Check fo expected requests. Hints an error if no requests are expected.
Return type: bool
haveQuerykey
()¶Check for required QueryKey parameter
Return type: bool
haveWebenv
()¶Check for required WebEnv parameter
Return type: bool
useHistory
()¶Check if history server should be used.
Return type: bool
EsearchAnalyzer¶
- class
entrezpy.esearch.esearch_analyzer.
EsearchAnalyzer
¶Bases:
entrezpy.base.analyzer.EutilsAnalyzer
EsearchAnalyzer implements the analysis of ESearch responses from E-Utils. JSON formatted data is enforced in responses. The result are stored as a
entrezpy.esearch.esearch_result.EsearchResult
instance.
Variables: result – entrezpy.esearch.esearch_result.EsearchResult
init_result
(response, request)¶Inits
entrezpy.esearch.esearch_result.EsearchResult
.
Returns: if result is initiated Return type: bool
analyze_result
(response, request)¶Implements
entrezpy.base.analyzer.EsearchAnalyzer.analyze_result()
.
Parameters:
- response (dict) – Esearch response
- request (
entrezpy.esearch.esearch_request.EsearchRequest
) – Esearch request
analyze_error
(response, request)¶Implements
entrezpy.base.analyzer.EutilsAnalyzer.analyze_error()
.
param dict response: Esearch response param request: Esearch request type request: entrezpy.esearch.esearch_request.EsearchRequest
size
()¶Returns number of analyzed UIDs in
result
Return type: int
query_size
()¶Returns number of expected UIDs in
result
Return type: int
reference
()¶Returns History Server references from
result
Returns: History Server referencess Return type: entrezpy.base.referencer.EutilReferencer.Reference
adjust_followup
(parameter)¶Adjusts result attributes from follow-up.
Parameters:
- parameter – Esearch parameter
- type –
entrezpy.esearch.esearch_parameter.EsearchParameter
check_error_json
(response)¶Checks for errors in JSON responses. Not unified among Eutil functions.
Parameters: response (dict) – reponse Returns: status if JSON response has error message Return type: bool
check_error_xml
(response)¶Checks for errors in XML responses
Parameters: response ( io.stringIO
) – XML responseReturns: if XML response has error message Return type: bool
convert_response
(raw_response_decoded, request)¶Converts raw_response into the expected format, deduced from request and set via the retmode parameter.
Parameters:
- raw_response (
urllib.request.Request
) – responseentrezpy.requester.requester.Requester
- request (
entrezpy.base.request.EutilsRequest
) – query requestReturns: response in parseable format
Return type: dict or
io.stringIO
- ..note::
- Using threads without locks randomly ‘looses’ the response, i.e. the raw response is emptied between requests. With locks, it works, but threading is not much faster than non-threading. It seems JSON is more prone to this than XML.
follow_up
()¶Return follow-up parameters if available
Returns: Follow-up parameters Return type: dict
get_result
()¶Return result
Returns: result instance Return type: entrezpy.base.result.EutilsResult
isEmpty
()¶Test for empty result
Return type: bool
isErrorResponse
(response, request)¶Checking for error messages in response from Entrez Servers and set flag
hasErrorResponse
.
Parameters:
- response (dict or
io.stringIO
) – parseable response fromconvert_response()
- request (
entrezpy.base.request.EutilsRequest
) – query requestReturns: error status
Return type: bool
isSuccess
()¶Test if response has errors
Return type: bool
known_fmts
= {'json', 'text', 'xml'}¶
parse
(raw_response, request)¶Check for errors and calls parser for the raw response.
Parameters:
- raw_response (
urllib.request.Request
) – response fromentrezpy.requester.requester.Requester
- request (
entrezpy.base.request.EutilsRequest
) – query requestRaises: NotImplementedError – if request format is not in
EutilsAnalyzer.known_fmts
EsearchRequest¶
- class
entrezpy.esearch.esearch_request.
EsearchRequest
(eutil, parameter, start, size)¶Bases:
entrezpy.base.request.EutilsRequest
The EsearchRequest class implements a single request as part of a Esearch query. It stores and prepares the parameters for a single request. See
entrezpy.elink.elink_parameter.ElinkParameter
for parameter description. Requests sizes are congifured from setting a start, i.e. the index of the first UID to fetch, and its size, i.e. how many to fetch. These are set byentrezpy.esearch.esearch_query.Esearcher.inquire()
.
Parameters:
- parameter – request parameter
- type –
entrezpy.elink.elink_parameter.ElinkParameter
- start (int) – number of first UID to fetch
- size (int) – requets size
get_post_parameter
()¶Virtual function returning the POST parameters for the request from required attributes.
Return type: dict Raises: NotImplemetedError –
dump
()¶
Return type: dict
calc_duration
()¶Calculates request duration
dump_internals
(extend=None)¶Dumps internal attributes for request.
Parameters: extend (dict) – extend dump with additional information
get_request_id
()¶
Returns: full request id Return type: str
prepare_base_qry
(extend=None)¶Returns instance attributes required for every POST request.
Parameters: extend (dict) – parameters extending basic parameters Returns: base parameters for POST request Return type: dict
report_status
(processed_requests=None, expected_requests=None)¶Reports request status when triggered
set_request_error
(error)¶Sets request error and HTTP/URL error message
Parameters: error (str) – HTTP/URL error
set_status_fail
()¶Set status if request failed
set_status_success
()¶Set status if request succeeded
start_stopwatch
()¶Starts time to measure request duration.
EsearchResult¶
- class
entrezpy.esearch.esearch_result.
EsearchResult
(response, request)¶Bases:
entrezpy.base.result.EutilsResult
EsearchResult sstores fetched UIDs and/or WebEnv-QueryKeys and creates follow-up parameters. UIDs are stored as string, even when UIDs, since responses can contain also accsessions when using the idtype option.
Parameters:
- response (dict) – Esearch response
- request (
entrezpy.esearch.esearch_request.EsearchRequest
) – Esearch request instance for this queryVariables: uids (list) – analyzed UIDs from response
dump
()¶
Return type: dict
get_link_parameter
(reqnum=0)¶Assemble follow-up parameters for linking. The first request returns all required information and using its querykey in such a case.
Return type: dict
isEmpty
()¶Empty search result has no webenv/querykey and/or no fetched UIDs.
size
()¶Returns number of analyzed UIDs.
Return type: int
query_size
()¶Returns number of all UIDs for search (count).
Return type: int
add_response
(response)¶Adds responses from individual requests.
Parameters: response (dict) – Esearch response