Use Boto3 to Run Async Queries and Dump Query Results
The AWS SDK for Python (Boto3) provides a Python API for AWS infrastructure services. Now, Relyt is compatible with Boto3, allowing you to use Boto3 Athena Client to interact with your Amazon S3 or Amazon Glue resources through external tables.
Currently, two Boto3 API operations are supported:
Only the parameters provided in the request syntax for each API operations are compatible. Other parameters that are not included will be ignored if specified.
Before you start
Ensure that you have connected Relyt to the target external data source. For details about how to connect Relyt to your external data source, see Configure Access to S3 Data Sources Through Integration with Lake Formation.
Create the following two files under ~/.aws
directory and configure parameters required for each.
If you want to create the files in another directory, you can change it by configuring the AWS_CONFIG_FILE
environment variable.
-
credentials
: used to save your Open API credentials, includingaws_access_key_id
,aws_secret_access_key
,region
, andservices
. -
configs
: used to save the enpdoint informationendpoint_url
.
The following table describes the required parameters and how to obtain them.
Parameter | Description | How to Obtain |
---|---|---|
aws_access_key_id and aws_secret_access_key | The access key to Open API. | Log in to your DW service unit console and choose Access Control > Open API. If no access key/secret key pair is available, generate one. |
region | The region in which the DW service unit is deployed. | Log in to your DW service unit console, choose Home > Overview, and check the value of the Region field. |
services | The service you use, which is fixed to relyt . | N/A |
endpoint_url | The public endpoint to connect to your DW service unit. | Log in to your DW service unit console, choose Access Control > Open API, and check the value of the Public Endpoint field. |
Athena.Client.start_query_execution(**kwargs)
Runs the SQL query statements contained in a query.
Request syntax
response = client.start_query_execution(
QueryString='string',
QueryExecutionContext={
'Database': '<database_name>',
'Catalog': '<catalog_name>'
}
ResultConfiguration={
'OutputLocation': '<output_location>'
},
WorkGroup='<dps_cluster_name>'
)
Description
To call start_query_execution
, you must have the access permissions to the Extreme DPS cluster on which the query will run.
In Relyt, an Extreme DPS cluster is equivalent to a workgroup in Boto3.
Parameters
-
<database_name>
The name of the Relyt database to which the catalog specified by
<catalog_name>
is mounted. -
<catalog_name>
The name of the catalog.
-
<output_location>
The output location specified to store results of async executions.
-
<dps_cluster_name>
The name of the DPS cluster specified to run queries.
Sample code
import boto3
session = boto3.Session(profile_name='relyt')
client = session.client('athena')
response = client.start_query_execution(
QueryString='SELECT * FROM "test"."analytic".currency LIMIT 12',
QueryExecutionContext={
'Database': 'test_db',
'Catalog': 'test.analytic'
},
ResultConfiguration={
'OutputLocation': 's3://xxxx/test'
},
WorkGroup='dps-2'
Athena.Client.get_query_execution(**kwargs)
Obtains the execution information of a query.
Request syntax
response = client.get_query_execution(
QueryExecutionId='<query_execution_id>'
)
Description
To call get_query_execution
, you must have the access permissions to the Extreme DPS cluster on which the query ran.
In Relyt, an Extreme DPS cluster is equivalent to a workgroup in Boto3.
Parameters
<query_execution_id>
: the unique ID of the query execution.
Sample code
import boto3
session = boto3.Session(profile_name='relyt')
client = session.client('athena')
response = client.get_query_execution(
QueryExecutionId='70646550539520_114720_20240625_090401_00001_46706_00146'
)