Integrate Amazon S3 with Relyt
Relyt seamlessly integrates with the Amazon S3 ecosystem, positioning itself as a powerful tool for direct querying and analysis of structured data stored within Amazon S3, a cornerstone for data lake scenarios. Recognizing the importance of efficient data handling and performance, Relyt empowers users with the capability to enhance query performance through the use of 'INSERT FROM SELECT' statement to import data from S3 into Relyt, streamlining the analytics workflow and enabling businesses to derive insights with greater speed and efficiency. By leveraging the robust storage solution of Amazon S3 and the analytical prowess of Relyt, organizations can unlock the full potential of their data assets.
Create External Table for S3 files
In Relyt, first create external table for the S3 files (currently, CSV and Text format data files are supported. Parquet, ORC is going to be supported soon). You can create table in an existing schema of a database, or create a brand new database and schema for the external table. The following sample statements showing the steps CREATE EXTERNAL TABLE in an existing database schema.
CREATE EXTERNAL TABLE lineitem_ext (
l_orderkey INTEGER,
l_partkey INTEGER,
l_suppkey INTEGER,
l_linenumber INTEGER,
l_quantity DECIMAL(15, 2),
l_extendedprice DECIMAL(15, 2),
l_discount DECIMAL(15, 2),
l_tax DECIMAL(15, 2),
l_returnflag VARCHAR,
l_linestatus VARCHAR,
l_shipdate DATE,
l_commitdate DATE,
l_receiptdate DATE,
l_shipinstruct VARCHAR,
l_shipmode VARCHAR,
l_comment VARCHAR,
l_dummy VARCHAR
)
LOCATION('s3://s3.ap-southeast-1.amazonaws.com/sample_data/tpch/100m/lineitem.tbl
accessid=<your_aws_access_id>
secret=<your_aws_access_secret>
region=ap-southeast-1')
FORMAT 'csv' (delimiter '|');
Please replace the parameters accordingly in the above statement, and change the table name and column definition according to your file schema.
s3://s3.ap-southeast-1.amazonaws.com/sample_data/tpch/100m/lineitem.tbl
: replace it with your S3 file path<your_aws_access_id>
: replace it with your AWS Access ID<your_aws_access_secret>
: replace it with your AWS Access Secretregion=ap-southeast-1
: replace i with your region IDdelimiter '|'
: here specify the column seperator in your CSV file
Query data from S3 external table
Now, you can query the data from the created S3 external table directly as below.
SELECT * FROM lineitem_ext LIMIT 10;
Import data from external table into Relyt
You can create Relyt table and import data from S3 external table into it to get higher query performance and even get higher data compression ratio to save your storage cost.
CREATE TABLE lineitem (
l_orderkey INTEGER,
l_partkey INTEGER,
l_suppkey INTEGER,
l_linenumber INTEGER,
l_quantity DECIMAL(15, 2),
l_extendedprice DECIMAL(15, 2),
l_discount DECIMAL(15, 2),
l_tax DECIMAL(15, 2),
l_returnflag VARCHAR,
l_linestatus VARCHAR,
l_shipdate DATE,
l_commitdate DATE,
l_receiptdate DATE,
l_shipinstruct VARCHAR,
l_shipmode VARCHAR,
l_comment VARCHAR,
l_dummy VARCHAR
);
INSERT INTO lineitem SELECT * FROM lineitem_ext;
SELECT * FROM lineitem LIMIT 10;