CREATE EXTERNAL TABLE
Creates an external table.
Syntax
CREATE [READABLE | WRITABLE] EXTERNAL TABLE <table_name>
( <column_name> <data_type> [, ...] | LIKE <other_table >)
LOCATION ('s3://<s3_endpoint>/<bucket_name>/[<s3_prefix>]
accessid=<access_key_id>
secret=<access_secret>
[threadnum=<thread_number>]
[chunksize=<chunk_size>]')
FORMAT 'TEXT'
[( [HEADER]
[DELIMITER [AS] '<delimiter>' | 'OFF']
[NULL [AS] '<null string>']
[ESCAPE [AS] '<escape>' | 'OFF']
[FILL MISSING FIELDS] )]
| 'CSV'
[( [HEADER]
[QUOTE [AS] '<quote>']
[DELIMITER [AS] '<delimiter>']
[NULL [AS] '<null string>']
[FORCE NOT NULL <column> [, ...]]
[ESCAPE [AS] '<escape>']
[FILL MISSING FIELDS] )]
| 'PARQUET'
| 'ORC'
Description
CREATE EXTERNAL TABLE creates a new readable external table definition in Relyt. Readable external tables are typically used for fast, parallel data loading. Once an external table is defined, you can query its data directly (and in parallel) using SQL commands. For example, you can select, join, or sort external table data. You can also create views for external tables. DML operations (UPDATE, INSERT, DELETE, or TRUNCATE) are not allowed on readable external tables, and you cannot create indexes on readable external tables.
Parameters
-
[READABLE | WRITABLE]The type of the external table. If not specified, the default
READABLEis used.-
READABLE: Readable external tables are used to load data to Relyt. -
WRITABLE: Writable external tables are used to unload data.
-
-
<table_name>The name of the external table.
-
<column_name>The name of the column.
Do not specify column constraints and default values for columns in external tables, because external tables do not support column constraints and default values.
-
<data_type>The data type of the column.
-
LOCATIONThe location of the external data source. Currently, only data sources from AWS S3 are supported.
-
<s3_endpoint>The endpoint to connect to the external data source. Supported endpoints include:
-
Amazon S3 endpoints
For details, see Amazon Simple Storage Service endpoints and quotas in the Amazon S3 documentation.
-
Kinsoft Cloud KS3 endpoints
For details, see Regions and endpoints in the Kinsoft Cloud KS3 documentation.
-
Alibaba Cloud OSS endpoints
For details, see Regions and endpoints in the Alibaba Cloud OSS documentation.
-
Tencent Cloud COS endpoints
For details, see Regions and Access Endpoints in the Tencent Cloud COS documentation.
-
-
<bucket_name>The name of the S3 bucket.
-
[S3_prefix]
An optional path prefix within the S3 bucket. If specified, only objects under this path prefix will be read.
-
accessidThe AWS Access ID.
-
secretThe AWS Access Secret.
-
threadnumThe number of parallel threads to use when transferring data. The value ranges from 1 to 8. The default value for
TEXTandCSVis 4 and that forPARQUETis 1. -
chunksizeThe size of chunks to transfer at a time, in bytes. The value ranges from 1 MB to 128 MB, and the default value is 8 MB.
-
FORMATThe data format. Suppported options include TEXT, CSV, PARQUET, and ORC.
-
DELIMITERThe single ASCII character that separates columns within each row of data. The default for
TEXTis a tab character and that forCSVis a comma (,). ForTEXT,you can setDELIMITERtoOFFfor special use cases in which unstructured data is loaded in to a single-column table.For the
s3protocol, the delimiter cannot be a newline character (\n) or a carriage return character (\r). -
NULLThe string that represents a
NULLvalue. The default is\N(backslash-N) inTEXTmode, and an empty value with no quotations inCSVmode. You might prefer an empty string even inTEXTmode for cases where you do not want to distinguishNULLvalues from empty strings. When using external and web tables, any data item that matches this string will be considered aNULLvalue.As an example for the
textformat, thisFORMATclause can be used to specify that the string of two single quotes ('') is aNULLvalue.FORMAT 'text' (delimiter ',' null '\'\'\'\'' ) -
ESCAPEThe single character that is used for C espace sequences, such as
\n,\t, or\100and for escaping data characters that might otherwise be taken as row or column delimiters. Make sure to choose an escape character that is not used anywhere in your actual column data. The default escape character is a backslash (\) for text-formatted files and a double quote(") for csv-formatted files, however it is possible to specify another character to represent an escape. It is also possible to deactivate escaping in text-formatted files by specifying the value'OFF'as the escape value. This is very useful for data such as text-formatted web log data that has many embedded backslashes that are not intended to be escapes. -
HEADERFor readable external tables, specifies that the first line in the data files is a header row (contains the names of the table columns) and should not be included as data for the table. If using multiple data source files, all files must have a header row.
For the
s3protocol, the column names in the header row cannot contain a newline character (\n) or a carriage return (\r). -
QUOTEThe quotation character for
CSVmode. The default is double-quote ("). -
FORCE NOT NULLIn
CSVmode, processes each specified column as though it were quoted and hence not aNULLvalue. For the default null string inCSVmode (nothing between two delimiters), this causes missing values to be evaluated as zero-length strings. -
FILL MISSING FIELDSIn both
TEXTandCSVmode for readable external tables, specifyingFILL MISSING FIELDSwill set missing trailing field values toNULL(instead of reporting an error) when a row of data has missing data fields at the end of a line or row. Blank rows, fields with aNOT NULLconstraint, and trailing delimiters on a line will still report an error.
Examples
Create an external table named nation_ext using the s3 protocol:
CREATE WRITABLE EXTERNAL TABLE ks3_writable_ext (
id integer
)
LOCATION ('s3://ks3-cn-beijing.ksyuncs.com/******/ext-write/
accessid=************
secret=************')
FORMAT 'CSV';
SQL standard compatibility
CREATE EXTERNAL TABLE is a Relyt extension.