CREATE EXTERNAL TABLE
Creates an external table.
Syntax
CREATE [READABLE | WRITABLE] EXTERNAL TABLE <table_name>
( <column_name> <data_type> [, ...] | LIKE <other_table >)
LOCATION ('s3://<s3_endpoint>/<bucket_name>/[<s3_prefix>]
accessid=<access_key_id>
secret=<access_secret>
[threadnum=<thread_number>]
[chunksize=<chunk_size>]')
FORMAT 'TEXT'
[( [HEADER]
[DELIMITER [AS] '<delimiter>' | 'OFF']
[NULL [AS] '<null string>']
[ESCAPE [AS] '<escape>' | 'OFF']
[FILL MISSING FIELDS] )]
| 'CSV'
[( [HEADER]
[QUOTE [AS] '<quote>']
[DELIMITER [AS] '<delimiter>']
[NULL [AS] '<null string>']
[FORCE NOT NULL <column> [, ...]]
[ESCAPE [AS] '<escape>']
[FILL MISSING FIELDS] )]
| 'PARQUET'
| 'ORC'
Description
CREATE EXTERNAL TABLE
creates a new readable external table definition in Relyt. Readable external tables are typically used for fast, parallel data loading. Once an external table is defined, you can query its data directly (and in parallel) using SQL commands. For example, you can select, join, or sort external table data. You can also create views for external tables. DML operations (UPDATE
, INSERT
, DELETE
, or TRUNCATE
) are not allowed on readable external tables, and you cannot create indexes on readable external tables.
Parameters
-
[READABLE | WRITABLE]
The type of the external table. If not specified, the default
READABLE
is used.-
READABLE
: Readable external tables are used to load data to Relyt. -
WRITABLE
: Writable external tables are used to unload data.
-
-
<table_name>
The name of the external table.
-
<column_name>
The name of the column.
Do not specify column constraints and default values for columns in external tables, because external tables do not support column constraints and default values.
-
<data_type>
The data type of the column.
-
LOCATION
The location of the external data source. Currently, only data sources from AWS S3 are supported.
-
<s3_endpoint>
The endpoint to connect to the external data source. Supported endpoints include:
-
Amazon S3 endpoints
For details, see Amazon Simple Storage Service endpoints and quotas in the Amazon S3 documentation.
-
Kinsoft Cloud KS3 endpoints
For details, see Regions and endpoints in the Kinsoft Cloud KS3 documentation.
-
Alibaba Cloud OSS endpoints
For details, see Regions and endpoints in the Alibaba Cloud OSS documentation.
-
Tencent Cloud COS endpoints
For details, see Regions and Access Endpoints in the Tencent Cloud COS documentation.
-
-
<bucket_name>
The name of the S3 bucket.
-
[S3_prefix]
An optional path prefix within the S3 bucket. If specified, only objects under this path prefix will be read.
-
accessid
The AWS Access ID.
-
secret
The AWS Access Secret.
-
threadnum
The number of parallel threads to use when transferring data. The value ranges from 1 to 8. The default value for
TEXT
andCSV
is 4 and that forPARQUET
is 1. -
chunksize
The size of chunks to transfer at a time, in bytes. The value ranges from 1 MB to 128 MB, and the default value is 8 MB.
-
FORMAT
The data format. Suppported options include TEXT, CSV, PARQUET, and ORC.
-
DELIMITER
The single ASCII character that separates columns within each row of data. The default for
TEXT
is a tab character and that forCSV
is a comma (,). ForTEXT
,you can setDELIMITER
toOFF
for special use cases in which unstructured data is loaded in to a single-column table.For the
s3
protocol, the delimiter cannot be a newline character (\n
) or a carriage return character (\r
). -
NULL
The string that represents a
NULL
value. The default is\N
(backslash-N) inTEXT
mode, and an empty value with no quotations inCSV
mode. You might prefer an empty string even inTEXT
mode for cases where you do not want to distinguishNULL
values from empty strings. When using external and web tables, any data item that matches this string will be considered aNULL
value.As an example for the
text
format, thisFORMAT
clause can be used to specify that the string of two single quotes (''
) is aNULL
value.FORMAT 'text' (delimiter ',' null '\'\'\'\'' )
-
ESCAPE
The single character that is used for C espace sequences, such as
\n
,\t
, or\100
and for escaping data characters that might otherwise be taken as row or column delimiters. Make sure to choose an escape character that is not used anywhere in your actual column data. The default escape character is a backslash (\
) for text-formatted files and a double quote("
) for csv-formatted files, however it is possible to specify another character to represent an escape. It is also possible to deactivate escaping in text-formatted files by specifying the value'OFF'
as the escape value. This is very useful for data such as text-formatted web log data that has many embedded backslashes that are not intended to be escapes. -
HEADER
For readable external tables, specifies that the first line in the data files is a header row (contains the names of the table columns) and should not be included as data for the table. If using multiple data source files, all files must have a header row.
For the
s3
protocol, the column names in the header row cannot contain a newline character (\n
) or a carriage return (\r
). -
QUOTE
The quotation character for
CSV
mode. The default is double-quote ("
). -
FORCE NOT NULL
In
CSV
mode, processes each specified column as though it were quoted and hence not aNULL
value. For the default null string inCSV
mode (nothing between two delimiters), this causes missing values to be evaluated as zero-length strings. -
FILL MISSING FIELDS
In both
TEXT
andCSV
mode for readable external tables, specifyingFILL MISSING FIELDS
will set missing trailing field values toNULL
(instead of reporting an error) when a row of data has missing data fields at the end of a line or row. Blank rows, fields with aNOT NULL
constraint, and trailing delimiters on a line will still report an error.
Examples
Create an external table named nation_ext
using the s3
protocol:
CREATE WRITABLE EXTERNAL TABLE ks3_writable_ext (
id integer
)
LOCATION ('s3://ks3-cn-beijing.ksyuncs.com/******/ext-write/
accessid=************
secret=************')
FORMAT 'CSV';
SQL standard compatibility
CREATE EXTERNAL TABLE
is a Relyt extension.