Cloud SQL Shootout: TPC-H on Redshift & Athena (or Hive)

 Cloud SQL Shootout: TPC-H on Redshift & Athena (or Hive)Today's business analyst demands a SQL-like access to Big DataTM . Your task today is to design a SQL-in-the-cloud data warehouse system. You compare AWS Athena (or Apache Hive) and AWS Redshift, which is a hosted version of the parallel database system Actian Matrix (probably better known under its previous name ParAccel). As a benchmark we are going to use TPC-H, a industry standard benchmark for analytical SQL systems. data sets with scale factors 1 & 10 and 300 have already been created for you and uploaded to S3. TPC-H scale factor 300 means the largest table "lineitem" has 1.799.989.091 records. (Scale factor 300 is omitted for budget reasons.)Preparation for RedshiftInstall SQL Workbench and the Redshift JDBC driver instructions: JDBC driver can be downloaded from time, you will need your AWS access credentials. Create a new access key:Go to "Security Credentials" in the consoleNote down your access key id and secret access key somewhere where you will find it again.You can also simply download the key file which contains this information.Redshift StartupI strongly request you read the rest of this tutorial before starting the cluster. Redshift clusters are quite costly, please keep their runtime minimal.Go to the AWS Redshift console and click the "Launch Cluster" ButtonFor now, we will create a single-node cluster for testing purposes.Make sure you note the username and password, you will need it later. Database name can remain empty.Select a single node cluster for nowAdditional configuration can all remain on defaultsThen launchGo to the clusters dashboard, you will see your cluster launchingWait until the cluster is availableClick the cluster name ("bads-test1" here) to show its detailsIt might show a warning due to the firewall not allowing access to the JDBC port. Click the warning sign and then "Edit security group"Edit the security group on the "inbound" tab to allow Redshift connections from anywhere:Afterwards, the warning should be gone.If connecting with SQL Workbench (see below) does not work --- either after above changes or although you don’t get the above warning about “No Inbound Permissions” in the first place --- please try the following in the EC2 console to open-up access after all.Then, use the displayed JDBC URL to connect using SQL Workbench.You will be greeted by the Query/Query result screen. Run a simple query to check everything works:SELECT 42;Redshift Schema / Data loadingRun the following query in the SQL Workbench to create and load the tables (a plain text file is available at).CREATE TABLE region (r_regionkey INT NOT NULL, r_name VARCHAR(25) NOT NULL, r_comment VARCHAR(152) NOT NULL, PRIMARY KEY (r_regionkey)) ;CREATE TABLE nation (n_nationkey INT NOT NULL, n_name VARCHAR(25) NOT NULL, n_regionkey INT NOT NULL, n_comment VARCHAR(152) NOT NULL, PRIMARY KEY (n_nationkey)) ;CREATE TABLE supplier (s_suppkey INT NOT NULL, s_name VARCHAR(25) NOT NULL, s_address VARCHAR(40) NOT NULL, s_nationkey INT NOT NULL, s_phone VARCHAR(15) NOT NULL, s_acctbal DECIMAL(15,2) NOT NULL, s_comment VARCHAR(101) NOT NULL, PRIMARY KEY (s_suppkey)) ;CREATE TABLE customer (c_custkey INT NOT NULL, c_name VARCHAR(25) NOT NULL, c_address VARCHAR(40) NOT NULL, c_nationkey INT NOT NULL, c_phone VARCHAR(15) NOT NULL, c_acctbal DECIMAL(15,2) NOT NULL, c_mktsegment VARCHAR(10) NOT NULL, c_comment VARCHAR(117) NOT NULL, PRIMARY KEY (c_custkey)) ;CREATE TABLE part (p_partkey INT NOT NULL, p_name VARCHAR(55) NOT NULL, p_mfgr VARCHAR(25) NOT NULL, p_brand VARCHAR(10) NOT NULL, p_type VARCHAR(25) NOT NULL, p_size INT NOT NULL, p_container VARCHAR(10) NOT NULL, p_retailprice DECIMAL(15,2) NOT NULL, p_comment VARCHAR(23) NOT NULL, PRIMARY KEY (p_partkey)) ;CREATE TABLE partsupp (ps_partkey INT NOT NULL, ps_suppkey INT NOT NULL, ps_availqty INT NOT NULL, ps_supplycost DECIMAL(15,2) NOT NULL, ps_comment VARCHAR(199) NOT NULL, PRIMARY KEY (ps_partkey, ps_suppkey), FOREIGN KEY (ps_partkey) REFERENCES part (p_partkey), FOREIGN KEY (ps_suppkey) REFERENCES supplier (s_suppkey)) ;CREATE TABLE orders (o_orderkey INT NOT NULL, o_custkey INT NOT NULL, o_orderstatus VARCHAR(1) NOT NULL, o_totalprice DECIMAL(15,2) NOT NULL, o_orderdate DATE NOT NULL, o_orderpriority VARCHAR(15) NOT NULL, o_clerk VARCHAR(15) NOT NULL, o_shippriority INT NOT NULL, o_comment VARCHAR(79) NOT NULL, PRIMARY KEY (o_orderkey)) ;CREATE TABLE lineitem (l_orderkey INT NOT NULL, l_partkey INT NOT NULL, l_suppkey INT NOT NULL, l_linenumber INT NOT NULL, l_quantity INTEGER NOT NULL, l_extendedprice DECIMAL(15,2) NOT NULL, l_discount DECIMAL(15,2) NOT NULL, l_tax DECIMAL(15,2) NOT NULL, l_returnflag VARCHAR(1) NOT NULL, l_linestatus VARCHAR(1) NOT NULL, l_shipdate DATE NOT NULL, l_commitdate DATE NOT NULL, l_receiptdate DATE NOT NULL, l_shipinstruct VARCHAR(25) NOT NULL, l_shipmode VARCHAR(10) NOT NULL, l_comment VARCHAR(44) NOT NULL, PRIMARY KEY (l_orderkey,l_linenumber)) ;COMMIT;-- In the remainder, replace XXXXX / YYYYY with your access key / secret access key!copy region from 's3://tpch-bads-data/sf1/region/' delimiter '|' gzip maxerror 210 credentials 'aws_access_key_id=XXXXX;aws_secret_access_key=YYYYY';copy nation from 's3://tpch-bads-data/sf1/nation/' delimiter '|' gzip maxerror 210 credentials 'aws_access_key_id=XXXXX;aws_secret_access_key=YYYYY';copy customer from 's3://tpch-bads-data/sf1/customer/' delimiter '|' gzip maxerror 210 credentials 'aws_access_key_id=XXXXX;aws_secret_access_key=YYYYY';copy orders from 's3://tpch-bads-data/sf1/orders/' delimiter '|' gzip maxerror 210 credentials 'aws_access_key_id=XXXXX;aws_secret_access_key=YYYYY';copy lineitem from 's3://tpch-bads-data/sf1/lineitem/' delimiter '|' gzip maxerror 210 credentials 'aws_access_key_id=XXXXX;aws_secret_access_key=YYYYY';copy part from 's3://tpch-bads-data/sf1/part/' delimiter '|' gzip maxerror 210 credentials 'aws_access_key_id=XXXXX;aws_secret_access_key=YYYYY';copy partsupp from 's3://tpch-bads-data/sf1/partsupp/' delimiter '|' gzip maxerror 210 credentials 'aws_access_key_id=XXXXX;aws_secret_access_key=YYYYY';copy supplier from 's3://tpch-bads-data/sf1/supplier/' delimiter '|' gzip maxerror 210 credentials 'aws_access_key_id=XXXXX;aws_secret_access_key=YYYYY';COMMIT;Replace XXXXX / YYYYY with your access key / secret access key!You can observe your cluster working by going to the "Queries" Tab in the cluster details on the Redshift console.Make sure the data is loaded by running a SELECT COUNT(*) FROM table for all the loaded tables after the COMMIT, i.e., (a plain text file is available at)SELECT count(*) from region;SELECT count(*) from nation;SELECT count(*) from supplier;SELECT count(*) from customer;SELECT count(*) from part;SELECT count(*) from partsupp;SELECT count(*) from orders;SELECT count(*) from lineitem;Run the following queries and note their runtime.Redshift TPC-H Query 1(a plain text file is available at)selectl_returnflag,l_linestatus,sum(l_quantity) as sum_qty,sum(l_extendedprice) as sum_base_price,sum(l_extendedprice * (1 - l_discount)) as sum_disc_price,sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge,avg(l_quantity) as avg_qty,avg(l_extendedprice) as avg_price,avg(l_discount) as avg_disc,count(*) as count_orderfromlineitemwherel_shipdate <= date '1998-12-01' - interval '108' daygroup byl_returnflag,l_linestatusorder byl_returnflag,l_linestatus;Redshift TPC-H Query 5(a plain text file is available at)selectn_name,sum(l_extendedprice * (1 - l_discount)) as revenuefromcustomer,orders,lineitem,supplier,nation,regionwherec_custkey = o_custkeyand l_orderkey = o_orderkeyand l_suppkey = s_suppkeyand c_nationkey = s_nationkeyand s_nationkey = n_nationkeyand n_regionkey = r_regionkeyand r_name = 'MIDDLE EAST'group byn_nameorder byrevenue desc;Redshift Shutting DownAfter done with your queries, shut down your clusterAthena Data "Loading"Go to the Athena Web InterfaceCreate a database for you to use: Enter the queryCREATE DATABASE bd_student_XX; (replace the database name with your user name), click "Run Query", and then make sure it's selected on the left side.Create and load tables (not really loading!), paste the following queries (ONE BY ONE) in the query window (what does it do?!) and execute them. (a plain text file is available at)CREATE EXTERNAL TABLE customer(C_CustKey int ,C_Name varchar(64) ,C_Address varchar(64) ,C_NationKey int ,C_Phone varchar(64) ,C_AcctBal decimal(13, 2) ,C_MktSegment varchar(64) ,C_Comment varchar(120) ,skip varchar(64)) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LOCATION 's3://tpch-bads-data/sf1/customer/';CREATE EXTERNAL TABLE lineitem(L_OrderKey int ,L_PartKey int ,L_SuppKey int ,L_LineNumber int ,L_Quantity int ,L_ExtendedPrice decimal(13, 2) ,L_Discount decimal(13, 2) ,L_Tax decimal(13, 2) ,L_ReturnFlag varchar(64) ,L_LineStatus varchar(64) ,L_ShipDate date ,L_CommitDate date ,L_ReceiptDate date ,L_ShipInstruct varchar(64) ,L_ShipMode varchar(64) ,L_Comment varchar(64) ,skip varchar(64)) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LOCATION 's3://tpch-bads-data/sf1/lineitem/';CREATE EXTERNAL TABLE nation(N_NationKey int ,N_Name varchar(64) ,N_RegionKey int ,N_Comment varchar(160) ,skip varchar(64)) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LOCATION 's3://tpch-bads-data/sf1/nation/';CREATE EXTERNAL TABLE orders(O_OrderKey int ,O_CustKey int ,O_OrderStatus varchar(64) ,O_TotalPrice decimal(13, 2) ,O_OrderDate date ,O_OrderPriority varchar(15) ,O_Clerk varchar(64) ,O_ShipPriority int ,O_Comment varchar(80) ,skip varchar(64)) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LOCATION 's3://tpch-bads-data/sf1/orders/';CREATE EXTERNAL TABLE part(P_PartKey int ,P_Name varchar(64) ,P_Mfgr varchar(64) ,P_Brand varchar(64) ,P_Type varchar(64) ,P_Size int ,P_Container varchar(64) ,P_RetailPrice decimal(13, 2) ,P_Comment varchar(64) ,skip varchar(64)) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LOCATION 's3://tpch-bads-data/sf1/part/';CREATE EXTERNAL TABLE partsupp(PS_PartKey int ,PS_SuppKey int ,PS_AvailQty int ,PS_SupplyCost decimal(13, 2) ,PS_Comment varchar(200) ,skip varchar(64)) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LOCATION 's3://tpch-bads-data/sf1/partsupp/';CREATE EXTERNAL TABLE region(R_RegionKey int ,R_Name varchar(64) ,R_Comment varchar(160) ,skip varchar(64)) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LOCATION 's3://tpch-bads-data/sf1/region/';CREATE EXTERNAL TABLE supplier(S_SuppKey int ,S_Name varchar(64) ,S_Address varchar(64) ,S_NationKey int ,S_Phone varchar(18) ,S_AcctBal decimal(13, 2) ,S_Comment varchar(105) ,skip varchar(64)) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LOCATION 's3://tpch-bads-data/sf1/supplier/';Make sure the tables have been loaded, e.g. they are in the tables list and all contain data. You can check they contain data by running the following queries (ONE BY ONE)(a plain text file is available at)SELECT count(*) from region;SELECT count(*) from nation;SELECT count(*) from supplier;SELECT count(*) from customer;SELECT count(*) from part;SELECT count(*) from partsupp;SELECT count(*) from orders;SELECT count(*) from lineitem;Please note whether and if so how the execution times for running the SELECT COUNT(*) FROM table queries differ between RedShift & Athena. If they do, any idea why?Run the following queriesAthena TPC-H Query 1(a plain text file is available at)SELECT L_RETURNFLAG, L_LINESTATUS, SUM(L_QUANTITY), SUM(L_EXTENDEDPRICE), SUM(L_EXTENDEDPRICE*(1-L_DISCOUNT)), SUM(L_EXTENDEDPRICE*(1-L_DISCOUNT)*(1+L_TAX)), AVG(L_QUANTITY), AVG(L_EXTENDEDPRICE), AVG(L_DISCOUNT), COUNT(1) FROM lineitem WHERE L_SHIPDATE<= CAST ('1998-09-02' AS DATE)GROUP BY L_RETURNFLAG, L_LINESTATUS ORDER BY L_RETURNFLAG, L_LINESTATUS;Athena TPC-H Query 5(a plain text file is available at)select n_name, sum(l_extendedprice * (1 - l_discount)) as revenue from customer c join ( select n_name, l_extendedprice, l_discount, s_nationkey, o_custkey from orders o join ( select n_name, l_extendedprice, l_discount, l_orderkey, s_nationkey from lineitem l join ( select n_name, s_suppkey, s_nationkey from supplier s join ( select n_name, n_nationkey from nation n join region r on n.n_regionkey = r.r_regionkey and r.r_name = 'MIDDLE EAST' ) n1 on s.s_nationkey = n1.n_nationkey ) s1 on l.l_suppkey = s1.s_suppkey ) l1 on l1.l_orderkey = o.o_orderkey) o1 on c.c_nationkey = o1.s_nationkey and c.c_custkey = o1.o_custkeygroup by n_name order by revenue desc;Observe the query runtime and amount of data read.Hive Schema / Data loadingStartup EMR cluster (all defaults, 2 nodes, remember Hue security group)Access Hue's Hive query editorSet S3 Credentials for HiveSet first Key to fs.s3n.awsAccessKeyId, first value your S3 access keySet second Value to fs.s3n.awsSecretAccessKey, second value to your S3 secret keyCreate and load tables (not really loading…) (a plain text file is available at)CREATE EXTERNAL TABLE customer(C_CustKey int ,C_Name varchar(64) ,C_Address varchar(64) ,C_NationKey int ,C_Phone varchar(64) ,C_AcctBal decimal(13, 2) ,C_MktSegment varchar(64) ,C_Comment varchar(120) ,skip varchar(64)) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LOCATION 's3n://tpch-bads-data/sf1/customer/';CREATE EXTERNAL TABLE lineitem(L_OrderKey int ,L_PartKey int ,L_SuppKey int ,L_LineNumber int ,L_Quantity int ,L_ExtendedPrice decimal(13, 2) ,L_Discount decimal(13, 2) ,L_Tax decimal(13, 2) ,L_ReturnFlag varchar(64) ,L_LineStatus varchar(64) ,L_ShipDate date ,L_CommitDate date ,L_ReceiptDate date ,L_ShipInstruct varchar(64) ,L_ShipMode varchar(64) ,L_Comment varchar(64) ,skip varchar(64)) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LOCATION 's3n://tpch-bads-data/sf1/lineitem/';CREATE EXTERNAL TABLE nation(N_NationKey int ,N_Name varchar(64) ,N_RegionKey int ,N_Comment varchar(160) ,skip varchar(64)) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LOCATION 's3n://tpch-bads-data/sf1/nation/';CREATE EXTERNAL TABLE orders(O_OrderKey int ,O_CustKey int ,O_OrderStatus varchar(64) ,O_TotalPrice decimal(13, 2) ,O_OrderDate date ,O_OrderPriority varchar(15) ,O_Clerk varchar(64) ,O_ShipPriority int ,O_Comment varchar(80) ,skip varchar(64)) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LOCATION 's3n://tpch-bads-data/sf1/orders/';CREATE EXTERNAL TABLE part(P_PartKey int ,P_Name varchar(64) ,P_Mfgr varchar(64) ,P_Brand varchar(64) ,P_Type varchar(64) ,P_Size int ,P_Container varchar(64) ,P_RetailPrice decimal(13, 2) ,P_Comment varchar(64) ,skip varchar(64)) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LOCATION 's3n://tpch-bads-data/sf1/part/';CREATE EXTERNAL TABLE partsupp(PS_PartKey int ,PS_SuppKey int ,PS_AvailQty int ,PS_SupplyCost decimal(13, 2) ,PS_Comment varchar(200) ,skip varchar(64)) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LOCATION 's3n://tpch-bads-data/sf1/partsupp/';CREATE EXTERNAL TABLE region(R_RegionKey int ,R_Name varchar(64) ,R_Comment varchar(160) ,skip varchar(64)) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LOCATION 's3n://tpch-bads-data/sf1/region/';CREATE EXTERNAL TABLE supplier(S_SuppKey int ,S_Name varchar(64) ,S_Address varchar(64) ,S_NationKey int ,S_Phone varchar(18) ,S_AcctBal decimal(13, 2) ,S_Comment varchar(105) ,skip varchar(64)) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LOCATION 's3n://tpch-bads-data/sf1/supplier/';Again, you might want to make sure the data is loaded by running a SELECT COUNT(*) FROM table for all the loaded tables after the COMMIT, i.e., (a plain text file is available at)SELECT count(*) from region;SELECT count(*) from nation;SELECT count(*) from supplier;SELECT count(*) from customer;SELECT count(*) from part;SELECT count(*) from partsupp;SELECT count(*) from orders;SELECT count(*) from lineitem;Please note whether and if so how the execution times for running the SELECT COUNT(*) FROM table queries differ between RedShift & Hive. If they do, any idea why?Hive TPC-H Query 1(a plain text file is available at)SELECT L_RETURNFLAG, L_LINESTATUS, SUM(L_QUANTITY), SUM(L_EXTENDEDPRICE), SUM(L_EXTENDEDPRICE*(1-L_DISCOUNT)), SUM(L_EXTENDEDPRICE*(1-L_DISCOUNT)*(1+L_TAX)), AVG(L_QUANTITY), AVG(L_EXTENDEDPRICE), AVG(L_DISCOUNT), COUNT(1) FROM lineitem WHERE L_SHIPDATE<='1998-09-02' GROUP BY L_RETURNFLAG, L_LINESTATUS ORDER BY L_RETURNFLAG, L_LINESTATUS;Hive TPC-H Query 5(a plain text file is available at)select n_name, sum(l_extendedprice * (1 - l_discount)) as revenue from customer c join ( select n_name, l_extendedprice, l_discount, s_nationkey, o_custkey from orders o join ( select n_name, l_extendedprice, l_discount, l_orderkey, s_nationkey from lineitem l join ( select n_name, s_suppkey, s_nationkey from supplier s join ( select n_name, n_nationkey from nation n join region r on n.n_regionkey = r.r_regionkey and r.r_name = 'MIDDLE EAST' ) n1 on s.s_nationkey = n1.n_nationkey ) s1 on l.l_suppkey = s1.s_suppkey ) l1 on l1.l_orderkey = o.o_orderkey) o1 on c.c_nationkey = o1.s_nationkey and c.c_custkey = o1.o_custkeygroup by n_name order by revenue desc;Hive/EMR Shutting DownAfter done with your queries, shut down your cluster.(Non-cloud) alternative: MonetDBIn case one of the above cloud approaches does not work for you in due time (report why not!), or in case you are just curious, you can use MonetDB () (on your own laptop/desktop/workstation!) as an alternative. See the MonetDB website at and for instructions how to download and install the latest release of MonetDB (Jul2015-SP4) on your system, for a general tutorial how to use it and for documentation how to connect SQL Workbench (that we used with RedShift above) to MonetDB (however, you do not necessarily need to use SQL Workbench or any other graphcal client interface; simply using the MonetDB-provided textual / concole-based `mclient` is enought for this exercise).You find TPC-H data as compressed CSV files for scale factors 1, 3, 10 (i.e., sizes 1GB, 3GB, 10GB) at . Please download all files for the scale factor(s) you want to try into a directory (use separate directory per scalefactor) on your machine (and recall the entire absolute path to that directory). There is no need to unpack / decompress these files, as MonetDB can bulk-load data directly from compressed CSV files.Then start the MonetDB server (mserver) as per the instructions on the MonetDB website (see above for links).At you also find the SQL scripts for MonetDB to create the database schema (tables), load the data (NOTE: in load_data.sql you need to replace “_MyDataPath_” with the entire absolute path to the directory where you downloaded the data files to!), run TPC-H queries 1 & 5, create foreign keys, and drop the tables, again.Run these scripts (or their content) via mclient, SQL Workbench, or you favorite SQL client.Tasks:Perform the following tasks first using scale factor (SF) 1 (1 GB dataset); once everything works fine with SF-1, repeat with SF-10 (10 GB dataset); for the latter, replace /tpch-bads-data/sf1/ by /tpch-bdit-data/sf10/ in above load scripts. Recall to drop and re-create your tables before loading an other dataset (or create and use a separate database for each dataset).Load the TPC-H dataset in two systems (Redshift and Athena (or Hive)) Observe and record the time each system takes to load each table.Run each query (TPC-H queries 1 & 5) at least three times in each system.Observer and record the time it takes each system to complete each query.Hint: Look at the loads/queries tab in the Redshift console to monitor progress.Report both loading and querying times in your report --- at least as tables, but preferably using a graphical visualisation you consider suitable --- and discuss whether and why they do (not) differ between systems.Include query plans (EXPLAIN) in your report.Run the queries on a 10-node cluster and the Scale factor 300 data in prefix s3://tpch-bads-data/sf300/Pending data availability(Scale factor 300 is omitted for budget reasons.)Bonus 1: Re-save the data as something that Athena (or Hive) loads faster such as Parquet and measure again. See 2: Using the schema and the queries, design and explain a partitioning scheme for Redshift and Athena (or Hive) that optimizes the runtime of the two queries. Run the queries again, measure and extend your report accordingly. Check query plans to look for the impact of partitioning changes.Redshift partitoning guide: partitioning guide: : Look at individual joins, and the size of the tables involved.Hint: For Hive, consider moving the data from S3 into HDFS using the CREATE TABLE AS SELECT … method. Multiple schemas can help here.Hint: Primary and Foreign keys for TPC-H(a plain text file is available at)ALTER TABLE region ADD CONSTRAINT regionkey PRIMARY KEY (r_regionkey) ;ALTER TABLE nation ADD CONSTRAINT nationkey PRIMARY KEY (n_nationkey) ;ALTER TABLE supplier ADD CONSTRAINT suppkey PRIMARY KEY (s_suppkey) ;ALTER TABLE customer ADD CONSTRAINT custkey PRIMARY KEY (c_custkey) ;ALTER TABLE part ADD CONSTRAINT partkey PRIMARY KEY (p_partkey) ;ALTER TABLE partsupp ADD CONSTRAINT partsuppkey PRIMARY KEY (ps_partkey, ps_suppkey) ;ALTER TABLE orders ADD CONSTRAINT orderkey PRIMARY KEY (o_orderkey) ;ALTER TABLE lineitem ADD CONSTRAINT lineitemkey PRIMARY KEY (l_orderkey,l_linenumber) ;ALTER TABLE nation ADD CONSTRAINT nation_regionkey FOREIGN KEY (n_regionkey) REFERENCES region (r_regionkey) ;ALTER TABLE supplier ADD CONSTRAINT supplier_nationkey FOREIGN KEY (s_nationkey) REFERENCES nation (n_nationkey) ;ALTER TABLE customer ADD CONSTRAINT customer_nationkey FOREIGN KEY (c_nationkey) REFERENCES nation (n_nationkey) ;ALTER TABLE partsupp ADD CONSTRAINT partsupp_partkey FOREIGN KEY (ps_partkey) REFERENCES part (p_partkey) ;ALTER TABLE partsupp ADD CONSTRAINT partsupp_suppkey FOREIGN KEY (ps_suppkey) REFERENCES supplier (s_suppkey) ;ALTER TABLE orders ADD CONSTRAINT order_custkey FOREIGN KEY (o_custkey) REFERENCES customer (c_custkey) ;ALTER TABLE lineitem ADD CONSTRAINT lineitem_orderkey FOREIGN KEY (l_orderkey) REFERENCES orders (o_orderkey) ;ALTER TABLE lineitem ADD CONSTRAINT lineitem_partkey FOREIGN KEY (l_partkey) REFERENCES part (p_partkey) ;ALTER TABLE lineitem ADD CONSTRAINT lineitem_suppkey FOREIGN KEY (l_suppkey) REFERENCES supplier (s_suppkey) ;ALTER TABLE lineitem ADD CONSTRAINT lineitem_partsuppkey FOREIGN KEY (l_partkey,l_suppkey) REFERENCES partsupp (ps_partkey,ps_suppkey) ;COMMIT; -- don't forget ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download