本記事では、OceanBaseデータベースを使用してTPC-DSテストを実施する際に必要なソフトウェア要件、テナントの仕様設定、および詳細なテスト方法について説明します。
TPC-DSとは
TPC-DS(Transaction Processing Performance Council Decision Support Benchmark、トランザクション処理性能評議会意思決定支援ベンチマーク)は、国際機関TPCが策定したデータウェアハウスの性能テスト基準です。このベンチマークでは、スノーフレークモデルとスター型モデルを組み合わせた設計が採用されており、7つのファクトテーブルと17のディメンションテーブルが含まれ、小売や金融などの業務シナリオをカバーしています。テストプロセスには、スケーラブルなテストデータの生成(スケールファクターは1TBから100TBまで対応)と、99個の複雑なSQLクエリの実行(統計、レポート、データマイニングなどのシナリオを網羅)が含まれ、最終的にQphDS(1時間あたりのクエリ数)などの指標を通じて、高同時実行や大量データ下でのシステムの処理能力を評価します。
TPC-Hのアップグレード版であるTPC-DSは、より複雑なデータモデル(TPC-Hは8つのテーブルのみ)、より豊富なクエリタイプ、およびOLAP操作へのサポートを備えており、現在主流の意思決定支援ベンチマークとなっています。
説明
ユーザーエクスペリエンスと使いやすさを向上させ、すべての開発者がデータベースを使用する際に良好なパフォーマンスを得られるよう、OceanBaseデータベースはV4.0.0以降で多くの最適化作業を行いました。この性能テスト手法は、基本的なパラメータに基づいてチューニングを行うことで、開発者が良好なデータベースパフォーマンス体験を得られるようになっています。
環境準備
テスト前に、以下の要件に従ってテスト環境を準備してください。
説明
この例はMySQLテナントを例としています。
ソフトウェア要件
JDK:JDK 1.8u131以降のバージョンを推奨します。
make:
yum install makeコマンドを実行してインストールします。GCC:
yum install gccコマンドを実行してインストールします。mysql-devel:
yum install mysql-develコマンドを実行してインストールします。Pythonデータベース接続ドライバー:
sudo yum install MySQL-pythonコマンドを実行してインストールします。prettytable:
pip install prettytableコマンドを実行してインストールします。JDBC:
mysql-connector-java-5.1.47バージョンを推奨します。TPC-DSツール:ダウンロードURL をクリックして取得します。OBDワンクリックテストを使用する場合は、このツールをスキップできます。
OBClient:詳細については、OBClientドキュメントを参照してください。
OceanBaseデータベース:詳細については、OceanBaseデータベースのクイックスタートを参照してください。
IOPS:ディスクIOPSは10000以上を推奨します。
テナントの仕様設定
テナントの仕様は、OceanBaseデータベースTPC-DSテストレポートに記載されているハードウェア構成に基づいて設定されます。ご自身のデータベースのハードウェア構成に応じて動的に調整する必要があります。
クラスタのデプロイ
今回のテストでは4台のマシンを使用します。TPC-DSとOBDはそれぞれ1台のマシンに独立してデプロイされ、クライアントの負荷マシンとして機能します。OceanBaseクラスタをOBDでデプロイするには3台のマシンが必要で、OceanBaseクラスタの規模は1:1:1です。
説明
- TPC-DSテストでは、TPC-DSとOBDをデプロイするマシンは4コア16GBで十分です。
- クラスタをデプロイする際は、
obd cluster autodeployコマンドの使用は推奨されません。このコマンドは安定性を確保するため、リソース利用率を最大限に引き出すことはありません(例えば、メモリ全体を使用することはありません)。代わりに、設定ファイルを個別にチューニングし、リソース利用率を最大限に引き出すことを推奨します。
デプロイが成功したら、TPC-DSテストを実行するための新しいテナントとユーザーを作成します(
sysテナントはクラスタを管理するための組み込みシステムテナントであり、直接sysテナントを使用してテストを行うことは避けてください)。テナントのprimary_zoneをRANDOMに設定します。RANDOMは、新しく作成されたテーブルパーティションのリーダーがこの3台のマシンにランダムに割り当てられることを意味します。
テナントの作成
OBD CLUSTER TENANT CREATEコマンドを使用して、テスト用のテナントを作成できます。対応するコマンドは以下のとおりです:obd cluster tenant create <DEPLOY_NAME> -n <TENANT_NAME> --max-cpu=28 --memory-size=180G -–zone-list=zone1,zone2,zone3 -–primary-zone=RANDOM --locality=F@zone1,F@zone2,F@zone3 --charset=utf8 -s 'ob_tcp_invited_nodes="%"' --optimize=<optimize>パラメータの説明は以下のとおりです:
DEPLOY_NAME:クラスタ名。TENANT_NAME:テナント名。--zone-list:テナントのゾーンリスト。--primary-zone:テナントのプライマリゾーン。--locality:ゾーン間のレプリカの分散状況。--charset:テナントの文字セット。-s:テナントのシステム変数値。OPTIMIZE:テナントのワークロードタイプ。express_oltp、complex_oltp、olap、htap、kvの5種類のワークロードタイプが含まれます。デフォルトのワークロードタイプはhtapで、混合OLAPおよびOLTPワークロードに適しています。OBDデプロイの詳細については、obd cluster tenant createを参照してください。注意
V4.3.x以降のバージョンでは、OBDを使用してデプロイする際に、構成パラメータ
scenarioを設定して適切なクラスタワークロードタイプを選択できます。設定されていない場合、デフォルトのscenarioはhtapです。詳細については、OBDによるOceanBaseデータベースのデプロイを参照してください。
例えば、
tpcds_tenantという名前のテナントを作成します。使用するクラスタ名はobperfで、CPUコア数は28、メモリ容量は180GBのリソース構成を持ち、デフォルトのテナントワークロードタイプをクラスタのシナリオと一致させます。obd cluster tenant create obperf -n tpcds_tenant --max-cpu=28 --memory-size=180G -–zone-list=zone1,zone2,zone3 -–primary-zone=RANDOM --locality=F@zone1,F@zone2,F@zone3 --charset=utf8 -s 'ob_tcp_invited_nodes="%"' --optimize=htap説明
この例では、
--optimize=htapがデフォルトのワークロードタイプです。本番環境では、実際のクラスタタイプに応じて適切なワークロードタイプを選択してください。
テスト方法
テスト環境が準備できたら、以下の方法に基づいてTPC-DS性能テストを実行できます。
TPC-DSツールを使用したTPC-DSテストの手動実行
手動テストは、選択したクラスタの負荷タイプとテナントのチューニングシナリオを設定した後に実施されます。これにより、OceanBaseデータベース、特にパラメータ設定の最適化について深く理解することができます。
ステップ1:テストテナントを作成する
説明
テストテナントが環境準備段階で既に作成済みの場合は、このステップをスキップしてください。
システムテナント(sysテナント)で実行されるコマンドを使用して、テストテナントを作成します。
説明
今回のテストでは、OceanBaseクラスタの環境デプロイモードは1:1:1です。
リソースユニット
mysql_boxを作成します。CREATE RESOURCE UNIT mysql_box MAX_CPU 28, MEMORY_SIZE '200G', MIN_IOPS 200000, MAX_IOPS 12800000, LOG_DISK_SIZE '300G';リソースプール
mysql_poolを作成します。CREATE RESOURCE POOL mysql_pool UNIT = 'mysql_box', UNIT_NUM = 1, ZONE_LIST = ('z1','z2','z3');MySQLモードのテナント
mysql_tenantを作成します。CREATE TENANT mysql_tenant RESOURCE_POOL_LIST = ('mysql_pool'), PRIMARY_ZONE = RANDOM, LOCALITY = 'F@z1,F@z2,F@z3' SET VARIABLES ob_compatibility_mode='mysql', ob_tcp_invited_nodes='%', secure_file_priv = "/";
ステップ2:環境のチューニングを行う
OceanBaseデータベースのチューニング。
システムテナント(
sysテナント)で以下のステートメントを実行して、関連パラメータを設定してください。ALTER SYSTEM flush plan cache GLOBAL; ALTER SYSTEM SET enable_sql_audit=false; select sleep(5); ALTER SYSTEM SET enable_perf_event=false; ALTER SYSTEM SET syslog_level='PERF'; ALTER SYSTEM SET enable_record_trace_log=false; ALTER SYSTEM SET data_storage_warning_tolerance_time = '300s'; ALTER SYSTEM SET _data_storage_io_timeout = '600s'; ALTER SYSTEM SET trace_log_slow_query_watermark = '7d'; ALTER SYSTEM SET large_query_threshold='0ms'; ALTER SYSTEM SET enable_syslog_recycle= 1; ALTER SYSTEM SET max_syslog_file_count = 300; set global ob_sql_work_area_percentage=50; ALTER SYSTEM SET default_table_store_format = 'column' ; ALTER SYSTEM SET ob_enable_batched_multi_statement='true'; ALTER SYSTEM SET _io_read_batch_size = '128k'; ALTER SYSTEM SET _io_read_redundant_limit_percentage = 50; ALTER SYSTEM SET ob_enable_batched_multi_statement='true'; set global parallel_servers_target=10000;テナントのチューニング。
テストテナント(ユーザーテナント)で以下のステートメントを実行して、関連パラメータを設定してください。
SET global NLS_DATE_FORMAT='YYYY-MM-DD HH24:MI:SS'; SET global NLS_TIMESTAMP_FORMAT='YYYY-MM-DD HH24:MI:SS.FF'; SET global NLS_TIMESTAMP_TZ_FORMAT='YYYY-MM-DD HH24:MI:SS.FF TZR TZD'; set global ob_query_timeout=10800000000; set global ob_trx_timeout=10000000000; set global ob_sql_work_area_percentage=50; -- ALTER SYSTEM SET default_table_store_format = 'column' ; ALTER SYSTEM SET ob_enable_batched_multi_statement='true'; ALTER SYSTEM SET _io_read_batch_size = '128k'; ALTER SYSTEM SET _io_read_redundant_limit_percentage = 50; ALTER SYSTEM SET ob_enable_batched_multi_statement='true'; set global parallel_servers_target=10000; set global collation_connection = utf8mb4_bin; set global collation_database = utf8mb4_bin; set global collation_server = utf8mb4_bin;
ステップ3:TPC-DSツールのインストール
TPC-DSツールをダウンロードします。詳細については、TPC-DSツールのダウンロードページを参照してください。
ダウンロードが完了したら、ファイルを解凍し、TPC-DSの解凍後のディレクトリ内のtoolsに移動します。
[wieck@localhost ~] $ unzip 49FA3DBA-FE6C-463C-952B-62B8E9D43372-TPC-DS-Tool.zip [wieck@localhost ~] $ cd DSGen-software-code-3.2.0rc1/toolsファイルをコンパイルします。
[wieck@localhost tools] $ makeファイルのコンパイルが成功すると、toolsフォルダ内にバイナリ実行ファイルdsdgenとdsqgenが生成されます。
ステップ4:データの生成
実際の環境に応じて、TCP-DS 10G、100G、または1Tのデータを生成できます。本記事では、100Gのデータを生成する例を示します。
データファイルを格納するディレクトリを作成します。
[wieck@localhost tools] $ mkdir tpcds_100gテストデータを構築します。
[wieck@localhost tools] $ ./dsdgen -scale 100G -dir tpcds_100gマルチスレッドで1Tのデータを生成する場合、OceanBaseはダイレクトロードをサポートしており、複数のファイルのデータを同時にテーブルにインポートできます。
[wieck@localhost tools] $ #!/bin/bash # 変数の設定 SCALE=$1 PARALLEL="20" DIR="/data/1/tpcds/tpcds_data/" # パスは自由に変更可能です for ((x=1; x<=PARALLEL; x++)) do # dsdgenコマンドの構築 CMD="./dsdgen -scale ${SCALE}GB -parallel ${PARALLEL} -child ${x} -dir ${DIR}" # コマンドをバックグラウンドで実行 ${CMD} & done wait echo "すべてのデータ生成が完了しました。"テストデータを修正します。
NULL値を修正します。
フィールド区切りとしてパイプ記号
|を使用する場合、a,NULL,c,d,NULLをテキストファイルにエクスポートするとa||c|d|という形式になります。LOAD DATAメソッドを使用してインポートする際にエラーが発生し、インポート失敗というメッセージが表示されるため、NULL値に対して何らかの処理を行う必要があります。[wieck@localhost tools] $ vim fix-null.sh #!/bin/bash # 最初のフィールドのNULL値を0に置き換え、^|を0|に置き換えます。 # 中間のフィールドのNULL値を0に置き換え、||を|0|に置き換えます。 # 最後のフィールドのNULL値を0に置き換え、|$を|0に置き換えます。 for s_f in `ls *dat` do echo "$s_f" i=1 while [ `egrep '\|\||^\||\|$' $s_f |wc -l` -gt 0 ] do echo $i sed 's/^|/0|/g;s/||/|0|/g;s/|$/|0/g' -i $s_f ((i++)) done done [wieck@localhost tools] $ sh fix-null.shdateフィールドを修正します。
[wieck@localhost tools] $ vim fix-date.sh for s_f in item.dat store.dat web_page.dat web_site.dat call_center.dat do # 最初と2番目のdateがどちらもNULLの場合の処理 sed 's/^\([A-Za-z0-9]*|[A-Za-z0-9]*\)|0|0|\(.*\)/\1|0000-00-00|0000-00-00|\2/' -i $s_f # 2番目のdateがNULLの場合の処理 sed 's/^\([0-9A-Za-z]*|[A-Za-z0-9]*|[0-9]\{4\}-[0-9]\{2\}-[0-9]\{2\}\)|0|\(.*\)/\1|0000-00-00|\2/' -i $s_f # 最初のdateがNULLの場合の処理 sed 's/^\([0-9A-Za-z]*|[A-Za-z0-9]*\)|0|\([0-9]\{4\}-[0-9]\{2\}-[0-9]\{2\}|.*\)/\1|0000-00-00|\2/' -i $s_f done [wieck@localhost tools] $ sh fix-date.sh
ステップ5:クエリSQLの生成
ストレージワークロードテスト用のSQLファイルを作成します。
[wieck@localhost tools] $ mkdir sql-dsDSGen-software-code-3.2.0rc1/query_templatesフォルダには、クエリのテンプレートファイルが保存されており、dsqgenを使用してテスト用のSQLステートメントを生成できます。dsqgenを使用してテスト用のSQLステートメントを生成する前に、まず
DSGen-software-code-3.2.0rc1/query_templatesフォルダ内のテンプレートファイルを修正し、ファイルにdefine _END = ""を追加する必要があります。そうしないとエラーが発生します。具体的な操作手順は以下のとおりです。shファイルを作成し、コードを記述します。
[wieck@localhost tools] $ vim query.sh for i in `ls query*tpl` do echo $i; echo "define _END = \"\";" >> $i done [wieck@localhost tools] $ sh query.shdsqgenを使用してテスト用のSQLステートメントを生成します。
[wieck@localhost tools] $ mkdir sql-ds [wieck@localhost tools] $ vim sql-ds.sh #!/bin/sh for i in `seq 1 99` do ./dsqgen -directory ../query_templates/ -template "query${i}.tpl" -dialect netezza -output ./sql-ds/ scp ./sql-ds/query_0.sql ./sql-ds/sql${i}.sql done [wieck@localhost tools] $ sh sql-ds.sh実行結果は次のとおりです:
qgen2 Query Generator (Version 3.2.0) Copyright Transaction Processing Performance Council (TPC) 2001 - 2021 Warning: This scale factor is valid for QUALIFICATION ONLY ...... Copyright Transaction Processing Performance Council (TPC) 2001 - 2021 qgen2 Query Generator (Version 3.2.0) Copyright Transaction Processing Performance Council (TPC) 2001 - 2021
生成されたテストファイルを確認します。
[wieck@localhost tools] $ ls sql-ds/ query_0.sql sql17.sql sql25.sql sql33.sql sql41.sql sql5.sql sql58.sql sql66.sql sql74.sql sql82.sql sql90.sql sql99.sql sql1.sql sql18.sql sql26.sql sql34.sql sql42.sql sql50.sql sql59.sql sql67.sql sql75.sql sql83.sql sql91.sql sql10.sql sql19.sql sql27.sql sql35.sql sql43.sql sql51.sql sql6.sql sql68.sql sql76.sql sql84.sql sql92.sql sql11.sql sql2.sql sql28.sql sql36.sql sql44.sql sql52.sql sql60.sql sql69.sql sql77.sql sql85.sql sql93.sql sql12.sql sql20.sql sql29.sql sql37.sql sql45.sql sql53.sql sql61.sql sql7.sql sql78.sql sql86.sql sql94.sql sql13.sql sql21.sql sql3.sql sql38.sql sql46.sql sql54.sql sql62.sql sql70.sql sql79.sql sql87.sql sql95.sql sql14.sql sql22.sql sql30.sql sql39.sql sql47.sql sql55.sql sql63.sql sql71.sql sql8.sql sql88.sql sql96.sql sql15.sql sql23.sql sql31.sql sql4.sql sql48.sql sql56.sql sql64.sql sql72.sql sql80.sql sql89.sql sql97.sql sql16.sql sql24.sql sql32.sql sql40.sql sql49.sql sql57.sql sql65.sql sql73.sql sql81.sql sql9.sql sql98.sqlクエリSQLを調整します。
パラレル並列処理を増やします。
SYSテナントで以下のコマンドを使用して、テナントの利用可能なCPU総数を確認します。
obclient> SELECT sum(cpu_capacity_max) FROM oceanbase.__all_virtual_server;sql1を例にとると、修正後のSQLは次のとおりです:
with customer_total_return as (select /*+ parallel(96) */ sr_customer_sk as ctr_customer_sk ---パラレル並列処理を追加 ,sr_store_sk as ctr_store_sk ,sum(SR_FEE) as ctr_total_return from store_returns ,date_dim where sr_returned_date_sk = d_date_sk and d_year =2000 group by sr_customer_sk ,sr_store_sk) select c_customer_id from customer_total_return ctr1 ,store ,customer where ctr1.ctr_total_return > (select avg(ctr_total_return)*1.2 from customer_total_return ctr2 where ctr1.ctr_store_sk = ctr2.ctr_store_sk) and s_store_sk = ctr1.ctr_store_sk and s_state = 'TN' and ctr1.ctr_customer_sk = c_customer_sk order by c_customer_id limit 100;日付関数を調整します。
SQL文の日付関数
(cast('2025-03-04' as date) + 30 days)をdate_add(cast('2025-03-04' as date),interval 30 day)または(cast('2025-03-04' as date) + INTERVAL 30 day)に、(cast ('2025-04-18' as date) - 30 days)をdate_sub(cast ('2025-04-18' as date),interval 30 day)または(cast('2025-04-18' as date) - INTERVAL 30 day)に変更します。日付関数を含む特定のSQLを例にとると、修正後のSQLは次のとおりです:
select i_item_id ,i_item_desc ,i_current_price from item, inventory, date_dim, catalog_sales where i_current_price between 22 and 22 + 30 and inv_item_sk = i_item_sk and d_date_sk=inv_date_sk and d_date between cast('2025-04-02' as date) and (cast('2025-04-02' as date) + INTERVAL 60 day) -----日付関数を修正 and i_manufact_id in (678,964,918,849) and inv_quantity_on_hand between 100 and 500 and cs_item_sk = i_item_sk group by i_item_id,i_item_desc,i_current_price order by i_item_id limit 100;rollup関数の使用方法を調整します。
SQL文のrollup関数の使用方法を
group by <col_name> with rollupに変更します。rollup関数を含む特定のSQLを例にとると、修正後のSQLは次のとおりです:
select i_item_id, ca_country, ca_state, ca_county, avg( cast(cs_quantity as decimal(12,2))) agg1, avg( cast(cs_list_price as decimal(12,2))) agg2, avg( cast(cs_coupon_amt as decimal(12,2))) agg3, avg( cast(cs_sales_price as decimal(12,2))) agg4, avg( cast(cs_net_profit as decimal(12,2))) agg5, avg( cast(c_birth_year as decimal(12,2))) agg6, avg( cast(cd1.cd_dep_count as decimal(12,2))) agg7 from catalog_sales, customer_demographics cd1, customer_demographics cd2, customer, customer_address, date_dim, item where cs_sold_date_sk = d_date_sk and cs_item_sk = i_item_sk and cs_bill_cdemo_sk = cd1.cd_demo_sk and cs_bill_customer_sk = c_customer_sk and cd1.cd_gender = 'M' and cd1.cd_education_status = 'College' and c_current_cdemo_sk = cd2.cd_demo_sk and c_current_addr_sk = ca_address_sk and c_birth_month in (9,5,12,4,1,10) and d_year = 2001 and ca_state in ('ND','WI','AL' ,'NC','OK','MS','TN') group by i_item_id, ca_country, ca_state, ca_county with rollup order by ca_country, ca_state, ca_county, i_item_id limit 100;
ステップ6:テーブルの作成
テーブル構造ファイル create_tpcds_mysql_table_part.ddl を作成します。
[wieck@localhost tools] $ mkdir load
[wieck@localhost tools] $ cd load
[wieck@localhost load] $ vim create_tpcds_mysql_table_part.ddl
SET global NLS_DATE_FORMAT='YYYY-MM-DD HH24:MI:SS';
SET global NLS_TIMESTAMP_FORMAT='YYYY-MM-DD HH24:MI:SS.FF';
SET global NLS_TIMESTAMP_TZ_FORMAT='YYYY-MM-DD HH24:MI:SS.FF TZR TZD';
set global ob_query_timeout=10800000000;
set global ob_trx_timeout=10000000000;
set global ob_sql_work_area_percentage=80;
set global optimizer_use_sql_plan_baselines = true;
set global optimizer_capture_sql_plan_baselines = true;
alter system set ob_enable_batched_multi_statement='true';
set global parallel_servers_target=10000;
create tablegroup if not exists tpcds_tg_catalog_group_range binding true partition by hash partitions 128;
create tablegroup if not exists tpcds_tg_store_group_range_72_hash binding true
partition by range columns 1 subpartition BY hash subpartitions 64
(partition part_1 values less than (2450846),
partition part_2 values less than (2450874),
partition part_3 values less than (2450905),
partition part_4 values less than (2450935),
partition part_5 values less than (2450966),
partition part_6 values less than (2450996),
partition part_7 values less than (2451027),
partition part_8 values less than (2451058),
partition part_9 values less than (2451088),
partition part_10 values less than (2451119),
partition part_11 values less than (2451149),
partition part_12 values less than (2451180),
partition part_13 values less than (2451211),
partition part_14 values less than (2451239),
partition part_15 values less than (2451270),
partition part_16 values less than (2451300),
partition part_17 values less than (2451331),
partition part_18 values less than (2451361),
partition part_19 values less than (2451392),
partition part_20 values less than (2451423),
partition part_21 values less than (2451453),
partition part_22 values less than (2451484),
partition part_23 values less than (2451514),
partition part_24 values less than (2451545),
partition part_25 values less than (2451576),
partition part_26 values less than (2451605),
partition part_27 values less than (2451636),
partition part_28 values less than (2451666),
partition part_29 values less than (2451697),
partition part_30 values less than (2451727),
partition part_31 values less than (2451758),
partition part_32 values less than (2451789),
partition part_33 values less than (2451819),
partition part_34 values less than (2451850),
partition part_35 values less than (2451880),
partition part_36 values less than (2451911),
partition part_37 values less than (2451942),
partition part_38 values less than (2451970),
partition part_39 values less than (2452001),
partition part_40 values less than (2452031),
partition part_41 values less than (2452062),
partition part_42 values less than (2452092),
partition part_43 values less than (2452123),
partition part_44 values less than (2452154),
partition part_45 values less than (2452184),
partition part_46 values less than (2452215),
partition part_47 values less than (2452245),
partition part_48 values less than (2452276),
partition part_49 values less than (2452307),
partition part_50 values less than (2452335),
partition part_51 values less than (2452366),
partition part_52 values less than (2452396),
partition part_53 values less than (2452427),
partition part_54 values less than (2452457),
partition part_55 values less than (2452488),
partition part_56 values less than (2452519),
partition part_57 values less than (2452549),
partition part_58 values less than (2452580),
partition part_59 values less than (2452610),
partition part_60 values less than (2452641),
partition part_61 values less than (2452672),
partition part_62 values less than (2452700),
partition part_63 values less than (2452731),
partition part_64 values less than (2452761),
partition part_65 values less than (2452792),
partition part_66 values less than (2452822),
partition part_67 values less than (2452853),
partition part_68 values less than (2452884),
partition part_69 values less than (2452914),
partition part_70 values less than (2452945),
partition part_71 values less than (2452975),
partition part_72 values less than (2453006),
partition part_73 values less than (maxvalue));
create table dbgen_version
(
dv_version varchar(16) ,
dv_create_date date ,
dv_create_time varchar(20) ,
dv_cmdline_args varchar(200)
);
CREATE TABLE customer_address
(
ca_address_sk bigint NOT NULL,
ca_address_id varchar(16) NOT NULL,
ca_street_number varchar(10),
ca_street_name varchar(60),
ca_street_type varchar(15),
ca_suite_number varchar(10),
ca_city varchar(60),
ca_county varchar(30),
ca_state varchar(2),
ca_zip varchar(10),
ca_country varchar(20),
ca_gmt_offset decimal(5,2),
ca_location_type varchar(20),
primary key(ca_address_sk)
) partition by hash (ca_address_sk) partitions 128;
create table customer_demographics
(
cd_demo_sk bigint not null,
cd_gender char(1),
cd_marital_status char(1),
cd_education_status char(20),
cd_purchase_estimate int,
cd_credit_rating char(10),
cd_dep_count int,
cd_dep_employed_count int,
cd_dep_college_count int,
primary key(cd_demo_sk)
) partition by hash (cd_demo_sk) partitions 128;
create table date_dim
(
d_date_sk bigint not null,
d_date_id char(16) not null,
d_date date,
d_month_seq int,
d_week_seq int,
d_quarter_seq int,
d_year int,
d_dow int,
d_moy int,
d_dom int,
d_qoy int,
d_fy_year int,
d_fy_quarter_seq int,
d_fy_week_seq int,
d_day_name char(9),
d_quarter_name char(6),
d_holiday char(1),
d_weekend char(1),
d_following_holiday char(1),
d_first_dom int,
d_last_dom int,
d_same_day_ly int,
d_same_day_lq int,
d_current_day char(1),
d_current_week char(1),
d_current_month char(1),
d_current_quarter char(1),
d_current_year char(1),
primary key(d_date_sk)
);
create table warehouse
(
w_warehouse_sk bigint not null,
w_warehouse_id char(16) not null,
w_warehouse_name varchar(20),
w_warehouse_sq_ft int,
w_street_number char(10),
w_street_name varchar(60),
w_street_type char(15),
w_suite_number char(10),
w_city varchar(60),
w_county varchar(30),
w_state char(2),
w_zip char(10),
w_country varchar(20),
w_gmt_offset decimal(5,2),
primary key(w_warehouse_sk)
);
create table ship_mode
(
sm_ship_mode_sk bigint,
sm_ship_mode_id char(16) not null,
sm_type char(30),
sm_code char(10),
sm_carrier char(20),
sm_contract char(20),
primary key(sm_ship_mode_sk)
) ;
create table time_dim
(
t_time_sk bigint not null,
t_time_id char(16) not null,
t_time int,
t_hour int,
t_minute int,
t_second int,
t_am_pm char(2),
t_shift char(20),
t_sub_shift char(20),
t_meal_time char(20),
primary key(t_time_sk)
);
create table reason
(
r_reason_sk bigint not null,
r_reason_id char(16) not null,
r_reason_desc char(100),
PRIMARY key(r_reason_sk)
);
create table income_band
(
ib_income_band_sk bigint not null,
ib_lower_bound int,
ib_upper_bound int,
PRIMARY key(ib_income_band_sk)
);
create table item
(
i_item_sk bigint not null,
i_item_id char(16) not null,
i_rec_start_date date,
i_rec_end_date date,
i_item_desc varchar(200),
i_current_price decimal(7,2),
i_wholesale_cost decimal(7,2),
i_brand_id int,
i_brand char(50),
i_class_id int,
i_class char(50),
i_category_id int,
i_category char(50),
i_manufact_id int,
i_manufact char(50),
i_size char(20),
i_formulation char(20),
i_color char(20),
i_units char(10),
i_container char(10),
i_manager_id int,
i_product_name char(50),
PRIMARY key(i_item_sk)
) ;
create table store
(
s_store_sk bigint not null,
s_store_id char(16) not null,
s_rec_start_date date,
s_rec_end_date date,
s_closed_date_sk bigint,
s_store_name varchar(50),
s_number_employees int,
s_floor_space int,
s_hours char(20),
s_manager varchar(40),
s_market_id int,
s_geography_class varchar(100),
s_market_desc varchar(100),
s_market_manager varchar(40),
s_division_id int,
s_division_name varchar(50),
s_company_id int,
s_company_name varchar(50),
s_street_number varchar(10),
s_street_name varchar(60),
s_street_type char(15),
s_suite_number char(10),
s_city varchar(60),
s_county varchar(30),
s_state char(2),
s_zip char(10),
s_country varchar(20),
s_gmt_offset decimal(5,2),
s_tax_percentage decimal(5,2),
PRIMARY key(s_store_sk)
);
create table call_center
(
cc_call_center_sk bigint not null,
cc_call_center_id char(16) not null,
cc_rec_start_date date,
cc_rec_end_date date,
cc_closed_date_sk bigint,
cc_open_date_sk bigint,
cc_name varchar(50),
cc_class varchar(50),
cc_employees int,
cc_sq_ft int,
cc_hours char(20),
cc_manager varchar(40),
cc_mkt_id int,
cc_mkt_class char(50),
cc_mkt_desc varchar(100),
cc_market_manager varchar(40),
cc_division int,
cc_division_name varchar(50),
cc_company int,
cc_company_name char(50),
cc_street_number char(10),
cc_street_name varchar(60),
cc_street_type char(15),
cc_suite_number char(10),
cc_city varchar(60),
cc_county varchar(30),
cc_state char(2),
cc_zip char(10),
cc_country varchar(20),
cc_gmt_offset decimal(5,2),
cc_tax_percentage decimal(5,2),
PRIMARY key(cc_call_center_sk)
);
CREATE TABLE customer
(
c_customer_sk bigint NOT NULL,
c_customer_id char(16) NOT NULL,
c_current_cdemo_sk bigint,
c_current_hdemo_sk bigint,
c_current_addr_sk bigint,
c_first_shipto_date_sk bigint,
c_first_sales_date_sk bigint,
c_salutation char(10),
c_first_name char(20),
c_last_name char(30),
c_preferred_cust_flag char(1),
c_birth_day int,
c_birth_month int,
c_birth_year int,
c_birth_country varchar(20),
c_login char(13),
c_email_address char(50),
c_last_review_date_sk bigint,
PRIMARY key(c_customer_sk)
)partition by hash (c_customer_sk) partitions 128;
create table web_site
(
web_site_sk bigint not null,
web_site_id char(16) not null,
web_rec_start_date date,
web_rec_end_date date,
web_name varchar(50),
web_open_date_sk bigint,
web_close_date_sk bigint,
web_class varchar(50),
web_manager varchar(40),
web_mkt_id int,
web_mkt_class varchar(50),
web_mkt_desc varchar(100),
web_market_manager varchar(40),
web_company_id int,
web_company_name char(50),
web_street_number char(10),
web_street_name varchar(60),
web_street_type char(15),
web_suite_number char(10),
web_city varchar(60),
web_county varchar(30),
web_state char(2),
web_zip char(10),
web_country varchar(20),
web_gmt_offset decimal(5,2),
web_tax_percentage decimal(5,2),
PRIMARY key(web_site_sk)
);
create table store_returns
(
sr_returned_date_sk bigint,
sr_return_time_sk bigint,
sr_item_sk bigint not null,
sr_customer_sk bigint,
sr_cdemo_sk bigint,
sr_hdemo_sk bigint,
sr_addr_sk bigint,
sr_store_sk bigint,
sr_reason_sk bigint,
sr_ticket_number bigint not null,
sr_return_quantity int,
sr_return_amt decimal(7,2),
sr_return_tax decimal(7,2),
sr_return_amt_inc_tax decimal(7,2),
sr_fee decimal(7,2),
sr_return_ship_cost decimal(7,2),
sr_refunded_cash decimal(7,2),
sr_reversed_charge decimal(7,2),
sr_store_credit decimal(7,2),
sr_net_loss decimal(7,2)
)
partition by range (sr_returned_date_sk)
subpartition BY hash(sr_item_sk) subpartitions 64
(partition part_1 values less than (2450846),
partition part_2 values less than (2450874),
partition part_3 values less than (2450905),
partition part_4 values less than (2450935),
partition part_5 values less than (2450966),
partition part_6 values less than (2450996),
partition part_7 values less than (2451027),
partition part_8 values less than (2451058),
partition part_9 values less than (2451088),
partition part_10 values less than (2451119),
partition part_11 values less than (2451149),
partition part_12 values less than (2451180),
partition part_13 values less than (2451211),
partition part_14 values less than (2451239),
partition part_15 values less than (2451270),
partition part_16 values less than (2451300),
partition part_17 values less than (2451331),
partition part_18 values less than (2451361),
partition part_19 values less than (2451392),
partition part_20 values less than (2451423),
partition part_21 values less than (2451453),
partition part_22 values less than (2451484),
partition part_23 values less than (2451514),
partition part_24 values less than (2451545),
partition part_25 values less than (2451576),
partition part_26 values less than (2451605),
partition part_27 values less than (2451636),
partition part_28 values less than (2451666),
partition part_29 values less than (2451697),
partition part_30 values less than (2451727),
partition part_31 values less than (2451758),
partition part_32 values less than (2451789),
partition part_33 values less than (2451819),
partition part_34 values less than (2451850),
partition part_35 values less than (2451880),
partition part_36 values less than (2451911),
partition part_37 values less than (2451942),
partition part_38 values less than (2451970),
partition part_39 values less than (2452001),
partition part_40 values less than (2452031),
partition part_41 values less than (2452062),
partition part_42 values less than (2452092),
partition part_43 values less than (2452123),
partition part_44 values less than (2452154),
partition part_45 values less than (2452184),
partition part_46 values less than (2452215),
partition part_47 values less than (2452245),
partition part_48 values less than (2452276),
partition part_49 values less than (2452307),
partition part_50 values less than (2452335),
partition part_51 values less than (2452366),
partition part_52 values less than (2452396),
partition part_53 values less than (2452427),
partition part_54 values less than (2452457),
partition part_55 values less than (2452488),
partition part_56 values less than (2452519),
partition part_57 values less than (2452549),
partition part_58 values less than (2452580),
partition part_59 values less than (2452610),
partition part_60 values less than (2452641),
partition part_61 values less than (2452672),
partition part_62 values less than (2452700),
partition part_63 values less than (2452731),
partition part_64 values less than (2452761),
partition part_65 values less than (2452792),
partition part_66 values less than (2452822),
partition part_67 values less than (2452853),
partition part_68 values less than (2452884),
partition part_69 values less than (2452914),
partition part_70 values less than (2452945),
partition part_71 values less than (2452975),
partition part_72 values less than (2453006),
partition part_73 values less than (maxvalue));
create table household_demographics
(
hd_demo_sk bigint not null,
hd_income_band_sk bigint,
hd_buy_potential char(15),
hd_dep_count int,
hd_vehicle_count int,
PRIMARY key(hd_demo_sk)
);
create table web_page
(
wp_web_page_sk bigint not null,
wp_web_page_id char(16) not null,
wp_rec_start_date date,
wp_rec_end_date date,
wp_creation_date_sk bigint,
wp_access_date_sk bigint,
wp_autogen_flag char(1),
wp_customer_sk bigint,
wp_url varchar(100),
wp_type char(50),
wp_char_count int,
wp_link_count int,
wp_image_count int,
wp_max_ad_count int,
PRIMARY key(wp_web_page_sk)
);
create table promotion
(
p_promo_sk bigint not null,
p_promo_id char(16) not null,
p_start_date_sk bigint,
p_end_date_sk bigint,
p_item_sk bigint,
p_cost decimal(15,2),
p_response_target int,
p_promo_name char(50),
p_channel_dmail char(1),
p_channel_email char(1),
p_channel_catalog char(1),
p_channel_tv char(1),
p_channel_radio char(1),
p_channel_press char(1),
p_channel_event char(1),
p_channel_demo char(1),
p_channel_details varchar(100),
p_purpose char(15),
p_discount_active char(1),
PRIMARY key(p_promo_sk)
);
create table catalog_page
(
cp_catalog_page_sk bigint not null,
cp_catalog_page_id varchar(16) not null,
cp_start_date_sk bigint,
cp_end_date_sk bigint,
cp_department varchar(50),
cp_catalog_number int,
cp_catalog_page_number int,
cp_description varchar(100),
cp_type varchar(100),
primary key(cp_catalog_page_sk)
);
create table inventory
(
inv_date_sk bigint not null,
inv_item_sk bigint not null,
inv_warehouse_sk bigint not null,
inv_quantity_on_hand int
)
partition by range (inv_date_sk)
subpartition BY hash(inv_item_sk) subpartitions 64
(partition part_1 values less than (2450846),
partition part_2 values less than (2450874),
partition part_3 values less than (2450905),
partition part_4 values less than (2450935),
partition part_5 values less than (2450966),
partition part_6 values less than (2450996),
partition part_7 values less than (2451027),
partition part_8 values less than (2451058),
partition part_9 values less than (2451088),
partition part_10 values less than (2451119),
partition part_11 values less than (2451149),
partition part_12 values less than (2451180),
partition part_13 values less than (2451211),
partition part_14 values less than (2451239),
partition part_15 values less than (2451270),
partition part_16 values less than (2451300),
partition part_17 values less than (2451331),
partition part_18 values less than (2451361),
partition part_19 values less than (2451392),
partition part_20 values less than (2451423),
partition part_21 values less than (2451453),
partition part_22 values less than (2451484),
partition part_23 values less than (2451514),
partition part_24 values less than (2451545),
partition part_25 values less than (2451576),
partition part_26 values less than (2451605),
partition part_27 values less than (2451636),
partition part_28 values less than (2451666),
partition part_29 values less than (2451697),
partition part_30 values less than (2451727),
partition part_31 values less than (2451758),
partition part_32 values less than (2451789),
partition part_33 values less than (2451819),
partition part_34 values less than (2451850),
partition part_35 values less than (2451880),
partition part_36 values less than (2451911),
partition part_37 values less than (2451942),
partition part_38 values less than (2451970),
partition part_39 values less than (2452001),
partition part_40 values less than (2452031),
partition part_41 values less than (2452062),
partition part_42 values less than (2452092),
partition part_43 values less than (2452123),
partition part_44 values less than (2452154),
partition part_45 values less than (2452184),
partition part_46 values less than (2452215),
partition part_47 values less than (2452245),
partition part_48 values less than (2452276),
partition part_49 values less than (2452307),
partition part_50 values less than (2452335),
partition part_51 values less than (2452366),
partition part_52 values less than (2452396),
partition part_53 values less than (2452427),
partition part_54 values less than (2452457),
partition part_55 values less than (2452488),
partition part_56 values less than (2452519),
partition part_57 values less than (2452549),
partition part_58 values less than (2452580),
partition part_59 values less than (2452610),
partition part_60 values less than (2452641),
partition part_61 values less than (2452672),
partition part_62 values less than (2452700),
partition part_63 values less than (2452731),
partition part_64 values less than (2452761),
partition part_65 values less than (2452792),
partition part_66 values less than (2452822),
partition part_67 values less than (2452853),
partition part_68 values less than (2452884),
partition part_69 values less than (2452914),
partition part_70 values less than (2452945),
partition part_71 values less than (2452975),
partition part_72 values less than (2453006),
partition part_73 values less than (maxvalue));
create table catalog_returns
(
cr_returned_date_sk bigint,
cr_returned_time_sk bigint,
cr_item_sk bigint not null,
cr_refunded_customer_sk bigint,
cr_refunded_cdemo_sk bigint,
cr_refunded_hdemo_sk bigint,
cr_refunded_addr_sk bigint,
cr_returning_customer_sk bigint,
cr_returning_cdemo_sk bigint,
cr_returning_hdemo_sk bigint,
cr_returning_addr_sk bigint,
cr_call_center_sk bigint,
cr_catalog_page_sk bigint ,
cr_ship_mode_sk bigint ,
cr_warehouse_sk bigint ,
cr_reason_sk bigint ,
cr_order_number bigint not null,
cr_return_quantity int,
cr_return_amount decimal(7,2),
cr_return_tax decimal(7,2),
cr_return_amt_inc_tax decimal(7,2),
cr_fee decimal(7,2),
cr_return_ship_cost decimal(7,2),
cr_refunded_cash decimal(7,2),
cr_reversed_charge decimal(7,2),
cr_store_credit decimal(7,2),
cr_net_loss decimal(7,2)
)
tablegroup = tpcds_tg_catalog_group_range
partition by hash(cr_item_sk) partitions 128
;
create table web_returns
(
wr_returned_date_sk bigint,
wr_returned_time_sk bigint,
wr_item_sk bigint not null,
wr_refunded_customer_sk bigint,
wr_refunded_cdemo_sk bigint,
wr_refunded_hdemo_sk bigint,
wr_refunded_addr_sk bigint,
wr_returning_customer_sk bigint,
wr_returning_cdemo_sk bigint,
wr_returning_hdemo_sk bigint,
wr_returning_addr_sk bigint,
wr_web_page_sk bigint,
wr_reason_sk bigint,
wr_order_number bigint not null,
wr_return_quantity int,
wr_return_amt decimal(7,2),
wr_return_tax decimal(7,2),
wr_return_amt_inc_tax decimal(7,2),
wr_fee decimal(7,2),
wr_return_ship_cost decimal(7,2),
wr_refunded_cash decimal(7,2),
wr_reversed_charge decimal(7,2),
wr_account_credit decimal(7,2),
wr_net_loss decimal(7,2)
)
partition by range(wr_returned_date_sk)
subpartition BY hash(wr_item_sk) subpartitions 64
(partition part_1 values less than (2450846),
partition part_2 values less than (2450874),
partition part_3 values less than (2450905),
partition part_4 values less than (2450935),
partition part_5 values less than (2450966),
partition part_6 values less than (2450996),
partition part_7 values less than (2451027),
partition part_8 values less than (2451058),
partition part_9 values less than (2451088),
partition part_10 values less than (2451119),
partition part_11 values less than (2451149),
partition part_12 values less than (2451180),
partition part_13 values less than (2451211),
partition part_14 values less than (2451239),
partition part_15 values less than (2451270),
partition part_16 values less than (2451300),
partition part_17 values less than (2451331),
partition part_18 values less than (2451361),
partition part_19 values less than (2451392),
partition part_20 values less than (2451423),
partition part_21 values less than (2451453),
partition part_22 values less than (2451484),
partition part_23 values less than (2451514),
partition part_24 values less than (2451545),
partition part_25 values less than (2451576),
partition part_26 values less than (2451605),
partition part_27 values less than (2451636),
partition part_28 values less than (2451666),
partition part_29 values less than (2451697),
partition part_30 values less than (2451727),
partition part_31 values less than (2451758),
partition part_32 values less than (2451789),
partition part_33 values less than (2451819),
partition part_34 values less than (2451850),
partition part_35 values less than (2451880),
partition part_36 values less than (2451911),
partition part_37 values less than (2451942),
partition part_38 values less than (2451970),
partition part_39 values less than (2452001),
partition part_40 values less than (2452031),
partition part_41 values less than (2452062),
partition part_42 values less than (2452092),
partition part_43 values less than (2452123),
partition part_44 values less than (2452154),
partition part_45 values less than (2452184),
partition part_46 values less than (2452215),
partition part_47 values less than (2452245),
partition part_48 values less than (2452276),
partition part_49 values less than (2452307),
partition part_50 values less than (2452335),
partition part_51 values less than (2452366),
partition part_52 values less than (2452396),
partition part_53 values less than (2452427),
partition part_54 values less than (2452457),
partition part_55 values less than (2452488),
partition part_56 values less than (2452519),
partition part_57 values less than (2452549),
partition part_58 values less than (2452580),
partition part_59 values less than (2452610),
partition part_60 values less than (2452641),
partition part_61 values less than (2452672),
partition part_62 values less than (2452700),
partition part_63 values less than (2452731),
partition part_64 values less than (2452761),
partition part_65 values less than (2452792),
partition part_66 values less than (2452822),
partition part_67 values less than (2452853),
partition part_68 values less than (2452884),
partition part_69 values less than (2452914),
partition part_70 values less than (2452945),
partition part_71 values less than (2452975),
partition part_72 values less than (2453006),
partition part_73 values less than (maxvalue))
;
create table web_sales
(
ws_sold_date_sk bigint,
ws_sold_time_sk bigint,
ws_ship_date_sk bigint,
ws_item_sk bigint not null,
ws_bill_customer_sk bigint,
ws_bill_cdemo_sk bigint,
ws_bill_hdemo_sk bigint,
ws_bill_addr_sk bigint,
ws_ship_customer_sk bigint,
ws_ship_cdemo_sk bigint,
ws_ship_hdemo_sk bigint,
ws_ship_addr_sk bigint,
ws_web_page_sk bigint,
ws_web_site_sk bigint,
ws_ship_mode_sk bigint,
ws_warehouse_sk bigint,
ws_promo_sk bigint,
ws_order_number bigint not null,
ws_quantity int,
ws_wholesale_cost decimal(7,2),
ws_list_price decimal(7,2),
ws_sales_price decimal(7,2),
ws_ext_discount_amt decimal(7,2),
ws_ext_sales_price decimal(7,2),
ws_ext_wholesale_cost decimal(7,2),
ws_ext_list_price decimal(7,2),
ws_ext_tax decimal(7,2),
ws_coupon_amt decimal(7,2),
ws_ext_ship_cost decimal(7,2),
ws_net_paid decimal(7,2),
ws_net_paid_inc_tax decimal(7,2),
ws_net_paid_inc_ship decimal(7,2),
ws_net_paid_inc_ship_tax decimal(7,2),
ws_net_profit decimal(7,2)
)
partition by range (ws_sold_date_sk)
subpartition BY hash(ws_item_sk) subpartitions 64
(partition part_1 values less than (2450846),
partition part_2 values less than (2450874),
partition part_3 values less than (2450905),
partition part_4 values less than (2450935),
partition part_5 values less than (2450966),
partition part_6 values less than (2450996),
partition part_7 values less than (2451027),
partition part_8 values less than (2451058),
partition part_9 values less than (2451088),
partition part_10 values less than (2451119),
partition part_11 values less than (2451149),
partition part_12 values less than (2451180),
partition part_13 values less than (2451211),
partition part_14 values less than (2451239),
partition part_15 values less than (2451270),
partition part_16 values less than (2451300),
partition part_17 values less than (2451331),
partition part_18 values less than (2451361),
partition part_19 values less than (2451392),
partition part_20 values less than (2451423),
partition part_21 values less than (2451453),
partition part_22 values less than (2451484),
partition part_23 values less than (2451514),
partition part_24 values less than (2451545),
partition part_25 values less than (2451576),
partition part_26 values less than (2451605),
partition part_27 values less than (2451636),
partition part_28 values less than (2451666),
partition part_29 values less than (2451697),
partition part_30 values less than (2451727),
partition part_31 values less than (2451758),
partition part_32 values less than (2451789),
partition part_33 values less than (2451819),
partition part_34 values less than (2451850),
partition part_35 values less than (2451880),
partition part_36 values less than (2451911),
partition part_37 values less than (2451942),
partition part_38 values less than (2451970),
partition part_39 values less than (2452001),
partition part_40 values less than (2452031),
partition part_41 values less than (2452062),
partition part_42 values less than (2452092),
partition part_43 values less than (2452123),
partition part_44 values less than (2452154),
partition part_45 values less than (2452184),
partition part_46 values less than (2452215),
partition part_47 values less than (2452245),
partition part_48 values less than (2452276),
partition part_49 values less than (2452307),
partition part_50 values less than (2452335),
partition part_51 values less than (2452366),
partition part_52 values less than (2452396),
partition part_53 values less than (2452427),
partition part_54 values less than (2452457),
partition part_55 values less than (2452488),
partition part_56 values less than (2452519),
partition part_57 values less than (2452549),
partition part_58 values less than (2452580),
partition part_59 values less than (2452610),
partition part_60 values less than (2452641),
partition part_61 values less than (2452672),
partition part_62 values less than (2452700),
partition part_63 values less than (2452731),
partition part_64 values less than (2452761),
partition part_65 values less than (2452792),
partition part_66 values less than (2452822),
partition part_67 values less than (2452853),
partition part_68 values less than (2452884),
partition part_69 values less than (2452914),
partition part_70 values less than (2452945),
partition part_71 values less than (2452975),
partition part_72 values less than (2453006),
partition part_73 values less than (maxvalue))
;
create table catalog_sales
(
cs_sold_date_sk bigint,
cs_sold_time_sk bigint,
cs_ship_date_sk bigint,
cs_bill_customer_sk bigint,
cs_bill_cdemo_sk bigint,
cs_bill_hdemo_sk bigint,
cs_bill_addr_sk bigint,
cs_ship_customer_sk bigint,
cs_ship_cdemo_sk bigint,
cs_ship_hdemo_sk bigint,
cs_ship_addr_sk bigint,
cs_call_center_sk bigint,
cs_catalog_page_sk bigint,
cs_ship_mode_sk bigint,
cs_warehouse_sk bigint,
cs_item_sk bigint not null,
cs_promo_sk bigint,
cs_order_number bigint not null,
cs_quantity int,
cs_wholesale_cost decimal(7,2),
cs_list_price decimal(7,2),
cs_sales_price decimal(7,2),
cs_ext_discount_amt decimal(7,2),
cs_ext_sales_price decimal(7,2),
cs_ext_wholesale_cost decimal(7,2),
cs_ext_list_price decimal(7,2),
cs_ext_tax decimal(7,2),
cs_coupon_amt decimal(7,2),
cs_ext_ship_cost decimal(7,2),
cs_net_paid decimal(7,2),
cs_net_paid_inc_tax decimal(7,2),
cs_net_paid_inc_ship decimal(7,2),
cs_net_paid_inc_ship_tax decimal(7,2),
cs_net_profit decimal(7,2)
)
partition by range (cs_sold_date_sk)
subpartition BY hash(cs_item_sk) subpartitions 64
(partition part_1 values less than (2450846),
partition part_2 values less than (2450874),
partition part_3 values less than (2450905),
partition part_4 values less than (2450935),
partition part_5 values less than (2450966),
partition part_6 values less than (2450996),
partition part_7 values less than (2451027),
partition part_8 values less than (2451058),
partition part_9 values less than (2451088),
partition part_10 values less than (2451119),
partition part_11 values less than (2451149),
partition part_12 values less than (2451180),
partition part_13 values less than (2451211),
partition part_14 values less than (2451239),
partition part_15 values less than (2451270),
partition part_16 values less than (2451300),
partition part_17 values less than (2451331),
partition part_18 values less than (2451361),
partition part_19 values less than (2451392),
partition part_20 values less than (2451423),
partition part_21 values less than (2451453),
partition part_22 values less than (2451484),
partition part_23 values less than (2451514),
partition part_24 values less than (2451545),
partition part_25 values less than (2451576),
partition part_26 values less than (2451605),
partition part_27 values less than (2451636),
partition part_28 values less than (2451666),
partition part_29 values less than (2451697),
partition part_30 values less than (2451727),
partition part_31 values less than (2451758),
partition part_32 values less than (2451789),
partition part_33 values less than (2451819),
partition part_34 values less than (2451850),
partition part_35 values less than (2451880),
partition part_36 values less than (2451911),
partition part_37 values less than (2451942),
partition part_38 values less than (2451970),
partition part_39 values less than (2452001),
partition part_40 values less than (2452031),
partition part_41 values less than (2452062),
partition part_42 values less than (2452092),
partition part_43 values less than (2452123),
partition part_44 values less than (2452154),
partition part_45 values less than (2452184),
partition part_46 values less than (2452215),
partition part_47 values less than (2452245),
partition part_48 values less than (2452276),
partition part_49 values less than (2452307),
partition part_50 values less than (2452335),
partition part_51 values less than (2452366),
partition part_52 values less than (2452396),
partition part_53 values less than (2452427),
partition part_54 values less than (2452457),
partition part_55 values less than (2452488),
partition part_56 values less than (2452519),
partition part_57 values less than (2452549),
partition part_58 values less than (2452580),
partition part_59 values less than (2452610),
partition part_60 values less than (2452641),
partition part_61 values less than (2452672),
partition part_62 values less than (2452700),
partition part_63 values less than (2452731),
partition part_64 values less than (2452761),
partition part_65 values less than (2452792),
partition part_66 values less than (2452822),
partition part_67 values less than (2452853),
partition part_68 values less than (2452884),
partition part_69 values less than (2452914),
partition part_70 values less than (2452945),
partition part_71 values less than (2452975),
partition part_72 values less than (2453006),
partition part_73 values less than (maxvalue));
create table store_sales
(
ss_sold_date_sk bigint,
ss_sold_time_sk bigint,
ss_item_sk bigint not null,
ss_customer_sk bigint,
ss_cdemo_sk bigint,
ss_hdemo_sk bigint,
ss_addr_sk bigint,
ss_store_sk bigint,
ss_promo_sk bigint,
ss_ticket_number bigint not null,
ss_quantity int,
ss_wholesale_cost decimal(7,2),
ss_list_price decimal(7,2),
ss_sales_price decimal(7,2),
ss_ext_discount_amt decimal(7,2),
ss_ext_sales_price decimal(7,2),
ss_ext_wholesale_cost decimal(7,2),
ss_ext_list_price decimal(7,2),
ss_ext_tax decimal(7,2),
ss_coupon_amt decimal(7,2),
ss_net_paid decimal(7,2),
ss_net_paid_inc_tax decimal(7,2),
ss_net_profit decimal(7,2)
)
partition by range(ss_sold_date_sk)
subpartition BY hash(ss_item_sk) subpartitions 64
(partition part_1 values less than (2450846),
partition part_2 values less than (2450874),
partition part_3 values less than (2450905),
partition part_4 values less than (2450935),
partition part_5 values less than (2450966),
partition part_6 values less than (2450996),
partition part_7 values less than (2451027),
partition part_8 values less than (2451058),
partition part_9 values less than (2451088),
partition part_10 values less than (2451119),
partition part_11 values less than (2451149),
partition part_12 values less than (2451180),
partition part_13 values less than (2451211),
partition part_14 values less than (2451239),
partition part_15 values less than (2451270),
partition part_16 values less than (2451300),
partition part_17 values less than (2451331),
partition part_18 values less than (2451361),
partition part_19 values less than (2451392),
partition part_20 values less than (2451423),
partition part_21 values less than (2451453),
partition part_22 values less than (2451484),
partition part_23 values less than (2451514),
partition part_24 values less than (2451545),
partition part_25 values less than (2451576),
partition part_26 values less than (2451605),
partition part_27 values less than (2451636),
partition part_28 values less than (2451666),
partition part_29 values less than (2451697),
partition part_30 values less than (2451727),
partition part_31 values less than (2451758),
partition part_32 values less than (2451789),
partition part_33 values less than (2451819),
partition part_34 values less than (2451850),
partition part_35 values less than (2451880),
partition part_36 values less than (2451911),
partition part_37 values less than (2451942),
partition part_38 values less than (2451970),
partition part_39 values less than (2452001),
partition part_40 values less than (2452031),
partition part_41 values less than (2452062),
partition part_42 values less than (2452092),
partition part_43 values less than (2452123),
partition part_44 values less than (2452154),
partition part_45 values less than (2452184),
partition part_46 values less than (2452215),
partition part_47 values less than (2452245),
partition part_48 values less than (2452276),
partition part_49 values less than (2452307),
partition part_50 values less than (2452335),
partition part_51 values less than (2452366),
partition part_52 values less than (2452396),
partition part_53 values less than (2452427),
partition part_54 values less than (2452457),
partition part_55 values less than (2452488),
partition part_56 values less than (2452519),
partition part_57 values less than (2452549),
partition part_58 values less than (2452580),
partition part_59 values less than (2452610),
partition part_60 values less than (2452641),
partition part_61 values less than (2452672),
partition part_62 values less than (2452700),
partition part_63 values less than (2452731),
partition part_64 values less than (2452761),
partition part_65 values less than (2452792),
partition part_66 values less than (2452822),
partition part_67 values less than (2452853),
partition part_68 values less than (2452884),
partition part_69 values less than (2452914),
partition part_70 values less than (2452945),
partition part_71 values less than (2452975),
partition part_72 values less than (2453006),
partition part_73 values less than (maxvalue));
ステップ7:データのロード
上記の手順で生成されたデータとSQLに基づいて、スクリプトを自作します。データロードのサンプル操作は以下のとおりです:
load.pyスクリプトを作成します。[wieck@localhost load] $ vim load.py #!/usr/bin/env python #-*- encoding:utf-8 -*- import os import sys import time import commands hostname='$host_ip' # 注意!!任意のobserver、例えばobserver Aが配置されているサーバーのIPアドレスを入力してください port='$host_port' # observer Aのポート番号 tenant='$tenant_name' # テナント名 user='$user' # ユーザー名 password='$password' # パスワード data_path='$path' # 注意!!任意のobserver、例えばobserver Aが配置されているサーバーのtblディレクトリのパスを入力してください db_name='$db_name' # データベース名 # テーブルの作成 cmd_str='obclient -h%s -P%s -u%s@%s -p%s -D%s < create_tpcds_mysql_table_part.ddl'%(hostname,port,user,tenant,password,db_name) result = commands.getstatusoutput(cmd_str) print result cmd_str='obclient -h%s -P%s -u%s@%s -p%s -D%s -e "show tables;" '%(hostname,port,user,tenant,password,db_name) result = commands.getstatusoutput(cmd_str) print result cmd_str=""" obclient -h%s -P%s -u%s@%s -p%s -c -D%s -e "load data /*+ parallel(80) */ infile '%s/dbgen_version.dat' into table dbgen_version fields terminated by '|';" """ %(hostname,port,user,tenant,password,db_name,data_path) result = commands.getstatusoutput(cmd_str) print result cmd_str=""" obclient -h%s -P%s -u%s@%s -p%s -c -D%s -e "load data /*+ parallel(80) */ infile '%s/customer_address.dat' into table customer_address fields terminated by '|';" """ %(hostname,port,user,tenant,password,db_name,data_path) result = commands.getstatusoutput(cmd_str) print result cmd_str=""" obclient -h%s -P%s -u%s@%s -p%s -c -D%s -e "load data /*+ parallel(80) */ infile '%s/customer_demographics.dat' into table customer_demographics fields terminated by '|';" """ %(hostname,port,user,tenant,password,db_name,data_path) result = commands.getstatusoutput(cmd_str) print result cmd_str=""" obclient -h%s -P%s -u%s@%s -p%s -c -D%s -e "load data /*+ parallel(80) */ infile '%s/date_dim.dat' into table date_dim fields terminated by '|';" """ %(hostname,port,user,tenant,password,db_name,data_path) result = commands.getstatusoutput(cmd_str) print result cmd_str=""" obclient -h%s -P%s -u%s@%s -p%s -D%s -e "load data /*+ parallel(80) */ infile '%s/warehouse.dat' into table warehouse fields terminated by '|';" """ %(hostname,port,user,tenant,password,db_name,data_path) result = commands.getstatusoutput(cmd_str) print result cmd_str=""" obclient -h%s -P%s -u%s@%s -p%s -c -D%s -e "load data /*+ parallel(80) */ infile '%s/ship_mode.dat' into table ship_mode fields terminated by '|';" """ %(hostname,port,user,tenant,password,db_name,data_path) result = commands.getstatusoutput(cmd_str) print result cmd_str=""" obclient -h%s -P%s -u%s@%s -p%s -c -D%s -e "load data /*+ parallel(80) */ infile '%s/time_dim.dat' into table time_dim fields terminated by '|';" """ %(hostname,port,user,tenant,password,db_name,data_path) result = commands.getstatusoutput(cmd_str) print result cmd_str=""" obclient -h%s -P%s -u%s@%s -p%s -c -D%s -e "load data /*+ parallel(80) */ infile '%s/reason.dat' into table reason fields terminated by '|';" """ %(hostname,port,user,tenant,password,db_name,data_path) result = commands.getstatusoutput(cmd_str) print result cmd_str=""" obclient -h%s -P%s -u%s@%s -p%s -c -D%s -e "load data /*+ parallel(80) */ infile '%s/income_band.dat' into table income_band fields terminated by '|';" """ %(hostname,port,user,tenant,password,db_name,data_path) result = commands.getstatusoutput(cmd_str) print result cmd_str=""" obclient -h%s -P%s -u%s@%s -p%s -c -D%s -e "load data /*+ parallel(80) */ infile '%s/item.dat' into table item fields terminated by '|';" """ %(hostname,port,user,tenant,password,db_name,data_path) result = commands.getstatusoutput(cmd_str) print result cmd_str=""" obclient -h%s -P%s -u%s@%s -p%s -c -D%s -e "load data /*+ parallel(80) */ infile '%s/store.dat' into table store fields terminated by '|';" """ %(hostname,port,user,tenant,password,db_name,data_path) result = commands.getstatusoutput(cmd_str) print result cmd_str=""" obclient -h%s -P%s -u%s@%s -p%s -c -D%s -e "load data /*+ parallel(80) */ infile '%s/call_center.dat' into table call_center fields terminated by '|';" """ %(hostname,port,user,tenant,password,db_name,data_path) result = commands.getstatusoutput(cmd_str) print result cmd_str=""" obclient -h%s -P%s -u%s@%s -p%s -D%s -e "load data /*+ parallel(80) */ infile '%s/customer.dat' into table customer fields terminated by '|';" """ %(hostname,port,user,tenant,password,db_name,data_path) result = commands.getstatusoutput(cmd_str) print result cmd_str=""" obclient -h%s -P%s -u%s@%s -p%s -c -D%s -e "load data /*+ parallel(80) */ infile '%s/web_site.dat' into table web_site fields terminated by '|';" """ %(hostname,port,user,tenant,password,db_name,data_path) result = commands.getstatusoutput(cmd_str) print result cmd_str=""" obclient -h%s -P%s -u%s@%s -p%s -c -D%s -e "load data /*+ parallel(80) */ infile '%s/store_returns.dat' into table store_returns fields terminated by '|';" """ %(hostname,port,user,tenant,password,db_name,data_path) result = commands.getstatusoutput(cmd_str) print result cmd_str=""" obclient -h%s -P%s -u%s@%s -p%s -c -D%s -e "load data /*+ parallel(80) */ infile '%s/household_demographics.dat' into table household_demographics fields terminated by '|';" """ %(hostname,port,user,tenant,password,db_name,data_path) result = commands.getstatusoutput(cmd_str) print result cmd_str=""" obclient -h%s -P%s -u%s@%s -p%s -c -D%s -e "load data /*+ parallel(80) */ infile '%s/web_page.dat' into table web_page fields terminated by '|';" """ %(hostname,port,user,tenant,password,db_name,data_path) result = commands.getstatusoutput(cmd_str) print result cmd_str=""" obclient -h%s -P%s -u%s@%s -p%s -c -D%s -e "load data /*+ parallel(80) */ infile '%s/promotion.dat' into table promotion fields terminated by '|';" """ %(hostname,port,user,tenant,password,db_name,data_path) result = commands.getstatusoutput(cmd_str) print result cmd_str=""" obclient -h%s -P%s -u%s@%s -p%s -c -D%s -e "load data /*+ parallel(80) */ infile '%s/catalog_page.dat' into table catalog_page fields terminated by '|';" """ %(hostname,port,user,tenant,password,db_name,data_path) result = commands.getstatusoutput(cmd_str) print result cmd_str=""" obclient -h%s -P%s -u%s@%s -p%s -c -D%s -e "load data /*+ parallel(80) */ infile '%s/inventory.dat' into table inventory fields terminated by '|';" """ %(hostname,port,user,tenant,password,db_name,data_path) result = commands.getstatusoutput(cmd_str) print result cmd_str=""" obclient -h%s -P%s -u%s@%s -p%s -D%s -e "load data /*+ parallel(80) */ infile '%s/catalog_returns.dat' into table catalog_returns fields terminated by '|';" """ %(hostname,port,user,tenant,password,db_name,data_path) result = commands.getstatusoutput(cmd_str) print result cmd_str=""" obclient -h%s -P%s -u%s@%s -p%s -c -D%s -e "load data /*+ parallel(80) */ infile '%s/web_returns.dat' into table web_returns fields terminated by '|';" """ %(hostname,port,user,tenant,password,db_name,data_path) result = commands.getstatusoutput(cmd_str) print result cmd_str=""" obclient -h%s -P%s -u%s@%s -p%s -c -D%s -e "load data /*+ parallel(80) */ infile '%s/web_sales.dat' into table web_sales fields terminated by '|';" """ %(hostname,port,user,tenant,password,db_name,data_path) result = commands.getstatusoutput(cmd_str) print result cmd_str=""" obclient -h%s -P%s -u%s@%s -p%s -c -D%s -e "load data /*+ parallel(80) */ infile '%s/catalog_sales.dat' into table catalog_sales fields terminated by '|';" """ %(hostname,port,user,tenant,password,db_name,data_path) result = commands.getstatusoutput(cmd_str) print result cmd_str=""" obclient -h%s -P%s -u%s@%s -p%s -c -D%s -e "load data /*+ parallel(80) */ infile '%s/store_sales.dat' into table store_sales fields terminated by '|';" """ %(hostname,port,user,tenant,password,db_name,data_path) result = commands.getstatusoutput(cmd_str) print resultデータをロードします。
注意
データをロードするには、OBClientクライアントをインストールする必要があります。
[wieck@localhost load] $ python load.pyコンパクションを実行します。
テストテナントにログインし、コンパクションを実行します。
obclient > ALTER SYSTEM SET undo_retention = 100; obclient > ALTER SYSTEM MAJOR FREEZE;コンパクションが完了したかどうかを確認します。
SELECT * FROM oceanbase.DBA_OB_ZONE_MAJOR_COMPACTION\G実行結果は次のとおりです:
*************************** 1. row *************************** ZONE: zone1 BROADCAST_SCN: 1716172011046213913 LAST_SCN: 1716172011046213913 LAST_FINISH_TIME: 2024-05-20 10:35:07.829496 START_TIME: 2024-05-20 10:26:51.579881 STATUS: IDLE 1 row in setSTATUSの状態がIDLEであり、BROADCAST_SCNとLAST_SCNの値が等しい場合、コンパクションが完了したことを示します。統計情報を手動で収集します。
テストユーザーで
obclient -h$host_ip -P$host_port -u$user@$tenant -p$password -A -D$databaseコマンドを実行します。call dbms_stats.gather_table_stats(NULL, 'date_dim', degree=>128, granularity=>'AUTO', method_opt=>'FOR ALL COLUMNS SIZE 128'); call dbms_stats.gather_table_stats(NULL, 'warehouse', degree=>128, granularity=>'AUTO', method_opt=>'FOR ALL COLUMNS SIZE 128'); call dbms_stats.gather_table_stats(NULL, 'ship_mode', degree=>128, granularity=>'AUTO', method_opt=>'FOR ALL COLUMNS SIZE 128'); call dbms_stats.gather_table_stats(NULL, 'time_dim', degree=>128, granularity=>'AUTO', method_opt=>'FOR ALL COLUMNS SIZE 128'); call dbms_stats.gather_table_stats(NULL, 'reason', degree=>128, granularity=>'AUTO', method_opt=>'FOR ALL COLUMNS SIZE 128'); call dbms_stats.gather_table_stats(NULL, 'income_band', degree=>128, granularity=>'AUTO', method_opt=>'FOR ALL COLUMNS SIZE 128'); call dbms_stats.gather_table_stats(NULL, 'item', degree=>128, granularity=>'AUTO', method_opt=>'FOR ALL COLUMNS SIZE 128'); call dbms_stats.gather_table_stats(NULL, 'store', degree=>128, granularity=>'AUTO', method_opt=>'FOR ALL COLUMNS SIZE 128'); call dbms_stats.gather_table_stats(NULL, 'call_center', degree=>128, granularity=>'AUTO', method_opt=>'FOR ALL COLUMNS SIZE 128'); call dbms_stats.gather_table_stats(NULL, 'web_site', degree=>128, granularity=>'AUTO', method_opt=>'FOR ALL COLUMNS SIZE 128'); call dbms_stats.gather_table_stats(NULL, 'household_demographics', degree=>128, granularity=>'AUTO', method_opt=>'FOR ALL COLUMNS SIZE 128'); call dbms_stats.gather_table_stats(NULL, 'web_page', degree=>128, granularity=>'AUTO', method_opt=>'FOR ALL COLUMNS SIZE 128'); call dbms_stats.gather_table_stats(NULL, 'promotion', degree=>128, granularity=>'AUTO', method_opt=>'FOR ALL COLUMNS SIZE 128'); call dbms_stats.gather_table_stats(NULL, 'catalog_page', degree=>128, granularity=>'AUTO', method_opt=>'FOR ALL COLUMNS SIZE 128'); call dbms_stats.gather_table_stats(NULL, 'inventory', degree=>128, granularity=>'GLOBAL', method_opt=>'FOR ALL COLUMNS SIZE 128'); call dbms_stats.gather_table_stats(NULL, 'web_sales', degree=>128, granularity=>'GLOBAL', method_opt=>'FOR ALL COLUMNS SIZE 128'); call dbms_stats.gather_table_stats(NULL, 'catalog_sales', degree=>128, granularity=>'GLOBAL', method_opt=>'FOR ALL COLUMNS SIZE 128'); call dbms_stats.gather_table_stats(NULL, 'store_sales', degree=>128, granularity=>'GLOBAL', method_opt=>'FOR ALL COLUMNS SIZE 128'); call dbms_stats.gather_table_stats(NULL, 'catalog_returns', degree=>128, granularity=>'GLOBAL', method_opt=>'FOR ALL COLUMNS SIZE 128'); call dbms_stats.gather_table_stats(NULL, 'web_returns', degree=>128, granularity=>'GLOBAL', method_opt=>'FOR ALL COLUMNS SIZE 128'); call dbms_stats.gather_table_stats(NULL, 'store_returns', degree=>128, granularity=>'GLOBAL', method_opt=>'FOR ALL COLUMNS SIZE 128'); call dbms_stats.gather_table_stats(NULL, 'customer_address', degree=>128, granularity=>'GLOBAL', method_opt=>'FOR ALL COLUMNS SIZE 128'); call dbms_stats.gather_table_stats(NULL, 'customer_demographics', degree=>128, granularity=>'GLOBAL', method_opt=>'FOR ALL COLUMNS SIZE 128'); call dbms_stats.gather_table_stats(NULL, 'customer', degree=>128, granularity=>'GLOBAL', method_opt=>'FOR ALL COLUMNS SIZE 128');
ステップ8:テストの実行
上記の手順で生成されたデータとSQLを基に、スクリプトを自作します。テストの実行例は以下のとおりです:
sql-dsディレクトリにテストスクリプトtpcds.shを作成します。
[wieck@localhost queries] $ vim tpcds.sh #!/bin/bash TPCDS_TEST="obclient -h $host_ip -P $host_port -utpcds_100g_part@tpcds_mysql -D tpcds_100g_part -ptest -c" # ウォームアップ for i in {1..99} do sql1="source sql${i}.sql" echo $sql1| $TPCDS_TEST >db${i}.log || ret=1 done # 実際の実行 for i in {1..99} do starttime=`date +%s%N` echo `date '+[%Y-%m-%d %H:%M:%S]'` "BEGIN Q${i}" sql1="source sql${i}.sql" echo $sql1| $TPCDS_TEST >db${i}.log || ret=1 stoptime=`date +%s%N` costtime=`echo $stoptime $starttime | awk '{printf "%0.2f\n", ($1 - $2) / 1000000000}'` echo `date '+[%Y-%m-%d %H:%M:%S]'` "END,COST ${costtime}s" doneテストスクリプトを実行します。
sh tpcds.sh
よくあるエラーメッセージ
データのインポートに失敗しました。エラーメッセージは以下のとおりです:
ERROR 1017 (HY000) at line 1: File not existtblファイルは、接続先のOceanBaseデータベースが存在するマシンの特定のディレクトリに配置する必要があります。データの読み込みはローカルから行わなければならないためです。データ表示時にエラーが発生しました。エラーメッセージは以下のとおりです:
ERROR 4624 (HY000):No memory or reach tenant memory limitメモリ不足です。テナントのメモリを増やすことを推奨します。
データのインポート時にエラーが発生しました。エラーメッセージは以下のとおりです:
ERROR 1227 (42501) at line 1: Access deniedユーザーにアクセス権限を付与する必要があります。以下のコマンドを実行して権限を付与してください。