Download the single node edition, Here I use the version 4.0 for redhat /centos , greenplum-db-4.0.0.4-build-1-RHEL5-x86_64.bin
check OS version,
[root@localhost Desktop]# cat /etc/redhat-release |
if you run the installer on unmatched version of centos, will get an error, “Installer will only install on RedHat/CentOS x86_64”
here run Uname –m, It will tell the hardware spec, 386 or x86_64.
[root@localhost ~]# uname -m |
run the installer directly.
[root@localhost Desktop]# ./greenplum-db-4.0.0.4-build-1-RHEL5-x86_64.bin |
read and accept the agreement,
by default the bits will be put into folder /usr/local/greenplum-version. Enter yes to accept the default settings. like folder path, once done. you will get the message like
source /usr/local/greenplum-db-4.0.0.4/greenplum_path.sh
Create a dedicated user to run the process , and folders to hold segments and maser data. given user the folder access permission.
Here we will create a user named gp Create a maser folder /data/master Create 4 segment folders /data/seg1, /data/seg2,/data/seg3/,/data/seg4 Change ownership folder of /data/* to gp |
Copy one instance from the sample config and change the setting like folder path , listen port.
cd /usr/local/greenplum-db-4.0.0.4/docs/cli_help |
Switch to user gp, go to home foder, then change two files we just copied here.
gp_init_singlenode_example content change. MACHINE_LIST_FILE=./single_hostlist_example declare -a DATA_DIRECTORY=(/data/seg1 /data/seg2 /data/seg3 /data/seg4) MASTER_DIRECTORY=/data/master |
Initialize the system. (create folders and start up several postgres processes.)
[gp@localhost ~]$ gpssh-exkeys -f single_hostlist_example |
once done, when you run ps -aux|grep /data/, you will see 5 postgresq processes here, one master , four segment instances.
Now the single Node cluster is ready, we can use the psql to connect to the instance and have some fun.
Create one test database CreateDB androidtest psql (8.2.14) androidtestdb=# Create table public.test( id int primary key, state varchar(30) not null); Now we just created one table test which is disturbed by id. which means if we put more data, the data will be distributed and dispatched to different segments by column iD. let’s run a simple sql. insert into public.test values (1,'AL') ,(2,'AK') ,(3,'AS') ,(4,'AZ') ,(5,'AR') ,(6,'CA') ,(7,'CO') ,(8,'CT') ,(9,'DE') ,(10,'DC') ,(11,'FM') ,(12,'FL') ,(13,'GA') ,(14,'GU') ,(15,'HI') ,(16,'ID') ,(17,'IL') ,(18,'IN') ,(19,'IA') ,(20,'KS') ,(21,'KY') ,(22,'LA') ,(23,'ME') ,(24,'MH') ,(25,'MD') ,(26,'MA') ,(27,'MI') ,(28,'MN') ,(29,'MS') ,(30,'MO') ,(31,'MT') ,(32,'NE') ,(33,'NV') ,(34,'nh') ,(35,'NJ') ,(36,'NM') ,(37,'NY') ,(38,'NC') ,(39,'ND') ,(40,'MP') ,(41,'OH') ,(42,'OK') ,(43,'OR') ,(44,'PW') ,(45,'PA') ,(46,'PR') ,(47,'RI') ,(48,'SC') ,(49,'SD') ,(50,'TN') ,(51,'TX') ,(52,'UT') ,(53,'VT') ,(54,'VI') ,(55,'VA') ,(56,'WA') ,(57,'WV') ,(58,'WI') ,(59,'WY') ; How can we tell whether the data are distributed across segments evenly. in each table, there is one system column called gp_segment_id select gp_segment_id, id, state from public.test; gp_segment_id | id | state you can run a basic group query androidtestdb=# select gp_segment_id, count(*) from public.test group by gp_segment_id; also you can run a analyze to tell the execution plan. which will also tell you the segments scan are handled in a parallel way. androidtestdb=# EXPLAIN select * from public.test; EXPLAIN analyze select * from public.test; |
1 comment:
Really very helpful page ...
Post a Comment