postgres-xc

Notice

Recent Posts

Recent Comments

Link

nanha's blog

« 2025/06 »
일	월	화	수	목	금	토
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

Tags more

Archives

Today

Total

관리 메뉴

avicom의 신변잡기

postgres-xc 본문

LiNux / sTorAge

postgres-xc

avicom 2010. 12. 30. 18:37

개요

pgsql 8.4.3을 베이스로 패치된 Write-scalable한 동기형 multi-master 클러스터 솔루션

Write-scalable이란 클러스터 노드 중 어느 노드에 접속해서 write를 하더라도 전 노드로 확신이 되는 것을 의미. (일반적인 replication 구조에선 불가능)

1개 이상의 하드웨어에 설치되어 클러스터링을 구성. 데이터는 분산 혹은 replication될 수 있음.

현재 0.9.1버전이 release 되었으며 서브쿼리, 커서, 뷰, order by, distinct 등의 복합구문과 DDL 동기화 등이 지원되지않는등 기능의 상당부분이 구현되지 않음

thread-safe 모드 지원안됨 (컴파일시 오류남)

fail-over, 재해복구 지원안됨

기존에 pgsql에서 지원하던 API는 모두 지원

전체 소스 (pgsql 8.4.3 + patch)와 pgsql 8.4.3용 패치 파일 두가지가 제공됨

https://sourceforge.net/projects/postgres-xc/files/Version_0.9.1/pgxc_v0_9_1.tar.gz/download

https://sourceforge.net/projects/postgres-xc/files/Version_0.9.1/PGXC_v0_9_1-PG_REL8_4_3.patch.gz/download

로드맵

Version 1.0 (Late in July, 2010)

ORDER BY
DISTINCT
Stored functions
subqueries
Views
Rules
DDLs
Regression tests

Version 1.1 (Late in September, 2010)

Cluster-wide installer
Cluster-wide operation utilities
Regression tests
Logical backup/restore (pg_dump, pg_restore)
Basic cross-node operation
TEMP Table
Extended Query Protocol (for JDBC)
Global timestamp
Driver support (ECPG, JDBC, PHP, etc.)
Forward Cursor (w/o ORDER BY)

Beyond Version 1.1

Physical backup/restore incl. PITR
Cross-node oepration optimization
More variety of statements such as SELECT in INSERT
Prepared statements
General aggregate functions
Savepoint
Session Parameters
2PC from Apps
Forward cursor with ORDER BY
Backward cursor
Batch, statement pushdown
Caralog synchronize with DDLs
Trigger
GLobal constraints
Tuple relocation (distrubute key update)
Performance improvement
Regression tests

콤포넌트 구성

GTM (Global Transaction Manager)

데이터 노드간 트랜잭션과 튜플(레코드) 투명성을 유지한다.

데이터노드에서 발생하는 모든 트랜잭션에 대해 고유한 XID(GXID)를 부여하고 스냅샷을 유지.이 정보는 각 테이블의 xmin, xmax 필드에 담겨지고 어떤 row를 insert할 때 이 트랜잭션의 xid가 xmin필드에 저장되며, row가 업데이트될 때 xmax필드에 업데이트 xid를 기록함으로써 오래된 row version을 구분한다.

이런 식으로 다수의 데이터노드로부터 동시에 트랜잭션이 들어오더라도 xid의 중복을 제거하고 고유성을 유지함.

일반적으로 데이터노드와 분리된 서버에서 동작하며 고유하고 정렬된 XID를 데이터노드에 제공한다

Coordinator

coordinator는 DB 접속을 제공하는 인터페이스임.

외부에서 보기에 일반적은 pgsql처럼 동작하지만 실제 데이터를 저장하진 않음. 실제 데이터는 datanode에 저장됨.

sql 구문을 받으면 GXID 및 스냅샷을 구해서 해당 sql을 어떤 데이터노드로 보낼 것인지 결정한다. sql을 데이터노드로 보낼 때, 다른 coordinator에서 데이터노드로 보낸 sql을 혼동하지 않도록 GXID와 스냅샷을 함께 보낸다.

Data Node

실제로 데이터를 저장하는 콤포넌트

테이블은 데이터 노드에 걸쳐서 분산되거나 모든 데이터 노드로 replcation된다.

전체 데이터베이스에 대한 global view가 없기 때문에, 단지 로컬에 저장된 데이터만 관리.

데이터노드로 들어오는 sql은 coordinator에 의해 검사되고, 각 데이터노드에서 실행하기 위해 재생성된 구문임.이때 GXID와 스냅샷이 함께 전송됨.

데이터노드는 다수의 coordinator로부터 sql 요청을 받음. 하지만 각 트랜잭션은 고유하게 인식되고 스냅샷과 연관되어있기 때문에 coordinator가 보낸 구문들을 혼동하지 않음

Transaction

transaction of pgxc

환경 구성

구성

coord / datanode 서버 : 192.168.100.76, 192.168.100.77, 192.168.100.79
gtm : 192.168.100.79

설치 (coord/datanode 및 gtm 서버공통)

소스 다운로드 및 압축 해제

[root@lustre-001 src]# wget http://downloads.sourceforge.net/project/postgres-xc/Version_0.9.1/pgxc_v0_9_1.tar.gz?use_mirror=cdnetworks-kr-2
[root@lustre-001 src]# tar xvfz pgxc_v0_9_1.tar.gz

configuration

[root@lustre-001 postgres-xc]# cat conf.sh
export CFLAGS='-O2'
./configure \
--prefix=/home/pgxc \
--with-perl \
--with-python \
--with-openssl
# --enable-thread-safe 옵션 먹지 않음

컴파일 및 설치

[root@lustre-001 postgres-xc]# sh -x conf.sh && make && make install

설정 (coord/datanode 서버)

/home/pgxc owner 변경

chown -R postgres. /home/pgxc

이후 작업은 모두 postgres 계정으로 수행

coord/datanode DB 초기화 (coord/datanode 서버)

/home/pgxc/bin/initdb -D /home/pgxc/co2dn2/coord
/home/pgxc/bin/initdb -D /home/pgxc/co2dn2/datanode

coordinator postgresql.conf 수정

listen_addresses = '*'
max_connections = 200
max_prepared_transactions = 200
num_data_nodes = 3
min_pool_size = 100
max_pool_size = 600

data_node_hosts = '192.168.100.76,192.168.100.77,192.168.100.79'
data_node_ports = '15432,15432,15432'
data_node_users = 'postgres'
gtm_coordinator_id = 1 # coord 서버 순서대로 1, 2, 3으로 부여

coordinator pg_hba.conf 수정

host all all 192.168.0.0/16 trust

datanode postgresql.conf 수정

listen_addresses = '*'
max_connections = 400
max_prepared_transactions = 400
max_pool_size = 600
persistent_datanode_connections = off
gtm_coordinator_id = 11 # datanode 서버 순서대로 11, 22, 33으로 부여

datanode pg_hba.conf 수정

host all all 192.168.0.0/16 trust

coord/datanode 기동 스크립트 작성

-bash-3.2$ cat pgxc.sh
#!/bin/sh

if [ "$1" == "coord" -a "$2" == "start" ]; then
    /home/pgxc/bin/pg_ctl -S coordinator -D /home/pgxc/co2dn2/coord -l /home/pgxc/co2dn2/coord/coord.log -o "-C -i -p 5432" start
elif [ "$1" == "coord" -a "$2" == "stop" ]; then
    /home/pgxc/bin/pg_ctl -S coordinator -D /home/pgxc/co2dn2/coord -l /home/pgxc/co2dn2/coord/coord.log stop
elif [ "$1" == "data" -a "$2" == "start" ]; then
    /home/pgxc/bin/pg_ctl -S datanode -D /home/pgxc/co2dn2/datanode -l /home/pgxc/co2dn2/datanode/datanode.log -o "-X -i -p 15432" start
elif [ "$1" == "data" -a "$2" == "stop" ]; then
    /home/pgxc/bin/pg_ctl -S datanode -D /home/pgxc/co2dn2/datanode -l /home/pgxc/co2dn2/datanode/datanode.log stop
else
    echo "usage : pgxc.sh [coord | data] [start|stop] "
fi

pgbench test

테스트 환경 구성

gtm 데몬 구동 (gtm 서버)

[postgres@lustre-client ~]$ /home/pgxc/bin/gtm -x 1000 -l /home/pgxc/data/gtm/gtmlog -p 16680 -D /home/pgxc/data/gtm

gtm_proxy 구동 (coord/datanode 서버)

-bash-3.2$ /home/pgxc/bin/gtm_proxy -D $HOME/co2dn2/ -h localhost -p 6666 -s 192.168.100.79 -t 16680 -n 2 -l /home/pgxc/co2dn2/log/gtm_proxy.log &

datanode 구동 (datanode 전 서버)

-bash-3.2$ /home/pgxc/pgxc.sh data start

coordinator 구동 (coord 1번 서버만)
현재 릴리즈는 DDL replication 지원이 되지 않기 때문에, 1번 서버의 coord db에서 DDL을 실행하고 그 데이터를 복사해야 한다.

-bash-3.2$ /home/pgxc/pgxc.sh coord start

pgbench db 생성 및 초기화

-bash-3.2$ /home/pgxc/bin/createdb pgbench
-bash-3.2$ /home/pgxc/bin/pgbench -i -s 100 -h localhost pgbench

coordinator shutdown 및 데이터 복사

-bash-3.2$ /home/pgxc/pgxc.sh coord stop
-bash-3.2$ rsync /home/pgxc/co2dn2/coord 192.168.100.77:/home/pgxc/co2dn2/coord/
-bash-3.2$ rsync /home/pgxc/co2dn2/coord 192.168.100.79:/home/pgxc/co2dn2/coord/

rsync로 인해 postgres.conf의 gtm_coordinator_id가 1번 서버로 지정되어있으므로 해당 서버에 맞게 수정

coordinator 기동 (coordinator 전 서버)

-bash-3.2$ /home/pgxc/pgxc.sh coord start

benchmark test

benchmark 수행 서버 (192.168.10.52)

구분	동시 접속수	duration	실제 트랜잭션 갯수	tps (접속세션 포함)	tps (접속세션 제외)
pgxc	10	120	26791	222.156437	233.817976
pgsql	10	120	19627	149.974910	150.047814

pgxc	50	120	24220	198.942199	296.510180
pgsql	50	120	20468	162.954511	163.255000

pgxc	100	120	19359	155.135624	262.712174
pgsql	100	120	18590	154.776800	155.390721

pgxc	150	120	14204	118.209952	330.276915
pgsql	150	120	20650	165.623020	166.576326

pgxc	200	120	4192	34.545902	320.780992
pgsql	200	120	21727	169.821198	171.049394

테스트 결과

session 접속 cost가 과다하게 발생하여 세션을 포함한 tps 수치는 크게 차이가 나지 않음

세션을 제외한 수치는 100개의 동시접속 테스트 시 pure pgsql 대비 약 59% 향상된 TPS를 보임

max_connection, max_prepared_transactions, max_pool_size 수치에 상당한 영향을 받는다

gtm - coordinator - datanode 사이에 수많은 세션이 발생하므로 pooler 및 max_connection 갯수를 여유있게 잡아야 최대의 성능 향상을 기대할 수 있다.

coord/datanode 서버간 gxid 교환시 종종 맞지 않는 현상 발생

정식 버전이 릴리즈되면 테스트 여하에 따라 production 용으로 사용하면 많은 이득을 볼 듯

max_connection, max_prepared_transactions, max_pool_size 수치에 상당한 영향을 받는다

max_connections	coord/postgresql.conf의 max_connection 항목은 pgbench의 -c 옵션 수치보다 커야한다. datanode/postgresql.conf의 경우, -c 옵션 수치의 2배로 잡아야함
max_prepared_transactions	max_connections과 동일한 값
max_pool_size	datanode 갯수 * max_connections

'LiNux / sTorAge' Related Articles

avicom의 신변잡기

postgres-xc 본문

postgres-xc

티스토리툴바