PACEMAKER: Unterschied zwischen den Versionen

Aus Xinux Wiki
Zur Navigation springen Zur Suche springen
 
(69 dazwischenliegende Versionen von 4 Benutzern werden nicht angezeigt)
Zeile 2: Zeile 2:
 
==[ALL] Initial setup==
 
==[ALL] Initial setup==
 
Install required packages:
 
Install required packages:
  sudo apt-get install pacemaker cman resource-agents fence-agents gfs2-utils gfs2-cluster ocfs2-tools-cman openais
+
  sudo apt-get install pacemaker cman resource-agents fence-agents gfs2-utils gfs2-cluster ocfs2-tools-cman openais drbd8-utils
  Make sure each host can resolve all other hosts. Best way to achive this is by adding their IPs and hostnames to /etc/hosts on all nodes. In this example, that would be:
+
   
 +
Make sure each host can resolve all other hosts. Best way to achive this is by adding their IPs and hostnames to /etc/hosts on all nodes. In this example, that would be:
  
 
  eth0
 
  eth0
Zeile 12: Zeile 13:
 
  10.168.244.162  foxy-ha
 
  10.168.244.162  foxy-ha
  
Disable o2cb from starting:
+
Disable o2cb from starting:
  
 
  update-rc.d -f o2cb remove
 
  update-rc.d -f o2cb remove
  
 
==[ALL] Create /etc/cluster/cluster.conf==
 
==[ALL] Create /etc/cluster/cluster.conf==
Paste this into /etc/cluster/cluster.conf:
+
Paste this into '''/etc/cluster/cluster.conf'''  :
 
   
 
   
 
  <?xml version="1.0"?>
 
  <?xml version="1.0"?>
  <cluster config_version="4" name="pacemaker">
+
  <cluster config_version="1" name="pacemaker">
    <fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/>
+
<cman two_node="1" expected_votes="1"> </cman>
    <clusternodes>
+
    <fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/>
            <clusternode name="server1" nodeid="1" votes="1">
+
    <clusternodes>
                <fence>
+
            <clusternode name="fix" nodeid="1" votes="1">
                        <method name="pcmk-redirect">
+
                <fence>
                                <device name="pcmk" port="server1"/>
+
                        <method name="pcmk-redirect">
                        </method>
+
                                <device name="pcmk" port="fix"/>
                </fence>
+
                        </method>
            </clusternode>
+
                </fence>
            <clusternode name="server2" nodeid="2" votes="1">
+
            </clusternode>
                <fence>
+
            <clusternode name="foxy" nodeid="2" votes="1">
                        <method name="pcmk-redirect">
+
                <fence>
                                <device name="pcmk" port="server2"/>
+
                        <method name="pcmk-redirect">
                        </method>
+
                                <device name="pcmk" port="foxy"/>
                </fence>
+
                        </method>
            </clusternode>
+
                </fence>
            <clusternode name="server3" nodeid="3" votes="1">
+
            </clusternode>
                <fence>
+
    </clusternodes>
                        <method name="pcmk-redirect">
+
  <fencedevices>
                                <device name="pcmk" port="server3"/>
+
    <fencedevice name="pcmk" agent="fence_pcmk"/>
                        </method>
+
  </fencedevices>
                </fence>
+
</cluster>
            </clusternode>
+
 
    </clusternodes>
+
Validate the config
  <fencedevices>
+
# ccs_config_validate
    <fencedevice name="pcmk" agent="fence_pcmk"/>
+
  Configuration validates
  </fencedevices>
 
    <cman/>
 
  </cluster>
 
  
 
==[ALL] Edit /etc/corosync/corosync.conf==
 
==[ALL] Edit /etc/corosync/corosync.conf==
Zeile 71: Zeile 69:
 
  update-rc.d -f pacemaker remove
 
  update-rc.d -f pacemaker remove
 
  update-rc.d pacemaker start 50 1 2 3 4 5 . stop 01 0 6 .
 
  update-rc.d pacemaker start 50 1 2 3 4 5 . stop 01 0 6 .
==ALL] Start cman service and then pacemaker service==
+
==[ALL] Start cman service and then pacemaker service==
  service cman start
+
deactivat quorum
  service pacemaker start
+
  # echo CMAN_QUORUM_TIMEOUT=0 >> /etc/default/cman
==[ONE] Setup resources==
+
start cman
 +
  # service cman start
 +
Starting cluster:
 +
    Checking if cluster has been disabled at boot... [  OK  ]
 +
    Checking Network Manager... [  OK  ]
 +
    Global setup... [  OK  ]
 +
    Loading kernel modules... [  OK  ]
 +
    Mounting configfs... [  OK  ]
 +
    Starting cman... [  OK  ]
 +
    Waiting for quorum... [  OK  ]
 +
    Starting fenced... [  OK  ]
 +
    Starting dlm_controld... [  OK  ]
 +
    Unfencing self... [  OK  ]
 +
    Joining fence domain... [ OK  ]
  
Wait for a minute until pacemaker declares all nodes online:
 
  
# crm status
+
# cman_tool nodes
============
+
Node  Sts  Inc  Joined              Name
Last updated: Thu Apr 26 20:18:12 2012
+
    1  M      4  2012-09-11 19:14:58  fix
Last change: Thu Apr 26 20:06:11 2012 via crmd on server1
+
    2  M      8  2012-09-11 19:18:35  foxy
Stack: cman
 
Current DC: server1 - partition with quorum
 
Version: 1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c
 
3 Nodes configured, unknown expected votes
 
15 Resources configured.
 
============
 
  
Online: [ server1 server2 server3 ]
 
  
Set up dlm_controld, gfs_controld and o2cb in cluster's CIB. Easiest way to do this is by running:
+
==[ALL] Enable pacemaker init scripts==
  
crm configure edit
+
update-rc.d -f pacemaker remove
 +
update-rc.d pacemaker start 50 1 2 3 4 5 . stop 01 0 6 .
  
and adjusting it to look somewhat like this:
 
  
node server1
+
service pacemaker start
node server2
+
Starting Pacemaker Cluster Manager: [  OK  ]
node server3
 
primitive resDLM ocf:pacemaker:controld \
 
        params daemon="dlm_controld" \
 
        op monitor interval="120s"
 
primitive resGFSD ocf:pacemaker:controld \
 
        params daemon="gfs_controld" args="" \
 
        op monitor interval="120s"
 
primitive resO2CB ocf:pacemaker:o2cb \
 
        params stack="cman" \
 
        op monitor interval="120s"
 
clone cloneDLM resDLM \
 
        meta globally-unique="false" interleave="true"
 
clone cloneGFSD resGFSD \
 
        meta globally-unique="false" interleave="true" target-role="Started"
 
clone cloneO2CB resO2CB \
 
        meta globally-unique="false" interleave="true"
 
colocation colGFSDDLM inf: cloneGFSD cloneDLM
 
colocation colO2CBDLM inf: cloneO2CB cloneDLM
 
order ordDLMGFSD 0: cloneDLM cloneGFSD
 
order ordDLMO2CB 0: cloneDLM cloneO2CB
 
property $id="cib-bootstrap-options" \
 
        dc-version="1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c" \
 
        cluster-infrastructure="cman" \
 
        stonith-enabled="false" \
 
        no-quorum-policy="ignore"
 
  
EXTREMELY IMPORTANT: Notice that this example has STONITH disabled. This is just a HOWTO for a basic setup. You should't be running shared resources with disabled STONITH. Check pacemaker's documentation for guidance on setting this up. If you are not sure about this, stop right now!
+
==[ONE] Setup resources==
 
+
Wait for a minute until pacemaker declares all nodes online:
Save and quit. Running
+
# crm status
 
+
============
crm status
+
Last updated: Fri Sep  7 21:18:12 2012
 
+
Last change: Fri Sep  7 21:17:17 2012 via crmd on fix
should now show all these services running:
+
Stack: cman
 
+
Current DC: fix - partition with quorum
# crm status
+
Version: 1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c
============
+
2 Nodes configured, unknown expected votes
Last updated: Thu Apr 26 20:18:12 2012
+
0 Resources configured.
Last change: Thu Apr 26 20:06:11 2012 via crmd on server1
+
============  
Stack: cman
+
   
Current DC: server1 - partition with quorum
+
  Online: [ fix foxy ]
Version: 1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c
 
3 Nodes configured, unknown expected votes
 
15 Resources configured.
 
============
 
 
 
Online: [ server1 server2 server3 ]
 
 
 
Clone Set: cloneDLM [resDLM]
 
    Started: [ server1 server2 server3 ]
 
Clone Set: cloneO2CB [resO2CB]
 
    Started: [ server1 server2 server3 ]
 
Clone Set: cloneGFSD [resGFSD]
 
    Started: [ server1 server2 server3 ]
 
 
 
Create GFS2 and OCFS2 filesystems:
 
 
 
mkfs.gfs2 -p lock_dlm -j4 -t pacemaker:pcmk /dev/vdc
 
mkfs.ocfs2 /dev/vdb
 
 
 
When running mkfs.gfs2, make sure that cluster name is identical with the name setup in /etc/cluster/cluster.conf. In this case, this is 'pacemaker'. Now add remaining resources (filesystems). Run
 
 
 
crm configure edit
 
 
 
and add:
 
 
 
primitive resFS ocf:heartbeat:Filesystem \
 
        params device="/dev/vdb" directory="/opt" fstype="ocfs2" \
 
        op monitor interval="120s"
 
primitive resFS2 ocf:heartbeat:Filesystem \
 
        params device="/dev/vdc" directory="/mnt" fstype="gfs2" \
 
        op monitor interval="120s"
 
clone cloneFS resFS \
 
        meta interleave="true" ordered="true" target-role="Started"
 
clone cloneFS2 resFS2 \
 
        meta interleave="true" ordered="true" target-role="Started"
 
colocation colFSGFSD inf: cloneFS2 cloneGFSD
 
colocation colFSO2CB inf: cloneFS cloneO2CB
 
order ordGFSDFS 0: cloneGFSD cloneFS2
 
order ordO2CBFS 0: cloneO2CB cloneFS
 
 
 
Once saved, cluster will show all services running:
 
 
 
# crm status
 
============
 
Last updated: Thu Apr 26 20:28:21 2012
 
Last change: Thu Apr 26 20:06:11 2012 via crmd on server1
 
Stack: cman
 
Current DC: server1 - partition with quorum
 
Version: 1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c
 
3 Nodes configured, unknown expected votes
 
15 Resources configured.
 
============
 
 
 
Online: [ server1 server2 server3 ]
 
 
 
  Clone Set: cloneDLM [resDLM]
 
    Started: [ server1 server2 server3 ]
 
  Clone Set: cloneO2CB [resO2CB]
 
    Started: [ server1 server2 server3 ]
 
Clone Set: cloneFS [resFS]
 
    Started: [ server1 server2 server3 ]
 
Clone Set: cloneGFSD [resGFSD]
 
    Started: [ server1 server2 server3 ]
 
Clone Set: cloneFS2 [resFS2]
 
    Started: [ server1 server2 server3 ]
 
  
=[ALL] installation pacemaker==
+
==[[OCSF2 WAY]]==
sudo apt-get install pacemaker
+
==[[GFS2 WAY]]==
edit /etc/default/corosync and enable corosync (START=yes)
+
==[[CRM/CIB]]==
==[ONE] generate corosync authkey==
 
sudo corosync-keygen
 
(this can take a while if there's no enough entropy; download ubuntu iso image on the same machine while generating to speed it up or use keyboard to generate entropy)
 
copy /etc/corosync/authkey to all servers that will form this cluster (make sure it is owned by root:root and has 400 permissions).
 
==[ALL] configure corosync==
 
In /etc/corosync/corosync.conf, replace bindnetaddr (by default it is 127.0.0.1), with the network address
 
of your server,  replacing the last number by 0 to get the network address. For example, if your IP is
 
192.168.1.101, then you would put 192.168.1.0.
 
==ALL hab ich gefunden? kein plan ob da das vermisste paket dabei is===
 
apt-get install drbd8-utils iscsitarget ocfs2-tools pacemaker corosync libdlm3 openais \
 
ocfs2-tools-pacemaker iscsitarget-dkms lm-sensors ocfs2-tools-cman resource-agents fence-agents
 
  
  for X in {drbd,o2cb,ocfs2}; do update-rc.d -f ${X} disable; done
+
==[ALL] install services that will fail over between servers==
  ---------
+
In this example, I'm installing apache2
  die beiden fehlen - libdlm3-pacemaker dlm-pcmk
+
sudo apt-get install apache2
 +
Disable their init scripts:
 +
  update-rc.d -f apache2 remove
 +
In this example, I'll create failover for the apache2. I'll also add one additional and tie apache2.
 +
sudo crm configure edit
 +
primitive resAPACHE ocf:heartbeat:apache \
 +
        params configfile="/etc/apache2/apache2.conf" httpd="/usr/sbin/apache2" \
 +
        op monitor interval="5s"
 +
  primitive resIP-APACHE ocf:heartbeat:IPaddr2 \
 +
        params ip="192.168.244.171" cidr_netmask="21" nic="eth0
 +
  group groupAPACHE resIP-APACHE resAPACHE
 +
  order apache_after_ip inf: resIP-APACHE:start resAPACHE:start
  
==[ALL] start corosync==
+
=Links=
sudo /etc/init.d/corosync start
+
*http://www.linux-magazin.de/Online-Artikel/GFS2-und-OCFS2-zwei-Cluster-Dateisysteme-im-Linux-Kernel
==[ALL] Install DRBD and other needed tools==
+
*https://wiki.ubuntu.com/ClusterStack/Precise
sudo apt-get install linux-headers-server psmisc drbd8-utils
+
*https://wiki.ubuntu.com/ClusterStack/LucidTesting
==[ALL] Pacemaker soll drbd managen==
+
*http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Clusters_from_Scratch
sudo update-rc.d -f drbd remove
+
*http://www.drbd.org/users-guide/
=CRM=
+
*http://wiki.techstories.de/display/IT/Pacemaker+HA-Cluster+Schulung+GFU+Koeln
==crm==
+
*http://theclusterguy.clusterlabs.org/post/907043024/introducing-the-pacemaker-master-control-process-for
==crm_mon==
+
*http://www.hastexo.com/resources/hints-and-kinks/fencing-libvirtkvm-virtualized-cluster-nodes
==cibadmin==
+
*http://linux.die.net/man/5/cluster.conf
===Kompelette CIB löschen===
+
*http://burning-midnight.blogspot.de/2012/05/cluster-building-ubuntu-1204.html
cibadmin --force -E
+
*http://burning-midnight.blogspot.de/2012/07/cluster-building-ubuntu-1204-revised.html
 +
*http://www.hastexo.com/resources/hints-and-kinks/ocfs2-pacemaker-debianubuntu
 +
*http://www.gossamer-threads.com/lists/drbd/users/23267
 +
*http://blog.simon-meggle.de/tutorials/score-berechnung-im-pacemaker-cluster-teil-1/
 +
*http://www.clusterlabs.org/doc/en-US/Pacemaker/1.0/html/Pacemaker_Explained/s-cluster-options.html
 +
*http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/s-ocf-return-codes.html
 +
*https://aseith.com/plugins/viewsource/viewpagesrc.action?p
 +
*http://publications.jbfavre.org/virtualisation/cluster-xen-corosync-pacemaker-drbd-ocfs2.enageId=9601113
 +
*http://www.youtube.com/watch?v=3GoT36cK6os&feature=youtu.be
 +
*http://www.debian-administration.org/articles/578
 +
*http://blog.datentraeger.li/?cat=31
 +
*http://publications.jbfavre.org/virtualisation/cluster-xen-corosync-pacemaker-drbd-ocfs2.en

Aktuelle Version vom 29. Januar 2013, 11:21 Uhr

Installation

[ALL] Initial setup

Install required packages:

sudo apt-get install pacemaker cman resource-agents fence-agents gfs2-utils gfs2-cluster ocfs2-tools-cman openais drbd8-utils

Make sure each host can resolve all other hosts. Best way to achive this is by adding their IPs and hostnames to /etc/hosts on all nodes. In this example, that would be:

eth0
192.168.244.161 fix
192.168.244.162 foxy
eth1
10.168.244.161   fix-ha
10.168.244.162   foxy-ha

Disable o2cb from starting:

update-rc.d -f o2cb remove

[ALL] Create /etc/cluster/cluster.conf

Paste this into /etc/cluster/cluster.conf :

<?xml version="1.0"?>
<cluster config_version="1" name="pacemaker">
<cman two_node="1" expected_votes="1"> </cman>
   <fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/>
   <clusternodes>
           <clusternode name="fix" nodeid="1" votes="1">
               <fence>
                       <method name="pcmk-redirect">
                               <device name="pcmk" port="fix"/>
                       </method>
               </fence>
           </clusternode>
           <clusternode name="foxy" nodeid="2" votes="1">
               <fence>
                       <method name="pcmk-redirect">
                               <device name="pcmk" port="foxy"/>
                       </method>
               </fence>
           </clusternode>
   </clusternodes>
 <fencedevices>
   <fencedevice name="pcmk" agent="fence_pcmk"/>
 </fencedevices>
</cluster>

Validate the config

# ccs_config_validate 
Configuration validates

[ALL] Edit /etc/corosync/corosync.conf

Find pacemaker service in /etc/corosync/corosync.conf and bump version to 1:

service {
        # Load the Pacemaker Cluster Resource Manager
        ver:       1
        name:      pacemaker
}

Replace bindnetaddr with the IP of your network. For example:

                bindnetaddr: 10.168.244.0

'0' is not a typo.

[ALL] Enable pacemaker init scripts

update-rc.d -f pacemaker remove
update-rc.d pacemaker start 50 1 2 3 4 5 . stop 01 0 6 .

[ALL] Start cman service and then pacemaker service

deactivat quorum

# echo CMAN_QUORUM_TIMEOUT=0 >> /etc/default/cman

start cman

# service cman start
Starting cluster: 
   Checking if cluster has been disabled at boot... [  OK  ]
   Checking Network Manager... [  OK  ]
   Global setup... [  OK  ]
   Loading kernel modules... [  OK  ]
   Mounting configfs... [  OK  ]
   Starting cman... [  OK  ]
   Waiting for quorum... [  OK  ]
   Starting fenced... [  OK  ]
   Starting dlm_controld... [  OK  ]
   Unfencing self... [  OK  ]
   Joining fence domain... [  OK  ]


# cman_tool nodes
Node  Sts   Inc   Joined               Name
   1   M      4   2012-09-11 19:14:58  fix
   2   M      8   2012-09-11 19:18:35  foxy


[ALL] Enable pacemaker init scripts

update-rc.d -f pacemaker remove
update-rc.d pacemaker start 50 1 2 3 4 5 . stop 01 0 6 .


service pacemaker start
Starting Pacemaker Cluster Manager: [  OK  ]

[ONE] Setup resources

Wait for a minute until pacemaker declares all nodes online:
# crm status
============
Last updated: Fri Sep  7 21:18:12 2012
Last change: Fri Sep  7 21:17:17 2012 via crmd on fix
Stack: cman
Current DC: fix - partition with quorum
Version: 1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c
2 Nodes configured, unknown expected votes
0 Resources configured.
============ 

Online: [ fix foxy ]

OCSF2 WAY

GFS2 WAY

CRM/CIB

[ALL] install services that will fail over between servers

In this example, I'm installing apache2

sudo apt-get install apache2

Disable their init scripts:

update-rc.d -f apache2 remove

In this example, I'll create failover for the apache2. I'll also add one additional and tie apache2.

sudo crm configure edit
primitive resAPACHE ocf:heartbeat:apache \
       params configfile="/etc/apache2/apache2.conf" httpd="/usr/sbin/apache2" \
       op monitor interval="5s"
primitive resIP-APACHE ocf:heartbeat:IPaddr2 \
       params ip="192.168.244.171" cidr_netmask="21" nic="eth0
group groupAPACHE resIP-APACHE resAPACHE
order apache_after_ip inf: resIP-APACHE:start resAPACHE:start

Links