AIX HACMP EE Installation and configuration
id : emfc3i37bp
category : computer
blog : unix
created : 12/26/11 - 15:31:17

Naming convention

    • nimserver : nim server installation.
    • node1 : first node of cluster.
    • node2 : second node of cluster.

Installation

Post prerequistes for HACMP EE installation
EMC Symmetrix ODM installation
  • install EMC Symmetrix ODM :
# mount nimserver:/app/list/nim/distrib /mnt
# cd /mnt/EMC/5.3.0.4
# installp -d . -agYX EMC.Symmetrix.aix.rte EMC.Symmetrix.fcp.rte
# lslpp -l | grep -i EMC.Symmetrix
 EMC.Symmetrix.aix.rte      5.3.0.4  COMMITTED  EMC Symmetrix AIX Support
 EMC.Symmetrix.fcp.rte      5.3.0.4  COMMITTED  EMC Symmetrix FCP Support
 EMC.Symmetrix.aix.rte      5.3.0.4  COMMITTED  EMC Symmetrix AIX Support
 EMC.Symmetrix.fcp.rte      5.3.0.4  COMMITTED  EMC Symmetrix FCP Support
# cd ; umount /mnt

Powerpath installation
  • powerpath 5.5.0.1 installation
# mount nimserver:/app/list/nim/distrib /mnt
# cd /mnt/EMC/5.5.0.1
# installp -d . -ag EMCpower.base
# lslpp -l | grep -i EMCpower
  EMCpower.base              5.5.0.1  COMMITTED  PowerPath Base Driver and
  EMCpower.encryption        5.5.0.1  COMMITTED  PowerPath Encryption with RSA
  EMCpower.migration_enabler
  EMCpower.mpx               5.5.0.1  COMMITTED  PowerPath Multi_Pathing
# cd ; umount /mnt

  • define registration key for new powerpath installation
# emcpreg -install
====== EMC PowerPath Registration ======
Do you have a new registration key or keys to enter?[n] y
                  Enter the registration keys(s) for your product(s),
                  one per line, pressing Enter after each key.
                  After typing all keys, press Enter again.

Key (Enter if done): ****-****-****-****-****-****
1 key(s) successfully added.
Key successfully installed.

Key (Enter if done):
1 key(s) successfully registered.
#powermt check_registration
Key ****-****-****-****-****-****
  Product: PowerPath
  Capabilities: All

  • reboot and check that powerpaths devices are now available :
# shutdown -Fr
# powermt display dev=all

  • enable powerpath on rootvg using pprootdev command and reboot :
# pprootdev on 
bosboot: Boot image is 49547 512 byte blocks.
PowerPath boot is enabled for the next system boot.
# shutdown -Fr

  • if everything was going fine, rootvg is now on hdiskpower0, list powerpath pseudo devices path to build the bootlist :
# powermt display dev=hdiskpower0
Pseudo name=hdiskpower0
Symmetrix ID=000292602667
Logical device ID=31F3
state=alive; policy=SymmOpt; priority=0; queued-IOs=0;
==============================================================================
--------------- Host ---------------   - Stor -   -- I/O Path --  -- Stats ---
###  HW Path               I/O Paths    Interf.   Mode    State   Q-IOs Errors
==============================================================================
   0 fscsi0                   hdisk0    FA 10eB   active  alive       0      0
   1 fscsi1                   hdisk16   FA  7eB   active  alive       0      0
   1 fscsi1                   hdisk23   FA  5eB   active  alive       0      0
   2 fscsi2                   hdisk32   FA 10eB   active  alive       0      0
   2 fscsi2                   hdisk39   FA 12eB   active  alive       0      0
   3 fscsi3                   hdisk48   FA  7eB   active  alive       0      0
   3 fscsi3                   hdisk55   FA  5eB   active  alive       0      0
   0 fscsi0                   hdisk7    FA 12eB   active  alive       0      0

  • bootlist can't be defined on a powerpath device, it have to be set on hdisk. Choose a path on each fibre channel card, to be the most redondant as possible, here : hdisk0, hdisk16, hdisk32, hdisk48.
# bootlist -m normal hdisk0 hdisk16 hdisk32 hdisk48
# bootlist -m normal -o

  • run pprootdev fix to set temporarly rootvg on hdisks, then run a bosboot :
# pprootdev fix
bosboot: Boot image is 49547 512 byte blocks.
You may now run bosboot.
PowerPath boot remains enabled for the next system boot.
# bosboot -ad /dev/ipldevice

  • reboot on last time, and check bootlist once again :
# shutdown -Fr
# bootlist –m normal –o 
hdisk0 blv=hd5
hdisk16 blv=hd5
hdisk32 blv=hd5
hdisk48 blv=hd5

Solution enabler installation
  • HACMP EE can't be run without solution enabler. Solution enabler is needed to control SRDF pairs beetween each site.
  • extend /usr /opt and /var filesystems :
# chfs -a size=600M /opt
# chfs -a size=3520M /usr
# chfs -a size=1G /var

  • then install solution enabler :
# mount nimsever:/app/list/nim/distrib /mnt
# cd /mnt/EMC/SolutionEnabler/7.3.0.1
# ./se7301_install.sh -install
 #----------------------------------------------------------------------------
 #                            EMC Installation Manager
 #----------------------------------------------------------------------------
 Copyright 2010, EMC Corporation
 All rights reserved.
 The terms of your use of this software are governed by the
 applicable contract.
 Solutions Enabler Native Installer [RT] Kit Location : /mnt/EMC/SolutionEnabler/7.3.0.1
 Checking for OS version compatibility......
 Checking for previous installation of Solutions Enabler......
 Following daemons can be set to run as a non-root user:
 storsrvd, storevntd, storgnsd, storwatchd
 Do you want to run these daemons as a non-root user? [N]:
 Checking for active processes.....
 Checking for active SYMCLI components...
 Install All EMC Solutions Enabler Shared Libraries and Run Time Environment ? [Y]:
 Install Symmetrix Command Line Interface SYMCLI ? [Y]:
 Install Option to Enable JNI Interface for EMC Solutions Enabler APIs ? [N]:
 Install EMC Solutions Enabler SRM Components ? [N]:
 Install EMC Solutions Enabler SYMRECOVER Components ? [Y]:
 Do you want to change default permission on /var/symapi directory from [755]? [N]:
 Installing SYMCLI.DATA.rte.....
 Installing SYMCLI.THINCORE.rte.....
 Installing SYMCLI.BASE.rte.....
 Installing SYMCLI.SYMCLI.rte.....
 Installing SYMCLI.SYMRECOVER.rte.....

 Installing SYMCLI.64BIT.rte.....
 Enabling stordaemon...
 Do not forget to run 'symcfg discover' after the installation
 completes and  whenever your configuration changes.
 You may need to manually rediscover remotely connected
 arrays. Please see the installation notes for further
 information.
 #-----------------------------------------------------------------------------
 # The following HAS BEEN INSTALLED in /opt/emc via the installp utility.
 #-----------------------------------------------------------------------------
  ITEM  PRODUCT                                         VERSION
  01    EMC Solutions Enabler                           V7.3.0.1
        RT KIT
 #-----------------------------------------------------------------------------

  • check fileset are installed and commited :
# lslpp -l | grep SYMCLI
  SYMCLI.64BIT.rte           7.3.0.1  COMMITTED  64-bit Shared Libraries
  SYMCLI.BASE.rte            7.3.0.1  COMMITTED  Shared Libraries and Runtime
  SYMCLI.DATA.rte            7.3.0.1  COMMITTED  Data Component -  Core Library
  SYMCLI.SYMCLI.rte          7.3.0.1  COMMITTED  Symmetrix Command Line
                                                 Interface (SYMCLI)
  SYMCLI.SYMRECOVER.rte      7.3.0.1  COMMITTED  EMC Solutions Enabler
  SYMCLI.THINCORE.rte        7.3.0.1  COMMITTED  Shared Libraries and Runtime

Solution enabler configuration
  • run a symcfg discover to discover attached SAN :
# symcfg discover
This operation may take up to a few minutes. Please be patient...

  • check server can see two SAN (at least), in this case 2667 (site1) and 2666 (site2)
# symcfg list
                                S Y M M E T R I X

                                       Mcode    Cache      Num Phys  Num Symm
    SymmID       Attachment  Model     Version  Size (MB)  Devices   Devices

    000292602666 Local       VMAX-1    5875      229376         8      5885
    000290100772 Remote      DMX3-24   5773       49152         0      2106
    000290101885 Remote      DMX3-24   5773       98304         0      8520
    000292602153 Remote      VMAX-1    5875       81920         0      1923
    000292602160 Remote      VMAX-1    5875       81920         0      2330
    000292602667 Remote      VMAX-1    5875      229376         0     13943

  • configure symapi daemon :
# cp /var/symapi/config/daemon_options /var/symapi/config/daemon_options.orig
# echo "storstpd:logfile_type = dated" > /var/symapi/config/daemon_options
# echo "storstpd:sync_vp_data_interval  = 4" >> /var/symapi/config/daemon_options
# echo "storapid:internode_lock_recovery=enable" >> /var/symapi/config/daemon_options
# echo "storapid:internode_lock_recovery_heartbeat_interval=10" >> /var/symapi/config/daemon_options
# echo "storapid:internode_lock_information_export=enable" >> /var/symapi/config/daemon_options
# grep -v ^# /var/symapi/config/daemon_options
# cp /var/symapi/config/options /var/symapi/config/options.orig
# sed 's/#SYMAPI_USE_RDFD = DISABLE/SYMAPI_USE_RDFD = ENABLE/' /var/symapi/config/options > /var/symapi/config/options.tmp
# mv /var/symapi/config/options.tmp /var/symapi/config/options
# grep SYMAPI_USE_RDFD /var/symapi/config/options

  • then restart daemons ;
# /usr/symcli/storbin/stordaemon shutdown all
# /usr/symcli/storbin/stordaemon start storapid
# /usr/symcli/storbin/stordaemon start storrdfd

  • if new luns are masked on server rerun a symcfg discover to discover it :
# symcfg discover

Prerequisites filesets for HACMP EE installation
  • install bos.adt.libm and bos.adt.syscalls :
# mount nimserver:/app/list/nim/os /mnt
# cd /mnt/AIX610/installp/ppc
# installp -d . -agY bos.adt.libm bos.adt.syscalls
# cd / ; umount /mnt ; mount nimserver:/app/list/nim/tl /mnt ; cd /mnt/AIX610/TL06
# installp -d . -agY bos.adt.libm bos.adt.syscalls
# cd / ; umount /mnt

  • install rsct filesets :
# mount nimserver:/app/list/nim/tl /mnt
# cd /mnt/AIX610/TL06
# installp -d . -ag rsct.basic rsct.compat.basic rsct.compat.clients rsct.core
# cd / ; umount /mnt

  • install bos.clvm for heartbeat ressource group configuration :
# mount nimserver:/app/list/nim/os /mnt
# cd /mnt/AIX610/installp/ppc
# pprootdev fix
# installp -d . -ag bos.clvm.enh
# cd / ; umount /mnt
# mount nimserver:/app/list/nim/sp /mnt
# cd /mnt/AIX610/TL06SP05
# installp -d . -ag bos.clvm.enh
# cd / ; umount /mnt

  • installation of bos.clvm run a bosboot command at the end of its installation.
  • HACMP EE is running over powerpath, beforce every bosboot command run pprootdev fix command, or AIX will not be able to perform its bosboot command.
  • extend /tmp filesystem for HACMP EE installation :
# chfs -a size=512M /tmp

  • HACMP EE filesets installation :
# mount nimserver:/app/list/nim/distrib /mnt
# cd /mnt/HACMP/PowerHAEE_v6.1/installp/ppc
# installp -d . -agY cluster.es.client cluster.es.cspoc cluster.es.server cluster.es.sr cluster.license cluster.xd.license
# cd /mnt/HACMP/PowerHA_v6.1/v6.1.0.6
# /usr/lib/instl/sm_inst installp_cmd -a -d '.' -f '_update_all'  '-c' '-N' '-g' '-X'    '-Y'
# cd /
# umount /mnt

  • system update to TL06SP05 :
# mount nimserver:/app/list/nim/sp /mnt
# cd /mnt/AIX610/TL06SP03
# /usr/lib/instl/sm_inst installp_cmd -a -d '.' -f '_update_all'  '-c' '-N' '-g' '-X'    '-Y'
# cd /mnt/AIX610/TL06SP05
# /usr/lib/instl/sm_inst installp_cmd -a -d '.' -f '_update_all'  '-c' '-N' '-g' '-X'    '-Y'

  • then reboot :
# shutdown -Fr

  • check TL levels with instfix command :
# oslevel -s
# instfix -i | egrep "ML|SP"
   All filesets for 6100-00_AIX_ML were found.
   All filesets for 6.1.0.0_AIX_ML were found.
   All filesets for 6100-01_AIX_ML were found.
   All filesets for 6100-02_AIX_ML were found.
   All filesets for 6100-03_AIX_ML were found.
   All filesets for 6100-04_AIX_ML were found.
   All filesets for 6100-05_AIX_ML were found.
   All filesets for 6100-06_AIX_ML were found.
# lppchk -vm3

  • check that two node of cluster are identical with compare_report command :
# ssh node1 "lslpp -Lc" > /tmp/node1_report
# ssh node2 "lslpp -Lc" > /tmp/node2_report
# compare_report -b /tmp/node1_report -o /tmp/node2_report -l
# compare_report -b /tmp/node1_report -o /tmp/node2_report -h
# compare_report -b /tmp/node1_report -o /tmp/node2_report -m
# compare_report -b /tmp/node1_report -o /tmp/node2_report -n

  • fix errors if they exists.
Post configuration checks
  • Identify data luns and heartbeat luns :
    • heartbeat luns are (in this case) TDEV of 1024M, two heartbeat luns are needed, one per site, and they need to be masked on two server without SRDF replication :
      • on 2666 SAN array :
# symdev -sid 2666 list pd

Symmetrix ID: 000292602666

        Device Name           Directors                  Device
--------------------------- ------------- -------------------------------------
                                                                           Cap
Sym  Physical               SA :P DA :IT  Config        Attribute    Sts   (MB)
--------------------------- ------------- -------------------------------------

0144 /dev/rhdisk14          12E:1 06A:D6  2-Way Mir     N/Grp'd ACLX RW       3
16EC /dev/rhdiskpower7      12E:1  NA:NA  TDEV          N/Grp'd      RW    1024

      • on 2667 SAN array :
# symdev -sid 2667 list pd

Symmetrix ID: 000292602667

        Device Name           Directors                  Device
--------------------------- ------------- -------------------------------------
                                                                           Cap
Sym  Physical               SA :P DA :IT  Config        Attribute    Sts   (MB)
--------------------------- ------------- -------------------------------------

31F3 /dev/rhdiskpower0      10E:1  NA:NA  TDEV          N/Grp'd      RW   32768
31F4 /dev/rhdiskpower1      10E:1  NA:NA  TDEV          N/Grp'd      RW   32768
35CA /dev/rhdiskpower2      10E:1  NA:NA  RDF1+TDEV     N/Grp'd  (M) RW  262148
35D2 /dev/rhdiskpower3      10E:1  NA:NA  RDF1+TDEV     N/Grp'd  (M) RW  131074
35D6 /dev/rhdiskpower4      10E:1  NA:NA  RDF1+TDEV     N/Grp'd  (M) RW  131074
35DA /dev/rhdiskpower5      10E:1  NA:NA  RDF1+TDEV     N/Grp'd      RW   32768
35DB /dev/rhdiskpower6      10E:1  NA:NA  RDF1+TDEV     N/Grp'd  (M) RW  262148
3666 /dev/rhdiskpower12     10E:1  NA:NA  TDEV          N/Grp'd      RW    1024
3769 /dev/rhdiskpower8      10E:1 15A:D3  2-Way Mir     N/Grp'd      RW       3
376A /dev/rhdiskpower9      10E:1 03A:D3  2-Way Mir     N/Grp'd      RW       3
376B /dev/rhdiskpower10     10E:1 07B:C5  2-Way Mir     N/Grp'd      RW       3
376C /dev/rhdiskpower11     10E:1 12A:C5  2-Way Mir     N/Grp'd      RW       3

  • two heartbeat luns are identified :
    • ID 3666 : on 2667 SAN array.
    • ID 16EC : on 2666 SAN array.
  • check theses luns are masked on each cluster node :
    • node1 :
# /usr/lpp/EMC/Symmetrix/bin/inq.aix64_51 -f_powerpath
/dev/rhdiskpower0  :EMC     :SYMMETRIX       :5874  :67!vr000   :    33554880
/dev/rhdiskpower1  :EMC     :SYMMETRIX       :5874  :67!vs000   :    33554880
/dev/rhdiskpower2  :EMC     :SYMMETRIX       :5874  :67"&S000   :   268439040
/dev/rhdiskpower3  :EMC     :SYMMETRIX       :5874  :67"&[000   :   134219520
/dev/rhdiskpower4  :EMC     :SYMMETRIX       :5874  :67"&_000   :   134219520
/dev/rhdiskpower5  :EMC     :SYMMETRIX       :5874  :67"&i000   :    33554880
/dev/rhdiskpower6  :EMC     :SYMMETRIX       :5874  :67"&j000   :   268439040
/dev/rhdiskpower7  :EMC     :SYMMETRIX       :5874  :666EC008   :     1048320
/dev/rhdiskpower8  :EMC     :SYMMETRIX       :5874  :67",<000   :        2880
/dev/rhdiskpower9  :EMC     :SYMMETRIX       :5874  :67",=000   :        2880
/dev/rhdiskpower10 :EMC     :SYMMETRIX       :5874  :67",>000   :        2880
/dev/rhdiskpower11 :EMC     :SYMMETRIX       :5874  :67",?000   :        2880
/dev/rhdiskpower12 :EMC     :SYMMETRIX       :5874  :67"(_000   :     1048320
# powermt display dev=hdiskpower7 | grep "Logical device ID"
Logical device ID=16EC
# powermt display dev=hdiskpower12 | grep "Logical device ID"
Logical device ID=3666

  • node 2 :
# /usr/lpp/EMC/Symmetrix/bin/inq.aix64_51 -f_powerpath
/dev/rhdiskpower0  :EMC     :SYMMETRIX       :5874  :662FF008   :    33554880
/dev/rhdiskpower1  :EMC     :SYMMETRIX       :5874  :66300008   :    33554880
/dev/rhdiskpower2  :EMC     :SYMMETRIX       :5874  :66301008   :      -----
/dev/rhdiskpower3  :EMC     :SYMMETRIX       :5874  :66309008   :      -----
/dev/rhdiskpower4  :EMC     :SYMMETRIX       :5874  :6630D008   :      -----
/dev/rhdiskpower5  :EMC     :SYMMETRIX       :5874  :6630E008   :      -----
/dev/rhdiskpower6  :EMC     :SYMMETRIX       :5874  :66312008   :      -----
/dev/rhdiskpower7  :EMC     :SYMMETRIX       :5874  :666EC008   :     1048320
/dev/rhdiskpower8  :EMC     :SYMMETRIX       :5874  :667A1008   :        2880
/dev/rhdiskpower9  :EMC     :SYMMETRIX       :5874  :667A2008   :        2880
/dev/rhdiskpower10 :EMC     :SYMMETRIX       :5874  :667A3008   :        2880
/dev/rhdiskpower11 :EMC     :SYMMETRIX       :5874  :667A4008   :        2880
/dev/rhdiskpower12 :EMC     :SYMMETRIX       :5874  :67"(_000   :     1048320
# powermt display dev=hdiskpower7 | grep "Logical device ID"
Logical device ID=16EC
# powermt display dev=hdiskpower12 | grep "Logical device ID"
Logical device ID=3666

  • check data luns. In this case there are five data luns. check that data luns on node1 are RDF1 and data luns on node2 are RDF2 :
    • 2 256Go luns (Meta).
    • 2 128Go luns (Meta).
    • 1 32Go lun.
  • node1 :
# symdev list pd -sid 2667
31F3 /dev/rhdiskpower0      10E:1  NA:NA  TDEV          N/Grp'd      RW   32768
31F4 /dev/rhdiskpower1      10E:1  NA:NA  TDEV          N/Grp'd      RW   32768
35CA /dev/rhdiskpower2      10E:1  NA:NA  RDF1+TDEV     N/Grp'd  (M) RW  262148
35D2 /dev/rhdiskpower3      10E:1  NA:NA  RDF1+TDEV     N/Grp'd  (M) RW  131074
35D6 /dev/rhdiskpower4      10E:1  NA:NA  RDF1+TDEV     N/Grp'd  (M) RW  131074
35DA /dev/rhdiskpower5      10E:1  NA:NA  RDF1+TDEV     N/Grp'd      RW   32768
35DB /dev/rhdiskpower6      10E:1  NA:NA  RDF1+TDEV     N/Grp'd  (M) RW  262148
3666 /dev/rhdiskpower12     10E:1  NA:NA  TDEV          N/Grp'd      RW    1024
3769 /dev/rhdiskpower8      10E:1 15A:D3  2-Way Mir     N/Grp'd      RW       3
376A /dev/rhdiskpower9      10E:1 03A:D3  2-Way Mir     N/Grp'd      RW       3
376B /dev/rhdiskpower10     10E:1 07B:C5  2-Way Mir     N/Grp'd      RW       3
376C /dev/rhdiskpower11     10E:1 12A:C5  2-Way Mir     N/Grp'd      RW       3

  • node2 :
# symdev list pd -sid 2666
0144 /dev/rhdisk0           05E:1 06A:D6  2-Way Mir     N/Grp'd ACLX RW       3
12FF /dev/rhdiskpower0      05E:1  NA:NA  TDEV          N/Grp'd      RW   32768
1300 /dev/rhdiskpower1      05E:1  NA:NA  TDEV          N/Grp'd      RW   32768
1301 /dev/rhdiskpower2      05E:1  NA:NA  RDF2+TDEV     N/Grp'd  (M) WD  262148
1309 /dev/rhdiskpower3      05E:1  NA:NA  RDF2+TDEV     N/Grp'd  (M) WD  131074
130D /dev/rhdiskpower4      05E:1  NA:NA  RDF2+TDEV     N/Grp'd      WD   32768
130E /dev/rhdiskpower5      05E:1  NA:NA  RDF2+TDEV     N/Grp'd  (M) WD  131074
1312 /dev/rhdiskpower6      05E:1  NA:NA  RDF2+TDEV     N/Grp'd  (M) WD  262148
16EC /dev/rhdiskpower7      05E:1  NA:NA  TDEV          N/Grp'd      RW    1024
17A1 /dev/rhdiskpower8      05E:1 15A:D3  2-Way Mir     N/Grp'd      RW       3
17A2 /dev/rhdiskpower9      05E:1 13A:D3  2-Way Mir     N/Grp'd      RW       3
17A3 /dev/rhdiskpower10     05E:1 03A:D3  2-Way Mir     N/Grp'd      RW       3
17A4 /dev/rhdiskpower11     05E:1 07C:C6  2-Way Mir     N/Grp'd      RW       3

Cluster configuration

SNMP configuration
  • disable snmpv v3 for HACMP, and switch to v1 on each cluster node :
# /usr/sbin/snmpv3_ssw -1

ip configuration
/etc/host configuration
  • modify /etc/host file, add all necessary ip on each cluster node :
# hostent -d 10.246.58.31
# hostent -d 10.246.58.32
# hostent -a 7.0.0.9 -h infsav5c1l
# hostent -a 7.0.0.10 -h infsav5c2l
# hostent -a 10.245.214.53 -h infsav5c1
# hostent -a 10.245.214.54 -h infsav5c2
# hostent -a 10.245.214.55 -h infsav5c1a
# hostent -a 10.246.58.31 -h infsav5c1adm
# hostent -a 10.246.58.32 -h infsav5c2adm
# hostent -a 10.246.58.35 -h infsav5adm

system ip configuration
  • configure primary ip address on secondary ip address on each cluster node :
    • node1 :
# mktcpip -h infsav5c1 -a 10.245.214.53 -i en2 -m 255.255.255.240 -g 10.245.214.62 -t 'N/A' -A yes -s
# route add -net 10.246.70.0 10.246.59.254
# chdev -l 'en0' -a netaddr='10.246.58.31' -a netmask='255.255.254.0' -a state='up'
# chdev -l 'en1' -a netaddr='7.0.0.9' -a netmask='255.255.255.240' -a state='up'

    • node2 :
# mktcpip -h infsav5c2 -a 10.245.214.54 -i en2 -m 255.255.255.240 -g 10.245.214.62 -t 'N/A' -A yes -s
# route add -net 10.246.70.0 10.246.59.254
# chdev -l 'en0' -a netaddr='10.246.58.32' -a netmask='255.255.254.0' -a state='up'
# chdev -l 'en1' -a netaddr='7.0.0.10' -a netmask='255.255.255.240' -a state='up'

/usr/es/sbin/cluster/etc/rhosts file
  • fill up /usr/es/sbin/cluster/etc/rhosts with all ip addresses used by cluster :
# cat > /usr/es/sbin/cluster/etc/rhosts << EOF
10.246.58.31
10.246.58.32
10.246.58.35
7.0.0.9
7.0.0.10
10.245.214.53
10.245.214.54
10.245.214.55
EOF

cluster naming
cluster name
  • naming cluster
# smit cm_config_nodes.add_dmn

* Cluster Name                                       [node]
  New Nodes (via selected communication paths)       []
  Currently Configured Node(s)                       node1

node names
  • nodes definitions :
    • node1 definition :
# smit cm_config_hacmp_nodes_menu_dmn
--> Change/Show a Node in the HACMP Cluster
* Node Name				node1
  New Node Name				[node1]
  Communication Path to Node		[node1]
  Persistent Node IP Label/Address

    • node2 definition :
# smit cm_config_hacmp_nodes_menu_dmn
--> Add a Node to the HACMP Cluster
* Node Name			[node2]
  Communication Path to Node	[node2]  

  • configuration check :
# cllsnode 
Node node1
Node node2

site names
  • sites definitions :
    • site1 definition :
# smit cm_add_site
* Site Name		[site1]
* Site Nodes		node1

    • site2 definition :
# smit cm_add_site
* Site Name		[site2]
* Site Nodes		node2

    • configuration check :
# cllssite
site1 node1 NONE
site2 node2 NONE

cluster network configuration
cluster network creation
  • network defintion :
  • There will be three different networks to define, one for administation, one for inter clusters communication, and one for production :
    • define ether_nodeadm network :
# smit cm_add_a_network_to_the_hacmp_cluster_select
* Network Name                              [ether_nodeadm]
* Network Type                              ether
* Netmask(IPv4)/Prefix Length(IPv6)         [255.255.254.0]
* Enable IP Address Takeover via IP Aliases [Yes]
  IP Address Offset for Heartbeating over IP Aliases []

    • define ether_nodelm network :
# smit cm_add_a_network_to_the_hacmp_cluster_select
* Network Name                              [ether_nodelm]
* Network Type                              ether
* Netmask(IPv4)/Prefix Length(IPv6)         [255.255.255.240]
* Enable IP Address Takeover via IP Aliases [Yes]
  IP Address Offset for Heartbeating over IP Aliases []

    • define ether_node network :
# smit cm_add_a_network_to_the_hacmp_cluster_select
* Network Name                              [ether_node]
* Network Type                              ether
* Netmask(IPv4)/Prefix Length(IPv6)         [255.255.255.240]
* Enable IP Address Takeover via IP Aliases [Yes]
  IP Address Offset for Heartbeating over IP Aliases []

  • then run HACMP discover :
# cm_extended_config_menu_dmn
--> Discover HACMP-related Information from Configured Nodes

cluster interfaces configuration
  • interfaces definitions :
  • for each network define interfaces for each node :
    • ether_nodeadm :
# smit cm_config_hacmp_communication_interfaces_devices_menu_dmn
--> Add Communication Interfaces/Devices
  --> Add Pre-defined Communication Interfaces and Devices
    --> Communication Interfaces
       --> ether_nodeadm
* IP Label/Address                              [node1adm]
* Network Type                                  ether
* Network Name                                  ether_nodeadm
* Node Name                                     [node1]
  Network Interface                             [en0]

# smit cm_config_hacmp_communication_interfaces_devices_menu_dmn
--> Add Communication Interfaces/Devices
  --> Add Pre-defined Communication Interfaces and Devices
    --> Communication Interfaces
       --> ether_nodeadm
* IP Label/Address                              [node2adm]
* Network Type                                  ether
* Network Name                                  ether_nodeadm
* Node Name                                     [node2]
  Network Interface                             [en0]

  • ether_nodelm
# smit cm_config_hacmp_communication_interfaces_devices_menu_dmn
--> Add Communication Interfaces/Devices
  --> Add Pre-defined Communication Interfaces and Devices
    --> Communication Interfaces
       --> ether_nodelm
* IP Label/Address                               [node1l]
* Network Type                                   ether
* Network Name                                   ether_nodelm
* Node Name                                      [node1]
  Network Interface                              [en1]

# smit cm_config_hacmp_communication_interfaces_devices_menu_dmn
--> Add Communication Interfaces/Devices
  --> Add Pre-defined Communication Interfaces and Devices
    --> Communication Interfaces
       --> ether_nodelm
* IP Label/Address                               [node2l]
* Network Type                                   ether
* Network Name                                   ether_nodelm
* Node Name                                      [node2]
  Network Interface                              [en1]

  • ether_node :
# smit cm_config_hacmp_communication_interfaces_devices_menu_dmn
--> Add Communication Interfaces/Devices
  --> Add Pre-defined Communication Interfaces and Devices
    --> Communication Interfaces
       --> ether_node
* IP Label/Address                                 [node1]
* Network Type                                     ether
* Network Name                                     ether_node
* Node Name                                        [node1]
  Network Interface                                [en2]

# smit cm_config_hacmp_communication_interfaces_devices_menu_dmn
--> Add Communication Interfaces/Devices
  --> Add Pre-defined Communication Interfaces and Devices
    --> Communication Interfaces
       --> ether_infsav5
* IP Label/Address                                 [node2]
* Network Type                                     ether
* Network Name                                     ether_node
* Node Name                                        [node2]
  Network Interface                                [en2]

  • configuration check :
# cllsif
node1            boot ether_node 	ether public     node1  10.245.214.53	en2 255.255.255.240             28
node1adm         boot ether_nodeadm 	ether public     node1  10.246.58.31	en0 255.255.254.0               23
node1l           boot ether_nodelm 	ether public     node1  7.0.0.9         en1 255.255.255.240             28
node2            boot ether_node 	ether public     node2  10.245.214.54   en2 255.255.255.240             28
node2adm         boot ether_nodeadm 	ether public     node2  10.246.58.32    en0 255.255.254.0               23
node2l           boot ether_nodelm 	ether public     node2  7.0.0.10        en1 255.255.255.240             28

  • first synchronisation :
# smit clsync
* Verify, Synchronize or Both                        [Both]
* Automatically correct errors found during          [No]
  verification?
* Force synchronization if verification fails?       [No]
* Verify changes only?                               [No]
* Logging                                            [Standard]

service adresses configuration
  • service adresses definition :
  • there will be three service ip, one per network
    • adm service ip !
# smit cm_config_hacmp_service_ip_labels_addresses_menu_dmn
--> Add a Service IP Label/Address
  --> Configurable on Multiple Nodes
    --> ether_infnodeadm
* IP Label/Address                           nodeadm
  Netmask(IPv4)/Prefix Length(IPv6)          [255.255.254.0]
* Network Name                               ether_nodeadm
  Alternate Hardware Address to accompany IP Label/A []
  ddress
  Associated Site                                     ignore

  • prodcution service ip :
# smit cm_config_hacmp_service_ip_labels_addresses_menu_dmn
--> Add a Service IP Label/Address
  --> Configurable on Multiple Nodes
    --> ether_node
* IP Label/Address                          node1a
  Netmask(IPv4)/Prefix Length(IPv6)         [255.255.255.240]
* Network Name                              ether_node
  Alternate Hardware Address to accompany IP Label/A []
  ddress
  Associated Site                                     ignore

ressource group configuration
application server configuration
  • application server definition :
    • one each node create directory used for HACMP EE scripts :
# mkdir -p /usr/HACMP_TOOLS/scripts

    • then define application server for future ressource group :
# smit claddserv.extended.dialog
* Server Name        [clnodeas]
* Start Script       [/usr/HACMP_TOOLS/scripts/start_node.sh]
* Stop Script        [/usr/HACMP_TOOLS/scripts/stop_node.sh]
Application Monitor Name(s)

ressource group configuration
  • ressource group definition :
# smit config_resource_group.dialog.custom
* Resource Group Name                              [clnoderg]
* Participating Nodes (Default Node Priority)      [node1 node2]
  Startup Policy  Online On Home Node Only
  Fallover Policy Fallover To Next Priority Node In The List
  Fallback Policy Never Fallback  

heartbeat ressource group configuration
  • heartbeat disk configuration
    • one node1 define hearbeat disks as pv, then use cfgmgr on node2 :
      • node1 :
# chdev -l hdiskpower7 -a pv=yes
# chdev -l hdiskpower12 -a pv=yes

      • node2 :
# cfgmgr

  • an HACMP bug prevent heartbeat creation, modify /usr/es/sbin/cluster/sbin/cl_mk_mndhb_lv script :
  • at line 3731 :
before :
print $MKVG_OUT | read out_node DVG out_rest
after :
print $MKVG_OUT | read DVG out_node out_rest

  • at line 3816 :
before :
print $MKLV_OUT | read node_out DLV_NAME rest
after :
print $MKLV_OUT | read DLV_NAME node_out rest

  • then define heartbeat ressource groups :
    • site1 heartbeat :
# smit cl_manage_mndhb
--> Create a new Volume Group and Logical Volume for Multi-Node Disk Heartbeat
  --> 00f663982ddd76c5 ( hdiskpower12 on node1 )
  Volume Group Name                     [site1vg]
  Volume group MAJOR NUMBER             [37]
  PVID for Logical Volume               00f663982ddd76c5
  Logical Volume Name                   [site1lv]
  Physical partition SIZE in megabytes  4
  Node Names                            node1,node2
  Resource Group Name                   [site1rg]
  Network Name                          [site1net]

    • site2 heartbeat :
#smit cl_manage_mndhb
--> Create a new Volume Group and Logical Volume for Multi-Node Disk Heartbeat
  -->  00f663982ddea027 ( hdiskpower7 on node1 )
  Volume Group Name                      [site2vg]
  Volume group MAJOR NUMBER              [37]
  PVID for Logical Volume                00f663982ddd76c5
  Logical Volume Name                    [site2lv]
  Physical partition SIZE in megabytes   4
  Node Names                             node1,node2
  Resource Group Name                    [site2rg]
  Network Name                           [site2net]

inter-site policy
  • change inster-site management policy to stick ressource group on site1 :
# smit cm_hacmp_extended_resource_group_config_menu_dmn
--> Change/Show a Resource Group
  --> clmut1rg
  Resource Group Name                       clnoderg
  New Resource Group Name                   []
  Inter-site Management Policy          Prefer Primary Site 
* Participating Nodes from Primary Site     [node1] 
  Participating Nodes from Secondary Site   [node2]

  Startup Policy                       Online On Home Node O> 
  Fallover Policy                      Fallover To Next Prio>
  Fallback Policy                      Never Fallback   

volume group
  • volume groups creation :
  • HACMP needs to manage SRDF replication when a ressource group goes to another site. device group with all data disks pairs are created to manage replication.
  • All ressource groups disks have to be in a device group.
    • list SRDF pairs :
# symrdf list pd
35CA 1301   R1:1    RW RW RW   S..1.       0        0 RW  WD   Synchronized
35D2 1309   R1:1    RW RW RW   S..1.       0        0 RW  WD   Synchronized
35D6 130E   R1:1    RW RW RW   S..1.       0        0 RW  WD   Synchronized
35DA 130D   R1:1    RW RW RW   S..1.       0        0 RW  WD   Synchronized
35DB 1312   R1:1    RW RW RW   S..1.       0        0 RW  WD   Synchronized 

    • create device group NODE1_RG1 and add R1 devices :
# symdg create NODE1_RG1 -type RDF1 
# symld -sid 2667 -g NODE1_RG1 add dev 35CA
# symld -sid 2667 -g NODE1_RG1 add dev 35D2
# symld -sid 2667 -g NODE1_RG1 add dev 35D6
# symld -sid 2667 -g NODE1_RG1 add dev 35DA
# symld -sid 2667 -g NODE1_RG1 add dev 35DB

    • check device group configuration :
# symdg show NODE1_RG1
[..]
DEV001  /dev/rhdiskpower2   35CA RDF1+TDEV (M)  RW    262148
DEV002  /dev/rhdiskpower3   35D2 RDF1+TDEV (M)  RW    131074
DEV003  /dev/rhdiskpower4   35D6 RDF1+TDEV (M)  RW    131074
DEV004  /dev/rhdiskpower5   35DA RDF1+TDEV      RW     32768 DEV005  /dev/rhdiskpower6   35DB RDF1+TDEV (M)  RW    262148
[..]

    • export device group configuration in a file, and copy it on the second cluster node :
# symdg exportall -f /tmp/node1_rg1.cfg -rdf
# scp /tmp/node1_rg1.cfg node2adm:/tmp

    • import device group configuration on node2 :
# symdg importall -f /tmp/node1_rg1.cfg 
Creating device group 'NODE1_RG1'
Adding STD device 1301 as DEV001...
Adding STD device 1309 as DEV002...
Adding STD device 130E as DEV003...
Adding STD device 130D as DEV004...
Adding STD device 1312 as DEV005...

    • create VG and LV on node1, then split SRDF pair :
# mkvg -f -S -s256M -y tsmmut1dbvg hdiskpower2
# mkvg -f -S -s256M -y tsmmut1aclogvg hdiskpower3
# mkvg -f -S -s256M -y tsmmut1arlogvg hdiskpower4 
# mkvg -f -S -s256M -y tsmstgp1vg hdiskpower6
# mkvg -f -S -s256M -y tsmmut1vg hdiskpower5
# chvg -an tsmmut1dbvg
# chvg -an tsmmut1aclogvg
# chvg -an tsmmut1arlogvg
# chvg -an tsmstgp1vg
# chvg -an tsmmut1vg
# mklv -t jfs2 -y tsmmut1lv tsmmut1vg 60
# mklv -t jfs2 -y tsmclient01lv tsmmut1vg 1
# mklv -t jfs2 -y tsmmut1db1lv tsmmut1dbvg 255
# mklv -t jfs2 -y tsmmut1db2lv tsmmut1dbvg 255
# mklv -t jfs2 -y tsmmut1db3lv tsmmut1dbvg 255
# mklv -t jfs2 -y tsmmut1db4lv tsmmut1dbvg 255
# mklv -t jfs2 -y tsmmut1acloglv tsmmut1aclogvg 255
# mklv -t jfs2 -y tsmmut1arloglv tsmmut1arlogvg 255
# mklv -t jfs2 -y tsmmut1poollv tsmstgp1vg 255
# crfs -d tsmmut1lv -v jfs2 -m /app/list/tsm/mut1/cfg -A no -a log="INLINE"
# crfs -d tsmmut1db1lv -v jfs2 -m /app/list/tsm/mut1/db1 -A no -a log="INLINE"
# crfs -d tsmmut1db2lv -v jfs2 -m /app/list/tsm/mut1/db2 -A no -a log="INLINE"
# crfs -d tsmmut1db3lv -v jfs2 -m /app/list/tsm/mut1/db3 -A no -a log="INLINE"
# crfs -d tsmmut1db4lv -v jfs2 -m /app/list/tsm/mut1/db4 -A no -a log="INLINE"
# crfs -d tsmclient01lv -v jfs2 -m /app/list/tsm/client/mut1 -A no -a log="INLINE"
# crfs -d tsmmut1acloglv  -v jfs2 -m /app/list/tsm/mut1/aclog -A no -a log="INLINE"
# crfs -d tsmmut1arloglv  -v jfs2 -m /app/list/tsm/mut1/arlog -A no -a log="INLINE"
# crfs -d tsmmut1poollv -v jfs2 -m /app/list/tsm/mut1/pool -A no -a log="INLINE"
# varyoffvg tsmmut1dbvg
# varyoffvg tsmmut1aclogvg
# varyoffvg tsmmut1arlogvg
# varyoffvg tsmstgp1vg
# varyoffvg tsmmut1vg
# symrdf -cg NODE1_RG1 split -force
Execute an RDF 'Split' operation for composite
group 'NODE1_RG1' (y/[n]) ? y
An RDF 'Split' operation execution is
in progress for composite group 'NODE1_RG1'. Please wait...

    Pend I/O on RDF link(s) for device(s) in (2666,001).............Done.
    Suspend RDF link(s) for device(s) in (2666,001)..................Done.
    Read/Write Enable device(s) in (2666,001) on SA at target (R2)...Done.
    Read/Write Enable device(s) in (2666,001) on RA at target (R2)...Done.

The RDF 'Split' operation successfully executed for
composite group 'NODE1_RG1'.

  • composite group creation ;
    • edit /usr/es/sbin/cluster/sr/cmds/cl_confirm_sr to correct an HACMP EE bug :
# cp /usr/es/sbin/cluster/sr/cmds/cl_confirm_sr /usr/es/sbin/cluster/sr/cmds/cl_confirm_sr.orig
# sed 's/timestamp=`date`/timestamp=`date +%Y%m%d`/g' /usr/es/sbin/cluster/sr/cmds/cl_confirm_sr > /usr/es/sbin/cluster/sr/cmds/cl_confirm_sr.tmp
# mv /usr/es/sbin/cluster/sr/cmds/cl_confirm_sr.tmp /usr/es/sbin/cluster/sr/cmds/cl_confirm_sr
# chmod +x /usr/es/sbin/cluster/sr/cmds/cl_confirm_sr

    • then create device composite group :
# smit claddsr.cmdhdr
* EMC SRDF(R) Composite Group Name         [NODE1_RG1CG]
* EMC SRDF(R) Mode                         SYNC
  Device Groups                            NODE1_RG1
* Recovery Action                          AUTO
* Consistency Enabled                      YES

  • synchronise cluster :
# smit cl_sync
Verify changes only?                      [No]
* Logging                                 [Standard]

  • import volume group on node2 :
  • SRDF pair has be splitted before.
# cfgmgr
# importvg -y tsmmut1stg1vg 00f663982e1002cb
# importvg -y tsmmut1logvg 00f663982e0af9ae
# importvg -y tsmmut1vg 00f663982e0a63f7
# importvg -y tsmmut1dbvg 00f663982e09b9f0
# varyoffvg tsmmut1stg1vg
# varyoffvg tsmmut1logvg
# varyoffvg tsmmut1vg
# varyoffvg tsmmut1dbvg

  • re-establish SRDF pair :
# symrdf -cg NODE1_RG1CG establish
Execute an RDF 'Incremental Establish' operation for composite
group 'NODE1_RG1CG' (y/[n]) ? y

An RDF 'Incremental Establish' operation execution is
in progress for composite group 'NODE1_RG1CG'. Please wait...

    Write Disable device(s) in (2667,001) on RA at target (R2).......Done.
    Suspend RDF link(s) for device(s) in (2667,001)..................Done.
    Resume RDF link(s) for device(s) in (2667,001)...................Started.
    Merge track tables between source and target in (2667,001).......Started.
    Devices: 35CA-35E2 in (2667,001)................................ Merged.
    Merge track tables between source and target in (2667,001).......Done.
    Resume RDF link(s) for device(s) in (2667,001)...................Done.

The RDF 'Incremental Establish' operation successfully initiated for
composite group 'NODE1_RG1CG'.

additionnal ressource group configuration
  • additionnal configuration on ressource group :
    • define VG and service ip on ressource group :
smit cm_hacmp_extended_resource_group_config_menu_dmns
--> Change/Show Resources and Attributes for a Resource Group
 --> clnoderg
Resource Group Name                        clnoderg
Inter-site Management Policy               Prefer Primary Site
Participating Nodes from Primary Site      node1
Participating Nodes from Secondary Site    node2

Startup Policy      Online On Home Node Only
Fallover Policy     Fallover To Next Priority Node In The List
Fallback Policy     Never Fallback

Service IP Labels/Addresses          [nodeadm node1a]
Application Servers                  [clmut1rg]
Volume Groups                        [tsmmut1dbvg tsmmut1arlogvg tsmmut1vg tsmstgp1vg tsmmut1aclogvg]
Use forced varyon of volume groups, if necessary    false
Automatically Import Volume Groups                  false
Filesystems (empty is ALL for VGs specified)       [ ]
Filesystems Consistency Check                       fsck
Filesystems Recovery Method                         parallel
Filesystems mounted before IP configured            false
Filesystems/Directories to Export (NFSv2/3)        []
Filesystems/Directories to NFS Mount               []
Network For NFS Mount                              []
Tape Resources                                     []
Raw Disk PVIDs                                     []
Fast Connect Services                              []
Communication Links                                []
Primary Workload Manager Class                     []
Secondary Workload Manager Class                   []
Miscellaneous Data                                 []
WPAR Name                                          []
EMC SRDF(R) Replicated Resources              [NODE1_RG1CG]

  • end up with cluster configuration and synchronize it one last time :
# smit cl_sync
Verify changes only?                      [No]
* Logging                                 [Standard]

additionnal configuration

  • disable emcp_mond on each node :
# rmitab rcemcp_mond

  • heartbeat link configuration :
# smit cm_config_network
--> Change a Network Module using Pre-defined Values
  --> diskhb
* Network Module Name     diskhb
Description               Disk Heartbeating Protocol
Failure Detection Rate    Normal  

*syncd synchronisation configuration :
# /usr/es/sbin/cluster/utilities/clchsyncd '10'

  • I/O Pacing configuration :
  • on each cluster node.
# chdev -l sys0 -a maxpout='00' -a minpout='00'

  • config too log configuration :
# /usr/es/sbin/cluster/utilities/clchmsgtimer -e'300' -g'300'

  • last verification :
# smit cm_initialization_and_standard_config_menu_dmn
-->  Verify and Synchronize HACMP Cluster Configuration