corosync-setup.tex

% Corosync setup


\chapter{Corosync setup}
\label{chap:corosync-setup}
This chapter explains the Corosync setup.

\section {About Corosync}
The Corosync Cluster Engine is a group communication system for implementing HA within applications. Among its features we can find:
\begin{itemize}
  \item Create replicated state machines thanks to a closed process group communication model
  \item Restart application process if it fails thanks to simple availability manager
  \item A configuration and statistics in-memory database
  \item A quorum system that notifies applications when quorum is achieved or lost.
\end{itemize}


The software is designed to operate on UDP/IP and InfiniBand networks natively.

We can find more information about Corosync. Either at its Wikipedia article (\cite{WikipediaCorosync}) or at its web page (\cite{CorosyncWebpage}).

Corosync enables a group communication system so that Pacemaker can talk to all the cluster nodes without having to implement communication capabilities.

\section {Corosync installation}
\textbf{In both nodes} we just need to install Corosync packages for Ubuntu 12.04.

We need to issue this command:
\begin{verbatim}
apt-get install corosync
\end{verbatim}

% COROSYNC Setup

\section {Corosync.conf}

The corosync.conf file on both computers it will be modified to use upnp so that we can use in non multicast environments. In order to use upnp we need to use a corosync version higher than 1.3 but that is not a problem because current version is higher than that.

\subsection {\label{subsec:primary-server-corosync-conf}Primary server Corosync.conf}

We create the file:
\begin{verbatim}
/etc/corosync/corosync.conf
\end{verbatim}
which its contents will be:

\begin{verbatim}
# Please read the openais.conf.5 manual page

totem {
	version: 2

	# How long before declaring a token lost (ms)
	token: 5000

	# How many token retransmits before
	# forming a new configuration
	token_retransmits_before_loss_const: 20

	# How long to wait for join messages
	# in the membership protocol (ms)
	join: 1000

	# How long to wait for consensus to be achieved 
	# before starting a new round of
	# membership configuration (ms)
	consensus: 7500

	# Turn off the virtual synchrony filter
	vsftype: none

	# Number of messages that may be sent by one
	# processor on receipt of the token
	max_messages: 20

	# Limit generated nodeids to 31-bits
	# (positive signed integers)
	clear_node_high_bit: yes

	# Disable encryption
 	secauth: off

	# How many threads to use for encryption/decryption
 	threads: 0

	# Optionally assign a fixed node id (integer)
	 #nodeid: 1234


        rrp_mode: passive


interface {
		member {
			memberaddr: 10.0.66.201
		}
		member {
			memberaddr: 10.0.66.202
		}
        ringnumber: 0
        bindnetaddr: 10.0.66.201
        mcastport: 5405
        }
		transport: udpu
}
amf {
	mode: disabled
}

service {
 	# Load the Pacemaker Cluster Resource Manager
 	ver:       0
 	name:      pacemaker
}

aisexec {
        user:   root
        group:  root
}

logging {
        fileline: off
        to_stderr: yes
        to_logfile: no
        to_syslog: yes
	syslog_facility: daemon
        debug: off
        timestamp: on
   logger_subsys {
           subsys: AMF
           debug: off
tags: enter|leave|trace1|trace2|trace3|trace4|trace6
   }
}
\end{verbatim}

Among other settings the configuration defines that there are two node members which their ips are: 10.0.66.201 and 10.0.66.202. The transport must use unicast protocol (udpu) through 5405 port (and 5404 too). Logging is set to output into syslog and stderr but without debug output. As we are using unicast we need to define the current node ip as the multicast one: 10.0.66.202.

\subsection {Secondary server Corosync.conf}

We create the file:
\begin{verbatim}
/etc/corosync/corosync.conf
\end{verbatim}
which its contents will be:

\begin{verbatim}
# Please read the openais.conf.5 manual page

totem {
	version: 2

	# How long before declaring a token lost (ms)
	token: 5000

	# How many token retransmits before
	# forming a new configuration
	token_retransmits_before_loss_const: 20

	# How long to wait for join messages
	# in the membership protocol (ms)
	join: 1000

	# How long to wait for consensus to be achieved 
	# before starting a new round of
	# membership configuration (ms)
	consensus: 7500

	# Turn off the virtual synchrony filter
	vsftype: none

	# Number of messages that may be sent by one
	# processor on receipt of the token
	max_messages: 20

	# Limit generated nodeids to 31-bits
	# (positive signed integers)
	clear_node_high_bit: yes

	# Disable encryption
 	secauth: off

	# How many threads to use for encryption/decryption
 	threads: 0

	# Optionally assign a fixed node id (integer)
	 #nodeid: 1234


        rrp_mode: passive


interface {
		member {
			memberaddr: 10.0.66.201
		}
		member {
			memberaddr: 10.0.66.202
		}
        ringnumber: 0
        bindnetaddr: 10.0.66.202
        mcastport: 5405
        }
		transport: udpu
}
amf {
	mode: disabled
}

service {
 	# Load the Pacemaker Cluster Resource Manager
 	ver:       0
 	name:      pacemaker
}

aisexec {
        user:   root
        group:  root
}

logging {
        fileline: off
        to_stderr: yes
        to_logfile: no
        to_syslog: yes
	syslog_facility: daemon
        debug: off
        timestamp: on
   logger_subsys {
           subsys: AMF
           debug: off
tags: enter|leave|trace1|trace2|trace3|trace4|trace6
   }
}
\end{verbatim}

You can check subsection \textbf{\ref{subsec:primary-server-corosync-conf} Primary server Corosync.conf} for configuration file settings explanation.

\section {Corosync's Authkey}
In \textbf{primary node} we will create the file:
\begin{verbatim}
/etc/corosync/authkey
\end{verbatim}
thanks to running:
\begin{verbatim}
corosync-keygen
\end{verbatim}
We might be requested to press keys on our keyboard to generate entropy.

Once the file created we will copy the very same file to the \textbf{secondary node} in the same path as in primary server.

In order to secure it in secondary node we will need to run:
\begin{verbatim}
chown root:root /etc/corosync/authkey
chmod 400 /etc/corosync/authkey
\end{verbatim}

\section {Cfgtool}

In \textbf{both nodes} we need to edit:
\begin{verbatim}
/etc/rc.local
\end{verbatim}
in order to add the line:
\begin{verbatim}
/usr/sbin/corosync-cfgtool -r
\end{verbatim}
just before the:
\begin{verbatim}
exit 0
\end{verbatim}
line.

This way we make sure that redudant ring state is reset cluster wide after a fault to re-enable redundant ring.

\section {Corosync startup enabling}
In \textbf{both nodes} in order to enable Corosync at boot we need to edit:
\begin{verbatim}
/etc/default/corosync
\end{verbatim}
file so that:
\begin{verbatim}
START=no
\end{verbatim}
line becomes:
\begin{verbatim}
START=yes
\end{verbatim}
.