IGI (based on EMI) Installation and Configuration

Common Installation

OS installation

Install SL5 using SL5.X repository (CERN mirror) or one of the supported OS (RHEL5 clones).

You may find information on official OS repositories at Repositories for APT and YUM
If you want to set up a local installation server please refer to Mrepo Quick Guide

NOTE: Please check if NTP , cron and logrotate are installed, otherwise install them!

Check the FQDN hostname

Ensure that the hostnames of your machines are correctly set. Run the command:

hostname -f

It should print the fully qualified domain name (e.g. prod-ce.mydomain.it). Correct your network configuration if it prints only the hostname without the domain. If you are installing WN on private network the command must return the external FQDN for the CE and the SE (e.g. prod-ce.mydomain.it) and the internal FQDN for the WNs (e.g. node001.myintdomain).

Disabling SELinux

Please remember to fully disabling SELinux. Disabling will completely disable all SELinux functions including file and process labelling. In RedHat Enterprise, edit /etc/selinux/config and change the SELINUX line to SELINUX=disabled:
# disabled - No SELinux policy is loaded.
SELINUX=disabled

... and then reboot the system.

EMI release useful link

If you don't find useful information in this documentation please have also a look to the official EMI documentation:

Repository Settings

To have more details to the repository have a look to the this link Repository Specifications

If not present by default on your SL5/x86_64 nodes, you should enable the EPEL repository (https://fedoraproject.org/wiki/EPEL)

If not present by default on your SL6/x86_64 nodes, you should enable the EPEL repository (https://fedoraproject.org/wiki/EPEL)

EPEL has an epel-release package that includes the gpg key used for package signing and other repository information, like the .repo files.
Tto use normal tools such as yum to install packages and their dependencies

By default the stable EPEL repo is enabled. Example of epel-5.repo file:

[extras]
name=epel
mirrorlist=http://mirrors.fedoraproject.org/mirrorlist?repo=epel-5&arch=$basearch
protect=0

IMPORTANT NOTE:

  • If present remember to disable the dag.repo if it is enabled.

You need to have enabled only the following repositories: Operating System, EPEL, Certification Authority, EMI, IGI. Please see the table bellow for details:

SL5 x86_64 SL6 x86_64
EMI1 EMI2 EMI2
EPEL 5 repo - please use epel-release-5-4.noarch.rpm EPEL 5 repo - please use epel-release-5-4.noarch.rpm EPEL 6 repo - please use epel-release-6-8.noarch.rpm
EMI 1 repos - please use emi-release-1.0.1-1.sl5.noarch.rpm EMI 2 repos - please use emi-release-2.0.0-1.sl5.noarch.rpm EMI 2 repos - please install this emi-release-2.0.0-1.sl6.noarch.rpm
EGI trust anchors repo - egi-trustanchors.repo EGI trust anchors repo - egi-trustanchors.repo EGI trust anchors repo - egi-trustanchors.repo
IGI (1) repo - igi-emi.repo IGI (2) repo igi-emi2.repo IGI (2) repo - igi-emi2.repo

It is strongly recommended the use of the lastest version of the emi-release packages mentioned above as they containing the EMI public key and the yum .repo files, that ensures the precedence of EMI repositories over EPEL.

Example:for EMI 1 case

# rpm --import http://emisoft.web.cern.ch/emisoft/dist/EMI/1/RPM-GPG-KEY-emi
# wget http://repo-pd.italiangrid.it/mrepo/EMI/1/sl5/x86_64/updates/emi-release-1.0.1-1.sl5.noarch.rpm
# yum localinstall emi-release-1.0.1-1.sl5.noarch.rpm

Updating from EMI 1 to EMI 2

For the update to EMI 2 you have to uninstall the old emi-release package and install the new one:

# rpm -e emi-release
# rpm -ivh http://emisoft.web.cern.ch/emisoft/dist/EMI/2/sl5/x86_64/base/emi-release-2.0.0-1.sl5.noarch.rpm

Important note on automatic updates

Several site use auto update mechanism. Sometimes middleware updates require non-trivial configuration changes or a reconfiguration of the service. This could involve service restarts, new configuration files, etc, which makes it difficult to ensure that automatic updates will not break a service. Thus

WE STRONGLY RECOMMEND NOT TO USE AUTOMATIC UPDATE PROCEDURE OF ANY KIND

on the IGI/EMI middleware repositories (you can keep it turned on for the OS). You should read the update information provides by each service and do the upgrade manually when an update has been released!

Tips & Tricks:

Generic Configuration

Configuration files

IGI YAIM configuration files

YAIM configuration files should be stored in a directory structure. All the involved files HAVE to be under the same folder <confdir>, in a safe place, which is not world readable. This directory should contain:

File or Directory Scope Example Details
<your-site-info.def> whole-site ig-site-info.def List of configuration variables in the format of key-value pairs.
It's a mandatory file.
It's a parameter passed to the ig_yaim command.
IMPORTANT: You should always check if your <your-site-info.def> is up-to-date comparing with the last /opt/glite/yaim/examples/siteinfo/ig-site-info.def template deployed with ig-yaim and get the differences you find.
For example you may use vimdiff:
vimdiff /opt/glite/yaim/examples/siteinfo/ig-site-info.def <confdir>/<your-site-info.def>
<your-wn-list.conf> whole-site - Worker nodes list in the format of hostname.domainname per row.
It's a mandatory file.
It's defined by WN_LIST variable in <your-site-info.def>.
<your-users.conf> whole-site ig-users.conf Pool account user mapping.
It's a mandatory file.
It's defined by USERS_CONF variable in <your-site-info.def>.
IMPORTANT: You may create <your-users.conf> starting from the /opt/glite/yaim/examples/ig-users.conf template deployed with ig-yaim, but probably you have to fill it on the base of your site policy on uids/guis. We suggest to proceed as explained here: ”<a href="http://igrelease.forge.cnaf.infn.it/doku.php?id=doc:use_cases:users" title="doc:use_cases:users">Whole site: How to create local users.conf and configure users</a>”.
<your-groups.conf> whole-site ig-groups.conf VOMS group mapping.
It's a mandatory file.
It's defined by GROUPS_CONF variable in <your-site-info.def>.
IMPORTANT: You may create <your-groups.conf> starting from the /opt/glite/yaim/examples/ig-groups.conf template deployed with ig-yaim.
<vo.d> whole-site vo.d In this directory shuld be the VO name files containing the VO configuration
It's not a mandatory directory. bus should be useful

Known issues

  • BDII_DELETE_DELAY default value is missing for services other than BDII site & top.
    • Workaround - add to your site-info.def BDII_DELETE_DELAY=0

Additional files

Furthermore the configuration folder can contain:

Directory Scope Details
services/ service-specific It contains a file per nodetype with the name format: ig-node-type.
The file contains a list of configuration variables specific to that nodetype.
Each yaim module distributes a configuration file in /opt/glite/yaim/examples/siteinfo/services/[ig or glite]-node-type.
It's a mandatory directory if required by the profile and you should copy it under the same directory where <your-site-info.def> is.
nodes/ host-specific It contains a file per host with the name format: hostname.domainname.
The file contains host specific variables that are different from one host to another in a certain site.
It's an optional directory.
vo.d/ VO-specific It contains a file per VO with the name format: vo_name, but most of VO settings are still placed in ig-site-info.def template. For example, for ”lights.infn.it”:
# cat vo.d/lights.infn.it
SW_DIR=$VO_SW_DIR/lights
DEFAULT_SE=$SE_HOST
VOMS_SERVERS="vomss://voms2.cnaf.infn.it:8443/voms/lights.infn.it?/lights.infn.it"
VOMSES="lights.infn.it voms2.cnaf.infn.it 15013 /C=IT/O=INFN/OU=Host/L=CNAF/CN=voms2.cnaf.infn.it lights.infn.it"

It's an optional directory for “normal” VOs (like atlas, alice, babar), mandatory only for “fqdn-like” VOs. In case you support such VOs you should copy the structure vo.d/<vo.specific.file> under the same directory where <your-site-info.def> is.

group.d/ VO-specific It contains a file per VO with the name format: groups-<vo_name>.conf.
The file contains VO specific groups and it replaces the former <your-groups.conf> file where all the VO groups were specified all together.
It's an optional directory.

The optional folders are created to allow system administrators to organise their configurations in a more structured way.”

NTP Configuration

Check if the NTP is installed in your O.S.
If you have installed the middleware, and you have the package glite-yaim-core in your host you have to get the repo igi-emi.repo and inside it there is a rpm called yaim-addons. Please install it:

wget http://repo-pd.italiangrid.it/mrepo/repos/igi/sl5/x86_64/igi-emi.repo
yum install yaim-addons

After that you can use yaim to configure ntrp:

/opt/glite/yaim/bin/yaim -r -d 6 -s <site-info.def> -n <node_type> -f config_ntp

BDII Site installation and Configuration

Have a look to the section Repository Settings of this documentation and ensure to have the common .repo files.
Before starting the installation procedure remember to clean all yum cache and headers:

yum clean all

CAs installation:

  • Install CAs on ALL profiles:
yum install ca-policy-egi-core

Service installation

  • Install the BDII_site metapackage, containing all packages needed by this service:
# yum install emi-bdii-site 

Service Configuration

To proper configure the BDII site profile you have to customize this file with you site parameter:

If you would like to cutomize the BDII_site service you can modify the variables in the service-specific file in the services/ directory. You will find an example in:

/opt/glite/yaim/examples/siteinfo/services/glite-bdii_site

YAIM Verification

  • Before starting the configuration PLEASE TEST that you have defined all the mandatory variables and that all configuration files contain all the site-specific values needed:
 /opt/glite/yaim/bin/yaim -v -s <site-info.def> -n BDII_site 

The mandatory variables are:

SITE_DESC
SITE_EMAIL
SITE_NAME
SITE_LOC
SITE_LAT
SITE_LONG
SITE_WEB
SITE_SECURITY_EMAIL
SITE_SUPPORT_EMAIL
SITE_OTHER_GRID
SITE_BDII_HOST
BDII_REGIONS

Most of those are in the file ig-bdii_site in directory services (the better things is to modify it). Remember in particular to set:

SITE_OTHER_GRID="WLCG|EGI"
SITE_OTHER_EGI_NGI="NGI_IT"

If no errors are reported you can proceed to the configuration, otherwise correct them before continuing.

YAIM Configuration

Please use the debug flag ( "-d 6") to configure the services in order to have detailed information. For your convenience you can save all the configuration information in a log file you can look at any time, separated from the yaimlog default one.

/opt/glite/yaim/bin/yaim -c -d 6 -s <site-info.def> -n BDII_site 2>&1 | tee /root/conf_BDII.`hostname -s`.`date +%Y-%m-%d-%H-%M-%S`.log

Service Testing - Reference Card

After service installation to have a look if all were installed in a proper way, you could have a look to Service BDII_site Reference Card. In this page you can found were all the log files are written, what daemons are running after installation and any other useful service information.

Documentation References:

BDII Top installation and Configuration

Have a look to the section Repository Settings of this documentation, ensure to have the common repo files.
Before starting the installation procedure remember to clean all yum cache and headers:

yum clean all

CAa installation:

  • Install CAs on ALL profiles:
yum install ca-policy-egi-core

Service installation

  • Install the BDII_top metapackage, containing all packages needed by this service:
yum install emi-bdii-top 

Service Configuration

To proper configure the BDII top profile you have to customize this file with you site parameter:

YAIM Verification

  • Before starting the configuration PLEASE TEST that you have defined all the mandatory variables and that all configuration files contain all the site-specific values needed:
 /opt/glite/yaim/bin/yaim -v -s <site-info.def> -n BDII_top 

The mandatory variable is:

BDII_HOST

If no errors are reported you can proceed to the configuration, otherwise correct them before continuing with the configuration.

YAIM Configuration

Please use the debug flag ( "-d 6") to configure the services in order to have detailed information. For your convenience yo can save all the configuration information in a log file you can look at any time, separated from the yaimlog defulat one.

/opt/glite/yaim/bin/yaim -c -d 6 -s <site-info.def> -n BDII_top 2>&1 | tee /root/conf_BDII.`hostname -s`.`date +%Y-%m-%d-%H-%M-%S`.log

Know Issue and Workaround

Missing /etc/bdii/gip/glite-info-site-defaults.conf : https://ggus.eu/tech/ticket_show.php?ticket=72561

Workaround: Check if the file exists and it's contents. If it's missing do:

echo "SITE_NAME=" > /etc/bdii/gip/glite-info-site-defaults.conf

Check also the permission of the directory opt/glite/var/cache/gip if it is not ldap:ldap change it:

chown -R ldap:ldap /opt/glite/var/cache/gip 

Service Testing - Reference Card

After service installation to have a look if all were installed in a proper way, you could have a look to Service BDII_top Reference Card. In this page you can found were all the log files are written, what daemons are running after installation and any other useful service information.

Documentation References:

StoRM installation and Configuration

Have a look to the section Repository Settings and ensure that you have the common repo files.
Before starting the installation procedure remember to clean all yum cache and headers: has been updated with a Globus version higher and some profile such as StoRM and DPM has problem with this version, so please use the UMD repositories. Have a look to Repository Settings

yum clean all

IMPORTANT NOTES:

  • StoRM Backend v. 1.8.2.2 (EMI 1) - If the Storage Area root of a Storage Area residing on a GPFS filesystem is specified as link pointing to the real directory and the link resides on a non GPFS filesystem a sanity check at bootstrap will fail preventing the service to start.
    • As a workaround the user can modify the definition of the Storage Area root to the real directory path rather than the link
  • StoRM Backend v. 1.9.0 (EMI 2): - Due to a Known Issue on StoRM backend this release is not suited for GPFS installations.
  • YAIM-STORM v, 4.2.1-3 - Starting from this release the XFS file system is managed by StoRM as a standard POSIX file system. All StoRM installations on XFS file system from this release on must specify at YAIM configuration time "posixfs" as file system type. See StoRM System Administration Guide 1.3.3 for more details.
    • Description: Specifying at YAIM configuration time "xfs" as file system type will produce an invalid namespace.xml preventing BackEnd service to boot.
      • As workaround XFS users have to specify "posixfs" as file system type.

StoRM Prerequisites

Host certificate installation:

Hosts participating to the StoRM-SE (FE, BE and GridFTP hosts) must be configured with X.509 certificates signed by a trusted Certification Authority (CA). Usually the hostcert.pem and hostkey.pem certificates are located in the /etc/grid-security/ directory, and they must have permission 0644 and 0400 respectively:

Check existence

[~]# ls -l /etc/grid-security/hostkey.pem
-r-------- 1 root root 887 Mar 1 17:08 /etc/grid-security/hostkey.pem
[~]# ls -l /etc/grid-security/hostcert.pem
-rw-r--r-- 1 root root 1440 Mar 1 17:08 /etc/grid-security/hostcert.pem

Check expiration

[~]# openssl x509 -in hostcert.pem -noout -dates

Change permission: (if needed)

[~]# chmod 0400 hostkey.pem
[~]# chmod 0600 hostcert.pem

ACL SUPPORT

If you are installing a new StoRM this check must be done, if you are updating your install or your storage has ACL you can step out to this issue. StoRM uses the ACLs on files and directories to implement the security model. Doing so, StoRM uses the native access to the file system. Therefore in order to ensure a proper running, ACLs need to be enabled on the underlying file system (sometime they are enabled by default) and work properly.

Check ACL:

[~]# touch test
[~]# setfacl -m u:storm:rw test
Note: the storm user used to set the ACL entry must exist.
[~]# getfacl test
  # file: test
  # owner: root
  # group: root
  user::rw-
  user:storm:rw-
  group::r--
  mask::rw-
  other::r--

[~]# rm -f test

Install ACL (eventually):
If the getfacl and setfacl commands are not available on your host:

[~]# yum install acl

Enable ACL (if needed):
To enable ACL, you must add the acl property to the relevant file system in your /etc/fstab file. For example:

[~]# vi /etc/fstab
  ...
  /dev/hda3             /storage         ext3         defaults, acl           1 2
  ...

Then you need to remount the affected partitions as follows:

 [~]# mount -o remount /storage
This is valid for different file system types (i.e., ext3, xfs, gpfs and others).

EXTENDED ATTRIBUTE SUPPORT
StoRM uses the Extended Attributes (EA) on files to store some metadata related to the file (e.g. the checksum value); therefore in order to ensure a proper running, the EA support needs to be enabled on the underlying file system and work properly. Note: Depending on OS kernel distribution, for Reiser3, ext2 and ext3 file systems, the default kernel configuration should not enable the EA. Check Extended Attribute Support :
 
[~]# touch testfile
[~]# setfattr -n user.testea -v test testfile
[~]# getfattr -d testfile
  # file: testfile
  user.testea="test"
[~]# rm -f testfile

Install attr (eventually):
If the getfattr and setfattrl commands are not available on your host:

[~]# yum install attr

Enable EA (if needed):
To set extended attributes, you must add the user_xattr property to the relevant file systems in your /etc/fstab file. For example:

[~]# vi /etc/fstab
   ...
   /dev/hda3         /storage       ext3        defaults,acl,user_xattr     1 2
   ...

Then you need to remount the affected partitions as follows:

[~]# mount -o remount /storage

CAs installation:

  • Install CAs on ALL profiles:
yum install ca-policy-egi-core

Service installation

* Install the StoRM metapackages, containing all packages needed by these four services. You can install StoRM in one host or in more hosts. The mandatory profiles to install are emi-storm-backend-mp, emi-storm-frontend-mp and emi-storm-globus-gridftp-mp. The other profiles are optional, have a look to the StoRM documentation System Administrator Guide to determinate if you need also emi-storm-gridhttps-mp or checksum.

The most common installation using one host:

yum install emi-storm-backend-mp
yum install emi-storm-frontend-mp
yum install emi-storm-globus-gridftp-mp

Service Configuration

To proper configure the StoRM BackEnd and FrontEnd profiles you have to customize the ig-site-indo.def file with you site parameter:

YAIM Verification

  • Before starting the configuration PLEASE TEST that you have defined all the mandatory variables for all the StoRM profiles.

 /opt/glite/yaim/bin/yaim -v -s <site-info.def> -n  se_storm_backend -n se_storm_frontend -n se_storm_gridftp
 

You can find in this documentation: System Administrator Guide all mandatory variables. In the section GENERAL YAIM VARIABLES

If no errors are reported with the verification you can proceed to the configuration, otherwise correct them before continuing with the configuration.

YAIM Configuration

Before configuring please pay attention:

  • if you are installing a new StoRM in a new host you can continue
  • if you are updating StoRM to a new release please follow this documentation containing useful information for the service upgrade and for the stored data files:

Please use the debug flag ( "-d 6") to configure the services in order to have detailed information. For your convenience yo can save all the configuration information in a log file you can look at any time, separated from the yaimlog default one.

# /opt/glite/yaim/bin/yaim -c -d 6 -s -n  se_storm_backend -n se_storm_frontend -n se_storm_gridftp 2>&1 | tee /root/conf_StroRM_BE_FE_Gftp.`hostname -s`.`date +%Y-%m-%d-%H-%M-%S`.log

IMPORTANT NOTE The order of the profile is important and must be : -n se_storm_backend -n se_storm_frontend

Service Testing - Reference Card

After service installation to have a look if all were installed in a proper way, you could have a look to Service StoRM Reference Card. In this page you can found were all the log files are written, what daemons are running after installation and any other useful service information.

Documentation References:

CREAM CE using DGAS accounting - Installation and Configuration

Have a look to the section Repository Settings and ensure that you have the common repo files.
Before starting the installation procedure remember to clean all yum cache and headers:

# yum clean all

The CREAM CE Services were tested with gLite WN SL5/SL6 x86_64 and also with ig_WN SL5/SL6 x86_64.

CREAM CE Prerequisites

Host certificate installation:

  • All nodes except UI, WN and BDII require the host certificate/key files to be installed*
  • Contact your national Certification Authority (CA) to understand how to obtain a host certificate if you do not have one already.

Once you have obtained a valid certificate:

  • hostcert.pem - containing the machine public key
  • hostkey.pem - containing the machine private key

make sure to place the two files in the target node into the /etc/grid-security directory and check the access right for hostkey.pem is only readable by root (0400) and that the public key, hostcert.pem, is readable by everybody (0644).

Check existence

[~]# ls -l /etc/grid-security/hostkey.pem
-r-------- 1 root root 887 Mar 1 17:08 /etc/grid-security/hostkey.pem
[~]# ls -l /etc/grid-security/hostcert.pem
-rw-r--r-- 1 root root 1440 Mar 1 17:08 /etc/grid-security/hostcert.pem

Check expiration

[~]# openssl x509 -in hostcert.pem -noout -dates

Change permission: (if needed)

[~]# chmod 0400 hostkey.pem
[~]# chmod 0600 hostcert.pem

Batch System:

  • If you will use LSF (licences are needed) - The server/client installation must be done manually. Have a look to Platform LSF documentation,
  • TORQUE server/client installation is done through the use of the -torque- metapackages, see bellow

Access to batch system log files
It doesn't matter what kind of deployment you have, batch-system master on a different machine than the CE (TORQUE or LSF) or on the same one, you have to be sure that you provide access to the batch system log files: You must set up a mechanism to transfer accounting logs to the CE:

  • through NFS (don't forget to set $BATCH_LOG_DIR and $DGAS_ACCT_DIR in <your-site-info.def> configuration file)
  • through a daily cron job to the directory defined in $BATCH_LOG_DIR and $DGAS_ACCT_DIR in <your-site-info.def> configuration file

CAs installation:

  • Install CAs on ALL profiles:
yum install ca-policy-egi-core

Middleware installation

  • Install the CREAM CE metapackages, containing all packages needed. Have a look to the CREAM CE documentation before starting to install :System Administrator Guide.
# yum install xml-commons-apis 
# yum install emi-cream-ce

Batch System Utilities installation

After the installation of the CREAM CE metapackage it is necessary to install the batch system specific metapackage(s):

  • If you are running Torque, and your CREAM CE node is the torque master, install the emi-torque-server and emi-torque-utils metapackages:

# yum install emi-torque-server
# yum install emi-torque-utils

  • If you are running Torque, and your CREAM CE node is NOT the torque master, install the emi-torque-utils metapackage:

# yum install emi-torque-utils

IMPORTANT NOTE FOR TORQUE:

After the Torque installation you should have the version 2.5.7-7. Please check that munge is installed and enabled.

# rpm -qa | grep munge
# munge-libs-0.5.8-8.el5
# munge-0.5.8-8.el5

To enable munge on your torque cluster:

  • Install the munge package (if it is not installed) on your pbs_server, submission hosts and all worker node hosts in your cluster.
  • On one host generate a key with /usr/sbin/create-munge-key
  • Copy the key, /etc/munge/munge.key to your pbs_server, submission hosts and all worker node hosts on your cluster.
    Pay attenction the ownership of that file must be:
    -r-------- 1 munge munge 1024 Jan 03 09:57 munge.key
  • Start the munge daemon on these nodes. *service munge start && chkconfig munge on*

  • If you are running LSF, install the emi-lsf-utils metapackage:
       # yum install emi-lsf-utils
       

DGAS_sensors installation & upgrade

Extended Release Notes:

IGI Release supports the use of DGAS as accounting system - please install also DGAS sensors on the CREAM node:

# yum install igi-dgas_sensors

Upgrade:

  • in case of an upgrade from a version < 3 - please follow section 3.1.1.2 Upgrade from a previous release from DGAS Sensors 4.0 guide
  • in case of an upgrade from 4.0.x to 4.0.13, after the updated packages are installed just restart the services:
       # service dgas-urcollector restart
       # service dgas-pushd restart
       

You can found more documentation at DGAS Sensors 4.0 guide and on DGAS Sensors Service Reference Card

Middleware Configuration

YAIM Verification

  • Before starting the configuration PLEASE TEST that you have defined all the mandatory variables for all the CREAM CE profiles.

For Torque:

 /opt/glite/yaim/bin/yaim -v -s <site-info.def> -n  creamCE -n TORQUE_server -n TORQUE_utils -n DGAS_sensors
 

For LSF:

 /opt/glite/yaim/bin/yaim -v -s <site-info.def> -n  creamCE -n LSF_utils -n DGAS_sensors
 

You can find in this documentation: YAIM CREAM CE Variables all mandatory variables.

If no errors are reported with the verification you can proceed to the configuration, otherwise correct them before continuing with the configuration.

Configuration Suggestions:

Blparser:
We suggest to use the new Blparser runs on the CREAM CE machine and it is automatically installed when installing the CREAM CE. The configuration of the new BLAH Blparser is done when configuring the CREAM CE (i.e. it is not necessary to configure the Blparser separately from the CREAM CE).

To use the new BLAH blparser, it is just necessary to set:

BLPARSER_WITH_UPDATER_NOTIFIER=true

ARGUS:
If you have an ARGUS server installed in your site or in central site we suggest to use it. Please set the proper variables:

USE_ARGUS=yes

In this case it is also necessary to set the following yaim variables:

  • ARGUS_PEPD_ENDPOINTS The endpoint of the ARGUS box (e.g."https://cream-43.pd.infn.it:8154/authz")
  • CREAM_PEPC_RESOURCEID The id of the CREAM CE in the ARGUS box (e.g. "http://pd.infn.it/cream-18")

If instead gJAF should be used as authorization system, yaim variable USE_ARGUS must be set in the following way:

USE_ARGUS=no

DGAS_sensors: For DGAS_sensors you should customize the services file. You can find in this path an example:

 /opt/glite/yaim/examples/siteinfo/services/dgas_sensors 

YAIM Configuration

Please use the debug flag ( "-d 6") to configure the services in order to have detailed information. For your convenience yo can save all the configuration information in a log file you can look at any time, separated from the yaimlog default one.

IMPORTANT NOTE:
For Torque was found an error in starting pbs_server so if you are configuring the PBS server NODE before launching yaim remember to start pbs_server:

/etc/init.d/pbs_server start

For Torque:

# /opt/glite/yaim/bin/yaim -c -d 6 -s  <site-info.def>  -n  creamCE -n TORQUE_server -n TORQUE_utils -n DGAS_sensors 2>&1 | tee /root/conf_CREAM_Torque_DGAS.`hostname -s`.`date +%Y-%m-%d-%H-%M-%S`.log

For LSF:

# /opt/glite/yaim/bin/yaim -c -d 6 -s  <site-info.def>  -n  creamCE -n LSF_utils -n DGAS_sensors 2>&1 | tee /root/conf_CREAM_LSF_DGAS.`hostname -s`.`date +%Y-%m-%d-%H-%M-%S`.log

Service Testing - Reference Card

After service installation to have a look if all were installed in a proper way, you could have a look to Service CREAM Reference Card and also to the Service Troubleshooting Guide. In this page you can found were all the log files are written, what daemons are running after installation and any other useful service information.

CREAM Documentation References:

* Functional Description
* Software Design Description
* User Guide
* Client Installation and Configuration
* Client Configuration Template
* Man Pages/Online Help
* User Troubleshooting Guide
* API Documentation
* Error Code Documentation
* System Administrator Guides
* Service Reference Card
* Service Troubleshooting Guide
* Other Documentation available here

DGAS Sensors Documentation References:

WN Installation and Configuration

  • Supported platforms: SL5/x86_64 & SL6/x86_64

  • Have a look to the section Repository Settings and ensure that you have the common repo files.
  • Before starting the installation procedure remember to clean all yum cache and headers:
    # yum clean all

WN Prerequisites

Batch System:

  • If you will use LSF (licences are needed) - The server/client installation must be done manually. Have a look to Platform LSF documentation,
  • TORQUE server/client installation is done through the use of the -torque- metapackages, see bellow

CAs installation:

  • Install CAs on ALL profiles:
    # yum install ca-policy-egi-core

Service installation

  • Install the WN metapackage:
    • EMI 1:
      # yum install <metapackage> emi-version openldap-client python-ldap 
    • EMI 2:
      # yum install <metapackage> 

  • IGI provides 5 custom profiles (metapackages) one for each specific batch system used in your cluster (see table bellow)
    • IGI customizations:
      • unique metapackages & configurations for WNs with support for the batch systems Torque & LSF
      • contains (where available) the "LCG Applications Dependency Metapackage" - HEP_OSLib, compat-gcc-34-g77, compat-libgcc-296, compat-libstdc++-296, gcc-c++, ghostscript lapack, ncurses, openafs, openafs-client, openldap-clients
      • contains yaim-addons - new configuration package with IGI-custom configuration functions and files (replaces ig-yaim), like - "ssh passwordless"

Profiles Metapackages INSTALLATION Nodetypes CONFIGURATION
WN igi-wn
igi-wn_lsf
igi-wn_lsf_noafs
igi-wn_torque
igi-wn_torque_noafs
WN
WN_LSF
WN_LSF_noafs
WN_torque
WN_torque_noafs

Note: the igi-wn metapackage is useful for SGE or other Batch System

IMPORTANT NOTE:
Metapackages & Notetypes Names are case sensitive, both for metapackages installation and notetype configuration, PLEASE use the ones in the table above.

IMPORTANT NOTE FOR TORQUE:
After the Torque installation you should have the version 2.5.7-7 (SL5) or 2.5.7-9 (SL5)
Please remember to copy the munge key from the batch master to the WN just installed
# scp <batch master host>:/etc/munge/munge.key /etc/munge 

For more details please read Deployment Notes of TORQUE WN config - it applies to all versions >= 2.5.7-1

Service Configuration

YAIM Verification

  • Before starting the configuration PLEASE TEST that you have defined all the mandatory variables for all the WN profiles (WN_torque, WN_torque_noafs, WN_LSF, WN_LSF_noafs)

  • For Torque:
    # /opt/glite/yaim/bin/yaim -v -s <your-site-info.def> -n  WN_torque 
    or
    # /glite/yaim/bin/yaim -v -s <your-site-info.def> -n  WN_torque_noafs 

  • For LSF:
    # /opt/glite/yaim/bin/yaim -v -s <your-site-info.def> -n  WN_LSF
    or
    # /opt/glite/yaim/bin/yaim -v -s <your-site-info.def> -n  WN_LSF_noafs

  • For Other Batch System:
    # /opt/glite/yaim/bin/yaim -v -s <your-site-info.def> -n  WN

You can find all mandatory variables in this documentation: YAIM WN Variables

If no errors are reported during verification you can proceed with the configuration, otherwise correct them before continuing.

YAIM Configuration

  • Please use the debug flag ( "-d 6") to configure the services in order to have detailed information.
  • For your convenience you can save all the configuration information in a log file for each configuration separately, different from the default yaimlog, that contins all the history of all configurations.

IMPORTANT NOTE:
All the nodetypes name are case sensitive, you have to write them as described above.

  • For Torque:
     # /opt/glite/yaim/bin/yaim -c -d 6 -s  <your-site-info.def>  -n  WN_torque 2>&1 | tee /root/conf_WN_torque.`hostname -s`.`date +%Y-%m-%d-%H-%M-%S`.log
    or
     # /opt/glite/yaim/bin/yaim -c -d 6 -s  <your-site-info.def>  -n  WN_torque_noafs 2>&1 | tee /root/conf_WN_torque_noafs.`hostname -s`.`date +%Y-%m-%d-%H-%M-%S`.log

IMPORTANT NOTE:
SL6/x86_64 first configuration with yaim presents the following WARNING:
 WARNING: /var/lib/torque/mom_priv/config already exists, YAIM will not touch it
 WARNING: Batch server defined in BATCH_SERVER variable is different 
 WARNING: from the batch server defined under /var/lib/torque/mom_priv/config
 WARNING: Remove /var/lib/torque/mom_priv/config and reconfigure again to use the new value! 

this is due to the presence of the file /var/lib/torque/mom_priv/config, provided by the SL6 torque-mom package, torque-mom-2.5.7-9.el6.x86_64. Remove the file as recommended and reconfigure

  • For LSF:
     # /opt/glite/yaim/bin/yaim -c -d 6 -s  <your-site-info.def> -n  WN_LSF 2>&1 | tee /root/conf_WN_LSF`.hostname -s`.`date +%Y-%m-%d-%H-%M-%S`.log
    or
     # /opt/glite/yaim/bin/yaim -c -d 6 -s  <your-site-info.def>  -n  WN_LSF_noafs 2>&1 | tee /root/conf_WN_LSF_noafs.`hostname -s`.`date +%Y-%m-%d-%H-%M-%S`.log

  • For Other Batch System:
 /opt/glite/yaim/bin/yaim -c -d 6 -s    -n  WN 2>&1 | tee /root/conf_WN.`hostname -s`.`date +%Y-%m-%d-%H-%M-%S`.log

Known Issues

  • ig_WN SL6 version:
    • misses the dependency on HEP_OSLib, available from linuxsoft/wlcg. One has to install the latest version available there.
    • in SL6 there is no portmap daemon. It should be replaced with rpcbind in the yaim configuration function config_nfs_sw_dir_client. It should be fixed in a future version of the yaim-addons package.

Service Testing - Reference Card

After service installation you could have a look at WN - Service Reference Card. In this page you can find information on what daemons are running, log files, cron jobs, etc.

WN Documentation References:

UI Installation and Configuration

  • Supported platforms: SL5/x86_64 & SL6/x86_64

  • Have a look to the section Repository Settings and ensure that you have the common repo files
  • Before starting the installation procedure remember to clean all yum cache and headers:
    # yum clean all

CAs installation:

  • Install CAs on ALL profiles:
    # yum install ca-policy-egi-core

Service installation

  • Have a look to the UI documentation before starting to install: UI Guides.
  • Install the UI metapackages, containing all clients available:
    # yum install emi-ui

Service Configuration

YAIM Verification

  • Before starting the configuration PLEASE TEST that you have defined all the mandatory variables for the UI nodetype:
     # /opt/glite/yaim/bin/yaim -v -s <your-site-info.def> -n  UI 

You can find all mandatory variables in this documentation: YAIM UI Variables

If no errors are reported during verification you can proceed with the configuration, otherwise correct them before continuing.

YAIM Configuration

  • Please use the debug flag ( "-d 6") to configure the services in order to have detailed information.
  • For your convenience yo can save all the configuration information in a log file for each configuration separately, different from the default yaimlog, that contins all the history of all configurations:
     /opt/glite/yaim/bin/yaim -c -d 6 -s  <your-site-info.def>  -n  UI 2>&1 | tee /root/conf_UI.`hostname -s`.`date +%Y-%m-%d-%H-%M-%S`.log 

Service Testing - Reference Card

After service installation you could have a look at the UI Reference Card or User Troubleshooting Guide. In this page you can find some common errors and other useful service information.

UI Documentation References:

MPI Installation and Configuration

  • Have a look to the section Repository Settings and ensure that you have the common repo files
  • Before starting the installation procedure remember to clean all yum cache and headers:
# yum clean all 

MPI Service installation on CE-CREAM

  • Follow the CE-CREAM installation guide to install a creamCE
  • Install the MPI metapackage. It provides mpi_start and glite-yaim-mpi for configuration

# yum install glite-mpi 

IMPORTANT NOTE:
Please pay attention to the metapackage name! It is glite-mpi, NOT emi-mpi or the previous one glite-MPI_utils.
Please install glite-mpi both in the creamCE and WNs nodes (IGI/EMI flavour)

MPI Service installation on WN

  • Follow the WN installation guide
  • Install the MPI metapackage. It provides mpi_start and glite-yaim-mpi for configuration

# yum install glite-mpi 
  • WNs also require a working MPI implementation, Open MPI and MPICH-2 are recommended. The devel packages should also be installed in order to allow user to compile their applications. Refer to your OS repositories for the exact packages.
    • If you would like to use the OPEN MPI flavour in your site please install:
      #yum install openmpi openmpi-devel 
    • If you would like to use the MPICH2 flavour in your site please install:
      # yum install mpich2  mpich2-devel

IMPORTANT NOTE:
If you are using Torque remember to create the munge key, and copy it to all cluster hosts (CE, Batch Master, WNs): Munge configuration

Service Configuration

Useful Variables

  • Remember to copy these three files in your services/ directory:
    • /opt/glite/yaim/examples/siteinfo/services/glite-mpi
    • /opt/glite/yaim/examples/siteinfo/services/glite-mpi_ce
    • /opt/glite/yaim/examples/siteinfo/services/glite-mpi_wn
  • Set properly the variables in the above files in services/ directory. In particular customize these important values:

File name Variable common value Description
glite-mpi MPI_MPICH_ENABLE MPI_MPICH_ENABLE = "no" Support for MPICH Flavour
glite-mpi MPI_MPICH2_ENABLE MPI_MPICH2_ENABLE="yes" Support for MPICH2 Flavour
glite-mpi MPI_OPENMPI_ENABLE MPI_OPENMPI_ENABLE="yes" Support for OPENMPI Flavour
glite-mpi MPI_MPICH2_PATH MPI_MPICH2_PATH="/usr/lib64/mpich2/" MPICH2 path
glite-mpi MPI_MPICH2_VERSION MPI_MPICH2_VERSION="1.2.1p1" MPICH2 version
glite-mpi MPI_OPENMPI_PATH MPI_OPENMPI_PATH="/usr/lib64/openmpi/1.4-gcc/" OPENMPI path
glite-mpi MPI_OPENMPI_VERSION MPI_OPENMPI_VERSION="1.4-4" OPENMPI version
glite-mpi MPI_MPICH_MPIEXEC MPI_MPICH_MPIEXEC="/usr/bin/mpiexec" MPICH MPIEXEC path
glite-mpi MPI_MPICH2_MPIEXEC MPI_MPICH2_MPIEXEC="/usr/bin/mpiexec" MPICH2 MPIEXEC path
glite-mpi MPI_OPENMPI_MPIEXEC MPI_OPENMPI_MPIEXEC="/usr/lib64/openmpi/1.4-gcc/bin/mpiexec" OPENMPI MPIEXEC path
glite-mpi MPI_SSH_HOST_BASED_AUTH MPI_SSH_HOST_BASED_AUTH=${MPI_SSH_HOST_BASED_AUTH:-"yes"} Use the SSH Hostbased Authentication between your WNs
glite-mpi_ce MPI_SUBMIT_FILTER MPI_SUBMIT_FILTER=${MPI_SUBMIT_FILTER:-"yes"} For Torque ensure that CPU allocation is performed correctly

YAIM Verification

  • Before starting the configuration PLEASE TEST that you have defined all the mandatory variables for the MPI profile
    • On CE-CREAM MPI Verification:
      # /opt/glite/yaim/bin/yaim -v -s <site-info.def> -n MPI_CE -n creamCE -n TORQUE_server -n TORQUE_utils 
    • On WN MPI Verification:
      # /opt/glite/yaim/bin/yaim -v -s <site-info.def> -n MPI_WN -n WN_torque_noafs 
  • You can find in the "YAIM MPI Variables" documentation information about all mandatory variables.
    • If no errors are reported during the verification you can proceed to the configuration, otherwise correct them.

YAIM Configuration

  • Please use the debug flag ( "-d 6") to configure the services in order to have detailed information.
  • For your convenience yo can save all the configuration information in a log file for each configuration separately, different from the default yaimlog, that contins all the history of all configurations.

IMPORTANT NOTE:
When configuring with yaim remember to put first the nodetype MPI_CE or MPI_WN

  • On CE-CREAM
# /opt/glite/yaim/bin/yaim -c -d 6 -s <site-info.def>  -n MPI_CE -n creamCE -n TORQUE_server -n TORQUE_utils 2>&1 | tee /root/conf_EMI_CREAM_Torque_MPI.`hostname -s`.`date +%Y-%m-%d-%H-%M-%S`.log 
  • On WN
#  /opt/glite/yaim/bin/yaim -c -d 6 -s <site-info.def> -n MPI_WN -n WN_torque_noafs  2>&1 | tee /root/conf_WN_Torque_MPI.`hostname -s`.`date +%Y-%m-%d-%H-%M-%S`.log 

Check files in WN

  • If in YAIM Configuration step you choose to use MPI_SSH_HOST_BASED_AUTH (recommended) check these files:
    • /etc/ssh/sshd_config
    • /etc/ssh/shosts.equiv

  • The first one, sshd_config, should have the follow variables set like below
HostbasedAuthentication yes 
IgnoreUserKnownHosts yes
IgnoreRhosts yes

  • The second file, shosts.equiv, should contain: the CE hostname, the default SE host name and all the WNs hostname. Maybe in the CE it was created in configuration process, copy it form there to all WNs
  • Restart sshd services after file modification:
° service sshd restart

Service Testing - Reference Card

MPI Documentation References:

HLR Server Installation and Configuration NEW Update

Extended Release Notes:

HLR Prerequisites

Hardware Requirements

  • The HLR Server host should be a real or virtual node having an optimal disk access.
  • The suggested requirements are:
    • CPU: 4/8 cores
    • Memory: 8/16GB RAM
    • Disk: minimum 200 GB of space for a first level HLR. (if you have an old HLR server please check your database actual dimension and redouble the partition size)
    • Network: open port TCP 56568 for inbuond connectivity

Co-Hosting

  • Due to its critical nature the HLR Server should be installed as a stand-alone service.

Virtual vs. Physical

  • If you will use a virtual host ensure you are not using Virtio to access to the mysql storage DB.
  • Please use a physical disk partition for the filesystem hosting the DB

Operating System

  • HLR Server 4.0 is supported on Scientific Linux 5, x86_64 and Scientific Linux 5, x86_64

Host certificate installation:

  • The HLR host must be configured with X.509 certificates signed by a trusted Certification Authority (CA).
  • The hostcert.pem and hostkey.pem certificates are located in the /etc/grid-security/ directory, and they must have permission 0644 and 0400 respectively:
  • Check existence
       # ls -l /etc/grid-security/hostkey.pem
       -r-------- 1 root root 887 Mar 1 17:08 /etc/grid-security/hostkey.pem
       # ls -l /etc/grid-security/hostcert.pem
       -rw-r--r-- 1 root root 1440 Mar 1 17:08 /etc/grid-security/hostcert.pem
       
  • Check expiration
    [~]# openssl x509 -in hostcert.pem -noout -dates
  • Change permission: (if needed)
       # chmod 0400 hostkey.pem
       # chmod 0600 hostcert.pem
       

Middleware Installation

  • Have a look to the section Repository Settings and ensure that you have the common repo files.
  • You can use any of the group of repositories reccomended for SL5/x86_64 (EMI 1 or EMI 2) section or SL6/x86_64 (EMI 2) section.
  • Before starting the installation procedure remember to clean all yum cache and headers:
    # yum clean all

CAs installation:

  • Install CAs on ALL profiles:
    # yum install ca-policy-egi-core

HLR Server installation

  • Have a look to the HLR documentation before starting the installation: HLR Server 4.0 Guide.
  • Install the HLR metapackages, containing all packages needed:
    # yum install igi-hlr

Middleware Configuration

Verification

  • Before starting the configuration PLEASE TEST that you have defined all the mandatory variables for all the HLR profile
    # /opt/glite/yaim/bin/yaim -v -s <your-site-info.def> -n  HLR 
  • If no errors are reported during verification you can proceed to the configuration, otherwise correct them before continuing.

HLR Server Configuration

  • Please use the debug flag ( "-d 6") to configure the services in order to have detailed information.
  • For your convenience you can save all the configuration information in a log file for each configuration separately, different from the default yaimlog, that contins all the history of all configurations.
     # /opt/glite/yaim/bin/yaim -c -d 6 -s  <site-info.def>  -n  HLR 2>&1 | tee /root/conf_igi-HLR.`hostname -s`.`date +%Y-%m-%d-%H-%M-%S`.log

Known Issue - IMPORTANT NOTE

After configuration please change into the file /etc/cron.d/dgas the command from /usr/sbin/dgas-hlr-translatedb to /usr/sbin/dgas-hlr-populateJobTransSummary like the example below:

cat /etc/cron.d/dgas 
# Cron file created by YAIM - don't modify it!
SHELL=/bin/bash
PATH=/sbin:/bin:/usr/sbin:/usr/bin
MAILTO=root
HOME=/

# Update hlr database
*/4 * * * * root /usr/sbin/dgas-hlr-populateJobTransSummary > /dev/null 2>&1

This fix will be provide with the new yaim in next release.

Post installation and configuration

  • After the installation and configuration of the HLR Server services, you should register all the resources of the CEs attached to your HLR Server. Please use thecommand below for each CE:
    # dgas-hlr-addadmin -Sa  <CE_DN>
  • Where <CE_DN> is something like "/C=IT/O=INFN/OU=Host/L=Padova/CN=prod-ce-01.pd.infn.it"

Update From a previous release < 4.0:

  • IMPORTANT NOTE: some of the operations described bellow (like "translate DB", the start of hlrd or "populateJobTransSummary") can take a lot of time, from minutes to hours or days (!) depending on how big the DB is. Please plan carefully an upgrade!

  • Back up the old databases :
    • on HLR Server stop the dgas services and make a dump of the databases:
      • Stop the services :
        # /etc/init.d/glite-dgas-hlrd stop 
      • Check disk space
        • Ensure you have enough space in the disk:
          # df -h
        • You should have the space to contain the hlr and hlr_tmp database dump. (Mount an external partition or NFS partition if you don't have enough free space)
      • Make the dump :
                 mysqldump --user=root --password hlr > /data/hlr.sql
                 mysqldump --user=root --password hlr_tmp > /data/hlr_tmp.sql
                 
  • Installation :
    • Install again the host with SL5 or SL6, x86_64 distribution or install a new host where the HLR Server will be installed.
    • Following the instructions in the previous section install the HLR Server and configure it using yaim.
  • Stop the HLR server process:
    # /etc/init.d/dgas-hlrd stop
  • Restore the dump :
       # mysql -u root -p hlr_tmp < /data/hlr_tmp.sql
       # mysql -u root -p hlr < /data/hlr.sql
       
  • execute translate DB
    # nohup /usr/sbin/dgas-hlr-translatedb -D & 
  • Start dgas services :
    # /etc/init.d/dgas-hlrd start 
  • Execute populateJobTransSummary:
    • The new HLR version need to polulate the JobTransSummary
      # /usr/sbin/dgas-hlr-populateJobTransSummary 
  • Restart dgas services :
    # /etc/init.d/dgas-hlrd restart 
  • Change the into the file */etc/cron.d/dgas the command from /usr/sbin/dgas-hlr-translatedb to /usr/sbin/dgas-hlr-populateJobTransSummary

Post Update Procedure

After the update of the HLR Server , you should register again all the resources of the CE attached to your HLR Server. Please use this command below for each CE:

# dgas-hlr-addadmin -Sa  <CE_DN>

Where <CE_DN> is something like "/C=IT/O=INFN/OU=Host/L=Padova/CN=prod-ce-01.pd.infn.it"

Update From a previous release >= 4.0.13:

  • Just update the packages:
    # yum update

HLR Documentation References:

Topic attachments
I Attachment Action Size Date Who Comment
PDFpdf Upgrade_Instructions_to_StoRM_v18.pdf manage 54.1 K 2011-11-22 - 10:00 SergioTraldi Upgrade StoRM istructions. Useful to the data stored partition
Edit | Attach | PDF | History: r77 < r76 < r75 < r74 < r73 | Backlinks | Raw View | More topic actions
Topic revision: r77 - 2014-02-14 - CristinaAiftimiei
 
This site is powered by the TWiki collaboration platformCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback