APEL Deployment
We are changing the accounting system used in our infrastructure, from DGAS to APEL and the procedure is quite simple.
Each resource centre needs to install a new node (the APEL Publisher) which receives the accounting information sent from the CE(s) by the APEL parser. Then the APEL publisher sends the data to the EGI central database using the messaging infrastructure.
In the past months we tested two scenarios:
- the accounting data are sent directly to the central databse (canonic installation)
- the accounting data are sent to FAUST and to the EGI central database
and we chose the second one
Registration
you need to register the APEL publisher in the GOC-DB: the service endpoint name to add is
glite-APEL and fill in also the certificate subject information.
Changes in GOCDB can take up to 4 hours to make it to the message brokers. this is necessary to authorize the publisher host in using the broker network.
Do not touch the
APEL service endpoint instead, otherwise nagios won't monitor the accounting data publication.
(for reference
https://wiki.egi.eu/wiki/MAN09_Accounting_data_publishing)
APEL Publisher Installation and Configuration
Follow the
EMI3 generic installation guide and the APEL one
https://twiki.cern.ch/twiki/pub/EMI/EMI3APELClient/APEL_Publisher_System_Administrator_Guide.pdf
Use the production queue of the broker network:
# Queue to which SSM will send messages (use this)
destination: /queue/global.accounting.cpu.central
and comment out or delete the testing one
Sending the data to FAUST
In order to send the accounting data also to FAUST, after installing and configuring the APEL Publisher as explained in the section above, follow the instructions
https://github.com/andreaguarise/ssm-dupl-send
this mean that instead of apelclient script,
you have to use the ssm-dupl-send.sh one.
Among the important paramaters to set in the faust-sender.cfg there are the following ones:
host: dgas-broker.to.infn.it
port: 61613
use_ssl: false
destination: apel.input
Create with mkdir the directory:
/var/spool/faust/outgoing
IMPORTANT: for Tier1 and Tier2 it will be used a dedicated queue:
destination: apel.<SITE-NAME>.input
For instance, in the case of INFN-PISA, into file faust-sender.cfg it will be set:
destination: apel.INFN-PISA.input
IMPORTANT: Run the FAUST script only after having launched the parser one on the CEs
cat /etc/cron.d/ssm-dupl-send
# Run APEL client once daily
05 01 * * * root /root/bin/ssm-dupl-send.sh
APEL Parser Installation and Configuration
Install and configure the APEL parser on each computing element of your resource centre.
Follow the
EMI3 generic installation guide and the APEL one
https://twiki.cern.ch/twiki/pub/EMI/EMI3APELClient/APEL_Parsers_System_Administrator_Guide.pdf
IMPORTANT: Send the accounting data starting from September, because the previous ones have been already sent by DGAS to APEL, otherwise the will be overwritten causing some inconsistencies. Configure the parser accordingly to make process the proper files (or move the old logs in another directory).
you can launch the apelparser script after the setting-up of the apelclient database
an example of the cron:
cat /etc/cron.d/apelparser
# Run APEL parser once daily
04 22 * * * root /usr/bin/apelparser
NOTE: the empty logfiles produce a CRITICAL error in the parsing operation:
2014-09-12 11:02:08,683 - apel.common.exceptions - CRITICAL - Unhandled exception raised!
2014-09-12 11:02:08,683 - apel.common.exceptions - CRITICAL - Please send a bug report with following information:
2014-09-12 11:02:08,684 - apel.common.exceptions - CRITICAL - UnboundLocalError: local variable 'line_number' referenced before assignment
2014-09-12 11:02:08,684 - apel.common.exceptions - CRITICAL - parse_file [/usr/bin/apelparser 139]
2014-09-12 11:02:08,684 - apel.common.exceptions - CRITICAL - scan_dir in /usr/bin/apelparser [187]
2014-09-12 11:02:08,684 - apel.common.exceptions - CRITICAL - handle_parsing in /usr/bin/apelparser [296]
2014-09-12 11:02:08,684 - apel.common.exceptions - CRITICAL - main in /usr/bin/apelparser [380]
2014-09-12 11:02:08,684 - apel.common.exceptions - CRITICAL - ? in /usr/bin/apelparser [392]
so check for these empty files and delete them.
Bug in APEL 1.2.1 - Apply the EMI-3 update 20
The EMI-3 update 20 released a fix of the APEL software for a bug in the parser which prevents it from working in the most common cases: it is unable to open uncompressed accounting logs for parsing.
Sites with this problem will have version 1.2.1 installed and see many log
messages like this:
2014-08-11 12:54:11,819 - parser - ERROR - Cannot open file
blahp.log-20140811: Not a gzipped file
in their parser log file - usually at /var/log/apel/parser.log.
Sites who have installed version 1.2.1 should upgrade to 1.2.2 immediately:
http://www.eu-emi.eu/releases/emi-3-monte-bianco/updates/-/asset_publisher/5Na8/content/update-20-12-09-2014-v-3-11-0-1
What about DGAS sensors?
Because the DGAS server problems occurred at the beginning of September, stop dgas sensors on your computing element(s).
Fast checks
When you launch the apelparser script for the first time, if there are no errors, it will be filled in the tables
BlahdRecords e
EventRecords (database "apelclient"), so check if they really contain the data.
Then with the execution of ssm-dupl-send.sh script on the publisher host, it will be done the join of those tables (filling in the
JobRecords e
VJobRecords ones), and the data will be sent to FAUST and to EGI, so you can perform the following check:
mysql> use apelclient
mysql> SELECT year(EndTime),Month(EndTime),InfrastructureDescription,count(*) FROM VJobRecords GROUP BY 1,2,3;
+---------------+----------------+---------------------------+----------+
| year(EndTime) | Month(EndTime) | InfrastructureDescription | count(*) |
+---------------+----------------+---------------------------+----------+
| 2014 | 5 | APEL-CREAM-PBS | 3747 |
| 2014 | 6 | APEL-CREAM-PBS | 7243 |
| 2014 | 7 | APEL-CREAM-PBS | 4852 |
| 2014 | 8 | APEL-CREAM-PBS | 4770 |
| 2014 | 9 | APEL-CREAM-PBS | 3882 |
+---------------+----------------+---------------------------+----------+
5 rows in set (0.13 sec)
STATUS
SITE NAME |
TICKET |
STATUS |
INFO |
BIOCOMP |
17459 |
SOLVED |
CIRMMP |
12 |
SOLVED |
CNR-ILC-PISA |
17461 |
SOLVED |
CRS4 |
13 |
SOLVED |
FBF-Brescia-IT |
14 |
SOLVED |
GARR-01-DIR |
17464 |
SOLVED |
GILDA-INFN-CATANIA |
27 |
SOLVED |
GILDA-SIRIUS |
29 |
SOLVED |
GRISU-COMETA-INFN-CT |
17465 |
SOLVED |
GRISU-UNINA |
17466 |
SOLVED |
ICEAGE-CATANIA |
28 |
SOLVED |
INAF-TS |
17467 |
SOLVED |
INFN-BARI |
15 |
IN PROGRESS |
in attesa del primo lancio |
INFN-BOLOGNA |
17468 |
SOLVED |
INFN-BOLOGNA-T3 |
17469 |
SOLVED |
INFN-CATANIA |
17449 |
SOLVED |
INFN-CNAF-LHCB |
17470 |
SOLVED |
INFN-COSENZA |
17471 |
SOLVED |
INFN-FERRARA |
17472 |
SOLVED |
INFN-FRASCATI |
17452 |
SOLVED |
INFN-GENOVA |
17473 |
SOLVED |
INFN-LECCE |
17474 |
SOLVED |
INFN-LNL-2 |
17450 |
SOLVED |
|
INFN-MILANO-ATLASC |
16 |
IN PROGRESS |
in attesa della prima pubblicazione |
INFN-NAPOLI-ARGO |
17477 |
SOLVED |
INFN-NAPOLI-ATLAS |
17453 |
SOLVED |
INFN-NAPOLI-CMS |
17478 |
SOLVED |
INFN-NAPOLI-PAMELA |
17479 |
SOLVED |
INFN-PADOVA |
17480 |
SOLVED |
|
INFN-PAVIA |
17 |
OPEN |
INFN-PERUGIA |
17483 |
SOLVED |
INFN-PISA |
17454 |
SOLVED |
INFN-ROMA1 |
26 |
IN PROGRESS |
manderà i dati assieme a INFN-ROMA1-VIRGO e INFN-ROMA1-CMS |
INFN-ROMA1-CMS |
17456 |
SOLVED |
INFN-ROMA1-VIRGO |
17484 |
OPEN |
INFN-ROMA2 |
17485 |
SOLVED |
INFN-ROMA3 |
17486 |
SOLVED |
INFN-T1 |
17457 |
SOLVED |
INFN-TORINO |
17458 |
SOLVED |
INFN-TRIESTE |
17487 |
SOLVED |
RECAS-NAPOLI |
17488 |
SOLVED |
SNS-PISA |
17489 |
SOLVED |
sito sospeso |
TRIGRID-INFN-CATANIA |
17490 |
SOLVED |
UNI-PERUGIA |
38 |
SOLVED |
UNINA-EGEE |
17492 |
SOLVED |
--
AlessandroPaolini - 2014-06-13