These pages describe some general details concerning the FATMEN service at CERN.
For information regarding the use of the FATMEN package itself, please see the FATMEN documentation.
See also the old (no longer maintained) web pages here
The FATMEN service is run on an IBM machine, with nodename fatcat.cern.ch.
The various FATMEN catalogs are kept on the /fatmen filesystem, which is NFS-exported and should be NFS-mounted on any system requiring access to the FATMEN catalog.
For each experiment (e.g. DELPHI, L3, OPAL etc.) there is a subdirectory which contains the catalog for that experiment and associated configuration files, e.g.
fatcat:/fatmen/fmdelphi (1) ls bad/ fatlogs/ fatmen.loccodes fatserv.log fmdelphi.names cern.fatrz fatmen.accounts fatmen.medtypes fatserv.sh toccsrs.in2p3.fr/ done/ fatmen.acl fatserv fatsrv todo/ fatcat:/fatmen/fmdelphi (2)The purpose of each of these files is described below.
Currently, FATMEN runs on a RS6000 machine, shared with HEPDB.
The node name is shd15 with aliases fatcat and hepdb.
Any user may request a restart of a server by creating a signal.start file in
the todo directory for their experiment.
e.g. to restart the server for DELPHI, type:
There are the following known problems:
The server will be automatically restarted by a cron job.
Free or buy some disk space.
Increase the RZ quota or change the RZ record size. Best left to an expert.
An example is shown below.
Restore the more recent online backup from /fatback/fm{experiment}/good.fatfx as follows:
An example is shown below:
Then 'mv' the necessary update files from /fatmen/fm{experiment}/done to
/fatmen/fm{experiment}/todo and restart the server.
The case of a simple repair to fmopal is presented.
If surgery is required on the RZ file, best wait for an expert.
This strips off the Fortran control words around each record and results
in a "true" exchange formant file.
Best wait for an expert.
This directory is used by the server to store update files that are not valid FATMEN
update files. Usually happens when a user application writes to the unit reserved
for FATMEN updates.
A ZEBRA-RZ file containing the FATMEN catalog for the corresponding experiment.
After processing, updates are 'mv-ed' to this directory (which is often/normally a link
to another filesystem). Update files are kept to enable recovery from corrupted catalogs.
They are automatically deleted a cleanup script.
The directory where log files are kept. Again, often a link.
A file containing account aliases. See the user documentation for more information.
A file containing access control specifications. See the user documentation for more information.
A file containing location code definitions. See the user documentation for more information.
A file containing media type definitions. See the user documentation for more information.
The server log file.
The shell script that runs the server.
A link to the FATMEN server.
A configuration file. See the documentation for more information.
A directory where updates are queued to a remote site. See the documentation for more information.
The directory where user updates are queued. Processed by the FATSRV process.
These are kept in the directory
/afs/cern.ch/project/fatmen/@sys/bin and
/afs/cern.ch/project/fatmen/scripts.
A program run every night to backup the FATMEN catalogs.
A program to print the header of a FATMEN update file.
The program responsible for transferring updates between different FATMEN servers
(e.g. at CERN and IN2P3).
The FATMEN server itself.
The scripts contain a brief comment explaining their function and are described in the user
documentation.
Many are run from cron as user jamie.
A few are also run from root:
Like many CERNLIB
programs, FATMEN relies on the DATIME
routine to handle dates and times. As is described here,
this routine returns the date in YYMMDD format through the argument list.
FATMEN stores dates and times in 3 fields for each entry - the date and time
that an entry was catalogued, the corresponding file created, and when it was
last accessed. All of these fields are optional.
Two of the commands in the FATMEN shell, SCAN and SEARCH,
permit date and time ranges for the above three fields to be specified.
It is not possible to specify a range that crosses a century boundary. Hence,
if a listing corresponding to all entries catalogued from e.g. 31-DEC-1999 until 01-JAN-2000
inclusive, two searches will have to be issued.
As this feature is very rarely used, no changes to FATMEN are foreseen.
The basic functionality has been tested on a Y2K machine, as follows:
These tests were successful.
Trouble-shooting
touch /fatmen/fmdelphi/todo/signal.start
The following information is for service managers only.
rsplus12:/afs/cern.ch/user/j/jamie (46) fatmen
>>> macro FATSYS not found
>>> macro FATGRP not found
>>> macro FATUSR not found
>>> macro FATLOGON not found
Type INIT to initialise FATMEN> init l3
FMINIT. Initialisation of FATMEN package
FATMEN 1.92/02 970107 09:30 CERN PROGRAM LIBRARY FATMEN=Q123
This version created on 960801 at 1200
Current Working Directory = //CERN/L3
FM> cd //cern/l3 -a
Current Working Directory = //CERN/L3
Quota = 64995
Number of subdirectories = 8
Created on 900711 at 1601 Modified on 970917 at 1818
Number of records used = 6
0 megawords + 5751 words
FM>
sp020:/fatmen/fmchorus (78) touch todo/signal.stop
sp020:/fatmen/fmchorus (79) fatps
Elapsed CPU time %CPU ** FATMEN server **
========================================================
08:05:05 00:00:40 0.1 /fatmen/fmopal/fatsrv
08:28:44 00:00:16 0.1 /fatmen/fmdelphi/fatsrv
08:34:18 00:00:11 0.0 /fatmen/fmcplear/fatsrv
08:45:23 00:00:08 0.0 /fatmen/fmatlas/fatsrv
08:24:02 00:00:07 0.0 /fatmen/fml3/fatsrv
08:41:08 00:00:02 0.0 /fatmen/fmcndiv/fatsrv
08:17:49 00:00:01 0.0 /fatmen/fmna49/fatsrv
08:43:13 00:00:01 0.0 /fatmen/fmchorus/fatsrv # CHORUS server still running
00:03 00:00:00 0.0 grep /fatsrv
07:55:50 00:00:00 0.0 /fatmen/fmwa93/fatsrv
07:56:53 00:00:00 0.0 /fatmen/fmsmc/fatsrv
07:59:23 00:00:00 0.0 /fatmen/fmrd6/fatsrv
08:01:25 00:00:00 0.0 /fatmen/fmrd5/fatsrv
08:02:28 00:00:00 0.0 /fatmen/fmrd3/fatsrv
08:13:43 00:00:00 0.0 /fatmen/fmnomad/fatsrv
08:15:44 00:00:00 0.0 /fatmen/fmna52/fatsrv
08:19:39 00:00:00 0.0 /fatmen/fmna48/fatsrv
08:20:40 00:00:00 0.0 /fatmen/fmna44/fatsrv
08:22:54 00:00:00 0.0 /fatmen/fmna31/fatsrv
08:42:11 00:00:00 0.0 /fatmen/fmcms/fatsrv
sp020:/fatmen/fmchorus (80) fatps
Elapsed CPU time %CPU ** FATMEN server **
========================================================
08:06:00 00:00:40 0.1 /fatmen/fmopal/fatsrv
08:29:38 00:00:16 0.1 /fatmen/fmdelphi/fatsrv
08:35:12 00:00:11 0.0 /fatmen/fmcplear/fatsrv
08:46:17 00:00:08 0.0 /fatmen/fmatlas/fatsrv
08:24:56 00:00:07 0.0 /fatmen/fml3/fatsrv
08:42:02 00:00:02 0.0 /fatmen/fmcndiv/fatsrv
08:18:43 00:00:01 0.0 /fatmen/fmna49/fatsrv
00:00 00:00:00 0.0 grep /fatsrv
07:56:44 00:00:00 0.0 /fatmen/fmwa93/fatsrv
07:57:47 00:00:00 0.0 /fatmen/fmsmc/fatsrv
08:00:17 00:00:00 0.0 /fatmen/fmrd6/fatsrv
08:02:19 00:00:00 0.0 /fatmen/fmrd5/fatsrv
08:03:22 00:00:00 0.0 /fatmen/fmrd3/fatsrv
08:14:37 00:00:00 0.0 /fatmen/fmnomad/fatsrv
08:16:39 00:00:00 0.0 /fatmen/fmna52/fatsrv
08:20:33 00:00:00 0.0 /fatmen/fmna48/fatsrv
08:21:35 00:00:00 0.0 /fatmen/fmna44/fatsrv
08:23:49 00:00:00 0.0 /fatmen/fmna31/fatsrv
08:43:05 00:00:00 0.0 /fatmen/fmcms/fatsrv
sp020:/fatmen/fmchorus (81) zftp
ZFTP> rtof cern.fatrz cern.fx
FZLOGL. File at LUN= 2, Diagnostic log level= 0
FZENDO. For output file at LUN= 2, Last activity=12, OPT= TE
FZOUT. LUN= 2 Write End-of-Run 0
FZOUT. LUN= 2 Write Zebra EoF
FZOUT. LUN= 2 End-of-Data
Number of objects written :
0 System EOF
1 Zebra EOF
1 End-of-Run
0 Start-of-Run
3787 Pilot records
3231 Non-empty d/s
556 Empty d/s
0 Number of errors
0 Mega-words +
686700 words
3789 Logical records
763 Physical records
763 Steering blocks
0 Words with conversion problems
ZFTP> hel rfrf
Command "/ZFTP/RFRF" :
* ZFTP/RFRF FZFILE RZFILE [ LRECL QUOTA CHOPT ]
FZFILE C 'FZ file name' D=' '
RZFILE C 'RZ file name' D=' '
LRECL I 'RZ file record length' D=0
QUOTA I 'Quota for output file' D=0
CHOPT C 'CHOPT' D=' '
Possible CHOPT values are:
A 'the input file is in FZ alpha format'
S 'display statistics on the RZ file'
X 'the RZ file will be created in eXchange mode'
C 'respect case of input/output file names'
R 'replace output file if it already exists'
This command converts an FZ exchange format file to an RZ file on the LOCAL
machine. No network transfer is performed. The FZFILE must be the output of
a previous RTOF command, or have been created using the RTOX or RTOA
programs. On Unix systems, this file will be read with FORTRAN
direct-access and will hence be transferable and readable on other systems.
By default, the output RZ file will have the same record length as the
original RZ file. However, if LRECL is specified then this value will be
used instead.
ZFTP> rfrf cern.fx cern.rz lrecl=16384
FZLOGL. File at LUN= 1, Diagnostic log level= 0
FZENDI. For input file at LUN= 1, Last activity= 2, OPT= TE
Number of objects read :
0 System EOF
0 Zebra EOF
0 End-of-Run
0 Start-of-Run
3787 Pilot records
3231 Non-empty d/s selected
556 Empty d/s selected
0 Read or Data errors
0 Mega-words +
684900 words
3787 Good logical records
761 Good physical records
761 Steering blocks
0 Words with conversion problems
ZFTP> q
sp020:/afs/cern.ch/user/j/jamie (75) cd $FMCHORUS
sp020:/fatmen/fmchorus (76) zftp
ZFTP> rfrf /fatback/fmchorus/good.fatfx chorus.fatrz
FZLOGL. File at LUN= 1, Diagnostic log level= 0
FZENDI. For input file at LUN= 1, Last activity= 2, OPT= TE
Number of objects read :
0 System EOF
0 Zebra EOF
0 End-of-Run
0 Start-of-Run
3787 Pilot records
3231 Non-empty d/s selected
556 Empty d/s selected
0 Read or Data errors
0 Mega-words +
684900 words
3787 Good logical records
761 Good physical records
761 Steering blocks
0 Words with conversion problems
ZFTP> q
sp020:/fatmen/fmchorus (76) (mv cern.fatrz cern.badrz; mv chorus.fatrz cern.fatrz)
sp020:/fatmen/fmchorus (77)
*
*# Add here the list of directories to be skipped
*# Note that the top directory is //RZ and not //CERN!
*#
* if(chl(1:lchl).eq.
* + '//RZ/OPAL/SIMD/DDST/L20011/P183QQ/R5050')
* + goto 20
if(chl(1:lchl).eq.
+ '//RZ/OPAL/SIMD/DDST/L20011/DIVERT/R5050')
+ goto 20
Explanation of FATMEN files
FATMEN programs and scripts
Programs
ls /afs/cern.ch/project/fatmen/@sys/bin
fatback fathead fatsend fatserv
Scripts
sp020:/fatmen/fmdelphi (11) crontab -l
#
# Start FATMEN backups at 02:00
#
0 2 * * * /afs/cern.ch/project/fatmen/scripts/fatback.sh >> /fatmen/fatback/logs 2>&1
#
# Check for servers that have been stopped
#
0 6 * * * /afs/cern.ch/project/fatmen/scripts/fatchk > /dev/null 2>&1
#
# Restart any servers with a signal.restart file
#
3,18,33,48 4-23 * * * /afs/cern.ch/project/fatmen/scripts/fatrestart > /dev/null 2>&1
#
# Remove old ZZ files
#
0 5,15 * * * /afs/cern.ch/project/fatmen/scripts/fatzz > /dev/null 2>&1
#
# Check for backlogs
#
0 6,16 * * * /afs/cern.ch/project/fatmen/scripts/fatqueue > /dev/null 2>&1
#
# Check that filesystems are not too full
#
0 0 * * * /afs/cern.ch/project/fatmen/scripts/fatdf > /dev/null 2>&1
#
# Check that none of the servers have died
#
0 * * * * /afs/cern.ch/project/fatmen/scripts/fatok > /dev/null 2>&1
#
# Special check for fmsend
#
30 * * * * /afs/cern.ch/project/fatmen/scripts/fatsendchk > /dev/null 2>&1
#
# Clean /fatlogs0 and /fatlogs1 filesystems
#
0 5 * * 3 /afs/cern.ch/project/fatmen/scripts/fatlog_clean > /fatlogs/clean.log 2>&1
#
# Look for bad DELPHI files
#
30 7 * * * /afs/cern.ch/project/fatmen/scripts/fatbad > /dev/null 2>&1
1,11,21,31,41,51 * * * * chmod o+rw /fatmen/fmdelphi/todo/* >/dev/null 2>/dev/null
1,11,21,31,41,51 * * * * chmod o+rw /fatmen/fmopal/todo/* >/dev/null 2>/dev/null
Y2K Information
(Some cosmetic changes, e.g. to print dates in I6.6 format in FMSHOW, have been made).