SYNOPSIS
       sge_shepherd

DESCRIPTION
       sge_shepherd  provides  the  parent  process functionality for a single
       Univa Grid Engine job.  The parent functionality is necessary  on  UNIX
       systems to retrieve resource usage information (see getrusage(2)) after
       a job has finished. In addition, the sge_shepherd forwards  signals  to
       the  job, such as the signals for suspension, enabling, termination and
       the  Univa  Grid  Engine  checkpointing  signal  (see  sge_ckpt(1)  for
       details).

       The  sge_shepherd receives information about the job to be started from
       the sge_execd(8).  During the execution of the job it  actually  starts
       up  to  5 child processes. First a prolog script is run if this feature
       is enabled by the prolog parameter in the cluster  configuration.  (See
       sge_conf(5).)   Next a parallel environment startup procedure is run if
       the job is a parallel job. (See sge_pe(5) for more information.)  After
       that,  the  job itself is run, followed by a parallel environment shut-
       down procedure for parallel jobs,  and  finally  an  epilog  script  if
       requested  by  the  epilog  parameter in the cluster configuration. The
       prolog and epilog scripts as well as the parallel  environment  startup
       and  shutdown  procedures  are  to be provided by the Univa Grid Engine
       administrator and are intended for site-specific actions  to  be  taken
       before and after execution of the actual user job.

       After  the  job  has  finished  and  the  epilog  script  is processed,
       sge_shepherd retrieves resource usage statistics about the job,  places
       them in a job specific subdirectory of the sge_execd(8) spool directory
       for reporting through sge_execd(8) and finishes.

       sge_shepherd also places an exit status file in  the  spool  directory.
       This  exit  status can be viewed with qacct -j JobId (see qacct(1)); it
       is not the exit status of sge_shepherd itself but of one of the methods
       executed  by sge_shepherd.  This exit status can have several meanings,
       depending on in which method an error occurred (if any).  The  possible
       methods  are:  prolog, parallel start, job, parallel stop, epilog, sus-
       pend, restart, terminate, clean, migrate, and checkpoint.

       The following exit values are returned:

       0      All methods: Operation was executed successfully.

       99     Job script, prolog and epilog: When FORBID_RESCHEDULE is not set
              in  the configuration (see sge_conf(5)), the job gets re-queued.
              Otherwise see "Other".

       100    Job script, prolog and epilog: When FORBID_APPERROR is  not  set
              in  the configuration (see sge_conf(5)), the job gets re-queued.
              Otherwise see "Other".

       Other  Job script: This is the exit status of the job itself. No action
       sge_shepherd should not be invoked manually, but only by  sge_execd(8).

FILES
       sgepasswd  contains  a  list  of  user  names   and   their correspond-
       ing encrypted passwords. If available, the  password   file   will   be
       used   by  sge_shepherd. To change the contents of this file please use
       the sgepasswd command. It is not advised to  change that file manually.
       <execd_spool>/job_dir/<job_id>     job specific directory

SEE ALSO
       sge_intro(1), sge_conf(5), sge_execd(8).

COPYRIGHT
       See sge_intro(1) for a full statement of rights and permissions.



UGE 8.0.0                $Date: 2007/07/19 09:04:33 $          SGE_SHEPHERD(8)

Man(1) output converted with man2html