Différences

Ci-dessous, les différences entre deux révisions de la page.

--- varenv [2017/08/28 07:20] – toussain
+++ varenv [2018/10/11 20:49] (Version actuelle) – modification externe 127.0.0.1
@@ Ligne 1: / Ligne 1: @@
 ====== Variables d'environnement SLURM ======
-===== Information sur les tableaux de jobs  =====
+On donne ici seulement quelques unes des variables d'environnement qui sont disponibles dans SLURM. Pour une liste exhaustive voir [[https://slurm.schedmd.com/sbatch.html#lbAG|https://slurm.schedmd.com/sbatch.html#lbAG]].
-Un tableau de jobs est composé d'un job principal (le tableau) et de tâches (les jobs contenus dans le tableau). SLURM attribue à un job du tableau un numéro constitué du numéro du job principal et de son indice : <
-**SLURM_ARRAY_JOB_ID** >_<**SLURM_ARRAY_TASK_ID**>.
+===== Information sur le job =====
+  * **SLURM_JOB_ID** : numéro du job
+  * **SLURM_JOB_NAME** : nom du job
+  * **SLURM_JOB_NUM_NODES** : nombre total de nœuds alloués pour le job
+  * **SLURM_SUBMIT_DIR** : répertoire à partir duquel le job est soumis
+  * **SLURMD_NODENAME** : nom du nœud sur lequel le job s'exécute
+  * **SLURM_JOB_PARTITION** : nom de la partition sur laquelle le job s'exécute
+===== Information sur les tableaux de jobs  =====
+Un tableau de jobs est composé d'un job principal (le tableau) et de tâches (les jobs contenus dans le tableau). SLURM attribue à un job du tableau un numéro constitué du numéro du job principal et de son indice : <
+**SLURM_ARRAY_JOB_ID** >_<**SLURM_ARRAY_TASK_ID**>.
+On dispose des variables d'environnement suivantes :
   * **SLURM_ARRAY_JOB_ID** : numéro du job principal
@@ Ligne 20: / Ligne 32: @@
   * **SLURM_ARRAY_TASK_MIN** : indice min du tableau de jobs
+===== Exemples d'utilisation =====
+<code bash>
+#!/bin/bash
+#SBATCH --job-name=test1
+#SBATCH --time=5:00
+#SBATCH --partition=court
+echo mon job $SLURM_JOB_NAME, num $SLURM_JOB_ID,
+echo "s'execute" dans la partition $SLURM_JOB_PARTITION sur le noeud $SLURMD_NODENAME
+./exe 10
+</code>
+<code bash>
+#!/bin/bash
+#SBATCH --time=1:00
+#SBATCH --array=0-9
+echo tableau de jobs numero $SLURM_ARRAY_JOB_ID, indices de $SLURM_ARRAY_TASK_MIN a $SLURM_ARRAY_TASK_MAX
+#les jobs dont les indices vont de 0 à 4 exécutent le programme 1
+if [ $SLURM_ARRAY_TASK_ID -le 4 ]
+then
+    echo "premier programme"
+    ./exe 1
+else  #les autres exécutent le programme 2
+    echo "second programme"
+    ./exe 10
+fi
+</code>
-SLURM_CHECKPOINT_IMAGE_DIR
-    Directory into which checkpoint images should be written if specified on the execute line.
-SLURM_CLUSTER_NAME
-    Name of the cluster on which the job is executing.
-SLURM_CPUS_ON_NODE
-    Number of CPUS on the allocated node.
-SLURM_CPUS_PER_TASK
-    Number of cpus requested per task. Only set if the --cpus-per-task option is specified.
-SLURM_DISTRIBUTION
-    Same as -m, --distribution
-SLURM_GTIDS
-    Global task IDs running on this node. Zero origin and comma separated.
-SLURM_JOB_ACCOUNT
-    Account name associated of the job allocation.
-SLURM_JOB_ID (and SLURM_JOBID for backwards compatibility)
-    The ID of the job allocation.
-SLURM_JOB_CPUS_PER_NODE
-    Count of processors available to the job on this node. Note the select/linear plugin allocates entire nodes to jobs, so the value indicates the total count of CPUs on the node. The select/cons_res plugin allocates individual processors to jobs, so this number indicates the number of processors on this node allocated to the job.
-SLURM_JOB_DEPENDENCY
-    Set to value of the --dependency option.
-SLURM_JOB_NAME
-    Name of the job.
-SLURM_JOB_NODELIST (and SLURM_NODELIST for backwards compatibility)
-    List of nodes allocated to the job.
-SLURM_JOB_NUM_NODES (and SLURM_NNODES for backwards compatibility)
-    Total number of nodes in the job's resource allocation.
-SLURM_JOB_PARTITION
-    Name of the partition in which the job is running.
-SLURM_JOB_QOS
-    Quality Of Service (QOS) of the job allocation.
-SLURM_JOB_RESERVATION
-    Advanced reservation containing the job allocation, if any.
-SLURM_LOCALID
-    Node local task ID for the process within a job.
-SLURM_MEM_PER_CPU
-    Same as --mem-per-cpu
-SLURM_MEM_PER_NODE
-    Same as --mem
-SLURM_NODE_ALIASES
-    Sets of node name, communication address and hostname for nodes allocated to the job from the cloud. Each element in the set if colon separated and each set is comma separated. For example: SLURM_NODE_ALIASES=ec0:1.2.3.4:foo,ec1:1.2.3.5:bar
-SLURM_NODEID
-    ID of the nodes allocated.
-SLURM_NTASKS (and SLURM_NPROCS for backwards compatibility)
-    Same as -n, --ntasks
-SLURM_NTASKS_PER_CORE
-    Number of tasks requested per core. Only set if the --ntasks-per-core option is specified.
-SLURM_NTASKS_PER_NODE
-    Number of tasks requested per node. Only set if the --ntasks-per-node option is specified.
-SLURM_NTASKS_PER_SOCKET
-    Number of tasks requested per socket. Only set if the --ntasks-per-socket option is specified.
-SLURM_PRIO_PROCESS
-    The scheduling priority (nice value) at the time of job submission. This value is propagated to the spawned processes.
-SLURM_PROCID
-    The MPI rank (or relative process ID) of the current process
-SLURM_PROFILE
-    Same as --profile
-SLURM_RESTART_COUNT
-    If the job has been restarted due to system failure or has been explicitly requeued, this will be sent to the number of times the job has been restarted.
-SLURM_SUBMIT_DIR
-    The directory from which sbatch was invoked.
-SLURM_SUBMIT_HOST
-    The hostname of the computer from which sbatch was invoked.
-SLURM_TASKS_PER_NODE
-    Number of tasks to be initiated on each node. Values are comma separated and in the same order as SLURM_NODELIST. If two or more consecutive nodes are to have the same task count, that count is followed by "(x#)" where "#" is the repetition count. For example, "SLURM_TASKS_PER_NODE=2(x3),1" indicates that the first three nodes will each execute three tasks and the fourth node will execute one task.
-SLURM_TASK_PID
-    The process ID of the task being started.
-SLURM_TOPOLOGY_ADDR
-    This is set only if the system has the topology/tree plugin configured. The value will be set to the names network switches which may be involved in the job's communications from the system's top level switch down to the leaf switch and ending with node name. A period is used to separate each hardware component name.
-SLURM_TOPOLOGY_ADDR_PATTERN
-    This is set only if the system has the topology/tree plugin configured. The value will be set component types listed in SLURM_TOPOLOGY_ADDR. Each component will be identified as either "switch" or "node". A period is used to separate each hardware component type.
-SLURMD_NODENAME
-    Name of the node running the job script.