HOWTOs

LaPalma3 (5): FAQs

Please note that all the SIEpedia's articles address specific issues or questions raised by IAC users, so they do not attempt to be rigorous or exhaustive, and may or may not be useful or applicable in different or more general contexts.

FAQs about LaPalma

General topics

  1. How can I get an account on LaPalma?
  2. When connecting I'm required to enter a password that I don't have...
  3. Should I add an Acknowledge text?
  4. How can I get some help?

Preparing your executions

  1. How I can see the how much free disk space I have?
  2. Where should I install programs common to all the members of my project group? And temporary data?
  3. Which compilers are available at LaPalma?
  4. What version of MPI is currently available at LaPalma?
  5. My application needs a special software (library, tool, ...) to run, is it available?
  6. I got some warnings/errors when loading software modules
  7. Should I be careful with the Input / Output over the parallel filesystem (Lustre)?
  8. Should I be concerned about the big/little endian problem like when executing on La Palma2?

Running your jobs

  1. How can I know the number of idle nodes?
  2. I get an error when submitting my jobs...
  3. Can I execute my programs interactively?
  4. How can I see the status of my jobs in queue?
  5. I have submitted some jobs, but they never run...
  6. I get an error when running my jobs...
  7. How do I know the position of my jobs in queue?
  8. Is there any way to make my jobs wait less in queue before running?
  9. I want to stop a running job, how can I do that?
  10. My job is finished but I see no output or it is not complete...
  11. My jobs have some special needs (dependencies, should start at a specific time, ...)
  12. Can I automatically run on LaPalma many instances of my parallel (or serial) program with different input data?
  13. How can I know how much CPU time (or other resources) have been consumed by my jobs?



Responses:

Q1: How can I get an account on LaPalma? ^ Top

A: LaPalma is one of the thirteen supercomputers that belongs to the Spanish Supercomputing Network (RES). To get an account on this machine:



Q2: When connecting I'm required to enter a password that I don't have... ^ Top

A: LaPalma uses a key-based authentication mechanism (see more info in connecting to LaPalma), so no password will be required by this machine.

  • If you get a message like Enter passphrase for key..., this is not a password to connect LaPalma, but a passphrase needed to access your private key on your local machine. Then you must specify the same passphrase you used when creating the ssh key pair.
  • If a password is asked when connecting to LaPalma, something is failing with the key. Most probably you are connecting from a different machine that has no key stored on LaPalma. Before you can connect to LaPalma for the first time, we will ask you to send us the public key of your machine and then we will store it, in order you can connect whenever you want from that computer. If later you want to connect from other machines, you can either send us the public keys of those machines or you can directly store them on LaPalma. For that purpose you must connect from an already authorized computer and add the new key(s) to file ~/.ssh/authorized_keys (do not delete old keys if you still want to connect from those machines).



Q3: Should I add an Acknowledge text? ^ Top

A: Yes. Please, add next text to your publications if they are based on the results obtained on LaPalma:

  The author thankfully acknowledges the technical expertise and assistance provided by the Spanish
  Supercomputing Network (Red Española de Supercomputación), as well as the computer resources used: the
  LaPalma Supercomputer, located at the Instituto de Astrofísica de Canarias."



Q4: How can I get some help? ^ Top

A: We are continuously updating this website according to the most common issues that our users may have. Please, read the other sections, like Introduction, Connecting, Useful Commands (preparations), Useful Commands (executions) and examples of Script files.

If your question is not explained in this website, do not hesitate to send us an email. Also contact us with any issue, doubt, suggestion, etc. you may have: res_support@iac.es

Q5: How I can see the how much free disk space I have? ^ Top

A: All members of your group will share the same quota, for both total disk space and maximum number of files. To check it, please, execute next command (your_group should be your username without the three last digits):

   [lapalma1]$ lfs quota -hg your_group /storage



Q6: Where should I install programs common to all the members of my project group? And temporary data? ^ Top

A: You should install programs accessible to all your group members in the directory /storage/projects/your_group.

For temporary data, you can use /storage/scratch/your_group/your_username/ directory. When your jobs are running, they can use the local hard disk of each node to store temporary data (access there will be faster, but that space is only available to your jobs when they are running).

Q7: Which compilers are available on LaPalma? ^ Top

A: On LaPalma you can use GNU, and Intel compilers (*) for sequential/OpenMP codes, and OpenMPI compilers for parallel MPI codes. See compiling section for more details and optimization options for each compiler.

(*): Please, contact us if you are using Intel Compilers and you have any issue with the license. There is no license for Intel Parallel Studio, however, the Intel MPI Libraries and MKL are available, so most applications compiled with Intel compilers on other compatible systems should run with no problems on LaPalma.

Q8: What version of MPI is currently available at LaPalma? ^ Top

A: At least OpenMPI 3.0.1 is installed on LaPalma and it has full MPI-3.1 standards conformance, so you should be able to compile your MPI-1, MPI-2 and MPI-3 applications with no problems on LaPalma.

Q9: My application needs a special software (library, tool, ...) to run, is it available? ^ Top

A: To see updated list of installed compilers, programs, libraries, tools, etc. and their versions, please, use next command:

  [lapalma1]$ module avail

If the required software is in that list, you only need to load it using module load <module_name> command (check also the useful commands). If the software you need is not in that list (or you need another version), and you cannot install it locally, please, contact us.

Q10: I got some warnings/errors when loading software modules ^ Top

A: Some modules have prerequisites and/or conflicts. For instance, to load OpenMPI/gnu you previously need to load gnu. If a module of OpenMPI is loaded, you cannot load any other version of OpenMPI until you unload the current module or switch it. You will receive warnings and hints when loading is not possible due to prerequisites or conflicts. Examples:

    WARNING: openmpi/gnu/3.0.1 cannot be loaded due to missing prereq.
    HINT: at least one of the following modules must be loaded first: gnu

    WARNING: openmpi/intel/3.0.1 cannot be loaded due to a conflict.
    HINT: Might try "module unload openmpi" first.

Although it is very uncommon, it could happen that after some failing attempts of loading or unloading modules, you receive a warning about a corrupt environment. At this point, it is much safer if you close your current shell and begin a new one, in order to work with a clean session.



Q11: Should I be careful with the Input / Output over the parallel filesystem (Lustre)? ^ Top

A: Parallel Filesystem can be a bottleneck when different processes of one job are writing to Lustre along the execution. In this kind of jobs, one possible way to improve the job performance is to copy the needed data for each job to the local scratch at the beginning and copy back to lustre at the end, (with this scheme, most of I/O will be performed locally). This scheme is also recommended for massive sets of sequential jobs.

Also bear in mind that Lustre is an open-source, parallel file system that offers a much better performance than NFS when working with large files and parallel access, since your files could be split into smaller pieces and stored in different Object Storage Targets (OSTs) to improve the performance in parallel access. You can check and set the stripping options of your files and/or directories inside Lustre with lfs command (by default no stripping is done, so you may want to set your own options). For instance, some basic examples (use man lfs or check documentation for further details and more options):

  [lapalma1]$ lfs osts /storage                  # List all available OSTs
  [lapalma1]$ lfs getstripe <file or dir>        # Get stripping options of a file or dir
  [lapalma1]$ lfs setstripe -c -1 <file or dir>  # Strip file or dir across all OSTs (if you specify a directory,
                                                 # stripping will be applied to new files, but not the existing ones)
                                                 # -c specify the stripe count (how many OSTs will be used,  
                                                 # -1 means all of them). You can also change stripe size (-S)  
                                                 # or the stripe offset (-o), but it is not recommended

There are some advice and tips you can follow to achieve a better performance, check the official documentation (Tutorials Manual, Wiki, etc.). Also some other institutions have valuable info about using Lustre file systems, like the Lustre Basics and Best Practices available at NAS (NASA Advanced Supercomputing).

Q12: Should I be concerned about the big/little endian problem like when executing on La Palma2? ^ Top

A: No, you should not be worried about the endianness when working with LaPalma3. This machine is formed by Intel processors (little-endian architecture), so it is almost sure that they have the same endiannes that the processors of your laptop or desktop PC (usually Intel or AMD). Only if you are transferring binary files from other computers with big-endian architectures (like PowerPC processors) you may have some problems related to endianness, contact us to solve that.

Q13: How can I know the number of idle nodes? ^ Top

A: It could be useful to know the number of idle nodes before submitting your scripts, in order to reduce the waiting time in the queue. There are a couple of scripts that show information about how many nodes (and cores) are currently idle:

  # Show number of idle nodes:
  [lapalma1]$ idlenodes

  # Show number of idle cores (basically 16 times the number of nodes):
  [lapalma1]$ idlecores

Note: Information shown by these commands could have a delay of some seconds in relation to the real current status of the queue

Q14: Can I execute my programs interactively? ^ Top

A: When you open a ssh session on LaPalma, you are connected to the login node that is used by all users. In order to keep it on a proper load level so everyone can work on it, it is forbidden to run any heavy/parallel process on the login node. Login node should only be used to prepare the executions (compile your code, decompress the data, etc., as far as those tasks take less than 10 minutes), while all the executions and long tasks must be carried out on the computing nodes through the queue system. Therefore you will need to create a job to run your applications, and it will be executed according to your priority when all the resources you need are available. See the executing your applications section on the Useful commands page and also the examples of script files to know how to submit your jobs.

If for any reason you need to work interactively on the login node for a long while (for instance, to work on a visualization), then you must submit a job to the interactive queue, so you will be granted 1 hour of working time on 1 single cpu. Use next command for this purpose, and remember to exit the interactive session once you are done:

  [lapalma1]$ salloc -p interactive

Example:

  # 1) Sumbit a job to the interactive queue and wait it is allocated
  [lapalma1]$  salloc -p interactive
  salloc: Pending job allocation 1234 
  salloc: job 1234 queued and waiting for resources
  salloc: job 1234 has been allocated resources
  salloc: Granted job allocation 1234
  salloc: Waiting for resource configuration
  salloc: Nodes login1 are ready for job

  # 2) Now you can work up to 1 hour on login1 (only for sequential tasks!)
  [login1]$  ...

  # 3) Once you are done, exit the job
  [login1]$  exit
  exit
  salloc: Relinquishing job allocation 1234

  # 4) You are again using normal mode, executions longer than 10min are not allowed
  [lapalma1]$



Q15: I get an error when submitting my jobs... ^ Top

A: When submitting your jobs you have to specify some mandatory parameters (see these examples), if any of these parameters is missing you will receive an error when submitting your job.

If you receive a message like this...

  sbatch: error: This does not look like a batch script.  The first
  sbatch: error: line must start with #! followed by the path to an interpreter.
  sbatch: error: For instance: #!/bin/sh

... check that your script begins with #!/bin/sh and there are no trailing whitespaces before the # symbol (remove them if any)

If you receive a message like this:

  sbatch: error: Batch job submission failed: Job violates accounting/QOS policy (job submit limit, user's size and/or time limits)

... you may be asking for resources that exceed the present limits. Remember there is a maximum number of cores you can use, a run-time limit, etc.

Q16: How can I see the status of my jobs in queue? ^ Top

A: Next command will provide you information about your jobs in queues:

  [lapalma1]$ squeue

To obtain detailed information about a specific job:

  [lapalma1]$ scontrol show job <job_ID>

If you want to get more info about your job times (estimation of starting and ending times, used and remaining time, etc):

  [lapalma1]$ jobtimes



Q17: I have submitted some jobs, but they never run... ^ Top

A: If you submitted some jobs and they have been waiting in the queue for a long while, try the next steps:

  1. Check that there are no problems with that job: Use command squeue to see your queue status and check that waiting jobs show PD in the STATE column. If some of them show F, there are some problems that you need to fix (you can check job state codes here).
  2. Run command squeue and check column NODELIST(REASON), there you will find information about why your job is not running... If you want further details, run command "scontrol show job <job id>" and search for string "JobState=PENDING Reason=XXX Dependency=...".
    Reason field should tell you why your job is still waiting. You can check the complete list of job reasons codes, but the most common are the following ones:
    • PartitionTimeLimit or PartitionNodeLimit: you are asking for more time or nodes than the available in the partition (queue). It is likely your job will never run, change the walltime or the number of nodes, respectively. For instance, you are trying to run a 80-hour job when the maximum is 72-hour, so you need to change either the walltime or queue and set a valid one (remember you can edit your job after submission, it is not needed to cancel and re-submit it).
    • Resources (or None): at this moment there are not enough free resources (nodes) to satisfy your job, so it will wait till the needed resources get available
    • Dependency: this job depends on other job(s) that has/have not finished yet.
    • Priority: there are jobs with higher priority than yours.
    • AssociationJobLimit: if your group has a limitation over the executions, that limit has been reached.



Q18: I get an error when running my jobs... ^ Top

A: If your job is submitted, but there are errors when it is executed, you should find some information in the error file (the one specified with parameter -e). Please, check that file to find out where the problems seem to be located, most times they are related to:

  • your submit files: check if all the SLURM options are correct. If you receive errors about executable command not found (like mpirun: command not found or error: execve(): vasp: No such file or directory), missing dynamic libraries (*.so), etc., make sure you are loading the proper modules (we recommend you clean the environment using module purge and then load only the needed modules. Also check you are specifying the right paths and/or commands to execute your program and locate the input/output files.
  • your code: there are some bugs, invalid operations, conditions that has not been considered, incorrect paths, etc.
  • the problem you are solving: size is too big and you are asking for bigger amount of memory / disk than the one it is available, etc.
  • the system: there is a one-time problem with the file system or the network, etc.

If you see errors like Fatal error in MPI_Init or your MPI programs are not running in parallel, maybe you are not executing using the right commands. Depending on how the MPI program was compiled, it should work with one of the next command:

  mpirun program
  or
  srun program
  or
  srun --mpi=pmi2 program



Q19: How do I know the position of my jobs in queue? ^ Top

A: You can use the command, that shows information about estimated time for the specified job to be executed (check value of StartTime field).

  [lapalma1]$ scontrol show job <job_ID>

Also you can try next script that will give you information about times related to your jobs:

  [lapalma1]$ jobtimes



Q20: Is there any way to make my jobs wait less in queue before running? ^ Top

A: You must tune the directive #SBATCH -t wall_clock_limit to the expected job duration. This value will be used by to decide when to schedule your job, so, shorter values are likely to reduce waiting time; However, notice that when a job exceeds its wall_clock_limit will be cancelled, so, it is recommended to work with an small security margin.

Q21: I want to stop a running job, how can I do that? ^ Top

A: If you want to stop a job that it is already running, or want to remove the job from the queue when it is waiting, simply delete that job with next command:

   [lapalma1]$ scancel <job id>

Check also other useful commands when executing your jobs.

Q22: My job is finished but I see no output or it is not complete... ^ Top

A: Output that is normally printed on screen (stadout and stderr) will be saved to files by the Batch-queuing System once your jobs are done. Names and locations of those output files have to be specified in the script file. If you do not see these files or they are truncate, please, check next steps:

  1. Use squeue to make sure that the job has already finished (it should not appear in the list). Once the job is finished, the system will begin to copy output files from the nodes where your jobs were executed on. That process could take a while, so it could happen that you may need to wait some seconds or a few minutes till all your files are copied, that will depend on the number and size of your output files.
  2. If your job is finished, but you do not see output files, check the parameters in the script to see where the files should have been created. Make sure that you have used the -o parameter for standard output and -e for errors. You can use absolute or relative paths (relative to the working directory, the one you specified using parameter -D). Check that all paths exist and they are the expected ones, maybe there is an error in the paths and your files were created in a different location.
  3. Check that your jobs create files with different names, to avoid that one job can overwrite the output files of other job(s). The easiest way to avoid this is adding %j somewhere in your filenames when using -o and -e parameters, so your files will include the number of the job that is different in every submission. If you are running jobs array, you will want to add also other values like %a and %A).
  4. Check the run-time limit (set with -t parameter). If your application exceeds that limit, then your program will be terminated by the system and your output will be probably truncate.
  5. Check that your disk quota is not exceeded.
  6. Check if your application crashed (internal error, not enough memory, etc.). For instance, using commands like sstat or sacct you can get information like the maximum memory used, check that value is not close to the available memory).
  7. Contact us if none of these steps solved your problem.



Q23: My jobs have some special needs (dependencies, should start at a specific time, ...) ^ Top

Dependencies

If some of your jobs depend on other one(s), you can specify the dependencies in the script file that you submit to the queue. For example, suppose you submit two jobs with IDs 12301 and 12302, and you want to submit a third job that should only begin if the two previous jobs were successfully executed. Then you need to add next line in the script file of the third job:

 #SBATCH -d afterok:12301:12302
 or
 #SBATCH --dependency=afterok:12301:12302

You can also specify the dependencies in command line when submitting the job, then it will be easier since you do not need to change the script file to specify the IDs (see also this example):

  [lapalma1]$ sbatch -d afterok:12301:12302 script.sub

Other events can be used when specifying dependencies, like jobs that will begin only if other jobs fail (afternotok), or if other jobs finish successfully or not (afterany), or after other jobs begin (after), etc. You can also specify a list of jobs for the dependencies, specifying if all of them have to satisfy the dependencies or just any (see more options). Also bear in mind that dependencies can be modified after submission with scontrol command.

Dependencies can be also used for very long jobs that exceed the time limit of all available queues. If the application you are running is able to generate checkpoints and resume the execution from those checkpoints, then several jobs can be submitted forcing that each job has to wait before the previous ones are terminated (use -d singleton for this, job will only begin if all previous jobs with same job name and user have finished). You also need to prepare your program to generate checkpoints before the walltime of the queue is reached and the job is killed, and make them available so the next starting job can use the last checkpoint to resume the execution.

Deferral time

If your jobs need to begin at a certain time (maybe after the data is generated and automatically copied to LaPalma3), then you can use --begin option and specify the time that the job should start (if there are enough resources). Time could be absolute (--begin=21:30, --begin=2016-10-02T17:23:30) or relative (--begin=now+2hour or --begin=now+7200 to begin in two hours after submission). Time could be changed after the job is submitted using scontrol command (see more info and examples).

Q24: Can I automatically run on LaPalma many instances of my parallel (or serial) program with different input data? ^ Top

A: If you have any parallel (or serial*) program that has to be run over a large set of different data, it is possible to automatize the executions (like using GREASY on LaPalma2). This is possible using the jobs array feature available in SLURM (the batch-queued system), you can check some examples in the script file section. Please, contact us to study your problem and help you with these executions.

(*) IMPORTANT: If you are using jobs array to execute sequential programs, make sure you are doing things properly and all cores of each node are being used. If your submit script is not correct, it could be relatively easy to end up executing just a sequential program on each node, so 15 cores will be wasted per node. If you run this on many nodes, you could very quickly consume your assigned hours, wasting 94% of the resources that have been granted to you. So, please, test and double check your submit script before submitting it to the queue, and contact us if you have any doubt (you can use this the sequential jobs array example listed in script file section as template).

Q25: How can I know how much CPU time (or other resources) have been consumed by my jobs? ^ Top

A: You can use commands like sreport, sacct and/or sstat to know the CPU time that your jobs have consumed in a given time, both total amount or detailed by each job. Please, visit the useful commands section where there are several examples about using these commands and their options.