BCM81 to BCM90 Changes: Difference between revisions

From HPC Docs
Jump to navigation Jump to search
(Created page with "After upgrading from Bright Cluster Manager 8.1 to 9.0, many software packages were updated. Some of these changes will necessitate modifications to your job submission script...")
 
No edit summary
 
(One intermediate revision by the same user not shown)
Line 1: Line 1:
After upgrading from Bright Cluster Manager 8.1 to 9.0, many software packages were updated. Some of these changes will necessitate modifications to your job submission scripts or other environmental settings. The ones discovered so far are listed below with solutions to resolve them.
After upgrading from Bright Cluster Manager 8.1 to 9.0, many software packages were updated. Some of these changes will necessitate modifications to your job submission scripts or other environmental settings. The ones discovered so far are listed below with solutions to resolve them.
== LMOD Errors When Logging In ==
You may receive error messages like the one below when logging in for the first time.
<pre>
Lmod Warning:  The following modules were not loaded: gcc
Lmod Warning:  One or more modules in your default collection have changed: "tcnjhpc".
To see the contents of this collection execute:
  $ module describe default
To rebuild the collection, do a module reset, then load the modules you wish, then execute:
  $ module save default
If you no longer want this module collection execute:
  $ rm ~/.lmod.d/default
For more information execute 'module help' or see http://lmod.readthedocs.org/
No change in modules loaded.
</pre>
and/or
<pre>
Lmod has detected the following error:  The following module(s) are unknown: "slurm"
Please check the spelling or version number. Also try "module spider ..."
It is also possible your cache file is out-of-date; it may help to try:
  $ module --ignore-cache load "slurm"
Also make sure that all modulefiles written in TCL start with the string #%Module
Lmod has detected the following error:  The following module(s) are unknown: "slurm"
Please check the spelling or version number. Also try "module spider ..."
It is also possible your cache file is out-of-date; it may help to try:
  $ module --ignore-cache load "slurm"
Also make sure that all modulefiles written in TCL start with the string #%Module
</pre>
These errors may be resolved by simply logging out and back in again. If the error persists, you can try the commands that are listed along with error (show below).
<pre>
module describe default
module save default
</pre>
Finally, if that still doesn't work, run the command below (be VERY CAREFUL that you type it correctly; cut-and-paste it if possible. There should only be 1 SPACE in the command which is after the "rm" command.)
<pre>
rm ~/.lmod.d/default
</pre>
== Errors Submitting Jobs using Sbatch ==
The <code>--workdir</code> option in <code>sbatch</code> submission scripts has been changed to <code>--chdir</code>. If you get the following error when submitting a job, you need to update your submit script.
<pre>
$ sbatch submit.sh
sbatch: unrecognized option '--workdir=./'
Try "sbatch --help" for more information
</pre>
Simply edit your submit script and change
<pre>
#SBATCH --workdir=./
</pre>
to
<pre>
#SBATCH --chdir=./
</pre>

Latest revision as of 13:40, 24 May 2021

After upgrading from Bright Cluster Manager 8.1 to 9.0, many software packages were updated. Some of these changes will necessitate modifications to your job submission scripts or other environmental settings. The ones discovered so far are listed below with solutions to resolve them.

LMOD Errors When Logging In

You may receive error messages like the one below when logging in for the first time.

Lmod Warning:  The following modules were not loaded: gcc




Lmod Warning:  One or more modules in your default collection have changed: "tcnjhpc".
To see the contents of this collection execute:
  $ module describe default
To rebuild the collection, do a module reset, then load the modules you wish, then execute:
  $ module save default
If you no longer want this module collection execute:
  $ rm ~/.lmod.d/default

For more information execute 'module help' or see http://lmod.readthedocs.org/
No change in modules loaded.

and/or

Lmod has detected the following error:  The following module(s) are unknown: "slurm"

Please check the spelling or version number. Also try "module spider ..."
It is also possible your cache file is out-of-date; it may help to try:
  $ module --ignore-cache load "slurm"

Also make sure that all modulefiles written in TCL start with the string #%Module



Lmod has detected the following error:  The following module(s) are unknown: "slurm"

Please check the spelling or version number. Also try "module spider ..."
It is also possible your cache file is out-of-date; it may help to try:
  $ module --ignore-cache load "slurm"

Also make sure that all modulefiles written in TCL start with the string #%Module

These errors may be resolved by simply logging out and back in again. If the error persists, you can try the commands that are listed along with error (show below).

module describe default
module save default

Finally, if that still doesn't work, run the command below (be VERY CAREFUL that you type it correctly; cut-and-paste it if possible. There should only be 1 SPACE in the command which is after the "rm" command.)

rm ~/.lmod.d/default

Errors Submitting Jobs using Sbatch

The --workdir option in sbatch submission scripts has been changed to --chdir. If you get the following error when submitting a job, you need to update your submit script.

$ sbatch submit.sh
sbatch: unrecognized option '--workdir=./'
Try "sbatch --help" for more information

Simply edit your submit script and change

#SBATCH --workdir=./

to

#SBATCH --chdir=./