Compiling on Maxwell

The pathscale  compilers have a wide range of compiler options On the HPCF, the relevant commands have been surrounded by wrapper scripts, so that the default is to provide a reasonable set of defaults for the HPCF systems and some detection of potential pitfalls. This is optional, and can be disabled entirely, but you are advised not to do that unless you know what you are doing. In particular, you are strongly advised not to do that just to make some horrible autoconfigure script or Makefile `work'. It is very likely that such scripts have not been updated in years and are totally inappropriate for the HPCF systems.

The following description serves two purposes. It is the specification of the wrapper script system, how to control it and why it does what it does. And it is a summary of the most important compiler and linker options, some suggestions of which ones are worth looking at further, and some indication of which ones not to use. Case is significant, as conventional under Unix.

Unsupported Facilities

While the gcc and g77 compilers are available the Pathscale compilers in general give significantly better performance. The commands that should be used are pathf90, pathcc, pathCC, mppathf90, mppathcc and mppathCC. The mp... commands are for building MPI programs.

General Environment Variables etc.

If the environment variable HPCF_MODE is set to yes, the wrapper scripts will operate in HPCF mode. If it is unset or set to no, they will do nothing and the commands will behave as supplied by Pathscale. Other values will cause an error. It is set to yes by default in the system login scripts. If it is unset or set to no, none of the other environment variables will be inspected, and the rest of this document is relevant only as a description of Pathscale's compiler options. The operation of HPCF_MODE relies on having /usr/local/bin first on your search path; this is also set up by default.

If the environment variable HPCF_ARCH is unset or set to 64bit, the wrapper scripts will operate in 64 bit mode. If it is 32bit, they will operate in 32 bit mode. Other values will cause an error. It is set to 64bit by default. You are strongly advised to work in 64 bit mode, even if integers in your program are kept as 32 bits, for a large number of reasons. You cannot link mixtures of 32 and 64 bit code, but this is not a major problem, as mistakes are diagnosed by the linker.

The environment variable HPCF_VERBOSE can be set to no, yes or all, and controls how many static diagnostic options are set. It does not set any options that will slow down execution significantly. An unset value is equivalent to of yes, and other values will cause an error. The default is yes.

If the compilation command option -hpcf_dryrun is specified and HPCF_MODE is set, the scripts will print out the expanded commands that they would have called, and do nothing. If HPCF_MODE is not set, it will cause a compilation error. This may help with debugging, and in building scripts and make files for use elsewhere, but be careful that you do not set an option that relies on the configuration of the HPCF.

General Optimisation Features

If HPCF_MODE is yes, the compilation commands will set the options: -O2 sets the optimisation level to keep compilation speed up and the size of the compiled code down; generally, levels higher than 2 should be used after doing some analysis of where the program is spending its time -- see below for some pointers as to how to proceed.
-m64 selects 64bit.
-m32 selects 32bit.
-mcmodel=medium allows the data segment to be larger than 2GB. 
-WOPT:warn_uninit=on gives a warning of unitialised variables. If you find the output unhelpful you can turn it off with -WOPT:warn_uninit=off.
-lacml selects AMD's mathematical library.
-I/opt/acml2.5.1/include includes the acml headers for C and C++
-lpathfortran is the required PathScale compiler run-time library.
-lm is purely for convenience so users do not need to add it.
-lstdc++ is a library used by C++. It is included for all languages to allow cross compiling.

The For really serious optimisation, you are recommended to try -O3, -O3 -OPT:Ofast or -Ofast in order of aggressiveness. But you may well want to do this only on the most critical parts of your code, and you will have to put a fair amount of effort in to do this properly. There are also dangers of a loss or numerical accuracy for some codes at these higher levels of optimisation.

-OPT:Ofast is equivalent to -OPT:roundoff=2:Olimit=0:div_split=on:alias=typed

-OPT:roundoff=2 allows for fairly extensive code transformations that may result in floating point round-off or overflow differences in computations.

-OPT:Olimit=0 is a generally safe option but may result in the compilation taking a long time or consuming large quantities of memory. This option tells the compiler to optimize the files being compiled at the specified levels no matter how large they are.

-OPT:div_split=on allows the conversion of x/y into x*(recip(y)) which may result in less accurate floating point computations.

-OPT:alias=typed assumes that the program has been coded in adherence with the ANSI/ISO C standard which states that two pointers of different types cannot point to the same location in memory.

-Ofast is equivalent to -O3 -ipa -OPT:fast -fno-math-errno

-fno-math-errno bypasses the setting of ERRNO in math functions. This can result in a performance improvement if the program does not rely on IEEE exception handling to detect runtime floating point errors.

The Inter-Procedural Analysis can also give significant performance improvements and can be activated with -ipa. When you are using -ipa, all the .o files have to have been compiled with -ipa all libraries have to have been compiled without -ipa and linking must be done with -ipa for your compilation to be successful. The total compile time can be considerably longer with IPA than without.

Understanding your code is important in choosing the best compiler options.

General Debugging Features

If HPCF_VERBOSE=all and HPCF_MODE=yes then compilation commands will also set the option for all compilers of -v which makes the compilers more verbose and -Wall -fullwarn which turns on all warnings.

Other options that can help with debugging include

-trapuv Trap uninitialized variables when compiled at -O0

-g to turn on debugging for use with gdb

For Fortran you can also use

-ffortran-bounds-check Check bounds.

-C Perform runtime subscript range checking.  Subscripts that are out of range cause fatal run time errors.

Building MPI

You should set the environment variable HPCF_MPI to yes if you are building an MPI  program, though all it does is to trap the mistake of using the serial commands to compile or link code. An unset value is equivalent to no, and other values will cause an error. 

The reason that you should not use the serial commands directly if you are building an MPI program on any modern system is that the mp... commands set up the paths and libraries correctly, and these are quite hard to get right. We have no idea why some of them need to be specified, nor what will happen if you get them wrong, but at least some will cause failure to compile or link, and others will cause serious inefficiency. Note that this applies to autoconfigure scripts and make files as much as your own commands. This is a serious `gotcha'.

To run your MPI program, you must use the mpirun command; see the local Web page on that for more details.