The Solaris compilation and linking system is one of the most complicated ones in existence. The good news is that you can do almost anything you want to, and that it is very consistent between the main programming languages. The bad news is that it is very hard to find out what you should be doing, and it is easy to make a mistake without getting a diagnostic. Sun are improving the diagnostics to help you detect serious mistakes, but there is a long way to go.

On the HPCF, the relevant commands have been surrounded by wrapper scripts, so that the default is to provide a reasonable set of defaults for the HPCF systems and some detection of potential pitfalls. This is optional, and can be disabled entirely, but you are advised not to do that unless you know what you are doing. In particular, you are strongly advised not to do that just to make some horrible autoconfigure script or Makefile `work'. It is very likely that such scripts have not been updated in years and are totally inappropriate for the HPCF systems.

The following description serves two purposes. It is the specification of the wrapper script system, how to control it and why it does what it does. And it is a summary of the most important compiler and linker options, some suggestions of which ones are worth looking at further, and some indication of which ones not to use. Case is significant, as conventional under Unix.

Unsupported Facilities

The old f77, ucbcc and c89 commands should not be used, though the first has been aliased to the f90 command for convenience. Any program so horrible that it will work only with the first two should be fixed, and the last is purely to satisfy the letter of the POSIX standard and does not support optimisation. The commands that should be used are f95 (or f90), cc, CC, mpf95 (or mpf90), mpcc and mpCC. The mp... commands are for building MPI and S3L programs.

The -cg89, -cg92 options and the -xarch= option for anything older than SPARC9 architectures are not supported on the HPCF, because the first two are obsolete and all cause much less efficient code generation. The recommended values of -xarch= are v9b and v8plusb, but the HPCF_ARCH environment variable is the recommended way of controlling the addressing if HPCF_MODE is set.

The -xsafe=mem option should not be used, because it is likely to lead to obscure program failure without warning. Despite some of the documentation, there is no speculative load instruction in the SPARC9 architecture, only a non-faulting one, and any error will have the effect of turning a SIGSEGV into a wrong answer. The compilers do not seem to take much notice of this option, anyway.

The C options -xrestrict and -xalias= for anything other than any or basic should not be used, because they are likely to lead to obscure program failure without warning. This is because they are using features of C99 that are complex beyond belief and, in the latter case, ambiguous beyond belief, too. A much better solution is to use the C99 restrict qualifier carefully - a good guideline is to use it only for simple scalar, vector and matrix arguments. Please ask HPCF Support about C99 if you want to know more.

The Fortran option -xalias=actual must be set for optimised MPI code, because its absence is likely to lead to obscure program failure without warning. This is because the semantic models of Fortran and MPI are incompatible (this is addressed in the next Fortran standard, currently in draft). Because so many HPCF codes use MPI, actual is forced in the the values of -xalias by the wrapper scripts. Please ask HPCF Support about this if you want to know more.

The Fortran options -onetrip option and -r8const should not be used, because they are obsolete and deceptive at best. The former is for compiling Fortran 66 programs, which should have been rewritten years ago, and the latter has been superseded by the better -xtypemap option. See also the HPCF_AUTODBLE environment variable for the latter case.

Some other unsupported facilities are described later. In particular, see the warnings about environment variables beginning LD_, MPSS, MPI_, OMP and SUNWand the -L, -R and -M options.

In most cases, `unsupported' does not mean that they are locked out, though some attempt is made to prevent their use by accident if HPCF_MODE is set, but that problems with them will not even be investigated and libraries will not be built to support them. It is just possible that they could even cause system problems, in which case their use would have to be forbidden. We know of no circumstances in which any unsupported facility is likely to be useful on the HPCF.

General Environment Variables etc.

If the environment variable HPCF_MODE is set to yes, the wrapper scripts will operate in HPCF mode. If it is unset or set to no, they will do nothing and the commands will behave as supplied by Sun. Other values will cause an error. It is set to yes by default in the system login scripts. If it is unset or set to no, none of the other environment variables will be inspected, and the rest of this document is relevant only as a description of Sun's compiler options. The operation of HPCF_MODE relies on having /usr/local/bin first on your search path; this is also set up by default.

If the environment variable HPCF_ARCH is unset or set to 64bit, the wrapper scripts will operate in 64 bit mode. If it is 32bit, they will operate in 32 bit mode. Other values will cause an error. It is set to 64bit by default. You are strongly advised to work in 64 bit mode, even if integers in your program are kept as 32 bits, for a large number of reasons. You cannot link mixtures of 32 and 64 bit code, but this is not a major problem, as mistakes are diagnosed by the linker.

The environment variable HPCF_VERBOSE can be set to no, yes or all, and controls how many static diagnostic options are set. It does not set any options that will slow down execution significantly. An unset value is equivalent to of yes, and other values will cause an error. The default is yes.

The environment variable HPCF_AUTODBLE can be set to no, single or all, and controls whether Fortran default INTEGER, REAL and DOUBLE PRECISION are automatically doubled. An unset value is equivalent to of no, and other values will cause an error. The default is no. It will add the following option for Fortran only:

If the compilation command option -hpcf_dryrun is specified and HPCF_MODE is set, the scripts will print out the expanded commands that they would have called, and do nothing. If HPCF_MODE is not set, it will cause a compilation error. This may help with debugging, and in building scripts and make files for use elsewhere, but be careful that you do not set an option that relies on the configuration of the HPCF.

The wrapper scripts do some truly horrible things to enable the use of large pages. This can give very significant performance improvements, but it is one in which the behaviour is likely to vary quite radically with time, as Solaris versions and the system configuration change. You are very strongly requested not to use the -M option, the ppgsz command, the memcntl function, or the environment variables LD_PRELOAD and anything beginning with MPSS without consulting HPCF support staff first.

There are some extremely nasty `gotchas' with several options, in that they must be set consistently across the whole program, but you are unlikely to get a warning if they are not. The C++ User's Guide, section 2.4.3, contains a list which mostly applies to the other languages as well. There is also the problem that ld does not support -fast or several other options, yet it is essential to specify them at link time if they are to take effect. You should therefore not use ld to link optimised programs on any Solaris system, unless you really know what you are doing.

Related to the above `gotcha' is Sun's warning in several places that you should not use the LD_LIBRARY_PATH and similar environment variables for optimised code, but there are also less clearly documented traps with using any of the LD_... environment variables, using the -L option to set system search paths, or using the -R option. Please avoid using any of these, though it is safe to use -L to add your own directories to the search path. Please contact HPCF support if you have any problem with this area, so that we can fix the system configuration if necessary. You can disable most of the checks by setting the environment variable HPCF_LD_PATHS to yes, but are advised not to.

Sun's incremental linker (ild) needs 5 GB of data space to run (as set by ulimit -d 5120000). This is vastly more than is available interactively, and increasing the limit would mean that runaway programs would prevent other users working. So -xildoff is set by default. Sun are investigating why.

General Optimisation Features

If HPCF_MODE is yes, the compilation commands will set the options: -fast expands into a large number of other, generally safe, optimisation options; you are advised to look at Sun's documentation to see what it does, as they may cause a few of the uncleaner programs to be rejected or fail. The options fall roughly into three categories: -O3 reduces the optimisation level to keep compilation speed up and the size of the compiled code down; generally, levels higher than 3 should generally be used after doing some analysis of where the program is spending its time -- see below for some pointers as to how to proceed. -stackvar uses the stack for local variables, and is needed for some other options; as it seems efficient, it is set by default. -xvector uses the vectorised forms of some mathematical functions, and -prefetch enables the SPARC9 prefetch instructions. -xarch=v9b and -xarch-v8plusb select 64-bit and 32-bit mode for the SPARC9 architecture, which is very important for efficiency. -xlic_lib=sunperf and -library=sunpperf select Sun's optimised LAPACK and BLAS library. -lm is purely for convenience.

Two useful Fortran optimisation options that are not currently set by default because mistakes in using them are poorly diagnosed are -xalias=no%craypointer and -xknown_lib=blas,intrinsics. If your code does not use Cray pointers (and it shouldn't, if you are interested in performance), you should add the first. If your code does not include routines with the same name as any of the BLAS or Fortran intrinsics (and that is not a good idea, either), you should set the second. If Sun improve the diagnostics, these will be added to the defaults, as they are otherwise quite safe.

For really serious optimisation, you are recommended to look at the -xipo and -xprofile options and to restore the optimisation level to -O5. But you may well want to do this only on the most critical parts of your code, and you will have to put a fair amount of effort in to do this properly. The -xcrossfile option is not recommended, in general, as the -xipo one is more powerful. Please do not use -xF without consulting HPCF Support first, as it needs the use of the -M option. Sun provides a lot of relevant facilities, but you should be able to follow them starting from the pointers in the documentation of -xipo and -xprofile.

General Debugging Features

If HPCF_MODE is yes, the compilation commands will set the options: -fpover detects overflow in formatted input. -u warns about undeclared variables and -ansi warns about the use of extensions to the standard; as these can be very verbose, they are not set by default. -U__MATHERR_ERRNO_DONTCARE restores the detection of errors in some mathematical functions that is disabled by -fast. xstrconst puts strings into read-only storage; any program that has trouble with this should be fixed. The others increase the number of miscellaneous warnings.

-xcheck=stkovf improves the detection of stack overflow, and is quite important for OpenMP or other threaded programs; it is not set by default, as it increases the time needed to extend the stack from about 2.5 seconds/GB to just over 20 seconds/GB.

The environment variable HPCF_FPETRAP can be set to no or yes, and controls whether SIGFPE is trapped in C and C++ by adding the option -ftrap=common. An unset value is equivalent to yes, and other values will cause an error. It is set to yes by default. This may cause some trouble, but the common default on most modern systems of no floating-point trapping in C causes a great many programs to give wrong answers with no indication of the fact; you are strongly advised to locate the cause of a problem rather than just disabling this.

Other useful debugging options include -g and (for Fortran) -Xlist; you should look at the documentation, as their use and behaviour are not obvious. Some other options, such as the -C option for Fortran array bound checking are less thorough than you might expect; for such testing, the NAG Fortran compiler is better. Don't be fooled into thinking that enabling IEEE arithmetic will help with discovering floating-point errors; it won't, and the reasons have nothing to do with Solaris or SPARC.

There are a large number of other facilities and several debuggers, but they have not yet been investigated.

Building MPI or S3L Programs

You should set the environment variable HPCF_MPI to yes if you are building an MPI or S3L program, though all it does is to trap the mistake of using the serial commands to compile or link code. An unset value is equivalent to no, and other values will cause an error. S3L is Sun's parallel mathematical library.

The reason that you should not use the serial commands directly if you are building an MPI or S3L program on any modern Solaris system is that the mp... commands set up the paths and libraries correctly, and these are quite hard to get right. We have no idea why some of them need to be specified, nor what will happen if you get them wrong, but at least some will cause failure to compile or link, and others will cause serious inefficiency. Note that this applies to autoconfigure scripts and make files as much as your own commands. This is a serious `gotcha'.

To run your MPI or S3L program, you must use the mprun command (mpirun is an alias on the HPCF, but not generally in Solaris); see the local Web page on that for more details.

Building OpenMP Programs

You should set the environment variable HPCF_OPENMP to yes if you are building an OpenMP program, and it will set the options `-xopenmp -xautopar', plus -xloopinfo if HPCF_VERBOSE is yes or `-xloopinfo -vpara' if HPCF_VERBOSE is all. You can also specify the -openmp or -xopenmp option explicitly, and the HPCF_OPENMP variable will be ignored. A value of no is equivalent to unset, and other values will cause an error.

-xopenmp selects parallelisation based on explicit OpenMP directives, -xautopar selects automatic parallelisation, -xloopinfo prints a description of whether each loop was or was not parallelised and -vpara gives more verbose explanations. You can get further parallelisation information by setting the -g option and using the er_src command on the compiled object file - yes, really!

You are strongly advised not to specify a scheduling policy on your OpenMP directives, and to use only static scheduling if you do. Experience with running benchmarking codes on many modern large SMP systems is that the advice to use only static scheduling applies to all of them. Both dynamic and guided scheduling are vastly less efficient, so much so that it is sometimes difficult to believe the figures.

To run your OpenMP program, you should set the environment variable OMP_NUM_THREADS to the number of CPUs in the GridEngine queue, though this may be done automatically if we can manage it.

You are strongly advised not to use Sun's older parallel directives (including the Cray ones) by setting either of the -parallel or -mp=... options directly, nor to set the PARALLEL environment variable. They are obsolete and are effectively unsupported even by Sun.

The default user login and job submission will set some environment variables for running OpenMP efficiently. You are strongly advised not to set, unset or change any environment variable beginning SUNW or OMP (except possibly SUNW_MP_WARN and of course OMP_NUM_THREADS) without contacting HPCF support first. We don't know what many of these do, and they may have weird effects or even cause system trouble.