Cambridge-Cranfield HPCF: Progtamming IO

History of High Performance Computing in Cambridge

Cambridge-Cranfield HPCF > Information for Users > Programming and Compilation > IO

I/O is one of the weaker areas of the HPCF machines. Although they are both faster than a departmental workstation when used correctly, they are not as fast as might be expected for machines of their class. Hence it is important to ensure that the best possible performance is extracted from them if a significant about of I/O is being done.

The rest of this discussion is divided into the following sections:

Hardware,
FORTRAN and
C.

Finally, there is a section on data transfer.

Hardware

On hartree and hodgkin the fastest local disk is a RAID array called hartree-tmp, or hodgkin-tmp.

FORTRAN

In FORTRAN the amount of control that the programmer has over I/O is limited. Hence there is little to say except the standard comments about avoiding formatted I/O and using reasonably large records.

The point that implicit do loops are far more efficient than explicit ones is well known and applies to almost all architectures. On babbage

      PARAMETER(ISIZE=64)
      DOUBLE PRECISION A(ISIZE*128*1024)

      WRITE(10)(A(I),I=1,ISIZE*128*1024)

produces a 64Mb file which can be read at over 6Mb/s, whereas writing it with

      DO I=1,ISIZE*128*1024
       WRITE(10)A(I)
      ENDDO

produces a 128Mb file which reads at under 0.5 Mb/s (useful data transfer rate).

C

Is this up todate? Can we reference the Sun rather than babbage?

In C the programmer has rather more control over how I/O is performed, and the defaults are less reasonable than for FORTRAN. In particular, with fread it is possible to set the buffer size via a call to setvbuf. This can dramatically improve the I/O performance on babbage:


Read performance on babbage from /Pscratch
buffer size Throughput (Mb/s)
8K          2.2
16K         3.5
32K         4.8
64K         6.8
128K        8.8
512K        10.4

Read performance on babbage from /Pscratch
buffer size	Throughput (Mb/s)
8K	2.2
16K	3.5
32K	4.8
64K	6.8
128K	8.8
512K	10.4

The default buffer size is 8K on babbage, but for /Pscratch, which is a striped filing system, this is far below the optimal value, and thus one is recommended to increase it to at least 64K if a significant amount of I/O is being done is this fashion. It is necessary to malloc and assign buffers indivdually for each open file, so very large buffers may cause memory problems. Using read rather than fread can avoid this buffering problem and achieve 8Mb/s

Data Transfer

Data transfer between different machines is always an frustrating task, for there are few standards about how numeric data should be represented on disk.

For small quantities of data, the simplest solution is to write out the data in a formatted fashion, and read it in on the other machine in the same way. Do remember not to truncate the precision of the data too hard when writing it out!

For larger quantities of data (tens of Mb), it may be necessary to consider something more efficient. In order for machine A to read data from machine B, the two machines must agree on three points:

the underlying encoding of the data
whether they are big- or little-endian
what headers, trailers, and record separators to use

Most machines can agree on point one, considering integers to be 32 bit twos complement, and floating point to be 32 bit IEEE (single) or 64 bit IEEE (double) precision. Any machine you are likely to meet (babbage, the 3050 workstations, RS/6000's, Suns, Alphas, SGIs, HPs, Intel PCs, Cray T3D, et al.) is IEEE, and those few which are not (Hitachi 3600, Cray Y/MP) can almost certainly be persuaded to read and write IEEE data. Of course I/O in a non-native format, that is in IEEE on a non-IEEE machine, is significantly slower than I/O in its native format, so should be used only when necessary.

The HPCF machines are all big-endian. This is perhaps now the dominant byte ordering, but two important chip lines are little-endian, DEC's Alpha line (Alphas and Cray T3D), and Intel's 80x86 line (all PC's including Pentiums). DEC provides a library call for endian reversal (cvt_ftof in libm), Intel provides an instruction (486+)...

The matter of headers and trailers is one which can be ignored in C - what you write is what you get. In FORTRAN, however the file is opened, the HPCF machines put some sort of header and trailer on the file. The header even includes a time stamp, so files with identical data can fail a comparison using UNIX utilities. Nick recommends forging headers, I (MJR) recommend calling C from FORTRAN to do the I/O. Either way, you may wish to contact us (support@hpcf.cam.ac.uk) for further advice if you need to transfer large amounts of floating point data from the HPCF to other systems.