6.5 Running large data sets
6.5.1 Performance scalability
I've analysed the performance scalability of the AUSRIVAS Macroinvertebrate Predictive Modelling software in relation to large data sets, and found the following:
Windows 98/95 memory management doesn't appear to be as efficient as Windows NT. I tested Win98 with 2500 sites and 256Mb memory and the program ran out of memory after processing only 380 sites. I then tested Win NT with 2500 sites and 128Mb memory (plus 180Mb virtual memory), and this succeeded. The AUSRIVAS program was using 4Mb of memory for each 100 sites that it processed. By the end of processing all 2500 sites it was using 181Mb of memory (this includes the operating system and some initial memory used by the AUSRIVAS program as well as the 4Mb per 100 sites). This took over 30 minutes to run on a 550MHz machine.
I then tested Win NT with 4500 sites. I was not able to complete the test because my machine does not have enough memory. However, I was able to see that in this case, the rate of memory usage was 7Mb per 100 sites with an initial overhead of 120Mb. This means that the projected estimate for the amount of memory required to process 4500 sites is at least 436Mb (say 512Mb). I estimate that this would take several hours to run on a 550MHz machine.
I recommend that this be made up of actual physical memory rather than being made up of real and virtual memory (memory temporarily stored to disk) otherwise processing will be slowed greatly.
6.5.2 Summary
In summary (by Operating System):
Windows 95/98
These operating systems can run a maximum of 380 sites, regardless of the machines physical memory and processor speed. (NB some users report a different number of sites for Win 98).
Windows NT
NT can run large data sets. In this case the machines memory (both physical and virtual) restricts how many sites can be run, while the processor speed and proportion of physical to virtual memory determines how long the run will take.
6.5.3 Recommendations
The minimum recommended system to run 4500 sites is:
Operating system: Windows NT
Physical memory: 512Mb
Depending on processor speed, it could take several hours to run 4500 sites. I’ll update this when I get feedback from users who have run large data sets.