Field Journal from Tom Comeau - 2/24/96


It's Saturday, and I'm working.

I actually am making up a couple of hours I missed early in the week when Mary (my wife) had to see a doctor about our baby. (She's due around the first of May.) Since I went with Mary to the doctor, she came to work with me today, and is doing some paperwork (she's a lawyer) while I work on my computer programs.

I'm working on getting six very large programs ready for the next Servicing Mission, which is scheduled for about this time next year. The programs are all part of the Data Archive and Distribution System, or DADS. DADS is where we keep the Hubble Data Archive, which has all the pictures and spectra taken from HST, plus information used to get those pictures and spectra.

For example, we have data about the "health" of the telescope, like the telescope's temperature, where it was pointed, and how much power the solar panels were generating, and lots of other things. We also have the schedules used to plan the observations, and comments made by the Operations Astronomers about the quality of the data.

We keep all this data on big optical disks, which are sort of like oversize CD-ROMs, except we can write on them. Once. Each optical disk holds about ten times as much data as a CD-ROM. All the optical disks are stored in big jukeboxes. Each jukebox has 131 disks, and we currently have three jukeboxes, with a fourth about to be installed. That's more storage capacity than 5000 CD-ROMs, and it is already more than three-quarters full!

During the Servicing Mission the astronauts will install a new solid state data recorder and two new instruments: STIS and NICMOS. Currently the telescope uses mechanical tape recorders. The new recorder will store about ten times the amount of data, and we won't have to wait for the tape to rewind. The new instruments can take many more, shorter exposures, and can give us much more data. The instrument scientists are telling us to expect seven to ten times as much data as we get today!

The result is that we expect to get a lot more information, and the new instruments will give us data in new formats. So we have to make our programs work much faster, and handle the new data formats.

There are only a few ways to improve the performance of a computer system. One is to use a faster processor. For example, the newer Pentium systems are faster than the old 386 systems. Another way is to parallel processing, so that the computer can do more than one thing at a time, or several computers can work on the same problem.

For example, if you are making 100 paper airplanes all by yourself, you might only turn out two planes a minute. You would need nearly an hour to make all 100. If you get four friends to make planes with you, the five of you can turn out ten planes a minute, and be done in ten minutes. That's parallel processing!

We're doing both. We're going from a computer invented in 1990 (old by computer standards) to a brand new system that is much faster. We're also setting things up so we can work on many things in parallel.

Each program I'm working on contains several "modules" -- smaller programs that are invoked from the main program. For example, the program that puts information about observations in a database has about thirty modules. Each module has two or three hundred lines of instructions! So this one program has over six thousand lines!

We wouldn't be able to figure out what a single six thousand line program did (or why it didn't work) so we break it up into these smaller modules, which are like building blocks, and let each module do just one thing. For example, one module just figures out the day on which an observation was taken. Another checks to see if we already have information about an observation in the database. Another figures out which instrument (camera or spectrograph) was used to take the observation. The main program just calls each smaller module to do a little piece of the work. That makes it easier to get each part of the program right, and makes it easier to add new instruments, or change the information we put in the database.

The program I'm working on today doesn't need a lot of changes. Of the thirty or so modules, only three or four need to be changed, and I'll probably have to add a couple of new ones. Then I'll test the modules I changed and the new ones by themselves, and then test the program as a whole, and finally test it with the rest of the system.

Even these six big programs are a fairly small part of the DADS system. In addition to Ingest (the part I'm working on) there's an even bigger set of programs to ship the data to astronomers around the world. And the biggest programs control the jukeboxes and optical disk drives. Overall, there are between two and three million lines of instructions in DADS, so the 6000 line program I'm working on today is less than one percent of the system.

But in the next couple of hours (before Mary decides it's time for a late lunch) I'm going to try and get my one percent working.