Safeguarding IS Systems
Keeping the Show on the Road
It is often said that there are only two certain things in life, death and taxes.
However, it is also fairly certain that if the computer systems go down, it will cause a great deal of disruption for everyone. Information technology plays an increasingly important role in our working lives, and we long since passed the point where it was an optional extra. We now expect it to be there and to work, and if it does not it can seriously affect our ability to do our job.
At the same time, the systems we depend on have become more complex and subject to greater levels of external threats, such as viruses, hacking and floods of junk e-mail.
This article outlines some of the steps that the Information Systems Unit has taken or is planning to take to safeguard our systems.
Network Links
All the university sites are connected together by high-speed fibre optic links. These mostly run in ducts provided by telecommunications companies, and are not immune from being dug up. This is not a common occurrence, but as the recent fire below a Manchester telephone exchange showed, these things can happen. To guard against this remote possibility, the university sites are also connected by a lower speed "Megastream" links, running through different ducts. These would probably not be sufficient to carry all the traffic that normally runs over the fibre optic links, but would allow essential network traffic to continue.
Power Supply
We normally expect the power supply to be there 24 hours a day, seven days a week, and usually it is. However over the past two years there have been more interruptions than normal, some due to the building work at John Dalton and others due to problems outside the university. These have affected systems throughout the institution. While we have no reason to believe that these power interruptions will be as frequent in future as they have been recently, the university has decided that it should take steps to guard against this possibility. A standby electricity generator has been installed to supply the computer room in the John Dalton extension. In the event of a power failure, with this will start up automatically and allow essential services to continue until the mains power is restored.
Firewall
The Information Systems Unit has implemented a firewall to protect the university's network from external attack. Because the link to Internet is so critical, the firewall is actually implemented on two separate boxes so that if either fails, the other unit will continue to provide the connection.
Particularly critical services, such as the finance and student records systems, are further protected by individual firewalls, and the university network is separated from the halls of residence networks to provide protection against viruses on student's own machines.
Clustered Servers
Some computers host particularly critical systems. Currently, if the computer suffers a hardware or software problem, the system will be unavailable. The Information Systems Unit is currently looking at "cluster" technology that allows two or more servers to share the load of running any given application. In the event of the failure of any one server, the application itself would still be available.
Network Storage
Like any other computer, the server stores its Information on hard disks. If these fail, the server and any applications it is running our unavailable. It is possible to guard against single disk failures by using Raid technology which spreads Information across several discs and can reconstruct missing information even if one of the discs fails. However, the server is still vulnerable to a failure of the Raid controller, and if servers are to be clustered they may well need to share the Information on their discs. The Information Systems Unit is investigating a technology called Network Attached Storage that allows storage to be separated from servers. Thus, several servers can share the same discs, and redundancy can be provided even to the extent of having the same data stored on separate discs in separate locations.
Monitoring
Although it doesn't prevent problems, it is important to monitor the availability of services so that when problems arise they can be corrected as soon as possible. This is particularly important for services where redundant hardware has been provided, because if one of the hosts fails the problem might otherwise go unnoticed.
The Information Systems Unit monitors its services in two ways. Firstly, the helpdesk has a daily checklist of around 20 items covering things such as whether overnight backups completed successfully. Secondly, an automated system continuously tests over 200 items such as end to end delivery of e-mail, availability of key university web sites, and the integrity of our connection to the Internet.
Business Continuity
A working group has been set up to consider how our Information Systems would bear up in the event of incidents such as a major fire in a critical location. To date, this group has commissioned an external review of our backup and recovery arrangements, and begun an audit to identify all systems which are critical to the university. It is currently identifying a number of "disaster" scenarios, and will systematically work through these to identify the impacts on the university and recommend changes to our systems and procedures to mitigate them, and reduce the likelihood of the disaster happening in the first place.





0 Comments:
Post a Comment
<< Home