Close this search box.

IT Outages: Is human error to blame?

Author: Gary Hall

Operations Director, Critical Facilities Solutions LTD

Downtime in a critical space is something that all Data Centre owners and operators fear, this could be a catastrophic event which may affect tens of thousands of users, or cost tens of thousands of pounds (£/$). Whilst the service remains out of action, this number could escalate very quickly indeed.

With reliability of Data Centre’s paramount in this age, the understanding of what causes outages is extremely valuable information that must be shared throughout the industry so that lessons can be learnt.

The root cause statistics realized by the Uptime Institute in direct relation to Data Centre outages make for interesting reading with 26% being allocated to Installation issues (builder or subcontractor deficiencies). How does the industry reduce this figure? What more can be done? All people entering a Data Centre should have a basic understanding of how a Data Centre works, the criticality of the building, what a Data Centre does, and the risks involved to the infrastructure if works are not controlled and performed by competent people.

Many Data Centre outages could be prevented by improving sub-contractor management processes and training staff to follow them correctly.

According to Uptime Institutes most recent (2021) Data Centre resiliency survey, 42% of respondents said they had experienced an outage in the last three years due to human error. Clearly, human errors in the Data Centre and in IT account for a lot of outages, the question raised in regard to our service offering is ‘How skilled should your Data Centre cleaners be’?

To let Data Centre cleaners into the heart of any client’s business, they must be fully trained and hold sufficient knowledge of the space that they are operating in. Data Centre cleaners must exercise a great deal of care and follow rigorous procedures, as they will work in close proximity to sensitive electronic equipment. In addition, cleaning operatives must use specialised vacuums with high efficiency particulate air filters, so that no dust or particles become airborne and get re-circulated into the supply air, as this is a risk in itself.

To reduce risk through human error, all Data Centre cleaning operatives should be fully trained in risk management, for instance, they should hold knowledge of fire suppression isolations, fire alarm isolations, fibre identification, electrical safety, they should have a basic understanding of air flow management and be aware that by removing large numbers of floor panels, they are affecting the cooling process.

“42% of respondents said they had experienced an outage in the last three years due to human error.”

Cleaning operatives should also be trained in the identification of zinc whiskers as they usually form on pedestals, cable trays, underside of floor panels, basically all the areas that cleaning operatives will be coming into contact with on a routine deep clean.

All reputable cleaning businesses will have a set of robust ‘prestart’ procedures to ensure the clients infrastructure is safe before any cleaning is undertaken, all Data Centre owners and operators should perform company diligence before work orders are placed.

Knowledge and training is power!