Power transformer fleet condition assessment and risk management

December 10th, 2014, Published in Articles: Energize


One of the most difficult aspects of fleet management is how to deal with the unknown. When dealing with hundreds, if not thousands, of assets, the unknown must be managed, and when possible, assessed such that it becomes known. Being proactive, and planning not just for tomorrow, but for the day after tomorrow, is critical to successfully managing a fleet of assets.

Listening to experts, and trusting the instruments at hand to help make a sound decision are the foundation of a successful asset management programme. At the heart of PAS 55 and ISO 55 000 there are a small number of key elements without which the application of the standards is meaningless. One of these is the need to understand, quantify and manage risk. Risks are a combination of event likelihood (probability) and event impact (consequence) [1, 2]. A high-impact, low-probability event may pose the same risk as a low-probability, high-impact event, but the perception of the event may govern how we respond. In either case, we need to identify, quantify and manage the risk.

A means to quantify probability is required – a scale of 1-5 where 1 is unlikely and 5 is almost certain – is a simple way to start. Likewise a means to identify impact is required – a 1 being low impact and a 5 being high impact is easy to state, but it also needs to be quantified in a common value.

Impact may be in terms of:

  • Asset value at purchase
  • Asset value to replace
  • Impact on the system in terms of customers interrupted and of interruption duration
  • Environmental effects
  • Safety effects
  • Some other measure

Referring back to PAS 55 and ISO 55 000, risks must be determined through individual asset performance within an organisation, and in comparison with other organisations. If, for example, your organisation has a power transformer failure rate of greater than 2% per year, is that considered high within your organisation, within the industry? Is there information available to identify where the risks lie and so improve that failure rate statistic?

Fig. 1: Car in a substation.

Fig. 1: Car in a substation.

Fig. 2: Failed circuit breaker.

Fig. 2: Failed circuit breaker.

Fig. 3: Probability of failure/year.

Fig. 3: Probability of failure/year.

What are the causes of failure for transformers? Are they related to location, application, manufacturer, age? It is becoming increasingly difficult to justify asset replacement on anything other than condition – replacement just prior to failure being the “optimum”.

It has been assumed over the years that as a transformer ages, its health and condition deteriorates, which increases the likelihood of failure. Recently, the authors performed an analysis of one utility’s transformers, based on the age of the asset at time of failure. The results were surprising, showing that the transformers actually had a lower likelihood of failure as they aged. This would run contrary to popular belief, and is based on only one company’s assets, and as such one company’s operating philosophy and maintenance strategy, but it begs the question how much influence does age have on risk of failure?

Do you know which transformers are likely to fail? Which breakers fail? How do you know? What is the consequence of failure – are some transformers more important than others? How does your organisation record and manage this information? Spreadsheets? Tribal knowledge? If it is important, it needs to be recorded, managed and addressed.

“Heuristics” refers to experience-based techniques for problem solving, learning, and discovery such as using a “rule of thumb”, an educated guess, an intuitive judgment, or common sense [3]. Heuristic approaches have developed over time and provide adequate solutions to real problem.

It is well understood, for example, that the origin of power factor testing is based on analysis and electrical engineering theory and the present terminology, CH, CHL etc. relate to the techniques developed decades ago [4]. The interpretation of the test result, however, is still a matter for consideration and debate: what may be a good result one day, in a given context, may be bad the next, in a very similar context. The test has an analytic theory but is often heuristic in interpretation. How do we capture this knowledge – failure rates, analyses, predicted condition? PAS 55 and ISO 55 000 both require that such information is recorded and available! [5]

Fig. 4: Transformer on fire.

Fig. 4: Transformer on fire.

Fig. 5: Rising trend in bushing power factor.

Fig. 5: Rising trend in bushing power factor.

Condition monitoring


Traditional condition monitoring has been performed over the years through offline, routine condition monitoring. This involves taking an asset out of service, disconnecting it from its in-service state, and performing electrical and/or mechanical testing, such as power factor, sweep frequency response analysis, or circuit breaker time and travel testing. The test results are then reviewed and analysed, and a decision is made on whether the equipment is fit for continued service, and when further maintenance or testing should be scheduled. This process proves to be very valuable in gathering information about the health and state of the asset, but can also be very time consuming and can be costly, depending on the asset.


It is possible that off-line tests indicate that a transformer is in suspect condition but has not failed. Units in known suspect condition and/or serving the most critical loads are the high risk units which need monitoring. In such cases these units should already be regularly sampled for dissolved gas analysis (DGA). Further, it may be relevant to recommend the use of on-line DGA monitors. An on line monitor provides confidence between regular samples that there are no incipient faults developing which would not otherwise be caught. Further monitoring provided by bushing sensors and PD detection gives a comprehensive view of the transformer condition which supports further use of the unit. This paper looks at the role of monitoring, when to recommend the addition of monitoring, and what type of monitoring to apply.

Fig. 6: Example of monitoring cost justification.

Fig. 6: Example of monitoring cost justification.


The ongoing challenge of all online diagnostic systems is in differentiating peripheral influences such as fluctuations in system voltage and temperature from actual changes in the medium (in this case the bushing insulation). The relative measurement method is susceptible to power systems variations and requires filtering. Minimising the influence of external factors can be accomplished by either increasing the number of bushings being monitored simultaneously or by increasing the time interval over which the data is averaged. Relying on a single measurement or a few points over a short period of time can result in the misinterpretation of changes in the data due to external factors [6].


Widespread application of condition monitoring has to pay for itself – either in losses avoided or risks reduced. (We’re not including here condition monitoring applications for research or evaluation purposes).

Step 1: Understand failure modes – what are the likely causes of transformer failure and how they can be detected, and which systems will provide the information.

Step 2: Run the numbers – how much is the system, and what are the chances of detecting deterioration before failure?

Over the years we have seen two distinct types of failure mode develop: graceful and rapid onset. Graceful failures relate to slower deterioration with a clear indication from a monitored parameter allowing for several weeks to months of planning and preparation for replacement. Rapid onset failure relates to failure modes which occur over a very short time period, giving minutes to hours warning. Appropriate monitoring has to be chosen to cover either situation [7].

Case studies

Example 1: Dissolved gas analysis testing

Dissolved gas analysis (DGA) is a well understood test for determination of transformer condition. There are many standards available for interpretation of the results [8]. Regular samples, particularly for larger or more critical units, yields a regular view on transformer condition, with a good DGA program giving early indication of failure in up to 50% of incipient failures. Data from annual sampling, though relatively sparse, is thus effective as an asset management tool. Fig. 7 gives an indication of key DGA levels for a transformer which subsequently failed.

Although most gas levels had been stable for some time, the hydrogen had been showing an increasing trend and the final failure brought a dramatic increase in most dissolved gas parameters.

By itself, regular DGA is a useful asset management tool in assisting with the identification of suspect units. An on-line DGA monitor gives further information, bridging the ‘silence between samples’ which can mask rapid deterioration. It can be seen from the data in Fig. 6 that an on-line monitor, such as a Doble Delphi device, may have been able to give early warning of the failure, if it had been applied and there was a “graceful” element to the deterioration. Of course, if the failure was sudden and catastrophic, there may have been no ability to act.

It is interesting to note that DGA for transformers covers the possibilities of either regular or occasional ad hoc sampling and continuous online monitoring. Both require their own individual asset management approaches.

Example 2: Partial discharge detection

The application of on-line partial discharge (PD) monitoring of power transformers in-service is one of the most promising technologies to detect and localise defects in the coil insulation. With the results of these diagnostic methods and the consideration of the network operation and management, a condition-based predictive maintenance and replacement planning is feasible [9].

Fig. 7: DGA key gas evolution over time.

Fig. 7: DGA key gas evolution over time.

Fig. 7 indicates phase resolved PD in a transformer which was a critical component of a transmission system. This “baseline” provides a clear visual indication of the state of PD in the system. A subsequent measurement set gave the results, also shown in Figs. 8a and b. The characteristics of the phase resolved PD are very different – there is a lot more PD across the whole phase range, indicating a change in the nature of the PD itself.

Fig. 8a: Initial phase-resolved PD signatures.

Fig. 8a: Initial phase-resolved PD signatures.

Fig. 8b: Final phase-resolved PD signatures.

Fig. 8b: Final phase-resolved PD signatures.

In this case, the change in PD gave early indication of a change in the transformer which could have led to failure. The data in Fig. 7 was collected in the four hours before noon on a single day, and then subsequently in the eight hours after noon on the same day. What caused the change in PD character? Subsequent investigation showed significantly deteriorated insulation within the transformer windings, which would have continued to a failure of the unit, if left in service [10].

Example 3: Bushing deterioration detected; failure averted

In this case a number of Trench COT type bushings were monitored. Over time, the monitoring system had shown variation in on-line power factor and capacitance for a number of units. In the early hours of a morning in January 2012, the monitoring system gave an alarm notification. It was considered prudent to switch he transformer out of service as a precaution. Initially it was thought this may be a false positive, relating to system voltage, or monitor performance. The data showed that a large increase in leakage current over a period of just a few hours had been detected for one bushing in a set of three.

Off line testing was used to confirm the on-line results, and the bushing was removed from service. A subsequent forensic tear down showed puncture marks and burning close to the edge of many of the foils.

Fig. 9: Sudden rise in leakage current.

Fig. 9: Sudden rise in leakage current.

Given the status of the insulation, the time taken for the leakage current to increase and industry background on these bushings, it was considered that this bushing had hours before catastrophic failure [7].


Making decisions can often be the most difficult part of an asset manager’s day. A knowledge and understanding of the asset and the risk of said asset is necessary to make decisions in a timely manner. Intervention needs to be planned for in advance, such that the risks to an asset are managed, and the necessary and correct intervention can occur in a suitable timescale. Managing risk is important, but understanding the need for risk aversion can be critical.


The authors wish to thank their many colleagues, both in Doble and within the industry, who have contributed to the discussion of risk and asset monitoring at Doble Client Conferences over more than twenty years and thus to the discussions within this paper.

This paper was published in Transmission and Distribution, June/July 2014, and is republished here with permission.


[1]    WH Bartley: “Analysis of Transformer Failures”, Proceedings of the sixty-ninth annual international conference of Doble clients, April 2000.
[2]    P Bernstein: “Against the Gods: The Remarkable Story of Risk”, John Wiley & Sons, 1996.
[3]    http://en.wikipedia.org/wiki/Heuristic
[4]    www.tufts.edu/home/feature/?p=doble
[5]    K Elkinson and T McGrail: “Development of Asset Management Standards”, TechCon Canada, 2012.
[6]    R Brusetti, K Elkinson, and T McGrail: “Role of On-Line Condition Monitoring for Power Transformer Operation and Maintenance”, Life of a transformer seminar, San Diego, 2013.
[7]    K Wyper, G MacKay, and T McGrail: “Condition Monitoring in the Real World”, Doble conference, 2013.
[8]    J A Lapworth: “A scoring system for integrating dissolved gas analysis results into a life management process for power transformers”, National Grid (UK), 71st international conference of Doble clients, Boston, USA, 2002.
[9]    S Coenen and S Tenbohlen, Universität Stuttgart, Germany; R Heywood, Doble PowerTest, England; and M Boltze, Doble Lemke, Germany: “Prospects and Limits of on-site PD Measurement Technique”, Boston conference, 2011.
[10]    G Topjian, K Elkinson, M Lawrence, and  T McGrail: “Aspects of Power Transformer Fleet Asset Risk Management”, IEEE transmission and distribution conference, Orlando, 2011.

Contact T McGrail, Doble Engineering, tmcgrail@doble.com

Related Articles

  • Evaluating the risks of South Africa’s energy sector to the country
  • Solar system replaces coal-fired plant
  • Considerations for the next generation of concentrating solar power systems
  • Onsite power for remote mining locations
  • Standby power generation set sizing