
Michael Assante explains how NERC’s CIP standards help ensure our electricity is there when we need it.
“Age doesn't matter as long as it's being maintained and operated well, and the best way to measure that is to look at the amount of events and understand why they are occuring”
-Michael Assante, NERC
There was a sense even before 2003 that mandatory standards were needed.
Looking back, it probably started in some of the western outages where there was an understanding that the voluntary model only got you so far and that there were still some gains to be achieved by standardizing and making the standards mandatory. And clearly, when the August 14, 2003 blackout occurred in the US northeast and south/central Canada, it captured the attention of folks all across North America and solidified the understanding that we needed to move toward a mandatory system.
It’s important to keep in mind what it would take in North America to achieve the most reliable system. That’s a very big land area with many different constituents and stakeholders involved. Before 2003, there was work being done from industry proponents as well as organizations like NERC towards a mandatory system.
The blackout of 2003 was the final event that was necessary to solidify the idea that standards should be mandatory. The industry openly accepted that and worked with federal authorities and across borders and embraced moving forward with this concept of having an international electric reliability organization. NERC was put forward to be the ERO in the United States, and was accepted and approved by FERC and the Canadian authorities.
Even though the new approach is specific in that the standards are mandatory, it’s interesting in that the model itself has been self-developed by industry. We consider ourselves a self-regulatory organization, where our registered entities or stakeholders participate in the process for deliberately putting together liability standards and then agreeing to be held to those same standards, and that makes it quite unique. The importance of this model is that you’re able to leverage the expertise that exists within the industry and open up a process so that we can all agree on what a true standard needs to reflect in terms of how to enhance reliability, and from there we, the ERO, enforce those standards.
It’s still a young model. The reality is that it takes two focuses as an organization to accomplish this mission. One is to work with all your stakeholders in a very open and deliberative process to develop these reliability standards, and the other part of the process is that once there is agreement and the standards have been accepted by the federal authorities, to enforce those standards.
The most important thing is that reliability is being improved.
The idea was that the standards would be mandatory and enforceable, and you would need an organization that would take on that goal of enforcement. The industry understands that it will take a dedicated organization to go out there and conduct audits and make proactive efforts to make sure the standards are being enforced.
As the ERO, we’re responsible for looking at information and analyzing trends and being able to talk about the assessment for reliability going forward. By bringing information to the table and turning it into knowledge, we can have a positive impact on reliability in terms of enforcement. One of the best measures that shows the industry is committed to improving and ensuring reliability is that not only do we go out and conduct audits, but industry also self-reports potential violations of a standard.
This shows how much the industry is involved. They take an honest look at their own program; they can run their own internal audits or assessments. If they find they might not be in compliance with a standard, they self-report that non-compliance. As a self-regulatory organization, we look very favorably upon self-reporting. It factors into the decision-making; it means you don’t have to go out and look for violations. If companies have their own culture of compliance and they’re looking for where they’re having issues and reporting those, that’s a very good sign that industry is on board with this. In 2007, more than half of the violations came from self-reporting.
It will take some time, as we begin to conduct an increasing number of audits in the field, to augment self-reporting with going in and examining across the set of reliability standards. We’re seeing a lot of progress toward meeting the mandatory standards. Not only does this progress show that companies are meeting the standards, but in a lot of cases, some of these standards are the minimum requirements necessary to ensure reliability, and many organizations are designing creative ways to even exceed what those standards call for.
I’m the chief security officer, so my focus is on the critical infrastructure protection, or CIP, standards, and we look at those as a set of minimum standard things that we do across the industry to ensure the protection of our critical assets and critical cyber-assets. A lot of organizations are very diligent around the issue of security-related risk, and they’ve been making big investments and putting a lot of effort into improving their ability to address risk, especially to cyber-assets, based on the realistic perception that this is an emerging challenge for the nation – for any inter-connected and technology-reliant society.
There are two things that lay the groundwork for the concern that exists in the system today. From an industry perspective, and in understanding how this infrastructure works, we know how technology-reliant we are in operating the system and managing it effectively. Technology in a lot of cases becomes a helpful solution to ensure reliability, and there is a heavy reliance on that technology to help to manage the system.
When you get to the point of enlightenment, you understand how important technology is in the operation of the system. For example, if you look back at the 2003 blackout investigation, there’s a big focus on how important technology was relative to the information available in order to have situational awareness and manage the system. That’s a key part of why it’s so important to have a critical infrastructure protection regime in place.
There are people out there willing to exploit our weaknesses for their own gain.
If you look at the amount of cyber security events that are occurring in other industries – in government as well as the areas of ecommerce and financial services – there’s a strong understanding that what comes with technology is complexity and vulnerabilities. We need to be in control of the technology we’re relying on.
A lot of it has been speculation. You’ve seen it in media coverage about which electric power systems could be attacked. Thereis some evidence at a very low level that some events have occurred around the world, and we as an industry are committed to getting ahead of this problem.
Any standard is intended to set an assured way of doing things across the system. We’re probably one of the only infrastructures to develop real cyber security standards in a self-regulatory manner. Other regulated industries, such as the nuclear industry, have security requirements driven by a federal regulator. In the electric utility sector, we have self-organized as part of the self-regulatory process to develop CIP standards. We’re pioneering critical infrastructure protection, and there’s a recognition that you don’t always get that right the first time. We’re committed to enhancing the standards, and we have a process in place to continue to develop them.
The electric grid is an interconnection of systems – one utility is interconnected through its neighbors as well as being able to feed information to a control area or reliability coordinator that sits above it – and we need to understand the importance of this interconnection. It’s the reason that the standards need to apply to all of North America.
In cyber security, you’re only as strong as your weakest link.
It’s important to makes sure that no one entity puts the rest of us at risk because its practices allow for easy access to technology or the exploitation of vulnerabilities within its system. It’s one of the key reasons that these standards are so important.
Organizations need to have processes put in place so that they can have a flexible and dynamic protection strategy against whatever evolving cyber threat might occur in the future. One approach to building a compliance strategy is to meet these standard requirements in terms of what you need to do to protect your system. That starts with identifying your critical assets within the constellation of assets that you might manage, and then identifying your critical cyber-assets. Only by having a true understanding and knowledge of what needs to be protected will you be able to properly develop a protection strategy.
You also need good security requirements around things like personnel risk assessment, meaning you’re looking at the folks to whom you will provide unescorted access or electronic access to critical cyber-assets, and you’re doing the proper amount of diligence to ensure they’re not a risk to the system. You’re putting into place processes that allow you to manage risk, such as vulnerability management, testing of systems, developing business continuity and restoration plans, and exercising those plans.
It’s a family of requirements that are designed to either set a minimum requirement, making sure that we raise the lowest level so that there is no weak link out there, that everyone’s doing at least something to the same standard. The second piece is to have processes in place so that you can react to any emerging cyber threats as they occur. That’s what the CIP standards are trying to achieve.
Newer technology can represent more of a risk than aging infrastructure.
You have a shorter operating history with new technology, and it’s typically more complex than the technology it’s replacing, and you have to watch both sides of that spectrum. You have to watch infrastructure as it gets older and try to understand maintenance trends and what’s occurring with the equipment, and then you have to watch the introduction of new technology just as closely.
With older technology, you have quite a bit of history to understand and build from; you have a certain comfort factor because you have a lot of quantitative data relative to the equipment. With new equipment you don’t have that, so you have to do a lot of measurement, and you have to develop these regimes to understand how to best manage and monitor that technology as you deploy it.
A lot of the infrastructure is getting older; in some cases it has exceeded its estimated lifespan. But many utilities have made the proper investments in monitoring and management programs and are watching that problem very closely. Clearly, as the ERO, this is something that we watch very closely, and we don’t feel it’s a major concern. If you’re looking to bring new technology into the power system and at embracing renewable energy, investments do need to be made, but that’s typically to expand the system and enable it to align it with some of these newer possibilities.
We conduct reliability assessments and bring information and knowledge to the table for all participants to understand where the reliability risks might be, whether that be internal dynamics within the industry or external factors like economic growth. Our feeling about the infrastructure is that age doesn’t matter as long as it’s being maintained and operated well, and the best way to measure against that is to take a look at the amount of events and conduct events analysis and understand why these events are occurring.
At this point in time we’re not seeing a negative effect to reliability as it relates to the economic conditions. It’s probably too early to see how they will be impacting investments in the system; we haven’t yet seen any major findings relative to investments in the systems on a macro scale. Neither have we seen an impact to reliability; in fact, the ERO is still growing and making investments in our mission, and I’d suspect our registered entities are also continuing to follow their compliance strategies in making the appropriate investments to ensure they are in compliance with the standards.
I’m not going to speculate, because duration is a very important factor when it comes to economic issues, and as time goes on we will obviously watch the economic conditions and how those potentially could impact reliability.
It’s our goal to achieve 100 percent compliance.
We’re seeing some real progress. As we work through the self-reporting of violations and we work with organizations, one of the key strategies of an enforcement program is to define a mitigation plan for when someone is not in compliance. This mitigation plan allows an organization to devise what investments or what process changes they need to put into place in order to be compliant.
We need to put in place mitigation strategies to resolve any reliability issues as quickly as possible. We manage that directly with the entities, and when they are reporting these things, they say, “We weren’t compliant, and now we’re ready for what comes with not being compliant.” The real focus from an ERO perspective is that we devise the mitigation strategies in order for us to immediately ensure reliability. They’re very aggressively managed to make sure there is no risk to reliability based on their self-reporting of a lack of compliance.
The same thing is true for an audit. When we audit against the standards and somebody isn’t in compliance, the important thing is each time these mitigation strategies are put together, that this is a learning opportunity. We take these mitigation plans and we make sure the industry learns from them.
We know we’re building corporate knowledge into the system of how to be compliant with the standards, and that’s an important part of the process that often gets lost. A lot of folks focus on the idea that they either report or we audit, and then somehow magically a penalty is decided, and that resolves the issue. The important point is not the penalty. Penalties are necessary, but more important is the development of these mitigation plans and their approval and the management of risk so entities become compliant. The sharing of that knowledge is the key to success for an electric reliability organization.
No enforcement program can prevent all system disturbances from occurring.
There are always issues that are beyond our control – there could be issues with weather, or the failure of equipment in the field introduced by human error. There are a number of causative factors that can cause a system disturbance.
The good news is that there’s a big effort in place to minimize their effects. It may impossible to rule them out entirely, but the new regime of the ERO and industry’s commitment to adhering to mandatory standards is about minimizing these occurrences, and more importantly, minimizing their effects. The idea is to make these things smaller and smaller as we all get better at focusing on reliability.
We look at our mission in a broader sense as ensuring the reliability of the system, and standards and compliance is only a part of that mission – an important one, but still only a part.
Michael Assante is Vice President and Chief Security Officer of the North American Electric Reliability Corporation.