This is an excerpt from something I'm working on this Labor Day holiday:
–
Large scale power outages act a lot like software failures. It starts with a small event, like a power line grounding out on a tree. Ordinarily that would be no big deal but under high-stress conditions it can turn into a cascading failure that affects millions of people. We can also learn from how power gets restored after an outage. Operators must perform a tricky balancing act between generation, transmission, and demand.
There used to be a common situation where power would be restored and then cut off again in a matter of seconds. It was especially common in the American South, where millions of air conditioners and refrigerators would all start at the same time. When a motor starts up, it draws a lot of current. You can see this in the way that lights dim when you start a circular saw. As the motor starts to spin, though, it creates "back EMF"–a kind of backpressure on the electrical current. (That's when the lights return to full brightness.) If you add up the effects of millions of electric motors starting all at once, you see a huge upward blip in current draw, followed by a quick drop due to back current. Power transmission systems would see the spike and drop and propagate that to the generation systems. First they would increase their draw then drop it dramatically. That would make the generation systems think they should shut off some of the turbines. Right about the time they started reducing supply, the initial surge of back EMF would decline and current load would come back up to baseline levels. The increased current load hit just when supply was declining, causing excess demand to trip circuit breakers. Lights out, again.
Smarter appliances and more modern control systems have mitigated that particular failure mode now, but there are still useful lessons for us.