Microsoft(s msft) just updated its explanation of what brought down Windows Azure in Europe for nearly 2 and a half hours last week.
In a blog post, Windows Azure General Manager Mike Neil basically said that when Microsoft added more compute capacity to meet increased demand in its West Europe sub-region, it did not match that new capacity with enough network devices to handle the additional connections needed. Because of the imbalance of compute-to-network devices, the “connection threshold was exceeded and that increased management traffic, [which] in turn, triggered bugs in some of the cluster’s hardware devices, causing them to reach 100 percent CPU utilization impacting data traffic,” Neil wrote.
Microsoft posted its first, limited, explanation of the outage the day after it happened and promised another update in the upcoming week. Six days later this post filled in some more details.
Microsoft, Amazon(s amzn), HP(s hpq) and other companies…
Lihat pos aslinya 65 kata lagi