[Resolved 6:40pm PST, February 27 2009]
The issue as described below has been resolved. Further rolling restarts have been delayed until such time that we can ensure the issue does not recur.
[Updated @ 4:12pm PST, February 27, 2009]
Please read below for the reasons behind the LLNet transition Rolling Restart Cancellation.
Last night, in our continuing effort to improve both the performance and maintainability of our network, we moved a bunch of hosts in our Phoenix datacenter onto a new network. This morning we noticed that many simulators were showing elevated asset download queues and that teleports were failing more often than usual. The problem took some time to diagnose, as we initially thought that the issue was network-related.
As many of you already know, for each avatar in Second Life, we store what are called “baked” textures, representing what the avatar looks like when it is wearing it’s current outfit. These are stored on the simulator host where the avatar’s appearance was last edited. We found simulators were getting “stuck” downloading baked textures from the *old* IP address of the simulators that were renumbered. As those IP addresses no longer existed, it was taking three minutes for these downloads to time out causing gray avatars and objects, delays in rezzing objects, a larger number of failed teleport attempts than usual, and other issues related to the simulator’s or viewer’s downloading of assets.
Once the root cause was discovered, it was a matter of deleting the entries from the databases that pointed to the old IP addresses. Though it took about an hour and a half because we wanted to avoid impacting other services in the process.
Once the entries were deleted, which completed at about 3:50 PM PST, things started returning to normal right away.
[Posted @ 1:21PM PST, February 27, 2009] The Rolling Restarts for LLNet transition that were previously scheduled for Saturday, February 28th and Sunday, March 1st have been canceled for the foreseeable future. Please check back for any further updates that may be posted on this issue.