Archive for April, 2008

Brief Power Outage Planned - 2700 regions will be affected

Wednesday, April 30th, 2008

We have received information from our network providers of planned maintenance which will make approximately 2700 regions unreachable for approximately 3 minutes this evening, Wednesday 30th April at 11pm PDT.

There will be two such events; the next outage will occur on Monday 5th May at 11pm PDT for the same length of time and will affect a similar number of regions.

We cannot list all of the regions that may be affected but if yours is one of these please accept our apologies for what we anticipate to be a few minutes lack of connectivity.

Rolling Restart planned for Wed April 30/Thu May 1

Tuesday, April 29th, 2008

[Update 2008-05-01 08:02am] The rolling restart to deploy 1.21.1 to the rest of the grid began at about 5:00am this morning. It is now complete.

[Update 2008-04-30 09:35am] The rolling restart to deploy 1.21.1 to the first half of the grid began at about 6:15am, and is now complete. The rest of the grid will receive 1.21.1 tomorrow morning.

[Update 2008-04-29 5:30pm] We will be pushing another pilot roll to the same 3 racks as yesterday. This will occur at 5pm today. The roll is complete. The schedule below has been updated to reflect this.

[Update 2008-04-29 9:15am] Just to confirm the earlier update - we’re officially rescheduling the rolling restart to Wednesday/Thursday. The schedule below has been updated to reflect this.

[Update 2008-04-29 6:00am] Because of the ongoing network problems that we are struggling to resolve, the rolling restart has not begun yet this morning. It will almost certainly be postponed; the rolling restart is likely to happen Wednesday and Thursday mornings instead of today and tomorrow. More information will be posted here as it becomes available.

One of the changes that went out in the 1.21 Server codebase enables us to alleviate database load caused by “spare” simulators - processes waiting to pick up regions after a restart. Unfortunately, a bug was found that prevents us from enabling the service. The bug did not hold up the 1.21 Server deploy significantly since it affected hosts in only one of our co-location facilities, and the new service was disabled within a few minutes of this being noticed for those hosts.

To send out a fix and reap the benefits of lower database load we need to do a follow-up rolling restart to 1.21.1 Server. (We’re as thrilled as you are.) There are no behavior changes. No new viewer is required. Each region will be given a 5 minute warning and then restarted.

Schedule:

  • Tuesday 4/29, 5-6pm: Pilot roll to 3 racks
  • Wednesday 4/30, 5-11am: Roll to half of the grid
  • Thursday 5/1, 5-11am: Roll to rest of the grid

[RESOLVED]Problematic Logins (and forced logoffs)

Tuesday, April 29th, 2008

[05.51 AM RESOLVED] Logins should be restored for all affected accounts. It may take some time for things to return to normal as the system catches up and everyone starts logging back in, but they are back online. -Chiyo

[05.16 AM UPDATE] The good news is we have found the problem with logins. The bad news it means logging off all accounts associated with one of our inventory databases while the fix is implemented. The accounts affected will not be able to log back in until they are complete.

There isn’t a way for you as a resident to know if you are one of those affected until the process begins. We apologize if this is abrupt or disconcerting for those involved, but ask your patience and understanding. Logins should be restored for everyone soon. We will update on the work progress and inform you when they are complete. -Chiyo

[04.55 AM ] We are seeing problems with many residents unable to log in. This may be related to the previous problems with packet loss and teleports, but we are not sure at this point. We are working to restore logins for everyone and will report our findings when we have a clear picture on the situation. -Chiyo

[RESOLVED] Teleports and Region Crossings affected in-world

Tuesday, April 29th, 2008

[Resolved 7:06 AM PDT Kate] Teleport and region crossing functionality has been restored.

[05:48 AM - Update] Our engineering team is still working on the network problems, and we will keep updating you about their progress here.

[04.34 AM - Update] The network problems are proving to be more complex. Our engineering team is continuing to work hard to resolve this.

[03.34 AM - Update] The situation is continuing and we are working to remedy this as soon as we can.

[02.34 AM - Update] Our team are continuing to work on the network problems.

[01:34 AM - Update] Our engineering team is till looking into the issue at this time.

[12:26 AM] We have identified a small amount of packet loss across some of our network connections at present. This may well manifest itself in failing teleports or region crossings.

Our team are looking into this issue and we will update as soon as we can.

[RESOLVED] - In-world search affected

Tuesday, April 29th, 2008

[03:13 AM - RESOLVED]  The in-world search server has been restored.

One of the servers responsible for in-world searching has encountered a problem and has been restarted. In-world searching will be restored to full capacity as soon as possible

Second Life Grid Status Reports

Monday, April 28th, 2008

Are you following the Second Life Grid Status Reports? If your business, your event, or your time with friends may be impacted by the Grid Status, we want you to have all the information we can get you, as quickly and easily as possible.

We’ve been posting Status Updates mixed in with a variety of posts on other topics. We realize that many of you don’t want to have to hunt through the blog for needed Status information, so in an effort make it easier for you, we’re following the tech industry standard and moving that information to a Status page.

Starting today, we’ll be collecting all the Second Life Grid Status Reports onto that one page. Now, you can check for an update without having to scan through all the other blog posts.

You can also subscribe to updates on your mobile phone using Twitter, or with an RSS reader.

Yep, get your Second Life Grid Status Reports, wherever you go.

Subscribe!

Now you can subscribe to the reports, and make sure they’re sent directly to you.

  • Get updates sent to your RSS reader.
  • Subscribe with your mobile phone and you can get Twitter updates while you’re out and about. Go ahead, take that meeting, go back to class, do your shopping, take a walk, get some other work done, and when the next Second Life Grid Status Report comes out, you’ll hear about it.
  • https://twitter.com/SLGridStatus

RSS: What is it? Status Reports sent Directly to You

  • RSS? The BBC offers a good, simple article that goes right to “How do I start…?”
  • If you’d like to get more technical, try Wikipedia

What is Twitter? Get Status Reports on your Mobile Phone or Computer

Quick Links: Second Life Grid Status Reports

Subscribe to the Second Life Grid Status Reports via Twitter

Subscribe to the Second Life Grid Status Reports via RSS


[RESOLVED] Database, asset server issues.

Saturday, April 26th, 2008

[RESOLVED 5:00PM PDT] Everything should have returned to normal by now as the repairs have finished. We apologize for any inconveniences you may have encountered during this time.

[UPDATE 4:00PM PDT] We still have not gotten word on the status of the repair. We apologize that this is taking so long. We hope to receive word soon.

[UPDATE 3:25PM PDT] Our vendor has dispatched an engineer to the facility. We are waiting for them to complete the repairs so we can bring the system back up to full speed. You may still notice intermittent problems until those repairs are completed. Further updates to follow.

[UPDATE 2:40PM PDT] The problem has been traced to a faulty piece of equipment and we are attempting to have repaired or replaced. Once that has been done things should return to normal. We will keep you updated on our progress.

[2:14PM PDT] We are investigating intermittent database and asset server issues. These may affect logins, teleporting, viewing scripts/notecards, and rezzing items in world. We are aware of the problem and working to isolate the source. We will have updates as we have more information.

[RESOLVED] Reminder: Support Portal Maintenance Tonight: Now 9pm-Midnight PDT

Saturday, April 26th, 2008

[Resolved at 10:50pm Pacific] Our Support Portal is back online!

[Updated at 10:10pm Pacific] Maintenance is in progress. — Frontier

As reported earlier this week, our support portal will be offline for system maintenance tonight, Saturday, 26th April.

Our software supplier has reduced the length of the downtime from six hours to three, from 9:00pm-Midnight PDT.

During that time, the support portal will be unavailable for chat or ticketing services.

Apologies for any inconvenience this may cause to you.

[RESOLVED] - Group Payouts Delayed

Saturday, April 26th, 2008

[8:51 AM - RESOLVED] - The manual process has been completed and all group payouts have been run.

[8:38 AM - Update] - Our engineering colleagues have identified the issue and are running the group payout process manually at the moment. It should be completed very soon and all groups will have paid out.

[8:17 AM - Update] - No further information on this at the moment.

We’ve been told by residents their nightly group distributions have not yet occurred. Upon further inspection we have found this may be true for other groups as well. We are investigating and will update when group distributions have completed or when we know when we can expect them to do so.

Rolling Restart for 1.21 Server Deploy Wed/Thu/Fri

Saturday, April 26th, 2008

[Updated Saturday @ 09:10am] The rolling restart of the rest of the grid is now complete.

[Updated Saturday @ 8:40am] The rolling restart of the rest of the grid is now in progress. It began at 5:10am, and is now 93% complete. As usual, each region will be down for ~5 minutes. if your region is down for more than 20 minutes, please contact support.

[Updated Saturday @ 7:06am] The rolling restart of the rest of the grid is now in progress. It began at 5:10am, and is now 46% complete. As usual, each region will be down for ~5 minutes. if your region is down for more than 20 minutes, please contact support.

[Updated Saturday @ 6:05am] The rolling restart of the rest of the grid is now in progress. It began at 5:10am, and is now 16% complete. As usual, each region will be down for ~5 minutes. if your region is down for more than 20 minutes, please contact support.

[Updated Saturday @ 5:10am] The rolling restart of the rest of the grid is now in progress. It began at 5:10am; we will post hourly updates with a percentage completed. As usual, each region will be down for ~5 minutes. if your region is down for more than 20 minutes, please contact support.

[Updated Friday @ 8:39am] The rolling restart to half of the grid is now complete but for 7 hosts that needed to be manually updated; those will be completed within a few minutes. The rest of the grid will be updated tomorrow morning.

[Updated Thursday @ 7:10pm] We are beginning have completed the deploy of 1.21 to 3 racks (632 regions). Here is a list of regions that as of now are on version 1.21.0.85745.

[Updated Thursday at 12:47pm] We will shortly be deploying have deployed 1.21 to 1 rack (about 170 regions) again. If all goes well, we will continue with the tenative timeline listed in the Wednesday at 8:10pm update below.

[Update Wednesday @ 9:15pm] A slight and subtle wrinkle during the deploy left some object-to-object emails non-functional. The responsible systems have gotten a stern talking to, and this service should be operational again.

[Update Wednesday @ 8:10pm] Another bug was found after we rolled out to one rack. That bug has been found and fixed. We will evaluate exactly what we’re going to do with this deploy after testing tomorrow, but it will likely shift the timeline forward by one day. Meanwhile, we are rolling back the 170 regions that had previously received a 1.21 deploy so that for all simulators are once again running on version 1.20.1 of the server code.

The central updates to 1.21 are complete and things seem “nominal” at the moment, but of course we’ll be watching closely.

  • Wednesday 4/23 @ 11am - deploy to 1 rack [DONE] [REVERTED]
  • Wednesday 4/23 - update central systems throughout the day [COMPLETE]
  • Thursday 4/23 @ 6pm - deploy to 3 racks [COMPLETE]
  • Friday 4/25 @ 5am-11am - deploy to half of remaining servers
  • Saturday 4/26 @ 5am-11am - deploy to remaining servers

[Update Wednesday @ 10:25am]

The bug in the 1.21 Server code identified last night during an initial rollout to 1 rack has been found, fixed, and verified. We’d planning to proceed with the rollout to avoid delaying the code update another week. On the table for today are the central services updates and limited rolling restarts.

What’s Changed in 1.21 Server

The most notable fixes will be physics-related, and have been in testing in the Beta Preview for several days. No new viewer is required.

Read on for more information…

(more…)