Today’s ramble is about the new BlackBerry 10 Infrastructure and how I see it interacting with a dual data center environment. I need to emphasize that this discussion is based solely on the RTM version of the BlackBerry 10 Enterprise Server.
For those who have read the blog before, you know that I’m passionate about two things 1) PowerShell, and 2) High Availability. Seeing as there aren’t yet PowerShell scripts for use with the BlackBerry 10 Infrastructure (I’ll start working on them immediately), I’ll have to write about the latter.
Background – Exchange Infrastructure
For the purposes of this discussion consider a highly available Exchange Infrastructure across two data centers in an Active/Active configuration. The Exchange version for this discussion is immaterial, suffice to say that it is for Exchange 2010 or later. To ease the discussion, we’ll call the two data centers “East” and “West.”
The data centers each have a bunch of Mailbox Servers and Client Access Servers in them. The Client Access Servers are configured in a RPC Client Access Array behind some type of load balancer. The configuration of the load balancer falls outside the scope of this discussion.
The mailbox servers are configured in two Database Availability Groups to provide high availability. In a production environment, these would most likely have two database copies within the data center and one in the remote data center to enable site resilience and still offer a local failover in case of a single server failure.
Placement of BlackBerry Servers
From reading all of the BlackBerry documentation for the RTM version of BlackBerry 10 Enterprise Server (which I will refer to as “BESX”), the highly availability that we had in BlackBerry 5.X is no longer available. I’m speaking specifically towards a BES Instance Pair where you have a pair of servers running the same BlackBerry Instance so that if a service on one server fails, the other takes over automatically.
Being “Highly Available” in BlackBerry 10
Like I just stated, there really isn’t a true HA configuration yet for BlackBerry 10. That being said, you can (and should) setup additional instances of the BlackBerry 10 Core Services. Since BlackBerry has moved away from a server and client licensing cost-basis to a client-only licensing basis (someone please correct me if I’m misunderstanding this) that encourages me to install as many servers in a single BlackBerry domain as I can. Remember that the simple definition of a BlackBerry Domain is the set of all servers and devices which use the same configuration database.
So what does this lack of native HA for BlackBerry 10 mean for us? That we get to install more servers! Since most people these days are probably running servers virtually (VMware or Hyper-V), adding additional servers doesn’t really have the sting of hardware costs when we used physical hardware.
Sketching out the drawing for something like this, you’d end up with something like this:
There are several layers to making this highly available, so let me speak to each in turn.
SQL Server
BESX supports SQL Server 2012. SQL Server 2012 leverages a new high availability model called “Database Availability” (sound familiar to those Exchange Admins out there?) where a database is active for read/writes on one server and is active for only reads on another server. If the read/write host becomes unavailable, then the other is changed to read/write. This makes the database highly available between data centers.
BlackBerry Administration Services (BAS)
The BlackBerry Administration Services (BAS) chews up a metric ton of memory when deployed. To help alleviate that, you can install them in a “Remote Configuration” and run them separate from the BlackBerry “Core” Services (Controller, Dispatcher, MDS, Enterprise Management Web). My recommendation would be to setup one (or more) of the BAS in each data center. Then you can use whatever intelligent load balancer that you like to poll each of them and direct traffic to the responding one.
BlackBerry “Core” Services
The four “Core” BlackBerry Services (those not associated with the Administration Services), are bound to a single BlackBerry “Instance.” For the purposes of the BESX Server, the Instance can be thought of as being the same as the server. Remember that in a BlackBerry 5.X environment, Instances can be bound to two different servers (one in each data center) so that the services are “shared” between the two of them to allow for high availability. As I stated earlier, this is not yet available for BESX.
Since MAPI is no longer the transport of preference and it has moved to ActiveSync, the load on the servers has greatly been reduced. I’ve seen processor utilization of less than 10% and memory use less than 3 GB for the Core Instances for BESX. This doesn’t mean that the server isn’t doing anything, because it is. It is essentially making all the ActiveSync requests on behalf of the handheld device. All of this is done via the Mobile Data Service and that service needs to be monitored religiously.
Data Center Down!!!
In the event that you have a data center down situation, start by performing the data center switchover of the Exchange Environment as outlined by the Exchange Team on their blog (includes an awesome PowerPoint decision tree).
If you have access to the failing data center, then shut down all the BlackBerry Servers there. Then you should log into the BlackBerry Administration Service in your surviving Data Center and move all the users bound to the failed instance to the surviving instance. For this reason, I’d recommend that you don’t overload your BlackBerry “Core” servers.
When the data center is back online, bring your Exchange Infrastructure back online and test it thoroughly. Once that gets the green-light, spin up the BESX Servers and shift your users back to their primary data center.