Discussion:
FSMO Role Seizures for DR Testing?
(too old to reply)
David
2006-10-18 17:36:45 UTC
Permalink
Good afternoon all,

I currently have a question regarding FSMO role transfer/seizures
for the purposes of a disaster recovery exercise. Currently, our AD
Forest consists of 3 domains (Designated A, B, and C). Each domain also
has 3 domain controllers with the FSMO roles distributed accordingly. I
have recently built an additional 4th domain controller for each domain
in preparation for an upcoming disaster recovery exercies at a remote
location. These 3 domain controllers were shipped to the remote
location and brought online using a VPN connection to our home office.
During our DR exercise, I will be traveling to our remote location
and we will be severing our VPN connection to our home office to
simulate it "being wiped out" so to speak. With these DR domain
controllers at the remote location (as well as various file and app
servers), we plan to conduct testing for about 48 hours. After the test
is over, the VPN connection will be re-established and these "remote"
DC's will remain online as off-site domain controllers for the network.
My question is kind of two-fold: Once we sever the VPN connection
during our disaster simulation/testing, will I need to seize all the
FSMO roles on these 3 "remote" DC's in order for AD to function
correctly? And second, if the seizure is the appropriate way to do
this; once the test is over and the link re-established won't my "home
office" DC's panic/freak-out when they see that another DC in their
respective domain holds an identical FSMO role? I have read several
articles suggesting that in a scenario like this, the DC's will sort
out who the original holder was and the duplicate holder will
relenquish control of that role back to the original DC without any
intervention on my part. Is there anyone that might be able to provide
me some insight into what I should be doing here? I am sure that I am
not the only admin to go through a scenario/test much like this one!
All input/assistance is greatly appreciated!

-David
Danny Sanders
2006-10-18 17:52:59 UTC
Permalink
Actually MS suggest the server the FSMO role has been "seized" from NOT be
brought back into the domain.
See "Notes" under Seize FSMO roles here:
http://support.microsoft.com/kb/255504/en-us

hth
DDS
Post by David
Good afternoon all,
I currently have a question regarding FSMO role transfer/seizures
for the purposes of a disaster recovery exercise. Currently, our AD
Forest consists of 3 domains (Designated A, B, and C). Each domain also
has 3 domain controllers with the FSMO roles distributed accordingly. I
have recently built an additional 4th domain controller for each domain
in preparation for an upcoming disaster recovery exercies at a remote
location. These 3 domain controllers were shipped to the remote
location and brought online using a VPN connection to our home office.
During our DR exercise, I will be traveling to our remote location
and we will be severing our VPN connection to our home office to
simulate it "being wiped out" so to speak. With these DR domain
controllers at the remote location (as well as various file and app
servers), we plan to conduct testing for about 48 hours. After the test
is over, the VPN connection will be re-established and these "remote"
DC's will remain online as off-site domain controllers for the network.
My question is kind of two-fold: Once we sever the VPN connection
during our disaster simulation/testing, will I need to seize all the
FSMO roles on these 3 "remote" DC's in order for AD to function
correctly? And second, if the seizure is the appropriate way to do
this; once the test is over and the link re-established won't my "home
office" DC's panic/freak-out when they see that another DC in their
respective domain holds an identical FSMO role? I have read several
articles suggesting that in a scenario like this, the DC's will sort
out who the original holder was and the duplicate holder will
relenquish control of that role back to the original DC without any
intervention on my part. Is there anyone that might be able to provide
me some insight into what I should be doing here? I am sure that I am
not the only admin to go through a scenario/test much like this one!
All input/assistance is greatly appreciated!
-David
Paul Williams [MVP]
2006-10-18 17:53:58 UTC
Permalink
Good afternoon to you.
Once we sever the VPN connection during our disaster simulation/testing,
will I need to seize all the FSMO roles on these 3 "remote" DC's in order
for AD to function correctly?
No. Only the PDCe for each domain. The others aren't necessary for
short-term outage and won't provide any functionality for a DR test.
And second, if the seizure is the appropriate way to do this; once the
test is over and the link re-established won't my "home office" DC's
panic/freak-out when they see that another DC in their
respective domain holds an identical FSMO role?

If you only seize the PDCe then this isn't an issue. If you were to seize
the RID master, you cannot do this.

Note. That's the guidelines. In actuallity, in the current versions of the
OS, this wouldn't be an issue due to the initial sync that occurs when a DC
starts. However, it is best not to do it with anything other than the PDCe.
I have read several articles suggesting that in a scenario like this, the
DC's will sort out who the original holder was and the duplicate holder
will relenquish control of that role back to the original DC without any
intervention on my part. Is there anyone that might be able to provide me
some insight into what I should be doing here?
If you take my advice and only seize the PDCe, then when you bring them back
online, reboot one of them and all should be well. If it isn't, simply
transfer the role to the correct server again.
--
Paul Williams
Microsoft MVP - Windows Server - Directory Services
http://www.msresource.net | http://forums.msresource.net
David
2006-10-18 18:10:46 UTC
Permalink
"Paul Williams [MVP] wrote:

If you take my advice and only seize the PDCe, then when you bring them
back
online, reboot one of them and all should be well. If it isn't, simply

transfer the role to the correct server again."


Paul - First, thank you for the incredibly fast reply! So to summarize:
I should seize *only* the PDC Emulator role on the disaster recovery DC
in each of my 3 domains (after breaking the link and starting the
exercise), correct? When the test is over and I re-establish the VPN
back to the home office, reboot the off-site DC that I seized the role
with and everything should be back to normal. If it still appears that
2 servers still hold the PDCe role when I use NTDSUTIL.exe then just
transfer the role back to the original home office DC, right?
Paul Williams [MVP]
2006-10-18 18:24:42 UTC
Permalink
Yes, that is correct.

No worries re. the help. The speed was a timezone/ luck thing. I've got a
couple of minutes to kill before the Metallica documentary "Some kind of
monster" starts... ;-)
--
Paul Williams
Microsoft MVP - Windows Server - Directory Services
http://www.msresource.net | http://forums.msresource.net
David
2006-10-18 19:40:20 UTC
Permalink
The DR test is to validate that we can bring up an alternate site in
the event of a catastophe at our main office. We will be restoring
close to 2 dozen file and application servers from backup during the
test. The additional DC's that were built will remain online after the
test at our co-lo facility.
David
2006-10-18 19:49:26 UTC
Permalink
But, they will need to function autonomously as if the home office site
did not exist for the purposes of the test. No DC's will be permanently
removed on either side here, and there will be no restores of any DC's
involved. In fact, the 3 DC's we will be using at the remote site
arrived there today and we currently have them online via our VPN to
that remote site. So I guess my next question is, is the need to seize
FSMO roles even necessary? My initial assumption was that since the
link between the 2 sites would only be severed for 2 days, the DC's in
the remote site should be able to function normally in the absence of
the home office DC's.
Paul Williams [MVP]
2006-10-19 19:57:09 UTC
Permalink
You're initial assumption is correct. You should be able to happily work
with few problems with the PDCe being offline for 48 hours.

See the other posts from Jorge and I re. performing a seizure in production.
While I agree it is not a great idea, it shouldn't cause you any problems.
However, thinking about it, it might be an unsupported thing to do from a
MSFT PSS standpoint.
--
Paul Williams
Microsoft MVP - Windows Server - Directory Services
http://www.msresource.net | http://forums.msresource.net
David
2006-10-24 15:14:41 UTC
Permalink
I appreciate you both providing me input. Unfortunately, since we are
also in the middle of a Novell to MS migration we do not have a
lab/test environment setup at this time that I could test all of these
items before the actual DR exercise occurs next month. I do not believe
we will be provisioning any new users or doing password changes during
the simulated outage, the primary concentration will be on network
login and access to files/applications. The only servers that will be
restored from backup for the exercise will be file and application
servers. None of the AD DC's will be restored, they have all been built
from scratch.
Jorge de Almeida Pinto [MVP - DS]
2006-10-24 19:17:34 UTC
Permalink
to be honest I would not combine a DR test during a migration...either one
might impact the other
--
Cheers,
(HOPEFULLY THIS INFORMATION HELPS YOU!)

# Jorge de Almeida Pinto # MVP Windows Server - Directory Services

BLOG (WEB-BASED)--> http://blogs.dirteam.com/blogs/jorge/default.aspx
BLOG (RSS-FEEDS)--> http://blogs.dirteam.com/blogs/jorge/rss.aspx
------------------------------------------------------------------------------------------
* This posting is provided "AS IS" with no warranties and confers no rights!
* Always test before implementing!
------------------------------------------------------------------------------------------
#################################################
#################################################
------------------------------------------------------------------------------------------
Post by David
I appreciate you both providing me input. Unfortunately, since we are
also in the middle of a Novell to MS migration we do not have a
lab/test environment setup at this time that I could test all of these
items before the actual DR exercise occurs next month. I do not believe
we will be provisioning any new users or doing password changes during
the simulated outage, the primary concentration will be on network
login and access to files/applications. The only servers that will be
restored from backup for the exercise will be file and application
servers. None of the AD DC's will be restored, they have all been built
from scratch.
David
2006-11-13 23:54:43 UTC
Permalink
Gents,

I thank you both for your valuable input, and I have 1 last
question for you both. Since it now seems that seizing FSMO roles will
not be required (or recommended) for this upcoming test, I have a
question about computer accounts and servers that we will be restoring.
As it stands now, we have our DC's at the remote site online and
replicating with the DC's at our home office. When the time for the
test comes, we will be flying to our remote site, and disconnecting the
VPN that connects us back to our main headquarters. At that point, we
will conduct the restores of several dozen servers from backup and
bring them online in the domain. My question is: how do we handle this,
since the "real" servers already exist in AD, we just won't be able to
communicate with them during the test? When we bring them online and
attempt to join them to the domain won't the remote site DC's throw a
fit because those servers already exist in AD? Also, when we
re-establish the link to our main headquarters how do we handle the
introduction of these recovered servers back to our productional
network where the "real" servers alrwady exist? This is all likely
obvious and I'm simply overlooking it - I could've swore I already had
most of this documented and researched but I'm not able to find it now
:( Can I ask you both for further assistance?

-David
Paul Williams [MVP]
2006-11-14 07:21:07 UTC
Permalink
If you restore them, they will be members of the domain with the same
password as the real machines that are unreachable. So that should work OK.
An issue can arise when you power those restored machines off and get the
line back up. The passwords might have been changed. If so, you'll have a
broken secure channel between the server and the DC and will need to reset
it.

Again, similar recommendations will request you do this in an isolated lab,
but you can do this in production. You just need to be aware that you might
need to reset the secure channel on the orginal boxes when the test is
finished (and you might have a bunch of errors in the event logs).
--
Paul Williams
Microsoft MVP - Windows Server - Directory Services
http://www.msresource.net | http://forums.msresource.net
David
2006-11-14 16:10:01 UTC
Permalink
Well, from what I have heard this morning it looks like that we will be
building the base OS on the servers we plan to restore and then
restoring the system states and data on top of that. I would assume
that restoring the system state would retain the original SID, so AD
would still think it was the original server back at our HQ correct? If
we find that the SID changes, once we sever the link we plan to delete
the server objects we will be restoring out of AD, restore the server,
rejoin it to the domain, and conduct our testing. Since the DC's we
have at the remote site are VM's, we're going to snapshot them before
we make any changes right before the start of the test. Once the test
is over, we will roll those DC's back to the snapshot we took just
prior to the start and bring the link back up. The production AD back
at our HQ will simply think those DCs went offline for awhile correct?

On a side note, I have seen instances in the past where you cannot
login to a DC with anything other than the original Administrator login
for the domain. For example, when I created these new DC's and severed
their connection from the production network I couldn't log in to them
with anything but the original Administrator account for the domain
despite the fact I had my own personal domain admin account. Why is
this, and will the same thing happen at our remote site when we sever
the link from our main office during the DR test?

Jorge de Almeida Pinto [MVP - DS]
2006-10-18 18:22:55 UTC
Permalink
Post by Paul Williams [MVP]
No. Only the PDCe for each domain. The others aren't necessary for
short-term outage and won't provide any functionality for a DR test.
I dont agree here....

during DR the FSMO really needed is the RID master FSMO.

why....
(1) as soon as a DC is restored it invalidates its current RID pool and it
wants a new RID pool from the RID master. Until it has received one, it will
not advertise itself
(2) the same applies to new promoted DCs if needed
Post by Paul Williams [MVP]
DC's will remain online as off-site domain controllers for the network.
My question is kind of two-fold: Once we sever the VPN connection
during our disaster simulation/testing, will I need to seize all the
FSMO roles on these 3 "remote" DC's in order for AD to function
correctly? And second, if the seizure is the appropriate way to do
this; once the test is over and the link re-established won't my "home
I would like to know first what the DR test is about... is it just to
disconnect the remote office from the main office?

if yes, there is no need to seize FSMO roles

besides that, I would not just seize the roles in a production env while the
current FSMO is online
--
Cheers,
(HOPEFULLY THIS INFORMATION HELPS YOU!)

# Jorge de Almeida Pinto # MVP Windows Server - Directory Services

BLOG (WEB-BASED)--> http://blogs.dirteam.com/blogs/jorge/default.aspx
BLOG (RSS-FEEDS)--> http://blogs.dirteam.com/blogs/jorge/rss.aspx
------------------------------------------------------------------------------------------
* This posting is provided "AS IS" with no warranties and confers no rights!
* Always test before implementing!
------------------------------------------------------------------------------------------
#################################################
#################################################
------------------------------------------------------------------------------------------
Post by Paul Williams [MVP]
Good afternoon to you.
Once we sever the VPN connection during our disaster simulation/testing,
will I need to seize all the FSMO roles on these 3 "remote" DC's in order
for AD to function correctly?
No. Only the PDCe for each domain. The others aren't necessary for
short-term outage and won't provide any functionality for a DR test.
And second, if the seizure is the appropriate way to do this; once the
test is over and the link re-established won't my "home office" DC's
panic/freak-out when they see that another DC in their
respective domain holds an identical FSMO role?
If you only seize the PDCe then this isn't an issue. If you were to seize
the RID master, you cannot do this.
Note. That's the guidelines. In actuallity, in the current versions of
the OS, this wouldn't be an issue due to the initial sync that occurs when
a DC starts. However, it is best not to do it with anything other than
the PDCe.
I have read several articles suggesting that in a scenario like this, the
DC's will sort out who the original holder was and the duplicate holder
will relenquish control of that role back to the original DC without any
intervention on my part. Is there anyone that might be able to provide me
some insight into what I should be doing here?
If you take my advice and only seize the PDCe, then when you bring them
back online, reboot one of them and all should be well. If it isn't,
simply transfer the role to the correct server again.
--
Paul Williams
Microsoft MVP - Windows Server - Directory Services
http://www.msresource.net | http://forums.msresource.net
Paul Williams [MVP]
2006-10-18 18:29:23 UTC
Permalink
Post by Jorge de Almeida Pinto [MVP - DS]
during DR the FSMO really needed is the RID master FSMO.
why....
(1) as soon as a DC is restored it invalidates its current RID pool and it
wants a new RID pool from the RID master. Until it has received one, it
will not advertise itself
(2) the same applies to new promoted DCs if needed
Ah ha. No mention of restoration was made. The line is going to be
severed, therefore they want to see if the DR site can function
autonomously.
Post by Jorge de Almeida Pinto [MVP - DS]
besides that, I would not just seize the roles in a production env while
the current FSMO is online
Valid point. But to perform the kind of test that the OP is asking about...

And...there's no issued with the PDCe anyway.

But yes, you are _possibly_ more sensable than I... ;-)
--
Paul Williams
Microsoft MVP - Windows Server - Directory Services
http://www.msresource.net | http://forums.msresource.net
Jorge de Almeida Pinto [MVP - DS]
2006-10-18 18:49:38 UTC
Permalink
Post by Paul Williams [MVP]
Post by Jorge de Almeida Pinto [MVP - DS]
Post by Paul Williams [MVP]
Ah ha. No mention of restoration was made. The line is going to be
severed, therefore they want to see if the DR site can function
autonomously.

so why would it be needed to seize the role(s)?

IMHO, you ONLY seize a FSMO when the current FSMO role owner is dead (or
disconnected and does not come back to communicate with the NEW FSMO role
owner). In case a WAN goes bad... that is not an immediate reason to go and
seize roles.
--
Cheers,
(HOPEFULLY THIS INFORMATION HELPS YOU!)

# Jorge de Almeida Pinto # MVP Windows Server - Directory Services

BLOG (WEB-BASED)--> http://blogs.dirteam.com/blogs/jorge/default.aspx
BLOG (RSS-FEEDS)--> http://blogs.dirteam.com/blogs/jorge/rss.aspx
------------------------------------------------------------------------------------------
* This posting is provided "AS IS" with no warranties and confers no rights!
* Always test before implementing!
------------------------------------------------------------------------------------------
#################################################
#################################################
------------------------------------------------------------------------------------------
Post by Paul Williams [MVP]
Post by Jorge de Almeida Pinto [MVP - DS]
during DR the FSMO really needed is the RID master FSMO.
why....
(1) as soon as a DC is restored it invalidates its current RID pool and
it wants a new RID pool from the RID master. Until it has received one,
it will not advertise itself
(2) the same applies to new promoted DCs if needed
Ah ha. No mention of restoration was made. The line is going to be
severed, therefore they want to see if the DR site can function
autonomously.
Post by Jorge de Almeida Pinto [MVP - DS]
besides that, I would not just seize the roles in a production env while
the current FSMO is online
Valid point. But to perform the kind of test that the OP is asking about...
And...there's no issued with the PDCe anyway.
But yes, you are _possibly_ more sensable than I... ;-)
--
Paul Williams
Microsoft MVP - Windows Server - Directory Services
http://www.msresource.net | http://forums.msresource.net
Paul Williams [MVP]
2006-10-18 19:07:13 UTC
Permalink
Valid points.
--
Paul Williams
Microsoft MVP - Windows Server - Directory Services
http://www.msresource.net | http://forums.msresource.net
Tomasz Onyszko
2006-10-18 19:53:47 UTC
Permalink
Post by Paul Williams [MVP]
Valid points.
Just as an addition to this conversation,
Brian Puhls' comments on FSMO placement and seizure in case of emergency
http://blogs.technet.com/bpuhl/archive/2005/12/07/415761.aspx
--
Tomasz Onyszko
http://www.w2k.pl/ - (PL)
http://blogs.dirteam.com/blogs/tomek/ - (EN)
Paul Williams [MVP]
2006-10-19 17:23:43 UTC
Permalink
I didn't read all of the blog but I agree with the sentiment. The design
I'm currently working on I have implemented the same way - enterprise roles
on one DC; domain roles on another. Plus a standby in the other datacentre.

In this case, all that is needed is a good process. I'm writing a tool that
will allow us to implement a process or processes that will allow us to
recover from any eventuallit via the systems management tool (a number of
different packages that call my tool with different arguments).
--
Paul Williams
Microsoft MVP - Windows Server - Directory Services
http://www.msresource.net | http://forums.msresource.net
David
2006-10-19 18:19:38 UTC
Permalink
Were we able to decide if Paul or Jorge had the correct steps, or are
they both correct to an extent? At this point, I think that seizing the
PDEe role for the remote site DC's during the outage is likely to be
the best course to ensure all is well once we sever that link... Paul?
any other thoughts?
Jorge de Almeida Pinto [MVP - DS]
2006-10-19 18:56:28 UTC
Permalink
Post by David
Were we able to decide if Paul or Jorge had the correct steps, or are
they both correct to an extent?
DR cannot be decided/designed in a sec! It is difficult when doing it, so
think how difficult it is not being involved and not knowing all the
details..

until now I have heart two DR scenarios
(1) you only want to test what to do if the WAN dies.--> In that case: DO
NOT TOUCH ANYTHING from AD. Fix the WAN
(2) your main site dies where the FSMO are located and you to relocate to
another site --> seize all FSMOs

I give this info with the info I know at this moment

besides all this, I'm interested to hear if such a scenario was tested in a
test environment before even thinking about trying in production...
I do not want to be rude, but if this has not been tested in a test
environment and you still have questions, please, please, please GO BACK TO
A TEST ENVIRONMENT! and try it there FIRST!
--
Cheers,
(HOPEFULLY THIS INFORMATION HELPS YOU!)

# Jorge de Almeida Pinto # MVP Windows Server - Directory Services

BLOG (WEB-BASED)--> http://blogs.dirteam.com/blogs/jorge/default.aspx
BLOG (RSS-FEEDS)--> http://blogs.dirteam.com/blogs/jorge/rss.aspx
------------------------------------------------------------------------------------------
* This posting is provided "AS IS" with no warranties and confers no rights!
* Always test before implementing!
------------------------------------------------------------------------------------------
#################################################
#################################################
------------------------------------------------------------------------------------------
Post by David
Were we able to decide if Paul or Jorge had the correct steps, or are
they both correct to an extent? At this point, I think that seizing the
PDEe role for the remote site DC's during the outage is likely to be
the best course to ensure all is well once we sever that link... Paul?
any other thoughts?
Paul Williams [MVP]
2006-10-19 19:54:53 UTC
Permalink
Jorge is right. Test environment makes the most sense first.

Perhaps I was a little hasty in answering. Although what I said is 100%
true and there will be no issues, Jorge is correct in saying that it's not
the greatest of ideas in production.

You can simulate a WAN outage without seizing the FSMO roles. You might hit
some issues with password changes, etc. due to the PDCe being offline. You
shouldn't hit any issues with any of the other roles unless you are
provisioning users or you restore some DCs.

You could also do the test without doing the siezing and then evaluate the
resulsts and look at then performing a similar test with a seizure of the
PDCe if you felt the lack of PDCe was causing you issues.

Real-life DR tests are tough. Unless you have a real nice pre-prod
environment, you are always going to be limited.
--
Paul Williams
Microsoft MVP - Windows Server - Directory Services
http://www.msresource.net | http://forums.msresource.net
Loading...