Has this ever happened to you? Your NetInfo network has been working beautifully for months. Suddenly mars, your trusty master NetInfo server, decides to head south. Annoying, to say the least.
If you're lucky, this hasn't happened, and if you stay lucky, it never will. However, just as you hope you never need to use that first aid kit, it's a good idea to have one and to know how to use it. In this article, we take a look at what you can do to recover should disaster strike.
nothing's the same without you What's life like without the master NetInfo server? The clone servers will take over most responsibilities (you do have clone servers, don't you?). However, clone servers provide only read access to domains. Without a master server, you can't do anything that requires write access-such as adding hosts, deleting user accounts, modifying NFS (Network File System) mounts, or even changing passwords. This might be okay for a while, but eventually you're going to need full access to the domain.
what to do now So, here you are without a master NetInfo server. What's the next step? There are four things you can do to get back to normal. We'll say more about their varied complexity and desirability in a minute. Basically, your choices are:
* Restore the NetInfo database from a clone server
* Restore the NetInfo database from a backup
* Make one of the clone servers the master
* Start over from scratch
decisions, decisions Which of these approaches should you take? As is frequently the case, there isn't an absolute answer. Your approach depends on what happened to the server, how long it will be until the master is available again, how long you can wait until you have full NetInfo access, and whether you have a clone server or backups available. Let's take a look at the possible scenarios:
* The NetInfo database is corrupted. If the database has been corrupted, you might have problems with binding during system boot, lookups might not be successful (people can't log in, for example), or NetInfoManager might show bizarre entries. In thiscase, restoration from a clone isn't feasible, because the clonedatabase will probably be corrupted as well. Your choices here are to restore from a backup (you do maintain good backups, don't you?) or, if one isn't available, to start from scratch-not a pleasant thought.
* The master server is temporarily unavailable, and you can wait for full NetInfo access until it's back. If the server failure doesn't involve its disk, there's no problem. Wait for the repair, turn on the server, and life continues as usual. If the disk is part of the problem, such as a disk crash or operator error (perhaps someone mistakenly erased /private/etc), once the server is functioning again you need to restore the NetInfo database. In this case, you can restore from a clone (best choice), restore from a backup (okay choice), or start over from scratch (yecccch).
* The master server is permanently unavailable, or you can't wait until it's repaired. If the master server is permanently out of commission-due to fire, explosion, or theft, for example-your best bet is to make one of the clones the new master. If you have no clones, you need to set up a new master and restore the database from a backup. If you have no backups, you need to start over.
how to recover from a crash Having decided which approach is most appropriate for your situation, you're ready to dive in and fix things.
restoring from a clone If you've decided to wait for your server to be repaired, you'll probably be restoring files from backups. In this case, a clone server will probably have a version of the NetInfo database that's more recent than the one on your backups. In this situation, restore all the files you need from backups and then copy the NetInfo database from a clone server (more on how to do that in a minute).
If your problem was an accident with part of the directory structure on the server, you'll probably recover the files from your operating system CD-ROM, your backups, or another computer, depending on what you've lost. Regardless of where you get the other files, copy the NetInfo database from a clone.
Here's what to do to copy the NetInfo database from a clone server to your master server:
1. Copy the appropriate directory from the clone onto a removable disk. Make sure you copy the entire directory (for example, /etc/netinfo/network.nidb), not just the files under it (/etc/netinfo/network.nidb/*).
2. Turn off the master server, disconnect the server from the network, and then start it up in single-user mode.
3. Delete the corresponding directory from the master server if it exists (/etc/netinfo/network.nidb). Remove the entire directory, because any extension files left in the directory will interfere with the new database once it's been installed.
4. Copy the directory from the removable disk onto the server's hard disk.
5. Turn off the master server, reconnect it to the network, and start it up.
That's it. Your NetInfo network should be back to where it was before the disaster.
restoring from a backup If you're able to wait while your server is repaired, but you don't haveaclone server available to copy from, you'll need to restore the NetInfo database from a backup. Because whatever happened to the database probably happened to the rest of the data on the disk, restoring your most recent backups will take care of everything, including the NetInfo database.
One word of caution if you're restoring from incremental backups: Thedatabase is made up of multiple files, and it's critical that you not end up with some files from one backup and other files from another. Make sure that the directory containing the database (such as /etc/netinfo/network.nidb) is restored as a whole, with no files left from some other backup. Also remember that any changes made between the time of your most recent backup and the time your server crashed will be lost. For example, let users know that if they changed their passwords during this time, they will have to go back to the passwords in effect when the last backup was made.
making a clone the master If your master server will be gone too long (like forever), you need to make some other computer the master NetInfo server. Remember that the master server of a NetInfo database is defined in the value of the master property of the root ("/") NetInfo directory. A logical assumption would be that you need to change this property to identify a different computer as the master.
However, since the server identified in the master property isn't available, you don't have write access to the domain. Because you can't change the database to identify a new master, you're going to have to make another computer look like the old master. In other words, give some other computer the host name and Internet address of the broken master server. Here's how to go about it:
1. Choose an appropriate clone server, one with plenty of memory and disk space, and log in as root.
2. Edit the file /etc/hostconfig to change the values of the HOSTNAME and INETADDR variables to be the same as on the old master server. If you don't know the correct values, use NetInfoManager or HostManager to check the database.
3. Disconnect the computer from the network, connect it to an isolated network section (a T connector and two terminators will do), and reboot it. If the master server was also an NFS server, it's possible that the new server will have problems booting. Make a note of any problems with NFS mounts and use Control-c to continue booting.
4. Because your old master NetInfo server is gone forever, you need to take care of any other services it was providing. For example, the master server was probably also the mail server. Check for the following types of service:
Mail service Modify the local domain on the new master server so it will use the correct sendmail configuration file and export /usr/spool/mail and /LocalLibrary/Images/People.
Any mail that was waiting in the spool directory on the old server is lost forever. If you have a recent backup, you might want to restore the spool directory. Warn your users, though, because some will getduplicates of mail they've already received. Also let them know that they will need to resend any mail they sent between the backup and the disaster.
File service If the old master server was a file server, either home directory or general purpose, you'll need to create the appropriate directory on the new server, restore the contents from backup, and export it using NFSManager. Let users know that any changes made to files between the backup and the crash will be lost.
Peripheral service If the old server was a printer or fax modem server,use PrintManager to set up the new server to provide the same service.
5. Connect the new master to the network and reboot it.
Your new server now looks like the old one, and everything should work just fine.
starting from scratch Starting from scratch should be necessary only if you have neither a clone server nor a backup available. Before you begin, use nidump to try to retrieve any usable data from NetInfo. There's not a lot else to say about starting over, except that it will probably take quite a bit of time to re-create what you had. Depending on what happened to the master server, you may have lost files in addition to the NetInfo database. To prevent this from ever happening to you, make sure you maintain at least one clone server for each network NetInfo domain and keep careful backups. Unless, of course, you prefer to learn one of life's lessons the hard way.
a little planning Losing your master NetInfo server, even briefly, isn't much fun. Now, at least, you're a little better prepared to recover should it ever happen to you. You might want to take a few minutes to plan which computers to use for recovery, just in case. You also might want to tuck these instructions away with your contingency plan. You do have a . . . well, you get the idea.