This sounds like a dumb question, but I have looked high and low for a solution here in the forums as well as various vendor sites. Nothing. The basic problem is that our master and media servers lose connectivity to robot and tape drives within a library if the library goes offline (say, for a reboot or even hot-swap replacement of a controller board.)
Our NBU 7.6.0.4 environment consists of a master server and 22 media servers, all running RHEL 6.6. A number of the media servers, and the master server, are zoned a Quantum i6000 tape library with 26 LTO5 drives. We're using Emulex 8Gb HBAs throughout.
When we lose connectivity to our drives and robot occurs, the only effective resolution we've found is to then reboot all of those 23 hosts, which is obviously disruptive to our users (particularly DBAs, whose frequent backup jobs take a hit when there's no server to back up to) and time-consuming for our server team as well.
Upon reboot, all the drives return to normal operational status and are visible from the hosts we expect to see them ... device paths remain the same, etc.
Our belief is that the servers should automatically see the devices when the library (and its drives) return to normal operational status, but this isn't happening. Is this expected behavior? Seems to us that we ought to be capable of replacing a hot-swap controller board (or even rebooting the library) without bringing the entire environment down.
Thoughts?