Tuesday, 6 May 2008

SAN SP Problem

Going to start blogging work. Might help someone Googling.

Had a weird problem today with one of the Dell badged EMC Clariion CX-500 SANs I look after. I have two identical SANs, one at the main office and another 5 kilometres away linked via our own single mode fibre to which we replicate some LUNs. One of the storage processors (SP) in the main SAN became "unmanaged", dropping out of the Navisphere management console and the management IP address not being pingable. This broke a number of mirrors using that SP. With several terrabytes of mission critical data, SAN errors are something I don't like to see! The problem started at 5am in the morning, so I didn't think it was a user initiated screwup or change - not even I am working that late! I checked the obvious of flashing NIC activity lights on the SP NIC, replaced cabling and used a different switch port and checked that someone hadn't changed the VLAN or port speed of the switch port the SP was plugged into. Also tried pinging from the same subnet in case it was a default gateway problem on the SP. All ok. Tried plugging a laptop with a crossover cable directly into the SP NIC and on the same subnet, the management IP was still not reachable.

Looking at the SP, there is a RJ-45 console port which helpfully ISN'T actually a normal console port, but a port that requires you to establish a PPP dial up networking connection to. After establishing a PPP network connection and accessing the HTTP setup page (http://192.168.1.1/setup), the setup page displayed the correct IP address of the SP. So it seemed to have the correct setup info, committing these settings the SP then came up. So basically the SP had somehow lost it's management IP settings or required that to be reset. Nice to have it fixed but still not sure why it happened.

Labels:

0 Comments:

Post a Comment

<< Home