Wednesday, February 11, 2015

Troubleshooting - Failed to write file data on cluster disk 1 partition 1, failure reason: The disk structure is corrupted and unreadable

One of the initial task in setting up a windows cluster for SQL failover cluster, is to validate if the servers fulfill all the network, storage, hardware and software requirements. It helps you to work through the setting up and straights things up before the really deal.

Sometimes, it is a step with full of surprises, especially when you don't have the control over how the server was set up. To be fair, it is quite a list of items and it is fair to say my network/system team missed on one or two things, which is still acceptable. Validation is like a QC check against the built servers to match the standard. On the other hands, some issue would just a pure kind of jerk that shows up to make your hard day harder.

I have the following issue returned, on the storage section, which make me and my network/SAN dude scratch our head a bit. The "Validate a configuration" check shows:

Interestingly, new shared drive created at the SAN shows up as corrupted. I tested the drive and I can read and write to it. It just not pass the validation. First come to my mind, is to reformat the drive 1 from the disk management msc. And that doesn't help. Indeed, I realise that the approach was wrong.

I researched on a few forums to look for the solution. First everyone suggested that (potential) Cluster Disk 1 doesn't corresponding to the physical disks which listed in the disk management msc. To determine what disk is the troublemaker, we need to check the List Potential Cluster Disks at the storage section in the validation report, then take a note of the disk ID. For example, I have an issue with Cluster Disk 1, and look up the list of potential cluster disks give me the ID c188f5ac:

Then, from the List All Disks, which lists any disk attached to the servers, we can then identify the disk in trouble with the disk ID mentioned above.

Now that we know which disk requires our attention, so let's get the solution out. There is a few potential reason that may contribute to the symptom that "The disk structure is corrupted and unreadable". I have a disk drive issue, what should I do? It sounds like one of my old days job interview question.

It would be the old school "chkdsk /f". The truth is, if you found 1 drive shows up disk drive issue, I would run chkdsk against all the drives, just in case. In my case, there are 3 drives out of 8 reported with drive issue, and "chkdsk /f" just fixed that.

Monday, February 02, 2015

SQL Trick KB: Algorithm - First Date of Financial Year

I was trying to implement a stored procedure with some function working based on the first date of the financial year for any given date. In Australia, our financial year start on the 1st of July. So, for example, if today is '2015-06-30', then it should return the '2014-07-01' as the first date of FY15, while if the date '2015-07-02' would return the next financial year FY16 with the start date as '2015-07-01'.

Sounds easy, huh? I was working on some CASE...WHEN statement, until I found this piece of brilliant, clean-cut idea, and thought that this is just so simple. It is from damien-the-unbeliever's answer at the Most efficient way to calculate the first day of the current Financial Year? at stackoverflow.com.