Tuesday, February 15, 2011

TrobuleShooting



Problem:
In our cluster environment I was testing SQL failover, when I failed over to passive node I can no longer launch SQL Server configuration manager (SSCM). When I tried to open SSCM, MMC has waited for 30 seconds and gave me the error below.
Connection to target machine could not be made in a timely fashion
sscm-error
Resolution:
Upon researching on this I found that WMI Service (Windows Management Instrumentation Service) is in hanged state. After restarting the service on passive node, it worked for me. Steps to restart the service is below.
  • Go to Run –> type services.msc and press enter
  • Right click on the service “Windows Management Instrumentation” and click on restart option in the menu
  • Now launch SQL Server configuration manager, it should work fine now.
SQL BACKUP ON NTFS Compressed Formate
I’m starting this topic since last week I had faced a backup issue with compressed drive. In one of our server we have used compressed folder to take backup of our SQL databases. This is working fine and the db is used for OLTP application and it had grown about 73GB. Now we have started facing backup failures only for that database alone all other databases (less than 10GB) are running fine in the maintenance plan. I’ve checked the machine performance and SQL Server is the only process using 50% of CPU and 3GB of memory, other than that the server is not under stress and in fact that box is dedicated to SQL Server.
Our backup has been failed with the below error message and the below event is written to errorlog too.
BackupMedium::ReportIoError: write failure on backup device 'D:\SQLBackup\MyDatabase.bak'. Operating system error 33(The process cannot access the file because another process has locked a portion of the file.).
The operating system returned error 1450(Insufficient system resources exist to complete the requested service.) to SQL Server during a write at offset 0x0000xxxxxxxx in file with handle 0x000000000000xxxx. This is usually a temporary condition and the SQL Server will keep retrying the operation. If the condition persists then immediate action must be taken to correct it.
Upon running the tests I found IO is so busy with that backup folder. While checking it I found that the system is not able to handle backup and folder compression at the same time when the file size is large, this creates IO timeout for SQL Server and SQL Server in turn returns the general error mentioned above.
As we already know that taking SQL backup to a compressed drive \ folder is not recommended by Microsoft, I’ve removed the compression option for that folder and now the backup ran without errors and it completed within 15 min. So taking a large online SQL backup to a compressed drive is mere waste, where you will spend CPU and more IO for it. Some of my recommendations are below to compress your backup.
  • Take a native SQL Backup to an uncompressed folder and then zip the backup file using any of the zipping (gzip, 7-zip, winzip, etc) utility and delete the SQL Backup file.
  • Buy third party utilities to compress the SQL backup on the fly, i.e taking direct compressed backup, no need to create a separate SQL temporary backup file. You can try LiteSpeedSQL BackupEmbarcadero etc.
  • Upgrade to SQL Server 2008 Enterprise Edition which supports backup compression.

    Suppress Backup Information In Error Log

     This article is all about suppressing successful backup information into errorlog. Let me explain my scenario, In my working environment we have a production SQL Server with 60+ databases, all the databases are set to full recovery and data loss is accepted only for 15 min. Yeah you got me, we are taking Full backup daily, differential every 4 hours and transaction log backup  every 15 min, there is no problem with the backup. We have one thing that is messing our error log, whenever a backup is taken by default SQL Server will write the detail to the errorlog as shown below.
    suppress_backup_1
    In our case SQL Server is writing the successful backup information for 60+ databases every 15 min to error log. So what happens, error log grows 30~50 MB daily which makes us hard to scan the error log. We have a maintenance job to scan the errorlog for errors, if it grows 30 MB our errorlog monitoring jobs runs for long time. To overcome this we can suppress this backup information writing in error log because we have this information in backupset table on MSDB database. We can achieve this by using the trace flag 3226. Follow the steps below to add this trace flag to sql server startup.
    • Open SQL Server Configuration Manager
    • Go to SQL Server properties and click on Advanced Tab
    • Add Trace flag 3226 as shown below
    suppress_backup_2
    • Restart SQL Server service
    That’s it you are good to go. From now no backup information will be written to errorlog however you can get the backup details from backupset table on MSDB database. We have done this and saved lot of space and time scanning errorlog.
    Applies to :
    • SQL Server 2000
    • SQL Server 2005
    • SQL Server 2008 &
    • SQL Server 2008 R2

2 comments: