First Edition (December 1999) Part Number 148942-001 Compaq Computer Corporation Compaq Confidential Need to Know Required Writer: Susana Weinstein Salazar Project: Compaq ProLiant ML530 Troubleshooting Guide Comments: Part Number: 148942-001 File Name: a-frnt.doc Last Saved On: 10/26/99 4:59 PM Notice The information in this publication is subject to change without notice. COMPAQ COMPUTER CORPORATION SHALL NOT BE LIABLE FOR TECHNICAL OR EDITORIAL ERRORS OR OMISSIONS CONTAINED HEREIN, NOR FOR INCIDENTAL OR CONSEQUENTIAL DAMAGES RESULTING FROM THE FURNISHING, PERFORMANCE, OR USE OF THIS MATERIAL. THIS INFORMATION IS PROVIDED "AS IS" AND COMPAQ COMPUTER CORPORATION DISCLAIMS ANY WARRANTIES, EXPRESS, IMPLIED OR STATUTORY AND EXPRESSLY DISCLAIMS THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR PARTICULAR PURPOSE, GOOD TITLE AND AGAINST INFRINGEMENT. This publication contains information protected by copyright. No part of this publication may be photocopied or reproduced in any form without prior written consent from Compaq Computer Corporation. 1999 Compaq Computer Corporation. All rights reserved. Printed in the U.S.A. The software described in this guide is furnished under a license agreement or nondisclosure agreement. The software may be used or copied only in accordance with the terms of the agreement. Compaq, Compaq Insight Manager, ProLiant, ROMPaq, SmartStart, QuickFind, PaqFax, registered United States Patent and Trademark Office. Netelligent is a trademark or service mark of Compaq Computer Corporation. Microsoft, Windows, and Windows NT are registered trademarks of Microsoft Corporation. Other product names mentioned herein may be trademarks and/or registered trademarks of their respective companies. Compaq ProLiant ML530 Troubleshooting Guide First Edition (December 1999) Part Number 148942-001 Compaq Confidential Need to Know Required Writer: Susana Weinstein Salazar Project: Compaq ProLiant ML530 Troubleshooting Guide Comments: Part Number: 148942-001 File Name: a-frnt.doc Last Saved On: 10/26/99 4:59 PM Contents About This Guide Text Conventions..................................................................................................... viii Symbols in Text..........................................................................................................ix Symbols on Equipment................................................................................................x Rack Stability ...................................................................................................... xi Warning Information .................................................................................................xii Additional Resources............................................................................................... xiii Telephone Numbers.......................................................................................... xiii Other Information Resources............................................................................ xiii Chapter 1 Server Startup and Operation Errors When the Server Does Not Start.............................................................................. 1-1 Normal Power-Up Sequence ............................................................................ 1-3 Diagnosis Steps........................................................................................................ 1-4 Problems after Initial Boot....................................................................................... 1-9 Chapter 2 Status Indicators System Status LED Indicators ................................................................................. 2-2 Hard Drive LED Indicators...................................................................................... 2-4 Hot-Plug Drive Replacement Guidelines ......................................................... 2-6 Unsafe Hot-Plug Drive Replacement Precautions............................................ 2-7 Predictive Failure Alert .................................................................................... 2-8 Network Status LED Indicators ............................................................................. 2-10 Power Supply Diagnostic LED Indicators ............................................................. 2-11 Diagnostic LED Indicators .................................................................................... 2-12 Memory Failure Conditions............................................................................ 2-14 Processor Failure Conditions.......................................................................... 2-15 Fan Failure Conditions ................................................................................... 2-16 Compaq Confidential Need to Know Required Writer: Susana Weinstein Salazar Project: Compaq ProLiant ML530 Troubleshooting Guide Comments: Part Number: 148942-001 File Name: a-frnt.doc Last Saved On: 10/26/99 4:59 PM iv Compaq ProLiant ML530 Troubleshooting Guide Chapter 3 System Maintenance Switch Settings Appendix A System Specifications Minimum Hardware Configuration......................................................................... A-1 Operating System Support ...................................................................................... A-2 Appendix B POST Error Messages Appendix C Array Diagnostic Utility Array Diagnostic Utility (ADU) ..............................................................................C-1 Running Array Diagnostic Utility (ADU).........................................................C-2 Array Diagnostic Utility (ADU) Error Messages.....................................................C-3 Index List of Tables Organization ................................................................................................................ v ProLiant ML530 Resources ..................................................................................... xiv Table 1-1 Diagnosis Steps........................................................................................ 1-4 Table 1-2 No LED Indicators On The Front Panel Are On ..................................... 1-5 Table 1-3 Power On/Standby Status LED is Amber ................................................ 1-6 Table 1-4 Server Does Not Have Video................................................................... 1-7 Table 1-5 Memory/Processor Status LED is Amber................................................ 1-8 Table 1-6 Installation Problems ............................................................................... 1-9 Table 2-1 System Status LED Indicators ................................................................. 2-3 Table 2-2 Hot-Plug Hard Drive LED Indicator Status Combinations...................... 2-5 Table 2-3 Network Status LED Indicators............................................................. 2-10 Table 2-4 Power Supply Diagnostic LED Indicators ............................................. 2-11 Table 2-5 Diagnostic LED Indicators .................................................................... 2-13 Table 3-1 System Maintenance Switch (SW1) ........................................................ 3-2 Table A-1 Minimum Hardware Configuration ....................................................... A-1 Table A-2 Supported Operating Systems................................................................ A-2 Table B-1 POST Error Messages.............................................................................B-2 Table C-1 Array Diagnostic Utility (ADU) Error Messages....................................C-3 Compaq Confidential Need to Know Required Writer: Susana Weinstein Salazar Project: Compaq ProLiant ML530 Troubleshooting Guide Comments: Part Number: 148942-001 File Name: a-frnt.doc Last Saved On: 10/26/99 4:59 PM About This Guide This troubleshooting guide provides specific information to quickly troubleshoot your ProLiant ML530. Use this guide to find details about server startup problems, switch settings, status indicators, Power-On Self-Test messages and more. For information about general troubleshooting techniques, status messages, and preventative maintenance, refer to the Compaq Servers Troubleshooting Guide, also included in your user documentation. WARNING: There is a risk of personal injury from hazardous energy levels. The installation of options and routine maintenance and service of this product shall be performed by individuals who are knowledgeable about the procedures, precautions, and hazards associated with equipment containing hazardous energy circuits. Organization What will I find? Where do I find it? You are provided with step-by-step instructions on what to try Chapter 1 and where to go for help. These instructions are provided for the most common problems that you may encounter when your server will not complete the initial Power-On Self-Test. This test must complete each time you power up your server, before the server can load the operating system and start running software applications. continued Compaq Confidential Need to Know Required Writer: Susana Weinstein Salazar Project: Compaq ProLiant ML530 Troubleshooting Guide Comments: Part Number: 148942-001 File Name: a-frnt.doc Last Saved On: 10/26/99 4:59 PM vi Compaq ProLiant ML530 Troubleshooting Guide Organization continued What will I find? Where do I find it? Once your server has completed the Power-On Self-Test you may Chapter 1 still encounter errors, such as an inability to load your operating system. You are provided with instructions on what to try and where to go for help when you encounter errors after completion of the Power-On Self-Test. There are several status LEDs located on the front and back of Chapter 2 your server. These LEDs can communicate the current status of varying aspects of your server's components and operations, thus aiding you in diagnosing the problem. Your server also has a set of internal diagnostic LEDs. This is a powerful tool used to diagnose failure conditions. You are provided with an illustration of the location of each LED on your server, as well as an explanation of uses and possible statuses. A switch bank on your system board contains switches that will Chapter 3 need to be changed from time to time to reflect changes made to your server. Problems can result when they are not set correctly. You are provided with a list of all switches, a description of what each setting means, and an illustration of where the switches may be found inside your server. Compliance with the minimum hardware configuration for your Appendix A ProLiant ML530 server is required to power up the server and run an operating system. This is often a good place to start diagnosis after you have changed your hardware configuration. It is possible to encounter errors when using an operating system Appendix A that is not compatible with your server. Check this list containing the name and version of each operating system supported for your server. The Power-On Self-Test error messages are generated each time Appendix B an error is encountered when the server is powered up. Each message is listed numerically, with a corresponding explanation and list of corrective measures to take. The Array Diagnostic Utility collects information on all the array Appendix C controllers in the system, and generates a list of detected problems in the form of error messages. Each message is listed alphabetically, with a corresponding explanation, and list of corrective measures to take. Compaq Confidential Need to Know Required Writer: Susana Weinstein Salazar Project: Compaq ProLiant ML530 Troubleshooting Guide Comments: Part Number: 148942-001 File Name: a-frnt.doc Last Saved On: 10/26/99 4:59 PM About This Guide vii For other places to obtain troubleshooting information, and information specific to ProLiant ML530 servers, also see "Additional Resources," later in this section. WARNING: To reduce the risk of personal injury or damage to the equipment, refer to the documentation supplied with the server and observe the appropriate safety precautions. Compaq Confidential Need to Know Required Writer: Susana Weinstein Salazar Project: Compaq ProLiant ML530 Troubleshooting Guide Comments: Part Number: 148942-001 File Name: a-frnt.doc Last Saved On: 10/26/99 4:59 PM viii Compaq ProLiant ML530 Troubleshooting Guide Text Conventions This document uses the following conventions to distinguish elements of text: Keys Keys appear in boldface. A plus sign (+) between two keys indicates that they should be pressed simultaneously. USER INPUT User input appears in a different typeface and in uppercase. FILENAMES File names appear in uppercase italics. Menu Options, These elements appear in initial capital letters. Command Names, Dialog Box Names COMMANDS, These elements appear in uppercase. DIRECTORY NAMES, and DRIVE NAMES Type When you are instructed to type information, type the information without pressing the Enter key. Enter When you are instructed to enter information, type the information and then press the Enter key. Compaq Confidential Need to Know Required Writer: Susana Weinstein Salazar Project: Compaq ProLiant ML530 Troubleshooting Guide Comments: Part Number: 148942-001 File Name: a-frnt.doc Last Saved On: 10/26/99 4:59 PM About This Guide ix Symbols in Text These symbols may be found in the text of this guide. They have the following meanings. WARNING: Text set off in this manner indicates that failure to follow directions in the warning could result in bodily harm or loss of life. CAUTION: Text set off in this manner indicates that failure to follow directions could result in damage to equipment or loss of information. IMPORTANT: Text set off in this manner presents clarifying information or specific instructions. NOTE: Text set off in this manner presents commentary, sidelights, or interesting points of information. Compaq Confidential Need to Know Required Writer: Susana Weinstein Salazar Project: Compaq ProLiant ML530 Troubleshooting Guide Comments: Part Number: 148942-001 File Name: a-frnt.doc Last Saved On: 10/26/99 4:59 PM x Compaq ProLiant ML530 Troubleshooting Guide Symbols on Equipment These icons may be located on equipment in areas where hazardous conditions may exist. Any surface or area of the equipment marked with these symbols indicates the presence of electrical shock hazards. Enclosed area contains no operator serviceable parts. WARNING: To reduce the risk of injury from electrical shock hazards, do not open this enclosure. Any RJ-45 receptacle marked with these symbols indicates a Network Interface Connection. WARNING: To reduce the risk of electrical shock, fire, or damage to the equipment, do not plug telephone or telecommunications connectors into this receptacle. Any surface or area of the equipment marked with these symbols indicates the presence of a hot surface or hot component. If this surface is contacted, the potential for injury exists. WARNING: To reduce the risk of injury from a hot component, allow the surface to cool before touching. Power Supplies or Systems marked with these symbols indicate the equipment is supplied by multiple sources of power. WARNING: To reduce the risk of injury from electrical shock, remove all power cords to completely disconnect power from the system. Compaq Confidential Need to Know Required Writer: Susana Weinstein Salazar Project: Compaq ProLiant ML530 Troubleshooting Guide Comments: Part Number: 148942-001 File Name: a-frnt.doc Last Saved On: 10/26/99 4:59 PM About This Guide xi WARNING: Any product or assembly marked with these symbols indicates that the component exceeds the recommended weight for one individual to handle safely. XX kg WARNING: To reduce the risk of personal injury or damage to the XX lb equipment, observe local occupational health and safety requirements and guidelines for manual material handling. Rack Stability WARNING: To reduce the risk of personal injury or damage to the equipment, be sure that: I The leveling jacks are extended to the floor. I The full weight of the rack rests on the leveling jacks. I The stabilizing feet are attached to the rack if it is a single rack installation. I The racks are coupled together in multiple rack installations. I Only one component is extended at a time. A rack may become unstable if more than one component is extended for any reason. Extend only one component at a time. Compaq Confidential Need to Know Required Writer: Susana Weinstein Salazar Project: Compaq ProLiant ML530 Troubleshooting Guide Comments: Part Number: 148942-001 File Name: a-frnt.doc Last Saved On: 10/26/99 4:59 PM xii Compaq ProLiant ML530 Troubleshooting Guide Warning Information For a complete list of warnings associated with your server, refer to the user documentation provided with your server. WARNING: To reduce the risk of personal injury from hot surfaces, allow the internal system components to cool before touching. WARNING: This product is very heavy. To reduce the risk of personal injury or damage to the equipment: I Remove all pluggable power supplies and modules to reduce 42-62 kg the weight of the product before lifting it. 93-137 lb I Observe local occupational health and safety requirements and guidelines for manual material handling. I Get help to lift and stabilize the product during installation or removal, especially when the product is not fastened to the rails. I Be cautious when installing the product in or removing the product from the rack; the product will be unstable when not fastened to the rails. WARNING: To reduce the risk of electric shock or damage to the equipment: I Do not disable the power cord grounding plug. The grounding plug is an important safety feature. I Plug the power cord into a grounded (earthed) electrical outlet that is easily accessible at all times. I Install the power supply before connecting the power cord to the power supply. I Unplug the power cord before removing the power supply from the server. I If the system has multiple power supplies, disconnect power from the system by unplugging all power cords from the power supplies. Compaq Confidential Need to Know Required Writer: Susana Weinstein Salazar Project: Compaq ProLiant ML530 Troubleshooting Guide Comments: Part Number: 148942-001 File Name: a-frnt.doc Last Saved On: 10/26/99 4:59 PM About This Guide xiii Additional Resources If you have a problem and have exhausted the information in this guide, you can get further information and other help in the following locations. Telephone Numbers In the United States and Canada, call the Compaq Technical Support Center at 1-800-OK-COMPAQ (1-800-652-6672), where a technical support specialist will help you diagnose the problem. For continuous quality improvement, calls may be recorded or monitored. For the latest worldwide telephone numbers, refer to the Compaq website: http://www.compaq.com and select Worldwide under Other Sites. For the name of your nearest Compaq authorized reseller: I In the United States, call 1-800-345-1518 I In Canada, call 1-800-263-5868 Other Information Resources Refer to the following additional information for help. NOTE: For additional resources outside the United States and Canada, contact your authorized Compaq service provider. Compaq Confidential Need to Know Required Writer: Susana Weinstein Salazar Project: Compaq ProLiant ML530 Troubleshooting Guide Comments: Part Number: 148942-001 File Name: a-frnt.doc Last Saved On: 10/26/99 4:59 PM xiv Compaq ProLiant ML530 Troubleshooting Guide ProLiant ML530 Resources Resource What it is Where it is ProLiant ML530 This book provides step-by-step instructions ProLiant ML530 Documentation CD Setup and on the setup and installation of your server. Installation Guide Compaq Servers This book provides troubleshooting ProLiant ML530 Documentation CD Troubleshooting information beyond the scope of this Guide document, including general hardware and software troubleshooting information for all Compaq ProLiant servers. ProLiant ML530 This book provides a complete list of all Compaq website: Maintenance and replacement parts available, along with step- http://www.compaq.com/support/servers/ Service Guide by-step instructions on installation and replacement. For other information resources referred to in this book, see the "resources" entries in the index. Compaq Confidential Need to Know Required Writer: Susana Weinstein Salazar Project: Compaq ProLiant ML530 Troubleshooting Guide Comments: Part Number: 148942-001 File Name: a-frnt.doc Last Saved On: 10/21/99 5:39 PM 1 Chapter Server Startup and Operation Errors This chapter provides step-by-step instructions on what to try and where to go for help when diagnosing the most common problems you may encounter if your server will not complete the initial Power-On Self-Test. It also outlines the process of troubleshooting your server when problems occur after the initial power-on sequence and during server operation. When the Server Does Not Start Complete these steps if the server does not start. 1. Verify that the computer and monitor are plugged into a working outlet. 2. Make sure your power source is working properly. Check status using the Power On/Standby LED. See Figure 1-1 for the location of the system status LEDs. Was the Power On/Standby switch pressed firmly? Refer to the "Power Source" section in Chapter 2 of the Compaq Servers Troubleshooting Guide for details on what else to check. 3. Make sure the power supplies are working properly. Check status using the power supply diagnostic LEDs. See the "Power Supply Diagnostic LED Indicators" section in Chapter 2 for the location of these LEDs, and an explanation of statuses. Also refer to the "Power Source" section in Chapter 2 of the Compaq Servers Troubleshooting Guide. Compaq Confidential Need to Know Required Writer: Susana Weinstein Salazar Project: Compaq ProLiant ML530 Troubleshooting Guide Comments: Part Number: 148942-001 File Name: b-ch1 Server Startup and Operation Errors.doc Last Saved On: 10/19/99 12:36 PM 1-2 Compaq ProLiant ML530 Troubleshooting Guide 4. If the system does not complete the Power-On Self-Test (POST) or start loading an operating system, refer to "General Loose Connections" in Chapter 2 of the Compaq Servers Troubleshooting Guide. 5. If the server is power cycling, verify that the system is not rebooting due to an Automatic Server Recovery-2 (ASR-2) reboot caused by another problem. In the Compaq Servers Troubleshooting Guide, refer to: Chapter 6 for a complete description of Automatic Server Recovery-2 "System Short" in Chapter 2 for other reasons for power cycling NOTE: ASR-2 can be enabled to restart your server, automatically loading the operating system. Should a critical error occur, ASR-2 will log the error in the Integrated Management Log (IML) and restart the server. The system ROM will then page the designated administrator and execute the normal restart process. 6. Restart the server, and see "Normal Power-up Sequence," immediately following, to verify that your system starts correctly. 7. If this does not solve the problem, continue with the following section. 1 2 3 4 Figure 1-1. System status LED indicators Power On/Standby status Fan status
Memory/processor status Power supply status
NOTE: For detailed information on the system status LEDs, see "System Status LED Indicators" in Chapter 2. Compaq Confidential Need to Know Required Writer: Susana Weinstein Salazar Project: Compaq ProLiant ML530 Troubleshooting Guide Comments: Part Number: 148942-001 File Name: b-ch1 Server Startup and Operation Errors.doc Last Saved On: 10/19/99 12:36 PM Server Startup and Operation Errors 1-3 Normal Power-Up Sequence The following events represent the power-up sequence for a system with minimal hardware under normal operations. 1. Front panel Power On/Standby LED turns from amber (standby) to green (on). 2. Fans start up. 3. The server will initialize in the following sequence: a. System initialization b. PCI auto configuration c. Video d. Memory test e. Memory initialization f. Processor initialization g. Power supply checking h. System event checking i. Diskette drive j. CD-ROM drive k. SCSI devices l. IDE devices 4. Operating system will boot. Compaq Confidential Need to Know Required Writer: Susana Weinstein Salazar Project: Compaq ProLiant ML530 Troubleshooting Guide Comments: Part Number: 148942-001 File Name: b-ch1 Server Startup and Operation Errors.doc Last Saved On: 10/19/99 12:36 PM 1-4 Compaq ProLiant ML530 Troubleshooting Guide Diagnosis Steps If your server does not power up, or powers up but does not complete the Power-On Self-Test (POST), answer the questions in the following table to determine appropriate actions based on the symptoms observed. Based on the answers you give, you will be directed to the appropriate table in the section immediately following. This table outlines possible reasons for the problem, options available to assist in diagnosis, possible solutions available, and references to other sources of information. Table 1-1 Diagnosis Steps Question Action If NO, go to Table 1-2. Question 1: Are any front panel system status LEDs on? If YES, continue to Question 2. See Figure 1-1 for the location of each LED. Question 2: Is the Power On/Standby status LED If AMBER, go to Table 1-3. Green or Amber? If GREEN, continue to Question 3. See Figure 1-1 for the location of each LED. Question 3: Does the server have video? If NO, refer to Table 1-4. If YES, video is available for diagnosis. Determine next action by observing Power-On Self-Test (POST) progress and error messages. Refer to Appendix B in this guide for a complete description of each POST error message. WARNING: To reduce the risk of electric shock or damage to the equipment: I Do not disable the power cord grounding plug. The grounding plug is an important safety feature. I Plug the power cord into a grounded (earthed) electrical outlet that is easily accessible at all times. I Install the power supply before connecting the power cord to the power supply. I Unplug the power cord before removing the power supply from the server. I If the system has multiple power supplies, disconnect power from the system by unplugging all power cords from the power supplies. Compaq Confidential Need to Know Required Writer: Susana Weinstein Salazar Project: Compaq ProLiant ML530 Troubleshooting Guide Comments: Part Number: 148942-001 File Name: b-ch1 Server Startup and Operation Errors.doc Last Saved On: 10/19/99 12:36 PM Server Startup and Operation Errors 1-5 CAUTION: Opening the access panel for an extended period of time when the server is on interferes with proper airflow and will cause a temperature violation. It is appropriate to open the server access panel briefly during server operation to view the internal diagnostic LEDs during problem diagnosis or to replace a redundant hot-plug power supply or hot-plug fan. If you feel that you will need to keep the access panel open for an extended period of time, prior to opening the unit: 1. Close all applications. 2. Bring down the operating system. 3. Press the Power On/Standby switch to place the server in standby mode. Table 1-2 No LED Indicators On The Front Panel Are On Possible Reasons The Next Step No AC power. 1. Check power supply installation and look for missing or bent pins on a power supply. Power supply problem exists. Check power supply diagnostic LEDs on the back of the 2. Verify that the power switch connector is properly power supply. The power supply may not be plugged in to the backplane. inserted correctly, have bent connector pins, or 3. Check the power switch cable for bent pins. Try it may have failed. reconnecting the power switch cable. There is a broken connection between the 4. Refer to "Power Problems" in Chapter 2 of the system and backplane via the power switch Compaq Servers Troubleshooting Guide for cable. further options. Compaq Confidential Need to Know Required Writer: Susana Weinstein Salazar Project: Compaq ProLiant ML530 Troubleshooting Guide Comments: Part Number: 148942-001 File Name: b-ch1 Server Startup and Operation Errors.doc Last Saved On: 10/19/99 12:36 PM 1-6 Compaq ProLiant ML530 Troubleshooting Guide Table 1-3 Power On/Standby Status LED is Amber See Figure 1-1 to identify the Power On/Standby status LED. Also see Chapter 2 for a complete description of these LEDs. Possible Reasons The Next Step 1. Try pressing the power switch until it is fully Power switch was not fully depressed on initial depressed. Press firmly. power-up. 2. Try reconnecting the power switch cable to the Power switch cable may not be connected backplane. properly to the backplane. 3. Check power supply insertion and look for Power supply problem exists. The power supply missing or bent pins on a power supply. may have bent connector pins, or it may have failed. 4. Reconnect any loose cables. There is a broken connection between the 5. Refer to "Power Problems" in Chapter 2 of the system and backplane via the power cables. Compaq Servers Troubleshooting Guide for further options. Compaq Confidential Need to Know Required Writer: Susana Weinstein Salazar Project: Compaq ProLiant ML530 Troubleshooting Guide Comments: Part Number: 148942-001 File Name: b-ch1 Server Startup and Operation Errors.doc Last Saved On: 10/19/99 12:36 PM Server Startup and Operation Errors 1-7 Table 1-4 Server Does Not Have Video See "System Status LED Indicators" in Chapter 2. Possible Reasons The Next Step Video may not be connected properly. 1. Verify the video connections. Refer to "Video Problems" in Chapter 2 of the Compaq Servers Memory is not present, not correctly seated, or Troubleshooting Guide. has failed. 2. Is the memory/processor status LED amber? If Processor is not present, has failed, or is not the memory/processor status LED is amber, there seated correctly. is a problem with the memory or the processor. See Figure 1-1 to identify the location of the memory/processor status LED. Continue to Table 1-5 for further troubleshooting steps. -Or- If the memory/processor status LED is not amber, continue with step 3. 3. Are there audible indicators, such as a series of beeps? A series of beeps will be the audible signal indicating the presence of a Power-On Self-Test (POST) error message. See Appendix B in this guide for a complete description of each beep sequence and the corresponding error message. Compaq Confidential Need to Know Required Writer: Susana Weinstein Salazar Project: Compaq ProLiant ML530 Troubleshooting Guide Comments: Part Number: 148942-001 File Name: b-ch1 Server Startup and Operation Errors.doc Last Saved On: 10/19/99 12:36 PM 1-8 Compaq ProLiant ML530 Troubleshooting Guide IMPORTANT: Use only the 133MHz front side bus Pentium III Xeon processors with gold color heat sinks. Processor heat sinks are color-coded to aid in differentiation of the correct Pentium III. If the wrong processor is installed, the server may not boot. Table 1-5 Memory/Processor Status LED is Amber Possible Reasons The Next Step Processor is not present, has failed, or is not 1. Close all software applications and bring down seated properly. the operating system. -Or- 2. Place the server in standby mode by pressing the Power On/Standby switch. Memory is not present, has failed, or is not seated properly. 3. Remove the access panel. 4. Verify that your server adheres to minimum hardware configuration standards. Are at least one memory module and one processor installed? See "Minimum Hardware Configuration" in Appendix A, to verify that your server configuration meets the specification minimums. 5. Look for amber diagnostic LEDs on the system board. These LEDs will be visible if the unit is powered on, or in standby mode, indicating a problem with the corresponding component(s). Are these components seated properly? See "Diagnostic LED Indicators" in Chapter 2 for the locations and meanings of these LEDs, and for step-by-step instructions on using them for problem diagnosis and resolution. Also refer to "Properly Seating Devices" in Chapter 2 of the Compaq Servers Troubleshooting Guide. Compaq Confidential Need to Know Required Writer: Susana Weinstein Salazar Project: Compaq ProLiant ML530 Troubleshooting Guide Comments: Part Number: 148942-001 File Name: b-ch1 Server Startup and Operation Errors.doc Last Saved On: 10/19/99 12:36 PM Server Startup and Operation Errors 1-9 Problems after Initial Boot Once your server has passed the Power-On Self-Test, you may still encounter errors, such as an inability to load your operating system. Use Table 1-6 to troubleshoot server installation problems that occur after the initial boot. Also see "Operating Systems" in Appendix A to verify the compatibility of your operating system. Table 1-6 Installation Problems Problem Possible Cause Possible Solution System cannot SmartStart Check the SmartStart Release notes provided on the load SmartStart. requirement not SmartStart Online Reference Information. performed. IDE Cable not Check the cable between system board and CD-ROM to connected to ensure proper connection. CD-ROM. Insufficient memory A rare "Insufficient Memory" message may display the available. FIRST time SmartStart is booted on certain unconfigured systems. Simply cold-boot the machine with the SmartStart and Support Software CD inserted in the CD-ROM drive to correct the problem. Existing software is *Run the Compaq System Erase Utility. Please read the causing conflict. caution below. Refer to the instructions in Chapter 4 of the Compaq Servers Troubleshooting Guide. SmartStart fails Error occurs during *Follow the error information provided. If it is necessary to during installation. installation. reinstall, run the Compaq System Erase Utility. Refer to the instructions in Chapter 4 Compaq Servers Troubleshooting Guide. CMOS not cleared. *Run the Compaq System Erase Utility. Read the caution below. Refer to the instructions in Chapter 4 of the Compaq Servers Troubleshooting Guide. *CAUTION: The Compaq System Erase Utility will cause loss of all configuration information, as well as loss of existing data on all connected hard drives. Please read "Running the Compaq System Erase Utility" and the associated warning in Chapter 4 of the Compaq Servers Troubleshooting Guide, prior to performing this operation. continued Compaq Confidential Need to Know Required Writer: Susana Weinstein Salazar Project: Compaq ProLiant ML530 Troubleshooting Guide Comments: Part Number: 148942-001 File Name: b-ch1 Server Startup and Operation Errors.doc Last Saved On: 10/19/99 12:36 PM 1-10 Compaq ProLiant ML530 Troubleshooting Guide Table 1-6 Installation Problems continued Problem Possible Cause Possible Solution Server cannot Required operating Follow these steps: load operating system step 1. Note at which phase the operating system failed. system. missed. 2. Remove any loaded operating system. 3. Refer to your operating system documentation. 4. Install again. Installation problem Refer to your operating system documentation and to the occurred. SmartStart Release Notes. Use the System Configuration Utility to troubleshoot where the installation failed. Problem Refer to the documentation provided with the hardware. encountered with Run the System Configuration Utility to determine if drives the hardware you are properly attached to the primary boot controller. Refer have added to the to the Compaq ProLiant ML530 Setup and Installation Guide system. for instructions on correct SCSI bus configuration for your unit. Refer to Chapter 1 and Chapter 3 of the Compaq Servers Troubleshooting Guide. I Chapter 1 provides you with the information you will need to collect when diagnosing software problems and to provide when contacting support. I Chapter 3 tells provides you with instructions on how to upgrade your operating system and its drivers. If also tells you what recovery options are available, and offers advice on minimizing downtime. Compaq Confidential Need to Know Required Writer: Susana Weinstein Salazar Project: Compaq ProLiant ML530 Troubleshooting Guide Comments: Part Number: 148942-001 File Name: b-ch1 Server Startup and Operation Errors.doc Last Saved On: 10/19/99 12:36 PM 2 Chapter Status Indicators There are a variety of status LEDs located on the front and back of your server. These LEDs can communicate the current status of varying aspects of your server's components and operations, thus aiding you in diagnosing your problem. The following ProLiant ML530 LEDs are explained in this chapter. I System status LEDs (on the front panel of the server) Power On/Standby status Memory/processor status Fan status Power supply status I Hard drive LEDs (located on the front of the physical drive) Activity status Online status Fault status I Network controller status LEDs (on the back of the server) Network link status Network activity status Compaq Confidential Need to Know Required Writer: Susana Weinstein Salazar Project: Compaq ProLiant ML530 Troubleshooting Guide Comments: Part Number: 148942-001 File Name: c-ch2 Status Indicators.doc Last Saved On: 10/27/99 9:38 AM 2-2 Compaq ProLiant ML530 Troubleshooting Guide I Power supply diagnostic LEDs (on the back of the server) AC power status Error status I Diagnostic LEDs (on the system board) 8 memory diagnostic LEDs, 1 for each memory module slot 2 processor diagnostic LEDs, 1 for each processor slot 3 drive fan diagnostic LEDs, 1 for each drive fan 4 system fan diagnostic LEDs, 1 for each system fan System Status LED Indicators The system status LEDs are located on the front panel of each server. These LEDs show: I Power On/Standby status I Memory/processor status I Fan status I Power supply status 1 2 3 4 Figure 2-1. System status LED indicators Compaq Confidential Need to Know Required Writer: Susana Weinstein Salazar Project: Compaq ProLiant ML530 Troubleshooting Guide Comments: Part Number: 148942-001 File Name: c-ch2 Status Indicators.doc Last Saved On: 10/27/99 9:38 AM Status Indicators 2-3 Table 2-1 System Status LED Indicators Indicator Status Means Green (On) System is fully on with adequate AC power Power On/Standby status LED provided. Amber (Standby) System is in standby mode. No +5V, +12V or +3.3V power is available. A portion of the system logic may still be active, and auxiliary power is provided to the system. Flashing Amber Temporary system shutdown due to a thermal event. Off No AC power is provided to the system. Flashing Amber This indicates a memory or processor failure. Memory/processor status LED To further aid in diagnosis, one or more internal memory or processor diagnostic LEDs on the system board will also be illuminated, indicating which component (memory module or processor) is experiencing the failure. See the "Diagnostic LED Indicators" section in this chapter for more information. Green Memory and processor(s) OK. Flashing Amber This indicates a fan failure. The failed fan must Fan status LED be replaced. To further aid in diagnosis, one or more internal drive or system fan diagnostic LEDs on the system board will also be illuminated (amber), indicating which fan is experiencing the failure. See the "Diagnostic LED Indicators" section in this chapter for more information. Green Fans OK. Flashing Amber This indicates a power supply failure. The Power Supply status LED redundant power supply must be replaced. See the "Power Supply Diagnostic LED Indicators" section for more information. Green Power supply(s) OK. Compaq Confidential Need to Know Required Writer: Susana Weinstein Salazar Project: Compaq ProLiant ML530 Troubleshooting Guide Comments: Part Number: 148942-001 File Name: c-ch2 Status Indicators.doc Last Saved On: 10/27/99 9:38 AM 2-4 Compaq ProLiant ML530 Troubleshooting Guide Hard Drive LED Indicators The hard drive LEDs, located on each physical drive, are visible on the front of the server or external storage unit. They provide activity, online and fault status for each corresponding drive when configured as a part of an array, and attached to a powered-on controller. Their behavior may vary, depending on the status of other drives in the array. This section provides the following information about hard drive LEDs: I An illustration detailing the location of each LED I A matrix of the possible LED configurations and what each combination means I Details about hot-plug drive rapid error recovery and guidelines for utilizing Compaq Insight Manager's predictive failure alert I Guidelines for hot-plug drive replacement For additional information on troubleshooting hard drive problems, refer to "Hard Drive Problems" and "SCSI Device Problems" in Chapter 2 of the Compaq Servers Troubleshooting Guide. Use the following illustration in conjunction with Table 2-2 to analyze current statuses for hot-plug hard drives which are attached to a Compaq Smart Array controller. 3 2 1 Figure 2-2. Hot-plug hard drive LED indicators Compaq Confidential Need to Know Required Writer: Susana Weinstein Salazar Project: Compaq ProLiant ML530 Troubleshooting Guide Comments: Part Number: 148942-001 File Name: c-ch2 Status Indicators.doc Last Saved On: 10/27/99 9:38 AM Status Indicators 2-5 IMPORTANT: It is recommended that you familiarize yourself with the guidelines following Table 2-2 prior to performing a drive replacement. Table 2-2 Hot-Plug Hard Drive LED Indicator Status Combinations Means Activity Online Fault
On Off Off Do not remove the drive. Removing a drive during this process will cause data loss. The drive is being accessed and is not configured as part of an array. On Flashing Off Do not remove the drive. Removing a drive during this process will cause data loss. The drive is rebuilding or undergoing capacity expansion. Flashing Flashing Flashing Do not remove the drive. Removing a drive during this process will cause data loss. The drive is part of an array being selected by the Array Configuration Utility. -Or- The Options ROMPaq is upgrading the drive. Off Off Off OK to replace the drive online if a predictive failure alert is received (see the following section for details) and the drive is attached to an array controller. The drive is not configured as part of an array. -Or- If this drive is part of an array, then a powered-on controller is not accessing the drive. -Or- The drive is configured as an online spare. continued Compaq Confidential Need to Know Required Writer: Susana Weinstein Salazar Project: Compaq ProLiant ML530 Troubleshooting Guide Comments: Part Number: 148942-001 File Name: c-ch2 Status Indicators.doc Last Saved On: 10/27/99 9:38 AM 2-6 Compaq ProLiant ML530 Troubleshooting Guide Table 2-2 Hot-Plug Hard Drive LED Indicator Status Combinations continued Means Activity Online Fault
Off Off On OK to replace the drive online. The drive has failed, and has been placed off-line. Off On Off OK to replace the drive online if a predictive failure alert is received (see the following section for details), provided that the array is configured for fault tolerance and all other drives in the array are online. The drive is online and configured as part of an array. On or On Off OK to replace the drive online if a predictive failure Flashing alert is received (see the following section for details), provided that the array is configured for fault tolerance and all other drives in the array are online. The drive is online and being accessed. Hot-Plug Drive Replacement Guidelines You should be able to hot-plug a drive during normal activity. Be aware; however, that hot plugging a disk drive will effect system performance and fault tolerance. NOTE: Depending upon your configuration, both a drive failure and the subsequent rebuild process will cause storage subsystem performance degradation. For example, the replacement of a single drive on an array with 50 logical drives will have less of an impact than if the array has 3 logical drives. When a disk drive is hot-plugged, although the system is functionally operational, the disk subsystem may no longer be fault tolerant. Fault tolerance will be lost until the removed drive is subsequently replaced and the rebuild operation is completed (this will take several hours even if the system is not busy while the rebuild is in progress). If another drive in the array should incur an error during the period when fault tolerance is unavailable, it is possible to cause a fatal system error due to a data error. If another drive fails during this period, the entire contents of the array will be lost. IMPORTANT: It is therefore recommended that disk drive replacement be performed during low activity periods whenever possible. In addition, a current valid backup should be available of the logical drives in the array of the drive being replaced, even if drive replacement is being made during server downtime. Compaq Confidential Need to Know Required Writer: Susana Weinstein Salazar Project: Compaq ProLiant ML530 Troubleshooting Guide Comments: Part Number: 148942-001 File Name: c-ch2 Status Indicators.doc Last Saved On: 10/27/99 9:38 AM Status Indicators 2-7 Unsafe Hot-Plug Drive Replacement Precautions Be aware of the following Compaq guidelines cautioning unsafe hot-plug replacement. I Do not remove a degraded drive if any other member of the array is off-line (the online LED is off). No other drive in the array can be hot-plugged without data loss. The possible exception to this would be the utilization of RAID 0+1 as a fault tolerant form. In this case, drives are essentially mirrored in pairs; more that one drive can fail and be replaced as long as the drive(s) they are mirrored to are online. Refer to your Smart Array Controller's user guide for information on fault tolerance options. I Do not remove a degraded drive if any member of an array is missing (previously removed and not yet replaced). I Do not remove a degraded drive if any member of an array is being rebuilt unless the drive being rebuilt has been configured as an online spare. The drive's online LED will be flashing, indicating that a replaced drive is being rebuilt from data stored on the other drives. NOTE: An online spare will not activate and start rebuilding after a predictive failure alert, as the degraded drive is still online. The online spare only activates after a drive in the array has failed. I Do not replace multiple degraded drives at the same time (for example, when the system is off), since the fault-tolerance may be compromised. When a drive is replaced, the controller uses data from the other drives in the array to reconstruct data on the replacement drive. If more than one drive is removed, a complete data set is not available to reconstruct data on the replacement drive(s) and permanent data loss could occur. CAUTION: Do not turn off an attached disk drive enclosure when the server containing the Smart Array controller is powered on. Also, do not turn on the server before turning on the disk enclosure. If these ordering rules are not followed, the Smart Array Controller may mark the drives in this enclosure as "failed," which could result in permanent data loss. Compaq Confidential Need to Know Required Writer: Susana Weinstein Salazar Project: Compaq ProLiant ML530 Troubleshooting Guide Comments: Part Number: 148942-001 File Name: c-ch2 Status Indicators.doc Last Saved On: 10/27/99 9:38 AM 2-8 Compaq ProLiant ML530 Troubleshooting Guide Predictive Failure Alert The predictive failure alert is a powerful problem-prevention tool that warns you when the system has determined that a drive failure is imminent. This alert allows you to proactively schedule downtime for maintenance and not interrupt critical business operations that rely on your servers. In addition, with hot-pluggable drives attached to Compaq Smart Array controllers, you can remove and replace one or several drives within a server while the system is online, which minimizes the interruption of the network, server downtime and data loss. Refer to your Compaq Insight Manager and Compaq Management Agents documentation (found on the Compaq Management CD) for instructions on implementing this function. CAUTION: Not following these guidelines could result in data loss. CAUTION: It is recommended that some level of fault tolerance be utilized in your RAID configuration. Refer to your Smart Array controller user's guide for information on fault tolerance options. IMPORTANT: You must use Compaq Insight Manager and a Compaq Smart Array controller to manage the drive array on your server if you wish to implement Predictive Failure Alert. Predictive Failure Replacement Guidelines To minimize server downtime and data loss, use these guidelines when Compaq Insight Manager implements a predictive failure alert. The alert indicates that a drive is degraded and should be replaced: I Make sure all physical drives in the affected array should be present and have the online LEDs illuminated before removing the degraded hot-plug drive. If any online LEDs are flashing (indicating a rebuild) or not illuminated, the degraded drive should not be removed. For step-by-step instructions on hot-plugging your hard drive, refer to the ProLiant ML530 Setup and Installation Guide. I If you are upgrading to larger drives in the array, follow the previously stated rules and ensure that each drive has completed its rebuild before adding the next new drive to the array. I You must follow the Compaq cabling guidelines when configuring your array to implement the best possible cabling solution for your server. Refer to the ProLiant ML530 Setup and Installation Guide for step-by-step instructions. Compaq Confidential Need to Know Required Writer: Susana Weinstein Salazar Project: Compaq ProLiant ML530 Troubleshooting Guide Comments: Part Number: 148942-001 File Name: c-ch2 Status Indicators.doc Last Saved On: 10/27/99 9:38 AM Status Indicators 2-9 I Check for cabling configurations that are not supported. Signal integrity errors may be injected into the SCSI bus when an active drive is hot-plugged. I Make sure fault tolerance is not currently being used to recover from errors to other drives in the array, such as media errors or signal integrity errors Loss of fault tolerance following a drive replacement may result in problems. CAUTION: In extreme cases, when the number of errors is greater than the firmware error recovery is able to sustain, hot-plugging an online drive may cause some unrecoverable errors to be reported to the operating system or may cause a complete failure of the array. Refer to your operating system documentation for more information on implications, as well as possible recovery options. IMPORTANT: Before replacing a degraded drive, use Compaq Insight Manager to examine the error counters recorded for each physical drive in the array to verify that such errors are not presently occurring. Refer to the Compaq Insight Manager documentation on the Compaq Management CD. Compaq Confidential Need to Know Required Writer: Susana Weinstein Salazar Project: Compaq ProLiant ML530 Troubleshooting Guide Comments: Part Number: 148942-001 File Name: c-ch2 Status Indicators.doc Last Saved On: 10/27/99 9:38 AM 2-10 Compaq ProLiant ML530 Troubleshooting Guide Network Status LED Indicators The network controller status LEDs are located on the back of the server. They provide information on: I Whether or not the server is linked to the network I Whether or not there is current network activity 1 876 5 4 3 2 1 2 Figure 2-3. Network status LED indicators Table 2-3 Network Status LED Indicators Indicator Status Means Off No network link Link status
On Linked to network Off No network activity Activity status
On or Flashing Network activity Refer to "Network Controller Problems" in Chapter 2 of the Compaq Servers Troubleshooting Guide for more information on troubleshooting network controller problems. Compaq Confidential Need to Know Required Writer: Susana Weinstein Salazar Project: Compaq ProLiant ML530 Troubleshooting Guide Comments: Part Number: 148942-001 File Name: c-ch2 Status Indicators.doc Last Saved On: 10/27/99 9:38 AM Status Indicators 2-11 Power Supply Diagnostic LED Indicators The power supply diagnostic LEDs are located on the back of the server. They provide information on status of AC power to the server, as well as current error status. 1 2 Figure 2-4. Power supply diagnostic LED indicators Table 2-4 Power Supply Diagnostic LED Indicators Indicator Status Means On System is on and AC power is present. AC Power Status (green)
Flashing System is in standby mode. Off No AC power present. Off Normal power supply operation, no errors. Error Status (amber)
On For single power supply configurations, this indicates a power supply failure. With multiple power supply configurations, this indicates that there is no AC power present. Flashing Current limit exceeded. Refer to "Power Problems" in Chapter 2 of the Compaq Servers Troubleshooting Guide for more information on troubleshooting power and power supply problems. Compaq Confidential Need to Know Required Writer: Susana Weinstein Salazar Project: Compaq ProLiant ML530 Troubleshooting Guide Comments: Part Number: 148942-001 File Name: c-ch2 Status Indicators.doc Last Saved On: 10/27/99 9:38 AM 2-12 Compaq ProLiant ML530 Troubleshooting Guide Diagnostic LED Indicators When a system fault occurs (such as a memory, processor or fan fault), the memory/processor or fan status LED on the front panel will begin flashing amber. In addition, one or more internal diagnostic LEDs will show a fault status. Each of these diagnostic LEDs corresponds to a specific component, (a memory module, a processor, or a fan) adding a powerful tool in diagnosing a server problem at the component level. For an understanding of the diagnostic LEDs, refer to the table of status meanings below. They include: I 8 memory diagnostic LEDs, 1 for each memory module slot I 2 processor diagnostic LEDs, 1 for each processor slot I 3 drive fan diagnostic LEDs, 1 for each drive fan I 4 system fan diagnostic LEDs, 1 for each system fan Directly following Table 2-5 is an explanation of each type of system fault, with steps to take when one occurs. Also see the "System Status LED Indicators" section, earlier in this chapter. 1 2 3 4 Figure 2-5. Diagnostic LED indicators Compaq Confidential Need to Know Required Writer: Susana Weinstein Salazar Project: Compaq ProLiant ML530 Troubleshooting Guide Comments: Part Number: 148942-001 File Name: c-ch2 Status Indicators.doc Last Saved On: 10/27/99 9:38 AM Status Indicators 2-13 Table 2-5 Diagnostic LED Indicators Indicators Status Means Off All installed memory modules OK. Memory diagnostic LEDs
Amber Indicated memory module has failed or (8 LEDs total) is not seated properly. (One or more memory modules may have corresponding LEDs lit.) All 8 Amber No valid memory is present. Off All installed processors OK. Processor diagnostic LEDs
Amber Indicated processor has failed or is not (2 LEDs total) seated properly. Green Indicated fan is operational. Drive Fan diagnostic LEDs
Amber Indicated fan has failed or is not (3 LEDs total) seated properly. Check cable connections. Off There is no power being accessed. Green Indicated fan is operational. System Fan diagnostic LEDs
Amber Indicated fan has failed or is not (4 LEDs total) seated properly. Check cable connections. Off There is no power being accessed. CAUTION: Opening the access panel for an extended period of time when the server is on interferes with proper airflow and will cause a temperature violation. It is appropriate to open the server access panel briefly during server operation to view the internal diagnostic LEDs during problem diagnosis or to replace a redundant hot-plug power supply or hot-plug fan. If you feel that you will need to keep the access panel open for an extended period of time, prior to opening the unit: 1. Close all applications. 2. Bring down the operating system. 3. Press the Power On/Standby switch to place the server in standby mode. Compaq Confidential Need to Know Required Writer: Susana Weinstein Salazar Project: Compaq ProLiant ML530 Troubleshooting Guide Comments: Part Number: 148942-001 File Name: c-ch2 Status Indicators.doc Last Saved On: 10/27/99 9:38 AM
| 148942-001 on-fault-to |