Product Details

Compaq StorageWorks
HSG60 Array Controller
ACS Version 8.6
Troubleshooting Reference Guide
First Edition (June 2001)
Part Number: EK-G60TR-SA. A01
Compaq Computer Corporation
2001 Compaq Computer Corporation.
Com paq, the Compaq logo, and StorageWorks Registered in U. S. Patent and Trademark Office.
OpenVMSis a trademark of Compaq Information Technologies Group, L.P. in the United States and other
countries.
Intel is a trademark of Intel Corporation in the United States and other countries.
UNIX is a trademark of The Open Group in the United States and other countries.
All other product names mentioned herein may be trademarks of their respective companies.
Confidential computer software. Valid license from Compaq required for possession, use or copying.
Consistent with FAR 12.211 and 12.212, Commercial Computer Software, Computer Software
Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under
vendor's standard commercial license.
Com paq shall not be liable for technical or editorial errors or omissions contained herein. The
information in this document is provided "as is" without warranty of any kind and is subject to change
without notice. The warranties for Compaq products are set forth in the express limited warranty
statements accompanying such products. Nothing herein should be construed as constituting an additional
warranty.
Com paq service tool software, including associated documentation, is the property of and contains
confidential technology of Compaq Computer Corporation. Service customer is hereby licensed to use
the software only for activities directly relating to the delivery of, and only during the term of, the
applicable services delivered by Compaq or its authorized service provider. Customer may not modify or
reverse engineer, remove, or transfer the software or make the software or any resultant diagnosis or
system management data available to other parties without Compaq's or its service provider's consent.
Upon termination of the services, customer will, at Compaq's or its service provider's option, destroy or
return the software and associated documentation in its possession.
Printed in the U.S.A.
HSG60 Array Controller ACS Version 8.6
Troubleshooting Reference Guide
First Edition (June 2001)
Part Number: EK-G60TR-SA. A01
Contents
About This Guide
Text Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
Symbols in Text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii
Symbols on Equipment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii
Rack Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv
Getting Help . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv
Compaq Technical Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv
Compaq Website . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv
Compaq Authorized Reseller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv
Chapter 1
Troubleshooting Information
Typical Installation Troubleshooting Checklist. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Troubleshooting Table. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Significant Event Reporting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
Reporting Events That Cause Controller Operation to Halt. . . . . . . . . . . . . . . . . . . . . . . . 110
Flashing OCP Pattern Display Reporting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
Solid OCP Pattern Display Reporting. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
Last Failure Reporting. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
Reporting Events That Allow Controller Operation to Continue. . . . . . . . . . . . . . . . . . . . 117
Spontaneous Event Log. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
CLI Event Reporting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
Running the Controller Diagnostic Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
ECB Charging Diagnostics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
Battery Hysteresis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
Caching Techniques. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
Read Caching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
iv HSG60 Array Controller ACS Version 8.6 Troubleshooting Reference Guide
Read-Ahead Caching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
Write-Through Caching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
Write-Back Caching. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
Fault-Tolerance for Write-Back Caching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
Nonvolatile Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
Cache Policies Resulting from Cache Module Failures . . . . . . . . . . . . . . . . . . . . . . . . 123
Enabling Mirrored Write-Back Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
Chapter 2
Utilities and Exercisers
Fault Management Utility (FMU) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Displaying Failure Entries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Translating Event Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Controlling the Display of Significant Events and Failures . . . . . . . . . . . . . . . . . . . . . . . . . 25
Video Terminal Display (VTDPY) Utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Restrictions with VTDPY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Running VTDPY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
VTDPY Help . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
VTDPY Display Screens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
Default Screen. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
Controller Status Screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
Cache Performance Screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
Device Performance Screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
Host Ports Statistics Screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
Resource Statistics Screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
Interpreting VTDPY Screen Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
Screen Header . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
Common Data Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
Unit Performance Data Fields. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
Device Performance Data Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
Device Port Performance Data Fields. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
Host Port Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
TACHYON Chip Status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
Device Port Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
Controller/Processor Utilization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
Resource Performance Statistics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
Disk Inline Exerciser (DILX) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
Checking for Unit Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
Finding a Unit in the Subsystem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
Testing the Read Capability of a Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
Testing the Read and Write Capabilities of a Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
v
DILX Error Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
Format and Device Code Load Utility (HSUTIL). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
Configuration (CONFIG) Utility. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
Code Load and Code Patch (CLCP) Utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
Clone (CLONE) Utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
Field Replacement Utility (FRUTIL) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
Change Volume Serial Number (CHVSN) Utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
Chapter 3
Event Reporting Templates
Passthrough Device Reset Event Sense Data Response . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Last Failure Event Sense Data Response (Template 01) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Multiple-Bus Failover Event Sense Data Response (Template 04) . . . . . . . . . . . . . . . . . . . . . . 35
Failover Event Sense Data Response (Template 05) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Nonvolatile Parameter Memory Component Event Sense Data Response (Template 11) . . . . . 38
Backup Battery Failure Event Sense Data Response (Template 12) . . . . . . . . . . . . . . . . . . . . . 310
Subsystem Built-In Self Test Failure Event Sense Data Response (Template 13) . . . . . . . . . . 311
Memory System Failure Event Sense Data Response (Template 14) . . . . . . . . . . . . . . . . . . . . 313
Device Services Nontransfer Error Event Sense Data Response (Template 41). . . . . . . . . . . . 315
Disk Transfer Error Event Sense Data Response (Template 51) . . . . . . . . . . . . . . . . . . . . . . . 317
Chapter 4
ASC/ASCQ, Repair Action, and Component Identifier Codes
Vendor Specific SCSI ASC/ASCQ Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Recommended Repair Action Codes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Component ID Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411
Chapter 5
Instance Codes
Instance Code Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Instance Codes and FMU. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
Notification/Recovery Threshold . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
Repair Action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
Event Number. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
Component ID . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
Chapter 6
Last Failure Codes
Last Failure Code Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
Last Failure Codes and FMU. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
Parameter Count . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
vi HSG60 Array Controller ACS Version 8.6 Troubleshooting Reference Guide
Restart Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
Hardware/Software Flag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
Repair Action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Error Number . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Component ID Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Glossary
Index
vii
Figures
Figure 21. VTDPY commands and shortcuts generated from the Help command. . . . . . . . . . . 210
Figure 22. Sample of the VTDPY default screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
Figure 23. Sample of the VTDPY status screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
Figure 24. Sample of the VTDPY cache screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
Figure 25. Sample of regions on the VTDPY device screen . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
Figure 26. Sample of the VTDPY host screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
Figure 27. Sample of the VTDPY resource screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
Figure 51. Structure of an Instance Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Figure 61. Structure of a Last Failure Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
viii HSG60 Array Controller ACS Version 8.6 Troubleshooting Reference Guide
Tables
Table 11 Troubleshooting Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Table 12 Flashing OCP Pattern Displays and Repair Actions . . . . . . . . . . . . . . . . . . . . . . . . . . 111
Table 13 Solid OCP Pattern Displays and Repair Actions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
Table 14 ECB Capacity Based On Memory Size. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
Table 15 Cache Policies--Cache Module Status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
Table 16 Resulting Cache Policies--ECB Status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
Table 21 Event Code Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Table 22 FMU SET Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Table 23 VTDPY Key Sequences and Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Table 24 VTDPY--Common Data Fields Column Definitions: Part 1 . . . . . . . . . . . . . . . . . . . 219
Table 25 VTDPY--Common Data Fields Column Definitions: Part 2 . . . . . . . . . . . . . . . . . . . 220
Table 26 VTDPY--Unit Performance Data Fields Column Definitions . . . . . . . . . . . . . . . . . . 221
Table 27 VTDPY--Device Performance Data Fields Column Definitions. . . . . . . . . . . . . . . . 222
Table 28 VTDPY--Device Port Performance Data Fields Column Definitions . . . . . . . . . . . . 224
Table 29 Fibre Channel Host Status Display--Known Host Connections . . . . . . . . . . . . . . . . 224
Table 210 Fibre Channel Host Status Display--Port Status . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
Table 211 Fibre Channel Host Status Display--Link Error Counters . . . . . . . . . . . . . . . . . . . . . 225
Table 212 First Digit on the TACHYON Chip . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
Table 213 Second Digit on the TACHYON Chip . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
Table 214 Device Map Column Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
Table 215 Controller/Processor Utilization Definitions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
Table 216 VTDPY Thread Descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
Table 217 Resource Performance Statistics Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
Table 218 DILX Control Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
Table 219 Data Patterns for Phase 1: Write Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
Table 220 DILX Error Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
Table 221 HSUTIL Messages and Inquiries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
ix
Table 31 Passthrough Device Reset Event Sense Data Response Format . . . . . . . . . . . . . . . . . 32
Table 32 Template 01--Last Failure Event Sense Data Response Format . . . . . . . . . . . . . . . . 34
Table 33 Template 04--Multiple-Bus Failover Event Sense Data Response Format . . . . . . . . 35
Table 34 Template 05--Failover Event Sense Data Response Format . . . . . . . . . . . . . . . . . . . 37
Table 35 Template 11--Nonvolatile Parameter Memory Component Event Sense Data Response
Format 39
Table 36 Template 12--Backup Battery Failure Event Sense Data Response Format . . . . . . 310
Table 37 Template 13--Subsystem Built-In Self Test Failure Event Sense Data Response Format.
311
Table 38 Template 14--Memory System Failure Event Sense Data Response Format . . . . . 313
Table 39 Template 41--Device Services Non-Transfer Error Event Sense Data Response Format .
316
Table 310 Template 51--Disk Transfer Error Event Sense Data Response Format . . . . . . . . . 318
Table 41 ASC and ASCQ Code Descriptions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Table 42 Recommended Repair Action Codes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Table 43 Component ID Codes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411
Table 51 Instance Code Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
Table 52 Event Notification/Recovery (NR) Threshold Classifications . . . . . . . . . . . . . . . . . . 52
Table 53 Instance Codes and Repair Action Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
Table 61 Last Failure Code Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
Table 62 Controller Restart Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
Table 63 Last Failure Codes and Repair Action Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
About This Guide
This guide is a troubleshooting resource for HSG60 array controllers running array
controller software (ACS) version 8.6L. It contains information on various utilities,
software templates, and event reporting codes.
Text Conventions
This document uses the following conventions to distinguish elements of text:
Keys Keys appear in boldface. A plus sign (+) between two
keys indicates that they should be pressed
simultaneously.
USER INPUT User input appears in a different typeface and in
uppercase
FILENAMES File names appear in uppercase italics.
Menu Options, These elements appear in initial capital letters.
Command Names,
Dialog Box Names
C OMMANDS, These elements appear in upper case.
DIRECTORY NAMES,
NOTE: UNIX commands are case sensitive and will not
and DRIVE NAMES appear in uppercase.
Type When you are instructed to type information, type the
information without pressing the Enter key.
Enter When you are instructed to enter information, type the
information and then press the Enter key.
xii HSG60 Array Controller ACS Version 8.6 Troubleshooting Reference Guide
"this controller" The controller serving the current CLI session through a
local or remote terminal.
"other controller" The controller in a dual-redundant pair that is connected
to the controller serving the current CLI session.
Symbols in Text
These symbols may be found in the text of this guide. They have the following meanings.
WARNING: Text set off in this manner indicates that failure to follow directions in the
warning could result in bodily harm or loss of life.
CAUTION: Text set off in this manner indicates that failure to follow directions could
result in damage to equipment or loss of information.
IMPORTANT: Text set off in this manner presents clarifying information or specific instructions.
NOTE: Text set off in this manner presents commentary, sidelights, or interesting points of
information.
Symbols on Equipment
These icons may be located on equipment in areas where hazardous conditions may exist.
Any surface or area of the equipment marked with these symbols indicates
the presence of electrical shock hazards. The enclosed area contains no
operator serviceable parts.
WARNING: To reduce the risk of injury from electrical shock hazards, do not
open this enclosure.
About This Guide xiii
Any RJ-45 receptacle marked with these symbols indicates a Network
Interface Connection.
WARNING: To reduce the risk of electrical shock, fire, or damage to the
equipment, do not plug telephone or telecommunications connectors into
this receptacle.
Any surface or area of the equipment marked with these symbols indicates
the presence of a hot surface or hot component. If this surface is contacted,
the potential for injury exists.
WARNING: To reduce the risk of injury from a hot component, allow the
surface to cool before touching.
Power Supplies or Systems marked with these symbols indicate the
equipment is supplied by multiple sources of power.
WARNING: To reduce the risk of injury from electrical shock,
remove all power cords to completely disconnect power from the
system.
Any product or assembly marked with these symbols indicates that the
component exceeds the recommended weight for one individual to handle
safely.
WARNING: To reduce the risk of personal injury or damage to the
equipment, observe local occupational health and safety requirements and
guidelines for manual material handling.
xiv HSG60 Array Controller ACS Version 8.6 Troubleshooting Reference Guide
Rack Stability
WARNING: To reduce the risk of personal injury or damage to the equipment, be sure
that:
s The leveling jacks are extended to the floor.
s The full weight of the rack rests on the leveling jacks.
s The stabilizing feet are attached to the rack if it is a single rack installation.
s The racks are coupled together in multiple rack installations.
s A rack may become unstable if more than one component is extended for any
reason. Extend only one component at a time.
Getting Help
If you have a problem and have exhausted the information in this guide, you can get
further information and other help in the locations listed in this section.
Compaq Technical Support
You are entitled to free hardware technical telephone support for your product for as long
you own the product. A technical support specialist will help diagnose the problem or
guide you to the next step in the warranty process.
In North America, call the Compaq Technical Phone Support Center at
1-800-OK-COMPAQ. This service is available 24 hours a day, 7 days a week.
NOTE: For continuous quality improvement, calls may be recorded or monitored.
Outside North America, call the nearest Compaq Technical Support Phone Center.
Telephone numbers for world wide Technical Support Centers are listed on the Compaq
website. Access the Compaq website by logging on to the Internet at
http://www.compaq.com.
Be sure to have the following information available before you call Compaq:
s Technical support registration number (if applicable)
s Product serial numbers
s Product model names and numbers
s Applicable error messages
About This Guide xv
s Add-on boards or hardware
s Third-party hardware or software
s Operating system type and revision level
s Detailed, specific questions
Compaq Website
The Compaq website has latest information on this product as well as the latest drivers.
You can access the Compaq website by logging on to the Internet at
http://www.compaq.com/storage.
Compaq Authorized Reseller
For the name of your nearest Compaq Authorized Reseller:
s In the United States, call 1-800-345-1518.
s In Canada, call 1-800-263-5868.
s Elsew here, see the Compaq website for locations and telephone numbers.
1
Chapter
Troubleshooting Information
This chapter provides guidelines for troubleshooting the controller, cache module, and
external cache battery (ECB). See enclosure documentation for information on
troubleshooting enclosure hardware, such as the power supplies, cooling fans, and
environmental monitoring unit (EMU).
12 HSG60 Array Controller ACS Version 8.6 Troubleshooting Reference Guide
Typical Installation Troubleshooting
Checklist
The following checklist identifies many of the problems that occur in a typical installation.
After identifying a problem, use Table 11 to confirm the diagnosis and fix the problem.
If an initial diagnosis points to several possible causes, use the tools described in this
chapter and then those in Chapter 2 to further refine the diagnosis. If a problem cannot be
diagnosed using the checklist and tools, contact a Compaq authorized service provider for
additional support.
To troubleshoot the controller and supporting modules:
1. Check the power to the enclosure and enclosure components.
Are power cords connected properly?
Is power within specifications?
2. Check the component cables.
Are bus cables to the controllers connected properly?
For enclosures, are ECB cables connected properly?
3. Check each program card to make sure the card is fully seated.
4. Check the operator control panel (OCP) and devices for LED codes.
See "Flashing OCP Pattern Display Reporting" on page 111, and "Solid OCP Pattern
Display Reporting" on page 113, to interpret the LED codes.
5. Connect a local terminal to the controller and check the controller configuration with
the following command:
SH OW THIS_CONTROLLER FULL
Make sure that the ACS version loaded is correct and that pertinent patches are
installed. Also, check the status of the cache module and the supporting ECB.
In a dual redundant configuration, check the "other controller" with the following
command:
SH OW OTHER_CONTROLLER FULL
Troubleshooting Information 13
6. Use the fault management utility (FMU) to check for Last Failure or "memory system
failure" entries.
Show these codes and translate the Last Failure Codes they contain. See Chapter 2,
"Displaying Failure Entries" and "Translating Event Codes" sections.
If the controller failed to the extent that the controller cannot support a local terminal
for FMU, check the host error log for the Instance or Last Failure Codes. See
Chapter 5 and Chapter 6 to interpret the event codes.
7. Check device status with the following command:
SHOW DEVICES FULL
Look for errors such as "misconfigured device" or "No device at this PTL." If a device
reports misconfigured or missing, check the device status with the following
command:
SHOW device-name
8. Check storageset status with the following command:
SHOW STORAGESETS FULL
Make sure that all storagesets are normal (or normalizing if the storageset is a RAIDset
or mirrorset). Check again for misconfigured or missing devices using step 7.
9. Check unit status with the following command:
SHOW UNITS FULL
Make sure that all units are available or online. If the controller reports a unit as
unavailable or offline, recheck the storageset the unit belongs to with the following
command:
SHOW storageset-name
If the controller reports that a unit has lost data or is unwriteable, recheck the status of
the devices that make up the storageset. If the devices are operating normally, recheck
the status of the cache module. If the unit reports a media format error, recheck the
status of the storageset and storageset devices.
14 HSG60 Array Controller ACS Version 8.6 Troubleshooting Reference Guide
Troubleshooting Table
After diagnosing a problem, use Table 11 to resolve the problem.
Table 11 Troubleshooting Guidelines (Sheet 1 of 6)
Symptom Possible Cause Investigation Remedy
Reset button not lit. No power to subsystem. Check power to subsystem Replace cord or ( enclosure
and power supplies on only) AC input box.
controller enclosure.
Failed controller. If the previous remedies fail Replace controller.
to resolve the problem,
check OCP LED codes.
Reset button lit steadily; Various. See OCP LED Codes. Follow repair action using
other LEDs also lit. Table 12.
SHOW device FULL. Follow repair action using
Device in error or failedset
Reset button FLASHING;
Table 13.
other LEDs also lit. on corresponding device
port with other LEDs lit.
Troubleshooting Information 15
Table 11 Troubleshooting Guidelines (Sheet 2 of 6)
Symptom Possible Cause Investigation Remedy
Use the correct command
Incorrect command See the controller CLI
Cannot set failover to
syntax.
syntax. reference guide for the SET
create dual-redundant
FAILOVER command.
configuration.
Different software versions Check software versions on Update one or both
on controllers. both controllers. controllers so that both are
using the same software
version.
Incompatible hardware. Check hardware versions. Upgrade controllers so that
they are using compatible
hardware.
Controller previously set Make sure that neither Use the SET NOFAILOVER
for failover. controller is configured for command on both
failover. controllers, then reset "this
controller" for failover.
Follow repair action using
Failed controller. If the previous remedies fail
Table 12 or Table 13.
to resolve the problem,
check for OCP LED codes.
Node ID is all zeros. SHOW_THIS to see if node Set node ID using the node
ID is all zeros. ID (bar code) that is located
on the frame in which the
controller sits. See SET
THIS_CONTROLLER
NODE_ID in the controller CLI
reference guide. Also, be
sure to copy in the right
direction. If cabled to the
new controller, use SET
FAILOVER COPY=
OTHER_CONTROLLER. If
cabled to old controller, use
SET FAILOVER
COPY=THIS_CONTROLLER.
16 HSG60 Array Controller ACS Version 8.6 Troubleshooting Reference Guide
Table 11 Troubleshooting Guidelines (Sheet 3 of 6)
Symptom Possible Cause Investigation Remedy
Reseat DIMM.
Improperly installed DIMM. Remove cache module and
Nonmirrored cache:
make sure that the DIMM is
controller reports failed
fully seated in the slot.
DIMM in Cache A or B.
Failed DIMM. If the previous remedy fails Replace DIMM.
to resolve the problem,
check for OCP LED codes.
Reseat DIMM.
Remove cache module and
Improperly installed DIMM
Mirrored cache:
make sure that DIMMs are
in "this controller" cache
"this controller" reports
installed properly.
module.
DIMM 1 or 2 failed in
Cache A or B. Replace DIMM in "this
Failed DIMM in "this If the previous remedy fails
controller" cache module.
controller" cache module. to resolve the problem,
check for OCP LED codes.
Mirrored cache: Improperly installed DIMM Remove cache module and Reseat DIMM.
"this controller" reports in "other controller" cache make sure that the DIMMs
DIMM 3 or 4 failed in module. are installed properly.
Cache A or B. Failed DIMM in "other If the previous remedy fails Replace DIMM in "other
controller" cache module. to resolve the problem, controller" cache module.
check for OCP LED codes.
Mirrored cache: controller Mem ory module was
reports battery not installed before the cache
Model 2100 enclosure:
present. module was connected to Model 2100 enclosure: ECB
install or reseat ECB.
not installed or seated
an ECB.
properly in backplane.
Mirrored cache: controller Primary data and the SHOW THIS_CONTROLLER Enter the SHUTDOWN
reports cache or mirrored mirrored copy data are not indicates that the cache or command on controllers that
cache has failed. identical. mirrored cache has failed. report the problem. (This
command flushes the cache
Spontaneous FMU message
contents to synchronize the
displays: "Primary cache
primary and mirrored data.)
declared failed - data
Restart the controllers that
inconsistent with mirror," or
were shut down.
"Mirrored cache declared
failed - data inconsistent
with primary."
Troubleshooting Information 17
Table 11 Troubleshooting Guidelines (Sheet 4 of 6)
Symptom Possible Cause Investigation Remedy
Connect a terminal to the
SHOW THIS_CONTROLLER
Invalid cache. Mirrored-cache mode
maintenance port on the
indicates "invalid cache."
discrepancy. This
controller reporting the error
discrepancy might occur
and clear the error with the
after installing a new
Spontaneous FMU message
following command--all on
controller. The existing
displays: "Cache modules
one line: CLEAR_ERRORS
cache module is set for
inconsistent with mirror
THIS_CONTROLLER
mirrored caching, but the
mode."
INVALID_CACHE
new controller is set for
NODESTROY_UNFLUSHED_
unmirrored caching.
DATA. See the controller CLI
This discrepancy might
reference guide for more
also occur if the new
information.
controller is set for
mirrored caching, but the
existing cache module is
not.
SHOW THIS_CONTROLLER Connect a terminal to the
Cache module might
indicates "invalid cache." maintenance port on the
erroneously contain
controller reporting the error,
unflushed write-back data.
and clear the error with the
This might occur after
No spontaneous FMU
following command--all on
installing a new controller.
message.
one line: CLEAR_ERRORS
The existing cache module
THIS_CONTROLLER
might indicate that the
INVALID_CACHE
cache module contains
DESTROY_UNFLUSHED_
unflushed write-back data,
DATA. See the controller CLI
but the new controller
reference guide for more
expects to find no data in
information.
the existing cache module.
This error might also occur
if installing a new cache
module for a controller that
expects write-back data in
the cache.
18 HSG60 Array Controller ACS Version 8.6 Troubleshooting Reference Guide
Table 11 Troubleshooting Guidelines (Sheet 5 of 6)
Symptom Possible Cause Investigation Remedy
Replace device.
Cannot add device. Illegal device. See product-specific
release notes that
accompanied the software
release for the most recent
list of supported devices.
Device not properly Check that the device is Firmly press the device into
installed in enclosure. fully seated. the bay.
Failed device. Check for presence of Follow repair action in the
device LEDs. documentation provided with
the enclosure or device.
Failed power supplies. Check for presence of Follow repair action in the
power supply LEDs. documentation provided with
the enclosure or power
supply.
Replace enclosure.
Failed bus to device. If the previous remedies fail
to resolve the problem,
check for OCP LED codes.
Reconfigure storageset with
Cannot configure Incorrect command See the controller CLI
correct command syntax.
storagesets. syntax. reference guide for the ADD
storageset command.
Delete unused storagesets.
Exceeded maximum Use the SHOW command to
number of storagesets. count the number of
storagesets configured on
the controller.
Replace the ECB if required.
Use the SHOW command to
Failed battery on ECB. An
check the ECB battery
ECB or uninterruptible
status.
power supply (UPS) is
required for RAIDsets and
mirrorsets.
Reassign the unit number
Cannot assign unit Incorrect command See the controller CLI
with the correct syntax.
number to storageset. syntax. reference guide for correct
syntax.
Troubleshooting Information 19
Table 11 Troubleshooting Guidelines (Sheet 6 of 6)
Symptom Possible Cause Investigation Remedy
None None
Unit is available but not This is normal. Units are
online. "available" until the host
accesses them, at which
point their status is
changed to "online."
Host cannot see device. Broken cables. Check for broken cables. Replace broken cables.
Check for the required Configure device special files
Host cannot access unit. Host files or device drivers
device special files. as described in the
not properly installed or
installation and configuration
configured.
guide that accompanied the
software release.
Invalid Cache See the description for the See the description for the
invalid cache symptom on invalid cache symptom.
page 17.
Units have lost data. Issue the SHOW UNITS FULL Clear these units with:
command. CLEAR_ERRORS unit-
number LOST_DATA.
Rebuild the storageset, then
Conduct a read scan of the
Unrecoverable read errors
Host log file or
restore storageset data from
storageset using the
might have occurred when
maintenance terminal
a backup source. While the
appropriate utility from the
the controller was
indicates that a forced
controller is reconstructing
host operating system, such
reconstructing the
error occurred when the
the storageset, monitor the
as the "dd" utility for a
storageset. Errors occur if
controller was
host error log activity or
TRU64 UNIX host.
another member fails
reconstructing a RAIDset
spontaneous event reports
while the controller is
or mirrorset.
on the maintenance terminal
reconstructing the
for any unrecoverable errors.
storageset.
If unrecoverable errors
persist, note the device on
which they occurred, and
replace the device before
proceeding.
Use the SHOW storageset- Wait for normalizing
Host requested data from
name command to see if all members to become normal,
a normalizing storageset
then resume I/O to them.
storageset members are
that did not contain the
data. "normal."
110 HSG60 Array Controller ACS Version 8.6 Troubleshooting Reference Guide
Significant Event Reporting
Controller fault management software reports information about significant events that
occur. These events are reported by:
s Maintenance terminal displays
s Host error logs
s OCP LEDs
Some events cause controller operation to halt; others allow the controller to remain
operable. Both types of events are detailed in the following sections.
Reporting Events That Cause Controller Operation to Halt
Events that cause the controller to halt operations are reported three possible ways:
s a FLASHING OCP pattern display
s a SOLID OCP pattern display
s Last Failure reporting
Use Table 12 to interpret FLASHING OCP patterns and Table 13 to interpret SOLID (ON) OCP
patterns. In the Error column of the solid OCP patterns, there are two separate
descriptions. The first denotes the actual error message that appears on the terminal, and
the second provides a more detailed explanation of the designated error.
Use the following legend to interpret both tables as indicated:
s = reset button FLASHING (in Table 12) or ON (in TABLE 13)
= reset button OFF
q = LED FLASHING (in Table 12) or ON (in TABLE 13)
= LED OFF
NOTE: If the reset button is FLASHING and an LED is ON, either the devices on the bus that
corresponds to the LED do not match the controller configuration, or an error occurred in one of
the devices on that bus.
Also, a single LED that is turned ON indicates a failure of the drive on that bus.
Troubleshooting Information 1 11
Flashing OCP Pattern Display Reporting
Certain events can cause a FLASHING display of the OCP LEDs. Each event and the resulting
pattern are described in Table 12.
IMPORTANT: Remember that a solid black pattern represents a FLASHING display. A white
pattern indicates OFF.
All LEDs FLASH at the same time and at the same rate.
Table 12 FLASHING OCP Pattern Displays and Repair Actions
Pattern OC P Error Repair Action
Code
sq 1 Program card EDC error. Replace program card.
sq 4 Timer zero on the processor is bad. Replace controller.
sq q 5 Timer one on the processor is bad. Replace controller.
sq q 6 Processor Guarded Memory Unit (GMU) is Replace controller.
bad.
sq q q B Nonvolatile Journal Memory (JSRAM) Verify the correct upgrade (see the
structure is bad because of a memory controller release notes and cover letters,
error or an incorrect upgrade procedure. if available). If error continues, replace
controller.
sq q q Press the reset button to restart the
D One or more bits in the diagnostic
controller. If this does not correct the
registers did not match the expected
error, replace the controller.
reset value.
sq q q E Memory error in the JSRAM. Replace controller.
sq q q q F Wrong image found on program card. Replace program card or replace
controller if needed.
sq 10 Controller Module memory is bad. Replace controller.
sq q 12 Controller Module memory addressing is Replace controller.
malfunctioning.
sq q q 13 Controller Module memory parity is not Replace controller.
working.
sq q 14 Controller Module memory controller Replace controller.
timer has failed.
Legend:
s = reset button FLASHING = reset button OFF q = LED OFF
= LED FLASHING
112 HSG60 Array Controller ACS Version 8.6 Troubleshooting Reference Guide
Table 12 FLASHING OCP Pattern Displays and Repair Actions (Continued)
Pa ttern OC P Error Repair Action
Code
sq q q 15 The Controller Module memory controller Replace controller.
interrupt handler has failed.
sq q q q 1E During the diagnostic memory test, the Replace controller.
Controller Module memory controller
caused an unexpected Non-Maskable
Interrupt (NMI).
sq q 24 The card code image changed when the Replace controller.
contents were copied to memory.
sq q 30 The JSRAM battery is bad. Replace controller.
sq q q 32 First-half diagnostics of the Time of Year Replace controller.
Clock failed.
sq q q q 33 Second-half diagnostics of the Time of Replace controller.
Year Clock failed.
sq q q q 35 The processor bus-to-device bus bridge Replace controller.
chip is bad.
sq q q q q 3B An unnecessary interrupt pending. Replace controller.
sq q q q 3C An unexpected fault during initialization. Replace controller.
sq q q q q 3D An unexpected maskable interrupt during Replace controller.
initialization.
sq q q q q 3E An unexpected NMI during initialization. Replace controller.
sq q q q q q 3F An invalid process ran during Replace controller.
initialization.
Legend:
s = reset button FLASHING = reset button OFF q = LED OFF
= LED FLASHING
Troubleshooting Information 1 13
Solid OCP Pattern Display Reporting
Certain events cause the OCP LEDs to display ON or SOLID. Each event and the resulting
pattern are described in Table 13.
Information related to the solid OCP patterns is automatically displayed on the
maintenance terminal (unless disabled with the FMU) using %FLL formatting, as detailed
in the following examples:
%FLL--H SG > --13-MAY-2001 04:39:45 (time not set)-- OCP Code: 38
Controller operation terminated.
%FLL--H SG > --13-MAY-2001 04:32:26 (time not set)-- OCP Code: 26
Memory module is missing.
Table 13 Solid OCP Pattern Displays and Repair Actions (Sheet 1 of 4)
Pattern OC P Error Repair Action
Code
0 Catastrophic controller or power failure. Check power. If good, reset controller. If
problem persists, reseat controller
module and reset controller. If problem is
still evident, replace controller module.
s 0 No program card detected or kill asserted Make sure that the program card is
by other controller. properly seated while resetting the
controller. If the error persists, try the
Controller unable to read program card.
card with another controller; or replace
the card. Otherwise, replace the
controller that reported the error.
sq q q 25 Recursive Bugcheck detected. Reset the controller. If this fault pattern is
displayed repeatedly, follow the repair
The same bugcheck has occurred three
actions associated with the Last Failure
times within 10 minutes, and controller
code that is repeatedly terminating
operation has halted.
controller execution.
sq q q 26 Indicated memory module is missing. Insert memory module (cache board).
Controller is unable to detect a particular
memory module.
Legend:
s = reset button ON q = LED OFF
= reset button OFF = LED ON
114 HSG60 Array Controller ACS Version 8.6 Troubleshooting Reference Guide
Table 13 Solid OCP Pattern Displays and Repair Actions (Sheet 2 of 4)
Pa ttern OC P Error Repair Action
Code
sq q q q 27 Memory module has insufficient usable Replace indicated DIMMs.
memory. This indication is only provided when
Fault LED logging is enabled.
sq q 28 An unexpected Machine Fault/NMI Reset the controller.
occurred during Last Failure processing.
A machine fault was detected while a
Non-Maskable Interrupt was processing.
sq q q 29 EMU protocol version incompatible. Upgrade either the EMU microcode or the
software (refer to the release notes that
The microcode in the EMU and the
accompanied the controller software).
software in the controller are not
compatible.
sq q q 2A All enclosure I/O modules are not of the Make sure that the I/O modules in an
same type. extended subsystem are either all single-
ended or all differential, not both.
Enclosure I/O modules are a combination
of single-ended and differential.
sq q q q Make sure that enclosure SCSI bus
2B Jumpers, not terminators, found on
terminators are installed and that no
backplane.
jumpers are installed. Replace the failed
One or more SCSI bus terminators are
terminator if the problem continues.
either missing from the backplane or
broken.
sq q q Make sure that all of the enclosure device
2C Enclosure I/O termination power out of
SCSI buses have an I/O module. If
range.
problem persists, replace the failed I/O
Faulty or missing I/O module causes
module.
enclosure I/O termination power to be out
of range.
sq q q q q 2F Memory module has illegal DIMM Verify that DIMMs are installed correctly.
configuration.
Legend:
s = reset button ON q = LED OFF
= reset button OFF = LED ON
Troubleshooting Information 1 15
Table 13 Solid OCP Pattern Displays and Repair Actions (Sheet 3 of 4)
Pattern OC P Error Repair Action
Code
sq q 30 An unexpected bugcheck occurred before Reinsert controller. If that does not correct
subsystem initialization completed. the problem, reset the controller. If the
error persists, try resetting the controller
An unexpected Last Failure occurred
again, and replace the controller if no
during initialization.
change occurs.
sq q q 31 ILF$INIT unable to allocate memory. Replace controller.
Attempt to allocate memory by ILF$INIT
failed.
sq q q 32 Code load program card write failure. Replace program card.
Attempt to update program card failed.
sq q q q 33 Nonvolatile program memory (NVPM) Verify that the program card contains the
structure revision too low. latest software version. If the error
persists, replace controller.
NVPM structure revision number is lower
than can be handled by the software
version attempting to be executed.
sq q q q 35 An unexpected bugcheck occurred during Reset controller.
Last Failure processing.
Last Failure Processing interrupted by
another Last Failure event.
sq q q q 36 Hardware-induced controller reset Replace controller.
expected and failed.
sq q q q q 37 Software-induced controller reset Replace controller.
expected and failed.
sq q q 38 Controller operation halted. Reset controller.
Last Failure event required termination of
controller operation, for example:
SHUTDOWN via the command line
interpreter (CLI).
sq q q q 39 NVPM configuration inconsistent. Replace controller.
Device configuration within the NVPM is
inconsistent.
Legend:
s = reset button ON q = LED OFF
= reset button OFF = LED ON
116 HSG60 Array Controller ACS Version 8.6 Troubleshooting Reference Guide
Table 13 Solid OCP Pattern Displays and Repair Actions (Sheet 4 of 4)
Pa ttern OC P Error Repair Action
Code
sq q q q 3A An unexpected NMI occurred during Last Replace controller.
Failure processing.
Last Failure processing interrupted by a
Non-Maskable Interrupt (NMI).
sq q q q q 3B NVPM read loop hang. Replace controller.
Attempt to read data from NVPM failed.
sq q q q 3C NVPM write loop hang. Replace controller.
Attempt to write data to NVPM failed.
sq q q q q 3D NVPM structure revision higher than Replace program card with one that
image. contains the latest software version.
NVPM structure revision number is higher
than the one that can be handled by the
software version attempting to execute.
sq q q q q q Verify that cache module is present. If the
3F DAEMON diagnostic failed hard in non-
fault tolerant mode. error persists, replace controller.
DAEMON diagnostic detected critical
hardware component failure; controller
can no longer operate.
Legend:
s = reset button ON q = LED OFF
= reset button OFF = LED ON
Troubleshooting Information 1 17
Last Failure Reporting
Last failures are automatically displayed on the maintenance terminal (unless disabled via
the FMU) using %LFL formatting. The example below shows a Last Failure report:
%LFL--H SG > --13-MAY-2001 04:39:45 (time not set)-- Last Failure Code: 20090010
Power On Time: 0. Years, 14. Days, 19. Hours, 58. Minutes, 42. Seconds
Controller Model: HSG60
Serial Number: AA12345678 Hardware Version: 0000(00)
Software Version: V086L(FF)
Informational Report
Instance Code: 0102030A
Last Failure Code: 20090010 (No Last Failure Parameters)
Additional information is available in Last Failure Entry: 1.
In addition, Last Failures are reported to the host error log using Template 01, following a
restart of the controller. See Chapter 4 for a more detailed explanation of this template.
Reporting Events That Allow Controller Operation to
Continue
Events that do not cause controller operation to halt are displayed in one of two ways:
s Spontaneous event log
s CLI event reporting
118 HSG60 Array Controller ACS Version 8.6 Troubleshooting Reference Guide
Spontaneous Event Log
Spontaneous event logs are automatically displayed on the maintenance terminal (unless
disabled with the FMU) using %EVL formatting, as illustrated in the following examples:
%EVL--HSG> --13-OCT-2000 04:32:47 (time not set)-- Instance Code: 0102030A (not yet
reported to host)
Template: 1.(01)
Power On Time: 0. Years, 14. Days, 19. Hours, 58. Minutes, 43. Seconds
Controller Model: HSG60
Serial Number: AA12345678 Hardware Version: 0000(00)
Software Version: V086L(FF)
Informational Report
Instance Code: 0102030A
Last Failure Code: 011C0011
Last Failure Parameter[0.] 0000003F
%EVL--HSG> --13-OCT-2000 04:32:47 (time not set)-- Instance Code: 82042002 (not yet
reported to host)
Template: 13.(13)
Power On Time: 0. Years, 14. Days, 19. Hours, 58. Minutes, 43. Seconds
Controller Model: HSG60
Serial Number: AA12345678 Hardware Version: 0000(00)
Software Version: V086L(FF)
Header type: 00 Header flags: 00
Test entity number: 0F Test number Demand/Failure: F8 Command: 01
Error Code: 0008 Return Code: 0005 Address of Error: A0000000
Expected Error Data: 44FCFCFC Actual Error Data: FFFF01BB
Extra Status(1): 00000000 Extra Status(2): 00000000 Extra Status(3): 00000000
Instance Code: 82042002
HSG>
Spontaneous event logs are reported to the host error log using SCSI Sense Data
Templates 01, 04, 05, 11, 12, 13, 14, 41, 51, and 90. See Chapter 3 for a more detailed
explanation of templates.
CLI Event Reporting
CLI event reports are automatically displayed on the maintenance terminal (unless
disabled with the FMU) using %CER formatting, as shown in the following example:
%CER--HSG> --13-OCT-2000 04:32:20 (time not set)-- Previous controller-
operation stopped with display of solid fault code, OCP Code: 3F
HSG>
Troubleshooting Information 1 19
Running the Controller Diagnostic Test
During startup, the controller automatically tests the device ports, host ports, cache
module, and value-added functions. If intermittent problems occur with one of these
components, run the controller diagnostic test in a continuous loop rather than restarting
the controller repeatedly.
Use the following steps to run the controller diagnostic test:
1. Connect a terminal to the controller maintenance port.
2. Start the self-test with one of the following commands:
SELFTEST THIS_CONTROLLER
SELFTEST OTHER_CONTROLLER
NOTE: The self-test runs until an error is detected or until the controller reset button is pressed.
If the self-test detects an error, the self-test saves information about the error and produces
an OCP LED code for a "daemon hard error." Restart the controller to write the error
EK-G60TR-SA
T0-DSTAT-IS
Page 1 - Page 2 - Page 3 - Page 4 - Page 5 - Page 6 - Page 7 - Page 8 -

3prime solutions for all your Digital requirements

     
 


HP is a registered trademark