| United States-English |
|
|
|
![]() |
HP 9000 V-Class Server: Architecture > Chapter 9 System utilities EMUC and Power-on |
|
The EMUC performs all environmental monitoring on the ECUB. It attaches to the core logic bus so that processors can monitor the system by accessing these CSRs. The EMUC works in conjunction with a hardware section on the ECUB known as the power-on circuit. This circuit controls powering up the entire system. It operates when the rest of the system is powered off or in some indeterminate state. It drives the environment LED display which is a basic (minimal hardware, no software) indication of what environmental error caused the ECUB to power down the system. The teststation can also read the environmental LED display. The EMUC and the power-on circuit monitor the following environmental conditions:
The power-on function detects environmental errors (such as ASIC Install or FPGA Not OK) immediately and does not turn on power to the system until the conditions are corrected. It also detects environmental errors such as 48V Fail while the system is powering up and Midplane Power Fail after the system has powered up. If a failure is detected in these two cases, the power-on circuit turns off power to the system. Environmental warnings such as 48V maintenance are also detected by the power-on circuit. It applies these to the EMUC, which then sends an environmental warning interrupt to the system processors. In all cases, the power-on circuit lights an environmental LED display code. The environmental LED display code is prioritized so that it only displays the highest priority error or warning. The EMUC detects most of the environmental conditions. It samples error conditions during a time period derived from a local 10-Hz clock that drives the power-on circuit. It registers all the environmental error conditions twice and then ORs them together. If the conditions persist for 200 milliseconds, the environmental error bit is set, and an environmental error interrupt is sent to the EPUC, which sends it on to the processors. The EMUC then waits 1.2 seconds and commands the power-on circuit to power down the system. This same procedure exists for an environmental warning except that an environmental warning interrupt is sent and the circuit does not power down the system. The environmental error interrupt and the 1.2 second delay provide the system adequate time to read CSRs to determine the cause of the error, log the condition in NVRAM, and display the condition on the LCD. After the system is powered down, the ECUB is still powered up, but all outputs are disconnected from the system. Second-level registers in the EMUC drive the 6-bit display. The EMUC prioritizes the environmental errors and warnings and passes the information to the power-on circuit. This circuit prioritizes the 6-bit field with its environmental conditions and produces a 7-bit field plus an attention bit (ATTN) that drives the Display. ATTN is on if there is an environmental warning. In general, the power-on-detected errors are a higher priority than EMUC-detected errors, the lower the error code number, the higher its priority. Environmental warnings are lower priority than the environmental errors. Table 9-2 “Environmental LED display” shows the LED display error codes. Table 9-2 Environmental LED display
The top of the table is the highest priority, the bottom the lowest. If a higher condition occurs, that one is displayed. This section describes each environmental condition that is monitored by the power-on circuit and the EMUC. This error indicates that the ECUB 3.3V power supply has failed, but the 5V supply has not. Each ASIC has install lines to prevent power-up if an ASIC is installed incorrectly (such as an EPAC installed in an ERACs position). If an ASIC is improperly installed, the ECUB does not power up the system. This condition is not monitored after power up. When this error is displayed, the power-on circuit did not power up the system, because one or more 48V power supplies reported an error. In systems with redundant 48V power supplies, this error means that two or more 48V supplies reported an error. If the 48V supply has dropped below 42 volts for any reason other than normally turning off the system or an ac failure, then this error is displayed by the power-on circuit. Also, the 48V supply that reported the error and the power-up state of the system at the time of the error is displayed. This error indicates that a 48V error occurred and the ECUB lost and then later regained power without the machine being turned off. The power-on circuit will display this error and not power on the system, because the 48V supply is likely at fault. If the system clock fails, then the EMUC will be unable to monitor environmental errors that could possibly damage the system. If the power-on circuit receives no response from the EMUC, it powers down the system and displays this error. The EMUC is programmed by a serial data transfer from EEPROM upon utility board power-up. If the transfer does not complete properly, the EMUC cannot configure itself and many environmental conditions cannot be monitored. The power-on circuit monitors both the EMUC and EPUC and does not power up the system, if they are not configured correctly. There is one temperature sensor per board that detects board overheating. The sensors are bussed together into four system quadrants plus the ENRB and applied to the EMUC. Sensors in the six fans determine if the fans are running properly. The EMUC waits 12.8 seconds for the fans to spin up after power-up before monitoring them. Because a power failure on a board could cause damage to other boards, a mechanism is in place to detect 3.3V failures on each board. Power failures are considered environmental errors, and the system is powered down after they are detected. If the ENRB power fails, the power-on circuit powers down the entire system. The ECUB is still active, but the power-on circuit displays the power failure condition and disables all ECUB outputs that drive the system. This condition persists until power is cycled on the ECUB. There are up to four 48V power supplies. Each sends a signal to the power-on circuit. If any supply fails at any time, the circuit asserts the 48V maintenance line to the EMUC, which reports the environmental warning to the processors. The power-on circuit displays the highest priority 48V supply that failed. The ambient air sensors detect a too warm or too hot condition in the input air stream. Ambient air too warm is an environmental warning; ambient air too hot is an environmental error that powers down the system. The temperature set points are set by the teststation. The digital temperature sensor has nonvolatile storage for the temperature set points. Power-on reset starts the digital temperature sensor without the core logic microprocessor intervening. Described in the following sections are functions the ECUB performs to control the system environment. When the power switch is turned on, the outputs of the 48V power supplies become active. Several hundred milliseconds after the ECUB 5V supply reaches an acceptable level, the power-on circuit starts powering up the other dc-to-dc converters of the system in succession. The power-on circuit does not power up the system if an ASIC is installed incorrectly (see the section “ASIC installation error”) or if an FPGA is not configured (see the section “FPGA configuration and status”). It keeps the system powered up unless an environmental condition occurs that warrants a power-down. This section describes some of the EMUC CSRs. The Processor Report register indicates the processors that are working in the system. Each processor reports by writing to this register and setting the bit corresponding to the processor number. P0-P15 comprise a fully readable and writable field. The bits are cleared on reset. Once a bit is written to a one value, it remains set until cleared by reset. Writes of a zero value do nothing. The bit, Px, set to a one value, indicates that processor x has reported in working. The Processor Semaphore register provides a signaling function for processor synchronization. This is an atomic read-and-increment register. Count is cleared on reset. Writes load any value. Reads return the value of Count and then increment Count atomically. The ERAC data register holds the data to be written to the destination ERAC CSR or the data that has been read from the ERAC CSR. ERAC Data bits comprise a fully readable and writable field. After an ERAC read operation, the ERAC Data register holds the data. After the ERAC write operation, the data is stored in the ERAC register, and ERAC Data is undefined. The ERAC Configuration Control register selects the target ERAC, the address of the CSR within that ERAC, and the type of CSR access (read or write). It controls the ERAC CSR operation and then returns status of the operation. The fields and bits of the ERAC configuration Control registers are defined as follows:
The EMUC Reset register initiates a reset or displays the type of the last reset. This CSR also contains the revision status. The bits and field of the Reset register are defined as follows:
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||