TR Troubleshooting

Token Ring Soft Error Types

The ten soft error types are as follows:
Internal error
Burst error
Line error
Abort delimiter transmitted error
AC error
Lost frame error
Receiver congestion error
Frame copied error
Frequency error.
Token Error

Internal errorTroubleshoot the addresses in the assigned fault domain by going to page 2 . Keep in mind that the reporting station NIC may be the failure cause.These errors signify that the reporting RS has encountered a recoverable internal error. If this particular error is recorded frequently, the reporting station NIC may be operating in marginal condition. Any available diagnostics should be run on the particular RS, and to be conclusive you can remove the station from the ring and rerun a protocol analysis session. If one internal error is encountered the NIC may have failed and should be troubleshot.

Burst error. Keep in mind that burst errors can occur when ring stations leave or enter the ring, causing ring reconfigurations, which are not of a serious nature. Check the protocol analysis session trace for ring reconfigurations occurring directly before the burst error. If no ring reconfigurations are occurring, troubleshoot the addresses in the assigned fault domain by going to page 2. Remember, these errors signify that the reporting RS encountered a signal error transition that has been detected in the Token Ring cabling medium. Again, normally, these errors occur when RSes leave or enter the ring, which causes ring reconfigurations. If the error is due to a ring reconfiguration, you can check the protocol analysis session trace for the Neighbor Notification process occurring directly before the burst error; for example, you should see a Report NAUN Change MAC frame present before the error. If the burst error count exceeds the level of 20 in one error packet and no stations have entered or left the ring, this may indicate a bad lobe cable which should be checked with a Time Domain Reflectometer. Basically, if this particular error is recorded frequently and no ring reconfigurations are occurring, it is possible that the cabling medium has a problem. In this case, troubleshoot the fault domain involved.

Line error. These errors can occur when ring stations leave or enter the ring, which causes ring reconfigurations, or because of simple ring recoveries. Check the protocol analysis session trace for the ring reconfigurations occurring directly before the line error. If no ring reconfigurations or ring recover ies are occurring, troubleshoot the addresses in the assigned fault domain by going to page 2 . Keep in mind that the presence of a line error can be due to a failure cause located in the reporting station's NAUN. These errors signify that the reporting RS checksum process has detected a checksum error in a specific received data frame or token, after transmission of the respective token or data frame. When this error occurs, it usually is related to ring recoveries or simple ring reconfiguration. Again, sometimes the presence of a line error can be due to a failure cause located in the reporting station's NAUN. If the line error count exceeds the level of 20 in one error packet and no stations have entered or left the ring, this may indicate a bad NIC which should be troubleshot. If line errors arise often, test the reporting station's NAUN. This can be done with diagnostics or by removing the station from the ring and rerunning a protocol analysis session.

Abort delimiter transmitted error. Troubleshoot the addresses in the assigned fault domain by going to page 2. Keep in mind that the reporting station NIC may be the failure cause. This error signifies that the reporting RS has encountered a recoverable internal error that forced it to transmit an Abort Delimiter frame. If this error is recurrent at rates of more than 10 in one error packet or just is continuously being generated from one ring station, the reporting station NIC may be operating in marginal condition. Again, diagnostics should be run on the particular RS. Also to be conclusive, the suspected station can be removed from the ring and a protocol analysis session run.

AC error. Troubleshoot the addresses in the assigned fault domain by going to page 2 . Keep in mind that the presence of an AC error can be due to a failure cause l ocated in the reporting station's NAUN. This error indicates that the reporting RS's NAUN could not successfully set the address recognized or frame copied bits in the newly transmitted frame, even though it has actually completed the copy of the bits on its last frame received. If this error happens often, it is possible that the reporting station's NAUN has a failure. It is also possible that a common destination station is returning the frame improperly to the source. If this error is captured at counts of even one in an error packet, the NAUN and destination devices of the reporting station should be tested with diagnostics or by removing the stations from the ring and rerunning a protocol analysis session. On router and bridge based networks if a large group of ring stations are experiencing this problem on one common ring, check the router and bridge NICs through diagnostics.

Lost frame error. Troubleshoot the addresses in the assigned fault domain by going to page 2 . This error indicates that an originating RS generated a frame onto the ring to a speific address and did not receive the frame back. This error may be detected by either the active monitor or the originating RS. Because the RS did not receive the frame back, it can-not release the token, which causes the active monitor to initiate ring recovery and issue a new token. Also because the station did not receive the frame back, the source that may have caused the frame to become lost is not directly identifiable. If this type of error occurs frequently, attempt to troubleshoot the fault domain surrounding the reporting RS, and also continue to rerun the protocol analysis session to identify any repetitive patterns of this error.

Receiver congestion error. Troubleshoot the addresses in the assigned fault domain by going to page 2 . Keep in mind that a receiver congestion e rror usually occurs because of lack of buffer space within the NIC on the destination ring station. This error indicates that the reporting RS could not receive a frame addressed to its address. This usually occurs because of lack of buffer space within the NIC on the destination RS. The destination station NIC may be at fault, but this error normally occurs because a specific network software application is causing the particular destination RS to be flooded with data too frequently. Receiver congestion errors occur often on certain network file servers. If the cause is related to the flooding of data and is a frequent occurrence, and something is not eventually done to alleviate the problem, the NOS may have operational failures. This problem usually can be remedied in either of two ways: the network software application access can be redesigned, or an NIC with larger buffer space can be installed in the particular destination RS.

Frame copied error. Check the ring for duplicate address as related to the reporting ring station's adapter. If no duplicate address is found, troubleshoot the addresses in the assigned fault domain by going to page 2. This error signifies that the reporting RS has copied a frame that may have the same address as its own (a duplicate address.) It also is possible that the frame was corrupted on the ring. If this error occurs frequently, and there is not an assigned duplicate Token Ring address on the ring, check the reporting RS's adapter.

Frequency error Troubleshoot the ring station assigned the active monitor role and the reporting ring station's NAUN for a possible failure by going to page 6. This error signifies that the reporting RS is attempting to receive a frame that does not contain the proper ring-clock frequency. Because the active monitor is responsible for maintaining the Ring Master Clock, it is possible that t he active monitor has encountered an error. If this error occurs frequently, check the active monitor and the reporting RS's respective NAUN. It is also possible that a power problem or MRP with bad grounding can induce intermittent frequency errors. Again, run any available diagnostics, remove any suspected RS from the ring, and rerun a protocol analysis session.

Token error. Troubleshoot the addresses in the assigned fault domain by going to page 2. A token error is generated only by the active monitor in the event that it does not detect a token on the ring. Because the active monitor cannot pinpoint the reason for the token being lost, the cause cannot directly be associated with any particular network component. The active monitor initiates ring recovery and issues a new token. Also, when a token error occurs because of the ring recovery process, it is highly possible that other RSes will detect and generate burst, line, and lost frame errors onto the ring. If token errors occur frequently, continue to run a protocol analysis session to identify any repetitive patterns of this error.

November 15, 1996

9595 Main Page