Network Troubleshooting: How to Fix Network Outages Step-by-Step
Leave a message
During office hours, many situations may arise where the network suddenly becomes unavailable, which is a frustrating issue that most low-voltage engineers and network engineers have encountered. Without a systematic approach to troubleshooting and resolving the issue, it can be particularly frustrating. In this article, we will look at how an expert troubleshoots and resolves the issue step by step.
1. Symptom of the Fault
One day, department B reported that the entire department could not access the internet. First, I tested on my own computer, and PINGing my department's gateway showed normal responses, and accessing the Internet (external network) was also normal. No other departments reported network issues, so I ruled out a network-wide fault. Since it wasn't just one computer in department B that couldn't access the internet, I suspected an internal LAN issue within department B.

2. Troubleshooting Process
2.1 Remote Login to the Aggregation Switch:
Logged into the aggregation switch to troubleshoot the network. PINGing the gateway of department B from the aggregation switch did not work. Checking the switch interface status revealed that the department's interface was in the "err-disabled" state. After analysis, the most likely cause was a routing loop, but it was not certain, so on-site testing was needed.
2.2 On-site Testing of the Aggregation Switch:
Disconnected the uplink cable of the department from the aggregation switch and directly connected a laptop to it, setting the laptop's IP address to the department's IP. Logged into the ethernet switch via the Console port, and found the interface was still in "err-disabled" state. Used the commands "shutdown" followed by "no shutdown" to restart the interface, which changed to "connected" state.

Testing showed that the laptop could PING the gateway and access the Internet normally, confirming that there was an issue with the department's LAN.
2.3 Testing the Department's LAN:
The department used a Layer 2 switch connected to the aggregation switch, which in turn connected to the office computers. Found the uplink cable connecting the department to the aggregation switch, connected the laptop, and tested, which showed normal access, indicating the uplink cable to the aggregation switch was fine.

Plugged the uplink cable back into the Layer 2 switch and connected the laptop for testing, which resulted in being unable to PING the gateway or access the Internet. Observing the switch's indicator lights revealed no obvious abnormalities.
2.4 Step-by-step Troubleshooting to Find the Issue:
Decided to systematically eliminate potential issues. Disconnected all network cables and asked a colleague to remotely log into the aggregation switch to restart the department's interface. First, connected the laptop and used the "Ping 192.168.20.254 –t" command to test network connectivity, which showed it could PING, proving it wasn't a switch issue. Connected the network cables one by one, observing the PING status. Since each cable had a label with the room number, when a cable from a certain room was inserted, the laptop's PING status timed out, indicating a problem with that room's line. Marked this cable and continued troubleshooting. Repeated the process of restarting the port and connecting the cables. After inserting all cables, no timeouts occurred, confirming that the issue was caused by this specific cable.
2.5 Tracing the Cause of the Fault:
Went to the room marked on the cable to investigate the cause of the fault. Found that the wall outlet in the room was connected to a 4-port switch, with all four ports occupied by cables. The office had only three computers, which were connected to the small switch via exposed cables. Upon inspection, found that both ends of one cable were connected to the switch, creating a routing loop. Removed this cable, tested, and the issue was resolved, allowing the department to access the network normally.
3. Summary
3.1 Standardize and Detail Cable Labeling:
Clearly label cables, marking both ends with labels indicating the office and port number, e.g., "303A" for port A in office 303.
3.2 IP/MAC Address Registration:
Implement IP/MAC address registration, recording each computer's IP address, MAC address, user, computer model, serial number, etc. Ideally, bind IP addresses with MAC addresses to enhance network security, prevent unauthorized access, and facilitate post-incident auditing.
3.3 Strengthen Management:
The routing loop was caused by poor management and insufficient control over various departments. Need to establish corresponding management systems, improve network access regulations, and unify network planning, prohibiting the unauthorized purchase of small switches, wireless routers, and other devices.






