Recognize and Isolate Issues
Recognizing and isolating the issues is the essence of
troubleshooting. You should start troubleshooting wide – not ignoring the
obvious – and move in closer to the problem. For example, let’s look at a
situation where you are troubleshooting a user who cannot connect to the
network:

In this example, we start at the general problem, “No
Network Connectivity”, and then increasingly get more detailed throughout
troubleshooting. We may have gone in a different direction at one point, but
eventually, we troubleshoot enough potential problems to find the core
problem.
Real World
Example
There was once a situation where we were troubleshooting a
network connectivity issue. A user couldn’t connect to the network. We went
through a troubleshooting process of identify the problem from a “large
circle” perspective and determined the problem could be in any of these areas:
- Network Issue (e.g. bad switch, bad wall port, DHCP
server down)
- Network Cabling Issue (bad patch cable, bad cable on the
back end)
- PC Network Interface Card (bad card, bad port, wrong
card settings)
- Software Issue (bad driver, incorrect configuration
settings, Windows issue)
We worked on this issue for awhile and decided the user had
a bad cable and replaced it. The problem still occurred and we searched for
another solution. Several troubleshooting steps later and we came back around
to it being a cable issue. Again, we replaced the new
cable with no resolution to the problem. After much more troubleshooting, we
replaced the cable a third time and the problem was
resolved!
We had a bad original cable and two bad replacement cables.
No matter how obvious you think the solution is – and think it can’t be that
simple – try it anyways! You never know when the solution isn’t just sitting
in front of you. I often find in troubleshooting that the most simple
explanation is usual the right one.
Setting Priorities When Troubleshooting
When you are troubleshooting a problem, you must learn the
art of setting priorities. You often don’t have time to try every
troubleshooting path you think of, you must learn how to decide the most
likely cause, go down that troubleshooting route, and return if you need to.
Much of this ability comes with time and experience – the most you experience,
the easier it becomes to jump to a conclusion as to the most likely problem.
One of the methods you can use to set priorities is by making a list of the
likely causes and then ranking them 1 to 4 with 1 being “Most Likely” to 4
being “Least Likely.”
For example, let’s assume a user is having a problem with a
“slow computer.” If there isn’t a more vague complaint! There are a lot of
causes for “slow computer,” so begin by making a list and marking each item
with a 1 to 4 rating:
1. Most Likely Cause
2. Likely Cause
3. Somewhat Likely Cause
4. Least Likely Cause
My list might be something like this:
Adware/spyware software (1)
Bad motherboard (4)
Degraded hard drive (2)
Too much software installed (3)
Service Pack installation failure (4)
Needs additional memory (1)
You could probably come up with another 20 items, but
that’s a good start. Now that I have a couple theories on what the problem
could be, I will begin troubleshooting the computer by checking for spyware,
checking memory usage, examining the hard drive for fragmentation, checking to
see what software is installed, and so on. The list gives me ideas as to
what the problem might be, and provides ideas on what
order to attack the problem.
When You Hit a Wall
What do you do when you hit a wall in troubleshooting?
One of my favorite scenario questions when interviewing job
candidates is a troubleshooting question with no clear answer. The question is
designed to gage the technical depth of the candidate – push them to their
technical limit – then find out what they do when they hit the proverbial
wall. So what do you do when you hit a wall in troubleshooting and need to
find the solution?
Find Someone Senior
The first thing you might try is to ask someone senior to
you if they have seen this problem before. This step is designed for a “quick
hit” – if they have not seen the problem, you should not just hand it off to
them to solve. It’s a learning opportunity for you to troubleshoot and
research a problem to completion.
Web Research
The next step is to research the issue using the Internet.
There are hundreds of good knowledge bases and forums out there for you to use
in research, but quite frankly, it comes down to Google.com and Microsoft.com.
These two resources will help you find the answer to almost any problem.
Actually, in my many years in IT, I have only had
one problem which we were unable to use the
Internet to find a solution. We found one user in Germany who had a similar
problem to ours and no answer on the forum. Unfortunately, he didn’t reply to
our emails so we ended up troubleshooting the problem for
weeks until we found a solution.
Otherwise, the Google has always been the best resource to
find a solution.
It is often the case that how you phrase your search term
will depend on if you find the answer to your problem or not. You might try
several different versions of your search term to see what results come up.
I have found that Microsoft’s internal search engine is
lacking. To find something in Microsoft’s knowledge base, try this search
term:
Site:Microsoft.com inurl:kb my search phrase
You should also be aware that many hardware vendors have
extensive knowledge bases and forums on their sites. Some are better than
others, but all have great information to help track down your problem.
Hardware Troubleshooting
If the problem is related to hardware, there may be some
obvious clues as to what the issue is. For example, if the computer is beeping
during POST, you are dealing with a hardware issue. For the A+ certification
exam, you should be aware of how to troubleshoot hardware issues.
BIOS POST Diagnostic Beeps
During Power-On Self Test (POST), the BIOS checks different
hardware to ensure it is operating correctly before starting the bootup
process. This POST check will produce several beeps if something is not
operating correctly. Here are several common beep codes and what they mean:
0 Beeps: Power issue or
problem with the power supply.
1 Beep: If at the end of
POST and the computer boots up, no problems.
2, 3 Beeps:
RAM issue –
reseat memory or replace with known good memory chips.
4, 5, 7, 10 Beeps: The
motherboard has a serious problem and should be repaired/replaced.
6 Beeps:
Keyboard error.
8 Beeps:
Video card error.
Reseat and check connections.
9 Beeps: Faulty BIOS.
Replace the motherboard.
These are guidelines to typical beep code errors. You
should always check your specific computer manufacturer’s documentation for
the exact error message. POST beep error codes are often a series of long and
short beeps with specific meaning to help in troubleshooting.
Exam Moment
One of the frequently asked questions on the exam involves
a computer system continually losing time setting. This is often caused by a
CMOS battery which has lost its charge. Replace the CMOS battery and the
computer will start retaining its time settings.
Document Your Findings
As you troubleshoot, you should keep written record of your
progress. This will help other technicians assigned your case if you are off
work, and also help you narrow down what the problem could be so you do not
repeat troubleshooting you have already performed.
At the end of your troubleshooting and once you have solved
the problem, you need to document your findings in whatever trouble ticket or
knowledgebase system your company uses. This ensures that the next technician
who picks up a problem for this user or finds a computer with a similar
problem will have access to what you did to resolve the issue. This will save
valuable time and money in the future.