Tips, tricks, techniques, tools, and code for those who administer Windows desktops.

Troubleshooting Software Problems

Frequently in my work as a Windows system administrator, I am asked to troubleshoot unusual application problems that our first and second-level support staff have been unable to fix.

Although I troubleshoot these kinds of problems on a fairly regular basis, I find that I don’t always do so consistently. I might overlook something that I shouldn’t have, or I might forget something I’ve seen before that helped me solve a similar problem.

To help myself and my co-workers jog our memories when presented with an application problem that doesn’t respond to the troubleshooting steps we’ve tried so far, I developed the following (lengthy) series of questions to ask myself when I run into a problem that seems to be resisting my efforts to solve it.

Since this list might be of value to others who are trying to solve problems with Windows (or even Mac or Linux applications, though this guide is aimed specifically at Windows), I thought I would publish it here so that others could benefit from it.

  • Has the PC been rebooted to ensure the problem isn’t temporary? If rebooting isn’t practical, try having the user log off/on, as this will refresh the applications that load when the user logs on and terminate anything that might be hung.
  • Have we checked to see if the manufacturer’s support site has seen this problem before?
  • Have we done a Google search on any symptoms or error messages to see if others have seen and fixed this before?
  • If this is a new application install, does the problem occur for an administrator and not for a normal user? If so, we probably need to adjust permissions for some of the files/folders in the application’s C:\Program Files directory. The Sysinternals Filemon tool can help you identify what files might be having trouble. Regmon can help you do the same for registry entries.
  • Does the problem occur when other users log on to the same PC and use the same application? If not, we’re probably looking at a user profile issue. Try renaming the user’s profile and having them login to create a new one, then see if the app works.
  • Has the application in question been repaired using Add/Remove Programs, or removed and reinstalled? If the application interacts with other applications (e.g., Flash Player and Internet Explorer), have all the relevant applications been repaired and/or reinstalled?
  • If the problem involves a browser add-on or extension, have we disabled all other browser extensions and add-ons to see if there is a conflict of some sort (for Internet Explorer, see Tools -> Manage Add-ons -> Enable or Disable Add-ons)? Has a recent Microsoft "kill bits" or ActiveX patch disabled it?
  • Is the application in question a Java application, or does it make some use of Java? If so, check to make sure Java is working by entering "java -version" at a command prompt. If Java isn’t found, that could be the problem.
  • If this is a problem with an application that creates and opens documents (like Excel), does the problem happen with all documents or just certain ones? If the document is copied to another machine with the same application does that machine exhibit the same problem? If so, it may just be a corrupted document.
  • Does the application utilize any temp files or configuration files (e.g., INIs) that might be corrupted? If so, have we tried renaming those and letting the application make new copies? For Internet Explorer, this includes the Temporary Internet Files. For Office, it includes opa11.dat, excel11.xlb, excel11.pip, mso1033.acl, powerp11.pip, ppt11.pip, extend.dat, and (Note that an uninstall/reinstall doesn’t usually fix this.)
  • Has CHKDSK been run to ensure there is no disk corruption? (Note: Multiple runs may be needed if corruption is extensive.) If there was corruption, repairing the application after fixing the corruption is a good idea. If there is still a problem, the OS itself might be corrupted and a full rebuild or reimage may be the best answer, especially if you can’t replicate the issue on another PC. If corruption doesn’t seem to get fixed after 3 CHKDSK runs, you’re probably looking at a bad hard disk or such severe corruption that rebuild is a better idea than repair.
  • Have we checked the vendor’s web site to see if there are any updates, hotfixes, or patches available and applied them?
  • If the application uses plug-ins, have we tried repairing and/or removing those plug-ins to see if the problem goes away?
  • Are there multiple versions of the application installed (e.g., Office 2003 and Office XP)? Can the user live without one of them? Has the newer version been repaired before (and/or after) the older one?
  • Is there anything in the Event Logs which might point to the cause of the problem? Does the application produce any logs of its own that we can look at?
  • If this is a network-related application (like Outlook, Cygwin, etc.) have we confirmed that networking is working? Is the firewall causing a problem?
  • If this is a database related application, is the database up? Is there an ODBC database provider configured in the control panel? Is any database middleware present (e.g., Oracle software) that needs to be?
  • Was anything installed on the computer just prior to the onset of the problem?
  • Were any patches applied recently that affect this particular application? Have you tried removing the most-recently installed patches to see if this helps (see Add/Remove Programs)?
  • Have we tried renaming the branch of the registry related to the application and then repairing the application (e.g., HKLM\Software\Vendor to HKLM\Software\Vendor.old)?
  • If this is an application which prints (like the Office apps), try changing the default printer and launching the application again. If the problem disappears, delete the original default printer, re-add it to get new drivers, and make it the default again. (Some apps grab printer information at startup and can crash if there is a driver issue.)
  • Is there a chance that this application needs a firewall exception? Check its manual, vendor web site, etc., to verify this and if necessary add one. If it needs a firewall exception and this wasn’t automatically done at install, set one up and add it to the package for the future.
  • Does the machine have the latest BIOS?
  • Some applications interface with, or hook into, hardware drivers. For example remote control software does this to simulate keyboard/mouse input and capture video changes. If there’s a chance this application does that, have we tried updating the drivers (e.g., video, network, key/mouse)? Note that you may need to repair the app after updating the drivers so the app can restore its "hook" into them.
  • If this is an application that processes sound, like a sound recorder, are the Control Panel settings correct for that? For example, are the input and output devices set correctly? (You may want to experiment with various options in case the control panel thinks, for example, that the line-in jack is the microphone jack.)
  • If this is a problem getting an application to launch, the likely culprits are disk corruption, corrupted temporary files, corrupted settings files, corrupted application files, or bad registry entries. CHKDSK can fix disk errors. Repairing the app should fix corrupted application files. Deleting temp and settings files should be tried. Renaming the Registry branch used by the app can help restore corrupted Registry entries.
  • Does the application rely on any Windows Services in order to function? Are those services installed and started? Have you tried stopping and restarting them?
  • Is there enough free space on the user’s hard disk (1-2GB)? The application may need to create temporary files, or the operating system may need page file room.
  • Does this application interact with a CD-ROM or other peripheral? If so, is that device attached? Is it working? If it’s a disk drive, does it contain a disk? Is that disk corrupted or unreadable?
  • Does the application generate any logs itself? (These may appear in the application’s own directory or in the user profile.) Any indication of a problem there? Does searching the error messages on the Internet help any?
  • If the application interfaces with something on the network, like a web server or application server, can we determine if that server is online? Are other users with this same software able to get to that server? Is there anything wrong with the user’s account on that server?
  • If this is an issue with a peripheral, like a mouse, have we tried using a generic Microsoft driver for the device (if there is one)? If we’re already using a generic driver for the device, have we tried a manufacturer-specific one?
  • If the problem in question is display oriented, like a window not refreshing properly or graphics appearing corrupted, etc., have we tried updating the video drivers to the latest available from the card’s manufacturer?
  • If this is a problem working with a media file, does the PC have the correct "codec" (compression/decompression) software installed? For example, AVI files may need codecs like DiVX, XVid, and so forth installed.
  • If this is a web-browser-oriented application, does it work when an administrator is logged in and running the browser (be careful about this if you’re going to an untrusted site as you could introduce malware!)? If so, we’re probably missing a plug-in or permission that allows the user to run the app.
  • Some applications embed an Internet Explorer control into them to read/view content from the Internet. Is that a possibility with this application? If so, have we tried repairing and troubleshooting IE?
  • If this is an issue with Internet Explorer, have we tried using Tools -> Options -> Advanced -> Reset…. to restore the browser to default configuration? Have we tried deleting temporary files?
  • Have we considered possible hardware causes for this problem? For example, could a failing hard disk cause this? Could faulty RAM be making this machine unstable? Could a bad motherboard or video card do this? An easy way to test this would be to configure a similar machine with the same software and see if you get the same result.
  • Have we tried calling, emailing, etc., the application manufacturer if possible?
  • If you’ve already invested a lot of and are no closer to fixing it, and you can’t replicate the issue for others with a similar hardware/software build, have you considered that a rebuild may be a better use of time? If this is a one-off issue that isn’t recurring for the user (or that you’re not seeing for lots of users), rebuilding the machine may be cheaper to the company than spending more hours fixing the issue.

Sponsored Links