Looking for PC Advice and Troubleshooting


Back_Blast

 

Posted

These forums have a pretty knowledgeable and helpful group of people, so I'm going to ask for a little advice. My current rig was built for me less than two years ago by a friend who owed me some favors.

It's been a real workhorse, but I've started having occasional problems with it, usually (but not always) related to playing CoH. In about one out of four sessions the machine will cut straight to a black screen and reboot. I've checked pretty much everything I can think of, and can't determine the culprit. Any ideas?

Processor: Intel Core2 Extreme QX6850 @3.0GHz
Motherboard: Gigabyte X38T-DQ6
Graphics Slot: PCI-Express x16
RAM: 4GB DDR3 (Patriot PDC32G 1600LLK)
Main Disc: 320GB Western Digital 7200rpm
Graphics Card: NVIDIA GeForce GTX 295 (ASUS ENGTX295/2DI/1792MD3)
Power Supply: Turbo-Cool 1 Kilowatt (1100 W Peak)

For a monitor I'm using a Samsung 40" TV running at 1920x1600, if that makes a difference, although I can't imagine how it would.

Temperature-wise it runs a little hot, but the monitors indicate that the GPU peaks at about 70-72C, which should be okay. (That's fighting a room full of Carnies with all my shaders and particles and what-not cranked to max, too.)

Any ideas as to where I should start looking to track this down? Thanks in advance.


@Glass Goblin - Writer, brainstormer, storyteller, hero

Though nothing will drive them away
We can beat them, just for one day
We can be heroes, just for one day

 

Posted

cutting to a black screen and rebooting would normally suggest an overheating error, which is certianly possible given the hardware in the computer.

One of the big problems with software temperature monitors is that they are not accurate. There's a reason most sites that do "serious" thermal testing run a lead right onto a processors heat sink. Even though your temp sensors in the computer may say everything's fine... it probably isn't. 70-72c is also really LOW for a GTX 295 that isn't being water-cooled. Okay, in all fairness, I've ever actually worked on 2 GTX 295's in the entire time they've been out... but on those two I had temp. reportings in the range of 170-185 F... and that's a bit above 72c.

Tracking down a thermal fail-over can be a bit hard to do though.

The easist way is to force a downclock on all components. You should be able to set lower clock speeds for the 295 through the Nvidia-Settings in the Windows control panel, and you should be able to downclock your processor in the bios.

If your system remains stable with lowered speeds, then there's your answer, you are reaching a thermal cut-out.

If the system still crashes with forced downclocks, we have other problems to think about.


 

Posted

Thanks, I'll try downclocking. After making that post I noticed an oddity about the way the builder set it up: the PSU's exhaust fan is right beside the GPUs intake fan. Not perhaps the way I would have structured it, myself.

I'll let you know if downclocking solves it. Thanks again!


@Glass Goblin - Writer, brainstormer, storyteller, hero

Though nothing will drive them away
We can beat them, just for one day
We can be heroes, just for one day

 

Posted

Hmm, I'm less confident that it's a heat issue now. I logged into CoH from a cold boot, after the machine was powered down overnight. It blew up (i.e., instant power-interruption type reboot) before it even finished logging me into the game. After the reboot I was able to get in, fought a few mobs (on-line for maybe 15 minutes), and then it blew up on zoning via the train.

Could there be a power issue? I would think a 1000W PSU would be sufficient.


@Glass Goblin - Writer, brainstormer, storyteller, hero

Though nothing will drive them away
We can beat them, just for one day
We can be heroes, just for one day

 

Posted

Well 1000 watts doesnt mean anything if one of your capacitors blew a cap. You may want to physically check the motherboard, and look for things that look like little batteries. Make sure none of them have a dome top (they should be flat). If one or more of them DOES have a pushed-up cap, or worse, it's leaking acid, you'll want a new mobo ASAP. Try not to use your computer at ALL, bad power going through your system can damage all your other components.


-STEELE =)


Allied to all sides so that no matter what, I'll come out on top!
Oh, and Crimson demands you play this arc-> Twisted Knives (MA Arc #397769)

 

Posted

Thanks for the input, Steele. I don't see anything looking funky on the mobo, but I'm going to tear the box down completely and look for anything else weird. The most recent reset it was sitting idle without any running applications at all.

This is getting frustrating. I've been out of work for six months. I can barely afford to keep my account current. Building a new box will be beyond my means, and going back to playing on a laptop will su- will be less than optimal.


@Glass Goblin - Writer, brainstormer, storyteller, hero

Though nothing will drive them away
We can beat them, just for one day
We can be heroes, just for one day

 

Posted

Quote:
Originally Posted by GlassGoblin View Post
It's been a real workhorse, but I've started having occasional problems with it, usually (but not always) related to playing CoH. In about one out of four sessions the machine will cut straight to a black screen and reboot. I've checked pretty much everything I can think of, and can't determine the culprit. Any ideas?
Frankly, this sounds like a driver issue to me. Did you update your drivers at some point prior to this problem showing up? You're probably getting the reboot due to the default setting in windows to auto-restart on a crash. Since you've had the computer for about two years, I assume you're running Windows XP, or possibly Vista... you'll want to go into the control panel and open System>Advanced Properties tab and find where it says "system startup and recovery", click the settings button and uncheck the box next to where it says System Failure>Automatically Restart ... the next time it happens you will most likely get a blue screen with an error code and listing of the drivers/files that caused the crash, but the computer won't automatically restart. From there you should be able to determine what drivers are causing the problem and update/roll back as needed.

Quote:
Originally Posted by GlassGoblin View Post
For a monitor I'm using a Samsung 40" TV running at 1920x1600, if that makes a difference, although I can't imagine how it would.

Temperature-wise it runs a little hot, but the monitors indicate that the GPU peaks at about 70-72C, which should be okay. (That's fighting a room full of Carnies with all my shaders and particles and what-not cranked to max, too.)

Any ideas as to where I should start looking to track this down? Thanks in advance.
Couple other quick things, as said above sometimes this kind of error/restart can be caused by operating temperatures, blown motherboard capacitors or lack of power... aside from what I said above, you'll probably also want to look over the system and ensure all the system, processor, video card and power supply fans are working properly, and look over the motherboard capacitors to make sure they aren't popped (leaking, cracked, puffed up). if everything seems ok then there's probably not a chance you're overheating since it's not a new system and you've been running for a while without problems. Your video card temps look to be inline with a normal system, temps on those GTX's have been known to reach 100c under 100% loads during stress tests, but normal operating should be around 70-80c so you should be fine. the system sensors aren't always that accurate, but they're usually not that far off to cause concern. at times the motherboard capacitors can be blown and not always show noticeable signs, but if you don't see anything out of the ordinary it will probably be safe to assume that there's nothing wrong with them.


 

Posted

I updated the system from XP Pro to Win7 Home Premium, but it worked fine for a few months before the trouble started. I upgraded the video drivers at that time (but not to the card-killer driver they pulled back). The Windows crash analysis tool pointed out that there had been ~some~ driver failures, but upgrading to the latest-n-greatest doesn't seem to have changed anything.

I'm going to tear the box apart and look over all the components for signs of failure. I've already checked the fans and they all appear to be hunky-dory.

Thanks for the information!


@Glass Goblin - Writer, brainstormer, storyteller, hero

Though nothing will drive them away
We can beat them, just for one day
We can be heroes, just for one day

 

Posted

I hope I can catch you before you take that step...not sure why no one has asked this yet.

Do you have it set to automatically restart on error? You should be seeing a blue screen of some type. Here's how you check:

Control Panel->System->Advanced System Settings->Advanced Tab->Settings button under 'Startup and Recovery'->Remove the checkmark beside "Automatically restart"

If you can see a blue screen, it might shed some light as to what's actually happening. If it just powers itself off or restarts without a blue screen and you have it set to not automatically restart, then we're looking for a hardware failure of some type in most cases.

Hope this helps.




We'll see....

 

Posted

You can also check the Event Viewer (Control Panel --> Administrative Tools --> Event Viewer). Error messages from past crashes ought to be recorded there as well which might point a bit to what is going on.


It is known that there are an infinite number of worlds, simply because there is an infinite amount of space for them to be in. However, not every one of them is inhabited. Therefore, there must be a finite number of inhabited worlds. Any finite number divided by infinity is as near to nothing as makes no odds, so the average population of all the planets in the Universe can be said to be zero. From this it follows that the population of the whole Universe is also zero, and that any people you may meet from time to time are merely the products of a deranged imagination.

 

Posted

I had previously looked at the logs. The last 58(!) critical events are all the same, varying only by timestamp:

Code:
Log Name:      System
Source:        Microsoft-Windows-Kernel-Power
Date:          3/16/2010 1:20:28 AM
Event ID:      41
Task Category: (63)
Level:         Critical
Keywords:      (2)
User:          SYSTEM
Computer:      Nyarlathotep
Description:
The system has rebooted without cleanly shutting down first. This error could be caused if the system stopped responding, crashed, or lost power unexpectedly.
I've turned off the auto-reboot. The fastest way to get it to blow up is to do something significant, like getting the Madame of Mystery down to a sliver of health when I'm out of inspirations. ~sigh~ I guess I'll be waking up in the hospital when I log back in.


@Glass Goblin - Writer, brainstormer, storyteller, hero

Though nothing will drive them away
We can beat them, just for one day
We can be heroes, just for one day

 

Posted

1000w power supply is more than sufficient for that rig. That doesn't mean the ps isn't going bad on you though.

Couple of things to check before you disassemble the whole machine.

1) Have you cleaned all your heatsinks/dusted the interior lately? Fuzz-bunnies and computers don't mix well.

2) You listed your GPU temps, how about your CPU though? If your cpu cooler has worked it's way loose or the fans are having troubles, that could cause an intermittant heat-fail.

3) Make sure all your capacitors look OK (on both MB and Video card if you can see them).

4) Double check all of your connections between the power supply and the components. Be sure to check the mb main connector and all the fan pins to make sure none have come partially off.

5) Try running with the case open so you can easily check and make sure all fans are spinning and aren't clogged with gunk.

6) Check your CPU cooler and Power Supply as well to make sure they clean and venting properly.

7) You can look at getting one of the various software monitors to track your utilization/heat levels/and voltages in real-time.

8) If you're the electrical type, check the actual output of your PS to make sure it's operating within specs. If you're not, DON'T mess with it unless/until you learn how to do so correctly (without ruining your PS or yourself via electrocution).


6000+ levels gained and 8 level 50's
Hello, my name is Soulwind and I have Alt-Itis.

 

Posted

Did some searching on your error and while my search was hardly exhaustive, a common issue I did see was with the audio drivers causing this. In some cases, updating them fixed the issue, in others, they had multiple apparent audio drivers that were conflicting with each other. Disabling the extras seemed to fix the crashes.

In the first case, the common denominator seemed to be with Realtek audio and a search of your mobo indicates it has a Realtek audio chip. This may be your problem.

In the second, it seems at least some ATI cards have some sort of audio driver with them (not quite sure why) and disabling it fixes the conflict. As you have an Nvidia card, this may not be relevant. Not sure if Nvidias have such as well. In any case, you might look for additional audio type devices and try either updating or disabling them as well.

Keep in mind the above only *may* be your issue. It could easily be a HW problem of some sort but those seem to be common solutions that I found in short searching.

Interestingly, I run Win 7 and would technically fall into both categories (Realtek + ATI) but fortunately I'm not having your problem. (And I hope I never do.) Something to check out and see what you find.


It is known that there are an infinite number of worlds, simply because there is an infinite amount of space for them to be in. However, not every one of them is inhabited. Therefore, there must be a finite number of inhabited worlds. Any finite number divided by infinity is as near to nothing as makes no odds, so the average population of all the planets in the Universe can be said to be zero. From this it follows that the population of the whole Universe is also zero, and that any people you may meet from time to time are merely the products of a deranged imagination.

 

Posted

Quote:
Originally Posted by Soulwind View Post
8) If you're the electrical type, check the actual output of your PS to make sure it's operating within specs. If you're not, DON'T mess with it unless/until you learn how to do so correctly (without ruining your PS or yourself via electrocution).
Yeah, I'm not going to be doing that. I don't need my own personal electric aura. ;-)

I've cleaned it out a couple of times, and will probably check the mobo for acne in a couple of days. I'm using cpu-z and gpu-z to keep tabs on the status, too.

While it may be a coincidence, when running for a bit today I noticed something odd. The Particle Physics Quality was set to "Very High (Not recommended without PhysX card)". I certainly don't recall making that change, and since I don't have a PhysX card I can't imagine why I would. At any rate, after changing it to High I played for a couple of hours without a problem, even during a fight with Babbage.

Any thoughts?


@Glass Goblin - Writer, brainstormer, storyteller, hero

Though nothing will drive them away
We can beat them, just for one day
We can be heroes, just for one day

 

Posted

Quote:
Originally Posted by Back_Blast View Post
In the second, it seems at least some ATI cards have some sort of audio driver with them (not quite sure why) and disabling it fixes the conflict. As you have an Nvidia card, this may not be relevant. Not sure if Nvidias have such as well. In any case, you might look for additional audio type devices and try either updating or disabling them as well.
I wonder if the embedded audio chips are for handling the HDMI output? Anyway, thanks for the suggestions. I'll look for more drivers for that.


@Glass Goblin - Writer, brainstormer, storyteller, hero

Though nothing will drive them away
We can beat them, just for one day
We can be heroes, just for one day

 

Posted

Quote:
Originally Posted by GlassGoblin View Post
Yeah, I'm not going to be doing that. I don't need my own personal electric aura. ;-)

I've cleaned it out a couple of times, and will probably check the mobo for acne in a couple of days. I'm using cpu-z and gpu-z to keep tabs on the status, too.

While it may be a coincidence, when running for a bit today I noticed something odd. The Particle Physics Quality was set to "Very High (Not recommended without PhysX card)". I certainly don't recall making that change, and since I don't have a PhysX card I can't imagine why I would. At any rate, after changing it to High I played for a couple of hours without a problem, even during a fight with Babbage.

Any thoughts?
Well as I see a mention by you above that it has failed on you at least once while sitting idle, I'm not sure I'd call it fixed just by that. Besides, the PhysX (I think) only really comes into play when there's lots of misc. stuff about from things like exploding objects, Gravity folks using Propel, etc. So while it could certainly affect your gameplay, I tend to question it being the cause of your problem. Too high PhysX would tend to mainly drop your FPS I'd expect, not crash your system.

Quote:
Originally Posted by GlassGoblin View Post
I wonder if the embedded audio chips are for handling the HDMI output? Anyway, thanks for the suggestions. I'll look for more drivers for that.
Hmm. Yeah I'd tend to think you're right about that. Hadn't occurred to me previously but that does make sense. That likely is it.


It is known that there are an infinite number of worlds, simply because there is an infinite amount of space for them to be in. However, not every one of them is inhabited. Therefore, there must be a finite number of inhabited worlds. Any finite number divided by infinity is as near to nothing as makes no odds, so the average population of all the planets in the Universe can be said to be zero. From this it follows that the population of the whole Universe is also zero, and that any people you may meet from time to time are merely the products of a deranged imagination.

 

Posted

Well, last night I tore the rig down to loose components and rebuilt it from scratch. (It needed to be done, anyway. As I mentioned, a friend built it for me so I needed to get a better feel for how it was assembled.)

In checking around and moving components (like putting the graphics card at the bottom of the case for better air flow, and taking out the never-used BD-Data drive), I found that the SATA port running the C drive may have gone bad. No visible issue on the mobo, but I had noticed an intermittent failure on init for SATAII01. I moved the drive to a new port and it ran for hours (including a couple of hours playing a grav controller) without a reboot.

Was that the problem? I don't really know. I ended up changing so many things it would be hard to be sure. But I wouldn't have even started looking in that direction if not for the assistance you people provided. Thank you all very much!

Now let's just hope it keeps working...


@Glass Goblin - Writer, brainstormer, storyteller, hero

Though nothing will drive them away
We can beat them, just for one day
We can be heroes, just for one day

 

Posted

Good to see you seemed to have fixed it. Here's hoping you truly have.


It is known that there are an infinite number of worlds, simply because there is an infinite amount of space for them to be in. However, not every one of them is inhabited. Therefore, there must be a finite number of inhabited worlds. Any finite number divided by infinity is as near to nothing as makes no odds, so the average population of all the planets in the Universe can be said to be zero. From this it follows that the population of the whole Universe is also zero, and that any people you may meet from time to time are merely the products of a deranged imagination.