Laptop shut down randomly when temperature is low

1

Basic info of my laptop: 3.5-year-old Samsung RF411, Win7, no hardware change since purchase, Windows disk & memory check all OK, passed memtest86+ test.
Recently, it randomly shuts down/blue screen/blurry screen&die without any warning. Sometimes, it shuts down after an uncertain period of time when win7 is loaded (a few mins to a few hours).

(1) Not an overheating problem.
I have cleaned the fan and changed the thermal grease, which does not work though temperature is quite low just before shutdown.
For example, CPU temperature <60C (official T_junction is 100C), graphics card (GeForce 525M) <50C. Moreover, when playing games, CPU easily heats up to around 90C without crash.
So I conclude it's not a temperature problem.

(2) Seems to be a hardware problem
Cleaning my two 4GB RAMs does not seems to solve my problem, but it SEEMS my laptop can run longer before a crash.
After cleaning, it shuts down/gets blurry screen a few times even just in BIOS (I've never updated BIOS).

Has anyone encountered similar problems? what else can I check?

Edit 1 Another possible issue: output of my power adapter is 19V 3.16A, which in my opinion is a bit lower than the peak power consumption of my laptop (cpu ~40W, GPU ~20W?, plus 2xRam,1HDD,1Fan,wifi module,etc). However, I've been using this adapter without weird shutdown since purchase, though it frequently uses battery when the task is CPU/GPU intensive.

Edit 2 One of the bsod says:
A problem has been detected and windows has been shut down to prevent damage to your computer.
SYSTEM_SERVICE_EXCEPTION.
If this is the first time you've seen this stop error screen, restart your computer. If this screen appears again, follow these steps:
Check to make sure any new hardware or software is properly installed. If this is a new installation, ask your software or hardware manufacturer for any windows update you might need.
If problems continue, disable or remove any newly installed hardware or software. Disable BIOS memory options such as caching or shadowing. If you need to use safe mode to remove or disable components, restart your computer. Press F8 to start advanced startup iptions, and then select safe mode.
Technical information:
** stop: 0x0000003B (0x00000000C0000005, 0xFFFFF96000122ED1, 0xFFFFF880099EC6F0, 0x0000000000000000)
** win32k.sys. - Address FFFFF96000122ED1 base at FFFFF96000050000 DateStamp 54ee9222.

Edit 3 If the blurry screen under BIOS means failure of graphics card, I wonder which graphics card it could be. My laptop is equipped with a discrete nvidia GT 525M card and the Intel HD3000 integrated in the CPU. I suspect only the integrated GPU is used when booted to BIOS. If so, could it be a problem with my RAM?

The Solution:
Seems the problem is due to one of my RAMs, although both passed the memtest86 check. Probably the RAM causing the shutdown becomes unstable due to heating after working for certain time. My problem can be solved by the removing the defective RAM. Well, resolving the problem of the RAM seems not so urgent and should be too challenging for me though.
I really appreciate all the suggestions in the replies and comments, which all helped me to find the solution. Sorry that I can only choose the closest one as the best answer. Again, thanks for all your help!

laptop
cpu
shutdown
temperature
asked on Super User Apr 14, 2015 by D-K • edited Apr 23, 2015 by D-K

2 Answers

0

By experience, I'd recommend you to get/make a Live Linux USB stick, I recommend Fedora XFCE since it's lightweight, stable, easy to use and I've been using it for quite a few years. Back to the point, you can install a few useful utilities that would help you diagnose many hardware or software issues and to know wether or not it's a problem with your OS or your hardware. These are the ones I most commonly use:

  • smartctl from the smartmontools package: This is an utility that will let you run SMART tests on your hard drives and to collect SMART data from them as well, very useful to tell if your hard drive is indeed faulty or healthy.
  • lm_sensors: this utility will let you monitor the temperature of the several hardware in your computer, from ACPI sensors to single CPU cores.
  • stress: this one will let you stress out several components of your machine, individually or many at a time, including CPU, RAM i/o and HDD i/o. This one comes in handy if run together with sensors, specifically by stressing the CPU you can make all cores run at 100% always to know if they are indeed overheating or not.

Best of all, if you can normally continue to use your computer with the live media -e.g. browsing the web- without it having those issues (of course you wouldn't get a BSOD on Linux but rather a Kernel Panic) or without it shutting down, you can be sure that the problem is your current Windows installation and not your computer, so you would simply need to format and make a clean install, in such case you can also:

  • Clone the entire hard drive to another media.
  • Mount the individual partitions to backup specific data.

While on the live media, so could safely wipe the hard drive and let Windows do a true clean install whithout losing any data in the process. As a technician, I can only say that a Linux live media is a marvelous tool that would be worth having around on a USB stick just in case, as a technician, it has saved me many times and let me diagnose and repair many other machines. Good Luck.

Edit:

Another good strategy to diagnose your computer is to completely take it apart and start adding components until you get to the problem, i.e:

  1. Start off with only the CPU and a single RAM module and see if you can get to BIOS and stay there for a while. If it fails at this point, check if other modules work well, it could be a faulty RAM module or even a bad motherboard.
  2. If it works fine, keep adding components in the following order until you find the one that's causing you trouble:

    • RAM
    • PCI cards
    • USB peripherals (keyboard, mouse, WiFi dongles, etc.)
    • Additional power consuming accesories like fans or led lights

    Each one at a time.

answered on Super User Apr 15, 2015 by arielnmz • edited Apr 15, 2015 by arielnmz
0
  1. Your tests and conclusion on overheating seem sound to me, i.e. it's probably not an overheating issue.

  2. Issues in BIOS are indeed OS-independent. You may try to update your BIOS just in case, but be careful: you don't want a crash in the middle of BIOS update process.

  3. I would not worry about your AC adapter, especially if it's an original one. Issues with adapters usually start with troubles charging your battery or running with battery unplugged.

I suggest you keep only one RAM module (or better yet, borrow / buy a module that certainly works) and run your system with that to see if crashes still happen. It may also be the RAM slot issue, like pins losing contact at certain temperature, so try plugging your RAM module to each slot and retest. The best would be to run a memory test when the blurring appears, if system crash is not imminent.

To clean a RAM slot, wrap a small piece of paper over the flat scredriver's tip, and gently push the screwdriver in and out of the slot cleaning 3-5 pins at a time. Repeat for all pins. Don't use a screwdriver which is too thick - you don't want to deform the slot. Don't swipe over all pins with it - this way you may clean your slot much faster, or you may deform / dislodge a coupe of pins and render that slot unusable.

answered on Super User Apr 15, 2015 by Dmitry Grigoryev • edited Apr 23, 2015 by Dmitry Grigoryev

User contributions licensed under CC BY-SA 3.0