Geekzone: technology news, blogs, forums
Guest
Welcome Guest.
You haven't logged in yet. If you don't have an account you can register now.


Filter this topic showing only the reply marked as answer View this topic in a long page with up to 500 replies per page Create new topic
1 | 2 | 3 | 4 
DravidDavid
1907 posts

Uber Geek
+1 received by user: 305


  #1273400 30-Mar-2015 13:29
Send private message

timmmay:
Did you mean to say it's a driver issue or a RAM issue?

I did test this when I built the machine a few years ago, and recently. Apparently memtest x86 sucks and can't detect memory errors even when run for 24 hours +.

I will likely build a new machine either in the next couple of weeks if it can't be sorted, or later this year if it works ok on 8GB for now.


What I found weird was that with other operating systems, it worked fine.  Which suggests a driver compatibility issue with Windows 7/10, and perhaps also suggests that HCI is telling you porkies.  Something I otherwise can't account for as  I've never heard of Memtest 86 (there is a newer Memtest 86+ you might want to try if not already using it) not picking up serious errors that would cause interruption of day to day use in such a dramatic manner.  I'm not sure of the reputation of the diagnostic tool you used, but I've certainly never heard of it.  Could it be possible that it was just a false report or similar?

If I was testing the memory, I'd be doing it one unit at a time in each channel to make sure.  It's the only way to reliably test memory if an error is found across a whole bank.  It takes days though.

I recently suffered from a power event, which destroyed my ethernet port, PCI-E expansion slots except the 16X and channel two on my motherboard, which meant I was forced to drop down to 8GB of RAM, the other 8 going in to a new machine which is now my partners.  I tested all the memory in one slot and when no errors occurred, I tested them all in the next slot and continued until I heard the dreaded beep code.

Not saying you have to do that in this case, but it could still be a driver/RAM/board issue at this stage if the memory has only been tested together in or separately in a singular slot on the same channel.



timmmay

20859 posts

Uber Geek
+1 received by user: 5350

Trusted
Lifetime subscriber

  #1273444 30-Mar-2015 14:31
Send private message

Yes that it works with other OS's is strange. It could be driver, it could be that it uses memory in a different way. True that HCI could be wrong as well. I did use Memtest x86+, which is not maintained any more, Memtest x86 is maintained by Passmark.

I did test each memory chip individually, it took me all day on Saturday. Each worked perfectly. It works perfectly in pairs too - any two RAM chips in slots 1 & 3 never give me an error. It could be that memory slots 2 & 4 are faulty, but given this is a brand new motherboard and the last motherboard had the same problem that seems exceedingly unlikely. Because it works with any two chips I think the chances are very strong that it's a RAM problem.

I don't have time to test each DIMM in each slot - that would take around 4 days of active swapping, testing, etc. If that was required I'd just toss the lot out and buy a new one. I'll do a little more targeted testing on slots 2 & 4, and so long as it's stable with two DIMMs I may accept that as good enough for now.

When I replace this machine it'll be with two 8GB DIMMS, not 4x4GB.

DravidDavid
1907 posts

Uber Geek
+1 received by user: 305


  #1273464 30-Mar-2015 15:20
Send private message

timmmay: Yes that it works with other OS's is strange. It could be driver, it could be that it uses memory in a different way. True that HCI could be wrong as well. I did use Memtest x86+, which is not maintained any more, Memtest x86 is maintained by Passmark.

Interesting.  I didn't know it was the other way around.  I always thought it was x86 that was dropped and 86+ was the continued version.  It is likely Linux is using the memory in a different way.  But I can't comment on how different as my knowledge fades away when Linux gets involved.  I'll have to look in to it as it may be useful in future.

timmmay: I did test each memory chip individually, it took me all day on Saturday. Each worked perfectly. It works perfectly in pairs too - any two RAM chips in slots 1 & 3 never give me an error. It could be that memory slots 2 & 4 are faulty, but given this is a brand new motherboard and the last motherboard had the same problem that seems exceedingly unlikely. Because it works with any two chips I think the chances are very strong that it's a RAM problem.

Strange.  I've never had a case where all 4 chips exhibited the same fault.  It's always been a problem DIMM or in my case, problem channel.  HCI is the only reason I'm hesitant to pin it specifically on drivers.  But if the memory was at fault you would see the same results using Windows XP as you did using Windows 7 and 10, would you not?  Or is the art of transferring files different in Windows 7+?

timmmay: I don't have time to test each DIMM in each slot - that would take around 4 days of active swapping, testing, etc. If that was required I'd just toss the lot out and buy a new one. I'll do a little more targeted testing on slots 2 & 4, and so long as it's stable with two DIMMs I may accept that as good enough for now.

I'm not flush enough to bin a build and start over, haha.  Testing for me was my only option, but yes it is a lot of work.  It sounds as if you've pretty much done what I'd do anyway.

timmmay:When I replace this machine it'll be with two 8GB DIMMS, not 4x4GB.

That's the best idea.  You leave room for expansion that way too.



timmmay

20859 posts

Uber Geek
+1 received by user: 5350

Trusted
Lifetime subscriber

  #1273472 30-Mar-2015 15:30
Send private message

It's not that all 4 DIMMs exhibit the same fault, it's that when tested individually or in pairs they all work perfectly, but when all four are used together I get a fault. I will try a pair in the second channel, see if it makes any difference.

Windows XP could use memory quite differently from W7/W10, in terms of copying. Could be exactly the same too, I have no idea.

I can sell my two motherboards each with 8GB RAM and one with an i7 processor if I make a new machine. I'd be comfortable using either as a primary machine with only 8GB RAM. That will reduce the cost, but it'd be around NZ$1200 for Xeon E3, 16GB ECC RAM, and a SuperMicro motherboard shipped from newegg. In NZ it'd cost a lot more.

timmmay

20859 posts

Uber Geek
+1 received by user: 5350

Trusted
Lifetime subscriber

  #1273504 30-Mar-2015 16:05
Send private message

Interesting... testing all four sticks of ram at 1.58V today and I've had no errors in 10 hours of testing. That's the first time it's run stable with all four RAM sticks in it - even 1.56V didn't do it. At 1.5V it was even less stable. The inner RAM is reaching 60 degrees, which is up from 50 degrees when it's at 1.5V. Spec says around 80 degrees so no huge concern, though I may add RAM cooling if the temps stay up this high. Testing will continue.

Curious.

networkn
Networkn
32871 posts

Uber Geek
+1 received by user: 15469

ID Verified
Trusted
Lifetime subscriber

  #1273524 30-Mar-2015 16:32
Send private message

I've been building computers and involved with computers for more than 20 years, and I've never had to stabilize a non overclocked machine with more voltage for memory. If you are in that zone, I'd just start replacing hardware, because further issues are just a matter of time. Also it's been at least 10 years since drivers on a modern machine have been attributable to any form of data corruption, so again I'd be well into hardware replacement too. (Last time I saw driver issues like that was when AMD were a force to be reckoned with and motherboards were popular with non intel chipsets)


 
 
 

Shop on-line at New World now for your groceries (affiliate link).
timmmay

20859 posts

Uber Geek
+1 received by user: 5350

Trusted
Lifetime subscriber

  #1273543 30-Mar-2015 17:08
Send private message

networkn: I've been building computers and involved with computers for more than 20 years, and I've never had to stabilize a non overclocked machine with more voltage for memory. If you are in that zone, I'd just start replacing hardware, because further issues are just a matter of time. Also it's been at least 10 years since drivers on a modern machine have been attributable to any form of data corruption, so again I'd be well into hardware replacement too. (Last time I saw driver issues like that was when AMD were a force to be reckoned with and motherboards were popular with non intel chipsets).


You're probably right. Frustrating problem though right? Especially with it working fine with any two sticks of RAM. Having to increase the voltage to the RAM was suggested by the makers of the RAM.

I'll probably never 100% trust the machine again, which means replacement is probably best for my piece of mind and stress levels, which are both more important to me than a few dollars.

freitasm
BDFL - Memuneh
80658 posts

Uber Geek
+1 received by user: 41072

Administrator
ID Verified
Trusted
Geekzone
Lifetime subscriber

  #1273558 30-Mar-2015 17:30
Send private message

Have you looked at PSU replacement to test?




Referral links: Quic Broadband (free setup code: R587125ERQ6VE) | Samsung | AliExpress | Wise | Sharesies 

 

Support Geekzone by subscribing (browse ads-free), or making a one-off or recurring donation through PressPatron.

 


freitasm
BDFL - Memuneh
80658 posts

Uber Geek
+1 received by user: 41072

Administrator
ID Verified
Trusted
Geekzone
Lifetime subscriber

  #1273559 30-Mar-2015 17:31
Send private message

Or just a different filter/circuit breaker? Tried on another power socket or different house?




Referral links: Quic Broadband (free setup code: R587125ERQ6VE) | Samsung | AliExpress | Wise | Sharesies 

 

Support Geekzone by subscribing (browse ads-free), or making a one-off or recurring donation through PressPatron.

 


timmmay

20859 posts

Uber Geek
+1 received by user: 5350

Trusted
Lifetime subscriber

  #1273616 30-Mar-2015 18:14
Send private message

I have considered it, but I don't have a spare, and I'm not spending any money on this PC. Given it runs 100% fine with any two DIMMs in any two slots (as far as I can tell) it only fails with all four DIMMs in I doubt it's PSU. I could borrow one, someone offered me an old dunger earlier, but I don't think it's likely enough to spend the time on driving to Lower Hutt. Could get one from a friend's place locally but again, not sure it's worth the bother.

timmmay

20859 posts

Uber Geek
+1 received by user: 5350

Trusted
Lifetime subscriber

  #1279904 9-Apr-2015 18:23
Send private message

Corsair replaced the RAM, which is a bit over two years old, with new RAM that has the same model number. Great service from them I have to say. A couple of hours of testing that easily reproduces the problem with the old RAM has found zero errors. So my conclusion is it's the RAM.

I find it very strange that the old RAM works in pairs but not two pairs, and that memtest x86 doesn't find any problems testing either individually, in pairs, or in fours. HCI Memtest did find errors, but only at 300% - ie after testing the memory three times. The way I found the errors is to use H2TestW and Teracopy in verify mode.

So the moral of the story is if you get data corruption, replace the ram, even if it tests fine.

 
 
 
 

Shop now for Dell laptops and other devices (affiliate link).
DravidDavid
1907 posts

Uber Geek
+1 received by user: 305


  #1279983 9-Apr-2015 20:12
Send private message

Great outcome all things considered.

Too bad it took hours of testing and most likely, a few grey hairs.

networkn
Networkn
32871 posts

Uber Geek
+1 received by user: 15469

ID Verified
Trusted
Lifetime subscriber

  #1280000 9-Apr-2015 20:42
Send private message

Well I'd have replaced the PSU then the memory, or possibly the other way around, but good outcome for you. I knew it wasn't a software problem though.


timmmay

20859 posts

Uber Geek
+1 received by user: 5350

Trusted
Lifetime subscriber

  #1280012 9-Apr-2015 20:51
Send private message

DravidDavid: Great outcome all things considered.

Too bad it took hours of testing and most likely, a few grey hairs.


Hours? More like weeks, over the space of 3 months intensively plus a bit of wondering beforehand.

Super strange problem though. And as far as I know there's no software to burn in a PC to find things like this.

1 | 2 | 3 | 4 
Filter this topic showing only the reply marked as answer View this topic in a long page with up to 500 replies per page Create new topic








Geekzone Live »

Try automatic live updates from Geekzone directly in your browser, without refreshing the page, with Geekzone Live now.



Are you subscribed to our RSS feed? You can download the latest headlines and summaries from our stories directly to your computer or smartphone by using a feed reader.