Discussion:
Computer issues with Linux
(too old to reply)
Brian May
2014-08-23 00:13:12 UTC
Permalink
Hello,

Am having two strange problems that are starting to eerk me with this
(relatively new) computer, running Debian wheezy:


1. If I boot anything later then 3.12 kernel, I don't get any display. As
in the monitors display black. No cursors of any sort. Changing to a
virtual console doesn't help. Booting in rescue mode doesn't help (I think
this rules out X-Windows being a problem). If I go back to the 3.12 kernel,
everything works perfectly. I also tried plugging monitors into alternative
ports just in case it is going to the wrong place, but get nothing - in any
case, under 3.12 the computer seems pretty good at automatically working
out what ports are active under X-Windows.

01:00.0 VGA compatible controller: NVIDIA Corporation G96 [GeForce 9500 GT]
(rev a1)

Am currently using the non-free nvidia drivers. Had exactly the same
symptoms when I installed the latest kernel without the non-free nvidia
kernel modules. I think the problem is occurring before X starts.

Computer seems to be up and running, and responsive to crl+alt+del despite
not having a display.


2. There seems to be some weird performance problem. e.g. save a 2 kilobyte
file in vim, and the computer can completely freeze (all other windows,
including xterms, stop responding to user input) for, say 30 seconds, while
it is writing that file. Chromium takes ages to load with several tabs, and
pages can fail to start properly while it is doing so.

Computer has 16GB RAM and is not using any swap. It has 11GB of
buffer/cache space:

total used free shared buffers cached
Mem: 15G 13G 2.1G 0B 1.1G 7.9G
-/+ buffers/cache: 4.6G 11G
Swap: 3.8G 0B 3.8G

Problems occurred before starting chromium, previously I wondered if it was
chromium's fault.

This is moving disk + RAID1 + LVM + ext4. Bonnie++ results seem to be
pretty good, better in fact then my work computer, which doesn't suffer
from similar problems.

Writing a byte at a time...done
Writing intelligently...done
Rewriting...done
Reading a byte at a time...done
Reading intelligently...done
start 'em...done...done...done...done...done...
Create files in sequential order...done.
Stat files in sequential order...done.
Delete files in sequential order...done.
Create files in random order...done.
Stat files in random order...done.
Delete files in random order...done.
Version 1.96 ------Sequential Output------ --Sequential Input-
--Random-
Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block--
--Seeks--
Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec
%CP
falidae 31904M 1228 94 107313 4 47471 2 +++++ +++ 137886 3
441.8 2
Latency 19141us 12435ms 251ms 19913us 88073us
283ms
Version 1.96 ------Sequential Create------ --------Random
Create--------
falidae -Create-- --Read--- -Delete-- -Create-- --Read---
-Delete--
files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec
%CP
16 +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++
+++
Latency 36us 223us 226us 36us 10us
24us
1.96,1.96,falidae,1,1404290654,31904M,,1228,94,107313,4,47471,2,+++++,+++,137886,3,441.8,2,16,,,,,+++++,+++,+++++,+++,+++++,+++,+++++,+++,+++++,+++,+++++,+++,19141us,12435ms,251ms,19913us,88073us,283ms,36us,223us,226us,36us,10us,24us


I am currently working on a new theory that the performance problems only
occur when the computer is cold and first turned on. I think I have seen
evidence to disprove this, but guess I should run bonnie++ as soon as I
turn the computer on, just to be sure.


Any other ideas?

Thanks
--
Brian May <***@microcomaustralia.com.au>
Chris Samuel
2014-08-23 00:51:09 UTC
Permalink
Post by Brian May
1. If I boot anything later then 3.12 kernel, I don't get any display.
What happens if you boot an Ubuntu or Fedora live CD on it?
Does it do the same?

cheers,
Chris
--
Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC
Brian May
2014-08-24 01:10:01 UTC
Permalink
Post by Chris Samuel
Post by Brian May
1. If I boot anything later then 3.12 kernel, I don't get any display.
What happens if you boot an Ubuntu or Fedora live CD on it?
Does it do the same?
ubuntu-14.04.1-desktop-amd64.iso, just downloaded today, works fine.
--
Brian May <***@microcomaustralia.com.au>
Robert Moonen
2014-08-23 07:52:24 UTC
Permalink
Post by Brian May
Hello,
Am having two strange problems that are starting to eerk me with this
1. If I boot anything later then 3.12 kernel, I don't get any display.
As in the monitors display black. No cursors of any sort. Changing to
a virtual console doesn't help. Booting in rescue mode doesn't help (I
think this rules out X-Windows being a problem). If I go back to the
3.12 kernel, everything works perfectly. I also tried plugging
monitors into alternative ports just in case it is going to the wrong
place, but get nothing - in any case, under 3.12 the computer seems
pretty good at automatically working out what ports are active under
X-Windows.
01:00.0 VGA compatible controller: NVIDIA Corporation G96 [GeForce
9500 GT] (rev a1)
Am currently using the non-free nvidia drivers. Had exactly the same
symptoms when I installed the latest kernel without the non-free
nvidia kernel modules. I think the problem is occurring before X starts.
Best to solve one problem at a time, so video first, by the non-free
drivers do you mean this one
http://www.nvidia.com/object/linux-display-amd64-319.49-driver that
seems to be the latest one. I am presently running Debian Wheezy on an
i7 3820 with 8GB on a gigabyte mobo with a GT 520 video card and it is
running just fine with kernel 3.2 and also using the actual nvidia
driver, the driver I linked to covers both cards so would be a good
place to start.
As for the other problems you experience, this seems to be a
contention/timeout problem, as no relatively new machine will ever take
30s to write a 2 kilobyte file.

cheers

Robert

<http://www.nvidia.com/object/linux-display-amd64-319.49-driver>
Post by Brian May
Computer seems to be up and running, and responsive to crl+alt+del
despite not having a display.
2. There seems to be some weird performance problem. e.g. save a 2
kilobyte file in vim, and the computer can completely freeze (all
other windows, including xterms, stop responding to user input) for,
say 30 seconds, while it is writing that file. Chromium takes ages to
load with several tabs, and pages can fail to start properly while it
is doing so.
Computer has 16GB RAM and is not using any swap. It has 11GB of
total used free shared buffers cached
Mem: 15G 13G 2.1G 0B 1.1G 7.9G
-/+ buffers/cache: 4.6G 11G
Swap: 3.8G 0B 3.8G
Problems occurred before starting chromium, previously I wondered if
it was chromium's fault.
This is moving disk + RAID1 + LVM + ext4. Bonnie++ results seem to be
pretty good, better in fact then my work computer, which doesn't
suffer from similar problems.
Writing a byte at a time...done
Writing intelligently...done
Rewriting...done
Reading a byte at a time...done
Reading intelligently...done
start 'em...done...done...done...done...done...
Create files in sequential order...done.
Stat files in sequential order...done.
Delete files in sequential order...done.
Create files in random order...done.
Stat files in random order...done.
Delete files in random order...done.
Version 1.96 ------Sequential Output------ --Sequential Input-
--Random-
Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block--
--Seeks--
Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP
/sec %CP
falidae 31904M 1228 94 107313 4 47471 2 +++++ +++ 137886
3 441.8 2
Latency 19141us 12435ms 251ms 19913us 88073us
283ms
Version 1.96 ------Sequential Create------ --------Random
Create--------
falidae -Create-- --Read--- -Delete-- -Create-- --Read---
-Delete--
files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
/sec %CP
16 +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++
+++++ +++
Latency 36us 223us 226us 36us 10us
24us
1.96,1.96,falidae,1,1404290654,31904M,,1228,94,107313,4,47471,2,+++++,+++,137886,3,441.8,2,16,,,,,+++++,+++,+++++,+++,+++++,+++,+++++,+++,+++++,+++,+++++,+++,19141us,12435ms,251ms,19913us,88073us,283ms,36us,223us,226us,36us,10us,24us
I am currently working on a new theory that the performance problems
only occur when the computer is cold and first turned on. I think I
have seen evidence to disprove this, but guess I should run bonnie++
as soon as I turn the computer on, just to be sure.
Any other ideas?
Thanks
--
_______________________________________________
luv-main mailing list
http://lists.luv.asn.au/listinfo/luv-main
Daniel J Jitnah
2014-08-23 08:46:37 UTC
Permalink
Post by Robert Moonen
Post by Brian May
Hello,
Am having two strange problems that are starting to eerk me with this
1. If I boot anything later then 3.12 kernel, I don't get any
display. As in the monitors display black. No cursors of any sort.
Changing to a virtual console doesn't help. Booting in rescue mode
doesn't help (I think this rules out X-Windows being a problem). If I
go back to the 3.12 kernel, everything works perfectly. I also tried
plugging monitors into alternative ports just in case it is going to
the wrong place, but get nothing - in any case, under 3.12 the
computer seems pretty good at automatically working out what ports
are active under X-Windows.
01:00.0 VGA compatible controller: NVIDIA Corporation G96 [GeForce
9500 GT] (rev a1)
Am currently using the non-free nvidia drivers. Had exactly the same
symptoms when I installed the latest kernel without the non-free
nvidia kernel modules. I think the problem is occurring before X starts.
I had a similar problem with a rather old and *different* Nvidia card
to yours and a late kernel (I think its 3.13 onwards?). As I understand
the changes in the latest kernel breaks the Nvidia driver, and for my
card at least, Nvidia will not update their driver, as my card is quite
old. Not sure, but could be yours is similarly affected? There are
patches, you supposedly can apply to the kernel source, but thats ugly,
and I tried that without success. :(.. - I did not struggle a lot with
it and gave up. (Reason I have not upgraded to latest kernel.) Wheezy
here as well. - In fact it appeared for me in Jessie and Ubuntu 14.04.

Cheers
Daniel
Post by Robert Moonen
Best to solve one problem at a time, so video first, by the non-free
drivers do you mean this one
http://www.nvidia.com/object/linux-display-amd64-319.49-driver that
seems to be the latest one. I am presently running Debian Wheezy on an
i7 3820 with 8GB on a gigabyte mobo with a GT 520 video card and it is
running just fine with kernel 3.2 and also using the actual nvidia
driver, the driver I linked to covers both cards so would be a good
place to start.
As for the other problems you experience, this seems to be a
contention/timeout problem, as no relatively new machine will ever
take 30s to write a 2 kilobyte file.
cheers
Robert
Post by Brian May
Computer seems to be up and running, and responsive to crl+alt+del
despite not having a display.
2. There seems to be some weird performance problem. e.g. save a 2
kilobyte file in vim, and the computer can completely freeze (all
other windows, including xterms, stop responding to user input) for,
say 30 seconds, while it is writing that file. Chromium takes ages to
load with several tabs, and pages can fail to start properly while it
is doing so.
Computer has 16GB RAM and is not using any swap. It has 11GB of
total used free shared buffers cached
Mem: 15G 13G 2.1G 0B 1.1G 7.9G
-/+ buffers/cache: 4.6G 11G
Swap: 3.8G 0B 3.8G
Problems occurred before starting chromium, previously I wondered if
it was chromium's fault.
This is moving disk + RAID1 + LVM + ext4. Bonnie++ results seem to be
pretty good, better in fact then my work computer, which doesn't
suffer from similar problems.
Writing a byte at a time...done
Writing intelligently...done
Rewriting...done
Reading a byte at a time...done
Reading intelligently...done
start 'em...done...done...done...done...done...
Create files in sequential order...done.
Stat files in sequential order...done.
Delete files in sequential order...done.
Create files in random order...done.
Stat files in random order...done.
Delete files in random order...done.
Version 1.96 ------Sequential Output------ --Sequential Input-
--Random-
Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block--
--Seeks--
Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP
/sec %CP
falidae 31904M 1228 94 107313 4 47471 2 +++++ +++ 137886
3 441.8 2
Latency 19141us 12435ms 251ms 19913us 88073us
283ms
Version 1.96 ------Sequential Create------ --------Random
Create--------
falidae -Create-- --Read--- -Delete-- -Create-- --Read---
-Delete--
files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
/sec %CP
16 +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++
+++++ +++
Latency 36us 223us 226us 36us 10us
24us
1.96,1.96,falidae,1,1404290654,31904M,,1228,94,107313,4,47471,2,+++++,+++,137886,3,441.8,2,16,,,,,+++++,+++,+++++,+++,+++++,+++,+++++,+++,+++++,+++,+++++,+++,19141us,12435ms,251ms,19913us,88073us,283ms,36us,223us,226us,36us,10us,24us
I am currently working on a new theory that the performance problems
only occur when the computer is cold and first turned on. I think I
have seen evidence to disprove this, but guess I should run bonnie++
as soon as I turn the computer on, just to be sure.
Any other ideas?
Thanks
--
_______________________________________________
luv-main mailing list
http://lists.luv.asn.au/listinfo/luv-main
_______________________________________________
luv-main mailing list
http://lists.luv.asn.au/listinfo/luv-main
Roger
2014-08-23 11:08:48 UTC
Permalink
Post by Daniel J Jitnah
Post by Brian May
Hello,
Am having two strange problems that are starting to eerk me with
1. If I boot anything later then 3.12 kernel, I don't get any
display. As in the monitors display black. No cursors of any sort.
Changing to a virtual console doesn't help. Booting in rescue mode
doesn't help (I think this rules out X-Windows being a problem). If
I go back to the 3.12 kernel, everything works perfectly. I also
tried plugging monitors into alternative ports just in case it is
going to the wrong place, but get nothing - in any case, under 3.12
the computer seems pretty good at automatically working out what
ports are active under X-Windows.
01:00.0 VGA compatible controller: NVIDIA Corporation G96 [GeForce
9500 GT] (rev a1)
Am currently using the non-free nvidia drivers. Had exactly the same
symptoms when I installed the latest kernel without the non-free
nvidia kernel modules. I think the problem is occurring before X starts.
I had a similar problem with a rather old and *different* Nvidia card
to yours and a late kernel (I think its 3.13 onwards?). As I
understand the changes in the latest kernel breaks the Nvidia driver,
and for my card at least, Nvidia will not update their driver, as my
card is quite old. Not sure, but could be yours is similarly
affected? There are patches, you supposedly can apply to the kernel
source, but thats ugly, and I tried that without success. :(.. - I did
not struggle a lot with it and gave up. (Reason I have not upgraded to
latest kernel.) Wheezy here as well. - In fact it appeared for me
in Jessie and Ubuntu 14.04.
Cheers
Daniel
I and my daughter have had problems especially with Ubuntu me using
standard install video drivers and she Nvidia driver. I'm thinking,
the way Linux 3.nn core handles video of late is worse that I can
remember.
There were a number of video problems then Ubuntu would not display
menus and eventually there was literally no easy way to even log in.
In desperation I used the trial live usb system and saved some of my
files to an external usb hd but on the next boot the whole thing failed.
I did a fresh install of ubuntu to test whether the computer, mb, memory
or ssd was faulty and am still testing, so far it's all ok but after 2
failures in a week I am most uncomfortable.
Oh! and before I get told off for not discussing Fedora, my ADSL is
running at 0.1 to at best 30 Kb/sec for weeks now so downloading the
Fedora 20 live was going to take 2 days not counting times when it shuts
off for hours at a time. I used what I had.
I'm thinking there is a fault somewhere in core linux and this may also
affect Nvidia drivers because every program or app I used had faults and
display errors and eventually the top bar with X - and [] for close,
minimise and enlarge disappeared. Thunderbird and LiberOffice had lines
of text missing, cut in half or doubled and the list goes on.
Roger
Brian May
2014-08-24 01:14:55 UTC
Permalink
Post by Robert Moonen
Best to solve one problem at a time, so video first, by the non-free
drivers do you mean this one
http://www.nvidia.com/object/linux-display-amd64-319.49-driver that seems
to be the latest one. I am presently running Debian Wheezy on an i7 3820
with 8GB on a gigabyte mobo with a GT 520 video card and it is running just
fine with kernel 3.2 and also using the actual nvidia driver, the driver I
linked to covers both cards so would be a good place to start.
As for the other problems you experience, this seems to be a
contention/timeout problem, as no relatively new machine will ever take 30s
to write a 2 kilobyte file.
Am using the drivers in Debian backports - I was using the drivers in
stable, but upgraded, just in case. These work fine with the 3.12 kernel:

$ dpkg -l \*nvidia\* | grep -v none
Desired=Unknown/Install/Remove/Purge/Hold
|
Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
||/ Name Version
Architecture Description
+++-=====================================-============================================-============-==============================================================================
ii glx-alternative-nvidia 0.4.1~bpo70+1
amd64 allows the selection of NVIDIA as GLX provider
ii libgl1-nvidia-glx:amd64 319.82-1~bpo70+2
amd64 NVIDIA binary OpenGL libraries
ii libgl1-nvidia-glx:i386 319.82-1~bpo70+2
i386 NVIDIA binary OpenGL libraries
ii libgl1-nvidia-glx-i386 319.82-1~bpo70+2
i386 NVIDIA binary OpenGL 32-bit libraries
ii libnvidia-ml1:amd64 319.82-1~bpo70+2
amd64 NVIDIA Management Library (NVML) runtime library
ii nvidia-alternative 319.82-1~bpo70+2
amd64 allows the selection of NVIDIA as GLX provider
ii nvidia-driver 319.82-1~bpo70+2
amd64 NVIDIA metapackage
ii nvidia-glx 319.82-1~bpo70+2
amd64 transition to nvidia-driver
ii nvidia-installer-cleanup 20131102+1~bpo70+1
amd64 cleanup after driver installation with the
nvidia-installer
ii nvidia-kernel-3.12-0.bpo.1-amd64
319.82+1~bpo70+1+1~bpo70+1+3.12.6-2~bpo70+1 amd64 NVIDIA binary
kernel module for Linux 3.12-0.bpo.1-amd64
ii nvidia-kernel-3.13-0.bpo.1-amd64
319.82+1~bpo70+1+1~bpo70+1+3.13.10-1~bpo70+1 amd64 NVIDIA binary
kernel module for Linux 3.13-0.bpo.1-amd64
ii nvidia-kernel-3.14-0.bpo.1-amd64
319.82+1~bpo70+1+1~bpo70+2+3.14.4-1~bpo70+1 amd64 NVIDIA binary
kernel module for Linux 3.14-0.bpo.1-amd64
ii nvidia-kernel-3.2.0-4-amd64 304.117+1+1+3.2.54-2
amd64 NVIDIA binary kernel module for Linux 3.2.0-4-amd64
ii nvidia-kernel-common 20131102+1~bpo70+1
amd64 NVIDIA binary kernel module support files
ii nvidia-settings 319.72-1~bpo70+1
amd64 tool for configuring the NVIDIA graphics driver
ii nvidia-support 20120630+3
amd64 NVIDIA binary graphics driver support files
ii nvidia-vdpau-driver:amd64 319.82-1~bpo70+2
amd64 NVIDIA vdpau driver
ii xserver-xorg-video-nvidia 319.82-1~bpo70+2
amd64 NVIDIA binary Xorg driver

Am going to try and disable gdm at boot, then I will be able be able to
prove if X-Windows is a factor or not.
--
Brian May <***@microcomaustralia.com.au>
Brian May
2014-08-24 01:31:17 UTC
Permalink
Post by Brian May
Am going to try and disable gdm at boot, then I will be able be able to
prove if X-Windows is a factor or not.

Seems that X-Windows is the only reason video works with 3.12

With gdm disabled, I get identical results with 3.12 - no display. So my
guess is that X is not starting with 3.13 or 3.14, so there is nothing to
"fix" whatever Linux did.

Linux 3.2.0 works fine...
Robert Moonen
2014-08-24 05:11:58 UTC
Permalink
Post by Brian May
Am going to try and disable gdm at boot, then I will be able be able
to prove if X-Windows is a factor or not.
Seems that X-Windows is the only reason video works with 3.12
With gdm disabled, I get identical results with 3.12 - no display. So
my guess is that X is not starting with 3.13 or 3.14, so there is
nothing to "fix" whatever Linux did.
Linux 3.2.0 works fine...
Yes, sorry, I just noticed that the latest kernel is 3.15 so you must be
having problems with 3pointtwelve and above, I am still running the
stock kernel installed with Wheezy 3.2 so haven't noticed any problems.
I will keep this problem in mind if I think about upgrading my kernel to
a more bleeding edge one.
I wonder what has changed and whether it will affect my GT 520 nvidia?


cheers

Robert
_______________________________________________
luv-main mailing list
http://lists.luv.asn.au/listinfo/luv-main
Robert Moonen
2014-08-24 05:21:36 UTC
Permalink
Post by Brian May
Am going to try and disable gdm at boot, then I will be able be able
to prove if X-Windows is a factor or not.
Seems that X-Windows is the only reason video works with 3.12
With gdm disabled, I get identical results with 3.12 - no display. So
my guess is that X is not starting with 3.13 or 3.14, so there is
nothing to "fix" whatever Linux did.
Linux 3.2.0 works fine...
So with GDM disabled you don't even get a console?
Or is after startx that there are no graphics?
Have you looked in /var/log/Xorg.0.log?

By the way, sorry, I just noticed that the latest kernel is 3.15 so you
must be having problems with 3pointtwelve and above, I am still running
the stock kernel installed with Wheezy 3.2 so haven't noticed any
problems. I will keep this problem in mind if I think about upgrading my
kernel to a more bleeding edge one. I mininterpreted 3.12 for 3.1.2 :-(.
I wonder what has changed and whether it will affect my GT 520 nvidia?
Robert Moonen
2014-08-24 06:01:21 UTC
Permalink
Post by Robert Moonen
Post by Brian May
Am going to try and disable gdm at boot, then I will be able be
able to prove if X-Windows is a factor or not.
Seems that X-Windows is the only reason video works with 3.12
With gdm disabled, I get identical results with 3.12 - no display. So
my guess is that X is not starting with 3.13 or 3.14, so there is
nothing to "fix" whatever Linux did.
Linux 3.2.0 works fine...
So with GDM disabled you don't even get a console?
Or is after startx that there are no graphics?
Have you looked in /var/log/Xorg.0.log?
By the way, sorry, I just noticed that the latest kernel is 3.15 so
you must be having problems with 3pointtwelve and above, I am still
running the stock kernel installed with Wheezy 3.2 so haven't noticed
any problems. I will keep this problem in mind if I think about
upgrading my kernel to a more bleeding edge one. I mininterpreted 3.12
for 3.1.2 :-(.
I wonder what has changed and whether it will affect my GT 520 nvidia?
Just in on the debian buglist "Bug#720528: nvidia: driver fails kernel
3.10+ (GeForce 600M and 700M series)".
attached.

Robert
Brian May
2014-08-30 00:13:04 UTC
Permalink
Post by Robert Moonen
Just in on the debian buglist "Bug#720528: nvidia: driver fails kernel
3.10+ (GeForce 600M and 700M series)".
attached.
That could be my problem. Apart from it reportedly working with 3.12.
However this could be that somebody is getting confused.

On my system, X works perfectly under 3.12
because nvidia-kernel-3.12-0.bpo.1-amd64 is good, this fixes the
framebuffer issues.

However, under 3.14, nvidia-kernel-3.14-0.bpo.1-amd64 is broken (undefined
symbol), so X doesn't start leaving a broken framebuffer.

I think it is safe to assume 3.13 is similar to 3.14.

Also, if I boot 3.14 with "vga=normal nomodeset" I get good text console.

Might double check this and comment on the bug report.
--
Brian May <***@microcomaustralia.com.au>
Brian May
2014-08-30 00:45:43 UTC
Permalink
Post by Brian May
That could be my problem. Apart from it reportedly working with 3.12.
However this could be that somebody is getting confused.
On second thoughts, this existing bug appears to be purely a X-Windows
issue. The only unexplained part of my problem is Frame Buffer failing.
--
Brian May <***@microcomaustralia.com.au>
Trent W. Buck
2014-08-25 00:26:10 UTC
Permalink
Post by Brian May
1. If I boot anything later then 3.12 kernel, I don't get any
display. As in the monitors display black. [...] Computer seems to be
up and running, and responsive to crl+alt+del despite not having a
display.
So SSH into it and investigate.

You say you're running wheezy with 3.12.
Are you using the bpo kernels?
Those are currently at 3.14.
Post by Brian May
2. There seems to be some weird performance problem. e.g. save a 2 kilobyte
file in vim, and the computer can completely freeze (all other windows,
including xterms, stop responding to user input) for, say 30 seconds, while
it is writing that file. Chromium takes ages to load with several tabs, and
pages can fail to start properly while it is doing so.
I can't account for this unless your storage is on eMMC or similar.
Have you looked for processes in D state & run iostat?
Post by Brian May
This is moving disk + RAID1 + LVM + ext4.
OK so not eMMC.
Are the blocks aligned?
Shouldn't really matter for RAID1 or spinning rust, tho.

Ty T'so had a blog post a while back explaining how to laboriously align
everything, if you care enough to try that (I wouldn't bother unless you
were reinstalling anyway).
Post by Brian May
I am currently working on a new theory that the performance problems only
occur when the computer is cold and first turned on. I think I have seen
evidence to disprove this, but guess I should run bonnie++ as soon as I
turn the computer on, just to be sure.
And in break or single mode, so no other shit is running.
Brian May
2014-08-30 00:22:55 UTC
Permalink
Post by Trent W. Buck
I can't account for this unless your storage is on eMMC or similar.
Have you looked for processes in D state & run iostat?
falidae# iostat
Linux 3.12-0.bpo.1-amd64 (falidae) 30/08/14 _x86_64_ (8
CPU)

avg-cpu: %user %nice %system %iowait %steal %idle
1.29 0.04 0.26 19.77 0.00 78.65

Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn
sda 34.12 637.80 218.87 640881 219924
sdc 21.07 87.49 403.81 87910 405759
sdb 90.41 591.97 403.81 594830 405759
md0 132.01 677.53 403.67 680806 405620
dm-0 0.09 0.37 0.00 372 0
scd0 0.02 0.10 0.00 96 0
dm-1 130.91 674.30 403.66 677557 405608
dm-2 0.33 1.31 0.01 1317 12
dm-3 0.09 0.37 0.00 376 0
dm-4 0.09 0.37 0.00 372 0
dm-5 0.09 0.37 0.00 372 0

19.77% seems very high for what appeared to be a idle system running
X-Windows and two idle web browsers. Any ideas on how to find what is
causing it to be so high?

(in the above, sda is solid state disk used by /, sdb+sdc is the underlying
disk used by md0 which is raid1)


Is dropping a bit now, but still high:

falidae# iostat
Linux 3.12-0.bpo.1-amd64 (falidae) 30/08/14 _x86_64_ (8
CPU)

avg-cpu: %user %nice %system %iowait %steal %idle
1.13 0.03 0.22 14.42 0.00 84.21

Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn
sda 23.96 445.45 153.82 641145 221396
sdc 15.98 61.28 291.36 88194 419357
sdb 64.40 414.07 291.36 595970 419357
md0 94.16 474.00 291.17 682230 419088
dm-0 0.06 0.26 0.00 372 0
scd0 0.02 0.07 0.00 96 0
dm-1 93.31 471.74 291.16 678981 419076
dm-2 0.23 0.92 0.01 1317 12
dm-3 0.07 0.26 0.00 376 0
dm-4 0.06 0.26 0.00 372 0
dm-5 0.06 0.26 0.00 372 0


Also considered the possibility that md0 is resyncing - that would explain
everything. Unfortunately /proc/mdstat says it isn't.

falidae# cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sdb1[0] sdc1[1]
1953382208 blocks super 1.2 [2/2] [UU]

unused devices: <none>
--
Brian May <***@microcomaustralia.com.au>
Chris Samuel
2014-08-30 05:08:06 UTC
Permalink
Post by Brian May
19.77% seems very high for what appeared to be a idle system running
X-Windows and two idle web browsers. Any ideas on how to find what is
causing it to be so high?
I'd suggest iotop (should be packaged) and iosnoop which is a shell script
that uses the Linux ftrace functionality to track what's going on, see here:

http://www.brendangregg.com/blog/2014-07-16/iosnoop-for-linux.html

Oh, and latencytop and perf top might give insights too.

Good luck!
Chris
--
Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC
Brian May
2014-08-30 07:07:16 UTC
Permalink
Post by Chris Samuel
I'd suggest iotop (should be packaged) and iosnoop which is a shell script
http://www.brendangregg.com/blog/2014-07-16/iosnoop-for-linux.html
Thanks for this. Both seem like good tools.
Only problem is, my computer seems fine right now.

Hmmm.

Another thought: really need to make sure cron isn't doing anything at the
time. It doesn't show up on top, but need to double check this. Hmm. Looks
like I do have anacron installed, so cron jobs do run when I turn the
computer on. According to my log files, all cron jobs should have
terminated when I saw the high iowait time this morning, and didn't run for
long in any case. Unless something is forking and continues to run after
cron reports it finished.

Will need to check this again tomorrow.


Oh, and latencytop and perf top might give insights too.
Apparently latencytop doesn't like the default Debian kernel:

falidae# latencytop
mount: none already mounted or /sys/kernel/debug/ busy
mount: according to mtab, debugfs is already mounted on /sys/kernel/debug
Please enable the CONFIG_LATENCYTOP configuration in your kernel.
Exiting...
falidae# mount -t
falidae# ls -l /sys/kernel/debug
total 0
drwxr-xr-x 2 root root 0 Aug 31 2014 acpi
drwxr-xr-x 23 root root 0 Aug 31 2014 bdi
drwxr-xr-x 2 root root 0 Aug 31 2014 dma_buf
drwxr-xr-x 4 root root 0 Aug 30 16:46 dri
drwxr-xr-x 2 root root 0 Aug 30 16:46 eeepc-wmi
drwxr-xr-x 2 root root 0 Aug 31 2014 extfrag
-r--r--r-- 1 root root 0 Aug 31 2014 gpio
drwxr-xr-x 5 root root 0 Aug 30 16:46 hid
drwxr-xr-x 2 root root 0 Aug 31 2014 kprobes
drwxr-xr-x 2 root root 0 Aug 30 16:46 kvm
drwxr-xr-x 2 root root 0 Aug 31 2014 mce
drwxr-xr-x 2 root root 0 Aug 30 16:46 mei
drwxr-xr-x 2 root root 0 Aug 30 16:46 pkg_temp_thermal
drwxr-xr-x 2 root root 0 Aug 31 2014 pstate_snb
drwxr-xr-x 2 root root 0 Aug 31 2014 regmap
-rw-r--r-- 1 root root 0 Aug 31 2014 sched_features
-r--r--r-- 1 root root 0 Aug 31 2014 sleep_time
-r--r--r-- 1 root root 0 Aug 31 2014 suspend_stats
drwxr-xr-x 6 root root 0 Aug 31 2014 tracing
drwxr-xr-x 2 root root 0 Aug 30 16:46 usb
-r--r--r-- 1 root root 0 Aug 31 2014 wakeup_sources
drwxr-xr-x 2 root root 0 Aug 31 2014 x86
#
--
Brian May <***@microcomaustralia.com.au>
Brian May
2014-09-06 00:40:04 UTC
Permalink
Post by Brian May
Another thought: really need to make sure cron isn't doing anything at the
time. It doesn't show up on top, but need to double check this. Hmm. Looks
like I do have anacron installed, so cron jobs do run when I turn the
computer on. According to my log files, all cron jobs should have
terminated when I saw the high iowait time this morning, and didn't run for
long in any case. Unless something is forking and continues to run after
cron reports it finished.
I was running a unison sync to a remote ssh server, iotop was saying no (or
low) disk I/O, top says up to 60% iowait, and basic tasks (e.g. clicking on
xterm or selecting text in xterm) take ages (system appears to be frozen).
Oh, and no swap being used, have 16GB ram with relatively few processes
running. This seems to be typical.

iotop and iosnoop don't seem to report anything abnormal that I can see
using the disk. There are cron jobs that run at times, however these
problems occur with no cron jobs running (and basic cron jobs shouldn't
kill a modern system in any case).

Not absolutely convinced this is a disk I/O problem, however something
weird is happening here.
--
Brian May <***@microcomaustralia.com.au>
Trent W. Buck
2014-09-08 01:49:10 UTC
Permalink
Post by Brian May
I was running a unison sync to a remote ssh server, iotop was saying no (or
low) disk I/O, top says up to 60% iowait, and basic tasks (e.g. clicking on
xterm or selecting text in xterm) take ages (system appears to be frozen).
Oh, and no swap being used, have 16GB ram with relatively few processes
running. This seems to be typical.
iotop and iosnoop don't seem to report anything abnormal that I can see
using the disk. There are cron jobs that run at times, however these
problems occur with no cron jobs running (and basic cron jobs shouldn't
kill a modern system in any case).
Not absolutely convinced this is a disk I/O problem, however something
weird is happening here.
ionice -c3 unison on both hosts?
Does the problem occur with rsync?
Brian May
2014-09-08 02:08:53 UTC
Permalink
Post by Trent W. Buck
ionice -c3 unison on both hosts?
Does the problem occur with rsync?
Any disk IO appears to make the wait go up excessively.

I want to try this with another kernel. Shame I can't get X working
properly though with anything other then 3.12. Still, I expect to be able
to reproduce this without X. Or if I can't, that maybe that might indicate
that I should investigate X more.
--
Brian May <***@microcomaustralia.com.au>
Brian May
2014-11-17 01:04:23 UTC
Permalink
Post by Brian May
Any disk IO appears to make the wait go up excessively.
Ok, after some false leads, still having problems. e.g. after bios update,
the bios gets upset that CPU fan is spinning slower then expected (around
400RPM); it is a large fan with large heat sink with no signs of dust or
damage, it spins freely and continues spinning for a while, so I think the
BIOS threshold is too low and the fan is ok.

The problem only occurs on cold days, and disappears after the computer has
been left on for 30mins+,

The CPU fan speed does not increase significantly in this time, and it is
still considered too slow for the BIOS (had considered the possibility that
the BIOS was throttling the CPU because it was upset with the fan, but I
doubt it).

I can't actually see any evidence of CPU throttling (not that I really know
how to check; values in /proc/cpuinfo seem to be fine).

What is most likely to cause the computer to run slowly when it is cold?
--
Brian May <***@microcomaustralia.com.au>
Morrie Wyatt
2014-11-19 02:34:59 UTC
Permalink
Hi Brian.



Only a thought, but if the drive is very cold, the air density under the

heads would be higher, and could possibly degrade its performance, increasing

retrys and therefore latency etc. As the spinning platters frictively heat

the enclosed air, head height normalises, and the drive settles down.



I may be totally off base here, but the scenario would seem to fit the symptoms.



Also, how "Cold" are we talking about here?



Wiser heads than mine may come up with better ideas.



Best regards,

Morrie.



From: luv-main-***@luv.asn.au [mailto:luv-main-***@luv.asn.au] On Behalf Of Brian May
Sent: Monday, 17 November 2014 12:04 PM
To: Luv Main
Subject: Re: Computer issues with Linux



On 8 September 2014 12:08, Brian May <***@microcomaustralia.com.au> wrote:

Any disk IO appears to make the wait go up excessively.





Ok, after some false leads, still having problems. e.g. after bios update, the bios gets upset that CPU fan is spinning slower then expected (around 400RPM); it is a large fan with large heat sink with no signs of dust or damage, it spins freely and continues spinning for a while, so I think the BIOS threshold is too low and the fan is ok.



The problem only occurs on cold days, and disappears after the computer has been left on for 30mins+,



The CPU fan speed does not increase significantly in this time, and it is still considered too slow for the BIOS (had considered the possibility that the BIOS was throttling the CPU because it was upset with the fan, but I doubt it).



I can't actually see any evidence of CPU throttling (not that I really know how to check; values in /proc/cpuinfo seem to be fine).



What is most likely to cause the computer to run slowly when it is cold?
--
Brian May <***@microcomaustralia.com.au>
Brian May
2014-11-19 03:02:03 UTC
Permalink
Post by Morrie Wyatt
Also, how "Cold" are we talking about here?
Not really that cold. Say, at rough guess, anything less then 15 degrees.
Will try to take a look at the temperature next time I turn it on.
--
Brian May <***@microcomaustralia.com.au>
Erik Christiansen
2014-11-19 07:03:57 UTC
Permalink
Post by Brian May
Post by Morrie Wyatt
Also, how "Cold" are we talking about here?
Not really that cold. Say, at rough guess, anything less then 15 degrees.
Will try to take a look at the temperature next time I turn it on.
Can't say what is causing it, but my machine also malfunctions at
slightly lower temperatures than that, around 8°-12°C. Often booting
hangs, but hitting the reboot button after it has internally warmed for
a minute always brings it up - so far. Even during that second boot,
video can tear horizontally in dramatic fashion - something which never
happens when it's warm.

In my case, a common cause could be that the ESR of the supply bypass
electrolytics is marginal (low ESR caps cost more), and is sufficiently
worse at low temperatures (OK, that's an assumption, but aluminium
electrolytics have a wet electrolyte) to allow disruptive power supply
noise. A higher ESR will lead to more rapid warming of those exposed to
high frequency power supply switching ripple, and would be symptomatic
of the rapid recovery I observe.

That's 5% observation, and 95% speculation, so any other theory is as
good.

Erik
--
Melbourne Water Use:
"More water is lost to stormwater each year than we use. On average we use
about 40 billion litres of water each year, and each year about 500 billion
litres runs into our drains." Leonie Duncan, Environment Victoria healthy river
campaigner, quoted on p7 of Journal 21.10.08.
But what the heck, with the N-S Pipeline, we can take water from the foodbowl.
Allan Duncan
2014-11-19 04:59:21 UTC
Permalink
Post by Brian May
Any disk IO appears to make the wait go up excessively.
Ok, after some false leads, still having problems. e.g. after bios
update, the bios gets upset that CPU fan is spinning slower then
expected (around 400RPM); it is a large fan with large heat sink with no
signs of dust or damage, it spins freely and continues spinning for a
while, so I think the BIOS threshold is too low and the fan is ok.
Cheat a bit and plug another (faster) fan into the CPU fan socket and
feed the original fan from elsewhere.
That will sort out if it is the bios invoking a throttle.
Cold days are disappearing fast!
Brian May
2014-08-30 00:36:53 UTC
Permalink
Post by Brian May
01:00.0 VGA compatible controller: NVIDIA Corporation G96 [GeForce 9500
GT] (rev a1)
WTF? My computer at work has that video card, it is fine. My one at home
has:

NVIDIA GPU GeForce GTX 660 (GK106) at PCI:1:0:0 (GPU-0)

Could it really be that I typed lspci on the wrong ssh session?

Ooops.
--
Brian May <***@microcomaustralia.com.au>
Continue reading on narkive:
Loading...