Bug 6206 - (int-159241) BME exit or SIGABRT causing reboot when charging sufficiently aged battery from less than half-full
(int-159241)
: BME exit or SIGABRT causing reboot when charging sufficiently aged battery fr...
Status: REOPENED
Product: System software
dsme
: 4.1.3 (5.2008.43-7)
: N810 Maemo
: Low critical with 4 votes (vote)
: ---
Assigned To: unassigned
: dsme-bugs
:
:
:
:
  Show dependency tree
 
Reported: 2009-11-16 15:52 UTC by Lucas Maneos
Modified: 2010-05-04 22:39 UTC (History)
3 users (show)

See Also:


Attachments
Complete syslog from SIGPIPE case (7.89 KB, text/plain)
2009-11-24 12:11 UTC, Lucas Maneos
Details
BME core dump (14.51 KB, application/x-gzip)
2009-11-27 12:31 UTC, Lucas Maneos
Details


Note

You need to log in before you can comment on or make changes to this bug.


Description Lucas Maneos (reporter) 2009-11-16 15:52:54 UTC
SOFTWARE VERSION:
5.2008.43-7

STEPS TO REPRODUCE THE PROBLEM:
1. N810 switched on with hildon desktop etc running.
2. Plug in charger.
3. Wait a few minutes.

EXPECTED OUTCOME:
Device charges over a period of time until battery full.

ACTUAL OUTCOME:
Device starts charging, then a few minutes later spontaneously reboots.

REPRODUCIBILITY:
Always (well, dozens of times over the past month or so).

EXTRA SOFTWARE INSTALLED:
Lots of stuff when problem originally appeared, but nothing new for several
months before that.  Significantly fewer packages currently, but I will also
try another reflash without restoring data from backup (although I don't see
how that would affect bme).

OTHER COMMENTS:
Similar symptoms to some commments in bug 3415, but nothing there mentions BME
so I'm assuming a different problem.  May actually be a hardware fault but I'd
like some feedback before sending it off for warranty repair (although I may
have to do that soon anyway as the warranty period expires in a couple of
weeks).

This started happening about a month or so ago. I reflashed the N810 a few days
ago (and restored user data and from backup and /some/ packages like ssh syslog
etc) but the problem persists.

If charger is plugged in while the device is switched off the device charges
and I get a few days' use before needing to switch off to charge again, but
obviously this is not optimal.

The following is with no card in the miniSD slot, swap disabled and internal
emmc checked with dosfsck and no errors found:

Nokia-N810-43-7:/var/log# grep . /proc/bootreason /var/lib/dsme/stats/*
/proc/bootreason:sw_rst
/var/lib/dsme/stats/lifeguard_resets:/usr/bin/bme_RX-44 :  5 *
/var/lib/dsme/stats/lifeguard_restarts:/usr/bin/hildon-desktop  : 1 
/var/lib/dsme/stats/lifeguard_restarts:/usr/sbin/ke-recv : 1 *
/var/lib/dsme/stats/sw_rst:7

syslog output:
Nov 16 12:34:34 Nokia-N810-43-7 DSME: process '/usr/bin/bme_RX-44' with pid 355
exited with return value: 0
Nov 16 12:34:34 Nokia-N810-43-7 DSME: '/usr/bin/bme_RX-44' exited with RESET
policy -> reset
Nov 16 12:34:34 Nokia-N810-43-7 DSME: SIGPIPE received, some client exited
before noticed?
Nov 16 12:34:34 Nokia-N810-43-7 DSME: Here we will request for sw reset
Nov 16 12:34:34 Nokia-N810-43-7 DSME: SIGPIPE received, some client exited
before noticed?
Nov 16 12:34:34 Nokia-N810-43-7 DSME: SIGPIPE received, some client exited
before noticed?
Nov 16 12:34:34 Nokia-N810-43-7 DSME: Here we could do some bookkeeping..
Nov 16 12:34:34 Nokia-N810-43-7 DSME: SIGPIPE received, some client exited
before noticed?

No idea why BME would exit with status 0...  Any suggestions for other things
to check welcome.
Comment 1 Lucas Maneos (reporter) 2009-11-24 12:06:04 UTC
It seems the battery needs to be < 50% full to trigger this.  Anyway, I tried
to retest with a nearly clean slate as follows:

- Start with battery fully charged, charger unplugged.
- Make sure R&D mode and all R&D flags are off.
- Re-flash RX-44_DIABLO_5.2008.43-7_PR_COMBINED_MR0_ARM.bin, do not restore
backup.
- Set up WLAN connection.
- Enable extras and add tools repository, install openssh-server, sysklogd and
strace.
- Open a couple of ssh sessions, watch syslog in one and strace of bme in the
other.
- Set display brightness to max, start an internet radio stream and wait for
battery to drain a bit.

After a few minutes (perhaps coincidentally as soon as I touched the screen and
it switched on) BME segfaulted.  Syslog:

Nov 24 09:10:52 Nokia-N810-43-7 hulda[1358]: hulda.c:183: i|m|p:
com.nokia.mce.signal|display_status_ind|/com/nokia/mce/signal
Nov 24 09:10:52 Nokia-N810-43-7 hulda[1358]: hulda.c:183: i|m|p:
com.nokia.mce.signal|tklock_mode_ind|/com/nokia/mce/signal
Nov 24 09:10:52 Nokia-N810-43-7 kernel: [ 1297.171875] EAC mode: play enabled,
rec enabled
Nov 24 09:10:52 Nokia-N810-43-7 DSME: SIGPIPE received, some client exited
before noticed?
Nov 24 09:10:52 Nokia-N810-43-7 DSME: Closed a client connection
Nov 24 09:10:52 Nokia-N810-43-7 DSME: process '/usr/bin/bme_RX-44' with pid 362
exited with signal: 13
Nov 24 09:10:52 Nokia-N810-43-7 DSME: '/usr/bin/bme_RX-44' exited with RESET
policy -> reset
Nov 24 09:10:52 Nokia-N810-43-7 DSME: Here we will request for sw reset
Nov 24 09:10:52 Nokia-N810-43-7 DSME: Here we could do some bookkeeping..
Nov 24 09:10:52 Nokia-N810-43-7 hulda[1358]: hulda.c:183: i|m|p:
com.nokia.mce.signal|system_inactivity_ind|/com/nokia/mce/signal

strace:
clock_gettime(CLOCK_MONOTONIC, {1295, 52093505}) = 0
clock_gettime(CLOCK_MONOTONIC, {1295, 52520751}) = 0
sendmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000},
msg_iov(2)=[{"0\0\0\0\1\301\0+\342\0\0\0\254\246\2\0"..., 16},
{"\1\375\0\0\0\32\375\0\0\5\0\2\0\4\371\341\1\0\0\20\0\34\0\n\0\4\0\0\17l\371\341"...,
32}], msg_controllen=0, msg_flags=0}, 0) = 48
poll([{fd=6, events=POLLIN}, {fd=8, events=POLLIN}, {fd=7, events=POLLIN},
{fd=-1}, {fd=4, events=POLLIN}, {fd=5, events=POLLIN}], 6, -1) = 1 ([{fd=6,
revents=POLLIN}])
mq_timedreceive(6,
"\2\0\0\0\1\0\25\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 256,
0, NULL) = 12
ioctl(4, 0x6002, 0xbecf0cf8)            = 0
ioctl(4, 0x6002, 0xbecf0cf0)            = 0
timer_settime(0x13, 0, {it_interval={0, 0}, it_value={0, 0}}, {it_interval={0,
0}, it_value={0, 0}}) = 0
ioctl(4, 0x6003, 0x8)                   = 509
ioctl(4, 0x6003, 0x8)                   = 508
ioctl(4, 0x6003, 0x8)                   = 508
ioctl(4, 0x6003, 0x8)                   = 508
ioctl(4, 0x6003, 0x8)                   = 508
poll([{fd=6, events=POLLIN}, {fd=8, events=POLLIN}, {fd=7, events=POLLIN},
{fd=-1}, {fd=4, events=POLLIN}, {fd=5, events=POLLIN}], 6, -1) = 1 ([{fd=8,
revents=POLLIN}])
accept(8, {sa_family=AF_FILE, path="r\17r\17r\17u\17"...}, [2]) = 12
read(12, "BMentity"..., 8)              = 8
write(12, "\n"..., 1)                   = -1 EPIPE (Broken pipe)
--- SIGPIPE (Broken pipe) @ 0 (0) ---
Process 362 detached

One occurrence so far.
Comment 2 Lucas Maneos (reporter) 2009-11-24 12:11:49 UTC
Created an attachment (id=1621) [details]
Complete syslog from SIGPIPE case

(In reply to comment #1)
> BME segfaulted.

Er, sorry... that's a SIGPIPE, not a SIGSEGV.  Attaching the complete syslog
from sysklogd installation & start until reboot.
Comment 3 Lucas Maneos (reporter) 2009-11-24 14:57:59 UTC
After another couple of hours of music playing with the screen mostly on, I
managed to get the battery down to this:

Nokia-N810-43-7:~# lshal -u /org/freedesktop/Hal/devices/bme
udi = '/org/freedesktop/Hal/devices/bme'
  battery.charge_level.capacity_state = 'ok'  (string)
  battery.charge_level.current = 2  (0x2)  (int)
  battery.charge_level.design = 4  (0x4)  (int)
  battery.charge_level.last_full = 0  (0x0)  (int)
  battery.charge_level.percentage = 50  (0x32)  (int)
  battery.charge_level.unit = 'bars'  (string)
  battery.is_rechargeable = true  (bool)
  battery.present = true  (bool)
  battery.rechargeable.is_charging = false  (bool)
  battery.rechargeable.is_discharging = true  (bool)
  battery.remaining_time = 3600  (0xe10)  (int)
  battery.remaining_time.calculate_per_time = false  (bool)
  battery.reporting.current = 463  (0x1cf)  (int)
  battery.reporting.design = 1532  (0x5fc)  (int)
  battery.reporting.last_full = 0  (0x0)  (int)
  battery.reporting.unit = 'mAh'  (string)
  battery.type = 'pda'  (string)
  battery.voltage.current = 3542  (0xdd6)  (int)
  battery.voltage.design = 4200  (0x1068)  (int)
  battery.voltage.unit = 'mV'  (string)
  info.addons = {'hald-addon-bme'} (string list)
  info.bus = 'unknown'  (string)
  info.capabilities = {'battery'} (string list)
  info.category = 'battery'  (string)
  info.parent = '/org/freedesktop/Hal/devices/computer'  (string)
  info.product = 'Battery (BME-HAL)'  (string)
  info.subsystem = 'unknown'  (string)
  info.udi = '/org/freedesktop/Hal/devices/bme'  (string)

Plug charger, wait for screen to sleep, touch screen (repeated last two steps a
few times, not always reproducible) results in bme_RX-44 dying with exactly the
same syslog output as in comment 1.  The strace output is similar, but one
thing I didn't report previously is:

clock_gettime(CLOCK_MONOTONIC, {12102, 397094726}) = 0
clock_gettime(CLOCK_MONOTONIC, {12102, 398101806}) = 0
--- SIGRT_1 (Unknown signal 33) @ 0 (0) ---
mq_timedsend(6, "\2\0\0\0\1\0\24\0\0\0\0\0"..., 12, 15, NULL) = 0
rt_sigreturn(0)                         = 0

Don't know if it's significant as it seems to be getting these signals from
time to time even before it dies.  Otherwise it ends the same as before:

mq_timedreceive(6,
"\2\0\0\0\1\0\24\0\0\0\0\0\f\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 256,
0, NULL) = 12
accept(8, {sa_family=AF_FILE, path=@""}, [2]) = 19
read(19, "BMentity"..., 8)              = 8
write(19, "\n"..., 1)                   = -1 EPIPE (Broken pipe)
--- SIGPIPE (Broken pipe) @ 0 (0) ---

Leaving the device untouched, it booted, started charging and rebooted again a
little while later.  After that, /var/log/syslog.old shows bme_RX-44 exited
with status 0 (same as in comment 0) and

Nokia-N810-43-7:~# grep . /proc/bootreason /var/lib/dsme/stats/*
/proc/bootreason:sw_rst
/var/lib/dsme/stats/lifeguard_resets:/usr/bin/bme_RX-44 :  3 *
/var/lib/dsme/stats/sw_rst:4

I'm assumming faulty hardware, but it looks like there may be a bme bug or two
in there as well.  If I can provide any other information before sending it off
for repair let me know.
Comment 4 Lucas Maneos (reporter) 2009-11-27 11:55:57 UTC
Device posted to Nokia Care yesterday to avoid missing the warranty window. 
Some additional info:

Swapping the battery with another N810 that works fine makes no difference (the
problem stays with the same N810 and the other one still works fine).

The reboots occur much more frequently when plugged in to an AC-8X ("Nokia High
Efficiency") charger, although they also occur with the bundled one as well as
with no charger plugged in.  The other N810 has no problem with either charger.
Comment 5 Lucas Maneos (reporter) 2009-11-27 12:31:04 UTC
Created an attachment (id=1635) [details]
BME core dump

Just found this in the /core-dumps dir of the SD card.  It's from just before I
removed the card and well before the second reflash so some extra packages were
installed at the time.  I have no way to examine it at the moment (the other
N810 is no longer with me) and the filename indicates a SIGABRT exit (well,
SIGPIPE doesn't dump core anyway) but it may still be relevant.
Comment 6 Jan Knutar 2009-12-01 19:32:03 UTC
Vaguely related observations on a N810, but most likely *not* the same bug:

symptoms: Charge from low battery resulted in bme dying, SIGABRT.
workaround: used a low power (300mA) charger or a new battery

details:
In ftd, I observed that the battery voltage before charging was at about
3580mV.
When the charger was connected, the closed-switch battery voltage measure grew
steadily, and bme died, if I remember correctly, at around 4500-4600mV. Open
Switch voltage, if my memory serves me right, was below 3700mV.
This suggested to me that the battery's internal resistance had become very
high (it's about 1.5 years old now I think). A new battery did not exhibit this
behaviour.
Charge from near-full did not exhibit this behaviour.
Charging with a weaker charger (5.6V, 300-ish mA) did not push the
closed-switch voltage as high, and charging was successful.

The charge control doesn't seem able to deal with aged batteries that have
developed high internal resistance, and (on purpose?) raises SIGABRT.
Comment 7 Lucas Maneos (reporter) 2009-12-08 14:47:16 UTC
Nokia Care sent back a replacement device which looks brand-new.  It came with
firmware 1.2007.42-18 and I have added/removed/updated/re-configured nothing
(not even set up a WLAN connection) so far.  BSME-caused reboots still happen
when the battery is relatively low and the device is plugged into the charger. 
The charges itself seems to make no difference (tested with AC-4X, AC-8X and
ACP-12X + CA-44 adaptor):

~ $ grep . /proc/bootreason /var/lib/dsme/stats/*
/proc/bootreason:sw_rst
/var/lib/dsme/stats/lifeguard_resets:/usr/bin/bme_RX-44 :  5 *
/var/lib/dsme/stats/sw_rst:5

The only unchanged variable seems to be the battery (perhaps I didn't test
adequately in comment 4?) - I'll try to get a fresh BP-4L this week and see if
it makes a difference (though the current one still gives me several days of
normal use per full charge).
Comment 8 Lucas Maneos (reporter) 2009-12-13 17:01:17 UTC
No bme exits or reboots with new battery after three cycles from full to <25%
charge and charging back to full.

I can post the old battery (feel free to email me a snailmail address) if it
can help debug this.
Comment 9 Andre Klapper maemo.org 2010-02-04 11:02:01 UTC
This is a WONTFIX for Maemo4 (no surprise probably for you) as Nokia will only
provide bugfixes for blocker issues for Maemo4, if at all (and I have my
personal doubts even about that).
Comment 10 Lucas Maneos (reporter) 2010-02-04 11:49:02 UTC
(In reply to comment #9)
> This is a WONTFIX for Maemo4 (no surprise probably for you)

Of course, but unless this is known to be FIXED in Fremantle(TM) I think
someone should still take a look at it.  It'll probably be a year or two before
N900 batteries age enough to trigger it, but it's better to fix this before it
starts happening, no?

Note that there are at least 3 different exit reasons: SIGPIPE, SIGABRT and
clean exit.  The first two certainly look like bugs, and IMHO bme shouldn't
spontaneously exit even with status 0 (and thus trigger a reboot) in any case.
Comment 11 Andre Klapper maemo.org 2010-03-01 19:42:25 UTC
So this might/will happen in the future with N900 you assume? Hmm... a fresh,
clean summary of this is highly welcome that I can copy&paste to the internal
tracker...
Comment 12 Lucas Maneos (reporter) 2010-03-02 08:52:00 UTC
(In reply to comment #11)
> So this might/will happen in the future with N900 you assume?

Maybe (it's possible that it's been fixed in the meantime or that li-ion
batteries won't trigger it).  Worth having a look I think.

> Hmm... a fresh,
> clean summary of this is highly welcome that I can copy&paste to the internal
> tracker...

Summary updated.
Comment 13 Lucas Maneos (reporter) 2010-04-26 20:38:58 UTC
Hm, I just noticed bug 3144 of which this may be a duplicate.  Can you compare
int-82382 & int-159241 to confirm?

If so, this was fixed (but not released) for Diablo, so Fremantle probably
contains the fix.
Comment 14 Andre Klapper maemo.org 2010-05-03 19:45:34 UTC
(In reply to comment #13)
> int-82382 .... If so, this was fixed (but not released) for Diablo

It was fixed for Diablo in initfs-diablo 0.95.31.1-200848maemo1 after the
5.2008.43-7 release.
Comment 15 Lucas Maneos (reporter) 2010-05-04 13:14:53 UTC
(In reply to comment #14)
> It was fixed for Diablo in initfs-diablo 0.95.31.1-200848maemo1 after the
> 5.2008.43-7 release.

Yeah, I got that from the comments :-)  Just wondering if it is actually the
same bug.  I still have the old battery and am happy to post for testing, or
even better test myself if (hint hint nudge nudge) Nokia can be persuaded to
release the fixed binary as non-free for the community SSU.
Comment 16 Andre Klapper maemo.org 2010-05-04 22:39:23 UTC
I've linked internally to bug 3144. Let's see.