Bug 8787 - random but frequent HW watchdog reboots ("32wd_to"), in "specific" network environment
: random but frequent HW watchdog reboots ("32wd_to"), in "specific" network en...
Status: RESOLVED WORKSFORME
Product: Core
general
: 5.0/(2.2009.51-1)
: All Maemo
: Unspecified normal with 1 vote (vote)
: ---
Assigned To: unassigned
: core-general-bugs
:
: crash
:
:
  Show dependency tree
 
Reported: 2010-02-02 17:04 UTC by Frederic Crozat
Modified: 2010-10-11 15:16 UTC (History)
6 users (show)

See Also:


Attachments
mtd2 file (289 bytes, application/octet-stream)
2010-02-02 19:27 UTC, Frederic Crozat
Details
cat from /dev/mtd2 (256.00 KB, text/plain)
2010-04-13 23:33 UTC, Brent McNabb
Details


Note

You need to log in before you can comment on or make changes to this bug.


Description Frederic Crozat (reporter) 2010-02-02 17:04:41 UTC
This is a follow-up for bug #6334 for people still seeing 32wd_to reboot with
after full flashing (system + eMMC) to firmware 2.2009.51-1.

If I disable WLAN autoconnection / switch to wlan when detected and keep only
3G enabled, I don't see any 32wd_to when I'm at my office (it is the only place
where I see those reboots).

I'll continue tests and try to enable only some WLAN in my configuration
Comment 1 Eero Tamminen nokia 2010-02-02 19:20:35 UTC
If you have something in /dev/mtd2 oops partition, could you attach it here?
Thanks!
Comment 2 Frederic Crozat (reporter) 2010-02-02 19:27:52 UTC
Created an attachment (id=2198) [details]
mtd2 file
Comment 3 Eero Tamminen nokia 2010-02-02 19:37:30 UTC
(In reply to comment #2)
> Created an attachment (id=2198) [details] [details]
> mtd2 file

Ok, you don't have any oopses i.e. the device or kernel just freezes and then
watchdog reboots it.  Do you notice these freezes before the reboot?
Comment 4 Frederic Crozat (reporter) 2010-02-02 19:52:28 UTC
unfortunately, usually, device reboots when it is waiting in my pocket, doing
nothing (except maybe MfE syncing Google contacts / agenda).

So far, no reboot today, with WLAN auto-switch disabled. 

Tomorrow, I'll try to enable WLAN scanning but no auto-switch when detected.
Comment 5 Frederic Crozat (reporter) 2010-02-04 19:33:34 UTC
result from yesterday tests : 
- WLAN scanning enabled but I disable "switch from 3G to WLAN if available" 
- device stayed connected the entire day on 3G
- no 32wd_to reset

result from today tests:
- WLAN scanning enabled AND allowed to switch from 3G to WLAN if available
- I've disabled all WLAN AP configured on the device for the office but one, to
be sure it would not try to connect to other WLAN
- no 32wd_to reset

So, it seems either one specific WLAN AP or conjunction of several WLAN AP at
the office are causing the reboot. I'll continue my tests.
Comment 6 Frederic Crozat (reporter) 2010-02-16 12:01:30 UTC
results from new tests :
- I wasn't at my office last week, no crash
- yesterday, I enabled both WLAN AP available in my office, no crash
- today, same configuration (two WLAN AP available) and I just got a 32wd_to
crash when the device was doing "nothing" (maybe exchange was synching calendar
/ contacts, I don't know). no oops detected.

It really looks like the device doesn't like switching between two AP (one has
a weaker signal than the other).
Comment 7 Frederic Crozat (reporter) 2010-02-16 12:15:56 UTC
I've just updated my device to 3.2010.02-8 using SSU. We'll see if it is more
stable..
Comment 8 Mikko O 2010-02-18 10:41:47 UTC
I get possibly the same bug. 32wd_to bootreason. Nothing in /dev/mtd2. I have
updated to 3.2010.02-8. Before that I re-flashed both system and eMMC.

I have only one wireless AP here, so no switching going on, but still reboots.

I haven't witnessed any of the reboots as they've happened, unfortunately.

(My AP seems to be "flaky". It will sometimes disconnect a client for no
apparent reason. Will put new AP in its place.)
Comment 9 Frederic Crozat (reporter) 2010-03-18 08:32:18 UTC
I have this bug on average one time per week with PR 1.1.1, without doing
anything on the system, only when two configured AP are available (at work)

Do you need AP models ?
Comment 10 Eero Tamminen nokia 2010-03-18 12:59:22 UTC
(In reply to comment #9)
> Do you need AP models ?

Yes, please.  If others have similar issues, they can check whether they're
using the same AP.

If reproducibility is once a week, it happens only with specific AP HW/SW and
there isn't even any kernel oops, I don't see how this could be fixed (and be
easily verified to be fixed).  Without any kind of backtraces, bug needs to be
reliably reproducible (within few hours) for it to be debuggable. :-/
Comment 11 Brent McNabb 2010-04-13 23:33:59 UTC
Created an attachment (id=2594) [details]
cat from /dev/mtd2
Comment 12 Brent McNabb 2010-04-13 23:34:28 UTC
I have a similar problem with 32wd_to reboots.  Only it seems to be a different
network environment that causes mine.  WLAN connections are fine, but when I am
using an EDGE data connection I get frequent reboots.  Signal strength has no
effect, I even get the reboots with strong signal.  I have the cellular radio
set to GSM-only, as I have AT&T as a carrier.

I do have something in /dev/mtd2, which I am attaching.  From looking at the
file, it may be due to memory errors perhaps?  It just seems strange that WLAN
connections don't trigger it, but cell connections do.  Once the reboots start,
they come about every five minutes or so.
Comment 13 Eero Tamminen nokia 2010-04-15 10:46:39 UTC
(In reply to comment #12)
> I have a similar problem with 32wd_to reboots.  Only it seems to be
> a different network environment that causes mine.  WLAN connections are
> fine, but when I am using an EDGE data connection I get frequent reboots.
>  Signal strength has no effect, I even get the reboots with strong signal.
> I have the cellular radio set to GSM-only, as I have AT&T as a carrier.
>
> I do have something in /dev/mtd2, which I am attaching.  From looking at the
> file, it may be due to memory errors perhaps? It just seems strange that WLAN
> connections don't trigger it, but cell connections do.

Your kernel oops partition has two types of reboots.  First ones are in
rate_control_get_rate() which I think is same as bug 7029 and should hopefully
be fixed in PR1.2 (bug states it's related to WLAN connection, but I guess it
could happen also with phone network).

Latter ones are memory related, but originate from file system operations. 
It's possible that the earlier oopses have gotten your file system corrupted. 
When PR1.2 release comes (which has also lots of other fixes), I would suggest
you to reflash the device instead of using SSU.


> Once the reboots start, they come about every five minutes or so.

This sounds like reboot wouldn't clear the device state completely...
Comment 14 Frederic Crozat (reporter) 2010-04-15 12:01:14 UTC
I'm still getting the 32wd_to reboot several times a week at the office.

Both access points are the same model DLink DAP-1160, configured in WPA, but
with different passwords. Moreover, one is near my desk and the other at the
other end of the building so its signal is weak from my desk (but it is
probably the first one picked by the phone when entering the building).
Comment 15 Frederic Crozat (reporter) 2010-04-28 17:36:41 UTC
just got the reboot with only one of the access point configured on N900, no
need to have two similar DLink DAP-1160
Comment 16 Andre Klapper maemo.org 2010-08-26 20:10:16 UTC
Does this still happen in 10.2010.19-1?
Comment 17 Frederic Crozat (reporter) 2010-08-26 20:19:37 UTC
I can't confirm, because I'm not working in the same company so I don't have
the same wifi setup.

But I know I haven't got this watchdog reboot for a long time..
Comment 18 Andre Klapper maemo.org 2010-10-11 15:07:19 UTC
(In reply to comment #17)
> I can't confirm, because I'm not working in the same company so I don't have
> the same wifi setup.

Fred, has this ever happened again since then?
Comment 19 Frederic Crozat (reporter) 2010-10-11 15:10:00 UTC
no, never
Comment 20 Andre Klapper maemo.org 2010-10-11 15:16:24 UTC
Yay, good for you, bad for reproducing. :-P