Bug 3281

Summary: Sporadic lockup/reboot when USB host software releases peripheral
Product: [Maemo Official Platform] Core Reporter: Till Harbaum <till>
Component: KernelAssignee: unassigned <nobody>
Status: RESOLVED WONTFIX QA Contact: linux-kernel-bugs
Severity: critical    
Priority: Medium CC: andre_klapper, eero.tamminen, Florent.de.Dinechin, jdneal, johannes, Kalle.Valo, olli.kattelus, quim.gil, till
Version: 4.0Keywords: crash, moreinfo
Target Milestone: ---   
Hardware: ARM   
OS: Maemo   

Description Till Harbaum (reporter) 2008-06-23 13:57:38 UTC
SOFTWARE VERSION:
2.2007.51-3, also seen on earlier versions on n800 as well as n810

STEPS TO REPRODUCE THE PROBLEM:
Use a USB peripheral with the IT in USB host mode (it doesn't make a difference
if a OTG master capable cable is being used or if host mode is enabled by
softare like the usb control applet). If there's no matching kernel driver
installed other software accessing the device (e.g. lsusb or a libusb based
client software) sometimes makes the IT lockup on exit/release and finally
results in a reboot. Bug #1375 seems to indicate that unloading a kernel driver
also triggers this.

This does not happen if there's a driver in the kernel installed. E.g. lsusb
with mass storage and hid devices doesn't seem to trigger this problem. I have
even tested this with different firmware versions of my own tiltstick hardware
(see http://www.harbaum.org/till/tiltstick). If the device reports to be a HID
device everything is good (as there is a hid driver in the IT-OS2008 kernel),
once i make it into an unknown device a libusb based demo app as well as lsusb
sometimes trigger a reboot on application exit/disconnect.

This malfunction has been seen with various of my own devices (incl. the
tiltstick in non-hid mode, the i2c-tiny-usb) as well as a rs232-to-usb cable
and the usb ethernet dongle.

EXPECTED OUTCOME:
Software should be able to detach from the USB system without the IT rebooting.

ACTUAL OUTCOME:
In about 1 of 10 cases the IT freezes and reboots (after the watchdog
triggers).

REPRODUCIBILITY:
sometimes

EXTRA SOFTWARE INSTALLED:
libusb incl. home-made libusb clients, lsusb

OTHER COMMENTS:
Bug #1375 (https://bugs.maemo.org/show_bug.cgi?id=1375) looks like it's a
subset of this one.

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.0; de; rv:1.9)
Gecko/2008052906 Firefox/3.0
Comment 1 Andre Klapper maemo.org 2008-06-24 02:24:07 UTC
Does the workaround described in bug 1375 work for you?
Modify /usr/sbin/osso-usb-mass-storage-*.sh to check whether g_ether is present
like /etc/init.d/ke-recv does.
Comment 2 Andre Klapper maemo.org 2008-07-01 16:57:03 UTC
(In reply to comment #1)
> Does the workaround described in bug 1375 work for you?
> Modify /usr/sbin/osso-usb-mass-storage-*.sh to check whether g_ether is present
> like /etc/init.d/ke-recv does.

...and this is still valid on Diablo, I guess?
Comment 3 Johannes Berg 2008-07-29 11:13:50 UTC
I found this bug using gphoto2 and my digital camera.

Additionally, I have connected serial console to the serial pads under the
battery and found that when the system locks up, no output is generated, i.e.
there is a deadlock condition rather than a kernel oops.

Unfortunately, the free flasher 0xFFFF cannot flash diabolo yet so I have not
yet been able to test either that or the workaround. The workaround would seem
to be ineffective though because I have neither module present anyway.
Comment 4 Till Harbaum (reporter) 2008-07-29 21:54:37 UTC
(In reply to comment #2)
> (In reply to comment #1)
> > Does the workaround described in bug 1375 work for you?
> > Modify /usr/sbin/osso-usb-mass-storage-*.sh to check whether g_ether is present
> > like /etc/init.d/ke-recv does.
> 
> ...and this is still valid on Diablo, I guess?
> 
That workaround addresses specifically the g_ether driver. I am not using that
driver.

And yes, this still happens with diablo as well.
Comment 5 Eero Tamminen nokia 2008-10-27 18:38:32 UTC
(In reply to comment #4)
> > > Does the workaround described in bug 1375 work for you?
> > > Modify /usr/sbin/osso-usb-mass-storage-*.sh to check whether present
> > > g_ether is like /etc/init.d/ke-recv does.
> > 
> > ...and this is still valid on Diablo, I guess?
> > 
> That workaround addresses specifically the g_ether driver.
> I am not using that driver.

The workaround addressed the automatic g_file_storage insmod by
/usr/sbin/osso-usb-mass-storage-*.sh when USB cable was connected.

g_ether is just part of the use-case in that particular bug.
Comment 6 Johannes Berg 2008-10-27 18:45:31 UTC
But since we aren't using g_ether, adding a workaround that checks for it will
be ineffective. We'd have to completely disable the loading? I don't have the
adapter with me right now to try though, I'll try to remember to get it and try
next weekend.
Comment 7 Eero Tamminen nokia 2008-10-28 10:00:57 UTC
(In reply to comment #6)
> But since we aren't using g_ether, adding a workaround that checks for it will
> be ineffective. We'd have to completely disable the loading? I don't have the
> adapter with me right now to try though, I'll try to remember to get it and
> try next weekend.

Please check whether your version of /usr/sbin/*.sh scripts checks whether
module should be insmodded before it insmods it.  I don't currently have a
device with Chinook in it.
Comment 8 Johannes Berg 2008-10-28 10:18:22 UTC
mass-storage-enable.sh:

/sbin/lsmod | grep g_file_storage >/dev/null
if [ $? != 0 ]; then
DIR=...
if [ -f $DIR/g_file_storage.ko ]; then
/sbin/insmod ...
RC=$?
fi
fi
Comment 9 Eero Tamminen nokia 2008-10-28 12:16:06 UTC
I remembered bug 1375 a bit wrong.  It was triggered by this:
2. rmmod g_file_storage
3. insmod /mnt/initfs/lib/modules/2.6.21-omap1/g_ether.ko
4. connect and disconnect USB cable to the host
5. rmmod g_ether.ko
6. ifconfig -a

- When in step 4) connecting the USB cable, insmod outputs from the ke-recv
script outputs this:
------------
[ 5272.250000] usb0: high speed config #1: 100 mA, Ethernet Gadget, using CDC
Ethernet
insmod: cannot insert '/mnt/initfs/lib/modules/2.6.21-omap1/g_file_storage.ko':
Device or resource busy (-1): Device or resource busy
/usr/sbin/osso-usb-mass-storage-enable.sh: failed to install g_file_storage
------------
  - without this step, there are no problems
- When doing "ifconfig" in step 6), device freezes.
  Shortly after that, HW watchdog rebooted the device (always)


> Bug #1375 seems to indicate that unloading a kernel driver also triggers this.

It was triggered by failed kernel USB module loading (bug in USB modules
refcounting) and is fixed in the next public release.


If you check "cat /dev/xconsole" output from your use-case, does your case seem
to be the same?   And can it be fixed by checking for the module that in your
case fails the g_file_storage insmod (I.e. is this duplicate of 1375)?

If not, could you with the xconsole information find out a repeatable use-case
that's reliably reproducible?
Comment 10 Johannes Berg 2008-10-29 01:19:43 UTC
I don't think I understand what you're asking, sorry.

I've even hooked up serial console, but except boot messages you get nothing
when the device hangs, it just deadlocks, no debug info, no oops, nothing.

And as far as I know there's no event when userspace deregisters, so I don't
see the modules being loaded? I can totally disable that though and recheck,
next weekend. I don't think it's a duplicate though since insmod shouldn't be
involved.
Comment 11 Eero Tamminen nokia 2008-10-29 11:23:09 UTC
(In reply to comment #10)
> I don't think I understand what you're asking, sorry.
> 
> I've even hooked up serial console, but except boot messages you get nothing
> when the device hangs, it just deadlocks, no debug info, no oops, nothing.

And you have serial output even after kernel has booted?

If not, enabled serial console RD flag with the flasher AND *remember to
disable it when you don't have the serial cable connected* (otherwise you get
random data in the serial, such as SysRq keyboard shortcuts for reboot etc).


> And as far as I know there's no event when userspace deregisters, so I don't
> see the modules being loaded?

You could see whether there's an insmod that fails.


> I can totally disable that though and recheck,
> next weekend. I don't think it's a duplicate though since insmod shouldn't be
> involved.

>> Use a USB peripheral with the IT in USB host mode
[...]
>> This malfunction has been seen with various of my own devices (incl. the
>> tiltstick in non-hid mode, the i2c-tiny-usb) as well as a rs232-to-usb
>> cable and the usb ethernet dongle.

Please give *explicit* steps on how to trigger the bug.

Plugging in USB cable normally causes the ke-recv scripts to do insmod on
g_file_storage.
Comment 12 Johannes Berg 2008-10-29 14:58:36 UTC
(In reply to comment #11)

> > I've even hooked up serial console, but except boot messages you get nothing
> > when the device hangs, it just deadlocks, no debug info, no oops, nothing.
> 
> And you have serial output even after kernel has booted?

Yes, of course. I haven't tried actually logging in on it though.

> If not, enabled serial console RD flag with the flasher AND *remember to
> disable it when you don't have the serial cable connected* (otherwise you get
> random data in the serial, such as SysRq keyboard shortcuts for reboot etc).

It also uses a lot of power :)

> You could see whether there's an insmod that fails.

Are insmods logged? I see nothing when it crashes.

> Please give *explicit* steps on how to trigger the bug.
> 
> Plugging in USB cable normally causes the ke-recv scripts to do insmod on
> g_file_storage.

Ok, well, I used some one of those maemo packages that puts the device into
host mode first, obviously.

Then, I hooked up my Nikon D40 camera and used gphoto2 (compiled myself, can
give you packages if you want) to make the camera take a picture. As gphoto2
quits/disconnects from the camera (which has taken a picture just fine!) the
system hangs.
Comment 13 Johannes Berg 2008-11-24 15:32:34 UTC
(In reply to comment #11)

> Please give *explicit* steps on how to trigger the bug.

Alright:

Steps to reproduce:
 0) install gphoto2
 1) use sysfs to put n810 into host mode
 2) connect Nikon D40 camera in PTP mode with a USB AF-to-AF adapter
 3) open (root) terminal
 4) gphoto2 --capture-image --camera="Nikon DSC D40 (PTP mode)"
 5) watch it hang as soon as gphoto2 quits

However, right now I'm unable to reproduce as I'm unable to get the N810 to
recognise the camera at all, lsusb lists nothing, dmesg shows nothing, etc.
I've managed to get it working before, but not today, for some reason. Are you
aware of anything I could be doing wrong?
Comment 14 Eero Tamminen nokia 2008-11-24 16:17:46 UTC
> 1) use sysfs to put n810 into host mode

I've never used this, what exact commands you use?


> However, right now I'm unable to reproduce as I'm unable to get the N810 to
> recognise the camera at all, lsusb lists nothing, dmesg shows nothing, etc.
> I've managed to get it working before, but not today, for some reason. Are you
> aware of anything I could be doing wrong?

Maybe there were some extra steps that need to be done (after boot) before you
do the steps describe above?

Does the gphoto verbose mode tell anything more?
Comment 15 Johannes Berg 2008-11-24 16:26:48 UTC
(In reply to comment #14)
> > 1) use sysfs to put n810 into host mode
> 
> I've never used this, what exact commands you use?

echo -n host > /sys/devices/platform/musb*/mode

(not sure about the exact path but musb* should expand to the right thing)

> Maybe there were some extra steps that need to be done (after boot) before you
> do the steps describe above?

Maybe, I don't remember, sorry. Maybe Till can help us out here?

> Does the gphoto verbose mode tell anything more?

Well, it simply doesn't find the device, which is not surprising since lsusb
doesn't list it either.
Comment 16 Olli Kattelus 2008-12-10 13:36:22 UTC
Hi folks!

I think I have the very same problem and it is very easy to reproduce. I have
two different MP3 players, Creative Zen V Plus and Sandisk Sansa Clip. Both of
these causes this same problem with these steps:

1. Turn "host" mode on (hacked cable, sw...it dowsn't matter)
2. Attach device
3. Wait few seconds for recognition of the device (OS says that device is not
supported and some other warning about that file system doesn't exist)
4. Use console and type "lsmod" (the device should be correctly in the
listing). After a few seconds the whole system hangs and eventually reboots.

I have been developing a app using MTP protocol to manage this kind of devices.
Firs I though that libmtp causes this behaviour but now I know that this appers
also with this lsmod-command and thats why must be related to system.

I there ANY workaround for this? I understood that this problem is strongly
related to unsupported devices.
Comment 17 Olli Kattelus 2008-12-10 13:58:28 UTC
Correction addings to previous message of mine:

By "lsmod" I mean "lsusb" of course. And also I forget to mention that I'm
using Nokia N810 with Maemo Diablo.
Comment 18 Johannes Berg 2008-12-10 14:27:33 UTC
Yeah, that definitely looks like the same problem and is consistent with what
I've been seeing, except that I don't think I was able to trigger it with
lsusb, but only with applications actually using the device from userland, but
maybe the problem is just generically with userland (usbdevfs) accesses to
devices when in host mode.

Too bad it isn't fixed on Diablo, I was hoping when I can finally upgrade it
would work...

Is there lockdep for ARM btw? Might be worth trying, this really looks like a
spin deadlock somewhere and the watchdog then reboots.
Comment 19 Eero Tamminen nokia 2008-12-10 17:04:34 UTC
> Is there lockdep for ARM btw?

I don't know, but I would assume so, that feature's pretty old.


> Might be worth trying, this really looks like a spin deadlock somewhere and the watchdog then reboots.

I'm hoping the partial fix for bug 1375 coming in next release improves also
this.
Comment 20 Quim Gil nokia 2009-01-17 00:50:00 UTC
It's late and it's Friday so bare with me if I'm saying something stupid
here...

Is USB Host Mode supported at all?

It's good that with hacks you can get the functionality, but if there is a bug
report... do we deserve it? Non supported functionality is not tested, if it's
ver officially supported then it goes through testing prior to release.
Comment 21 Ryan Abel maemo.org 2009-01-17 06:47:04 UTC
(In reply to comment #20)
> Is USB Host Mode supported at all?
> 

Well, you do ship a USB port with OTG support enabled. So, arguably, yes.

> It's good that with hacks you can get the functionality, but if there is a bug
> report... do we deserve it? Non supported functionality is not tested, if it's
> ver officially supported then it goes through testing prior to release.
> 

No "hacks" (Quim, perhaps your definition and mine are different, but we
frequently seem to be disagreed what constitutes a "hack". . . .) involved.
Plug in a USB device with the proper cable and it'll be detected and mounted.
Comment 22 Quim Gil nokia 2009-01-31 23:50:56 UTC
I think we should resolve this as WONTFIX. USB OTG is tchnically available but
not productized (features, QA, testing...) nor advertized to end users. There
is nowork in our team to fix this in Diablo. It is fair to say that if Maemo
ever supports USB OTG someone will test plugging and unplugging systematically
several types of devices.
Comment 23 Andre Klapper maemo.org 2009-02-02 12:51:13 UTC
OK, so if it's not "officially supported", then WONTFIX is an honest answer
here.
Comment 24 Till Harbaum (reporter) 2009-02-02 21:33:29 UTC
I don't think it's that simple. I'd agree with the n800 on 2007 which really
needed hardware hacks (due to the mini-b-only receptacle) as well as software
hacks. But a n810 under dianlo? It has a micro-ab receptacle clearly telling
the end user that this device was meant to work as a host and it comes with
matching kernel support and even with a set if basic host mode drivers. This
device was clearly meant to be used as an OTG host.

This is like claiming that your phones were never meant to be used to call my
mom because you don't specifically advertise this special use case nor do you
test that one can actually call her.
Comment 25 Joshua Neal 2009-10-20 10:35:43 UTC
I came up with the idea of doing a nice pocket I2C/SPI/etc. analyzer tool for
the N900, as part of the PUSH N900 contest.  Unfortunately the boards I want
to use as a bus interface require USB host functionality, which the N900 lacks.
 Oh well, still inspired, I decided to try to implement something on the
n810.  I finally managed to secure the practically unobtainable OTG cable for
the n810, and encountered this (or at least a very similar issue) almost
immediately.  (Using release RX-34+RX-44+RX-48_DIABLO_5.2008.43-7_PR_MR0.)

Was there ever a satisfactory workaround for this?

FWIW, my take is that this was an advertised feature of the N810, and was very
much mentioned in developer-facing materials.  For example the following, where
it is listed under connectivity options:

http://developer.nokia.com/devices/N810
Comment 26 Florent.de.Dinechin 2010-01-14 22:15:46 UTC
*** This bug has been confirmed by popular vote. ***
Comment 27 Florent.de.Dinechin 2010-01-14 22:54:45 UTC
Sorry if it is closed already, I'd like to get a synthetic workaround, even
dirty. I couldn't figure out one out of this page or anywhere.

Same bug with n800, diablo, and a Velleman k8055 card.
Pluging the card is OK, but opens an alert after a while saying "can't connect,
file system inaccessible" (more or less, I get this message in Russian). Then
the N800 works OK as long as I don't try to access the card, I can even unplug
the usb, etc. 

Then when I do a lsusb, the N800 outputs the proper output, then (always and
immediately) freezes then reboot. It actually lives a short while (half a
second?) after the lsusb itself. For example (see below), for lsusb; lsmod it
has the time to do the lsmod. In another experiment I write a program that
(through libk8055, thus libusb) lights the leds of the K0855 board in sequence,
and I get half a second of animation. So the usb works, there is something that
tries to come on top of it that doesn't and kills the whole system.

Some outputs (after OTG mode has been switched on and the usb inserted):

Nokia-N800-43-7:~# dmesg
(...)
[  291.218750] cx3110x: PSM dynamic with 100 ms CAM timeout.
[  721.796875] EAC mode: play enabled, rec enabled
[  722.609375] cx3110x: PSM dynamic with 200 ms CAM timeout.
[  724.304687] EAC mode: play disabled, rec disabled
[  735.046875] EAC mode: play enabled, rec enabled
[  736.554687] tusb_source_power 633: VBUS a_wait_vrise, devctl 81 otg 184 conf
c0010001 prcm 00a80500
[  738.609375] EAC mode: play disabled, rec disabled
[  738.890625] musb_stage0_irq 645: CONNECT (a_host) devctl 3d
[  738.890625] hub 1-0:1.0: state 8 ports 1 chg 0000 evt 0000
[  738.890625] usb usb1: usb auto-resume
[  738.890625] usb usb1: finish resume
[  738.890625] hub 1-0:1.0: hub_resume
[  738.914062] hub 1-0:1.0: port 1, status 0301, change 0001, 1.5 Mb/s
[  739.070312] hub 1-0:1.0: debounce: port 1: total 100ms stable 100ms status
0x301
[  739.195312] usb 1-1: new low speed USB device using musb_hdrc and address 2
[  739.328125] usb 1-1: skipped 1 descriptor after interface
[  739.328125] usb 1-1: default language 0x0409
[  739.328125] usb 1-1: new device strings: Mfr=1, Product=2, SerialNumber=0
[  739.328125] usb 1-1: Product: USB K8055
[  739.328125] usb 1-1: Manufacturer: Velleman 
[  739.328125] usb 1-1: device v10cf p5500 is not supported
[  739.328125] usb 1-1: uevent
[  739.328125] usb 1-1: usb_probe_device
[  739.328125] usb 1-1: configuration #1 chosen from 1 choice
[  739.328125] usb 1-1: adding 1-1:1.0 (config #1, interface 0)
[  739.328125] usb 1-1:1.0: uevent
[  739.328125] usbhid 1-1:1.0: usb_probe_interface
[  739.328125] usbhid 1-1:1.0: usb_probe_interface - got id
[  739.328125]
/home/bifh4/diablo-uarm-prereleased.gcc34qemu/work/kernel-diablo-2.6.21/kernel-source-diablo/drivers/usb/input/hid-core.c:
HID probe called for ifnum 0
[  739.335937]
/home/bifh4/diablo-uarm-prereleased.gcc34qemu/work/kernel-diablo-2.6.21/kernel-source-diablo/drivers/usb/input/hid-core.c:
submitting ctrl urb: Get_Report wValue=0x0100 wIndex=0x0000 wLength=8
[  739.335937] HID device not claimed by input or hiddev
[  739.335937] usbtest 1-1:1.0: usb_probe_interface
[  739.335937] usbtest 1-1:1.0: usb_probe_interface - got id
[  739.335937]
/home/bifh4/diablo-uarm-prereleased.gcc34qemu/work/kernel-diablo-2.6.21/kernel-source-diablo/drivers/usb/core/inode.c:
creating file '002'
[  739.335937] hub 1-0:1.0: 100mA power budget left
[  739.335937] hub 1-0:1.0: state 7 ports 1 chg 0000 evt 0002
[  739.335937] hub 1-0:1.0: port 1 enable change, status 00000303
[  741.335937] usb 1-1: usb auto-suspend
[  743.359375] hub 1-0:1.0: hub_suspend
[  743.359375] usb usb1: usb auto-suspend
[  754.531250] EAC mode: play enabled, rec enabled
[  758.140625] EAC mode: play disabled, rec disabled
[  763.937500] EAC mode: play enabled, rec enabled
[  766.437500] EAC mode: play disabled, rec disabled
[  824.000000] cx3110x: PSM dynamic with 100 ms CAM timeout.
Nokia-N800-43-7:~# lsusb; lsmod
Bus 001 Device 002: ID 10cf:5500 Velleman Components, Inc. 8055 Experiment
Interface Board (address=0)
Bus 001 Device 001: ID 0000:0000  
Module                  Size  Used by
g_file_storage 27656 0 - Live 0xbf055000
cx3110x 56200 0 - Live 0xbf046000
umac 258788 1 cx3110x, Live 0xbf005000 (P)
omap_rng 2956 0 - Live 0xbf003000
rng_core 4292 1 omap_rng, Live 0xbf000000
Nokia-N800-43-7:~# Read from remote host 192.168.0.17: Connection reset by peer

(this was an ssh connexion to the n800)

Cheers, and thanks to all for the good work anyway.