maemo.org Bugzilla – Bug 2747
inconsistent mmc device naming at boot time when one card is missing
Last modified: 2010-10-26 00:08:57 UTC
You need to log in before you can comment on or make changes to this bug.
STEPS TO REPRODUCE THE PROBLEM: 1. Install boot menu http://fanoush.wz.cz/maemo/#initfs 2. Copy system to mmc card 3a Try to boot N800 with card only in external slot with internal slot empty 3b Try to boot N810 with (builtin) card only in internal slot with external slot empty 4. system fails to boot because correct device name /dev/mmcblk1 does not work but the card can be accesses with wrong device name /dev/mmcblk0 5. after system fully boots this is corrected (because of udev rules?) 6. df output shows mismatched partition names, when booted from second partition on such card and having VFAT partition there too the output looks like /dev/mmcblk0p2 / /dev/mmcblk1p1 /media/mmc1 bu both partitions are actually on the same card! EXPECTED OUTCOME: slot naming and device names at boot time is consistent no matter what cards are inserted ACTUAL OUTCOME: sy 4. and 6. above REPRODUCIBILITY: always with one slot empty OTHER COMMENTS: Looks like kernel always first assigns /dev/mmcblk0 no matter in which slot the cards is. This is old bug that occured also in OS2007/N800 but with N800 this was not so critical as almost nobody wants to boot from external card with internal slot empty. With N810 the slot naming is switched so this happens quite often = always when one wants to boot from builtin internal card with no card in external slot. See http://www.internettablettalk.com/forums/showthread.php?p=109055#post109055 http://www.internettablettalk.com/forums/showthread.php?p=124657#post124657
cat /proc/partitions is very confusing because of this bug - it returns information which differs from df when run on the same machine. For example, on my N810 with 8GB card in the miniSD slot (7.5GB VFAT partition and a second 0.5GB ext2 partition) I get the following output from df: /home/user/bin # df Filesystem 1k-blocks Used Available Use% Mounted on /dev/mtdblock4 2048 1912 136 93% /mnt/initfs none 512 88 424 17% /mnt/initfs/tmp /dev/mtdblock4 257536 169360 88176 66% / none 512 88 424 17% /tmp none 1024 24 1000 2% /dev tmpfs 1024 0 1024 0% /dev/shm /dev/mmcblk0p1 1999206 380992 1618214 19% /media/mmc2 /dev/mmcblk1p1 7460608 4 7460604 0% /media/mmc1 The df command correctly identifies the 2GB internal flash memory (swap, maps etc.) mounted from mmcblk0p1 and my 7.5GB flash memory mounted from mmcblk1p1. However "cat /proc/partitions" has got it all wrong: /home/user/bin # cat /proc/partitions major minor #blocks name 31 0 128 mtdblock0 31 1 384 mtdblock1 31 2 2048 mtdblock2 31 3 2048 mtdblock3 31 4 257536 mtdblock4 254 0 7977472 mmcblk0 254 1 7475199 mmcblk0p1 254 2 502272 mmcblk0p2 254 8 1966080 mmcblk1 254 9 2007032 mmcblk1p1 Here it thinks device mmcblk0 is an 8GB device with two partitions and mmcblk1 is a 2GB device with a single partition, yet df correctly states that mmcblk0 is my internal 2GB memory, and mmcblk1 is the memory card with my 7.5GB and 0.5GB partitions.
Looks like this is by design, see comment http://lxr.linux.no/linux+v2.6.21/drivers/mmc/mmc_block.c#L470 ... be unset for removable block devices ... Since MMC block devices clearly fall under the second case, we do not set GENHD_FL_REMOVABLE ..... So mmc block devices are 'removable/volatile' and come and go as cards are inserted and removed and first one is always mmcblk0 no matter what slot it is inserted into. If Nokia has (IMO quite sane) requirement that slot device names should be consistent, it should patch kernel device name allocation code http://lxr.linux.no/linux+v2.6.21/drivers/mmc/mmc_block.c#L414 to comply with this. I'll try to hack the code to do it (not sure when and not sure how). Maybe this should be discussed with mmc kernel maintainers (Pierre Ossman?), dynamic nature of block devices is not a nice feature in multislot configuration, perhaps there is no real reason to have it like this (except having easy implementation via find_first_zero_bit)? It is not possible to prevent having only mmcbkl1 without mmcblk0 in real world anyway since this may clearly happen with two cards by inserting and removing in different order. So it should not be a problem to allocate first mmcblk1 with mmcblk0 still unallocated.
Just to let you know that the bootmenu code http://fanoush.wz.cz/maemo/#initfs mentioned in comment #1 now has workaround for this issue. You can still see the boot device name changing in "Booting from ...." message and the confusion mentioned in other comments is still there, but at least the system boots. The workaround code is here (for curious people): #figure out mmc slot device names, http://bugs.maemo.org/show_bug.cgi?id=2747 INT_CARD="" EXT_CARD="" case `grep product /proc/component_version` in *SU-18) [ -d /sys/block/mmcblk0 ] && EXT_CARD="mmcblk0" ;; *RX-*) for i in mmcblk0 mmcblk1 ; do if [ -d /sys/block/$i ] ; then if [ -f /sys/block/$i/device/../slot_name ] ; then # 2.6.21/OS2008 slot_name=/sys/block/$i/device/../slot_name else # 2.6.18/OS2007 slot_name=$(expr substr $(basename $(readlink /sys/block/$i/device) ) 1 4) slot_name=/sys/block/$i/device/../mmc_host:${slot_name}/slot_name fi case `cat $slot_name` in internal) INT_CARD=$i ;; external) EXT_CARD=$i ;; esac fi done ;; esac This code should still work when this issue is fixed in kernel.
Created an attachment (id=1026) [details] make mmcblkX devices consistent with slot names, 'external' slot is mmcblk1
Attached is patch which corrects this issue. After all it is really easy and straightforward. mmc_host structure is extended with preferred starting mmc device index (mmcblkX). Checking for 'e' in omap driver is a bit hackish, correct way would be to extend platform data with the index. Since mmc device nodes are dynamic and may come and go this patch does not add any additional irregularities caused by having mmcblk1 with no mmcblk0, this may already happen without this patch by inserting and removing cards in specific order.
I wonder whether this has improved in Fremantle pre alpha, but probably to early to test...
Just FYI - another symptom of this bug. When booting from internal mmc on N810 while having card in external slot, system reboots when external card is removed, see http://www.internettablettalk.com/forums/showthread.php?t=26797 When card is inserted after system boots, everything works fine. There is also link to pre-built kernel there for those bitten by this bug. Also from the forum post: "this has fixed the issue" .... "What are the chances of this making into a standard distribution, so if there are any system updates in the future the kernel does not have to be re-built?". Good question :-)
(In reply to comment #7) > There is also link to pre-built kernel there for those bitten by this bug. Posting direct link also here just in case. Also the kernel is now updated with fix for bug #3243 too. http://fanoush.wz.cz/maemo/kernel-diablo-2.6.21-200842maemo1.tar.gz http://fanoush.wz.cz/maemo/modules-diablo-2.6.21-200842maemo1.tar.gz (optional)
(In reply to comment #6) > I wonder whether this has improved in Fremantle pre alpha, but probably to > early to test... I wonder this too. I volunteer to test if someone provides the pieces and steps needed.
(In reply to comment #9) > I wonder this too. I volunteer to test if someone provides the pieces and steps > needed. > 1. boot device with external card not present, once it boots completely start xterm and 2. save output of 'cat /proc/partitons' 3. save output of 'df' 4. boot device again, now with external card inserted before device is powered on/rebooted 5. after system boots do 2 again 6. do 3 again 7. make sure internal card is /dev/mmcblk0 on both /proc/partitions outputs 2 and 5 (check card size - big number in front of device name) 8. make sure /proc/partitions has same idea of device names as df output for both cases 2,3 and then also for 5,6. df prints device size in second column right after device name, it should be same/similar to size of partition with same name (mmcblk0p1 or mmcblk1p1) in /proc/partitions
Finally I got the time to go through this. (In reply to comment #10) > 1. boot device with external card not present, once it boots completely start > xterm and > 2. save output of 'cat /proc/partitons' Internal mmc = mmcblk0 > 3. save output of 'df' Internal mmc = mmcblk0 > 4. boot device again, now with external card inserted before device is powered > on/rebooted > 5. after system boots do 2 again External memory card = mmcblk0 Internal memory card = mmcblk1 !!! > 6. do 3 again External memory card = mmcblk1 Internal memory card = mmcblk0 !!! > 7. make sure internal card is /dev/mmcblk0 on both /proc/partitions outputs 2 > and 5 (check card size - big number in front of device name) Nope > 8. make sure /proc/partitions has same idea of device names as df output for > both cases 2,3 and then also for 5,6. df prints device size in second column > right after device name, it should be same/similar to size of partition with > same name (mmcblk0p1 or mmcblk1p1) in /proc/partitions Nope Thanks for guiding the steps!
Internal comment is "check devnodes in /dev, not /proc/partitions, hence INVALID". I assume that documenting this would be welcome? Hmm, if so, where? :-/
(In reply to comment #12) > Internal comment is "check devnodes in /dev, not /proc/partitions, hence > INVALID". My translation: "Yes, we know we didn't fix it properly. Instead of fixing it in kernel (in few lines) so both kernel and userspace have the same view of mmcblkx devices, we made a half working fix in userspace (udev rules?) so device names in /dev do not change because we hacked the naming. Sadly this does not help with data in kernel files like /proc/partitions or /proc/mounts because kernel POV is now different" Yes the names in /dev/ look correct but the real device minor numbers change. If you ls -l /dev/mmcblk* for both cases you notice sometimes /dev/mmcblk0 has correct minor number 0 and sometimes it has 8 which according to kernel naming belongs to mmcblk1 device. In fact as far as kernel sees it, it _is_ mmcblk1 device node, just renamed to mmcblk0 in /dev so the name looks OK. So far this half working fix caused a lot of confusion and also unexpected behaviour like: - output of shell utilities that use /proc files like df, mount is wrong - wrong card is used in some 3rd party utilities - one card is not mounted after system boot - system immediately reboots when booted from one card (internal one on N810) and you try to remove the other one (removable one on N810) It may be possible that this indeed does not cause any immediate problem when booting clean firmware but (apart from issues it causes for 3rd party tools) it may be hidden source of hard to track bugs even for Nokia's own code that handles mmc devices (mounting usb storage, formatting card, some HAL stuff, ...) so it may be wise to fix it properly (= in linux kernel).
@Andre & Quim: Here's a thought - why not have the developer who made that (possibly incorrect) comment post directly into this bug (see: bug 630) to continue the technical disussion in order to achieve a rapid resolution of one kind or another?
(In reply to comment #14) > why not have the developer who made that > (possibly incorrect) comment Well the comment "check devnodes in /dev" is correct if it was meant like "check file names of devnodes in /dev once system boots". They really have some workaround (similar to one in comment #3) so the names itself are consistent. This is not true for initfs environment with static device nodes (no longer in Fremantle) or early on boot, but once system boots and whole /dev is handled dynamically you really could e.g. run "mount /dev/mmcblk0p1 /media/mmc2" and be sure you have mounted correct (=internal) card. But the inconsistency is still there on deeper level. Anything more advanced that needs to go deeper and uses kernel naming or expects specific device minor numbers (search http://www.lanana.org/docs/device-list/devices-2.6+.txt for mmc) can be confused and is potential source of bugs. It is cleaner to fix kernel to assign our preferred device name directly than trying to fight it and hack the naming later in userspace. BTW, the mmc stack sources in linux kernel were restructured after 2.6.22 (?) so the patch will definitely not apply to current Fremantle kernel. Would it help if I spent some time updating it for recent kernel? Once my Beagleboard or Pandora finally arrives I can even test it ;-)
This is still discussed internally. "The more correct udev rule is a symlink. The current name change is a hack, agree. I do not see a reason to change the order on the kernel side."
(In reply to comment #16) > "The more correct udev rule is a symlink. The current name change is a hack, > agree. Yes, this is fine too, let /dev/mmcblkX be what kernel decides and make something like /dev/mmc_internal /dev/mmc_external symlink pointing to right /dev/mmcblkX This way is only inconvenient (when compared to setting deterministic order in kernel) but correct and transparent. > I do not see a reason to change the order on the kernel side. Well, it simplifies things a bit when device names are deterministic but we can surely live with more complexity just like we live with it with USB devices. It is just that with fixed number of mmc slots which are always present this dynamic nature is not needed and is a bit unexpected. IMO this dynamic nature of mmc card device nodes is design fault and is inconsistent with behaviour of floppy, cdrom, tape or other drives. Such devices still have its device name assigned even if there is no medium inserted. sd/mmc reader with no card inserted should work the same. Anyway, I agree this discussion belongs to kernel list and I am fine with symlink pointing to correct /dev/mmcblkX.
(In reply to comment #17) > (In reply to comment #16) > I am fine with symlink pointing to correct /dev/mmcblkX. ...which will basically be the solution for Fremantle. There will definitely be consistent naming, hence closing as FIXED.
Marking patches of interest to Diablo (Maemo4) community updates, please excuse the noise.
(In reply to comment #18) > ...which will basically be the solution for Fremantle. There will definitely be > consistent naming, hence closing as FIXED. > With RX-51_2009SE_1.2009.42-11_PR_MR0 this is still an issue. I still see the same hack in place. Nokia-N900-42-11:~# cat /proc/partitions major minor #blocks name 179 8 31264768 mmcblk1 179 9 28315648 mmcblk1p1 179 10 2097152 mmcblk1p2 179 11 786432 mmcblk1p3 179 0 7974912 mmcblk0 179 1 6139904 mmcblk0p1 179 2 917504 mmcblk0p2 179 3 913408 mmcblk0p3 Nokia-N900-42-11:~# ls -l /dev/mmcblk* brw-rw---- 1 root floppy 179, 8 Jan 1 1970 /dev/mmcblk0 brw-rw---- 1 root floppy 179, 9 Jan 1 1970 /dev/mmcblk0p1 brw-rw---- 1 root floppy 179, 10 Jan 1 1970 /dev/mmcblk0p2 brw-rw---- 1 root floppy 179, 11 Jan 1 1970 /dev/mmcblk0p3 brw-rw---- 1 root floppy 179, 0 Dec 28 22:39 /dev/mmcblk1 brw-rw---- 1 root floppy 179, 1 Dec 28 22:39 /dev/mmcblk1p1 brw-rw---- 1 root floppy 179, 2 Dec 28 22:39 /dev/mmcblk1p2 brw-rw---- 1 root floppy 179, 3 Dec 28 22:39 /dev/mmcblk1p3 Nokia-N900-42-11:~# df Filesystem 1k-blocks Used Available Use% Mounted on rootfs 233344 177516 51548 77% / ubi0:rootfs 233344 177516 51548 77% / tmpfs 1024 100 924 10% /tmp tmpfs 256 68 188 27% /var/run none 10240 88 10152 1% /dev tmpfs 65536 4 65532 0% /dev/shm /dev/mmcblk0p2 2064208 95988 1863364 5% /home /dev/mmcblk0p1 28312128 5366272 22945856 19% /home/user/MyDocs /dev/mmcblk1p1 6138336 2946432 3191904 48% /media/mmc1 Nokia-N900-42-11:~# kernel thinks internal eMMC is mmcblk1 (minor 8) and microsd card is mmcblk0. Device names in /dev/ are renamed so eMMC has mmcblk0 name (but having minor 8, mmcblk0 should start at minor 0) df output and /proc/partitions is again inconsistent Once I successfully clone current system to microsd card I will reflash and retest with 2.2009.51-1 (PR1.1) and reopen this bug if the issue still remains.
(In reply to comment #20) > Once I successfully clone current system to microsd card I will reflash and > retest with 2.2009.51-1 (PR1.1) and reopen this bug if the issue still remains. Yes, it is still the same, kernel thinks eMMC is mmcblk1, device node names are renamed so that eMMC is mmcblk0, reopening dmesg [ 6.594726] mmc0: new high speed SDHC card at address 0001 [ 6.612762] mmcblk0: mmc0:0001 00000 7.60 GiB [ 6.613067] mmcblk0: p1 p2 p3 [ 6.805694] mmc1: new high speed MMC card at address 0001 [ 6.811004] mmcblk1: mmc1:0001 MMC32G 29.8 GiB [ 6.811340] mmcblk1: p1 p2 p3 Nokia-N900-51-1:~# cat /proc/partitions major minor #blocks name 179 0 7974912 mmcblk0 179 1 6139904 mmcblk0p1 179 2 917504 mmcblk0p2 179 3 913408 mmcblk0p3 179 8 31264768 mmcblk1 179 9 28315648 mmcblk1p1 179 10 2097152 mmcblk1p2 179 11 786432 mmcblk1p3 Nokia-N900-51-1:~# df Filesystem 1k-blocks Used Available Use% Mounted on rootfs 233344 152432 76632 67% / ubi0:rootfs 233344 152432 76632 67% / tmpfs 1024 80 944 8% /tmp tmpfs 256 76 180 30% /var/run none 10240 88 10152 1% /dev tmpfs 65536 4 65532 0% /dev/shm /dev/mmcblk0p2 2064208 124204 1835148 6% /home /dev/mmcblk0p1 28312128 5366336 22945792 19% /home/user/MyDocs /dev/mmcblk1p1 6138336 2946432 3191904 48% /media/mmc1 Nokia-N900-51-1:~# ls -l /dev/mmcblk* brw-rw---- 1 root floppy 179, 8 Jan 1 1970 /dev/mmcblk0 brw-rw---- 1 root floppy 179, 9 Jan 1 1970 /dev/mmcblk0p1 brw-rw---- 1 root floppy 179, 10 Jan 1 1970 /dev/mmcblk0p2 brw-rw---- 1 root floppy 179, 11 Jan 1 1970 /dev/mmcblk0p3 brw-rw---- 1 root floppy 179, 0 Jan 1 1970 /dev/mmcblk1 brw-rw---- 1 root floppy 179, 1 Jan 1 1970 /dev/mmcblk1p1 brw-rw---- 1 root floppy 179, 2 Jan 1 1970 /dev/mmcblk1p2 brw-rw---- 1 root floppy 179, 3 Jan 1 1970 /dev/mmcblk1p3 Nokia-N900-51-1:~#
Last internal comment was "In Fremantle, we have already consistent naming because the internal card cannot be removed. So, internal always gets /dev/mmcblk0." So this is not true?
(In reply to comment #22) > Last internal comment was "In Fremantle, we have already consistent naming > because the internal card cannot be removed. So, internal always gets > /dev/mmcblk0." > So this is not true? I'm not sure. The fact is that the kernel assigns the numbers in order of recognition and that is how the upstream kernel works, I was told. So, if you want to fix this (without the current udev rule hack), please send a patch against the Linux kernel... This has been like this since Nokia 770, so no news flash here...
------- Comment #24 from Frantisek Dufka 2010-01-08 16:19 GMT+3 ------- (In reply to comment #22) > Last internal comment was "In Fremantle, we have already consistent naming > because the internal card cannot be removed. So, internal always gets > /dev/mmcblk0." > So this is not true? > No, this is not true. At least on my N900 when microSD is inserted at boot time it gets mmcblk0 and eMMC gets mmcblk1. Also the reasoning is wrong since N810 also has eMMC and the bug is reported for N810 too. (In reply to comment #23) > So, if you > want to fix this (without the current udev rule hack), please send a patch > against the Linux kernel... Ok, we're back at comment #16 :-) (In reply to comment #16) > This is still discussed internally. > "The more correct udev rule is a symlink. The current name change is a hack, > agree. I do not see a reason to change the order on the kernel side." > Anyway, I want it fixed on my own device so I'll do the patch and attach it here just in case :-)
(In reply to comment #24) > Anyway, I want it fixed on my own device so I'll do the patch and attach it > here just in case :-) It's welcome. :)
Created an attachment (id=2125) [details] minimal patch for fremantle kernel (PR1.1 source used) same patch as for Diablo, minimal but effective, slot name starting with 'e' gets mmcblk1 (slot names are 'internal' and 'external')
Created an attachment (id=2126) [details] binary modules for PR1.1 kernel with the fix MMC drivers in fremantle are currently compiled as modules. Minimal patch changes only mmc driver sources ==> no kernel flashing is needed, just copy all three modules to /lib/modules/current/. These modules were tested with unmodified PR1.1 kernel. When the fix is working kernel assigns mmcblk1 to external card even if it was detected before internal card [ 3.131500] Freeing init memory: 144K [ 4.720550] mmci-omap-hs mmci-omap-hs.0: Failed to get debounce clock [ 4.829956] mmci-omap-hs mmci-omap-hs.1: Failed to get debounce clock [ 4.985412] mmc0: host does not support reading read-only switch. assuming write-enable. [ 4.985534] mmc0: new high speed SDHC card at address 0001 [ 4.986083] mmcblk1: mmc0:0001 00000 7.60 GiB [ 4.986358] mmcblk1: p1 p2 p3 [ 5.235626] mmc1: new high speed MMC card at address 0001 [ 5.236083] mmcblk0: mmc1:0001 MMC32G 29.8 GiB [ 5.236358] mmcblk0: p1 p2 p3 [ 17.096649] Registered led device: twl4030:vibrator
Created an attachment (id=2127) [details] bigger patch for fremantle kernel which extends also platform data structures Alternative patch. Platform data structures describing MMC slots are extended with preferred mmcblkX index. This extends couple of in-kernel structures in addition to mmc modules so it needs also new matching kernel. IMO minimal patch is more elegant but this one may feel less hackish (with the cost of more data shuffling between various structures for little or no gain in clarity). I haven't actually tested this on the device but the code compiles fine and looks straightforward :-)
Internal status of this is WONTFIX, hence reflecting it here. :-/