maemo.org Bugzilla – Bug 9662
DHCP with redundant Gateway does not configure any routes
Last modified: 2010-04-12 13:29:43 UTC
You need to log in before you can comment on or make changes to this bug.
SOFTWARE VERSION: 3.2010.02-8.002 EXACT STEPS LEADING TO PROBLEM: (Explain in detail what you do (e.g. tap on OK) and what you see (e.g. message Connection Failed appears)) 1. Setup a DHCP server with multiple/redundant Default Gateways/Routers in it for fault tolerance. In my case I used "Windows 2003 R2" DHCP service, for the related scope I defined two values for "003 Router" list (the option to set Default gateway for DHCP clients). 2. Set your DHCP to update DNS records when it serves a DHCP request or add a reservation for your device's MAC so that you know what IP does your device gets from DHCP. 3. Turn your Wi-Fi ON or reconnect to the network using Wi-Fi. 4. If you added reservation in DHCP remember that IP or in other case get the IP assigned to that device from DNS or DHCP server and PING that IP. (IP should be ping able). 5. Try to browse Internet or try connecting any application to the Internet. (that shall not work). If you have installed application "Personal IP Address" it won't show any IP and will show as not connected to the network so don't get confused, the device is actually connected to the local network, just does not know the default route to the internet. 6. SSH into the device and type "sudo ifconfig" to see the network status. You would see that your device has an IP assigned to it. Look for section "wlan0" and "inet addr:". Well just to double check, otherwise you already know that IP as you are connected to SSH :-) but that check was incase if you are in terminal mode of your device itself. 7. Within SSH or terminal window issue command "sudo /sbin/route" or only "route" if you are already in su/root mode to see a list of routes available to your device. You may see only one entry here for the local network only and no default route. Below I have mentioned what route command shows and compared to what it should have shown in order to make it work. The metric value could be any different values; I just came up with these values. EXPECTED OUTCOME: ================= / $ sudo /sbin/route Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 192.168.2.15 * 255.255.255.0 U 0 0 0 wlan0 default 192.168.2.1 0.0.0.0 UG 0 0 0 wlan0 default 192.168.2.2 0.0.0.0 UG 10 0 0 wlan0 / $ ACTUAL OUTCOME: =============== / $ sudo /sbin/route Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 192.168.2.15 * 255.255.255.0 U 0 0 0 wlan0 / $ REPRODUCIBILITY: 10/10 OTHER COMMENTS: =============== When DHCP server serves request and replies with multiple/redundant gateways, the device does take the IP and communicates within the local network but ignores the list of gateway(s) (it was probably expecting only one IP here instead of a list) and is not able to connect the Internet/outside network. When I setup my DHCP to serve only one Gateway for this MAC, N900 starts to connect to the Internet. Our office uses multiple Gateways for redundancy/fault-tolerance. We have other debian and CentOS based systems running here including Ubuntu workstations as well, while none of them is having problem with multiple/redundant gateways. Only N900 is having this problem which means that the problem is not inherited from Linux or Debian but bug is introduced in Maemo itself. User-Agent: Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; OfficeLiveConnector.1.4; OfficeLivePatch.1.3)
Thanks for the report. A tcpdump capture and/or syslog output would be useful here, see http://wiki.maemo.org/Documentation/devtools/maemo5 for details and installation instructions. Note that you should probably uninstall sysklogd and remove /var/log/syslog* once it is no longer needed to avoid problems with it filling up the root filesystem. Could you also post the contents of /var/run/dhcp-params.conf while a WLAN connection to this network has been established?
Nevermind, I can reproduce with ISC dhdpc and an "option routers" entry containing two IP addresses. The problem is in /etc/maemo-dhcp.d/50_ipv4_network_setup which handles $router as a single value causing /sbin/route to fail with a syntax error. > + /sbin/route add default gw XXX.XXX.XXX.1 XXX.XXX.XXX.2 dev wlan0 > BusyBox v1.10.2 (Debian 3:1.10.2.legal-1osso26+0m5) multi-call binary > > Usage: route [{add|del|delete}] This is a regression from Diablo which handles this fine, iterating over the tokens in $router and setting individual routes.
(In reply to comment #2) > Nevermind, I can reproduce with ISC dhdpc and an "option routers" entry > containing two IP addresses. > The problem is in /etc/maemo-dhcp.d/50_ipv4_network_setup which handles $router > as a single value causing /sbin/route to fail with a syntax error. > > + /sbin/route add default gw XXX.XXX.XXX.1 XXX.XXX.XXX.2 dev wlan0 > > BusyBox v1.10.2 (Debian 3:1.10.2.legal-1osso26+0m5) multi-call binary > > > > Usage: route [{add|del|delete}] > This is a regression from Diablo which handles this fine, iterating over the > tokens in $router and setting individual routes. I supposed syslog captures are not required anymore and you were able to duplicate the problem. However I still have checked the contents of /var/run/dhcp-params.conf file and am pasting them down here... file reads like this with only one IP in Routers list / $ cat /var/run/dhcp-params.conf wlan0 10.1.1.111 10.1.1.254 255.255.0.0 MYDOMAIN.DOM 10.1.1.11 10.1.1.12 when we have two IPs in Routers list, it reads like this / $ cat /var/run/dhcp-params.conf wlan0 10.1.1.111 10.1.1.254 10.1.1.253 255.255.0.0 MYDOMAIN.DOM 10.1.1.11 10.1.1.12 Here 10.1.1.254 is our primary route/gateway and 10.1.1.253 is the secondary one. Another interesting thing that is happening me is that if I restart networking it does nothing or at-least it does not reset wlan0 interface and its parameters or reflect changes in DHCP server. Everytime I make a change in DHCP, I have to restart the device itself. Please let me know if I am doing something wrong here or there is some other way to reset wlan0 interface instead of rebooting my N900. Following is the command I used to restart networking and the output it shows me. I don't know why is it failed all the time. / $ sudo /etc/init.d/networking restart /etc/network/options is deprecated. Setting up IP spoofing protection...done (rp_filter). Reconfiguring network interfaces...sh: missing ] ifdown: interface eth0 not configured ifdown: interface usb0 not configured ifup: don't seem to have all the variables for eth0/inet ifconfig: SIOCGIFFLAGS: No such device ifconfig: SIOCSIFADDR: No such device route: SIOCADDRT: No such device failed.
(In reply to comment #3) > I supposed syslog captures are not required anymore and you were able to > duplicate the problem. That's correct, sorry if I wasn't clear enough. > Another interesting thing that is happening me is that if I restart networking > it does nothing or at-least it does not reset wlan0 interface and its > parameters or reflect changes in DHCP server. You shouldn't need to do that, just disconnecting/reconnecting from the GUI is enough. If you restart networking from the command line you may also have to restart wlancond (wlan0 isn't managed via /etc/network/interfaces anyway).
Yeah my mistake about that, I should have restarted Wireless network from GUI anyways. So how does it work? if a bug is identified and acknowledged, how does it actually gets fixed or goes on queue to get fixed for the next firmware?
(In reply to comment #5) > So how does it work? if a bug is identified and acknowledged, how does it > actually gets fixed or goes on queue to get fixed for the next firmware? The way it usually works is that once a bug is confirmed, Andre (the maemo.org bugmaster) will clone it to Nokia's internal bug tracker (http://wiki.maemo.org/Bugs:Cloning) and then sync its progress here.
Created an attachment (id=2520) [details] patch Here's my fix for this bug, in case this is useful.
Well as for the workaround for myself, as I have access to the DHCP server, I have added a custom reservation for my device in the DHCP server and the workaround works but it might not be the case for someone else in a similar envoirnment. Malcolm, I am sorry I am new to this stuff, can you please also describe how to apply/use that patch? I dont have a file /root/50_ipv4_network_setup.orig in my N900 I still wish that the bug is officially take care of.. How does the process work? who confirms a bug as a bug and how does a bug report gets the attension of whoever does confirm? Is the confirmation from "Lucas Maneos" sufficient? How does Andre know that the bug is confirmed and now it is time to assign it somewhere.
Ehsan, if you have a local workaround, there's probably no need to bother with my patch. But in case you're interested in testing it, all you should need to do is run something like: patch /etc/maemo-dhcp.d/50_ipv4_network_setup < 50_ipv4_network_setup.diff (as root, assuming you have 'patch' installed and have downloaded my patch to 50_ipv4_network_setup.diff in the current directory). You don't need /root/50_ipv4_network_setup.orig; that's just an artefact of how I made the patch and will be ignored by the above command. Note that I provide no guarantee that this will not destroy your N900; I merely state that it works for me. I can't comment about Nokia's bug-resolution procedures as I'm new here too! Hopefully someone in the know will notice my patch, review it and commit it upstream (hint hint). :-)
(In reply to comment #6) > The way it usually works is that once a bug is confirmed, Andre (the maemo.org > bugmaster) will clone it to Nokia's internal bug tracker If Andre is not on holidays... :-/ Sorry for my late response here. Done.
This has been fixed in internal build version 10.2010.04-4 (Note: 2009/2010 is the year, and the number after is the week.) The next public update released with the year/week later than this internal build version will include the fix. (This is not always already the next public update.) Please verify that this new version fixes the bug by marking this bug report as VERIFIED after the public update has been released and if you have some time. To answer popular followup questions: * Nokia does not announce release dates of public updates in advance. * There is currently no access to these internal, non-public build versions. A Brainstorm proposal to change this exists at http://maemo.org/community/brainstorm/view/undelayed_bugfix_releases_for_nokia_open_source_packages-002/
(In reply to comment #11) > This has been fixed in internal build version > 10.2010.04-4 > (Note: 2009/2010 is the year, and the number after is the week.) Thank you very much Andre and everyone else who contributed. > Please verify that this new version fixes the bug by marking this bug report as > VERIFIED after the public update has been released and if you have some time. I will try to keep a track of it and mark it as verified as we get that release. I do not expect it to be in this over-due expected release but I hope it will be there in the next one.
(In reply to comment #12) > I do not expect it to be in this over-due expected release Note the target milestone field, which indicates it will be :-)
(In reply to comment #13) > (In reply to comment #12) > > I do not expect it to be in this over-due expected release > > Note the target milestone field, which indicates it will be :-) > Wow that would be great. But does that mean that we are not going to see "5.0/PR 1.2" sometime sooner? By the way I am not sure if Andre updated this field because I think when creating this bug, this was how I did set this field as I hoped to get it resolved sooner (well it was fast compared to other bug I reported)
(In reply to comment #14) > By the way I am not sure if Andre updated this field It was updated by Andre today, along with the internal bug number in the alias field (int-163466 -> int-153291). > But does that mean that we are not going to see > "5.0/PR 1.2" sometime sooner? I can only speculate (and see comment 11) but the alias change indicates that this was a known bug internally before it was reported here so may not have much impact on release dates.
(In reply to comment #15) > I can only speculate (and see comment 11) but the alias change indicates that > this was a known bug internally before it was reported here so may not have > much impact on release dates. Exactly.
(In reply to comment #9) > But in case you're interested in testing it, all you should need to > do is run something like: > > patch /etc/maemo-dhcp.d/50_ipv4_network_setup < 50_ipv4_network_setup.diff > > Note that I provide no guarantee that this will not destroy your N900; > I merely state that it works for me. Note that if you change the configuration file of the system package, the SSU to PR1.2 might fail for that package. I have not really tested what happens in that case but in order to avoid the mess described in http://talk.maemo.org/showthread.php?t=40567&page=13 (vpn package changing system config file), restore original 50_ipv4_network_setup file before doing the SSU (when it is released).