Bug 9662 - (int-153291) DHCP with redundant Gateway does not configure any routes
(int-153291)
: DHCP with redundant Gateway does not configure any routes
Status: RESOLVED FIXED
Product: Connectivity
Networking
: 5.0/(3.2010.02-8)
: N900 Maemo
: Unspecified major (vote)
: 5.0/(10.2010.19-1)
Assigned To: unassigned
: networking-bugs
:
: patch
:
:
  Show dependency tree
 
Reported: 2010-03-22 18:56 UTC by Ehsan
Modified: 2010-04-12 13:29 UTC (History)
5 users (show)

See Also:


Attachments
patch (1.03 KB, patch)
2010-03-24 14:15 UTC, Malcolm Scott
Details


Note

You need to log in before you can comment on or make changes to this bug.


Description Ehsan (reporter) 2010-03-22 18:56:17 UTC
SOFTWARE VERSION:  3.2010.02-8.002

EXACT STEPS LEADING TO PROBLEM: 

(Explain in detail what you do (e.g. tap on OK) and what you see (e.g. message
Connection Failed appears))
1. Setup a DHCP server with multiple/redundant Default Gateways/Routers in it
for fault tolerance. In my case I used "Windows 2003 R2" DHCP service, for the
related scope I defined two values for "003 Router" list (the option to set
Default gateway for DHCP clients).

2. Set your DHCP to update DNS records when it serves a DHCP request or add a
reservation for your device's MAC so that you know what IP does your device
gets from DHCP.

3. Turn your Wi-Fi ON or reconnect to the network using Wi-Fi.

4. If you added reservation in DHCP remember that IP or in other case get the
IP assigned to that device from DNS or DHCP server and PING that IP. (IP should
be ping able).

5. Try to browse Internet or try connecting any application to the Internet.
(that shall not work). If you have installed application "Personal IP Address"
it won't show any IP and will show as not connected to the network so don't get
confused, the device is actually connected to the local network, just does not
know the default route to the internet.

6. SSH into the device and type "sudo ifconfig" to see the network status. You
would see that your device has an IP assigned to it. Look for section "wlan0"
and "inet addr:". Well just to double check, otherwise you already know that IP
as you are connected to SSH :-) but that check was incase if you are in
terminal mode of your device itself.

7. Within SSH or terminal window issue command "sudo /sbin/route" or only
"route" if you are already in su/root mode to see a list of routes available to
your device. You may see only one entry here for the local network only and no
default route.

Below I have mentioned what route command shows and compared to what it should
have shown in order to make it work. The metric value could be any different
values; I just came up with these values.

EXPECTED OUTCOME:
=================

/ $ sudo /sbin/route
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
192.168.2.15    *               255.255.255.0   U     0      0        0 wlan0
default         192.168.2.1     0.0.0.0         UG    0      0        0 wlan0
default         192.168.2.2     0.0.0.0         UG    10     0        0 wlan0
/ $


ACTUAL OUTCOME:
===============

/ $ sudo /sbin/route
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
192.168.2.15    *               255.255.255.0   U     0      0        0 wlan0
/ $

REPRODUCIBILITY: 10/10



OTHER COMMENTS:
===============

When DHCP server serves request and replies with multiple/redundant gateways,
the device does take the IP and communicates within the local network but
ignores the list of gateway(s) (it was probably expecting only one IP here
instead of a list) and is not able to connect the Internet/outside network.

When I setup my DHCP to serve only one Gateway for this MAC, N900 starts to
connect to the Internet.

Our office uses multiple Gateways for redundancy/fault-tolerance. We have other
debian and CentOS based systems running here including Ubuntu workstations as
well, while none of them is having problem with multiple/redundant gateways.
Only N900 is having this problem which means that the problem is not inherited
from Linux or Debian but bug is introduced in Maemo itself.


User-Agent:       Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64;
Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729;
Media Center PC 6.0; OfficeLiveConnector.1.4; OfficeLivePatch.1.3)
Comment 1 Lucas Maneos 2010-03-23 15:07:52 UTC
Thanks for the report.

A tcpdump capture and/or syslog output would be useful here, see
http://wiki.maemo.org/Documentation/devtools/maemo5 for details and
installation instructions.  Note that you should probably uninstall sysklogd
and remove /var/log/syslog* once it is no longer needed to avoid problems with
it filling up the root filesystem.

Could you also post the contents of /var/run/dhcp-params.conf while a WLAN
connection to this network has been established?
Comment 2 Lucas Maneos 2010-03-23 15:40:52 UTC
Nevermind, I can reproduce with ISC dhdpc and an "option routers" entry
containing two IP addresses.

The problem is in /etc/maemo-dhcp.d/50_ipv4_network_setup which handles $router
as a single value causing /sbin/route to fail with a syntax error.

> + /sbin/route add default gw XXX.XXX.XXX.1 XXX.XXX.XXX.2 dev wlan0
> BusyBox v1.10.2 (Debian 3:1.10.2.legal-1osso26+0m5) multi-call binary
> 
> Usage: route [{add|del|delete}]

This is a regression from Diablo which handles this fine, iterating over the
tokens in $router and setting individual routes.
Comment 3 Ehsan (reporter) 2010-03-23 16:57:17 UTC
(In reply to comment #2)
> Nevermind, I can reproduce with ISC dhdpc and an "option routers" entry
> containing two IP addresses.
> The problem is in /etc/maemo-dhcp.d/50_ipv4_network_setup which handles $router
> as a single value causing /sbin/route to fail with a syntax error.
> > + /sbin/route add default gw XXX.XXX.XXX.1 XXX.XXX.XXX.2 dev wlan0
> > BusyBox v1.10.2 (Debian 3:1.10.2.legal-1osso26+0m5) multi-call binary
> > 
> > Usage: route [{add|del|delete}]
> This is a regression from Diablo which handles this fine, iterating over the
> tokens in $router and setting individual routes.

I supposed syslog captures are not required anymore and you were able to
duplicate the problem. However I still have checked the contents of
/var/run/dhcp-params.conf file and am pasting them down here...

file reads like this with only one IP in Routers list
/ $ cat /var/run/dhcp-params.conf
wlan0 10.1.1.111 10.1.1.254 255.255.0.0 MYDOMAIN.DOM 10.1.1.11 10.1.1.12

when we have two IPs in Routers list, it reads like this
/ $ cat /var/run/dhcp-params.conf
wlan0 10.1.1.111 10.1.1.254 10.1.1.253 255.255.0.0 MYDOMAIN.DOM 10.1.1.11
10.1.1.12

Here 10.1.1.254 is our primary route/gateway and 10.1.1.253 is the secondary
one.


Another interesting thing that is happening me is that if I restart networking
it does nothing or at-least it does not reset wlan0 interface and its
parameters or reflect changes in DHCP server. Everytime I make a change in
DHCP, I have to restart the device itself. Please let me know if I am doing
something wrong here or there is some other way to reset wlan0 interface
instead of rebooting my N900. Following is the command I used to restart
networking and the output it shows me. I don't know why is it failed all the
time.

/ $ sudo /etc/init.d/networking restart
/etc/network/options is deprecated.
Setting up IP spoofing protection...done (rp_filter).
Reconfiguring network interfaces...sh: missing ]
ifdown: interface eth0 not configured
ifdown: interface usb0 not configured
ifup: don't seem to have all the variables for eth0/inet
ifconfig: SIOCGIFFLAGS: No such device
ifconfig: SIOCSIFADDR: No such device
route: SIOCADDRT: No such device
failed.
Comment 4 Lucas Maneos 2010-03-23 17:03:49 UTC
(In reply to comment #3)
> I supposed syslog captures are not required anymore and you were able to
> duplicate the problem.

That's correct, sorry if I wasn't clear enough.

> Another interesting thing that is happening me is that if I restart networking
> it does nothing or at-least it does not reset wlan0 interface and its
> parameters or reflect changes in DHCP server.

You shouldn't need to do that, just disconnecting/reconnecting from the GUI is
enough.  If you restart networking from the command line you may also have to
restart wlancond (wlan0 isn't managed via /etc/network/interfaces anyway).
Comment 5 Ehsan (reporter) 2010-03-23 17:13:14 UTC
Yeah my mistake about that, I should have restarted Wireless network from GUI
anyways.

So how does it work? if a bug is identified and acknowledged, how does it
actually gets fixed or goes on queue to get fixed for the next firmware?
Comment 6 Lucas Maneos 2010-03-23 17:51:00 UTC
(In reply to comment #5)
> So how does it work? if a bug is identified and acknowledged, how does it
> actually gets fixed or goes on queue to get fixed for the next firmware?

The way it usually works is that once a bug is confirmed, Andre (the maemo.org
bugmaster) will clone it to Nokia's internal bug tracker
(http://wiki.maemo.org/Bugs:Cloning) and then sync its progress here.
Comment 7 Malcolm Scott 2010-03-24 14:15:05 UTC
Created an attachment (id=2520) [details]
patch

Here's my fix for this bug, in case this is useful.
Comment 8 Ehsan (reporter) 2010-03-24 17:59:38 UTC
Well as for the workaround for myself, as I have access to the DHCP server, I
have added a custom reservation for my device in the DHCP server and the
workaround works but it might not be the case for someone else in a similar
envoirnment.

Malcolm, I am sorry I am new to this stuff, can you please also describe how to
apply/use that patch? I dont have a file /root/50_ipv4_network_setup.orig in my
N900

I still wish that the bug is officially take care of..
How does the process work? who confirms a bug as a bug and how does a bug
report gets the attension of whoever does confirm? Is the confirmation from
"Lucas Maneos" sufficient?
How does Andre know that the bug is confirmed and now it is time to assign it
somewhere.
Comment 9 Malcolm Scott 2010-03-24 20:08:39 UTC
Ehsan, if you have a local workaround, there's probably no need to bother with
my patch.  But in case you're interested in testing it, all you should need to
do is run something like:

  patch /etc/maemo-dhcp.d/50_ipv4_network_setup < 50_ipv4_network_setup.diff

(as root, assuming you have 'patch' installed and have downloaded my patch to
50_ipv4_network_setup.diff in the current directory).  You don't need
/root/50_ipv4_network_setup.orig; that's just an artefact of how I made the
patch and will be ignored by the above command.

Note that I provide no guarantee that this will not destroy your N900; I merely
state that it works for me.

I can't comment about Nokia's bug-resolution procedures as I'm new here too! 
Hopefully someone in the know will notice my patch, review it and commit it
upstream (hint hint). :-)
Comment 10 Andre Klapper maemo.org 2010-04-07 20:44:54 UTC
(In reply to comment #6)
> The way it usually works is that once a bug is confirmed, Andre (the maemo.org
> bugmaster) will clone it to Nokia's internal bug tracker

If Andre is not on holidays... :-/ Sorry for my late response here. Done.
Comment 11 Andre Klapper maemo.org 2010-04-09 15:33:05 UTC
This has been fixed in internal build version
10.2010.04-4
(Note: 2009/2010 is the year, and the number after is the week.)

The next public update released with the year/week later than this internal
build version will include the fix. (This is not always already the next public
update.)
Please verify that this new version fixes the bug by marking this bug report as
VERIFIED after the public update has been released and if you have some time.


To answer popular followup questions:
 * Nokia does not announce release dates of public updates in advance.
 * There is currently no access to these internal, non-public build versions.
   A Brainstorm proposal to change this exists at
http://maemo.org/community/brainstorm/view/undelayed_bugfix_releases_for_nokia_open_source_packages-002/
Comment 12 Ehsan (reporter) 2010-04-09 17:37:01 UTC
(In reply to comment #11)
> This has been fixed in internal build version
> 10.2010.04-4
> (Note: 2009/2010 is the year, and the number after is the week.)

Thank you very much Andre and everyone else who contributed.

> Please verify that this new version fixes the bug by marking this bug report as
> VERIFIED after the public update has been released and if you have some time.

I will try to keep a track of it and mark it as verified as we get that
release. I do not expect it to be in this over-due expected release but I hope
it will be there in the next one.
Comment 13 Lucas Maneos 2010-04-09 17:39:39 UTC
(In reply to comment #12)
> I do not expect it to be in this over-due expected release

Note the target milestone field, which indicates it will be :-)
Comment 14 Ehsan (reporter) 2010-04-09 17:47:34 UTC
(In reply to comment #13)
> (In reply to comment #12)
> > I do not expect it to be in this over-due expected release
> 
> Note the target milestone field, which indicates it will be :-)
> 

Wow that would be great. But does that mean that we are not going to see
"5.0/PR 1.2" sometime sooner? By the way I am not sure if Andre updated this
field because I think when creating this bug, this was how I did set this field
as I hoped to get it resolved sooner (well it was fast compared to other bug I
reported)
Comment 15 Lucas Maneos 2010-04-09 17:53:52 UTC
(In reply to comment #14)
> By the way I am not sure if Andre updated this field

It was updated by Andre today, along with the internal bug number in the alias
field (int-163466 -> int-153291).

> But does that mean that we are not going to see
> "5.0/PR 1.2" sometime sooner?

I can only speculate (and see comment 11) but the alias change indicates that
this was a known bug internally before it was reported here so may not have
much impact on release dates.
Comment 16 Andre Klapper maemo.org 2010-04-09 18:06:24 UTC
(In reply to comment #15)
> I can only speculate (and see comment 11) but the alias change indicates that
> this was a known bug internally before it was reported here so may not have
> much impact on release dates.

Exactly.
Comment 17 Jukka Rissanen nokia 2010-04-12 13:29:43 UTC
(In reply to comment #9)
> But in case you're interested in testing it, all you should need to
> do is run something like:
> 
>   patch /etc/maemo-dhcp.d/50_ipv4_network_setup < 50_ipv4_network_setup.diff
> 
> Note that I provide no guarantee that this will not destroy your N900;
> I merely state that it works for me.

Note that if you change the configuration file of the system package, the SSU
to PR1.2 might fail for that package. I have not really tested what happens in
that case but in order to avoid the mess described in
http://talk.maemo.org/showthread.php?t=40567&page=13 (vpn package changing
system config file), restore original 50_ipv4_network_setup file before doing
the SSU (when it is released).