Bug 8133 (int-154163)

Summary: dnsmasq segfaults when connecting to a certain WLAN AP
Product: [Maemo Official Platform] Connectivity Reporter: Philipp Zabel <philipp.zabel>
Component: NetworkingAssignee: unassigned <nobody>
Status: RESOLVED FIXED QA Contact: networking-bugs
Severity: major    
Priority: Unspecified CC: andre_klapper, darryl-mailinglists, jukka.rissanen, maemo, mikleavakov, philipp.zabel
Version: 5.0/(3.2010.02-8)Keywords: crash
Target Milestone: 5.0/(10.2010.19-1)   
Hardware: All   
OS: Maemo   
Attachments: output of tcpdump -i wlan0 port 53 -w dnsmasq-crash.tcpdump
output of tcpdump -i wlan0 port 53 -w dnsmasq-crash-2.tcpdump
contents of the wlan0 resolv.conf
dnsmasq_2.45-1+lenny1+maemo1_all.deb
dnsmasq-base_2.45-1+lenny1+maemo1_armel.deb
dnsmasq_2.45-1+lenny1+maemo1.diff.gz

Description Philipp Zabel (reporter) 2010-01-17 00:12:15 UTC
SOFTWARE VERSION:
2.2009.51-1

EXACT STEPS LEADING TO PROBLEM: 
1. Set up a new connection to the WLAN AP (setting SSID and WPA-PSK)
2. Connect to the WLAN AP, either by getting into range when automatic
   WLAN connecting is enabled, or by selecting it from the connection
   dialog when in range.

EXPECTED OUTCOME:
dnsmasq forwards DNS queries to the name server, which is correctly entered in
the wlan0 resolv.conf.

ACTUAL OUTCOME:
dnsmasq segfaults in reply_query at line 219 of forward.c

REPRODUCIBILITY:
So far always - the WLAN AP in question is not mine, I only tried several times
on three occasions.

EXTRA SOFTWARE INSTALLED:
Conboy (unstable)
Extra Decoders Support
Harmony
Last.fm scrobbler
Ogg Support
OpenSSH Client
OpenSSH Server
pymaemo-optify
rootsh
vagalume
XChat

OTHER COMMENTS:

Here is a partial syslog of the connection attempt:

Jan 16 13:55:02 Nokia-N900-51-1 kernel: [13105.937225] wlan0: associated
Jan 16 13:55:02 Nokia-N900-51-1 icd2 0.87+fremantle4+0m5[1165]: connecting iap
0x39a10 in state ICD_IAP_STATE_LINK_POST_UP: interface is 'wlan0'
Jan 16 13:55:02 Nokia-N900-51-1 icd2 0.87+fremantle4+0m5[1165]: connecting iap
0x39a10 in state ICD_IAP_STATE_IP_UP: interface is 'wlan0'
Jan 16 13:55:03 Nokia-N900-51-1 udhcpc[1631]: udhcpc (v0.9.9-pre) started
Jan 16 13:55:03 Nokia-N900-51-1 udhcpc[1631]: Sending discover...
Jan 16 13:55:07 Nokia-N900-51-1 udhcpc[1631]: Sending discover...
Jan 16 13:55:08 Nokia-N900-51-1 udhcpc[1631]: Sending select for
192.168.2.101...
Jan 16 13:55:08 Nokia-N900-51-1 udhcpc[1631]: Lease of 192.168.2.101 obtained,
lease time 172800
Jan 16 13:55:08 Nokia-N900-51-1 dnsmasq[1132]: failed to access
/var/run/resolv.conf.wlan0: No such file or directory
Jan 16 13:55:08 Nokia-N900-51-1 dnsmasq[1132]: read /etc/hosts - 1 addresses
Jan 16 13:55:08 Nokia-N900-51-1 icd2 0.87+fremantle4+0m5[1165]: connecting iap
0x39a10 in state ICD_IAP_STATE_IP_UP: interface is 'wlan0'
Jan 16 13:55:08 Nokia-N900-51-1 icd2 0.87+fremantle4+0m5[1165]: srv type ''
unknown
Jan 16 13:55:08 Nokia-N900-51-1 [1074]: Unknown state 15 received from ICd2
Jan 16 13:55:09 Nokia-N900-51-1 mce[741]: GLIB DEBUG ConIc -
con_ic_connection_send_event(0x30020, 6e668de8-9891-4d0b-ae47-d30d8537eb2e,
WLAN_INFRA, 0)
Jan 16 13:55:09 Nokia-N900-51-1 location-proxy[1117]: GLIB DEBUG ConIc -
con_ic_connection_send_event(0x27c18, 6e668de8-9891-4d0b-ae47-d30d8537eb2e,
WLAN_INFRA, 0)
Jan 16 13:55:09 Nokia-N900-51-1 mission-control[1032]: GLIB DEBUG ConIc -
con_ic_connection_send_event(0x3e410, 6e668de8-9891-4d0b-ae47-d30d8537eb2e,
WLAN_INFRA, 0)
Jan 16 13:55:09 Nokia-N900-51-1 alarmd[967]: GLIB DEBUG ConIc -
con_ic_connection_send_event(0x33018, 6e668de8-9891-4d0b-ae47-d30d8537eb2e,
WLAN_INFRA, 0)
Jan 16 13:55:09 Nokia-N900-51-1 [1411]: GLIB DEBUG ConIc -
con_ic_connection_send_event(0xf9980, 6e668de8-9891-4d0b-ae47-d30d8537eb2e,
WLAN_INFRA, 0)
Jan 16 13:55:09 Nokia-N900-51-1 browserd[1457]: GLIB DEBUG ConIc -
con_ic_connection_send_event(0x42d80, 6e668de8-9891-4d0b-ae47-d30d8537eb2e,
WLAN_INFRA, 0)
Jan 16 13:55:09 Nokia-N900-51-1 browserd[1457]: GLIB DEBUG default -
connection_cb(0x42d80, 6e668de8-9891-4d0b-ae47-d30d8537eb2e, WLAN_INFRA, 0, 0)
count 0
Jan 16 13:55:09 Nokia-N900-51-1 browserd[1457]: GLIB DEBUG default -
connection_cb(0x42d80, 6e668de8-9891-4d0b-ae47-d30d8537eb2e, WLAN_INFRA, 0, 0)
connected
Jan 16 13:55:09 Nokia-N900-51-1 [1074]: GLIB DEBUG ConIc -
con_ic_connection_send_event(0xa49e0, 6e668de8-9891-4d0b-ae47-d30d8537eb2e,
WLAN_INFRA, 0)
Jan 16 13:55:14 Nokia-N900-51-1 dnsmasq[1132]: reading
/var/run/resolv.conf.wlan0
Jan 16 13:55:14 Nokia-N900-51-1 dnsmasq[1132]: using nameserver 192.168.2.1#53
Jan 16 13:55:15 Nokia-N900-51-1 init: dnsmasq main process (1132) killed by
SEGV signal
Jan 16 13:55:15 Nokia-N900-51-1 init: dnsmasq main process ended, respawning
Jan 16 13:55:15 Nokia-N900-51-1 dnsmasq[1699]: started, version 2.35 cachesize
150
Jan 16 13:55:15 Nokia-N900-51-1 dnsmasq[1699]: compile time options: IPv6
GNU-getopt no-RTC no-ISC-leasefile DBus no-I18N 
Jan 16 13:55:15 Nokia-N900-51-1 dnsmasq[1699]: failed to access
/var/run/resolv.conf.gprs: No such file or directory
Jan 16 13:55:15 Nokia-N900-51-1 dnsmasq[1699]: failed to access
/var/run/resolv.conf.lo: No such file or directory
Jan 16 13:55:15 Nokia-N900-51-1 dnsmasq[1699]: failed to access
/var/run/resolv.conf.ppp0: No such file or directory
Jan 16 13:55:15 Nokia-N900-51-1 dnsmasq[1699]: failed to access
/var/run/resolv.conf: No such file or directory
Jan 16 13:55:15 Nokia-N900-51-1 dnsmasq[1699]: reading
/var/run/resolv.conf.wlan0
Jan 16 13:55:15 Nokia-N900-51-1 dnsmasq[1699]: using nameserver 192.168.2.1#53
Jan 16 13:55:15 Nokia-N900-51-1 dnsmasq[1699]: read /etc/hosts - 1 addresses
Jan 16 13:55:20 Nokia-N900-51-1 init: dnsmasq main process (1699) killed by
SEGV signal
...

User-Agent:       Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.6)
Gecko/20091216 Iceweasel/3.5.6 (like Firefox/3.5.6; Debian-3.5.6-1)
Comment 1 Philipp Zabel (reporter) 2010-01-17 00:16:02 UTC
A gdb session log of the segmentation fault in question:

BusyBox v1.10.2 (Debian 3:1.10.2.legal-1osso26+0m5) built-in shell (ash)
Enter 'help' for a list of built-in commands.

/home/user # gdb dnsmasq
GNU gdb (GDB) 6.8.50.20090417-debian
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "arm-linux-gnueabi".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
(gdb) run -a 127.0.0.1 -i lo -z -d
Starting program: /home/user/dnsmasq -a 127.0.0.1 -i lo -z -d


Program received signal SIGSEGV, Segmentation fault.
forward_query (daemon=0x24008, udpfd=-1, udpaddr=0x0, 
    dst_addr=0x0, dst_iface=0, header=0x24b38, plen=46, 
    now=-9382099, forward=0x28190) at forward.c:219
219    forward.c: No such file or directory.
    in forward.c
(gdb) 
(gdb) bt
#0  forward_query (daemon=0x24008, udpfd=-1, udpaddr=0x0, 
    dst_addr=0x0, dst_iface=0, header=0x24b38, plen=46, 
    now=-9382099, forward=0x28190) at forward.c:219
#1  0x0001112a in reply_query (daemon=0x24008, 
    fd=<value optimized out>, 
    family=<value optimized out>, now=-9382099)
    at forward.c:505
#2  0x00012110 in check_dns_listeners (daemon=0x24008, 
    set=0xbee19a58, now=-9382099) at dnsmasq.c:728
#3  0x00012d00 in main (argc=<value optimized out>, 
    argv=<value optimized out>) at dnsmasq.c:617
(gdb)
Comment 2 Philipp Zabel (reporter) 2010-01-17 00:17:57 UTC
*** Bug 8134 has been marked as a duplicate of this bug. ***
Comment 3 Philipp Zabel (reporter) 2010-01-17 00:18:42 UTC
*** Bug 8135 has been marked as a duplicate of this bug. ***
Comment 4 Lucas Maneos 2010-01-17 04:36:46 UTC
Thanks for the report and log.  Could you attach the contents of
/var/run/resolv.conf.wlan0 when connected to this AP, as well as a tcpdump
capture of DNS traffic?
Comment 5 Darryl L. Miles 2010-01-21 13:38:30 UTC
the current version of dnsmasq is:

2.45-1+lenny1 but N900 ships with 2.35-1osso11+etch4+0m5

is it possible to perform a refresh of the base package? 

Nokia removes their bespoke patches (if any) and updates the base package and
reapplies/works their stuff.
Comment 6 Darryl L. Miles 2010-01-21 14:35:52 UTC
FYI (just in case the maintainer didn't know)


$ apt-get source dnsmasq
$ wget
http://ftp.de.debian.org/debian/pool/main/d/dnsmasq/dnsmasq_2.46.orig.tar.gz
$ cd dnsmasq-2.35
$ uupdate -u dnsmasq_2.45.orig.tar.gz

Gets you most of the way there.  The DBUS CFLAGS stuff in Makefile's looks to
have been fixed already (so the Makefile changes may not be necessary).


Changes to source which might fix the original bug reporters problem, the line
of code around 219 is:
      domain = forward->sentto->domain;
given the lines above I guess the issue is: (forward->sentto == NULL) or
invalid.

The source changes need to be audited by the Maintainer as he would know the
goal/fixes being made.  Some of might have already been addresses.


As for the original bug, there has been changes to the functions (in
dnsmasq-2.35-1+etch4.orig/src/forward.c around line ~850-950)

struct frec *allocate_frec(struct daemon *daemon, time_t now);
struct randfd *allocate_rfd(struct daemon *daemon, int family);

These functions use malloc() and now they have an additional line to:

f->sentto = NULL;

otherwise would be undefined (in the 2.35 version) maybe this is the cause of
random crashing ?



Please rebase dnsmasq to the current lenny version.
Comment 7 Darryl L. Miles 2010-01-21 14:36:57 UTC
/usr/bin/uupdate comes in package "devscripts" which is not installed in
scratchbox by default.

apt-get install devscripts
Comment 8 Philipp Zabel (reporter) 2010-01-22 11:05:58 UTC
Created an attachment (id=2089) [details]
output of tcpdump -i wlan0 port 53 -w dnsmasq-crash.tcpdump

DNS traffic log during the crash
Comment 9 Philipp Zabel (reporter) 2010-01-22 11:06:25 UTC
Created an attachment (id=2090) [details]
output of tcpdump -i wlan0 port 53 -w dnsmasq-crash-2.tcpdump
Comment 10 Philipp Zabel (reporter) 2010-01-22 11:06:47 UTC
Created an attachment (id=2091) [details]
contents of the wlan0 resolv.conf
Comment 11 Andre Klapper maemo.org 2010-01-22 14:53:15 UTC
Thanks!
Comment 12 Jukka Rissanen nokia 2010-02-08 17:18:21 UTC
Hi,

currently I am maintaining dnsmasq in Nokia and I ported dnsmasq
(2.45-1+lenny1) to maemo (only debian directory changes were needed).

I would be grateful if the original reporter could try out this new version as
we do not seem to be able to reproduce the problem internally. I tried the
packages in my N900 and everything seems to work ok but you should probably
take a backup of your device before doing any experimenting.

Please note that after you have installed the dnsmasq and dnsmasq-base
packages, your SSU will not work any longer (because of dnsmasq package version
differencies), so in order to get SSU working later you would need to
re-install the original dnsmasq version.
Comment 13 Jukka Rissanen nokia 2010-02-08 17:20:27 UTC
Created an attachment (id=2239) [details]
dnsmasq_2.45-1+lenny1+maemo1_all.deb
Comment 14 Jukka Rissanen nokia 2010-02-08 17:21:05 UTC
Created an attachment (id=2240) [details]
dnsmasq-base_2.45-1+lenny1+maemo1_armel.deb
Comment 15 Jukka Rissanen nokia 2010-02-08 17:21:36 UTC
Created an attachment (id=2241) [details]
dnsmasq_2.45-1+lenny1+maemo1.diff.gz
Comment 16 Jukka Rissanen nokia 2010-02-08 17:34:37 UTC
(In reply to comment #12)
> I would be grateful if the original reporter could try out this new version as
> we do not seem to be able to reproduce the problem internally. I tried the
> packages in my N900 and everything seems to work ok but you should probably
> take a backup of your device before doing any experimenting.
> 

Forgot to add the installation instructions (just in case):
- transfer the debs to your device
- dpkg -i dnsmasq_2.45-1+lenny1+maemo1_all.deb
dnsmasq-base_2.45-1+lenny1+maemo1_armel.deb
- reboot
Comment 17 Philipp Zabel (reporter) 2010-02-10 13:24:28 UTC
I quickly tried the new dnsmasq package yesterday, and it didn't crash once
during about a dozen connect/disconnect cycles.
I can check in more detail some time (take another tcpdump to see if it still
receives the same ServFail DNS responses), but so far it's looking good.
Comment 18 Mikle 2010-02-17 03:13:43 UTC
With some APs my n900 can't resolve domain names - for a half of queries it
returns 1.0.0.0

comment 16 didn't help, my current version of firmware is 3.2010.02-8
Comment 19 Lucas Maneos 2010-02-17 10:44:23 UTC
(In reply to comment #18)
> With some APs my n900 can't resolve domain names - for a half of queries it
> returns 1.0.0.0

This is unrelated.   Please check with tcpdump that the upstream nameserver is
returning correct replies, and if so file a separate bug report.
Comment 20 Andre Klapper maemo.org 2010-02-17 21:24:04 UTC
This has been fixed in package
dnsmasq 2.45-1+lenny1+maemo3+0m5
which is part of the internal build version
10.2010.06-14
(Note: 2009/2010 is the year, and the number after is the week.)

A future public update released with the year/week later than this internal
build version will include the fix. (This is not always already the next public
update.)
Please verify that this new version fixes the bug by marking this bug report as
VERIFIED after the public update has been released and if you have some time.


To answer popular followup questions:
 * Nokia does not announce release dates of public updates in advance.
 * There is currently no access to these internal, non-public build versions.
   A Brainstorm proposal to change this exists at
http://maemo.org/community/brainstorm/view/undelayed_bugfix_releases_for_nokia_open_source_packages-002/
Comment 21 Andre Klapper maemo.org 2010-03-15 20:52:41 UTC
Setting explicit PR1.2 milestone (so it's clearer in which public release the
fix will be available to users).

Sorry for the bugmail noise (you can filter on this message).
Comment 22 Andre Klapper maemo.org 2010-03-15 20:52:53 UTC
Setting explicit PR1.2 milestone (so it's clearer in which public release the
fix will be available to users).

Sorry for the bugmail noise (you can filter on this message).