Bug 6641 - (int-149761) Sofia-SIP throws 500 Internal Server Error
(int-149761)
: Sofia-SIP throws 500 Internal Server Error
Status: RESOLVED FIXED
Product: Chat & Call & SMS
SIP
: 5.0/(2.2009.51-1)
: N900 Linux
: Medium normal (vote)
: 5.0/(10.2010.19-1)
Assigned To: rtcomm@maemo.org
: sip-bugs
:
:
:
:
  Show dependency tree
 
Reported: 2009-12-06 23:41 UTC by Jonas Arndt
Modified: 2010-03-15 20:52 UTC (History)
4 users (show)

See Also:


Attachments
SIP negotiation (tethereal) (1.17 KB, application/x-gzip)
2009-12-07 07:02 UTC, Jonas Arndt
Details
syslog (2.97 KB, application/x-gzip)
2009-12-07 18:16 UTC, Jonas Arndt
Details
Log from sofia, running on desktop (1.48 KB, application/x-gzip)
2009-12-08 23:34 UTC, Jonas Arndt
Details
Working session (5.16 KB, application/x-compressed-tar)
2009-12-09 04:51 UTC, Jonas Arndt
Details
tethereal sip (1.22 KB, application/x-gzip)
2009-12-09 05:08 UTC, Jonas Arndt
Details
libgstreamer with debug (989.14 KB, application/octet-stream)
2009-12-24 00:32 UTC, Olivier Crête
Details
Farsight2 build with extra debug (145.83 KB, application/octet-stream)
2009-12-24 00:36 UTC, Olivier Crête
Details


Note

You need to log in before you can comment on or make changes to this bug.


Description Jonas Arndt (reporter) 2009-12-06 23:41:58 UTC
SOFTWARE VERSION:
1.2009.42-11.002 on the N900
Asterisk 1.6.1.6 on the server

EXACT STEPS LEADING TO PROBLEM: 
Configured the N900 SIP Client with the following

Username: 3450@192.168.2.92
Password: xxxxxxxxx
Advanced Settings:
User for telephone numbers: Checked
User name: 3450
Transport: Auto
Outbound proxy: 192.168.2.92
Port: 5060
The rest: Default

The phone registers fine with asterisk. However, when trying to make a call
from another SIP phone to the N900 I get:

SIP Phone <---------------------------------->   N900

Invite with media description =>
                                       <== 100 Trying
                                       <== 180 Ringing
                                       <== 500 Internal Server Error


I have played with the settings like crazy, but nothing seems to help. Any
other SIP phone works fine with my Asterisk (Cisco 9640, X-lite ....)


EXPECTED OUTCOME:
The correct SIP response after ringing (when somebody picks up) is to provide a
200 OK with a session description so the other guy know where to send the
sound.

ACTUAL OUTCOME:
N900 sends a 500: Internal Server Error

REPRODUCIBILITY:
Always happens

EXTRA SOFTWARE INSTALLED: Just the ssh client

OTHER COMMENTS:

User-Agent:       Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.5)
Gecko/20091103 SUSE/3.5.5-1.1.2 Firefox/3.5.5
Comment 1 Jonas Arndt (reporter) 2009-12-07 07:02:29 UTC
Created an attachment (id=1681) [details]
SIP negotiation (tethereal)

This is a sip negotiation caught in tethereal. 
3456 is another SIP phone calling the N900 (3450)
Comment 2 Jonas Arndt (reporter) 2009-12-07 07:03:52 UTC
I just learned that someone else has an N900 working with Asterisk. I still get
the 500 return from the device.
Comment 3 Mikhail Zabaluev nokia 2009-12-07 12:17:54 UTC
I can't see anything abnormal in the request.

Could you provide verbose SIP trace captured in syslog? To do so, start a
terminal window. Now export TPORT_LOG=1 and export TPSIP_DEBUG=all in the
environment (e.g. by editing /etc/osso-af-init/af-defines.sh) for
telepathy-sofiasip. Exporting TPSIP_PERSIST=1 also prevents telepathy-sofiasip
from exiting due to no connections being requested. Now SIP traffic logs get
dumped into syslog. Also see
http://wiki.maemo.org/Documentation/devtools/maemo5/syslog
Comment 4 Jonas Arndt (reporter) 2009-12-07 18:16:54 UTC
Created an attachment (id=1683) [details]
syslog

Hi,

I followed your instructions. I have attached the log. The only thing I could
see that stood out was "nua_set_params(): failed: Error Setting SOA
Parameters". But I don't know the code at all, so you might find something else
there.

Thanks for your help,

// Jonas
Comment 5 Mikhail Zabaluev nokia 2009-12-07 18:53:29 UTC
(In reply to comment #4)
> Created an attachment (id=1683) [details] [details]
> syslog

Thank you very much. We should investigate the problem.
Comment 6 Jonas Arndt (reporter) 2009-12-08 01:25:48 UTC
Just as an FYI. I also tried on an Asterisk 1.4, same result
Comment 7 Mikhail Zabaluev nokia 2009-12-08 13:49:12 UTC
Do you, by chance, have Empathy and telepathy-sofiasip installed on a Linux
desktop? I'd like to confirm if the same problem exists there as well.
Comment 8 Jonas Arndt (reporter) 2009-12-08 17:14:09 UTC
I have now. I am running OpenSuSE 11.2.

I will try to reproduce this and I will come back to you soon

Thanks
Comment 9 Jonas Arndt (reporter) 2009-12-08 19:12:00 UTC
So on the desktop I get a 480 Temporarily Unavailable response after the
RINGING. Unfortunately I can't find the /etc/osso-af-init/af-defines.sh file to
put the thing in debug mode. How can I debug this animal on a desktop?

Thanks,

// Jonas
Comment 10 Mikhail Zabaluev nokia 2009-12-08 19:25:38 UTC
(In reply to comment #9)
> So on the desktop I get a 480 Temporarily Unavailable response after the
> RINGING. Unfortunately I can't find the /etc/osso-af-init/af-defines.sh file to
> put the thing in debug mode. How can I debug this animal on a desktop?

You can run telepathy-sofiasip from a terminal in the user session.
Define the debugging variables and run it like this (you have to go offline in
Empathy first):
TPSIP_PERSIST=1 /usr/lib/telepathy/telepathy-sofiasip
Comment 11 Jonas Arndt (reporter) 2009-12-08 23:30:57 UTC
So I am thinking I am facing a totally different problem on the desktop
environment. I am getting a 480 instead and in the log files I keep seeing
something about the stream engine not being ready. I guess this needs to be
solved prior to getting to a point where we can focus on the problem. Any
pointers here? I am uploading the log soon.

// Jonas
Comment 12 Jonas Arndt (reporter) 2009-12-08 23:34:48 UTC
Created an attachment (id=1701) [details]
Log from sofia, running on desktop

You can probably see that I am facing something completely different here. I
have the following packages installed

libsofia-sip-ua-glib3
libsofia-sip-ua0
telepathy-sofiasip
libtelepathy
libtelepathy-farsight0
libtelepathy-glib0
python-telepathy
telepathy-butterfly
telepathy-gabble
telepathy-haze
telepathy-idle
telepathy-mission-control
telepathy-salut
telepathy-sofiasip
telepathy-stream-engine
Comment 13 Jonas Arndt (reporter) 2009-12-09 04:51:57 UTC
Created an attachment (id=1706) [details]
Working session

So I got it working on the desktop environment. I had to install some gstreamer
plugins. This attachment has two log files. One is from telepathy-sofiasip and
the other from empathy.
Comment 14 Jonas Arndt (reporter) 2009-12-09 05:08:05 UTC
Created an attachment (id=1707) [details]
tethereal sip

tethereal sip conversation

For some reason it takes a really long time from when I push answer in empathy
till I hear sound, which is also visible in the SIP trace
Comment 15 Jonas Arndt (reporter) 2009-12-09 09:05:04 UTC
I started to compare some log files. A clear difference that happens just prior
to the "nua_set_params(): failed: Error Setting SOA Parameters", followed by
the Internal Server error is

"cb_fs_local_candidates_prepared: candidate->ip = ''"

On the working desktop session this has a real IP address
Comment 16 Mikhail Zabaluev nokia 2009-12-09 11:37:04 UTC
Thanks for your time analyzing this.
Looks like we have a problem in Farsight/stream-engine, indeed.
Comment 17 Jonas Arndt (reporter) 2009-12-11 07:01:54 UTC
So what would be the next step here? Is there anything I can do to help out? It
feels a bit like we have concluded where the problem is but not what it is.
Comment 18 Mikhail Zabaluev nokia 2009-12-11 11:48:41 UTC
(In reply to comment #17)
> So what would be the next step here? Is there anything I can do to help out? It
> feels a bit like we have concluded where the problem is but not what it is.

Well, I'm out of things that we could mine out of your device with your kind
help.
The internal bug is filed and will be prioritized.
In the trace I can see that the device is sending out UPnP discovery requests,
but does not receive a response. This should not prevent the stream engine from
picking up the correct local transport address, though. Olivier, any clues on
what could be the cause?
Comment 19 Mikhail Zabaluev nokia 2009-12-11 11:55:15 UTC
I've got a second thought: as a workaround, and to help us localize the
problem, you can install a STUN server in your local network and set its
address in the advanced SIP account settings on the device. This should change
the path of transport address discovery.
Comment 20 Jonas Arndt (reporter) 2009-12-11 12:08:03 UTC
Exactly what I noticed. You get those SSDP requests. I haven't seen where this
happens though. Perhaps this is related to why we don't get an IP
"cb_fs_local_candidates_prepared: candidate->ip = ''" on the N900.

On the working Desktop trace I don't see any SSDP package. The question here is
if there something triggering this discovery
Comment 21 Mikhail Zabaluev nokia 2009-12-11 12:10:07 UTC
(In reply to comment #20)
> On the working Desktop trace I don't see any SSDP package. The question here is
> if there something triggering this discovery

It's done every time unless you have STUN configured or discovered off your SIP
service domain.
Comment 22 Jonas Arndt (reporter) 2009-12-11 12:14:50 UTC
Great idea with the STUN server. Let me work on that.

Thanks!
Comment 23 Jonas Arndt (reporter) 2009-12-11 13:06:27 UTC
Hi Guys,

Great news. So with a running stun server it works. I still get the SSDP
packages there. However with a stun server in the path the N900 comes back with
a 200 OK and a session description. 

This is great and I am really thrilled. Long term I guess one shouldn't really
have to run a stun server on an internal network though.
Comment 24 Mikhail Zabaluev nokia 2009-12-11 13:54:37 UTC
One probably last thing to look at: could you dump the output of ifconfig when
you are connected to the WLAN with the Asterisk server?
Comment 25 Jonas Arndt (reporter) 2009-12-11 14:04:14 UTC
Here you go. You did see my post of that it is working with a stun server,
right? 

Nokia-N900-42-11:~# ifconfig
lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:167 errors:0 dropped:0 overruns:0 frame:0
          TX packets:167 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0                             
          RX bytes:10897 (10.6 KiB)  TX bytes:10897 (10.6 KiB)  

phonet0   Link encap:UNSPEC  HWaddr
15-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
          UP POINTOPOINT RUNNING NOARP  MTU:4000  Metric:1
          RX packets:2314 errors:0 dropped:0 overruns:0 frame:0
          TX packets:137 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:100
          RX bytes:55892 (54.5 KiB)  TX bytes:2165 (2.1 KiB)

wlan0     Link encap:Ethernet  HWaddr 34:7E:39:42:B5:02
          inet addr:192.168.2.30  Bcast:192.168.2.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:5620 errors:0 dropped:0 overruns:0 frame:0
          TX packets:3778 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:1544384 (1.4 MiB)  TX bytes:906870 (885.6 KiB)

wmaster0  Link encap:UNSPEC  HWaddr
00-1F-DF-49-B3-6B-00-00-00-00-00-00-00-00-00-00
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

Nokia-N900-42-11:~# route
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
192.168.2.0     *               255.255.255.0   U     0      0        0 wlan0
default         192.168.2.1     0.0.0.0         UG    0      0        0 wlan0
Comment 26 Mikhail Zabaluev nokia 2009-12-11 16:41:11 UTC
(In reply to comment #25)
> Here you go. You did see my post of that it is working with a stun server,
> right?

Yes, and thanks. Nothing seems to be unusual about your network configuration.
Comment 27 Olivier Crête 2009-12-24 00:32:23 UTC
Created an attachment (id=1837) [details]
libgstreamer with debug
Comment 28 Olivier Crête 2009-12-24 00:36:24 UTC
Created an attachment (id=1838) [details]
Farsight2 build with extra debug

I've attached versions of gstreamer and farsight with debug printing enabled.
Can you please get a shell and do
ST_DEBUG=fs*:5 STREAM_ENGINE_PERSIST=1
/usr/lib/telepathy/telepathy-stream-engine > log

And attach the log here (and also the matching syslog). That should hopefully
give me enough information to find the source of this bug.
Comment 30 Olivier Crête 2010-01-12 06:02:16 UTC
Jonas, can you still reproduce the problem ?
Comment 31 Jonas Arndt (reporter) 2010-01-15 05:34:15 UTC
Hi Olivier,

I could have sworn I replied to you earlier. There is no sign about that in
this ticket though so I guess i goofed up some how.

I upgraded the firmware for my WRT54g2 and the problem with the internal server
error went away. The problem is not completely solved though. Without the stun
server it will take a good 15-20 seconds from when I've answered the incoming
call on tne N900 till sound starts flowing in both direction, so there is still
something weird going on here.

At this point I am not sure if your debug compiled library would through light
on this "new" situation. I am willing to try if you think it would. It would be
great if I could get just the library files though so I can temporarily replace
the original ones with the debug compiled one. It is much quicker then fiddle
around with the package manager.

Thanks,

// Jonas
Comment 32 Olivier Crête 2010-01-15 05:45:46 UTC
Well, you can try making a new tcpdump to see if it helps.
Comment 33 Olivier Crête 2010-01-15 05:47:06 UTC
I think I know what your original problem was .. the UPnP implementation in
your router is probably buggy. I will add code validating the result further.
Comment 34 Jonas Arndt (reporter) 2010-01-15 06:05:46 UTC
Yeah something was wrong with the firmware. I also noticed that the AP would
lock up after a couple of times trying to call. I had to push the reset button
to fix it. All that is gone now. Still it is the delay in connecting the media
stream without a stun server. I will try another tethereal again to see what is
going on.
Comment 35 Olivier Crête 2010-01-19 02:20:54 UTC
The original bug has been fixed in gupnp-igd 0.1.4-0maemo2 which should
hopefully be in the next release. It now validates that the returned IP
adresses are actually valid.
Comment 36 Andre Klapper maemo.org 2010-01-28 14:45:15 UTC
This has been fixed in package
gupnp-igd 0.1.4-0maemo2+0m5
which is part of the internal build version
2010.04-8
(Note: 2009/2010 is the year, and the number after is the week.)

A future public update released with the year/week later than this internal
build version will include the fix. (This is not always already the next public
update.)
Please verify that this new version fixes the bug by marking this bug report as
VERIFIED after the public update has been released and if you have some time.


To answer popular followup questions:
 * Nokia does not announce release dates of public updates in advance.
 * There is currently no access to these internal, non-public build versions.
   A Brainstorm proposal to change this exists at
http://maemo.org/community/brainstorm/view/undelayed_bugfix_releases_for_nokia_open_source_packages-002/
Comment 37 Mikhail Zabaluev nokia 2010-02-18 17:47:12 UTC
It would help our internal testing if you can provide the model and the
firmware version of the gateway which was in use when the problem was observed.
We are pretty sure it is fixed, but this information could help us to make it
certain rather than waiting for your feedback after the next release. Thank you
for your help so far.
Comment 38 Jonas Arndt (reporter) 2010-02-18 19:06:55 UTC
Hi Mikhail,

The model is wrt54g2v1. I will have to come back later today with the
firmware...

// Jonas
Comment 39 Jonas Arndt (reporter) 2010-02-19 16:27:29 UTC
Firmware version now: 1.0.04 build 005, Jun. 16, 2009.
This is the working version

What I had be fore that is whatever the AP was shipped with
Comment 40 Andre Klapper maemo.org 2010-03-15 20:52:01 UTC
Setting explicit PR1.2 milestone (so it's clearer in which public release the
fix will be available to users).

Sorry for the bugmail noise (you can filter on this message).