Bug 8615 - (int187721/int-188421) telepathy-sofiasip uses lots of memory and causes the phone to swap
(int187721/int-188421)
: telepathy-sofiasip uses lots of memory and causes the phone to swap
Status: RESOLVED FIXED
Product: Chat & Call & SMS
Telepathy
: 5.0:(10.2010.19-1)
: All Maemo
: Unspecified major (vote)
: Harmattan
Assigned To: rtcomm@maemo.org
: telepathy-bugs
:
: performance, upstream, use-time
:
:
  Show dependency tree
 
Reported: 2010-01-28 17:28 UTC by Damien Mascord
Modified: 2011-02-19 20:33 UTC (History)
5 users (show)

See Also: https://bugs.freedesktop.org/show_bug.cgi?id=32615


Attachments
telepath-smaps as requested (19.90 KB, text/plain)
2010-03-23 19:34 UTC, Damien Mascord
Details
Output of memcpu monitoring (1.40 KB, text/plain)
2010-03-24 16:49 UTC, Damien Mascord
Details
SIP transactions related to memory increase (118.06 KB, text/plain)
2010-07-01 23:09 UTC, Damien Mascord
Details
sofia-sip server siptrace logs (1.86 MB, application/zip)
2010-07-07 21:09 UTC, Damien Mascord
Details


Note

You need to log in before you can comment on or make changes to this bug.


Description Damien Mascord (reporter) 2010-01-28 17:28:32 UTC
SOFTWARE VERSION:
(Settings > General > About product)

EXACT STEPS LEADING TO PROBLEM: 
(Explain in detail what you do (e.g. tap on OK) and what you see (e.g. message
Connection Failed appears))
1. Setup a SIP account
2. Have the SIP account active for 24-48 hours
3. Notice that the telepathy-sofia binary is using over 25% of the available
memory

EXPECTED OUTCOME:

Memory use does not increase over time, and stays stable 

ACTUAL OUTCOME:

Memory use increases over time, and the phone becomes unusable.

REPRODUCIBILITY:
(always, less than 1/10, 5/10, 9/10)

EXTRA SOFTWARE INSTALLED:

OTHER COMMENTS:

User-Agent:       Mozilla/5.0 (X11; U; Linux x86_64; en-US) AppleWebKit/532.9
(KHTML, like Gecko) Chrome/5.0.307.0 Safari/532.9
Comment 1 Andre Klapper maemo.org 2010-01-28 18:35:20 UTC
I'm wondering if this is a duplicate of bug 8581...
Comment 2 Damien Mascord (reporter) 2010-01-28 19:15:27 UTC
Hi Andre,

This is extreme memory usage, rather than extreme network usage, so different
bug I think.

Cheers,

Damien
Comment 3 Andre Klapper maemo.org 2010-02-03 18:22:43 UTC
Are any other IM accounts active to?
Can you provide some numbers about memory increase, and how you have created
those numbers?

Also, these fields are missing:

> REPRODUCIBILITY:
> (always, less than 1/10, 5/10, 9/10)
>
> EXTRA SOFTWARE INSTALLED:
Comment 4 Damien Mascord (reporter) 2010-02-03 18:28:43 UTC
Hi Andre,

I will turn back on the telepathy-sofia now, and get you more details regarding
the memory usage.   It is reproducible within 48 hours, so will let you know
once that time comes.

In terms of other software installed, I do not have any related software
installed, and it would be cumbersome to list everything that is installed.  
Are you looking for something in particular ?

Cheers,

Damien
Comment 5 Damien Mascord (reporter) 2010-02-03 19:15:24 UTC
Hi Andre,

OK, it took much shorter than expected:

telepathy-sofia: 77368M RSS, 31.2% memory and 32.4% CPU.

The memory usage just keeps going up... it's now at 79464... with only 10956K
free... it will eventually swap and make the device unusable.

Cheers,

Damien
Comment 6 Andre Klapper maemo.org 2010-02-09 22:10:54 UTC
> EXTRA SOFTWARE INSTALLED:

Well, if you are running some stuff from -devel or -testing there could be some
undetected side effects.

Anything IM/Chat/Call related?
Comment 7 Damien Mascord (reporter) 2010-02-09 22:33:34 UTC
Hi,

I have Pidgin Internet installed, but not used.
I also have the Conversations Desktop Widget installed.

Otherwise nothing call/chat/IM related.

Cheers,

Damien
Comment 8 Eero Tamminen nokia 2010-03-23 16:45:36 UTC
Can you attach /proc/PID/smaps file for the telepathy-sofia process which
memory usage has increased a lot?
Comment 9 Damien Mascord (reporter) 2010-03-23 19:34:40 UTC
Created an attachment (id=2507) [details]
As requested
Comment 10 Eero Tamminen nokia 2010-03-23 20:02:46 UTC
Wow.  The heap (normal allocation) usage is enormous:
00033000-1b106000 rw-p 00033000 00:00 0          [heap]
Size:             443212 kB
Rss:               31836 kB
Pss:               31836 kB
Shared_Clean:          0 kB
Shared_Dirty:          0 kB
Private_Clean:       260 kB
Private_Dirty:     31576 kB
Referenced:        21032 kB
Swap:             411368 kB

And there's some pretty huge anonymous alloc too:
43be2000-47be3000 rw-p 43be2000 00:00 0 
Size:              65540 kB
Rss:               43508 kB
Pss:               43508 kB
Shared_Clean:          0 kB
Shared_Dirty:          0 kB
Private_Clean:         0 kB
Private_Dirty:     43508 kB
Referenced:        22588 kB
Swap:              22032 kB

This large memory usage is a performance issue -> adding keyword.


Could you install sp-memusage package from the SDK tools repository, kill the
telepathy-sofiasip process and after it has been restarted, start tracking its
memory usage with something like this:
  mem-cpu-monitor 300 $(pidof telepathy-sofiasip)

It will output the memory usage every 5 minutes.

You can then check what kind of activity in the device seems to grow the
process memory usage (follow the dirty column, change column takes also clean
memory into account which is not so good, that will be fixed after PR1.2).

I'd like to know what kind of activity increases the memory usage:
- "idling"
- doing calls
- receiving calls
- whether the increase is constant or happens only at some specific time
- is it e.g. specific to some account
Comment 11 Damien Mascord (reporter) 2010-03-24 16:49:46 UTC
Created an attachment (id=2522) [details]
Output of memcpu monitoring
Comment 12 Damien Mascord (reporter) 2010-03-24 16:52:55 UTC
The attached file shows the huge amount of memory utilization in such a short
period of time.   I am doing nothing on the phone at all, so it's just idling. 
 I did not make any calls at all during the session.

You can see the memory usage suddenly drop off once I disable the SIP account.

I am using my own SIP server running on FreeSwitch (latest trunk) which uses
the Sofia library too, so if you want a login/password I can provide this
privately if it would help in the debugging ?

Cheers,

Damien
Comment 13 Eero Tamminen nokia 2010-04-15 10:52:34 UTC
(Was on vacation and had some thousand mails after a week, took a bit of time
to go through my backlog...)

(In reply to comment #12)
> The attached file shows the huge amount of memory utilization in such
> a short period of time.   I am doing nothing on the phone at all, so it's
> just idling.  I did not make any calls at all during the session.

It's quite interesting that sofiasip CPU usage is also slowly increasing, like
it were adding things to some list or something and traversing through that
whole the time.

Any idea why mem-cpu-monitor reports such bogus average MHz values for your
device (they should be between 250 - 600, not -70 - 71)?


> You can see the memory usage suddenly drop off once I disable the SIP account.
> 
> I am using my own SIP server running on FreeSwitch (latest trunk) which uses
> the Sofia library too, so if you want a login/password I can provide this
> privately if it would help in the debugging ?

If you disable the SIP account, reboot, wait for a while and then both enabled
SIP account and monitoring of the sofiasip process, does the memory increase
start immediately?

If yes, then the issue could be some interaction with your server.  If it
doesn't start immediately, then I would like you to try to pinpoint the trigger
which starts the memory increase as AFAIK nobody else has had this issue (at
least yet).
Comment 14 Damien Mascord (reporter) 2010-05-06 17:10:41 UTC
Hi,

That was the method of testing which I did to gather the latest attachment, and
the memory increase started within 30 minutes.

I have sent you a private email with an account that you can test with.

Cheers,

Damien
Comment 15 Eero Tamminen nokia 2010-05-06 17:47:14 UTC
(In reply to comment #14)
> That was the method of testing which I did to gather the latest attachment,
> and the memory increase started within 30 minutes.

So it didn't start immediately.  Any possibility of finding the cause better,
e.g. by correlating mem-cpu-monitor timestamps and your SIP server logs?

Or maybe you could install also syslog to N900 and check whether there's
anything interesting around the time the memory increase starts (Glib error
messages for example)?


> I have sent you a private email with an account that you can test with.

Sorry, I don't work with IM/VOIP stuff myself...
Comment 16 Damien Mascord (reporter) 2010-05-06 18:03:42 UTC
Hi,

If you have an n900 handy, it'll make it easy for you to replicate the issue
with the account that I sent through.   Going back and forth trying to debug
the issue using my phone will take quite some time as you are the expert, and
I'm just a user :)

Cheers,

Damien
Comment 17 Andre Klapper maemo.org 2010-05-06 18:09:55 UTC
(In reply to comment #16)
> If you have an n900 handy, it'll make it easy for you to replicate the issue
> with the account that I sent through.   Going back and forth trying to debug
> the issue using my phone will take quite some time as you are the expert, and
> I'm just a user :)

Most issues are not easily reproducible otherwise they would have been fixed
already, plus Eero is around in maemo.org Bugzilla voluntarily and has really
lots of other stuff to do too.
So more info needed from you to track this down...
Comment 18 Damien Mascord (reporter) 2010-05-06 18:13:33 UTC
OK, granted, my apologies!

I'll do some packet sniffing and try to co-relate the events.
Comment 19 Damien Mascord (reporter) 2010-07-01 23:09:13 UTC
Created an attachment (id=2955) [details]
SIP transactions related to memory increase
Comment 20 Damien Mascord (reporter) 2010-07-01 23:12:18 UTC
Hi,

As you can see from the sip trace here, there are tonnes and tonnes of OPTIONS
calls that happen around the same time of the memory increase.

You can also see a few NOTIFY which do not get answered, due to a port change
for the external IP.

Cheers,

Damien
Comment 21 Damien Mascord (reporter) 2010-07-07 21:09:23 UTC
Created an attachment (id=2963) [details]
sofia-sip server siptrace logs

Here is the sofia debug output on the server side, as you can see the Contact:
header is replicated, which is very likely the reason for the excessive memory
usage.

The user involved is tuskern900, if you look in the latest log, you will see
the massive SIP headers.
Comment 22 Andre Klapper maemo.org 2010-08-24 20:47:40 UTC
Damien: I assume that you have updated to 10.2010.19-1 in the meantime, right?
Comment 23 Damien Mascord (reporter) 2010-08-24 21:05:27 UTC
Hi,

I have always updated the software on the day (or day after) the release of the
software has been made, so yes.

Cheers,

Damien
Comment 24 Mikhail Zabaluev nokia 2010-08-25 15:25:14 UTC
You can check if the problem persists when you disable keepalive requests by
setting the keepalive method to Off in the SIP account settings.

Also, try to set the transport to TCP and see if it helps.
Comment 25 Damien Mascord (reporter) 2010-08-25 16:41:53 UTC
After turning off keepalives and switching to TCP only, the CPU and memory
usage seems to be stable over a 30 minute period so far.

Will test now with KeepAlives auto and TCP only, and update the ticket shortly.
Comment 26 Damien Mascord (reporter) 2010-08-25 18:47:02 UTC
Having KeepAlive Auto, and Connection Type to be TCP, the problem doesn't
appear.

Am testing with KeepAlive Auto and UDP now, and will update the ticket.
Comment 27 Mikhail Zabaluev nokia 2010-08-27 13:23:23 UTC
This does not happen with just any SIP proxy when using UDP and OPTIONS
keepalives. I could use a test account on a server where the leak is
reproducible.
Comment 28 Damien Mascord (reporter) 2010-08-27 16:03:04 UTC
Hi,

Test account details have been sent through to your private email.

Thanks in advance,

Damien
Comment 29 Mikhail Zabaluev nokia 2010-08-27 17:45:32 UTC
(In reply to comment #28)
> Test account details have been sent through to your private email.

Thank you. I ran a brief testing with mem-cpu-monitor, and the process image
size does not grow for me. I use more recent software releases than what is
available in release PR1.2, though.

Can you post or email packet dumps of keepalive traffic between the N900 and
the proxy, so that we could check if anything may be different in your case? I
presume this traffic does not expose any private information besides test user
IDs and IP addresses.
Comment 30 Damien Mascord (reporter) 2010-08-27 17:49:29 UTC
Hi,

You will need to leave it for about 20-30 minutes before it shows up.   Also,
it may relate to being behind a NAT, given that with NAT the UDP and TCP port
are liable to change.

KeepAlive Auto, and UDP work fine, and have been running for quite some time.

In any case, I have attached the sip logs for the problematic traffic.

Cheers,

Damien
Comment 31 Mikhail Zabaluev nokia 2010-08-27 18:13:03 UTC
(In reply to comment #30)
> You will need to leave it for about 20-30 minutes before it shows up.   Also,
> it may relate to being behind a NAT, given that with NAT the UDP and TCP port
> are liable to change.

You may want to play with the keepalive period. The default is 2 minutes, and
if it's longer than the UDP binding lifetime in the NAT router, we'll get
frequent remappings and associated re-registrations. It might lead to a list of
stale contact entries accumulating somewhere... I see one binding change in the
log you have posted. How frequently does this happen?

> KeepAlive Auto, and UDP work fine, and have been running for quite some time.

This is odd, because only UDP transport seems to be used for keepalive traffic.

Alas, the logs you have posted do not contain full SIP messages, that would
help to see if there is any problem with growing headers or something else.
Comment 32 Mikhail Zabaluev nokia 2010-08-27 18:15:53 UTC
OK, I got a leak of ~4K per 10 minutes in a network behind a NAT. Will
investigate.
Comment 33 Damien Mascord (reporter) 2010-08-27 18:16:29 UTC
Hi,

The attached "sofia-sip server siptrace logs" does contain the full SIP
headers, ie: 

  REGISTER sip:pbx2.gplhost.com SIP/2.0
   Via: SIP/2.0/TCP 10.153.7.102:64629;branch=z9hG4bKaD9ejSgKtmN1c
   Route: <sip:pbx2.gplhost.com:5060;lr>
   Max-Forwards: 70
   From: <sip:tuskern900@pbx2.gplhost.com>;tag=mUjaSFZ0mQ2aa
   To: <sip:tuskern900@pbx2.gplhost.com>
   Call-ID: aab271ee-0490-122e-01bf-347e39420000
   CSeq: 133144291 REGISTER
   Contact: <sip:tuskern900@216.203.24.14:43034;transport=udp>
   Contact: <sip:tuskern900@216.203.24.14:55661;transport=tcp>;expires=0
   Contact: <sip:tuskern900@216.203.24.14:43034;transport=udp>;expires=0
   Contact: <sip:tuskern900@216.203.24.14:55661;transport=tcp>;expires=0
   Contact: <sip:tuskern900@216.203.24.14:43034;transport=udp>;expires=0
   Contact: <sip:tuskern900@216.203.24.14:55661;transport=tcp>;expires=0
   Contact: <sip:tuskern900@216.203.24.14:43034;transport=udp>;expires=0
   Contact: <sip:tuskern900@216.203.24.14:55661;transport=tcp>;expires=0
   Contact: <sip:tuskern900@216.203.24.14:43034;transport=udp>;expires=0
   Contact: <sip:tuskern900@216.203.24.14:55661;transport=tcp>;expires=0
   Contact: <sip:tuskern900@216.203.24.14:43034;transport=udp>;expires=0
   Contact: <sip:tuskern900@216.203.24.14:55661;transport=tcp>;expires=0
   Contact: <sip:tuskern900@216.203.24.14:43034;transport=udp>;expires=0

Cheers,

Damien
Comment 34 Mikhail Zabaluev nokia 2010-08-27 18:27:16 UTC
Oh my. I didn't realise these are SIP messages: it starts with lines upon lines
of Contact:. Well, the problem is plain to see: we accumulate a huge list of
Contact headers between the UA and the proxy.
Switch to TCP or set the keepalive period to a value below the binding lifetime
in your NAT configuration.

In sofia-sip, there should be a limit of Contact headers it keeps. We got a
happy pair of two sofia-sip stacks building the list together :)
Comment 35 Mikhail Zabaluev nokia 2010-12-29 17:47:06 UTC
This is an upstream bug, and a patch is available.
Comment 36 Mikhail Zabaluev nokia 2011-01-11 20:46:28 UTC
Integrated the fix for Harmattan.
Comment 37 Rajil Saraswat 2011-02-15 22:33:06 UTC
Can this be backported to Maemo please since some of us are trying to use N900
with FreeSwitch?
Comment 38 Mikhail Zabaluev nokia 2011-02-16 11:25:19 UTC
(In reply to comment #37)
> Can this be backported to Maemo please since some of us are trying to use N900
> with FreeSwitch?

Not in official releases, sorry. But I guess the community effort could package
telepathy-sofiasip 0.6.6 or later easily.
Comment 39 Rajil Saraswat 2011-02-16 12:01:10 UTC
(In reply to comment #38)
> (In reply to comment #37)
> > Can this be backported to Maemo please since some of us are trying to use N900
> > with FreeSwitch?
> 
> Not in official releases, sorry. But I guess the community effort could package
> telepathy-sofiasip 0.6.6 or later easily.

The patches are against sofia-sip rather than telepathy-sofiasip. So sofia-sip
needs to recompiled rather telepathy-sip?

Is patching official sofia-sip (sofia-sip_1.12.10-0maemo5+0m5.tar.gz) with the
following be sufficient?

http://gitorious.org/~ppessi/sofia-sip/pessi-sofia-sip/commit/928c1059befc9a20a2c80f077ca71ebb9a111c61
http://gitorious.org/~ppessi/sofia-sip/pessi-sofia-sip/commit/8e65c5259c9ffb08646fdcd4b98d9f7ced862e9f
http://gitorious.org/~ppessi/sofia-sip/pessi-sofia-sip/commit/505dd0051d87afb2a7380be631597081572c0e1b
Comment 40 Mikhail Zabaluev nokia 2011-02-16 13:49:09 UTC
(In reply to comment #39)
> > Not in official releases, sorry. But I guess the community effort could package
> > telepathy-sofiasip 0.6.6 or later easily.
> 
> The patches are against sofia-sip rather than telepathy-sofiasip. So sofia-sip
> needs to recompiled rather telepathy-sip?

Woops, yes indeed. You'd better update telepathy-sofiasip as well to fix a
couple of other bugs, though.

> Is patching official sofia-sip (sofia-sip_1.12.10-0maemo5+0m5.tar.gz) with the
> following be sufficient?

Perhaps it's easier just to package the current master tree, the patches are
there.
Comment 41 Rajil Saraswat 2011-02-18 22:41:00 UTC
(In reply to comment #40)
> (In reply to comment #39)
> > > Not in official releases, sorry. But I guess the community effort could package
> > > telepathy-sofiasip 0.6.6 or later easily.
> > 
> > The patches are against sofia-sip rather than telepathy-sofiasip. So sofia-sip
> > needs to recompiled rather telepathy-sip?
> 
> Woops, yes indeed. You'd better update telepathy-sofiasip as well to fix a
> couple of other bugs, though.
> 
> > Is patching official sofia-sip (sofia-sip_1.12.10-0maemo5+0m5.tar.gz) with the
> > following be sufficient?
> 
> Perhaps it's easier just to package the current master tree, the patches are
> there.

telepathy-sofiasip master tree needs an updated telepathy-glib >= 0.13.9 which
depends on GLib=>2.25.16. Wouldnt updating GLib break everything else in
fremantle?
Comment 42 Mikhail Zabaluev nokia 2011-02-19 20:33:47 UTC
(In reply to comment #41)
> > Perhaps it's easier just to package the current master tree, the patches are
> > there.


> telepathy-sofiasip master tree needs an updated telepathy-glib >= 0.13.9 which
> depends on GLib=>2.25.16.

I meant the master tree of sofia-sip.
For telepathy-sofiasip, version 0.6.6 should fit Maemo 5 better.