Bug 10388 (int-171987)

Summary: Choppy audio in SIP conversation, WLAN power save problem?
Product: [Maemo Official Platform] Multimedia Reporter: Jehan <jens-e.hansen>
Component: gstreamerAssignee: unassigned <nobody>
Status: NEW QA Contact: gstreamer-bugs
Severity: major    
Priority: Unspecified CC: andre_klapper, bojo42, colchaodemola, eero.tamminen, info, kimch, liquid, maemo, maemo, maemo_bugzilla, mikhail.zabaluev, olivier.crete, rajil.s, Tomas.Klimek
Version: 5.0:(10.2010.19-1)Keywords: performance
Target Milestone: ---   
Hardware: All   
OS: Maemo   
Attachments: syslog from iLBC session
syslog from PCMU session
chrt binary for comment#20
tcpdump and ping output

Description Jehan (reporter) 2010-05-28 16:33:06 UTC
SOFTWARE VERSION:
10.2010.19-1 / n900

EXACT STEPS LEADING TO PROBLEM: 
1. Create a SIP account on the phone
2. Connect to the SIP account via WLAN
3. Initiate a call (preferably an echo test '*43' on asterisk pbx)

EXPECTED OUTCOME:
A smooth conversation.

ACTUAL OUTCOME:
The opponents voice will be chopped and it will be impossible to understand a
word. BUT the called party can clearly understand you and you can
tell him to call again.

REPRODUCIBILITY:
always

EXTRA SOFTWARE INSTALLED:
none

OTHER COMMENTS:
This mostly happens if you are using iLBC codec but is also reproduceable with
PCMU.

Please note that putting the call on hold fixes the choppy audio temporarily.
Also I noticed a "Internal GStreamer error: clock problem." message in syslog
output but don't know if this could cause such symptoms. I'll attach two syslog
outputs - one for iLBC (disabled PCMA / PCMU via gstcodecs.conf) and one for
PCMU.

User-Agent:       Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.9)
Gecko/20100501 Firefox/3.5.8 (like Firefox/3.5.9)
Comment 1 Jehan (reporter) 2010-05-28 16:34:29 UTC
Created an attachment (id=2768) [details]
syslog from iLBC session
Comment 2 Jehan (reporter) 2010-05-28 16:35:02 UTC
Created an attachment (id=2769) [details]
syslog from PCMU session
Comment 3 Rajil Saraswat 2010-05-29 00:45:53 UTC
i faced the same issue with g729 codec. remote voice became choppy after
sometime
Comment 4 Mikhail Zabaluev nokia 2010-06-02 11:54:31 UTC
Thanks for moving this out into a separate bug.
Comment 5 r.l.p.verschaeren 2010-06-07 15:39:29 UTC
*** This bug has been confirmed by popular vote. ***
Comment 6 Mikhail Zabaluev nokia 2010-06-11 16:39:43 UTC
Can anybody attach a packet dump with RTP packets of a test conversation where
the problem is observed? You can filter out other traffic from the dump file to
avoid exposing sensitive information (we'll lose the picture of intervening
traffic, but it's unlikely to be the cause of the problems observed).
Comment 7 Mikhail Zabaluev nokia 2010-06-11 16:40:50 UTC
(In reply to comment #6)
> Can anybody attach a packet dump with RTP packets of a test conversation where
> the problem is observed?

And yeah, we need it taken again for PR1.2 (10.2010.19-2).
Comment 8 Mikhail Zabaluev nokia 2010-06-14 21:58:00 UTC
Does the problem persist if power saving is disabled for the WLAN connection on
the N900?
Comment 9 Jehan (reporter) 2010-06-15 18:31:12 UTC
For me the problem does not seem to persist if power saving is disabled for the
WLAN connection on the N900.
Comment 10 Mikhail Zabaluev nokia 2010-06-15 20:06:00 UTC
Changing the summary to reflect new knowledge.
Comment 11 Olivier Crête 2010-06-16 00:01:22 UTC
What happens is that there are some packets that come in with a big timestamp
delta (like 400ms).. but not big enough to trigger the 0.5s threshold over
which the process is reset. So they are dropped. That delay might be introduced
by power saving.. That said, I have no idea how it works. I though by sending
out packets every 40ms (at worst) we were effectively disabling it.
Comment 12 Mikhail Zabaluev nokia 2010-06-16 16:07:58 UTC
(In reply to comment #11)
> What happens is that there are some packets that come in with a big timestamp
> delta (like 400ms).. but not big enough to trigger the 0.5s threshold over
> which the process is reset. So they are dropped.

Is it jitter, i.e. few packets are lost, received time wanders up to 400 ms,
but the rate does not skew?
Comment 13 Rajil Saraswat 2010-07-04 13:54:56 UTC
@Mikhail
I have done quite a bit of testing and can confirm that the problem
persists even with Power Saving Off. I have tried this with N900
talking to a Linksys WRT54GL running DD-WRT firmware : DD-WRT v24-sp2
(10/10/09) voip, (v24 preSP2 Build 13064). I have done a tcpdump trace
and syslog trace which I have sent your through email. This call was made with
Wifi Power Saving Off. The voice was lost after a few seconds into the
call.

I had a similar response using Tomato firmware 1.27 on the same router.
Comment 14 Rajil Saraswat 2010-07-25 04:54:06 UTC
@Mikhail

I have done some testing over usbnet just to take "wifi" out of the picture.
The tests were done on a virgin centos 5.5 box with freeswitch. I made a call
to the voice mail number on the PBX. The call had a lot of pauses and voice
drops. 

Unfortunately, tcpdump doesnt work for me with usbnet so i wasnt able to trace
any calls. I had other people calling the voicemail box as well (luke-jr,
timeless/Josh Soref, thanks to them) and they reported audio drops as well.
Comment 15 colchaodemola 2010-08-02 04:43:32 UTC
I have this same problem but for skype!
The other side can hear me just fine , but i  can not ...
Even with power management off.
Comment 16 Eric Wheeler 2010-08-05 09:39:56 UTC
Hello all, I might have found a fix for this---please test and share your
results.  

The problem seems to be a pulseaudio<==>alsa buffering issue.  This
configuration is working very well (ie, no jitter, no drop, no echo, and < 3ms
pings over wifi) on my N900:

First update /etc/asound.conf:
==== /etc/asound.conf ====
pcm.voice {
type dmix
ipc_key 1024
slave {
pcm "hw:0,0"

period_time 0
period_size 1024
buffer_size 8192
#format “S32_LE”
#periods 128
rate 44100
}
}

pcm.dsp0 {
type plug
slave.pcm "voice"
}

# This following device can fool some applications into using pulseaudio
pcm.dsp1 {
type plug
slave.pcm "pulse"
}

ctl.mixer0 {
type hw
card 0
}

pcm.pulse { type pulse }
ctl.pulse { type pulse }

pcm.!default {
type pulse
}

ctl.!default {
type pulse
}
==========================

Then change the "exec" line in /etc/event.d/pulseaudio:


==== /etc/event.d/pulseaudio ====
[ ... ]
exec nice --20 pulseaudio  --system --no-cpu-limit=1
--resample-method=speex-fixed-0 --disable-shm=0
[ ... ]
=================================



Incidentally, I am using a slightly different /etc/pulse/daemon.conf, though I
do not think it has an effect:

=== /etc/pulse/daemon.conf ===
disable-shm = yes
shm-size-bytes = 16777216
exit-idle-time = -1

; This is an ugly combination and can lead into trouble, but needed for DVFS
no-cpu-limit = 1
; Limit max RT schedule slice to 10ms (= 10000000ns)
rlimit-rttime = 20000000 

high-priority = no
nice-level = -20
realtime-scheduling = yes
realtime-priority = 8
resample-method = speex-fixed-0

default-sample-format = s16le
default-sample-rate = 48000
default-sample-channels = 2
default-fragments = 2
default-fragment-size-msec = 20
==============================


Interestingly, the pings change drastically when the call starts.  Here's the
ping to my phone over wifi as it transitions to the start of a call.  I find
this strange, as the prior config did not act this way.  BTW, I'm using
WPA-AES.

=============================
64 bytes from 192.168.103.111: icmp_seq=1255 ttl=64 time=119 ms
64 bytes from 192.168.103.111: icmp_seq=1256 ttl=64 time=143 ms
64 bytes from 192.168.103.111: icmp_seq=1257 ttl=64 time=167 ms
64 bytes from 192.168.103.111: icmp_seq=1258 ttl=64 time=125 ms
64 bytes from 192.168.103.111: icmp_seq=1259 ttl=64 time=47.0 ms
64 bytes from 192.168.103.111: icmp_seq=1260 ttl=64 time=69.4 ms
64 bytes from 192.168.103.111: icmp_seq=1261 ttl=64 time=36.3 ms
64 bytes from 192.168.103.111: icmp_seq=1262 ttl=64 time=114 ms
64 bytes from 192.168.103.111: icmp_seq=1263 ttl=64 time=138 ms
64 bytes from 192.168.103.111: icmp_seq=1264 ttl=64 time=3.64 ms    <=== call
starts, could be power-saving transition.
64 bytes from 192.168.103.111: icmp_seq=1265 ttl=64 time=16.6 ms
64 bytes from 192.168.103.111: icmp_seq=1266 ttl=64 time=5.44 ms
64 bytes from 192.168.103.111: icmp_seq=1267 ttl=64 time=3.72 ms
64 bytes from 192.168.103.111: icmp_seq=1268 ttl=64 time=5.24 ms
64 bytes from 192.168.103.111: icmp_seq=1269 ttl=64 time=3.31 ms
64 bytes from 192.168.103.111: icmp_seq=1270 ttl=64 time=6.79 ms
64 bytes from 192.168.103.111: icmp_seq=1271 ttl=64 time=5.01 ms
64 bytes from 192.168.103.111: icmp_seq=1272 ttl=64 time=4.75 ms
64 bytes from 192.168.103.111: icmp_seq=1273 ttl=64 time=3.38 ms
=============================
Comment 17 Eric Wheeler 2010-08-05 09:42:11 UTC
Note that this line does NOT wrap, but it did in my original paste.

================================= /etc/event.d/pulseaudio ====
[ ... ]
exec nice --20 pulseaudio  --system --no-cpu-limit=1
--resample-method=speex-fixed-0 --disable-shm=0
[ ... ]
=================================

-Eric
Comment 18 Rajil Saraswat 2010-08-05 09:59:04 UTC
(In reply to comment #17)
> Note that this line does NOT wrap, but it did in my original paste.
> 
> ================================= /etc/event.d/pulseaudio ====
> [ ... ]
> exec nice --20 pulseaudio  --system --no-cpu-limit=1
> --resample-method=speex-fixed-0 --disable-shm=0
> [ ... ]
> =================================
> 
> -Eric
> 

Thanks for this post. Should the codec be set to speex before trying this out?
I mainly use g729 on N900.
Comment 19 Eric Wheeler 2010-08-05 12:13:10 UTC
(In reply to comment #18)

> Thanks for this post. Should the codec be set to speex before trying this out?
> I mainly use g729 on N900.

I'm using G711-ulaw into an asterisk system with default settings, as my system
does not (yet) have a 729 license.  Also, 729 is known to have higher CPU
requirements so your performance may vary.
Comment 20 Eric Wheeler 2010-08-05 12:25:02 UTC
This script sets real-time priority on the irq processes serving the N900 audio
and wifi hardware.  When using SIP, you want the audio and network devices to
get as much scheduling priority as possible---whereas, you want disk IO to be
comparatively low.  The script below orders processes on the machine by
realtime priority.  If you tune this script for better results or understand
the hardware better than I, please post updates.

** Note: you will need the 'chrt' binary which is available in the "Easy
Debian" install.  Qualitatively, the N900 "feels" more responsive since the
touch-screen input device gets a bit more priority than normal (I believe it is
serviced by the 4-wire omap2_mcspi device; correct me if I am wrong).


===== rtprio ====
#!/bin/sh    

# High-prio procs in order of DESCENDING priority
HIGHPRIO="twl4030-irqchip twl4030-irq omap2_mcspi wl12xx"

# Low-prio procs in order of ASCENDING priority
LOWPRIO="kcryptd kmmcd mmcqd"

prio=80; 
for proc in $HIGHPRIO; do 
    pids=`pgrep "^\\[*$proc\\]*$"`
    chrt -r -p $prio $pids
    echo $proc prio=$prio pids:$pids; 1>&2
    prio=$(($prio-5)); 
done

prio=1; 
for proc in $LOWPRIO; do 
    pids=`pgrep "^\\[*$proc\\]*$"`
    chrt -r -p $prio $pids
    echo $proc prio=$prio pids:$pids; 1>&2
    prio=$(($prio+1)); 
done

================
Comment 21 Jehan (reporter) 2010-08-05 15:17:07 UTC
(In reply to comment #20) 
> ** Note: you will need the 'chrt' binary which is available in the "Easy
> Debian" install.

Could you provide me with a chrt binary (maybe via PM), even statically linked
if it's needed? I would like to obviate installing easy debian if it's
possible.

thx
Comment 22 Eric Wheeler 2010-08-05 22:44:28 UTC
Created an attachment (id=3017) [details]
chrt binary for comment#20

This binary enables changing realtime scheduling attributes of processes.  See 
http://www.cyberciti.biz/faq/howto-set-real-time-scheduling-priority-process/
for details.  chrt has minimal library dependencies and is not statically
linked:
    libc.so.6 => /lib/libc.so.6 (0x4002f000)
    /lib/ld-linux.so.3 (0x40000000)
Comment 23 Izbus 2010-08-06 01:23:48 UTC
After updating asound.conf I am unable to make or receive calls. I get "Call
disconnected" message every time
Comment 24 Eric Wheeler 2010-08-06 01:58:29 UTC
(In reply to comment #23)
> After updating asound.conf I am unable to make or receive calls. I get "Call
> disconnected" message every time

Be sure that the quotes in asound.conf are actually quotes... sometimes they
copy-paste wrong out of web browsers.

"Call Disconnected" happens when pulseaudio is not running or accepting
connections.  As root, run 
  nice --20 pulseaudio -vv --system --no-cpu-limit=1
--resample-method=speex-fixed-0 --disable-shm=0

in a console and see if it gives any failure indication.  I have a feeling
pulseaudio died for some reason...

Also, I am running the "Power Kernel" on my phone... I'm not sure if that has
different ALSA support compiled in (though, that doesn't really make sense...)
Comment 25 Izbus 2010-08-06 03:16:39 UTC
After restart I have no sound at all. When i try to play music i get invalid
codec error. When I start pulseaudio in cli, I do get sound in mediaplayer, but
not in sip. Sip connects but there is silence. 

I believe am running PR 1.2 with "enhanced" kernel. I have also double checked
quotes. 

Here is a partial log of what i got: http://pastebin.com/v06NiBnz

===== is where stuff is skipped.
I hope it helps.
Comment 26 Eric Wheeler 2010-08-06 04:22:27 UTC
(In reply to comment #25)
> After restart I have no sound at all. When i try to play music i get invalid
> codec error. When I start pulseaudio in cli, I do get sound in mediaplayer, 
> but not in sip. Sip connects but there is silence. 

What version of pulseaudio? I'm running:
  Nokia-N900-51-1:~# pulseaudio --version
  pulseaudio 0.9.15

Before you launch pulseaudio as above, be sure that there are no other
pulseaudio instances running (remember upstart will respawn it if you kill it,
so you might need to shoot it down several times to stop the respawn, or "stop
pulseaudio" or twiddle with /etc/event.d/pulseaudio).


If your /etc/pulse/daemon.conf and /etc/asound.conf are exactly the same as
mine (above), and you are invoking pulseaudio (without linebreaks if this post
wraps) as :

  nice --20 pulseaudio  --system --no-cpu-limit=1
--resample-method=speex-fixed-0 --disable-shm=0

... then I am out of ideas.
Comment 27 Izbus 2010-08-06 04:47:11 UTC
My pulseaudio version is the same. 
When i use your deamon.conf i get

E: conf-parser.c: [/etc/pulse/daemon.conf:2] Failed to parse numeric value:
16777216

otherwise, with default deamon.conf i get errors mentioned in the post above..

Maybe someone else can replicate your solution?
Comment 28 Eric Wheeler 2010-08-06 05:03:09 UTC
(In reply to comment #27)
> My pulseaudio version is the same. 
> E: conf-parser.c: [/etc/pulse/daemon.conf:2] Failed to parse numeric value:
> 16777216

Do you have the libasound2 and libasound2-plugins packages installed?

Nokia-N900-51-1:~# dpkg -l | grep asound
ii  libasound2           1.0.20-0maemo1+0m5               shared library for
ALSA applications
ii  libasound2-plugins   1.0.20-0maemo1+0m5               ALSA library
additional plugins
Comment 29 Izbus 2010-08-06 05:07:34 UTC
I guess so:

Nokia-N900-51-1:~# dpkg -l | grep asound
ii  libasound2                                          1.0.20-0maemo1+0m5
         shared library for ALSA applications
ii  libasound2-plugins                                  1.0.20-0maemo1+0m5
         ALSA library additional plugins
Comment 30 Andre Klapper maemo.org 2010-08-06 14:43:33 UTC
(I understand and appreciate attempts to work around the issue, but please
avoid using Bugzilla as a forum - there is http://talk.maemo.org for such kind
of conversation. Thanks!)
Comment 31 Olivier Crête 2010-08-06 15:16:17 UTC
@Eric:

The ping changes because the wireless card is only awaken every ~100ms unless
you try to send a packet. And VoIP sends packets more often, so the incoming
ping come back faster. So its entirely unrelated.

You changes to the alsa configuration do nothing, pulseaudio does not use alsa
like that. And stream-engine uses pulseaudio directly.

In the pulse conf:
- You change the RT prio.. don't do that, it will make things less stable. The
priorities of the various elements have been carefully calibrated, changing it
will probably lead to lower call quality
- You disabled high-prio.. this may break cellular calls
- You increased the max slice to 20ms from 10ms.. this is useless, it should
never take that long. If it does, there is something seriously wrong.

The speex-fixed-2 is a resample from libspeexdsp, it has nothing to do with
Speex the codec (apart from the fact that they were written by the same
person). You changed it to speex-fixed-0.. which will give you worse quality..

Stream-engine and pulseaudio already have their audio threads running at
real-time priority, changing the nice value does nothing, it is ignored by the
RT scheduler.
Comment 32 Jehan (reporter) 2010-08-06 17:23:41 UTC
Modifications described in #16, #17 and #20 doesn't solve the problem for me.
Maybe it's of topical interest that pings to the device change from < 100ms to
> 9000 ms as soon as the call ist established with or without the suggested
modifications.
Comment 33 Mikhail Zabaluev nokia 2010-08-06 17:34:20 UTC
(In reply to comment #32)
> Maybe it's of topical interest that pings to the device change from < 100ms to
> > 9000 ms as soon as the call ist established with or without the suggested
> modifications.

I wonder if it differs depending on which WLAN access point model you use.

May it be that unusual WLAN packet flags or IP packet flags trigger wrong (or
intended?) behavior in the AP?
Comment 34 Eric Wheeler 2010-08-06 21:45:55 UTC
I'm linking to Jehan's original thread for discussion on the topic.  Be sure to
subscribe the thread so you get updates on the topic; we can post back to the
bug report if we get a solution that works for everyone:

   http://talk.maemo.org/showthread.php?t=54039
Comment 35 Jehan (reporter) 2010-08-12 14:46:43 UTC
(In reply to comment #33)
> I wonder if it differs depending on which WLAN access point model you use.

Ok I have to say that I only have this latency problems if power saving is
enabled. Maybe my AP does not handle PS correcty although I thought that the
tablets try to optimize performance by remaining in Constant Active Mode if
there is constant traffic. I'll try another AP if I have the chance to do so.

> May it be that unusual WLAN packet flags or IP packet flags trigger wrong (or
> intended?) behavior in the AP?

I don't think so because I have no latency problems with this AP if I use
twinkle an my notebook or an old nokia E60 (both with powersaving enabled) to
place a SIP call.

To be sure I'll attach a tar archive which includes captured tcpdump- and
ping-output during a SIP session initiated from the n900. tcpdump and ping both
where started on the n900.

If I disable power saving I can hear my opponent for about 10 - 15 min. After
that the situation is as described under "ACTUAL OUTCOME".

Do you think I should open another bug regarding the CAM issue or is this
normal behavior if the AP does not handle power saving correctly?
Comment 36 Jehan (reporter) 2010-08-12 14:49:29 UTC
Created an attachment (id=3028) [details]
tcpdump and ping output

tcpdump and ping were both started on the n900 before a sip call was initiated.
power saving enabled
Comment 37 Rajil Saraswat 2010-09-09 19:48:51 UTC
Has there been any progress on resolving this bug?
Comment 38 Eric Wheeler 2010-09-09 21:10:57 UTC
(In reply to comment #37)
> Has there been any progress on resolving this bug?

See this thread; try to keep this bug report to bug-related comments.
   http://talk.maemo.org/showthread.php?t=54039
Comment 39 Rajil Saraswat 2010-12-21 23:28:46 UTC
I have recorded a call to the Voicemail application of FreeSWITCH. N900 skips
all the digit prompts of the voicemail IVR. The wave file was big to attach
here, so i have uploaded it to http://174.132.148.146/ivr.wav
Comment 40 Gary Birkett 2011-01-24 20:30:20 UTC
(In reply to comment #39)
> I have recorded a call to the Voicemail application of FreeSWITCH. N900 skips
> all the digit prompts of the voicemail IVR. The wave file was big to attach
> here, so i have uploaded it to http://174.132.148.146/ivr.wav

is this recording indicative of every call you make using your server?

ie related to the codec eating specific frequencies or is that just a side
effect of something else?
it seems very specifically missing the tones rather than the bug reports
indication of general choppyness.

----------------------------

There was more discussion about this bug raised in #maemo today where user
(crashanddie) indicated similar issues with his SIP server installation too.

http://mg.pov.lt/maemo-irclog/%23maemo.2011-01-24.log.html#t2011-01-24T19:08:43