Bug 3703 - Browserd Hangs at livejournal.com with 3 concurrent flash videos and Has to be Killed Manually
: Browserd Hangs at livejournal.com with 3 concurrent flash videos and Has to b...
Status: RESOLVED INVALID
Product: Browser
MicroB engine
: 4.1.1 (4.2008.30-2)
: ARM Maemo
: Medium normal (vote)
: ---
Assigned To: unassigned
: microb-bugs
:
: moreinfo, performance
:
:
  Show dependency tree
 
Reported: 2008-09-13 12:39 UTC by luarvique
Modified: 2009-03-24 16:22 UTC (History)
3 users (show)

See Also:


Attachments


Note

You need to log in before you can comment on or make changes to this bug.


Description luarvique (reporter) 2008-09-13 12:39:40 UTC
SOFTWARE VERSION:
Latest version of Diablo.

STEPS TO REPRODUCE THE PROBLEM:
Start up MicroB and go to any LiveJournal page, for example

http://www.livejournal.com/~idu_shagayu

EXPECTED OUTCOME:
The page should load well and reasonably fast.

ACTUAL OUTCOME:
The browser is stuck loading the page, CPU load panel shows 100% CPU being used
by browserd daemon. The only way to resolve the problem is to kill browserd
manually (using 3rd party tools not available on a stock tablet) and never
visit livejournal.com again.

REPRODUCIBILITY:
sometimes, depending on page content, but ~70% of all visits
Comment 1 luarvique (reporter) 2008-09-13 12:55:32 UTC
Software version:
4.2008.30-2
Comment 2 Eero Tamminen nokia 2008-09-15 15:17:48 UTC
If you disable Flashplayer, I think you'll find the site more usable.

With Flash enabled the site seems to be always running something,
with Flash disabled, it eventually stops using CPU.  Without Flash
you've got also more memory left for rest of the content.

I wasn't able to get Browser hanged, the page was just slow, but
the reason could be that today the site didn't have:
- Bad JavaScript
- Bad Flash ActionScript
- So much data (images, Flash adverts etc) on the page that
  the browser/flashplayer runs out of memory


> EXPECTED OUTCOME:
> The page should load well and reasonably fast.

Hm. Does the site work significantly better on (slowish) Desktop/laptop
machine which doesn't have that much more RAM than the device?
I.e. can you reasonably expect this from device having 400Mhz ARM CPU
with 128MB of RAM?


> The only way to resolve the problem is to kill browserd manually

If the browserd is stuck, and you try to close the UI, it seems to
get stuck too.  Framework will then ask whether the UI should be killed
and if that's OKed, browserd HTML area gets reparented as application
window (which doesn't show up in TN, this is TN issue).  That needs
indeed to be killed manually.

This is another bug though.


> (using 3rd party tools not available on a stock tablet)

"killall -9 browserd" in xterm should work just fine.
Comment 3 Andre Klapper maemo.org 2008-09-15 15:31:22 UTC
Same as Eero described here.

I wonder whether strace'ing it might help to find the reason, and getting
output from sp-memusage might also be helpful to track this down.

Please see http://maemo.org/development/tools/ ,
http://maemo.org/development/tools/doc/diablo/sp-memusage/ and
http://maemo.org/development/documentation/man_pages/strace.html .
Comment 4 luarvique (reporter) 2008-09-15 16:33:03 UTC
(In reply to comment #2)
> If you disable Flashplayer, I think you'll find the site more usable. 
> With Flash enabled the site seems to be always running something,
> with Flash disabled, it eventually stops using CPU.  Without Flash
> you've got also more memory left for rest of the content.
I do suspect therer were flash movies inlined into the page. Will recheck. In
either case, inlined flash objects should not cause 100% CPU usage if they are
not doing anything.

> Hm. Does the site work significantly better on (slowish) Desktop/laptop
> machine which doesn't have that much more RAM than the device?
It works well on a PC, even a slowish PC.

> > The only way to resolve the problem is to kill browserd manually
> If the browserd is stuck, and you try to close the UI, it seems to
> get stuck too.  Framework will then ask whether the UI should be killed
> and if that's OKed, browserd HTML area gets reparented as application
> window (which doesn't show up in TN, this is TN issue).  That needs
> indeed to be killed manually.
In other words, "the only way to resolve the problem is to kill browserd
manually", i.e. my original statement.

> > (using 3rd party tools not available on a stock tablet)
> "killall -9 browserd" in xterm should work just fine.
It is unreasonable to expect an average tablet user to be able to do that. Even
I have some problems bringing up xterm and pecking at the screen with a stylus
trying to type in this command, especially while browserd is using 100% CPU.
Comment 5 Eero Tamminen nokia 2008-09-15 18:32:47 UTC
(In reply to comment #4)
> I do suspect there were flash movies inlined into the page. Will recheck.
> In either case, inlined flash objects should not cause 100% CPU usage if
> they are not doing anything.

Please install "strace" and "xresponse" to the device from the tools
repository:
  http://maemo.org/development/tools/

When you've reproduced the issue and browser is the top application,
log to the device over SSH and use "xresponse -i" to see whether
the page (Flash content or JS) is updating the screen and then
"strace -p <busy Browserd PID>" to see what Browser is doing.


> > Hm. Does the site work significantly better on (slowish) Desktop/laptop
> > machine which doesn't have that much more RAM than the device?
>
> It works well on a PC, even a slowish PC.

I guess the page just has some content that requires certain base
performance (gfx acceleration for layer compositing etc) after which
it starts to be usable and with lower HW perf it just gets strangled
(uses lots of CPU and at the same tries to do screen updates at fast
FPS without frame limiting i.e. bad Flash code).


> > > The only way to resolve the problem is to kill browserd manually
> >
> > If the browserd is stuck,
> >
> > and you try to close the UI, it seems to
> > get stuck too.  Framework will then ask whether the UI should be killed
> > and if that's OKed, browserd HTML area gets reparented as application
> > window (which doesn't show up in TN, this is TN issue).  That needs
> > indeed to be killed manually.
>
> In other words, "the only way to resolve the problem is to kill browserd
> manually", i.e. my original statement.

I meant that the freeze itself is a separate issue from the issue
of getting rid of the frozen application.  Separate issues should
be filed separately as they might concern different componenents,
have different priority&re-producibility[1] and be fixed even in
separate releases (freeze might be flashplayer issue and freeze
handling is browser ui/daemon).  Could you report it separately?

[1] I cannot reproduce the freeze nor have any other test-case
    that could cause browserd freeze. So far I've just simulated
    it by using "SIGSTOP" on the browserd daemon.


> > > (using 3rd party tools not available on a stock tablet)
> > "killall -9 browserd" in xterm should work just fine.
>
> It is unreasonable to expect an average tablet user to be able to do that. 

Oh, you meant "*non-power user* needs either to reboot the device or
use a 3rd party tool to get rid of browser".  Yes, that's an important
point, it's use-time issue and needed to regain use of browser.
Comment 6 luarvique (reporter) 2008-09-17 23:53:14 UTC
(In reply to comment #5)
> (In reply to comment #4)
> When you've reproduced the issue and browser is the top application,
> log to the device over SSH and use "xresponse -i" to see whether
> the page (Flash content or JS) is updating the screen
It is not.

> and then
> "strace -p <busy Browserd PID>" to see what Browser is doing.
I have got it into a situation where CPU Load applet shows ~75% overall CPU
usage. The page does contain 3 embedded Flash video players (YouTube).

htop shows:
5918 user      15   0  135M 41524 15472 S 57.7 10.5  3:32.77 /usr/sbin/browserd
-s 5918 -i microb

strace -p 5918 shows:
gettimeofday({1221684708, 748413}, NULL) = 0
gettimeofday({1221684708, 752899}, NULL) = 0
gettimeofday({1221684708, 757507}, NULL) = 0
gettimeofday({1221684708, 760314}, NULL) = 0
gettimeofday({1221684708, 764587}, NULL) = 0
gettimeofday({1221684708, 767395}, NULL) = 0
gettimeofday({1221684708, 772735}, NULL) = 0
gettimeofday({1221684708, 776916}, NULL) = 0
gettimeofday({1221684708, 781951}, NULL) = 0
gettimeofday({1221684708, 786437}, NULL) = 0
gettimeofday({1221684708, 791320}, NULL) = 0
gettimeofday({1221684708, 795776}, NULL) = 0
gettimeofday({1221684708, 800170}, NULL) = 0
gettimeofday({1221684708, 804443}, NULL) = 0
gettimeofday({1221684708, 809814}, NULL) = 0
gettimeofday({1221684708, 813781}, NULL) = 0
gettimeofday({1221684708, 820281}, NULL) = 0
ioctl(5, FIONREAD, [0])                 = 0
poll([{fd=6, events=POLLIN}, {fd=5, events=POLLIN}, {fd=21, events=POLLIN},
{fd=11, events=POLLIN}, {fd=24, events=POLLIN}, {fd=28, events=POLLIN}, {fd=10,
events=POLLIN}, {fd=30, events=POLLIN}, {fd=31, events=POLLIN}, {fd=18,
events=POLLIN}, {fd=25, events=POLLIN}, {fd=27, events=POLLIN}, {fd=32,
events=POLLIN}, {fd=19, events=POLLIN}, {fd=22, events=POLLIN}], 15, 0) = 0
(Timeout)
gettimeofday({1221684708, 835754}, NULL) = 0
ioctl(5, FIONREAD, [0])                 = 0
gettimeofday({1221684708, 844848}, NULL) = 0
poll([{fd=6, events=POLLIN}, {fd=5, events=POLLIN}, {fd=21, events=POLLIN},
{fd=11, events=POLLIN}, {fd=24, events=POLLIN}, {fd=28, events=POLLIN}, {fd=10,
events=POLLIN}, {fd=30, events=POLLIN}, {fd=31, events=POLLIN}, {fd=18,
events=POLLIN}, {fd=25, events=POLLIN}, {fd=27, events=POLLIN}, {fd=32,
events=POLLIN}, {fd=19, events=POLLIN}, {fd=22, events=POLLIN}], 15, 0}, NULL)
= 0
[gettimeofday() repeats here]

The gettimeofday() call repeats many, many times. In fact, it gets printed so
fast that it takes a while to get ssh to accept the Ctrl-C and stop printing
it. Looks like a busy loop waiting for some value of gettimeofday() to me.

> I guess the page just has some content that requires certain base
> performance (gfx acceleration for layer compositing etc) after which
> it starts to be usable and with lower HW perf it just gets strangled
> (uses lots of CPU and at the same tries to do screen updates at fast
> FPS without frame limiting i.e. bad Flash code).
Not really.

> I meant that the freeze itself is a separate issue from the issue
> of getting rid of the frozen application.  Separate issues should
> be filed separately as they might concern different componenents,
The issue here is the freeze. Inability to kill a background process without
resorting to a command line tool (kill) or some third-party applet is a
complication, not an issue in this case.
Comment 7 Eero Tamminen nokia 2008-09-25 19:11:19 UTC
> Looks like a busy loop waiting for some value of gettimeofday() to me.

Gettimeofday is used for timing of video display/decoding/audio sync etc.
There's nothing inherently wrong with it, and it's expected if you're
playing/decoding video.


> The page does contain 3 embedded Flash video players (YouTube).

Expecting more than 2 flash videos to work concurrently isn't really reasonable
for our device...   Please complain to Adobe. :-)

Suppose that you accept that this device will never be able to play 3 flash
videos concurrently and you visit a page with 3 or more flash videos, what
would you want the browser to do? (yes, obviously not hanging would be a good
start, but should it play 0 videos? should it randomly pick 1 video? ... be
reasonable and explain your logic)


It's possible that there's also an issue with the connections as browserd/flash
seems to be polling 15 different file descriptors.  When this happens to you
again, what does this give:
  pidof browserd|xargs -n1 lsof -p
?
Comment 8 Eero Tamminen nokia 2008-09-25 19:12:32 UTC
Btw. Just to verify that it's really Flashplayer issue, can you trigger this
with Flash disabled?
Comment 9 Andre Klapper maemo.org 2008-10-06 11:05:39 UTC
luarvique, can you answer Eero's last question?
Comment 10 luarvique (reporter) 2008-10-06 11:36:28 UTC
(In reply to comment #8)
> Btw. Just to verify that it's really Flashplayer issue, can you trigger this
> with Flash disabled?
After installing the flash blocker plugin, the issue seems to have subsided.
Furthermore, it appears that when I manually enable three YouTube players on a
page using the flash blocker (but not starting playback on any of them), the
issue DOES NOT reappear.

NOTE: Sincle filing this bug, I have upgraded the system with the second Diablo
SSU.
Comment 11 Andre Klapper maemo.org 2008-10-06 13:36:03 UTC
...so it seems to be a Flashplayer issue.

> I do suspect therer were flash movies inlined into the page. Will recheck. 
> In either case, inlined flash objects should not cause 100% CPU usage if 
> they are not doing anything.

So this is not the case anymore?

> Suppose that you accept that this device will never be able to play 3 flash
> videos concurrently and you visit a page with 3 or more flash videos, what
> would you want the browser to do?

Still unanswered...
Currently looks like WONTFIX to me. Even my old 1700MHz laptop with Firefox 2
had enough problems on, say, some Myspace pages with lots of multimedia
content.
Comment 12 luarvique (reporter) 2008-10-06 13:53:51 UTC
(In reply to comment #11)
> ...so it seems to be a Flashplayer issue.
I guess so.

> > I do suspect therer were flash movies inlined into the page. Will recheck. 
> > In either case, inlined flash objects should not cause 100% CPU usage if 
> > they are not doing anything.
> So this is not the case anymore?
With flash blocker plugin installed, this does not seem to be a case. In order
to check if it is still a case without flash blocker I will need to uninstall
the flash blocker.

> > Suppose that you accept that this device will never be able to play 3 flash
> > videos concurrently and you visit a page with 3 or more flash videos, what
> > would you want the browser to do?
> Still unanswered...
Please note that nowhere did I mention PLAYING 3 flash videos at once. The
problem occurs when videos are NOT playing, i.e. those embedded flash objects
are not supposed to be doing anything at all.

> Currently looks like WONTFIX to me. Even my old 1700MHz laptop with Firefox 2
> had enough problems on, say, some Myspace pages with lots of multimedia
> content.
As I said before, this bug is not about running three flash applets at once. It
is about a situation where seemingly dormant flash applets effectively bring
down the system.
Comment 13 Eero Tamminen nokia 2008-10-13 15:07:37 UTC
(In reply to comment #12)
>>> Suppose that you accept that this device will never be able to play
>>> 3 flash videos concurrently and you visit a page with 3 or more flash
>>> videos, what would you want the browser to do?
>
> Please note that nowhere did I mention PLAYING 3 flash videos at once. The
> problem occurs when videos are NOT playing, i.e. those embedded flash objects
> are not supposed to be doing anything at all.
>
>> After installing the flash blocker plugin, the issue seems to have subsided.
>> Furthermore, it appears that when I manually enable three YouTube players on
>> a page using the flash blocker (but not starting playback on any of them),
>> the issue DOES NOT reappear.

Hm.  So maybe it has by default even more flash plugins on the page than the
three youtube ones?

Btw. I have another potential cause for the freeze.  If you can trigger this
issue again[1], could you do: "/etc/init.d/esd restart" as root and see whether
this helps anything?

[1] I've never been able to reproduce this when I've gotten Flash enabled (when
using the UI in English or Finnish which I understand, maybe the issue is
language specific?).


> > Currently looks like WONTFIX to me. Even my old 1700MHz laptop with Firefox2
> > had enough problems on, say, some Myspace pages with lots of multimedia
> > content.
>
> As I said before, this bug is not about running three flash applets at once.
> It is about a situation where seemingly dormant flash applets effectively
> bring down the system.

Flashplayer is still running the ActionScript script etc in them even when the
video is not being played.
Comment 14 luarvique (reporter) 2008-10-13 15:20:23 UTC
(In reply to comment #13)
> (In reply to comment #12)
> Btw. I have another potential cause for the freeze.  If you can trigger this
> issue again[1], could you do: "/etc/init.d/esd restart" as root and see whether
> this helps anything?
I will try causing the problem again and restarting esd. Can't promise I
succeed though.

> [1] I've never been able to reproduce this when I've gotten Flash enabled (when
> using the UI in English or Finnish which I understand, maybe the issue is
> language specific?).
Most likely not. There is a slight possibility that it is caused by additional
[mis]features enabled for "cyrillic" users by SUP. I explicitly canceled all
their extra "services" though, so the possibility is slim.

> > As I said before, this bug is not about running three flash applets at once.
> > It is about a situation where seemingly dormant flash applets effectively
> > bring down the system.
> Flashplayer is still running the ActionScript script etc in them even when the
> video is not being played.
Yes, the traces I have posted above appear to confirm it. But I do hope that
the flood of gettimeofday() calls can somehow be avoided.
Comment 15 Andre Klapper maemo.org 2009-03-24 16:22:04 UTC
Closing this bug report as no further information has been provided. Please
feel free to reopen this bug if you can provide the information asked for/if
you can still reproduce this. Thanks!