maemo.org Bugzilla – Bug 3939
sapwood loops infinitely at malformed request
Last modified: 2010-08-13 15:00:34 UTC
You need to
before you can comment on or make changes to this bug.
from SVN source
STEPS TO REPRODUCE THE PROBLEM:
send sapwood a pixbufrequest where length of data written to socket <
base->length but > sizeof(PixbufBaseRequest)
To deal with this gracefully.
It pegs CPU to 100%
ofs = 0;
while (ofs < buflen)
n = process_buffer (fd, buf + ofs, buflen - ofs, user_data);
if (n < 0)
ofs += n;
and from process_buffer:
if (buflen < sizeof (PixbufBaseRequest) || buflen < base->length)
if base->length is larger than recieved data, and buflen larger than sizeof
PixbufBaseRequest, it will loop infinitely.
Before closing this bug because "we trust our clients", the reason why this bug
has been posted is because we are looking at some debug output at a user (.. on
a 770) that suggests a corrupted base->length, and this causes sapwood to peg
EXTRA SOFTWARE INSTALLED:
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; da; rv:220.127.116.11)
To have this remain critical severity, you should provide a repeatable end-user
> Before closing this bug because "we trust our clients", the reason why this
> bug has been posted is because we are looking at some debug output at a user
> (.. on a 770) that suggests a corrupted base->length, and this causes sapwood
> to peg the cpu.
As if you're not trusting the clients, why you would trust the client to write
data to the socket on which Sapwood is reading?
Fair enough - it was in terms of that we cannot safely assume clients send
properly structured data - which is what the bug is about, a case where a
corrupted stack, compiler bug, or whatever cause sapwood server to peg CPU
completely and lock up.
We're seeing this on armv5t archs (a zaurus and a 770) with hildon-desktop &
demos/gradient in sapwood, a suspicion of a stack corruption/compiler
bug/whatever where the client sends badly formatted data (length moved into
padding, wrong values in borders , etc). Through chasing down this bug we
noticed this bug as well, so that's why this bug was reported as fixing it may
increase stability with unstable applications.
I've moved severity to normal. Fix -looks- like it would be a matter of
changing 'if (n < 0)' to 'if (n <= 0)' without looking much closer to it.
Sapwood trunk has a proper testing framework. So please provide a patch adding
a test for sapwood's behavior (there's already a test you can use as a starting
place for yours).
And with such a testcase you could easily verify your patch-guess and submit a
proper patch if it really fixes this issue.
So I have to agree with Eero here. Unless there is a proper testcase, there's
no trial-and-error fixing of this bug (and also no high severity). We added the
test framework for exactly this purpose to get out of the blue-sky bug fixing
Created an attachment (id=1076) [details]
testcase doing a too short request
Attached is a testcase built on double-free that does the following:
* Connects to sapwood, sends a standard request, but actually only sends one
byte of data. This is one of the cases where the method indicated would return
0, and causing sapwood to spin into a infinite loop.
On my FREMANTLE_X86 target, sapwood-3.1.1, this spins sapwood into an infinite
loop. Due to limited time I've not had a chance to test using the built-in test
framework (which requires Xvfb which doesn't exist in the Fremantle pre-alpha
A more proper way to test (given the existence of a within-sdk working test
framework of sapwood) would be to take double-free, do a single modification to
the pixbuf_proto_request which changes the length of the packet sent to sapwood
Honestly, the problem report is quite simple. Given the situation where
process_buffer returns 0 (which it will in case of malformed header or a too
short buffer), the while (ofs < buflen) loop will not iterate, as we do ofs +=
n, where n in this case is 0, which means it will try process_buffer in next
loop iteration, get same return value again, repeat loop contents over and over
The problem about how to go about the fix is if to simply rely on that we get
all the data at once in a socket callback, and terminate the connection in the
case of short data, like is done later in process_buffer (causing return FALSE
in socket callback).
Or make sane buffering: break from the while loop, and keep input in a buffer
for next portion of data/socket callback and keep connection alive (return
TRUE) until the incoming data makes sense.
> Due to limited time I've not had a chance to test using the built-in test
> framework (which requires Xvfb which doesn't exist in the Fremantle pre-alpha
You can run it outside the Scratchbox, similarly to Xephyr.
(Xvfb is used because some of our internal build machines don't have displays.)
(In reply to comment #5)
> > Due to limited time I've not had a chance to test using the built-in test
> > framework (which requires Xvfb which doesn't exist in the Fremantle pre-alpha
> > SDK).
> You can run it outside the Scratchbox, similarly to Xephyr.
Not without patching the source tree.
> (Xvfb is used because some of our internal build machines don't have displays.)
Xvfb is not used because of the build machines. It's used to not distract the
user and because it has some features that Xephyr doesn't provide (eg. multiple
screens, which we will need for fixing a different issue with sapwood…).
Related bug: #4023 (sapwood engine request buffer corruption due to misaligned
(In reply to comment #7)
> Related bug: #4023 (sapwood engine request buffer corruption due to misaligned
> data structure)
Hm. Misalignment and the infinite loop mentioned in this bug reminds me of
Older ARM kernels had the problem that they don't handle unaligned data
accesses properly (on ARMv6 kernel assume CPU handles the alignment, but
unaligned multiword access still causes alignment exception. Because that's
not handled, process goes into infinite loop with 100% CPU usage instead of
being terminated with SIGBUS). This should be fixed in Diablo kernel, but it
at least wasn't fixed in upstream ARM kernel if you're using that instead.
 See "Misaligned memory accesses handling & ARMv6/ARMv7" thread on ARM
kernel mailing list.
If this is still valid it probably makes more sense to upstream this to
I don't expect any process here.
Yes - please upstream in case it makes sense.