Bug 3479 - SDK does not work on Debian lenny system
: SDK does not work on Debian lenny system
Status: RESOLVED FIXED
Product: Development platform
SDK
: 5.0-alpha
: All Linux
: Medium major (vote)
: 5.0-beta2
Assigned To: Soumya
: sdk-bugs
:
:
:
:
  Show dependency tree
 
Reported: 2008-07-21 18:05 UTC by Graham Cobb
Modified: 2009-08-01 18:07 UTC (History)
7 users (show)

See Also:


Attachments


Note

You need to log in before you can comment on or make changes to this bug.


Description Graham Cobb (reporter) maemo.org 2008-07-21 18:05:30 UTC
Like many others, I use Debian lenny for my desktop and development systems. 
After a recent update, my scratchbox environment (which has been working well
for a long time) has stopped working.  The error is the well-known
"_rtld_local._dl_sysinfo_dso' failed" error.

Googleing for this error indicates that the problem is a compatability issue
between the old libc in use in the SDK and the linux kernel and that this has
been known on some platforms for a long time.  Some workrounds are suggested
which involve either writing 0 to /proc/sys/vm/vdso_enabled or setting options
on the boot command line.  However, neither workround works with the debian
kernel in linux-image-2.6.22-3-amd64 version 2.6.22-6.lenny1 (the latest in
lenny).

This is a serious problem as I would expect off-the-shelf Debian systems to be
the single most commonly used development environment, particularly for larger
scale developers.  I believe it is essential for wide Maemo acceptance that the
development environment works on Debian without the need for (very inefficient)
emulators.

Note that a complete build of GPE (approx 30 packages) takes about 45 minutes
on my fastest system while the same build in a QEMU emulated environment
running on the same system takes 6-9 hours. So, emulation is NOT a feasible
solution for serious software development.
Comment 1 Daniel Martin Yerga 2008-07-21 20:20:15 UTC
Could be this a problem related to 64bits?

I am using Debian sid with linux-image-2.6.24-1-686 and it works for me.
Though I have to say some time ago I had a similar problem when I was using
2.6.18 and try update to next versions of kernel and I was some time using
2.6.18 for this reason. Though I don't remember how but after a time it was
fixed "magically". It could be for update to a later kernel version or for
update scratchbox (to 1.0.10 version).
Comment 2 Graham Cobb (reporter) maemo.org 2008-07-25 01:18:42 UTC
I pasted in the wrong kernel version in the original report.  This problem was
introduced with the move to the kernel linux-image-2.6.25-2-amd64 version
2.6.25-6 (which is really the latest in lenny).

The previous kernel (linux-image-2.6.24-1-amd64 version 2.6.24-7) does not have
this problem.  So, it would be useful if Daniel Martin Yerga could try
upgrading to 2.6.25 to see if the problem happens for him.

This **could** be a specifically amd64 problem (note that I run scratchbox in a
32-bit chroot, but with the 64-bit kernel -- this is the way I have always run
it on this system). I have not yet tested on my 32-bit system (because I can't
afford it to break at the moment).
Comment 3 Daniel Martin Yerga 2008-07-25 17:57:45 UTC
Graham, I have installed the linux-image-2.6.25-2-686 (2.6.25-7 version) and it
doesn't show the error you wrote in the original post else it shows the
following note:
"""
Host kernel has vdso support (which is uncompatible with SB)
You can fix this with either: 
  echo 0 > /proc/sys/vm/vdso_enabled
or
  add 'vdso=0' to the kernel parameters
"""

And writing the 0 to the /proc/sys/vm/vdso_enabled file it works again.

What scratchbox version are you using? I am using 1.0.10.
Comment 4 Graham Cobb (reporter) maemo.org 2008-07-25 19:49:53 UTC
Interesting.  I am also using SB 1.0.10.  But it neither gives that useful
error message nor do either of the workarounds work.

It might be a real difference between the amd64 kernel and the 32-bit kernel. 
I will upgrade my 32-bit system over the weekend and try there.
Comment 5 Graham Cobb (reporter) maemo.org 2008-07-26 02:21:09 UTC
On my 32-bit system I see the same effect as Daniel.  In other words, I can
workround the problem.

So, it seems it is only completely broken on 64-bit, which previously worked
fine using a chroot.

So, this leaves me with two very unappealing choices: 1) do all my builds on my
old, slow, memory constrained 32-bit system, or 2) do all my builds in my QEMU
environment emulated very slowly on my nice, fast, 64-bit system.

Either way, builds which previously took 45 minutes on the 64-bit system now
take 6-9 hours.  This will mean that I have to go back to only updating the GPE
daily build repositories once every three days as I can't build everything
every night.
Comment 6 Quim Gil nokia 2008-07-28 09:27:51 UTC
I'm not sure about the right diagnose and resolution of this bug, however...

(In reply to comment #0)
> This is a serious problem as I would expect off-the-shelf Debian systems to be
> the single most commonly used development environment, particularly for larger
> scale developers.  I believe it is essential for wide Maemo acceptance that the
> development environment works on Debian without the need for (very inefficient)
> emulators.

Debian Stable is supported by the Maemo 4.1 SDK, which is also a stable
release.



From the maemo 4.1 release notes:

> This release was tested on the following distributions:
> - Ubuntu Feisty
> - Ubuntu Gutsy
> - Ubuntu Hardy 
> - Debian stable
> - Kubuntu Hardy
>
> It should also work on other Scratchbox supported operating systems.
Comment 7 Graham Cobb (reporter) maemo.org 2008-07-28 12:21:51 UTC
It is up to Nokia what you choose to support, of course.  However, I stand by
my view: I believe it is essential for wide Maemo acceptance that the
development environment works on Debian without the need for (very inefficient)
emulators. In case it wasn't clear, I meant up-to-date Debian (latest testing
and even, when possible, unstable).

> Debian Stable is supported by the Maemo 4.1 SDK, which is also a stable
> release.

Do you know of any developers who use Debian stable for development???  I run
Debian stable on my mail/VPN/web server on the internet.  I use Debian testing
on my other machines.

Maybe it would be worthwhile doing a survey of the main developers to see what
development environments they use.  It may be that most use Ubuntu, which does
seem to be better supported.

I will also log a bug with Debian asking that the 64-bit kernel build supports
whatever it is that is required to at least make the vdso workround work.  But
I have never been successful in persuading the Debian kernel developers to
change anything in the past!

In any case, whatever urgency you give this, surely in the medium term either
scratchbox has to be fixed to not need this VDSO workround or the Maemo SDK has
to move away from scratchbox.  There is talk of lenny being released in
September (I'll believe it when I see it, but it wont be long).
Comment 8 Graham Cobb (reporter) maemo.org 2008-07-28 13:53:48 UTC
I have created Debian bug 492702 --
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=492702
Comment 9 Tor 2008-07-28 14:47:41 UTC
(In reply to comment #8)
> Do you know of any developers who use Debian stable for development???  I run
> Debian stable on my mail/VPN/web server on the internet.  I use Debian testing
> on my other machines.

FWIW, I agree completely with Graham's comment above - there are a number of
developers using Debian where I work and they all (myself included, and I'm
also a scratchbox user) use Lenny or Sid. We only use stable/Etch for the kind
of services described  above (servers).  On my system I use a kernel compiled
with the necessary support, but if this only implies moving to a slightly newer
libc in the SDK it sounds like it should be considered.
Comment 10 Jussi Hakala 2008-08-05 16:31:15 UTC
Do you have compat vdso support compiled in your kernel?

My guess would be that the linux-image-2.6.25-2-amd64 doesn't have that support
compiled in, thus using exclusively the randomized vdso which is unsupported by
the old glibc inside scratchbox.
Comment 11 Eero Tamminen nokia 2008-08-05 16:41:38 UTC
(In reply to comment #7)
> > Debian Stable is supported by the Maemo 4.1 SDK, which is also a stable
> > release.
> 
> Do you know of any developers who use Debian stable for development???

On some of my own Desktop machines, yes.  None of those runs the latest
stuff (I do fairly low level development and have old machines so faster
and more stable environment is nice).

However, the latest Etch kernel update gave on Scratchbox startup also
this message:
"""
Host kernel has vdso support (which is uncompatible with SB)
You can fix this with either: 
  echo 0 > /proc/sys/vm/vdso_enabled
or
  add 'vdso=0' to the kernel parameters
"""

Basically the problem is that Scratchbox build tools are incompatible
with the host environment (in Etch it can still be fixed with vsdo=0),
the amount of host tools is large (half a gig) and they need modifications
for Scratchbox 1.  It's a bit unfortunate that chroot doesn't isolate from
this kind of effects in stable distro security updates.


Sbox2[1] makes it much easier to update/change the tools to be compatible
with the host enviroment, you just apt-get them from suitable Debian
distro.  However, if you updated/change them, they are not anymore
the set we've tested to work for building our software.  And it doesn't
work well enough (at least yet) so that it could replace Scratchbox 1
officially (for starters, it doesn't support x86 package building &
installation properly yet, just ARMEL).  Some developers are already
happy using it though as part of maemo SDK+ beta[2].

[1] http://freedesktop.org/wiki/Software/sbox2
[2] http://maemo-sdk.garage.maemo.org/


> I believe it is essential for wide Maemo acceptance that the development
> environment works on Debian without the need for (very inefficient)
> emulators.

Some kind of a virtual machine is the only really reliable method
if we want to support any OS or even just any Linux distro.


> Note that a complete build of GPE (approx 30 packages) takes about 45
> minutes on my fastest system while the same build in a QEMU emulated 
> environment running on the same system takes 6-9 hours. So, emulation
> is NOT a feasible solution for serious software development.

Using virtualization (Vmware, Xen, UML, VirtualBox...) is much faster
than emulation.  With those for normal application development I think
the main issue will be memory overhead of running multiple OS instances
(ARM Qemu is needed inside the virtual machine) and transferring code
between the host and the build environment. (In scratchbox you can use
just "cp", with virtual machine you need first to setup networking between
it and the host)


All of these solutions (Scratchbox 1, Sbox2, emulation/virtualization,
interpreted languages like Python) have shortcomings.  I think we would
need to support several of them.
Comment 12 Graham Cobb (reporter) maemo.org 2008-08-11 18:12:53 UTC
I have a working workaround: use the user-mode-linux image (and modules) from a
32-bit system.  I have run the 32-bit UML (version 2.6.24-1um-1 from 32-bit
lenny) on my AMD64 system and it works and will build GPE (and other
applications) successfully and quite quickly (less than 90 minutes for full GPE
build, which takes over 10 hours using QEMU).

Of course, the 32-bit UML package cannot be easily installed on a 64-bit
system, and it contains conflicting files if you also need the 64-bit UML
installed.  I believe what Nokia needs to do is to create debian packages (for
both 32-bit and 64-bit Debian architectures) which will install the 32-bit UML,
and its modules, with package and file names which are maemo-specific and do
not conflict with the Debian user-mode-linux package.  Nokia would test that
this works on Debian stable and testing (and, if possible, other systems such
as Ubuntu and RHEL).

The creation of debian packages containing the UML image and testing that it
can run the SDK should be Nokia's responsibility (part of issuing the SDK). 
This could then become the only supported way to install and use the SDK,
potentially reducing support problems.  Note that these do not need to be
updated all the time -- there is no particular problem with the SDK UML kernel
being a little out of date, as long as there are no major bugs or security
problems.

The community can then be left to create convenience packages like pre-built
disk images with the SDK's pre-installed and/or autobuilders.
Comment 13 Eero Tamminen nokia 2008-08-12 09:46:29 UTC
(In reply to comment #12)
> I have a working workaround: use the user-mode-linux image (and modules)
> from a 32-bit system.  I have run the 32-bit UML (version 2.6.24-1um-1
> from 32-bit lenny) on my AMD64 system and it works and will build GPE
> (and other applications) successfully and quite quickly (less than 90
> minutes for full GPE build, which takes over 10 hours using QEMU).

Was there any particular reason why you chose UML over VmWare[1]?
Is it faster or offers easier access from inside the VM to the outside
and vice verse?  (Can UML e.g. mount a directory as a harddisk like
some of the emulators for older machines can do?)

[1] http://lists.maemo.org/pipermail//maemo-developers/2008-June/034089.html
    http://maemovmware.garage.maemo.org/
Comment 14 Graham Cobb (reporter) maemo.org 2008-08-12 11:55:28 UTC
I have been using UML to provide a clean build environment for my maemo package
builds for some time (before this recent problem).  So, using UML is the
easiest option for me.

That original decision was based mainly on speed.  Vmware and QEMU seemed to
have similar speed, much slower than using UML.  While not an expert, I always
assumed that is because VmWare and QEMU were both emulating a software-neutral
hardware environment, whereas UML was only emulating the linux kernel, not
attempting to be software-neutral.  Anyway, for whatever reason, I found that
UML ran at near-native speeds and both VmWare and QEMU ran MUCH slower (and
similar to each other).  Note that these tests were on processors with no
virtualisation support.

By the way, out of VmWare and QEMU I found QEMU easier to set up and control as
a dedicated build engine.  VmWare seemed to be more aimed at interactive use,
which was not my goal.  And, of course, that is also why I am not using the
maemovmware project.

In summary, VmWare is probably a great choice for an interactive environment,
particularly for Windows users.  UML is a much better choice for use on Linux
systems and the only practical choice as a production build engine (actually,
chroot is a better choice still but that no longer works on AMD64 lenny as
reported in this bug report).
Comment 15 Graham Cobb (reporter) maemo.org 2008-08-20 13:37:22 UTC
I eventually sat down and read the kernel VDSO code, where I found that there
is a real solution to my problem!

If you are running a 64-bit kernel, there is no /proc file you can set but you
can disable VDSO for 32-bit applications such as scratchbox by appending the
option "vdso32=0' or 'vdso32=2' to the kernel command line.  It may be useful
to feed this information back to the scratchbox team for inclusion in their
documentation.  Note that the vdso32= option should also work on 32-bit kernels
(where it is a synonym for vdso=).

In any case, the SDK INSTALL.txt should be modified as follows:

For example on Ubuntu Hardy:
$ echo 0 | sudo tee /proc/sys/vm/vdso_enabled

If you are using a 64-bit Linux kernel there is no /proc file to disable VDSO. 
In this case it is necessary to add the option "vdso32=0" to the kernel boot
command line.
Comment 16 Andre Klapper maemo.org 2008-11-10 17:33:13 UTC
(In reply to comment #15)
> In any case, the SDK INSTALL.txt should be modified as follows:
> 
> For example on Ubuntu Hardy:
> $ echo 0 | sudo tee /proc/sys/vm/vdso_enabled
> 
> If you are using a 64-bit Linux kernel there is no /proc file to disable VDSO. 
> In this case it is necessary to add the option "vdso32=0" to the kernel boot
> command line.


That sounds like easy to handle - Soumya, can you take a look at this please?
Comment 17 Andre Klapper maemo.org 2009-01-22 15:17:50 UTC
Soumya: ping (comment 16)
Comment 18 Quim Gil nokia 2009-01-22 15:22:17 UTC
(In reply to comment #17)
> Soumya: ping (comment 16)

She is on holidays until the end of the month.
Comment 19 Soumya nokia 2009-02-02 16:32:30 UTC
Im back! The hint as indicated in comment #15 was added in the installation
instructions for Fremantle pre-alpha releases.
Have a look: http://maemo.org/development/sdks/pre_alpha_installation/#vdso

The bug is not really fixed since this is a work around. Feel free to change
the status you see appropriate.
Comment 20 Quim Gil nokia 2009-02-03 09:51:38 UTC
Either we consider this issue 

FIXED since now the installation instructions document the workaround

or we consider that the main issue is the lack of support of Debian Lenny,
converting this in an unresolved enhancement request.

By the way, the Maemo SDK+ is getting Fremantle rootstraps at the same time
than the new regular SDK releases and the tool itself seems to be in a very
decent shape now. Something to consider is whether developers not fitting in
the supported distros could move to the SDK+ that (theoretically) solves most
of these issues.
Comment 21 Graham Cobb (reporter) maemo.org 2009-02-03 14:34:24 UTC
(In reply to comment #20)
I consider the documented workround is all that is needed to fix the technical
problem.  However, the other part of the issue is what platforms are tested and
supported.

I believe that not fully supporting and testing the SDK on both the current
Debian stable and testing (note that lenny is planned to be released as stable
in 2 weeks) is a bad move by Nokia and is bad for the Maemo community.  Why
release an SDK unless it is tested and supported on the major platforms used by
the customers for the SDK?

So, unless you can close the bug with a statement that the Freemantle SDK will
be tested and supported on Debian stable and testing then I guess it should
move to the enhancement list.
Comment 23 Quim Gil nokia 2009-04-01 13:20:15 UTC
So this is still a problem in the Maemo 5 alpha SDK, right?
Comment 24 Daniel Elstner 2009-04-17 20:21:42 UTC
(In reply to comment #23)
> So this is still a problem in the Maemo 5 alpha SDK, right?

Yes.
Comment 25 Quim Gil nokia 2009-05-19 01:15:02 UTC
(In reply to comment #21)
> So, unless you can close the bug with a statement that the Freemantle SDK will
> be tested and supported on Debian stable and testing then I guess it should
> move to the enhancement list.

The Fremantle beta release has been tested on 32bit Debian Lenny, Ubuntu
Intrepid and Jaunty. One problem with Debian testing is that it's not a
'stable' release by definition, so there is no way to guarantee a test. It may
work one week, it may not work the next week.

However, we recognize the importance of supporting something more recent than
Debian stable since this is what more Debian users have in their machines
anyway. For this reason we have decided that the final SDK releases will be
tested against a specific snapshot of Debian testing. We might also include the
most remarkable pre-releases (e.g. first alpha, first beta) but there is no
commitment to check against Debian testing in all the pre-releases.

Hopefully this solution makes sense to the Debian users.
Comment 26 Andre Klapper maemo.org 2009-07-14 13:42:43 UTC
Fix should be included in Fremantle SDK beta 2 hence updating Target Milestone.
If you are the reporter of this bug: Feel free to verify the fix if possible.
Comment 27 Javier S. Pedro 2009-08-01 18:07:25 UTC
You may be interested in the patch I sent to scratchbox-devel [1], which fixes
the issue in scratchbox's glibc instead of workarounding around it (needs more
testers though).


If the above patch does not make it into scratchbox (and we have to keep the
workaround), I'd also like to say that the current Fremantle SDK Beta 2 seems
to suggest (automatically in x86_32) setting vdso_enabled to 0
(=VDSO_DISABLED). This is BAD, because on modern processors the performance of
the entire system might get a lot worse if VDSO is completely disabled.

Since scratchbox's problem seems to be related to _randomized_ VDSO but not
with STACK_TOP fixed VDSO, I suggest suggesting users to set vdso_enabled to 2
(=VDSO_COMPAT) instead, which according to [2] maps the VDSO to both STACK_TOP
_and_ a random location.
This should be enough to make scratchbox's glibc 2.3.2 happy, and [3] seems to
indicate this is the case.

[1]
http://lists.scratchbox.org/pipermail/scratchbox-devel/2009-August/000457.html 
[2] http://lkml.org/lkml/2007/4/5/17
[3] http://maemo.org/development/sdks/maemo_5-0_installation/#vdso (section
"Older kernel versions")