Bug 5818 - We need servers
: We need servers
Status: RESOLVED FIXED
Product: maemo.org Website
General
: unspecified
: All All
: Low blocker with 54 votes (vote)
: ---
Assigned To: Niels Breet
: general@maemo.org
: http://maemo.org
:
:
:
  Show dependency tree
 
Reported: 2009-10-26 19:14 UTC by Ryan Abel
Modified: 2010-02-23 13:42 UTC (History)
21 users (show)

See Also:


Attachments


Note

You need to log in before you can comment on or make changes to this bug.


Description Ryan Abel (reporter) maemo.org 2009-10-26 19:14:19 UTC
STEPS TO REPRODUCE THE PROBLEM:

1. Try to apply one heart to one package in Extras-testing.

EXPECTED OUTCOME:
Logging and and rating a package should take all of 30 seconds.

ACTUAL OUTCOME:
Gave up after 15 minutes of trying. It took 10 minutes to get to the
appropriate page then finally login, and I spent another 5 minutes trying to
heart the package and watching it timeout each attempt.

REPRODUCIBILITY:
It's worse some times than others.

OTHER COMMENTS:
We need servers or we're going to have a repeat of November/December 2007 and
the platform _really_ can't afford that this time.

User-Agent:       Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_1; en-US)
AppleWebKit/531.9+(KHTML, like Gecko, Safari/528.16) OmniWeb/v622.10.0
Comment 1 Quim Gil nokia 2009-10-27 00:50:29 UTC
Packages and Brainstorm are indeed very slow, making difficult the normal work.
Comment 2 Tero Kojo nokia 2009-10-27 08:18:13 UTC
Completly agree.
Agreement with the new ISP has been signed.
Now we are waiting for them to get the hardware to the server room.
After that 

Wiki page http://wiki.maemo.org/ISP_Move created to track this bug
Comment 3 Henri Bergius 2009-12-07 18:39:12 UTC
*** Bug 6672 has been marked as a duplicate of this bug. ***
Comment 4 Neil MacLeod maemo.org 2009-12-07 22:03:26 UTC
Is there a timeline for this server move that could be added to the wiki page?
Is the move due to happen in Dec 09, Jan 10, Feb 10 or later?

https access to maemo.org has always been shockingly bad so this isn't really
anything new but with the increased popularity of the platform I suppose the
web services are now becoming more unusable than usual.
Comment 5 Jeff Moe 2009-12-08 03:00:24 UTC
Does this "ISP move" also entail the build boxes? I noticed it took over 1 hour
10 minutes to compile irssi on the build box. I did it on my laptop (.debs of
both ARM and X86 packages) in 2 minutes 38 seconds total. Build servers are
supposed to be *faster* than laptops... W.T.F.

I'm waiting for the package I submitted to get built. Imagine the guy that
submitted erlang haha. That build box is going to be tied up forever. If the
build times were reasonable, it would be easy to submit many packages per
day...

I can't imagine you're still waiting on the ISP as the contract was signed in
October. Hop to it!
Comment 6 Tero Kojo nokia 2009-12-08 08:41:45 UTC
(In reply to comment #5)

> I can't imagine you're still waiting on the ISP as the contract was signed in
> October. Hop to it!

Yes we are, no imagining required.
Comment 7 Tero Kojo nokia 2009-12-08 08:49:04 UTC
(In reply to comment #4)
> Is there a timeline for this server move that could be added to the wiki page?
> Is the move due to happen in Dec 09, Jan 10, Feb 10 or later?

ASAP. Hopefully everything will be there during this month.
Getting hardware online from the new ISP one bit at a time. Painfully slow
experience, can't recommend to anyone.

> https access to maemo.org has always been shockingly bad so this isn't really
> anything new but with the increased popularity of the platform I suppose the
> web services are now becoming more unusable than usual.

Correct. Loads are not what they used to be. It was a more than tenfold
increase in traffic when the N900 got announced. And the growth hasn't stopped,
but keeps going up slowly. So presumably the S-curve is starting to even out.
Comment 8 gidyn 2009-12-10 17:42:30 UTC
Erlang took 2 days, 18 hours
http://talk.maemo.org/showpost.php?p=420857&postcount=17

I always new that dynamic languages are slow :-)
Comment 9 Niels Breet maemo.org 2009-12-10 20:38:23 UTC
(In reply to comment #8)
> Erlang took 2 days, 18 hours
> http://talk.maemo.org/showpost.php?p=420857&postcount=17
> 
> I always knew that dynamic languages are slow :-)
> 
Well it took a long time, but it was actually stuck trying to access the
internet. I carefully killed some processes so the build could continue.

Hardware for the new builder has been delivered, now we can start setting up
and configuring it. Should make things a lot better soon.
Comment 10 Claus Feichtinger 2009-12-17 15:43:05 UTC
any news?

I wanted to start voting for apps in -devel and -testing, but i'm not gonna
wait 2+ minutes per page.

will the new servers be in place for christmas, so that i (and i imagine others
as well) can use the holidays for something productive?
Comment 11 Tuomas Kulve 2009-12-19 12:47:02 UTC
I tried to upload new Ogg Support with support for Flac tags but I'm getting
just "server timed out" messages.
Comment 12 Jeff Moe 2009-12-20 14:39:39 UTC
The build server appears hung for around 14 hours or so now. The last few days
it has been much faster than a week+ ago.

Is this the best place to report this? Where should outages be sent? Sucks to
lose the weekend. :(
Comment 13 Tero Kojo nokia 2009-12-21 09:45:03 UTC
(In reply to comment #12)
> The build server appears hung for around 14 hours or so now. The last few days
> it has been much faster than a week+ ago.

Ran out of disk space. Extremely large projects like Qt 4.6 take up quite a
chunk of disk. On backup hardware now. Fixing today.

> Is this the best place to report this? Where should outages be sent? Sucks to
> lose the weekend. :(

It truly does.

And yes, right place to report, thank you!
Comment 14 Jeff Moe 2009-12-21 17:10:13 UTC
Perhaps another outage?

https://garage.maemo.org/builder/fremantle/burgerspace_1.8.3-18/armel.root.log.FAILED.txt

Err http://osso.stage.dmz fremantle/sdk/free Packages
  Could not connect to osso.stage.dmz:80 (10.5.1.102). - connect (111
Connection refused)
Comment 15 Niels Breet maemo.org 2009-12-21 17:42:51 UTC
(In reply to comment #14)
> Perhaps another outage?
> 
> Err http://osso.stage.dmz fremantle/sdk/free Packages
>   Could not connect to osso.stage.dmz:80 (10.5.1.102). - connect (111
> Connection refused)
> 
Another server which is being replaced soon. It couldn't handle the load and
started to reject. Should be better again now.
Comment 16 Jeff Moe 2009-12-21 22:21:09 UTC
Now the build server appears to be "up", trying to upload fails. I'm trying via
scp. Sometimes I can connect, but it only sends the first 192k or so then just
stalls there. I notice there is nothing in the build queue, this could be why.
Comment 17 Jeff Moe 2009-12-23 18:09:58 UTC
Still stalling at 192k of a file uploaded. I was able to get some packages
uploaded yesterday, but today all day it's been hanging there. Perhaps
something with a firewall?
Comment 18 Jeff Moe 2009-12-23 19:18:14 UTC
I tried from my laptop via extras assistant:

https://garage.maemo.org/extras-assistant/index.php?step=4
==============================================
Maemo Extras Assistant
Step 4: check uploaded files

Checking your files:
File upload error. (tar file) Please try to upload your packages again!
==============================================

Note, where I am scping from has a 100Mbit connnection to the 'net at a data
center in the USA (in other words very fast/good net
connectivity--netdepot.com). I tried the web browser upload from a DSL
connection in Argentina (regular home connection).
Comment 19 Kasper Souren 2009-12-25 14:16:23 UTC
http://maemo.org/community/brainstorm/

"Fatal error: Maximum execution time of 30 seconds exceeded in
/mnt/netapp/pear/midcom/lib/midcom/core/collector.php on line 212"
Comment 20 ossipena 2009-12-30 13:08:02 UTC
unable to access brainstorm. Have been waiting soon 30 minutes without any
progress.
Comment 21 Johannes Siipola 2010-01-02 23:43:12 UTC
Unable to access any packages I have uploaded to devel. All the errors similar
to: " Fatal error: Maximum execution time of 30 seconds exceeded * on line * ".
This has been the situation for several days now.
Comment 22 Jeff Moe 2010-01-03 02:31:25 UTC
/me [Waiting for headers]
Comment 23 Jeff Moe 2010-01-03 02:55:32 UTC
Ah, here's a new timeout I haven't seen before:

==========================

Err http://repository.maemo.org fremantle/sdk/free Packages
  504 Gateway Time-out [IP: 96.17.106.136 80]

==========================
that resolves to:
136.106.17.96.in-addr.arpa domain name pointer
a96-17-106-136.deploy.akamaitechnologies.com.
Comment 24 Jeff Moe 2010-01-03 21:57:16 UTC
/me [Waiting for headers] again when running `apt-get update` in SDK. 

Occassionally I can pull it down, but if so, it's like 398B/s (*not* kbytes,
bytes). This is on a box that gets 8000k/sec (eight thousand) or so from
kernel.org (e.g. 20,000 times faster).

I am coming from 63.247.92.155.

$ host repository.maemo.org 
repository.maemo.org    CNAME    repository.maemo.org.edgesuite.net
repository.maemo.org.edgesuite.net    CNAME    a515.g.akamai.net
a515.g.akamai.net       A    207.226.85.80
a515.g.akamai.net       A    207.226.85.74

Reporting this here, per this discussion:
http://lists.maemo.org/pipermail/maemo-developers/2010-January/023380.html

What other info would you like? Would you like separate bug reports for
different outages/services?
Comment 25 Andre Klapper maemo.org 2010-01-04 00:20:48 UTC
*** Bug 7610 has been marked as a duplicate of this bug. ***
Comment 26 Andre Klapper maemo.org 2010-01-04 14:43:41 UTC
*** Bug 7585 has been marked as a duplicate of this bug. ***
Comment 27 Jeff Moe 2010-01-04 17:22:55 UTC
Builder seems foo. I've been trying to build this since Dec 23. Builds fine
arm/i386 for me.

===================================
[2010-01-01 16:09:11] Processing package guile-1.8 1.8.7+1-3. Uploader: jebba,
builder: builder1
[2010-01-01 16:09:13] Building guile-1.8 1.8.7+1-3 for target
'maemo-fremantle-armel-extras-devel'
[2010-01-01 16:13:06] OK
[2010-01-01 16:13:07] Building guile-1.8 1.8.7+1-3 for target
'maemo-fremantle-i386-extras-devel'
[2010-01-04 11:25:37] Unexpected error:
   KeyboardInterrupt: 
Traceback (most recent call last):
  File "/usr/lib/python2.4/site-packages/buildlib/app.py", line 81, in run
    mainfunc(argv, options, self._logger, self.conf)
  File "/usr/bin/buildme", line 628, in main
    params = {
  File "/usr/lib/python2.4/site-packages/buildlib/fsm.py", line 72, in run
    code = handler(self)
  File "/usr/bin/buildme", line 444, in do_build
    (env.params['rcode'], _) = destination.run(dsc.fname)
  File "/usr/lib/python2.4/site-packages/buildlib/dest.py", line 81, in run
    return getstatusoutput("ssh %s %s %s" % (self.dest,
self.command['command'],
  File "commands.py", line 54, in getstatusoutput
    text = pipe.read()
KeyboardInterrupt
===================================

Looks like it got CTRL-C'd (?). Though not quite sure how a job submitted via
'net gets CTRL-C'd. Perhaps it was resumbitted manually. The package doesn't go
out to the 'net or anything like that. Note, it looks like it sat in the
builder for 3 days.
Comment 28 Niels Breet maemo.org 2010-01-04 17:48:32 UTC
(In reply to comment #27)
> Looks like it got CTRL-C'd (?). Though not quite sure how a job submitted via
> 'net gets CTRL-C'd. Perhaps it was resumbitted manually. The package doesn't go
> out to the 'net or anything like that. Note, it looks like it sat in the
> builder for 3 days.
> 
Correct, this was a package I submitted manually (in debug mode), but after the
weekend still wasn't finished. So I hit ctrl-c.
Comment 29 Jeff Moe 2010-01-05 00:04:16 UTC
wiki.maemo.org

Warning: pg_connect() [function.pg-connect]: Unable to connect to PostgreSQL
server: FATAL: sorry, too many clients already in
/usr/local/lib/mediawiki_1.12.0/extensions/GForgeAuthentication/GForgeAuthenticationPlugin.php
on line 15
Couldn't authenticate against garage. (DB problem)

Do you need any other info?
Comment 30 Pelau Vadim 2010-01-05 16:09:55 UTC
I can't use brainstorming AT ALL!
I have to leave the page open after pressing "post" and then check back in ~5
minutes, load the page to see if the post was loaded and post of the times post
it again... IT'S A NIGHTMARE!
Comment 31 Jeff Moe 2010-01-05 21:08:14 UTC
Builder (or server which feeds it) looks down:
https://garage.maemo.org/builder/fremantle/pidgin-otr_3.2.0-6/armel.root.log.FAILED.txt

Err http://repository.maemo.org fremantle/tools/free Packages
  504 Gateway Time-out [IP: 193.184.164.146 80]
Get:5 http://repository.maemo.org fremantle/tools/non-free Packages [500B]
Get:6 http://repository.maemo.org fremantle/free Packages [2109kB]
Err http://repository.maemo.org fremantle/non-free Packages
  504 Gateway Time-out [IP: 193.184.164.146 80]
Failed to fetch
http://repository.maemo.org/dists/fremantle/tools/free/binary-armel/Packages.gz
 504 Gateway Time-out [IP: 193.184.164.146 80]
Failed to fetch
http://repository.maemo.org/extras-devel/dists/fremantle/non-free/binary-armel/Packages.gz
 504 Gateway Time-out [IP: 193.184.164.146 80]
Comment 32 Jeff Moe 2010-01-05 21:09:09 UTC
Bugzilla itself was down trying to submit the outage about the builder:

"Can't connect to the database. Error: Too many connections "
Comment 33 Micke Nordin 2010-01-05 21:53:27 UTC
I have been trying for days to promote a package from extras-devel to
extras-testing. I am unable to connect though, sometimes I get:

Fatal error: Maximum execution time of 30 seconds exceeded in
/mnt/netapp/pear/midcom/lib/midcom/helper/_dbfactory.php on line 451

And sometimes the page wont load at all.
Comment 34 Brent Chiodo 2010-01-11 03:33:10 UTC
I've been trying for the last few days to upload a package to
fremantle-extras-devel-non-free, and get a "Successfully uploaded packages."
from dput but never get an email telling me if it succeeded or not. No update
appears in the repository for the package, either.

Is this because of the builder not functioning properly at the moment or
another sort of problem?
Comment 35 Jeff Moe 2010-01-11 14:19:57 UTC
Failed to fetch
http://repository.maemo.org/extras/dists/fremantle/free/source/Sources.gz  503
Service Unavailable [IP: 64.210.100.9 80]
Failed to fetch
http://repository.maemo.org/extras-testing/dists/fremantle/free/source/Sources.gz
 504 Gateway Time-out [IP: 64.210.100.9 80]

Coming from 63.247.92.155.


(As a side note, I am using a mirror for that server, but ruskie mentioned a
kernel update that I didn't see, so thought I'd check upstream.)
Comment 36 Jeff Moe 2010-01-12 03:06:59 UTC
Trying to see details on a package I had just built:

Fatal error: Maximum execution time of 30 seconds exceeded in
/mnt/netapp/pear/midcom/lib/midcom/core/privilege.php on line 457

http://maemo.org/packages/package_instance/view/fremantle_extras-devel_free_i386/d-feet/0.1.10-2/
Comment 37 Jeff Moe 2010-01-17 05:24:07 UTC
Well, it hasn't been noted here, but lists.maemo.org, the repositories, etc.
are down for everyone. This has been true for hours. Here's one thread that
talks about it:
http://talk.maemo.org/showthread.php?t=40496
Comment 38 Jeff Moe 2010-01-18 06:53:01 UTC
Downtime continues, basically for the entire weekend. This includes the maemo
repositories, the mail servers (lists web interface and actual mail to/from
lists), and firmware binaries (tablet-dev.nokia.com).
Comment 39 ossipena 2010-01-18 07:33:26 UTC
(In reply to comment #38)
> Downtime continues, basically for the entire weekend. This includes the maemo
> repositories, the mail servers (lists web interface and actual mail to/from
> lists), and firmware binaries (tablet-dev.nokia.com).
> 

yes, ISP can fix the issue this morning (GMT+2) as Tero Kojo told @ tmo
Comment 41 Jeff Moe 2010-01-20 00:52:38 UTC
The repositories are still having problems. Right now I'm getting (and
confirmed by another user (xorAxAx) on IRC):

W: GPG error: http://repository.maemo.org fremantle Release: The following
signatures were invalid: BADSIG E40DC434616730BD maemo.org Extras repositories
(Fremantle Extras) <repositories@maemo.org>

I am coming from 190.30.23.191. xorAxAx is getting identical BADSIG:


A third user is reporting this (in a pastebin linked from maemo-devel):
#
Failed to fetch
http://repository.maemo.org/dists/fremantle/sdk/free/source/Sources.gz 
Sub-process gzip returned an error code (1)
#
Reading package lists... Done
#
W: GPG error: http://repository.maemo.org fremantle Release: Couldn't access
keyring: 'No such file or directory'


Note, this is after a recent "flushing" performed by x-fade.
Comment 42 mustali 2010-01-20 10:26:02 UTC
similar error here. coming from 65.216.74.168

W: GPG error: http://repository.maemo.org fremantle Release: The following
signatures were invalid: BADSIG E40DC434616730BD maemo.org Extras repositories
(Fremantle Extras) <repositories@maemo.org>
W: GPG error: http://repository.maemo.org fremantle Release: The following
signatures were invalid: BADSIG E40DC434616730BD maemo.org Extras repositories
(Fremantle Extras) <repositories@maemo.org>
W: GPG error: http://repository.maemo.org fremantle/tools Release: The
following signatures were invalid: NODATA 1 NODATA 2
W: You may want to run apt-get update to correct these problems
Comment 43 mustali 2010-01-21 12:19:36 UTC
another "hiccup"
Jan. 21, 10:13:19 UTC

https://bugs.maemo.org/
Software error:
Can't connect to the database.
Error: Can't connect to MySQL server on 'apps.maemo.org' (111)
  Is your database installed and up and running?
  Do you have the correct username and password selected in localconfig?

http://maemo.org/packages
Service Temporarily Unavailable
The server is temporarily unable to service your request due to maintenance
downtime or capacity problems. Please try again later.
Comment 44 Jeff Moe 2010-01-23 18:59:40 UTC
repository.maemo.org down.

garage.maemo.org down.

lists.maemo.org down.

I'm coming from 207.145.214.214. Others reporting outages as well.
Comment 45 Jeff Moe 2010-01-23 19:12:00 UTC
More info on the outage:
==================================
  504 Gateway Time-out [IP: 96.17.8.16 80]
Err http://repository.maemo.org fremantle/non-free Packages
  504 Gateway Time-out [IP: 96.17.8.16 80]
Err http://repository.maemo.org fremantle/free Packages
  504 Gateway Time-out [IP: 96.17.8.16 80]
Err http://repository.maemo.org fremantle/free Packages
  504 Gateway Time-out [IP: 96.17.8.16 80]
W: Failed to fetch
http://repository.maemo.org/extras/dists/fremantle/free/binary-armel/Packages 
504 Gateway Time-out [IP: 96.17.8.16 80]

W: Failed to fetch
http://repository.maemo.org/extras/dists/fremantle/non-free/binary-armel/Packages
 504 Gateway Time-out [IP: 96.17.8.16 80]

W: Failed to fetch
http://repository.maemo.org/extras-devel/dists/fremantle/free/binary-armel/Packages
 504 Gateway Time-out [IP: 96.17.8.16 80]

W: Failed to fetch
http://repository.maemo.org/extras-testing/dists/fremantle/free/binary-armel/Packages
 504 Gateway Time-out [IP: 96.17.8.16 80]
==================================

Also, andre klapper reports on irc that bugmail down for 2.5 hours.
Comment 46 mustali 2010-01-24 17:46:41 UTC
repositories are down: 15:44:51 UTC 
coming from: 87.65.176.133

Err http://repository.maemo.org fremantle Release.gpg                           
  Could not connect to repository.maemo.org:80 (194.78.100.27), connection
timed out
Err http://repository.maemo.org fremantle/free Translation-en_US
  Unable to connect to repository.maemo.org http:
Err http://repository.maemo.org fremantle/non-free Translation-en_US
  Unable to connect to repository.maemo.org http:
Err http://repository.maemo.org fremantle Release.gpg   
  Unable to connect to repository.maemo.org http:
Err http://repository.maemo.org fremantle/free Translation-en_US
  Unable to connect to repository.maemo.org http:
Err http://repository.maemo.org fremantle/non-free Translation-en_US
  Unable to connect to repository.maemo.org http:
Err http://repository.maemo.org fremantle/tools Release.gpg
  Unable to connect to repository.maemo.org http:
Err http://repository.maemo.org fremantle/tools/free Translation-en_US
  Unable to connect to repository.maemo.org http:
Err http://repository.maemo.org fremantle/tools/non-free Translation-en_US
  Unable to connect to repository.maemo.org http:
Fetched 567B in 4min0s (2B/s)                           
Reading package lists... Done
W: Failed to fetch
http://repository.maemo.org/extras/dists/fremantle/Release.gpg  Could not
connect to repository.maemo.org:80 (194.78.100.27), connection timed out

W: Failed to fetch
http://repository.maemo.org/extras/dists/fremantle/free/i18n/Translation-en_US.gz
 Unable to connect to repository.maemo.org http:

W: Failed to fetch
http://repository.maemo.org/extras/dists/fremantle/non-free/i18n/Translation-en_US.gz
 Unable to connect to repository.maemo.org http:

W: Failed to fetch
http://repository.maemo.org/extras-testing/dists/fremantle/Release.gpg  Unable
to connect to repository.maemo.org http:

W: Failed to fetch
http://repository.maemo.org/extras-testing/dists/fremantle/free/i18n/Translation-en_US.gz
 Unable to connect to repository.maemo.org http:

W: Failed to fetch
http://repository.maemo.org/extras-testing/dists/fremantle/non-free/i18n/Translation-en_US.gz
 Unable to connect to repository.maemo.org http:

W: Failed to fetch
http://repository.maemo.org/dists/fremantle/tools/Release.gpg  Unable to
connect to repository.maemo.org http:

W: Failed to fetch
http://repository.maemo.org/dists/fremantle/tools/free/i18n/Translation-en_US.gz
 Unable to connect to repository.maemo.org http:

W: Failed to fetch
http://repository.maemo.org/dists/fremantle/tools/non-free/i18n/Translation-en_US.gz
 Unable to connect to repository.maemo.org http:
Comment 47 Jeff Moe 2010-01-24 23:08:37 UTC
It appears the mailing list mail server is down too:
=====================================================
This is an automatically generated Delivery Status Notification.

THIS IS A WARNING MESSAGE ONLY.

YOU DO NOT NEED TO RESEND YOUR MESSAGE.

Delivery to the following recipients has been delayed.

              <maemo-developers@maemo.org>

The reason for the problem:
4.4.0 - Other network problem '[Errno 61] Connection refused'
=====================================================
Comment 48 arnim sauerbier 2010-01-27 02:26:19 UTC
I see BADSIG E40DC434616730BD again now - Jan 27 
My current ip is 84.128.203.34

Should this symptom have a bugreport of its own?
Comment 49 Tero Kojo nokia 2010-01-27 08:56:30 UTC
(In reply to comment #48)
> I see BADSIG E40DC434616730BD again now - Jan 27 
> My current ip is 84.128.203.34
> 
> Should this symptom have a bugreport of its own?

Yes, this was actually a result of a package being deleted from testing and
devel on the developers request (turned out that the app for securing your
device worked too well).
Comment 50 Tuomas Kulve 2010-02-03 08:34:08 UTC
So, what's status now? Is the new ISP in use? What about the build servers?
Comment 51 Tero Kojo nokia 2010-02-08 11:57:01 UTC
(In reply to comment #50)
> So, what's status now? Is the new ISP in use? What about the build servers?

Has been for a while. Many problems reported in this thread were due to the new
ISP not getting things right the first time.

Build servers? They have been in the new place since December. Do you need more
speed?
Comment 52 Jeff Moe 2010-02-23 13:42:24 UTC
AFAICT all the new servers have been in place for a few weeks and there are no
new pending moves, so this is FIXED.