maemo.org Bugzilla – Bug 935
RSS reader misbehavior on HTTP redirects (RFC 2616 violation)
Last modified: 2009-07-14 18:08:11 UTC
You need to log in before you can comment on or make changes to this bug.
Taken from a thread on the maemo-users mailing list (referenced in URL above) --8<-- An interesting thing I have noticed with the RSS reader is that it both follows HTTP 3xx redirect responses (rock!) AND saves the newly received HTTP URL for the feed (not so rock...). The latter behavior will silently destroy a whole folder of RSS feeds if-- a) you're using free wifi in a cafe, and b) said wifi requires you to login through an HTTP proxy, and c) the dumb user (that's me!) doesn't login prior to refreshing the feeds. I couldn't figure out why none of my feeds would reload, investigated, and found they all point to http://wifi-texas.com/login/78705SH/ after I got some coffee. As a suggestion, I think this update-URL-on-3xx behavior could be improved by having the RSS reader save the new URL if and only if the new URL contains valid feed content. --8<-- To which Andrew Flegg replied --8<-- On 1/6/07, Rhys Ulerich <rhys.ulerich[at]gmail.com> wrote: > > An interesting thing I have noticed with the RSS reader is that it > both follows HTTP 3xx redirect responses (rock!) AND saves the newly > received HTTP URL for the feed (not so rock...). > [snip] > As a suggestion, I think this update-URL-on-3xx behavior could be > improved by having the RSS reader save the new URL if and only if the > new URL contains valid feed content. It's a nasty misfeature that. I'd raise it on http://bugs.maemo.org/ as a serious bug: 302 means "Moved Temporarily", RFC 2616[1] *specifically* says user agents (such as the RSS reader) should not store the resulting URL. 301 means "moved permanently" and so the new URL to the feed could be saved, although it would still be worth the sanity check you suggest. If the wifi provider was using 301 rather than 302 to redirect to the login page, *they're* the ones mis-reading the specs, so the enhancement you suggest would be useful there. Cheers, Andrew [1] http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.3.3 --8<-- To which Nicola Larosa replied-- --8<-- Andrew Flegg wrote: > If the wifi provider was using 301 rather than 302 to redirect to the > login page, *they're* the ones mis-reading the specs, so the > enhancement you suggest would be useful there. The practical situation of 3xx HTTP response code is a mess with historical causes: Redirect in response to POST transaction http://ppewww.physics.gla.ac.uk/~flavell/www/post-redirect.html --8<--
https://garage.maemo.org/tracker/index.php?func=detail&aid=1386&group_id=164&atid=681
Rhys, can you please provide an example URL that will redirect?
(In reply to comment #2) > Rhys, can you please provide an example URL that will redirect? > I ran into the problem in a coffee shop that redirected all HTTP traffic to a intro/login page. A similar proxy setup should give you the redirect you need. (Easier) You could setup two tinyurl.com redirects (one to a valid RSS feed, and one to a regular web page) and use that as a test harness. I am unaware if tinyurl.com uses 301 or 302 redirects, but it should not matter for the "sanity check" feature mentioned in the original bug. Hope that helps, Rhys
(In reply to comment #3) > (Easier) You could setup two tinyurl.com redirects (one to a valid RSS feed, > and one to a regular web page) and use that as a test harness. I am unaware if > tinyurl.com uses 301 or 302 redirects, but it should not matter for the "sanity > check" feature mentioned in the original bug. I created a tinyurl for my blog and added the tinyurl to the RSS news reader. The icon shown for my blog in the left pane is the tinyurl icon. The address stored and displayed in the User Interface is the tinyurl. more /home/user/.osso_rss_feed_reader/feedlist.opml : <outline text="andre klapper's blog." title="andre klapper's blog." description="andre klapper's blog." type="rss" htmlUrl="http://blogs.gnome.org/aklapper" xmlUrl="http://tinyurl.com/57so4h" updateInterval="-1" id="xjdjqeq" lastPollTime="1224585228" sortColumn="time"/> So what are your htmlUrl and xmlUrl values?
I don't have my 770 available to check my htmlUrl and xmlUrl values at the moment-- here's the sleuthing I can do with the Firefox Live HTTP Headers add-on when using the URL http://tinyurl.com/57so4h : GET /57so4h HTTP/1.1 Host: tinyurl.com User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.0.3) Gecko/2008092510 Ubuntu/8.04 (hardy) Firefox/3.0.3 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive HTTP/1.x 301 Moved Permanently Location: http://blogs.gnome.org/aklapper/feed/ Content-Type: text/html Content-Length: 0 Date: Tue, 21 Oct 2008 14:56:57 GMT Server: TinyURL/1.6 Had TinyURL sent a 302 Moved Temporarily response, the xmlUrl value and favicon you indicate would be correct. However, TinyURL is sending a 301 Moved Permanently response. From an RFC purist perspective, your xmlUrl value should be "http://blogs.gnome.org/aklapper/feed/". Having a tinyurl.com xmlUrl after a 301 response seems, to me, a bug. The htmlUrl appears to be from the <link/> tag in the XML response from the final GET to http://blogs.gnome.org/aklapper/feed/; the value "http://blogs.gnome.org/aklapper" you report seems correct to me. I am unsure how the newsreader is constructing the URL to obtain the favicon, but displaying a TinyURL favicon after a 301 also looks like a bug. Others in this thread have indicated 301s and 302s tend to get mixed inappropriately in practice, and the sanity check I originally suggested would be to ensure that, after the 301 Moved Permanently response, the xmlUrl is only updated if a GET to the 301 Location received valid RSS content in the response. After fixing the above two issues (xmlUrl and favicon incorrect after a 301), I would add an additional piece of logic to ensure the xmlUrl and favicon are only updated if the 301 Location URL gives you a valid RSS XML response. You could test this by pointing the newsreader to a TinyURL that redirects to some static, non-RSS content. I'm having a flashback to my RFC 3261 SIP specification days. :) Hope that helps, Rhys
I don't expect any changes in Diablo to fix this (not a high priority). No idea for Fremantle ...
(In reply to comment #6) > No idea for Fremantle ... If you are so kind to provide a list of steps to reproduce I can give it a try. I'm a bit lost with the current description.
quim: 1. add a feed for http://tinyurl.com/57so4h 2. check the feed icon 3. open xterm 4. cat /home/user/.osso_rss_feed_reader/feedlist.opml ideally for this case the icon should be from andre's blog but more importantly, you should *not* see this: htmlUrl="http://blogs.gnome.org/aklapper" xmlUrl="http://tinyurl.com/57so4h" it should either have tinyurl in both places or blogs.gnome in both places. And whichever it has should determine the favicon.... That's actually only half of the bug [301] The other half is if the server sends [302], in which case it should retain the url you provided instead of the redirect. for 302, use http://timeless.justdave.net/maemo/andre-rss.pl In current testing w/ diablo, both give the same results. The right behavior for the 302 case is to retain the .pl url in both places. As for which favicon to show, dunno. For kicks, I'm trying to supply a favicon for the .pl file, however I have no opinion as to whether it should be shown :).
No special icon to be seen, just the normal orange RSS icon. About the rest, see for yourself: BusyBox v1.10.2 (Debian 3:1.10.2.legal-1osso16) built-in shell (ash) Enter 'help' for a list of built-in commands. ~ $ cat /home/user/.osso_rss_feed_reader/feedlist.opml <?xml version="1.0"?> <opml version="1.0"> <head> <title>Liferea Feed List Export</title> </head> <body> <outline text="BBC News | News Front Page | World Edition" title="BBC News | News Front Page | World Edition" description="BBC News | News Front Page | World Edition" type="rss" htmlUrl="http://news.bbc.co.uk/go/rss/-/2/hi/default.stm" xmlUrl="http://newsrss.bbc.co.uk/rss/newsonline_world_edition/front_page/rss.xml" updateInterval="15" id="stgtyat" sortColumn="time"/> <outline text="BBC Sport | Sport Homepage | World Edition" title="BBC Sport | Sport Homepage | World Edition" description="BBC Sport | Sport Homepage | World Edition" type="rss" htmlUrl="http://news.bbc.co.uk/go/rss/-/sport2/hi/default.stm" xmlUrl="http://newsrss.bbc.co.uk/rss/sportonline_world_edition/front_page/rss.xml" updateInterval="15" id="puhblxn" sortColumn="time"/> <outline text="Internet Tablet News" title="Internet Tablet News" description="Internet Tablet News" type="rss" htmlUrl="http://nokia.com/n800" xmlUrl="http://tableteer.nokia.com/rss/internettabletnews.xml" updateInterval="-1" id="rgxfucg" sortColumn="time"/> <outline text="andre klapper's blog." title="andre klapper's blog." description="andre klapper's blog." type="rss" htmlUrl="http://blogs.gnome.org/aklapper" xmlUrl="http://tinyurl.com/57so4h" updateInterval="-1" id="jfxsgye" lastPollTime="1236949408" sortColumn="time"/> </body> </opml> ~ $
Was this output useful?
Hey, was that useful? Think that the RSS Feed Readser team is investing time now bugfixing. Please help me trying to reproduce this bug. I'm your robot. Just let me know what I need to do. Thanks!
moreinfo as per last comment
(In reply to comment #8) > but more importantly, you should *not* see this: > > htmlUrl="http://blogs.gnome.org/aklapper" xmlUrl="http://tinyurl.com/57so4h" > > it should either have tinyurl in both places or blogs.gnome in both places. Actually I think those two are independent. The htmlUrl value comes from the link element found in the RSS document (not from the HTTP response Location: header), and in any case the attribute is optional and could be omitted altogether[1]. So it looks correct, and AFAICT osso-rss-feed-reader doesn't use it for anything (at least in Diablo) anyway. In the case of a 301 response the xmlUrl could be rewritten, but this is "only" a SHOULD[2] so not stictly speaking a bug, and leaving it unmodified avoids the original issue. I guess rewriting the xmlUrl for 301 responses after successful validation of the payload would be the most technically correct thing to do, but as it is this could be considered FIXED sometime between Gregale and Diablo (the window between the bug being opened and comment 4). [1] <http://www.opml.org/spec2>: > Optional attributes: description, htmlUrl, language, title, version. These > attributes are useful when presenting a list of subscriptions to a user, > except for version, they are all derived from information in the feed itself. > > description is the top-level description element from the feed. htmlUrl is > the top-level link element. [2] <http://www.ietf.org/rfc/rfc2616.txt>: > 10.3.2 301 Moved Permanently > > The requested resource has been assigned a new permanent URI and any > future references to this resource SHOULD use one of the returned > URIs. Clients with link editing capabilities ought to automatically > re-link references to the Request-URI to one or more of the new > references returned by the server, where possible. This response is > cacheable unless indicated otherwise.
Yay. Bugzilla changes tinyurl URLs automatically. Hence: So this should be in both cases "http tinyurl com mbug935" Please add yourself the missing :// . / [From comment 8] > 1. add a feed for blogs.gnome.org/aklapp... > 2. check the feed icon > 3. open xterm > 4. cat /home/user/.osso_rss_feed_reader/feedlist.opml > > but more importantly, you should *not* see this: > htmlUrl="http://blogs.gnome.org/aklapper" xmlUrl="blogs.gnome.org/aklapp..." This does not happen when using the direct blogs.gnome.org address (NOT using tinyurl) in Fremantle. Both values start with "http://". Trying again in Fremantle by using http://tinyurl.com/mbug935 as feed URL: * tinyurl icon is used in RSS reader * htmlUrl="http://blogs.gnome.org/aklapper" * xmlUrl="http://tinyurl.com/mbug935" [From comment 5] > Had TinyURL sent a 302 Moved Temporarily response, the xmlUrl value and > favicon you indicate would be correct. > TinyURL is sending a 301 Moved Permanently response. > From an RFC purist perspective, your xmlUrl value should be > "http://blogs.gnome.org/aklapper/feed/". > Having a tinyurl.com xmlUrl after a 301 response seems a bug. > for 302, use http://timeless.justdave.net/maemo/andre-rss.pl > In current testing w/ diablo, both give the same results. > The right behavior for the 302 case is to retain the .pl url in both places. cat /home/user/.osso_rss_feed_reader/feedlist.opml here is: * htmlUrl="http://blogs.gnome.org/aklapper" * xmlUrl="http://timeless.justdave.net/maemo/andre-rss.pl". So this is still valid in Fremantle.
I don't think it's even valid in Diablo (see comment 13) - since xmlUrl is not rewritten the user's feeds will not be destroyed by a misbehaving hotspot.