Quantcast
Channel: WebException when loading rss feed - Stack Overflow
Viewing all articles
Browse latest Browse all 2

WebException when loading rss feed

$
0
0

I am attempting to load a page I've received from an RSS feed and I receive the following WebException:

Cannot handle redirect from HTTP/HTTPS protocols to other dissimilar ones.

with an inner exception:

Invalid URI: The hostname could not be parsed.

I wrote a code that would attempt loading the url via an HttpWebRequest. Due to some suggestions I received, when the HttpWebRequest fails I then set the AllowAutoRedirect to false and basically manually loop through the iterations of redirect until I find out what ultimately fails. Here's the code I'm using, please forgive the gratuitous Console.Write/Writeline calls:

Uri url = new Uri(val);bool result = true;System.Net.HttpWebRequest req = (System.Net.HttpWebRequest)System.Net.HttpWebRequest.Create(url);string source = String.Empty;Uri responseURI;try{    using (System.Net.WebResponse webResponse = req.GetResponse())    {        using (HttpWebResponse httpWebResponse = webResponse as HttpWebResponse)        {            responseURI = httpWebResponse.ResponseUri;            StreamReader reader;            if (httpWebResponse.ContentEncoding.ToLower().Contains("gzip"))            {                reader = new StreamReader(new GZipStream(httpWebResponse.GetResponseStream(), CompressionMode.Decompress));            }            else if (httpWebResponse.ContentEncoding.ToLower().Contains("deflate"))            {                reader = new StreamReader(new DeflateStream(httpWebResponse.GetResponseStream(), CompressionMode.Decompress));            }            else            {                reader = new StreamReader(httpWebResponse.GetResponseStream());            }            source = reader.ReadToEnd();            reader.Close();        }    }    req.Abort();    HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();    doc.LoadHtml(source);    result = true;}catch (ArgumentException ae){    Console.WriteLine(url +"\n--\n"+ ae.Message);    result = false;}catch (WebException we){    Console.WriteLine(url +"\n--\n"+ we.Message);    result = false;        string urlValue = url.ToString();    try    {        bool cont = true;        int count = 0;        do        {            req = (System.Net.HttpWebRequest)System.Net.HttpWebRequest.Create(urlValue);            req.Headers.Add("Accept-Language", "en-us,en;q=0.5");            req.AllowAutoRedirect = false;            using (System.Net.WebResponse webResponse = req.GetResponse())            {                using (HttpWebResponse httpWebResponse = webResponse as HttpWebResponse)                {                    responseURI = httpWebResponse.ResponseUri;                    StreamReader reader;                    if (httpWebResponse.ContentEncoding.ToLower().Contains("gzip"))                    {                        reader = new StreamReader(new GZipStream(httpWebResponse.GetResponseStream(), CompressionMode.Decompress));                    }                    else if (httpWebResponse.ContentEncoding.ToLower().Contains("deflate"))                    {                        reader = new StreamReader(new DeflateStream(httpWebResponse.GetResponseStream(), CompressionMode.Decompress));                    }                    else                    {                        reader = new StreamReader(httpWebResponse.GetResponseStream());                    }                    source = reader.ReadToEnd();                    if (string.IsNullOrEmpty(source))                    {                        urlValue = httpWebResponse.Headers["Location"].ToString();                        count++;                        reader.Close();                    }                    else                    {                        cont = false;                    }                }            }        } while (cont);    }    catch (UriFormatException uriEx)    {        Console.WriteLine(urlValue +"\n--\n"+ uriEx.Message +"\r\n");        result = false;    }    catch (WebException innerWE)    {        Console.WriteLine(urlValue +"\n--\n"+ innerWE.Message+"\r\n");        result = false;    }}if (result)    Console.WriteLine("testing successful");else    Console.WriteLine("testing unsuccessful");

Since this is currently just test code I hardcode val as http://rss.nytimes.com/c/34625/f/642557/s/3d072012/sc/38/l/0Lartsbeat0Bblogs0Bnytimes0N0C20A140C0A70C30A0Csarah0Ekane0Eplay0Eamong0Eofferings0Eat0Est0Eanns0Ewarehouse0C0Dpartner0Frss0Gemc0Frss/story01.htm

the ending url that gives the UriFormatException is: http:////www-nc.nytimes.com/2014/07/30/sarah-kane-play-among-offerings-at-st-anns-warehouse/?=_php=true&_type=blogs&_php=true&_type=blogs&_php=true&_type=blogs&_php=true&_type=blogs&_php=true&_type=blogs&_php=true&_type=blogs&_php=true&_type=blogs&partner=rss&emc=rss&_r=6&

Now I'm sure if I'm missing something or if I'm doing the looping wrong, but if I take val and just put that into a browser the page loads fine, and if I take the url that causes the exception and put it in a browser I get taken to an account login for nytimes.

I have a number of these rss feed urls that are resulting in this problem. I also have a large number of these rss feed urls that have no problem loading at all. Let me know if there is any more information needed to help resolve this. Any help with this would be greatly appreciated.

Could it be that I need to have some sort of cookie capability enabled?


Viewing all articles
Browse latest Browse all 2

Latest Images

Trending Articles





Latest Images