Hashemian Blog
Web Tools, Financial Markets, Technology
Sunday, June 29, 2008
RSS/ATOM SyndicationFeed in .NET 3.5 Framework
Among some of the new classes introduced in Version 3.5 of the .NET Framework Class Library (FCL) were the syndication-related classes. While other technologies such as LINQ, extension methods and lambda expressions have been grabbing most of the attention the new syndication classes also deserve a nod. Part of the System.ServiceModel.Syndication namespace the classes offer a variety of methods to easily generate or consume syndication feeds in RSS 2.0 or ATOM 1.0 formats.
It's not that reading or writing feeds were exceedingly difficult before. FCL comes with a number of XML classes that facilitate working with XML data which all syndication feeds emanate from. No doubt there are plenty of sample code out there that made the task as easy as copy, paste and tweak. But now FCL comes with its own native classes to handle feeds, with advanced settings, intellisense, and potential of extension.
To demonstrate ease of use, here's a sample code that pulls in a sample feed from Google, and scrapes and saves the content of each link to a file:var wc = new WebClient(); using (var rss = XmlReader.Create( "http://finance.google.com/finance?morenews=10&q=NASDAQ:INTC&output=rss")) { var feed = SyndicationFeed.Load(rss); foreach (var x in feed.Items) { var uri = x.Links.Last().Uri; wc.DownloadFile(uri, @"c:\rss\" + Regex.Replace(uri.LocalPath, @"^.*/", "")); } } That was easy, eh? Just remember to add System.ServiceModel.Web.dll as a reference to your project. Happy syndicating.
rss,atom,microsoft,net,asp.net,xmlLabels: programming < RSS/ATOM SyndicationFeed in .NET 3.5 Framework>
// posted by rh
Friday, January 04, 2008
Pain and Program Exceptions
Happy new year. Not so much for me. The new year was marked with a wicked lower back pain. An irritating bulging disk that gets inflamed every now and then and pinches the nerve, radiating pain everywhere. Guess that's just nature's way of manifesting my age to me. Yes I know, I'm not in my twenties anymore. Haven't been there for quite some time and the longer I live the further away I get.
So what do? Just add the back pain to the foot pain, added to the hamstring pain, added to the knee pain, and I have a nice variety of aches and pains.
I suppose pain is body's exception system. If you're a programmer, you know what I'm talking about. An exception in a program is raised when something totally unexpected happens in the running code and needs immediate attention. Good coding practice dictates that programmers anticipate and compensate for all possible errors before their code is hit by an exception. But sometimes there's no avoiding it.- A number gets divided by zero - an exception is raised.
- A missing file is referenced - an exception is raised.
- A piece of data doesn't fit inside a database table column - an exception is raised.
- An XML stream is missing a tag - an exception is raised.
Graceful code is supposed to catch the exception, alert the user, and halt. After all something catastrophic must have happened and continuing the program could mean entering an invalid and unknown state. I admit, I've broken that rule a few times by catching an exception, logging the issue, and continuing as if nothing had happened. Why should proper programs get all the running privileges?
Body pain works the same way. It's a signal that something's gone wrong and needs attention. One must correct the problem before continuing with normal activities. That's exactly what I had intended to do. Give my back a few days of rest before getting back to running again. Except that last night I saw a jogger in the freezing temperature and my jealousy meter went off the scale.
So tonight I resolved to go for a walk, only to naturally speed up to a jogging pace after a few steps. And thus I entered into the invalid state of pain, as in, not knowing how my back will feel tomorrow. Oh well, why should only healthy, pain-free people have the privilege of running?
jogging,running,exceptions,programming,exception handlingLabels: jogging, programming, running < Pain and Program Exceptions>
// posted by rh
Sunday, August 19, 2007
IP address and Host name Scripts
As some of you know, this site contains several utilities in the tools section. One of these tools is JavaScript Visitor IP Address and Host Name. It's a simple JavaScript block that can be placed inside any web page and it displays or prints the visitor's IP address and host name.
As is with the rest of the tools, this one was also born out of necessity and I decided to share it with everyone. But some people are not comfortable putting my scripts on their site. That's cool and I don’t blame them, they don’t know me. So here I am going to explain how to display or print the user's IP address and host name using a number of server-side technologies.
The keyword here is "server-side". That's right, there is no way you can glean that information from client-side JavaScript. Even my JavaScript utility uses server-side calls to obtain the data and then packages it up in JavaScript format and streams it back. If your site supports server-side scripting, then chances are one of the following will do the job for you.
Perl:print "IP: $ENV{'REMOTE_ADDR'}<br>Host: $ENV{'REMOTE_HOST'}"; SSI:IP: <!--#echo var="REMOTE_ADDR"--><br>Host: <!--#echo var="REMOTE_HOST"--> PHP:<?= "IP: {$_SERVER['REMOTE_ADDR']}<br>Host: {$_SERVER['REMOTE_HOST']}" ?> ASP:<%= "IP: " + Request.ServerVariables("REMOTE_ADDR") + "<br>Host: " + Request.ServerVariables("REMOTE_HOST") %> ASP.NET:<%= "IP: " + Request.UserHostAddress + "<br>Host: " + Request.UserHostName %> Python:print "IP: " + cgi.os.environ["REMOTE_ADDR"] + "<br>Host: " + cgi.os.environ["REMOTE_HOST"] Ruby:print "IP: " + ENV["REMOTE_ADDR"] + "<br>Host: " + ENV["REMOTE_HOST"] JSP:<%= "IP: " + request.getRemoteAddr() + "<br>Host: " + request.getRemoteHost() %> Java Servlet:out.println("IP: " + request.getRemoteAddr() + "<br>Host: " + request.getRemoteHost()); A common issue with the above calls is that in many cases host names may be returned as IP addresses or nothing at all. In some cases that is because no reverse record for a client's IP address is available. But if this issues occurs all the time, it could mean that reverse resolution is turned off. This is generally done for performance reason, to save on server resources. You can ask your hosting company to turn that service on, or you could configure reverse lookup yourself if you have access to the server configuration files. Here’s how reverse look up is switched on for Internet Information Server (IIS) and Apache. IIS (execute at command line):adsutil set w3svc/EnableReverseDNS TRUE Apache (edit httpd.conf file):HostnameLookups On By the way, if you ever wanted to run a simple reverse lookup on an IP address, here's a Reverse Whois tool for that job. Before I end this post, here's one more piece of information for those who might wonder where parameters like REMOTE_ADDR or REMOTE_HOST come from. Those are part of a collection of parameters known as environment variables that web servers are expected to make available to the scripts. Want the gory details? Read here. There you have it. If you can put any of the above scripts to use instead of using my JavaScript utility, I'd appreciate the bandwidth savings. And to those who continue to use my utilities, your trust and confidence are appreciated. javascript, server-side scripting, ip address, hostname, perl, ssi, asp, ruby, python, jsp, servletLabels: ip address, javascript, programming < IP address and Host name Scripts>
// posted by rh
Thursday, June 07, 2007
HTTP Authorization and .NET WebRequest, WebClient Classes
One of the more useful, yet simple, classes of .NET is System.Net.WebClient. It basically simulates a simple browser to interact with web pages within your code. You can use it to GET pages or POST data to pages over http or https (SSL). The page response can be saved into a byte array, a string or a file. WebClient can even be used to pass authentication values to pages that require it, you know, those pages that pop up a user and password window before letting you in. For that, you’d supply the values to the Credentials property of WebClient and fire away.
Recently I was trying to access a protected page using WebClient. The code was pretty straightforward:WebClient wc=new WebClient(); wc.Credentials=new NetworkCredentials("user","pass"); string a=ws.DownloadString("http://www.example.com"); Pretty simple, eh? I've done this a million times, but in this case (the actual site shall remain nameless) it was throwing an exception. After some investigation, I noticed that the page was returning a 404 code (page not found) prompting the exception error. I also discovered that the call wasn't sending the appropriate Authorization header to the page. The site's documentation was clear about accessing the page using basic authorization header, but there was no getting past the exception. What was going on here?
After some fruitless troubleshooting, I decided to forego the Credentials property and manually craft the Authorization header. To do that I wrote the following code:WebClient wc=new WebClient(); wc.Headers.Add("Authorization","Basic "+ Convert.ToBase64String( Encoding.ASCII.GetBytes("user:pass"))); string a=ws.DownloadString("http://www.example.com"); This time, to my delight, the page obliged and the string variable "a" received the page's content. Not being satisfied with merely solving the problem using a different route, I decided to dig in and find out why the original (more proper) code was failing. First I discovered that the page was designed to return a 404 code rather than the customary 401 (Not Authorized) when the credentials were missing.
I'm not sure what the RFC's position on this is, but according to MSDN documentation, when a protected URL receives no authorization header from a client, it should return a 401 code, signaling to the client that authentication is required. The client should then provide the authorization header with each access, satisfying the URL's demand. The WebClient class with its Credentials property is designed to do just that, but not in a straightforward manner.
Under the hood, WebClient constructs a HtttpWebRequest object and sends a plain request to the specified page. Upon receiving a 401 code, it crafts the authorization header using the Credentials property and hits the page again. That's two round trips for every request. Worse yet, it does that for every subsequent request. I fail to see why WebClient insists on not sending the authorization header in the first place. After all, if the coder specifies the Credentials property, he must already know that the page requires authorization and WebClient should just obey and send in the header without the fuss.
In this case, the site's response aggravated matters by sending back a 404 (rather than a 401) after the first request, sending the whole process into a tailspin and causing an exception to be raised.
If you use the WebClient class in your .NET code to access protected URLs, watch out for this little stumper. It wasted quite a bit of my time. Maybe my loss will be your gain.
.NET,dotnet,webclient,webrequest,httpwebrequest,authorization header,404 code,401 code,http codesLabels: programming, web < HTTP Authorization and .NET WebRequest, WebClient Classes>
// posted by rh
Tuesday, April 10, 2007
DRM (Digital Rights Management) and Amazon Unbox, Part 2
I have to admit when it comes to DRM (Digital Rights Management) my knowledge is pretty dismal. I always knew that it was meant to prevent piracy but not being a fanatic of music and movies, DRM was never in my priority list of things to learn. But when I tried to run the Star Trek WMV file I had downloaded from Amazon Unbox on the new PC using Windows Media Player, I was greeted with a dialog box informing me that I needed a license to play the file. When I clicked on a link to obtain the license, the dialog box simply opened the Amazon Unbox home page and then just left me hanging.
Since now I knew that Unbox movies were DRM protected, I decided to learn some more about what DRM exactly is and how it works. A quick check with Wikipedia revealed that DRM is really an umbrella term referring to various technologies to protect copyrights. In this case Unbox was using the Microsoft flavor. That means the movie is encrypted at source only to be unlocked by a separate key that is referred to as the license. The key is basically a file comprising various restrictions such as the number of permitted viewings, duration of validity, and other data. It is stored separately from the movie file under the user's profile and if everything is in order, Windows Media Player applies it to the movie (or music) file to decrypt and play.
Another notable feature of the DRM key is that it can be tied to the machine that it is downloaded to. That makes the movie viewable only on the exact machine that it was configured for, preventing users from simply copying the file to another PC (as I had done) and playing the movie there. I'm not sure what parameters are used to construct this exclusivity restriction, but I assume a number of items such as the BIOS, CPU, and the network card are polled to create a unique identifier.
Now I knew I had to obtain a new license to watch the Star Trek episode on this new PC and that meant installing the Unbox on the new machine. After downloading and installing the software, I imported the WMV file into the Unbox program and I had the file unlocked. The film was playing smoother now with fewer jitters but since this PC didn't have any speakers and the monitor quality was poor, I decided to use remote desktop from the original PC and watch the film that way. No luck, remote desktop just doesn't have the repaint power to handle a movie and I was back at the same position as before with lots of interruptions and jumps.
I knew it was time to replace that original PC with a more modern machine. I had been wanting to do this for some time anyways. I had just the PC and now was as good a time as any to upgrade. So I dismantled the old PC, scavenged as many parts as I could and tossed out its shell. I always dread upgrading PC's. Being somewhat picky about configuring the new machine exactly as I want, it takes me days to get a new box to a level I am comfortable with. But in this case once I had the new PC running at a tolerable level, I decided to give the film a try.
Having gone through the experience before, I copied the WMV file to the new PC, installed the Unbox program and attempted to import the file to acquire a new license for it. But that's when I hit yet another surprise.
I'll conclude the saga in part 3.
DRM (Digital Rights Management) and Amazon Unbox, Part 1 DRM (Digital Rights Management) and Amazon Unbox, Part 3
amazon,amazon unbox,drm,digital rights,drm license,windows media player,wmv files,encryptionLabels: amazon, drm, programming < DRM (Digital Rights Management) and Amazon Unbox, Part 2>
// posted by rh
Wednesday, January 10, 2007
HTML Forms, Part 2
Click here to read Part 1
There are really many steps that a browser goes through to display a web page or post some data to it. For the purposes of this discussion, a simplified process consists of a request and a response. The browser makes a request to a server for a particular page, and the server returns the data to the browser in the form of a response.
There are a few types of requests a browser can issue using commands that are generally known as 'verbs'. The most widely used verb is GET. Whenever you visit a web page, your browser issues a GET command to the server along with some other parameters and the server obliges by returning the data for the particular page.
In fact you are reading this very page because your browser made a GET request to www.hashemian.com and specified this page. The server then returned this page to your browser in a raw format. Your browser then rendered it as you see now.
But what happens when you submit a form to a page? In that case the page generally contains a FORM tag like this:<form method="POST" action="some_page"> some form elements such as <INPUT> here a submit <BUTTON> here </form> Notice the POST verb specified as an attribute of the form tag. When you click on the Submit button, the data you entered on the form is sent to some_page using the POST verb. The POST verb is similar to the GET verb but there are some additional data. The browser contacts some_page, issues the POST command, it also specifies Content-Length and Content-Type.
Content-Length is the character length of the data that you have submitted to the page, while Content-Type specifies the format of the data. The data format, known as MIME type, is generally specified as application/x-www-form-urlencoded. That signals to the server that data was encoded in a specific format so the server can figure out how to decrypt and use the data. We'll delve into more details on POST in part 3. html,forms,get,post,urlLabels: programming, web < HTML Forms, Part 2>
// posted by rh

|
Links
Technorati Profile
TMCnet.com
ARCHIVES
09/01/2003 - 10/01/200303/01/2004 - 04/01/200404/01/2004 - 05/01/200405/01/2004 - 06/01/200406/01/2004 - 07/01/200407/01/2004 - 08/01/200408/01/2004 - 09/01/200409/01/2004 - 10/01/200410/01/2004 - 11/01/200411/01/2004 - 12/01/200412/01/2004 - 01/01/200501/01/2005 - 02/01/200502/01/2005 - 03/01/200503/01/2005 - 04/01/200504/01/2005 - 05/01/200505/01/2005 - 06/01/200506/01/2005 - 07/01/200507/01/2005 - 08/01/200508/01/2005 - 09/01/200509/01/2005 - 10/01/200510/01/2005 - 11/01/200511/01/2005 - 12/01/200512/01/2005 - 01/01/200601/01/2006 - 02/01/200602/01/2006 - 03/01/200603/01/2006 - 04/01/200604/01/2006 - 05/01/200605/01/2006 - 06/01/200606/01/2006 - 07/01/200607/01/2006 - 08/01/200608/01/2006 - 09/01/200609/01/2006 - 10/01/200610/01/2006 - 11/01/200611/01/2006 - 12/01/200612/01/2006 - 01/01/200701/01/2007 - 02/01/200702/01/2007 - 03/01/200703/01/2007 - 04/01/200704/01/2007 - 05/01/200705/01/2007 - 06/01/200706/01/2007 - 07/01/200707/01/2007 - 08/01/200708/01/2007 - 09/01/200709/01/2007 - 10/01/200710/01/2007 - 11/01/200711/01/2007 - 12/01/200712/01/2007 - 01/01/200801/01/2008 - 02/01/200802/01/2008 - 03/01/200803/01/2008 - 04/01/200804/01/2008 - 05/01/200805/01/2008 - 06/01/200806/01/2008 - 07/01/200807/01/2008 - 08/01/200808/01/2008 - 09/01/2008
|