Tip: HTTP Content Disposition

When a programmer is writing an HTTP download engine, sometimes the engine will encounter URLs that don't contain a file name. By 'URLs', I mean the source URL (original URL), and the response URL(also known as the redirected URL) that might — or might not — be the same as the source URL. In other words, the response URL is the URL that responds to your HTTP request.

If you encounter a URL that does not contain the file name, then you can resort to finding the file name in content disposition header. However, because the content disposition header is not part of the HTTP 1.1 standard, not all web sites implement it; however, it is widely implemented in most web sites.

Unfortunately, .NET 2.0 HttpWebResponse class does not support querying the file name from this header directly. You can still parse this information yourself. Here are the steps to do so, using the .NET 2.0 HttpWebResponse class:

  • Use WebRequest's static Create method to pass the source URL as the only parameter and the Create method will return a HttpWebRequest object.
  • Pass the web proxy from WebRequest's GetSystemWebProxy method or WebRequest's DefaultWebProxy property to HttpWebRequest object's Proxy property. This step is optional. However, if you do not do it, those people who can only access internet through a web proxy, cannot access the web.
  • Use the HttpWebRequest object's GetResponse method to get a valid HttpWebResponse object
  • Query if the "content-disposition" string exists in the HttpWebResponse object's Header property.
  • If "content-disposition" exists, since we are only interested in the file name, search for the "filename" among the keyvalue pairs.
  • Each keyvalue pair is separated by a semicolon and the key is separated from the value with an assignment character. Example, filename="Hello.jpg".
  • Last step would be to trim away the double quotes around the file name, if there is any.
public bool GetResponseUrl(
    System.String strSrc, 
    ref System.String strDest)
{
    try
    {
        HttpWebRequest request = WebRequest.Create(strSrc) as HttpWebRequest;

        if(request==null)
            return false;

        System.Net.IWebProxy iwpxy = WebRequest.GetSystemWebProxy();
        System.Uri url = new Uri("http://www.example.com");
        System.Uri urlProxy = iwpxy.GetProxy(url);
        WebProxy wpxy = new WebProxy();
        if (url != urlProxy)
        {
            wpxy.Credentials = iwpxy.Credentials;
            wpxy.Address = urlProxy;
        }

        request.Proxy = wpxy;  

        HttpWebResponse response = request.GetResponse() as HttpWebResponse;

        if(response==null)
            return false;

        strDest = response.ResponseUri.ToString();

        for ( int i = 0; i < response.Headers.Count; ++i )
        {
	        System.Diagnostics.Debug.Print("{0} : {1}", response.Headers.Keys[i], response.Headers[i]);
	        m_dictHeaders[response.Headers.Keys[i]] = response.Headers[i];
        }

        bool b = m_dictHeaders.TryGetValue("content-disposition", out m_szContentDisposition);

        if(b==false)
            b = m_dictHeaders.TryGetValue("Content-Disposition", out m_szContentDisposition);

        if(b)
            ParseContentDisposition(m_szContentDisposition);

        response.Close();
    }
    catch (System.Net.WebException e)
    {
        if(e.Response!=null)
	        strDest = e.Response.ResponseUri.ToString();
        return false;
    }

    return true;
}


public bool ParseContentDisposition(System.String str)
{
    if(String.IsNullOrEmpty(str))
        return false;

    System.Diagnostics.Debug.Print("Content Disposition");
    System.Diagnostics.Debug.Print(str);

    String strFilenameKey = "filename";
    String strFilenameValue = String.Empty;

    int pos = 0;
    m_szCDFilename = System.String.Empty;
    return ExtractContentDisposition(pos, str, strFilenameKey + "=", ref m_szCDFilename);
}

public bool ExtractContentDisposition(
    int pos,
    System.String strContentDisposition, 
    System.String strKey, 
    ref System.String strValue )
{
    System.String strContentDisposition2 = strContentDisposition.ToLower();
    pos = strContentDisposition2.IndexOf(strKey, pos);

    for(int i=pos+strKey.Length; i<strContentDisposition.Length; ++i)
    {
        if(strContentDisposition[i]==';')
	        break;
        else
	        strValue += strContentDisposition[i];
    }

    if(String.IsNullOrEmpty(strValue))
    {
        return false;
    }

    String strDelimit="\"\' \t";
    strValue = strValue.Trim(strDelimit.ToCharArray());

    return true;
}

Note: If the file name is in the URL ( that is, it is at the right end of the URL), then most likely there won't be a content disposition for you to get the file name. So the 'CD Filename' would be empty

References

Mail Content Disposition Values and Parameters



About the Author

Wong Shao Voon

I guess I'll write here what I does in my free time, than to write an accolade of skills which I currently possess. I believe the things I does in my free time, say more about me.

When I am not working, I like to watch Japanese anime. I am also writing some movie script, hoping to see my own movie on the big screen one day.

I like to jog because it makes me feel good, having done something meaningful in the morning before the day starts.

I also writes articles for CodeGuru; I have a few ideas to write about but never get around writing because of hectic schedule.

Downloads

Comments

  • Problems with URL download

    Posted by jonxvel on 01/15/2013 12:20pm

    As you write, I have this kind of problem for download a file via VB6. I have this link or similar: https://www.inforis.org/nuevo_ris/download_file.php?token=54cf5c32931c08ee9c03cd01ab5ea156&id=7394 I can download using a simple Inet Download function, but can´t get de filename. The server gives me only this Headers of the file: HTTP/1.1 200 OK Date: Tue, 15 Jan 2013 19:14:23 GMT Server: Apache/2.2.16 (Debian) Last-Modified: Wed, 09 Jan 2013 22:59:15 GMT Etag: "400ae-1bd39a-4d2e302e7d6c0" Accept-Ranges: bytes Content-Length: 1823642 Keep-Alive: timeout=15, max=99 Connection: Keep-Alive Content-Type: application/pgp-signature But no the name. Even i used the Firebug Firefox extension for see the full headers and there appears, but i can´t get it. What i am doing wrong? Thanks,

    Reply
  • Very Useful

    Posted by exu666 on 02/20/2010 05:56pm

    Hi Wong, I was having bad time trying to figure out a way to know the the filename beforehand and the use the webclient (I'm just a poor DBA) ;) Thanks a lot my unacquainted friend! Z

    Reply
  • download file

    Posted by amigo* on 02/08/2010 07:11am

    could you explain?... when we'v got file name how to download this file via HttpWebResponse?

    • Hi amigo

      Posted by CBasicNet on 02/08/2010 08:39am

      You can get an example of download code at this another article of mine, link below http://www.codeguru.com/csharp/.net/net_general/internet/article.php/c16073/

      Reply
    Reply
Leave a Comment
  • Your email address will not be published. All fields are required.

Top White Papers and Webcasts

  • As all sorts of data becomes available for storage, analysis and retrieval - so called 'Big Data' - there are potentially huge benefits, but equally huge challenges...
  • The agile organization needs knowledge to act on, quickly and effectively. Though many organizations are clamouring for "Big Data", not nearly as many know what to do with it...
  • Cloud-based integration solutions can be confusing. Adding to the confusion are the multiple ways IT departments can deliver such integration...

Most Popular Programming Stories

More for Developers

RSS Feeds

Thanks for your registration, follow us on our social networks to keep up-to-date