Saturday, September 10, 2011

Scaling images for web–part 2

After reading the last part, looking at Image Transformations page in Open Waves documentation section, we know how to transform an image. All we need is an input stream and an output stream.
Where do we get an input stream from? If we have a virtual path of an image, we can try something along the following lines:

public Stream OpenImageStream(string virtualPath)
{
   return File.OpenRead(HostingEnvironment.MapPath(virtualPath));
}

But this is not how ASP.NET does it. A very useful feature introduced in ASP.NET 2.0 is an abstraction layer on top of a file system – VirtualPathProvider. Ok, so it makes sense to use it to access files. Our code to open image stream can look like this:

public Stream OpenImageStream(string virtualPath)
{
  var file = HostingEnvironment.VirtualPathProvider.GetFile(virtualPath);
  return file.Open();
}

What was wrong with the first approach? If you develop EPiServer sites you know that the file pointed to by a virtual path does not necessary is on the local disk. With projects like VPP for Amazon S3, the file might not even be in our network. One problem with abstractions is that sometimes they hide just a little bit too much. In our case, if we had a file on a local disk, we could for example check its last modified date to see if we need to transform the image or can serve a cached version. Fortunately, ASP.NET developers have also noticed this need and provided a way for implementers of VPPs to notify clients of file changes. That’s why a VirtualPathProvider class has the following methods:

public virtual string GetFileHash(
    string virtualPath, 
    IEnumerable virtualPathDependencies)


public virtual CacheDependency GetCacheDependency(
    string virtualPath, 
    IEnumerable virtualPathDependencies, 
    DateTime utcStart)

The first one should return a stable hash of a file – if the file hash changes, we can assume the file has changed.

The second one should return a CacheDependency that will invalidate cache entry (if a client decides to cache results of the file processing) when the file changes.

VirtualPathDependencies parameter is an important one. When calling the methods, it should contain a list of all dependencies of the given virtual path. If any of the dependencies change, the provider should consider the file indicated by the virtual path as modified, trigger cache dependency, and update file hash. When using the methods (or implementing a VPP) remember that the virtual path must be included in the list of dependencies. Example:

Let’s say we have foo.aspx file that includes a reference to bar.ascx control. ASP.NET BuildManager will ask for file hash using the following method call:

virtualPathProvider.GetFileHash(
   "~/foo.aspx",
   new [] {"~/foo.aspx", "~/bar.ascx"})

In image scaling scenario, where image files don’t have any dependencies, the dependencies list will only include a virtual path to the file itself.

A word of warning. Not every implementation of a VirtualPathProvider will implement the above methods. It is fine not to implement one or both of them. In such case, the base class will return null for both file hash and cache dependency. EPiServer Unified File provider (and derived classes) is an example of the implementation where GetFileHash method is not present (GetCacheDependency is implemented by VirtualPathNativeProvider). For cases like this, if you know the details of the VPP implementation (DotPeek?) you can often find other ways to calculate the hash. In EPiServer case VPP.GetFile method returns instances derived from VirtualFileEx which has Changed property, giving us access to the last modified date of a file.

For scenarios like this, and for better testability, image transformation related code in Open Waves does not depend on a VPP directly. Instead we have IVirtualFileProvider interface.

    public interface IVirtualFileProvider
    {
        IVirtualFile GetFile(Url fileUrl);        
    }

    public interface IVirtualFile
    {
        Url Url { get; }
        string Hash { get; }
        Stream Open();
    }

The interface is implemented by VirtualPathFileProvider class which is a wrapper for a VirtualPathProvider. This gives us a chance to “fix” any issues we may find in the underlying VPP. Another difference is that we are not relying on virtual paths but rather on Urls (in most cases they will be virtual paths). This allows us to implement a virtual file provider that fetches images from external sites (flickr?) – just need to be smart about how to implement file hashes. For more details take a look at this page from Open Waves documentation section Image Transformations for Web.

In the next part I’ll try to describe an approach we chosen to caching transformed images.

No comments: