D. Patrick Caldwell on Software Engineering: Code Snippets

Showing posts with label Code Snippets. Show all posts

Friday, February 10, 2012

JavaScript for Photoshop - When Macros Aren't Enough

A few years ago, a fellow named Brian Dorn got in touch with me. He was working on his doctoral dissertation at Georgia Tech and he needed participants. I met him in Atlanta and found out that his study was about scripting with Photoshop. That was the first I'd ever heard of Adobe Photoshop Scripting.

A few weeks ago when I started working on a new word game This2That for Mobile Magic Developers, I needed to generate 38 tiles with letters, numbers, and punctuation. The task was tedious and the macros just couldn't make it any easier. Then I realized I had the embossing all wrong and I had to start over!

Frustrated, I tried to think of a better way. I remembered Brian Dorn and I started looking into this Adobe Photoshop JavaScript thing. I was very pleased when 15 minutes of script became an easily reusable tile generating utility. Now, I can take any PSD, open it up, select any text layer, and have the script generate a PNG for each letter I need

I thought this little known feature would make for interesting reading for both programmers and designers so here's my script:

// call the method that does all of the work
main();

// wrap the code in a method to make it easier to debug
function main() {

 // make sure you're working in a document and have a text layer selected
 if (!activeDocument || !activeDocument.activeLayer || activeDocument.activeLayer.kind != LayerKind.TEXT)
 {
   alert("Please select a document and a target text layer.");
   return;
 }
 
 // set up some information about the current file
 var textLayer = activeDocument.activeLayer;
 var path = activeDocument.path;
 var fileName = activeDocument.name;
 
 // remove the extension on the file name
 var extensionPosition;
 if (extensionPosition = fileName.lastIndexOf('.'))
  fileName = fileName.substr(0, extensionPosition);

 // get a good place to put the file
 var outputFolder = Folder.selectDialog("Select a target folder.", path);
 
 // set up the letters we want images for
 var characterMap = [
  ["question", "?"]
 ];
 
 // and add the lowercase alphabet and numbers
 characterMap = characterMap.concat(getAsciiRange(97, 26), getAsciiRange(48, 10));

 // for each character, update the selected text layer and save a file
 for (var i = 0; i < characterMap.length; i++)
 {
   var character = characterMap[i][1];
   var fileSuffix = characterMap[i][0];
   
   textLayer.textItem.contents = character;
   
   var file = new File(outputFolder + "/" + fileName + "_" + fileSuffix + ".png");
   var options = new PNGSaveOptions();
   options.interlaced = false;

   activeDocument.saveAs(file, options, true, Extension.LOWERCASE);
 }
};

// a little helper method to make a range of letters and their
// filename extensions
function getAsciiRange(from, count) {
 var result = [];

 for (var i = 0; i < count; i++)
 {
  var character = String.fromCharCode(i + from);
  result.push([character, character.toUpperCase()]);
 } 
 
 return result;
}

Tuesday, November 15, 2011

Switching on Enums: A Style Observation

I was digging through the framework today looking for another good dotPeek of the Week topic. I was perusing the Lazy<T> class and found an interesting snippet.

This post isn't quite like my previous dotPeek of the Weeks insofar as this is more of an example of what not to do. This is certainly merely my opinion, but one rule I try to follow when writing code is that more expressive is almost always better than less expressive. When I was looking at the Lazy class, I found a great example of this.

Here's the code (abridged for clarity … and also because the threading in this class will make for better discussion later):

private T LazyInitValue()
{
  switch (this.Mode)
  {
    case LazyThreadSafetyMode.None:
      // set the value
      break;

    case LazyThreadSafetyMode.PublicationOnly:
      // CompareExchange the value
      break;

    default:
      // lock and set values
      break;
  }
}

Is there anything you notice about this code? Perhaps any unanswered questions as you read it and try to figure out what it does? Specifically, what exactly constitutes the default case?

As I read through this code, learning about some of the interesting thread safety techniques, I found myself pondering, "why would locking be the default behavior? In fact, what is the default case? Do the default values have something in common?"

I looked at the enum LazyThreadSafetyMode and found this:

public enum LazyThreadSafetyMode
{
  None,
  PublicationOnly,
  ExecutionAndPublication,
}

That's when I decided that, in most cases, when you're switching on a reasonably small set of values, it's best to express these values explicitly so that the people (including you) who have to maintain the code can better understand why the default case is the default case … even if there are no comments.

For example, the following code is functionally equivalent:

private T LazyInitValue()
{
  switch (this.Mode)
  {
    case LazyThreadSafetyMode.None:
      // set the value
      break;

    case LazyThreadSafetyMode.PublicationOnly:
      // CompareExchange the value
      break;

    case LazyThreadSafetyMode.ExecutionAndPublication:
      // lock and set values
      break;
  }
}

Personally, I find the latter example much more expressive. It's obvious to me what the three cases are and I don't have to wonder what possible values can become the default case. In fact, I may even go so far as having the default case throw an exception in this class. I'd do this so that, for whatever reason, if someone were to change the LazyThreadSafetyMode enum and not implement that case for the new values in Lazy<T>.LazyInitValue(), they'd get an exception in testing instead of incorrectly using the default functionality.

Friday, September 2, 2011

7-bit Encoding with BinaryWriter in .Net

At work this week, I had the need to serialize objects and encrypt them while trying to keep the smallest data footprint I could. I figured the easiest thing to do would be to binary serialize them with the BinaryFormatter. It was, indeed, the easiest thing to do; however, the BinaryFormatter seems to come with a fair amount of overhead. By the time the BinaryFormatter was finished listing all of the necessary assembly-qualified type names, the 60 bytes of data I wanted to preserve were more than a kilobyte!

I needed another way so I extended BinaryWriter (mostly to get it to serialize the types I needed it to) and now my 60 bytes take 68 bytes to serialize. In the process of writing this class, I looked through the disassembled BinaryWriter using JetBrains's dotPeek and found this little gem (Write7BitEncodedInt(int)) and decided it'd make a great dotPeek of the Week:

protected void Write7BitEncodedInt(int value)
{
  uint num = (uint) value;

  while (num >= 128U)
  {
    this.Write((byte) (num | 128U));
    num >>= 7;
  }

  this.Write((byte) num);
}

This is how the BinaryWriter.Write(string) encodes the length of the string. Thus, if you have a string with fewer than 128 characters, it only takes one byte to encode the length. In my implementation, I used this to specify the length of collections too. In fact, when writing a 32 bit signed integer, you should be able to save space for all positive numbers less than 2 ^ 22 and break even (require four bytes) for 2 ^ 29. Encoding an Int32 that is greater than or equal to 2 ^29 will require five bytes. Thus, if your integers tend to be smaller than 536,870,912, you'll probably save space encoding this way. A similar function could be used for a long where all positive values less than 2 ^ 57 will result in at least breaking even.

Here's how it works:

Convert the number to an unsigned int so you can do arithmetic on positive numbers (after all, you're just writing bits)
While the converted value is greater than or equal to 128 (i.e., 8 bits)
1. Write low 7 bits and put a 1 in the high bit (to indicate to the decoder more bytes are coming)
2. Shift the 7 bits you just wrote off the number
When the loop finishes, there will be 7 or fewer bits to write so write them

I think this is really clever and very easy. Reading the data back is a smidgen more complicated (Read7BitEncodedInt()):

protected internal int Read7BitEncodedInt()
{
  // some names have been changed to protect the readability
  int returnValue = 0;
  int bitIndex = 0;

  while (bitIndex != 35)
  {
    byte currentByte = this.ReadByte();
    returnValue |= ((int) currentByte & (int) sbyte.MaxValue) << bitIndex;
    bitIndex += 7;

    if (((int) currentByte & 128) == 0)
      return returnValue;
  }

  throw new FormatException(Environment.GetResourceString("Format_Bad7BitInt32"));
}

Here's how this works:

Set up an int to accumulate your data as you read them
Set up a place to keep track of which 7-bit block you're reading
While your bit index is less than 35 (i.e., you've read no more than 5 bytes comprising 1 "more bytes" indicator and 7 data bits)
1. Read a byte
2. Take the byte you just read and logical conjuction (bitwise and) it with sbyte.MaxValue (127 or in binary 0111 1111)
3. Left shift those 7 bits to the position in which they belong
4. Use a logical disjunction (bitwise or) to write those seven bits into your accumulator int
5. Add 7 to your bit index for the next 7 bits you read
6. If the current byte you read does not have a 1 in its most significant bit (i.e, byte & 1000 0000 == 0)
  1. There are no more bytes to read so just return the current accumulator value
If you get to this point, you've read a 6th of 5 bytes and it's time to let the user know there was a formatting problem

Thursday, August 25, 2011

.Net GetHashcode Functions

Last week, I wrote a post about a .Net method to combine hashcodes. In the process of designing that method, I looked into dozens of .Net's GetHashcode implementations.

If you'd like to know the principles of hashcodes, take a look at the aforementioned article. This one is part of the dotPeek of the Week series so I'll just be sharing the insight I got from the framework's implementations here.

First, some really basic ones:

// from Int32
public override int GetHashCode()
{
  return this;
}

// from Int16
public override int GetHashCode()
{
  return (int) (ushort) this | (int) this << 16;
}

// from Int64
public override int GetHashCode()
{
  return (int) this ^ (int) (this >> 32);
}

The Int32 implementation easily meets the hashcode requirements by returning itself. The Int16 copies itself to the top half of the Int32 and returns that. The Int64 takes the bottom half of itself and XORs it with the top half.

This XOR is the first valuable piece of information. If you look at the logical functions, the XOR produces the best bit twiddling for hashing. Basically, the XOR produces a relatively even distribution of bits as opposed to the OR and the AND operations which will be biased 3 to 1 in one direction or the other.

Some more complicated implementations:

// from String
public override unsafe int GetHashCode()
{
  fixed (char* chPtr = this)
  {
    int num1 = 352654597;
    int num2 = num1;
    int* numPtr = (int*) chPtr;
    int length = this.Length;
    while (length > 0)
    {
      num1 = (num1 << 5) + num1 + (num1 >> 27) ^ *numPtr;
      if (length > 2)
      {
        num2 = (num2 << 5) + num2 + (num2 >> 27) ^ numPtr[1];
        numPtr += 2;
        length -= 4;
      }
      else
        break;
    }
    return num1 + num2 * 1566083941;
  }
}

// from Tuple<>
internal static int CombineHashCodes(int h1, int h2)
{
  return (h1 << 5) + h1 ^ h2;
}

These two methods have some really good information in them. The String.GetHashcode implementation has what's called a rolling hash. It loops through the characters, does a barrel shift by 5, and then XORs with the next character.

While I liked this, I preferred the simplicity of the Bernstein Hash. The main component of the Bernstein Hash is the (i << 5) + i. i << 5 == i * 32 (except that bit shifting is much faster than multiplying).

Thus, i << 5 + i == 32i + i == 33i. The Bernstein Hash just takes the current hash value, multiplies it by 33, and XORs in the new value.

I didn't use 33 in my hash function because I feel like using prime numbers is healthy. Thus, instead of adding I subtract so my hashcode method is (hash << 5) - hash ^ value (or 31 * hash ^ value). This, by the way, is the way Java tends to do it.

Here are some interesting ones from System.Drawing:

// from Size
public override int GetHashCode()
{
  return this.width ^ this.height;
}

// from Rectangle
public override int GetHashCode()
{
  return this.X ^ (this.Y << 13 | (int) ((uint) this.Y >> 19)) ^ (this.Width << 26 | (int) ((uint) this.Width >> 6)) ^ (this.Height << 7 | (int) ((uint) this.Height >> 25));
}

// from Point
public override int GetHashCode()
{
  return this.x ^ this.y;
}

Thursday, August 18, 2011

.Net Method to Combine Hash Codes

A while back I developed a helper method that I've been using to aid me in computing good hash values from sets of properties for an object. I was writing a dotPeek of the Week entry about .Net GetHashcode implementations and I decided to spruce up my implementation and post it on my blog.

I know It's pretty uncommon that you need to override the Equals method, but when you do, you have to take particular care in considering overriding the GetHashCode method as well. To make this a little easier, I developed a method I call CombineHashCodes().

Using CombineHashCodes, I can easily compute a well randomized composite hash code based on several parameters from an object. This curtails a lot of the work involved when dealing with objects that have complicated .Equals overrides.

Microsoft's documentation on Object.GetHashCode() lists the following rules:

A hash function must have the following properties:

If two objects compare as equal, the GetHashCode method for each object must return the same value. However, if two objects do not compare as equal, the GetHashCode methods for the two object do not have to return different values.

The GetHashCode method for an object must consistently return the same hash code as long as there is no modification to the object state that determines the return value of the object's Equals method. Note that this is true only for the current execution of an application, and that a different hash code can be returned if the application is run again.

For the best performance, a hash function must generate a random distribution for all input.

Thus, if you override the .Equals method such that it's no longer doing the default reference equality, you probably need to change your GetHashCode implementation. I've found this is usually a very simple task. I whipped up a quick example in LINQPad:

void Main()
{
 var lauren1 = new Person { FirstName = "Lauren" };
 var lauren2 = new Person { FirstName = "Lauren" };
 
 Console.WriteLine(lauren1.Equals(lauren2));
 Console.WriteLine("{0}.GetHashCode() = {1}", "lauren1", lauren1.GetHashCode());
 Console.WriteLine("{0}.GetHashCode() = {1}", "lauren2", lauren2.GetHashCode());
 
 if (lauren1.Equals(lauren2) && lauren1.GetHashCode() != lauren2.GetHashCode())
  Console.WriteLine("This is bad.  This is very bad.");
}

public class Person
{
   public string FirstName { get; set; }
}

/*
False
lauren1.GetHashCode() = 47858386
lauren2.GetHashCode() = 20006478
*/

As one would expect, lauren1 does not equal lauren2 and neither do the hashcodes; however, if we were to override the .Equals method on the Person object to compare just the first name, we'll find that lauren1 and lauren2 will be equal but will have different hashcodes! This violates the first rule of GetHashCode Club. Here's an example:

public class Person
{
   public string FirstName { get; set; }
  
   public override bool Equals (object obj)
   {
  var otherPerson = obj as Person;
  
  if (otherPerson == null)
   return false;
   
  return String.Compare(this.FirstName, otherPerson.FirstName, StringComparison.OrdinalIgnoreCase) == 0;
 }
}

/*
True
lauren1.GetHashCode() = 33193253
lauren2.GetHashCode() = 37386806
This is bad.  This is very bad.
*/

That's why we override GetHashCode as well and something like this will work:

public override int GetHashCode()
{
 return FirstName.GetHashCode();
}

/*
True
lauren1.GetHashCode() = 2962686
lauren2.GetHashCode() = 2962686
*/

That's all well and good (except the .Equals() does a case insensitive compare so "Lauren" and "lauren" will be equal but will result in different hash codes unless you do something like return FirstName.ToLower().GetHashCode(). Why'd I leave that error in the post just to parenthetically correct it . . . because I think it's a good demonstration of how easy it is to screw this up :)). Consider what happens when you end up with a more complex class that looks like this:

public class Person
{
   public string FirstName { get; set; }
 public string LastName { get; set; }
 public DateTime BirthDate { get; set; }
  
   public override bool Equals (object obj)
   {
  var otherPerson = obj as Person;
  
  if (otherPerson == null)
   return false;
   
  return String.Compare(FirstName, otherPerson.FirstName, StringComparison.OrdinalIgnoreCase) == 0
   && String.Compare(LastName, otherPerson.LastName, StringComparison.OrdinalIgnoreCase) == 0
   && BirthDate.Equals(otherPerson.BirthDate);
 }
}

You could leave the GetHashCode function the way it was previously, but then all Lauren's will be lumped in the same bucket together which violates the third rule about a random distribution. I've seen people combat this problem by concatenating the string values of the equals fields delimited by some extremely unlikely string and getting the hash code of that.

I felt like that was not really the best way to do it in case someone decides a colon is a good character in a first name or something worse.

Thus, I wrote something like this (which has recently been modified to what you see now based on my explorations for my dotPeek of the Week series). This is CombineHashCodes:

public static class HashCodeHelper
{
 public static int CombineHashCodes(params object[] args)
 {
  return CombineHashCodes(EqualityComparer<object>.Default, args);
 }

 public static int CombineHashCodes(IEqualityComparer comparer, params object[] args)
 {
  if (args == null) throw new ArgumentNullException("args");
  if (args.Length == 0) throw new ArgumentException("args");

  int hashcode = 0;

  unchecked
  {
   foreach (var arg in args)
    hashcode = (hashcode << 5) - hashcode ^ comparer.GetHashCode(arg);
  }

  return hashcode;
 }
}

With this method, you can get a nice pseudo-random distribution of hash codes without violating any of the GetHashCode rules as long as you include the values relevant to your Equals method in your call to CombineHashCodes:

public override int GetHashCode()
{
 return HashCodeHelper.CombineHashCodes(FirstName.ToLower(), LastName.ToLower(), BirthDate);
}

/*
True
lauren1.GetHashCode() = -891990792
lauren2.GetHashCode() = -891990792
*/

Monday, August 15, 2011

.Net Optimization for Int32

Another entry in the dotPeek of the Week series here. I was digging through some .GetHashcode() implementations (expect that as the next dotPeek of the week) and noticed some interesting implementations of overridden .Equals() methods.

I found this in Int16:

public override bool Equals(object obj)
{
  if (!(obj is short))
    return false;
  else
    return (int) this == (int) (short) obj;
}

public bool Equals(short obj)
{
  return (int) this == (int) obj;
}

So, I checked Byte and sure enough:

public override bool Equals(object obj)
{
  if (!(obj is byte))
    return false;
  else
    return (int) this == (int) (byte) obj;
}

public bool Equals(byte obj)
{
  return (int) this == (int) obj;
}

Can you guess how Char.Equals() is implemented?

public override bool Equals(object obj)
{
  if (!(obj is char))
    return false;
  else
    return (int) this == (int) (char) obj;
}

public bool Equals(char obj)
{
  return (int) this == (int) obj;
}

Even the unsigned int gets cast as a signed int in UInt.Equals(). I was pretty curious, so I looked at some other methods:

public int CompareTo(short value)
{
  return (int) this - (int) value;
}

public int CompareTo(byte value)
{
  return (int) this - (int) value;
}

public int CompareTo(char value)
{
  return (int) this - (int) value;
}

I took note and discussed it with a few friends. It seems to make sense that on a 32 bit operating system, it'd be optimal to work with 32 bit integers. It turns out, even on a 64 bit architecture, .net is optimized for the Int32. I found this gem online:

Best Practices: Optimizing performance with built-in types

The runtime optimizes the performance of 32-bit integer types (Int32 and UInt32), so use those types for counters and other frequently accessed integral variables.

For floating-point operations, Double is the most efficient type because those operations are optimized by hardware.

MCTS Self-Paced Training Kit (Exam 70-536): Microsoft® .NET Framework 2.0—Application Development Foundation

So storing integer values, no matter the ceiling of your expected value, is best done using Int32, particularly if they'll be operated on heavily like an index to a collection or as a counter in an iteration control structure.

Monday, August 8, 2011

Enumerable.Any vs. Enumerable.Count

This is the innagural entry into the section of my blog I'm calling "dotPeek of the Week." Workload permitting, it will be a weekly thing where I'll be using JetBrain's dotPeek - a free .NET decompiler to learn more about the .NET framework and share any interesting things I come up with.

In this post, I'll be sharing a little tip I picked up from my friend David Govek.

I can't .Count() the number of times I've seen a line of code like this one:

if (someEnumerable.Count() > 0) 
    doSomething();

I know I've done that myself a handful of times. I think it comes from the pre-.net 3.5 days when you were used to dealing with ICollections that had a .Count property that returned the value of a private field:

public virtual int Count { get { return this._size; } }

When 3.5 came out, the System.Linq.Enumerable class brought with it a Count extension method. I believe it was at that point that people started using .Count() everywhere, including on IEnumerables, which previously didn't have a method for getting the length of the enumerable.

Most of the time, there was very little pain because the engineers over at Microsoft were clever enough to help us out. The first thing they try to do in the .Count extension method is check to see if the IEnumerable is an ICollection. If it is, they just use the Count property which we already know is plenty fast; however, if it's not an ICollection, they have to iterate the Enumerable and count the elements.

Here's what that looks like:

public static int Count<TSource>(this IEnumerable<TSource> source)
{
  if (source == null)
    throw Error.ArgumentNull("source");

  ICollection<TSource> collection1 = source as ICollection<TSource>;
  if (collection1 != null)
    return collection1.Count;

  ICollection collection2 = source as ICollection;
  if (collection2 != null)
    return collection2.Count;

  int num = 0;
  using (IEnumerator enumerator = source.GetEnumerator())
  {
    while (enumerator.MoveNext())
      checked { ++num; }
  }
  return num;
}

If your source is indeed an Enumerable, this is still a fine way to find out how many items there are. The problem is, in the sample code where we're just simply checking to ensure that the Enumerable isn't empty, using .Count can prove costly. Checking that the count is greater than 0 has to enumerate the entire collection and count each item despite the fact that we know it's not empty as soon as we spot the first element.

Fortunately, the clever folks at Microsoft thought of this and also gave us .Any(). This extension method simply gets the Enumerator, calls .MoveNext(), and disposes the Enumerator. MoveNext tries to move to the next element in the collection and returns true until it passes the end of the collection.

Here's what the .Any() method looks like:

public static bool Any(this IEnumerable<TSource> source)
{
  if (source == null)
    throw Error.ArgumentNull("source");

  using (IEnumerator<TSource> enumerator = source.GetEnumerator())
  {
    if (enumerator.MoveNext())
      return true;
  }

  return false;
}

Thus, there's no need to Enumerate the entire Enumerable if you can use .Any() like this refactored code:

if (someEnumerable.Any()) 
    doSomething();

Monday, July 25, 2011

C# Yield Keyword, IEnumerable<T>, and Infinite Enumerations

Visual Studio 2005 came with a slew of .net and .net compiler features. One of those features that I particularly enjoy is the yield keyword. It was Microsoft's way of building IEnumerable (or IEnumerator) classes and generic classes around your iterator code block. I'm not going to discuss the yield keyword much in this post because the feature has been around a long time now and the internet is replete with discussions on the topic.

Despite the long life of the yield keyword, I still find it conspicuous when I see it in projects I work on. I suppose it's just rare that I find myself writing my own enumerable. As a result, when I see yield return, it tends to stand out. I started looking around to see how the rest of the programming world uses the yield keyword wondering if I was under-utilizing the flexibility provided.

Specifically, I wondered if I was missing out on the lazy nature of the iterator and the numerous linq extension methods optimized to take advantage of that aspect of iterators. What I mean by that is that an iterator doesn't need to store each sequential value in memory the way a collection would and thus you can use each value without necessarily increasing the memory overhead. Further, you can take advantage of calculations which tend to already be sequential in nature (like the Fibbonacci sequence for example).

The second thing I thought about was an infinite (well, sort of infinite) enumerable. I'm not sure if I feel like it's a bad idea or not so I wrote these examples to be unending? I may eventually find a use for such an iterator and then get burned when someone tries to call Fibbonacci.Min() and the application throws an overflow exception, but I suppose at that point I'll make it a method and take a sanity check variable.

In the meantime, here are a few examples of some iterators I thought were interesting and fun challenges:

static IEnumerable<ulong> Fibbonacci
{
    get
    {
        yield return 0;
        yield return 1;

        ulong previous = 0, current = 1;
        while (true)
        {
            ulong swap = checked(previous + current);
            previous = current;
            current = swap;
            yield return current;
        }
    }
}

static IEnumerable<long> EnumerateGeometricSeries(long @base)
{
    yield return 1;

    long accumulator = 1;
    while (true)
        yield return accumulator = checked(accumulator * @base);
}

static IEnumerable<ulong> PrimeNumbers
{
    get
    {
        var prime = 0UL;
        while (true)
            yield return prime = prime.GetNextPrime();
    }
}

static IEnumerable<List<uint>> PascalsTriangle
{
    get
    {
        var row = new List<uint> { 1 };
        yield return row;

        while (true)
        {
            var last = row[0];
            for (var i = 1; i < row.Count; i++)
            {
                var current = row[i];
                row[i] = current + last;
                last = current;
            }

            row.Add(1);
            yield return row;
        }
    }
}

Some of these methods use a few extension methods I adapted from places in the .net framework:

public static ulong GetNextPrime(this ulong from)
{
    for (var j = from + 1 | 1UL; j < ulong.MaxValue; j += 2)
        if (j.IsPrime())
            return j;

    return from;
}

public static bool IsPrime(this ulong value)
{
    if ((value & 1) != 0)
    {
        var squareRoot = (ulong)Math.Sqrt((double)value);
        for (ulong i = 3; i <= squareRoot; i += 2)
            if (value % i == 0)
                return false;

        return true;
    }
    return value == 2;
}

To keep the test console app clean, of course, I used my favorite IEnumerable.Each() extension method:

public static void Each<T>(this IEnumerable<T> enumerable, Action<T> action)
{
    foreach (var element in enumerable)
        action(element);
}

Here's the sample code:

static void Main()
{
    Fibbonacci.Skip(10).Take(10).Each(Console.WriteLine);
    Console.WriteLine();

    EnumerateGeometricSeries(2).Take(10).Each(Console.WriteLine);
    Console.WriteLine();

    PrimeNumbers.Where(p => p > 600).Take(10).Each(Console.WriteLine);
    Console.WriteLine();

    foreach (var row in PascalsTriangle.Take(10))
    {
        row.Each(element => Console.Write("{0} ", element)); 
        Console.WriteLine();
    }

    Console.ReadLine();
}

// The second set of 10 elements in the Fibbonacci sequence
// 55 89 144 233 377 610 987 1597 2584 4181

// Base of 2 to the first 10 powers
// 1 2 4 8 16 32 64 128 256 512

// The first 10 prime numbers greater than 600
// 601 607 613 617 619 631 641 643 647 653

// The first 10 rows of Pascal's Triangle
// 1
// 1 1
// 1 2 1
// 1 3 3 1
// 1 4 6 4 1
// 1 5 10 10 5 1
// 1 6 15 20 15 6 1
// 1 7 21 35 35 21 7 1
// 1 8 28 56 70 56 28 8 1
// 1 9 36 84 126 126 84 36 9 1

Wednesday, July 20, 2011

.Net ObjectFormatter - Using Tokens in a Format String

If you've already read this article and you don't feel like scrolling through my sample formats, you can jump directly to ObjectFormatter on github to get the source.

At some point in almost every business application, it seems you eventually run into the ubiquitous email notifications requirement. Suddenly, in the middle of what was once a pleasant and enjoyable project come the dozens of email templates with «guillemets» marking the myriad fields which will need replacing with data values.

You concoct some handy way of storing these templates in a database, on the file system, or in resource files. You compose your many String.Format() statements with the dozens of variables required to format the email templates and you move on to the greener pastures of application development.

Now, you've got a dozen email templates like this one:

Dear {0},

{1} has created a {2} task for your approval. This task must be reviewed between {3} and {4} to be considered for final approval.

This is an automagically generated email sent from an unmonitored email address. Please do not reply to this message. No, seriously . . . stop that. Nobody is going to read what you are typing right now. Don't you dare touch that send button. Stop right now. I hate you. I wish I could hate you to death.

Thank you,
The Task Approval Team

No big deal, everything is going swimmingly, and the application goes into beta. Then, it turns out, the stakeholders don't want an email template that looks like that. That was more of a draft really. Besides, you should've already known what they wanted in the template to begin with. After all, it's like you have ESPN or something.

It's important to add information about the user for whom this action is taking place, so this is your new template:

Dear {0},

{1} has created a {2} task for your approval regarding {5}({6}). This task must be reviewed between {3} and {4} to be considered for final approval.

If you have questions, please contact your approvals management supervisor {7}.

This is an automagically generated email sent from an unmonitored email address. Please do not reply to this message. No, seriously . . . stop that. Nobody is going to read what you are typing right now. Don't you dare touch that send button. Stop right now. I hate you. I wish I could hate you to death.

Thank you,
The Task Approval Team

So far so good. You've updated the template, updated your String.Format() parameters, passed QA and gone into production. But, now that users are actually hitting the system, it turns out that you need a few more changes. Specifically, you need to add contact information for the supervisor, remove the originator of the task, and by the way, what kind of sense does it make to put a low end limit on a deadline? Here's your new template:

Dear {0},

A {2} task for {5}({6}) is awaiting your approval. This task must be reviewed by {4} to be considered for final approval.

If you have questions, please contact your approvals management supervisor {7} at {1}.

This is an automagically generated email sent from an unmonitored email address. Please do not reply to this message. No, seriously . . . stop that. Nobody is going to read what you are typing right now. Don't you dare touch that send button. Stop right now. I hate you. I wish I could hate you to death.

Thank you,
The Task Approval Team

Now you have an email template format with various numbers all over the place, a String.Format() call with more parameters than there are tokens, and you have to go through the QA - deployment cycle again.

I've gone through this process on almost every application throughout my career as a software engineer. Hence the ObjectFormatter. Now, my email template looks like this:

Dear {Employee.FullName},

A {Task.Description} task for {TargetUser.FullName}({TargetUser.UserId}) is awaiting your approval. This task must be reviewed by {DueDate} to be considered for final approval.

If you have questions, please contact your approvals management supervisor {Supervisor.FullName} at {Supervisor.PhoneNumber}.

This is an automagically generated email sent from an unmonitored email address. Please do not reply to this message. No, seriously . . . stop that. Nobody is going to read what you are typing right now. Don't you dare touch that send button. Stop right now. I hate you. I wish I could hate you to death.

Thank you,
The Task Approval Team

I find that the ObjectFormatter makes my templating much easier to maintain and much more flexible. It also usually makes my calling code a lot cleaner. Here's an example of the approaches you could take to populate the sample templates:

// plain string formatting
String.Format(template, Employee.FullName, Supervisor.PhoneNumber, Task.Description, String.Empty, DueDate, TargetUser.FullName, TargetUser.UserId);

// if you have a dto already built
ObjectFormatter.Format(template, myDto);

// if you don't have a dto built
ObjectFormatter.Format(template, new { Employee, Supervisor, Task, DueDate, TargetUser });

I've found that most of the time they ask for template changes, they want me to add some value that is already a property on an object that's already in my object graph because of the current email template. That way, when they come tell me they want he target user's name formatted differently, I don't even need to recompile (well, sometimes I do . . . I mean, I can't predict everything). I can implement a lot of changes by using objects I already know I'm passing into the ObjectFormatter.Format() method. Here's the new template with the changes and I didn't have to change a line of code to make it work:

Dear {Employee.FullName},

A {Task.Description} task for {TargetUser.LastName}, {TargetUser.FirstName}({TargetUser.UserId}) is awaiting your approval. This task must be reviewed by {DueDate} to be considered for final approval.

If you have questions, please contact your approvals management supervisor {Supervisor.FullName} at {Supervisor.PhoneNumber}.

This is an automagically generated email sent from an unmonitored email address. Please do not reply to this message. No, seriously . . . stop that. Nobody is going to read what you are typing right now. Don't you dare touch that send button. Stop right now. I hate you. I wish I could hate you to death.

Thank you,
The Task Approval Team

If you'd like to check out the source or use the ObjectFormatter in your own projects, look for ObjectFormatter on github. If you make any cool changes, please let me know and I'll try to figure out how to merge them into the repository.

Tuesday, July 19, 2011

Extension Method to Replace foreach With Lambda Expression

It's pretty often I find myself looping through some enumerable and performing an action on the elements. Sometimes it's just displaying results with Console.WriteLine. Other times I need to do something a little more complicated. In any case, every once in a while, I feel like the foreach statement and the for statement aren't really quite expressive enough.

That's why I have this little guy:

public static void Each<T>(this IEnumerable<T> enumerable, Action<T> action)
{
    foreach (var element in enumerable)
        action(element);
}

It's pretty basic but I like the way it looks and feels. I used it in my blog post about a C# UpTo Extension Method a la Ruby's int.upto method.

Here's a simple demonstration I wrote in LinqPad:

void Main()
{
    Enumerable.Range(1, 5).Each(Console.WriteLine);
}

static class Extensions
{
    public static void Each<T>(this IEnumerable<T> source, Action<T> action)
    {
        foreach (var element in source)
            action(element);
    }
}

In writing this post (and perhaps because I've been spending way too much time with jQuery lately), it occurred to me that I may want to be able to chain my actions with another Each() or with other extensions from Linq perhaps:

void Main()
{
    Enumerable.Range(1, 5).Each(Console.WriteLine).Each(Console.WriteLine);
    // 1 2 3 4 5 1 2 3 4 5
}

static class Extensions
{
    public static IEnumerable<T> Each<T>
        (this IEnumerable<T> source, Action<T> action)
    {
        foreach (var element in source)
            action(element);
   
        return source;
    }
}

The problem is though, that generally you don't want your action to enumerate your enumerable until something is to be done with the results. Instead, you often want it to be executed during the enumeration of your enumerable, so you'd write it like this:

void Main()
{
    Enumerable.Range(1, 10)
        .Each(Console.WriteLine)
        .Where(i => i <= 5)
        .ToList();
    // 1 2 3 4 5 6 7 8 9 10
    
    Console.WriteLine();
    
    Enumerable.Range(1, 10)
        .Each(Console.WriteLine)
        .Take(5)
        .ToList();
    // 1 2 3 4 5

    Console.WriteLine();

    Enumerable.Range(1, 10)
        .Each(Console.WriteLine)
        .Skip(5)
        .Take(5)
        .ToList();
    // 1 2 3 4 5 6 7 8 9 10
}

static class Extensions
{
    public static IEnumerable<T> Each<T>
        (this IEnumerable<T> source, Action<T> action)
    {
        foreach (var element in source)
        {
            action(element);
            yield return element;
        }
    }
}

Thursday, October 28, 2010

Column Encryption in SQL Server 2008 with Symmetric Keys

Most of the time when I write blog posts, I do it to share ideas with my fellow developers. Sometimes I do it just so I can have a place to reference when I forget the syntax for something. This is one of those reference posts.

Recently I've been charged to column level encrypt some personally identifiable information. The present post is not intended to discuss the merits of column level encryption; rather, as I said it is to put a few code snippets up so that I can reference them later. If you should find yourself in a column level encryption predicament in a SQL Server 2008 environment, you may find these useful as well.

First thing's first. Get the database ready for column level encryption by creating a master key:

--if there is no master key create one
IF NOT EXISTS 
(
  SELECT * 
  FROM sys.symmetric_keys 
  WHERE symmetric_key_id = 101
)
CREATE MASTER KEY ENCRYPTION BY 
  PASSWORD = 'This is where you would put a really long key for creating a symmetric key.'
GO

Now, you'll need a certificate or a set of certificates with which you will encrypt your symmetric key or keys:

-- if the certificate doesn't, exist create it now
IF NOT EXISTS
(
  SELECT *
  FROM sys.certificates
  WHERE name = 'PrivateDataCertificate'
)
CREATE CERTIFICATE PrivateDataCertificate
   WITH SUBJECT = 'For encrypting private data';
GO

Once you have your certificates, you can create your key or keys:

-- if the key doesn't exist, create it too
IF NOT EXISTS
(
  SELECT *
  FROM sys.symmetric_keys
  WHERE name = 'PrivateDataKey'
)
CREATE SYMMETRIC KEY PrivateDataKey
  WITH ALGORITHM = AES_256
  ENCRYPTION BY CERTIFICATE PrivateDataCertificate;
GO

Before you can use your symmetric key, you have to open it. I recommend that you get in the habit of closing it when you're finished with it. The symmetric key remains open for the life of the session. Let's say that you have a stored procedure in which you open the symmetric key to decrypt some private data which your stored procedure uses internally. Someone who has access to the stored procedure can run it and then will have the key opened for use in decrypting private data. My point, close the key before you leave the procedure. Here's how you open and close keys.

-- open the symmetric key with which to encrypt the data.
OPEN SYMMETRIC KEY PrivateDataKey
   DECRYPTION BY CERTIFICATE PrivateDataCertificate;

-- close the symmetric key
CLOSE SYMMETRIC KEY PrivateDataKey;

Here's a little test script I wrote to demonstrate a few points. First, the syntax for encrypting and decrypting. Second, the fact that the the cipher text changes each time you do the encryption. This prevents a plain text attack.

-- open the symmetric key with which to encrypt the data.
OPEN SYMMETRIC KEY PrivateDataKey
   DECRYPTION BY CERTIFICATE PrivateDataCertificate;

-- somewhere to put the data
DECLARE @TestEncryption TABLE
(
  PlainText VARCHAR(100),
  Cipher1 VARBINARY(100),
  Cipher2 VARBINARY(100)
);

-- some test data
INSERT INTO @TestEncryption (PlainText)
SELECT 'Boogers'
UNION ALL
SELECT 'Foobar'
UNION ALL
SELECT '457-55-5462'; -- ignoranus

-- encrypt twice
UPDATE @TestEncryption
SET 
  Cipher1 = ENCRYPTBYKEY(KEY_GUID('PrivateDataKey'), PlainText),
  Cipher2 = ENCRYPTBYKEY(KEY_GUID('PrivateDataKey'), PlainText);

-- decrypt and display results  
SELECT
  *,
  CiphersDiffer = CASE WHEN Cipher1 <> Cipher2 THEN 'TRUE' ELSE 'FALSE' END,
  PlainText1 = CONVERT(VARCHAR, DECRYPTBYKEY(Cipher1)),
  PlainText2 = CONVERT(VARCHAR, DECRYPTBYKEY(Cipher2))
FROM @TestEncryption;

-- close the symmetric key
CLOSE SYMMETRIC KEY PrivateDataKey;

Tuesday, September 21, 2010

Array Function to Recode Data in Google Apps Scripts

I went on my honeymoon with my beautiful wife last week and the week before. Having a little time off of work gave me the opportunity to get some work done :). I've been wanting for a while to develop a survey to find out what makes a good programmer.

I've been working with Google Docs and Google Forms to see what they're capable of. This spreadsheet posed a few difficulties. The primary problem was that I had a set of text values which needed to be recoded to numerical values from another range.

I wrote this array function for Google Apps Scripts in Google Spreadsheets to recode values based on an array of values.

Here's an example spreadsheet demonstrating the Array Data Block Recode Function.

Here's what the function looks like with a few tests:

function recode(data, values, valueColumnIndex)
{
  var  valueHash = {};
  
  // if the values are in an array, make a hash table
  if (values.constructor == Array)
    for (var i = 0; i < values.length; i++)
      valueHash[values[i][0]] = values[i][valueColumnIndex];

  else
    valueHash = values;
  
  var ret = [];
  
  // if the data are in an array, recursively recode them
  if (data.constructor == Array)
    for (var i = 0; i < data.length; i++)
      ret.push(recode(data[i], valueHash, valueColumnIndex));
          
  else
    ret = valueHash[data] != undefined ? valueHash[data] : data;

  return ret;
}

var values = [['a', '1', 'I'], ['b', '2', 'II']];

print(recode('a', values, 1));
print(recode(['a', 'b', 'c'], values, 1));
print(recode([['a', 'b'], ['b', 'c']], values, 1));
print(recode(['a', ['a', 'b'], [['a', 'b', 'c']]], values, 2));

/*
Results:
1
[1, 2, 'c']
[[1, 2], [2, 'c']]
['I', ['I', 'II'], [['I', 'II', 'c']]]
*/

/*
Google Apps Syntax:
=Recode(A1:B3, D1:F2, 1)
=Recode(A1:B3, D1:F2, 2)
*/

Regular Expression Search Bookmarklet

I have an old website where I keep most of my bookmarklets. I'm planning on deprecating that site and just putting up some personal stuff (since I don't do what that site says I do anymore).

This bookmarklet is probably my most used bookmarklet. Basically, you enter a regular expression and each match in the page will be highlighted. It cycles through 16 color schemes to change the highlight color.

If you just want to install the bookmarklet, grab this link and drag it onto your bookmarklet toolbar in your browser.
Regex Search

If you want to use it on your mobile device, you can use my Mobile Bookmarklet Installer Bookmarklet (which, by the way, will install itself too).

I'm sure it doesn't work in IE, but I haven't tested it in a really long time. If you'd like to see what it would do in IE if IE didn't suck so badly, just click it.

If you're interested in the code, here it is!

// check to see if the variable searches has been defined.
// if not, create it.  this variable is to cycle through 
// highlight colors.
if (typeof(searches) == 'undefined')
{
  var searches = 0;
};

(
  function()
  {
    // just some variables
    var count = 0, text, regexp;

    // prompt for the regex to search for
    text = prompt('Search regexp:', '');

    // if no text entered, exit bookmarklet
    if (text == null || text.length == 0)
      return;

    // try to create the regex object.  if it fails
    // just exit the bookmarklet and explain why.
    try
    {
      regexp = new RegExp(text, 'i');
    }

    catch (er)
    {
      alert('Unable to create regular expression using text \'' + text + '\'.\n\n' + er);
      return;
    }

    // this is the function that does the searching.
    function searchWithinNode(node, re)
    {
      // more variables
      var pos, skip, acronym, middlebit, endbit, middleclone;
      skip = 0;

      // be sure the target node is a text node
      if (node.nodeType == 3)
      {
        // find the position of the first match
        pos = node.data.search(re);

        // if there's a match . . . 
        if (pos >= 0)
        {
          // create the acronym node.
          acronym = document.createElement('ACRONYM');
          acronym.title = 'Search ' + (searches + 1) + ': ' + re.toString();
          acronym.style.backgroundColor = backColor;
          acronym.style.borderTop = '1px solid ' + borderColor;
          acronym.style.borderBottom = '1px solid ' + borderColor;
          acronym.style.fontWeight = 'bold';
          acronym.style.color = borderColor;
    
    // get the last half of the node and cut the match
    // out.  then, clone the middle part and replace it with
    // the acronym
          middlebit = node.splitText(pos);
          endbit = middlebit.splitText(RegExp.lastMatch.length);
          middleclone = middlebit.cloneNode(true);
          acronym.appendChild(middleclone);
          middlebit.parentNode.replaceChild(acronym, middlebit);
          count++;
          skip = 1;
        }
      }

      // if the node is not a text node and is not
      // a script or a style tag then search the children
      else if (
        node.nodeType == 1
        && node.childNodes
        && node.tagName.toUpperCase() != 'SCRIPT'
        && node.tagName.toUpperCase != 'STYLE'
      )
        for (var child = 0; child < node.childNodes.length; ++child)
          child = child + searchWithinNode(node.childNodes[child], re);

      return skip;
    }

    // use the search count to get the colors.
    var borderColor = '#' 
      + (searches + 8).toString(2).substr(-3)
      .replace(/0/g, '3')
      .replace(/1/g, '6');
    
    var backColor = borderColor
      .replace(/3/g, 'c')
      .replace(/6/g, 'f');

    // for the last half of every 16 searhes, invert the
    // colors.  this just adds more variation between
    // searches.
    if (searches % 16 / 8 >= 1)
    {
      var tempColor = borderColor;
      borderColor = backColor;
      backColor = tempColor;
    }

    searchWithinNode(document.body, regexp);
    window.status = 'Found ' + count + ' match'
      + (count == 1 ? '' : 'es')
      + ' for ' + regexp + '.';

    // if we made any matches, increment the search count
    if (count > 0)
      searches++;
  }
)();

Thursday, March 11, 2010

Custom jQuery Selector for External and Internal Links

I was working on a website for my fiancee and her church group called The Diocese of Atlanta Young Adults. They were hosting an event they call the Young Adult Summit.

One of the requirements I had was to provide a warning to users before they left the page by after having clicked an external link. Using jQuery, I was able to bind to the click event with easy cross browser compatibility and I used jQueryUI to open a modal dialog box. Pretty basic stuff.

The one thing I regretted was that I had to use a class name to determine which links were external and which were internal. A few days ago, I discovered that jQuery supports custom selectors and I decided to write a pair of custom selectors for identifying internal and external links.

jQuery.extend(
  jQuery.expr[ ":" ],
  {
    /*
      /:\/\// is simply looking for a protocol definition.
      technically it would be better to check the domain
      name of the link, but i always use relative links
      for internal links.
    */

    external: function(obj, index, meta, stack)
    {
      return /:\/\//.test($(obj).attr("href"));
    },

    internal: function(obj, index, meta, stack)
    {
      return !/:\/\//.test($(obj).attr("href"));
    }
  }
);

I do have a few items of note. First, you'll notice that I'm not actually looking for the current domain name in the links. That's because I use relative links for all of my internal links these days. Thus, I know that if there's not a protocol definition that it's an internal link. If you need to identify internal links by domain as well, you can just pull that out of top.location.href and check for it too.

Second, you could technically use this selector on any DOM object. There are several ways to get around this. One way would be to verify that the object is an anchor tag. Another would be to verify that the href attribute exists. I just plan on using common sense.

Here's a quick usage example:

$("a:external").click(verifyNavigateAway);
$("a:internal").css("font-weight", "bold");

Wednesday, February 25, 2009

File Download Resumer for HTTP

In my last post, I was complaining that my browsers of choice (namely, firefox and chrome) don't have good (if any) support for resuming failed or interrupted file downloads.

Now, there are very few things that irk me more than someone who complains but never offers a solution, so this post is proof that there is indeed a solution (and indeed a very simple one) to this problem.

To test it, I downloaded this file, hashed it, and got an MD5 of 0D01ADB7275BB516AED8DC274505D1F5. I downloaded about half the file, paused it in firefox, renamed the .pdf.part file to .pdf, resumed the download, hashed it, and got 0D01ADB7275BB516AED8DC274505D1F5. The file resumed exactly as I expected it to.

I threw together a quick download resumer and I've posted the project if you want the whole thing. Otherwise, these 4 lines contain the one line where the "magic" happens:

// create the request
HttpWebRequest request = HttpWebRequest.Create(Source) as HttpWebRequest;
request.Method = "get";
                
// set the range
request.AddRange(Convert.ToInt32(Downloaded));
                
// get the response . . . a.k.a., the "magic"
response = request.GetResponse() as HttpWebResponse;

Calling request.AddRange creates the header "Range: bytes=n-" where n is the number of bytes remaining in the download. Using this, browsers can append the desired bytes to abandoned file starting at exactly the position it left off. They could support it natively without plugins and allow you to "attempt to resume and hope for the best."

Monday, February 23, 2009

Social Linking on Blogger with Digg, StumbleUpon, Delicious, DotNetKicks, et Cetera

I've been doing more and more blogging lately, and I've noticed my visibility really picking up. When I added Digg (even though very few of my readers seem to Digg me . . . hint hint), I noticed my daily visits took a jump. I added Dot Net Kicks, and again . . . a jump. I've decided to add more social bookmarking links like StumpleUpon and Del.icio.us to see what happens. The only one that was difficult was the Delicious link because I couldn't find an example online. I decided that I'd post all of my bookmarking links here largely as a way to keep track of them myself, but you're all welcome to use them too.

Nota bene, I'm pretty sure I encoded them correctly so that if you copy and paste them into your template, you'll get what you're looking for. If they don't work (and the blogger template editor will usually warn you about an invalid html entity or something like that), then please leave me a comment and let me know and I'll fix it as soon as possible.

So, without further ado, here's how I got my Digg link:

<script type='text/javascript'>
digg_url = &#39;<data:post.url/>&#39;;
digg_title = &#39;<data:post.title/>&#39;;
digg_bgcolor = &#39;transparent&#39;;
</script>
<script src='http://digg.com/tools/diggthis.js' type='text/javascript'/>

Here's how I got my Reddit link:

<script>reddit_url=&#39;<data:post.url/>&#39;</script>
<script>reddit_title=&#39;<data:post.title/>&#39;</script>
<script language='javascript' src='http://reddit.com/button.js?t=2'/>

And my stumble upon link:

<a class='timestamp-link' expr:href='&quot;http://www.stumbleupon.com/submit?url=&quot; + data:post.url + &quot;&amp;title=&quot; + data:post.title' style=''>
<img align='' alt='Stumble Upon Toolbar' border='0' src='http://www.stumbleupon.com/images/su_micro.gif'/>
</a>

Dot Net Kicks:

<a expr:href='&quot;http://www.dotnetkicks.com/kick/?url=&quot; + data:post.url + &quot;&amp;title=&quot; + data:post.title'>
<img border='0' expr:src='&quot;http://www.dotnetkicks.com/Services/Images/KickItImageGenerator.ashx?url=&quot; + data:post.url'/>
</a>

And the difficult and time consuming del.icio.us:

del.icio.us

<a 
  expr:onclick='&quot;window.open(\&quot;http://delicious.com/save?v=5&amp;noui&amp;jump=close&amp;url=\&quot;+encodeURIComponent(\&quot;&quot; + data:post.url + &quot;\&quot;)+\&quot;&amp;title=\&quot;+encodeURIComponent(\&quot;&quot; + data:post.title + &quot;\&quot;), \&quot;delicious\&quot;,\&quot;toolbar=no,width=550,height=550\&quot;); return false;&quot;'
  expr:href='&quot;http://delicious.com/save?v=5&amp;noui&amp;jump=close&amp;url=&quot; + data:post.url + &quot;&amp;title=&quot; + data:post.title'>
<img alt='Delicious' height='10' src='http://static.delicious.com/img/delicious.gif' width='10'/> del.icio.us
</a>

The considerably less difficult RSS feed link:

<a expr:href='data:blog.homepageUrl + &quot;feeds/posts/default?alt=rss&quot;'>
<img alt='rss feed' src='http://www.benjaminobdyke.com/images/rss_button2.gif'/>
</a>

This post will also be a living document, so when I update it I'll post an update notice so it'll come up in your RSS feed.

Wednesday, December 17, 2008

Using the Proxy Pattern to Write to Multiple TextWriters

I was working on a data synchronizing application the other day. I needed to write to a file for the export, write to a string builder for logging and analysis, and write to the console for debugging. I know that it's pretty common that I'll need to write to more than 1 text stream at the same time, so I figured I could write a quick proxy application to write to a collection of TextWriters.Please comment on this post and let me know how this TextWriterProxy article helped you.Here's what I came up with:

    1 using System.Collections.Generic;

    2 using System.Text;

    3 using System.IO;

    5 namespace ESG.Utilities

    6 {

    7     public class TextWriterProxy : TextWriter

    8     {

    9         // store TextWriters here

   10         private List<TextWriter> _writers = new List<TextWriter>();

   12         #region Properties

   14         /// <summary>

   15         /// This property returns Encoding.Default.  The TextWriters in the

   16         /// TextWriterProxy collection can have any encoding.  However, this

   17         /// property is required.

   18         /// </summary>

   19         public override Encoding Encoding { get { return Encoding.Default; } }

   21         /// <summary>

   22         /// Gets or sets the line terminator string used by the TextWriters in

   23         /// the TextWriterProxy collection.

   24         /// </summary>

   25         public override string NewLine

   26         {

   27             get

   28             {

   29                 return base.NewLine;

   30             }

   32             set

   33             {

   34                 foreach (TextWriter tw in _writers)

   35                     tw.NewLine = value;

   37                 base.NewLine = value;

   38             }

   39         }

   41         #endregion

   43         #region Methods

   45         /// <summary>

   46         /// Add a new TextWriter to the TextWriterProxy collection.  Setting properties

   47         /// or calling methods on the TextWriterProxy will perform the same action on

   48         /// each TextWriter in the collection.

   49         /// </summary>

   50         /// <param name="writer">The TextWriter to add to the collection</param>

   51         public void Add(TextWriter writer)

   52         {

   53             // don't add a TextWriter that's already in the collection

   54             if (!_writers.Contains(writer))

   55                 _writers.Add(writer);

   56         }

   58         /// <summary>

   59         /// Remove a TextWriter from the TextWriterProxy collection.

   60         /// </summary>

   61         /// <param name="writer">The TextWriter to remove from the collection</param>

   62         /// <returns>True if the TextWriter was found and removed; False if not.</returns>

   63         public bool Remove(TextWriter writer)

   64         {

   65             return _writers.Remove(writer);

   66         }

   69         // this is the only Write method that needs to be overridden

   70         // because all of the Write methods in a TextWriter ultimately

   71         // end up calling Write(char)

   73         /// <summary>

   74         /// Write a character to the text stream of each TextWriter in the

   75         /// TextWriterProxy collection.

   76         /// </summary>

   77         /// <param name="value">The char to write</param>

   78         public override void Write(char value)

   79         {

   80             foreach (TextWriter tw in _writers)

   81                 tw.Write(value);

   83             base.Write(value);

   84         }

   86         /// <summary>

   87         /// Closes the TextWriters in the TextWriterProxy as well as the

   88         /// TextWriterProxy instance and releases any system resources

   89         /// associated with them.

   90         /// </summary>

   91         public override void Close()

   92         {

   93             foreach (TextWriter tw in _writers)

   94                 tw.Close();

   96             base.Close();

   97         }

   99         /// <summary>

  100         /// Releases all resources used by the TextWriterProxy and by the

  101         /// TextWriters in the TextWriterProxy collection.

  102         /// </summary>

  103         /// <param name="disposing">Pertains only to the TextWriterProxy instance:

  104         /// true to release both managed and unmanaged resources; false to release

  105         /// only unmanaged resources.</param>

  106         protected override void Dispose(bool disposing)

  107         {

  108             foreach (TextWriter tw in _writers)

  109                 tw.Dispose();

  111             base.Dispose(disposing);

  112         }

  114         /// <summary>

  115         /// Clears all buffers for each TextWriter in the TextWriterProxy

  116         /// collection and causes all buffered data to be written

  117         /// to the underlying device.

  118         /// </summary>

  119         public override void Flush()

  120         {

  121             foreach (TextWriter tw in _writers)

  122                 tw.Flush();

  124             base.Flush();

  125         }

  127         #endregion

  128     }

  129 }

So far, it works great. It cleans up a lot of my code and gives me the option to write to any number of TextWriters with only one call. Further, if you are calling a method that takes a TextWriter as a parameter, you can pass the TextWriterProxy to it because it extends the TextWriter class. Here's what the usage syntax looks like:

    1 // create a TextWriterProxy instance

    2 TextWriterProxy proxy = new TextWriterProxy();

    4 // add the Console.Out TextWriter

    5 proxy.Add(Console.Out);

    7 // you can still write directly to console

    8 Console.WriteLine(string.Empty.PadRight(80, '='));

   10 // add a StreamWriter for a FileStream

   11 FileStream fs = new FileStream("C:\\TestExportFileAutoGen.abx", FileMode.Create);

   12 StreamWriter resultWriter = new StreamWriter(fs);

   13 proxy.Add(resultWriter);

   15 // add a StringWriter for a StringBuilder

   16 StringBuilder sb = new StringBuilder();

   17 StringWriter resultStringWriter = new StringWriter(sb);

   18 proxy.Add(resultStringWriter);

   20 // call a method that takes a TextWriter

   21 ClientSync.GenerateSessionDataExport("Sync.ServerExport", proxy);

   23 // write directly to the TextWriterProxy

   24 proxy.WriteLine("Export Complete!");

   26 // close all of my writers

   27 proxy.Close();

And there you have it. A TextWriterProxy class to write to multiple TextWriters at once.