Archive

Author Archive

Compress/Decompress a String in C#

December 26, 2012 Comments off

The feature I was working on today involved saving some temporary data into an HTTP session cookie. The data was an object serialized into JSON. Writing and reading cookies is easy enough but, after testing my implementation a bit, I found out that my code was intermittently failing to save the cookie in my browser. After some deeper investigation, I found that the reason my cookie wasn’t being saved (sometimes) was because the JSON I was attempting to store in the cookie was exceeding the 4K limit of a cookie. The object I was serializing to JSON was a List of other classes so I could not prevent users from exceeding the limit without imposing arbitrary restrictions on the usage of the feature. I also wanted to keep the data on the client’s machine, as opposed to a DB or other storage medium, because the data really did seem to belong in the user’s session. The only option I could think of to solve my problem was to compress the JSON string.

Unfortunately, when I Google’d “C# compress string” (or something like that), the results weren’t very helpful. Most of the code I came across was either for .NET 2.0 or was doing File IO and byte array manipulation with pointers. Many examples didn’t even make proper usage of using() blocks to properly close Streams. What I needed was .NET 4.0 code to Compress and Decompress strings to strings, preferably without needing to do any byte pointer magic. Fortunately, I eventually came across this some-what helpful Stack Overflow Q&A which was close enough to what I was trying to do. The OP was looking for a .NET 2.0 solution and his compressed medium was a byte[] instead of a string but, with a few simple modifications, I was able to adapt the code to do exactly what I needed.

using System;
using System.IO;
using System.IO.Compression;
using System.Text;

...

public static string Compress(string s)
{
    var bytes = Encoding.Unicode.GetBytes(s);
    using (var msi = new MemoryStream(bytes))
    using (var mso = new MemoryStream())
    {
        using (var gs = new GZipStream(mso, CompressionMode.Compress))
        {
            msi.CopyTo(gs);
        }
        return Convert.ToBase64String(mso.ToArray());
    }
}

public static string Decompress(string s)
{
    var bytes = Convert.FromBase64String(s);
    using (var msi = new MemoryStream(bytes))
    using (var mso = new MemoryStream())
    {
        using (var gs = new GZipStream(msi, CompressionMode.Decompress))
        {
            gs.CopyTo(mso);
        }
        return Encoding.Unicode.GetString(mso.ToArray());
    }
}

It is simple, elegant, and best-of-all… it works! Luckily for me, JSON compresses extremely well. The previous data which was failing to save was about 4.5 K but I was able to reduce that down to only 1.5 K after compression! The only drawback I have noticed so far of storing compressed data in a cookie is that it is not human readable. You could spin that as a good thing because users will be less likely to want to see/modify obfuscated cookie values, but as a web-developer, I usually like knowing exactly what is being stored on my computer by the websites I visit. An (unintentional) positive side-effect of compressing the cookie is that it helps minimize the amount of data being sent back-and-forth from the users and our servers. 4K might not seem like a lot of data, but when it is being sent in every request from the user’s browser, it can add up. Either way, this simple code helped me solve this problem… moving on!

Advertisements

ASP.NET Data-Bind tag vs. Evaluation tag ‘Gotcha’

August 2, 2012 Comments off

My team ran into an interesting problem today while trying to fix a NullReferenceException from within an ASPX page. We had the following (simplified) code in the page:

<span>MyString # of characters: <%# MyString.Length %></span>

Of coarse, when MyString was null, this page was throwing the NullReferenceException. To solve this, we added a null check around the whole block.

<% if (MyString != null) { %>
    <span>MyString # of characters: <%# MyString.Length %></span>
<% } %>

By making this change, our understanding was that the inner block should not be evaluated by the server because of the conditional check around it. We tested this out with the case where MyString was null. To our surprise, we still got a NullReferenceException but this time with more details. The stack trace eluded tho the problem stemming from a DataBind() invocation.

The problem with this code is that we are attempting to use a DataBind expression in our ASPX page. DataBind constructs (<%# %>) get evaluated by the DataBind() method regardless of what conditional constructs surround them in the page. This mistake was simply caused by our inexperience ASP.

The solution was to switch the DataBind expression to, what I will call, an Evaluation expression. These expressions are the ones with the <%= %> syntax. I call them “evaluation” expressions because the documentation doesn’t refer to them as anything other than “the <%= %> construct”. They don’t use the DataBind functionality at all. Instead they are the equivalent of calling Response.Write() from within your page; they output the result of the expression as plan text into your markup. Our working solution looks like this:

<% if (MyString != null) { %>
    <span>MyString # of characters: <%= MyString.Length %></span>
<% } %>

A more seasoned ASP.NET developer may have thought “Duh, don’t do that” but it may not be so obvious to others. Unfortunately, I have seen DataBind and Evaluation expressions used almost interchangeably in many applications. They have different purposes and the differences between them need to be understood to be effective in ASP.NET. To learn these differences and understand when to use one vs. the other, I found the following resources helpful.

What are these special tags: <%# and <%= (Microsoft ASP.NET Forums)

ASP.NET Databinding/Server tags differences, declarative output property? (StackOverflow)

Easy Java toString() methods using Apache commons-lang ToStringBuilder

May 21, 2010 2 comments

One of the disciplines I try to have while developing Java code is always overriding the default toString, equals, and hashCode methods. I have found that most of the modern IDE’s have a way to generate these methods. My IDE of choice, IntelliJ IDEA, does have this functionality but I find it’s implementations to be sub par. Here is an example of an IntelliJ generated toString method for a class…

@Override
public String toString() {
    return "Person{" +
        "name=" + name +
        ", age=" + age +
        '}';
}

My biggest problems with this implementation are:

  1. Hard-coded, static strings for class and field names makes the amazing refactoring features of IntelliJ less reliable.
  2. Not using StringBuilder makes this code harder to read and slightly more resource intensive.
  3. Only fields of this class can be used; all inherited fields, even protected ones, are not included.

This was just not good enough. I started to investigate if anyone had written some kind of library which could reflectively look at an Object and produce a complete (private and inherited fields included) and consistent toString representation. Sure enough, I discovered that one of the Apache Commons libraries provided exactly this functionality. The commons-lang project has a class called ToStringBuilder which has a static method on it that uses reflection to generate a string which contains a all of a classes fields. The string’s format is consistent and can be controlled by the ToStringStyle class which has several useful formats built-in and also lets to implement your own. You can now write any object’s toString method as follows:

@Override
public String toString() {
    return ToStringBuilder.reflectionToString(this);
}

If you are using maven as your project management tool, you can add the following dependency into your pom.xml file in order to be able to use the commons-lang library in your project.

<dependency>
  <groupId>commons-lang</groupId>
  <artifactId>commons-lang</artifactId>
  <version>2.5</version>
</dependency>

Normally I would try to avoid reflection in production code when possible due to it’s reported slower performance, however I find the only time I am actually using toString methods is usually in debugging or testing where speed is not my number one concern. If you don’t feel reflection is a good solution for you, the ToStringBuilder class does have several instance methods which follow the traditional ‘builder’ pattern which allows you to manually choose the fields which will be included in the string representation of your object. Here is example of that usage pattern:

public String toString() {
    return new ToStringBuilder(this).
        append("name", name).
        append("age", age).
        append("smoker", smoker).
        toString();
}

This usage does have some of the same problems as the original implementation like it requires the field identifiers to be static strings. This does however allow you to ensure a consistent format of the resulting string and it also has the ability to include the any super classes’ toString result using the appendSuper() method.

If you are using Eclipse or NetBeans, I understand there are very good plugins which can build very good toString representations but this functionality is unfortunately not enough for me to give up the refactoring support IntelliJ provides. The above solution would also help though if you have some developers on a team using different IDE; using a library to toString representations is better than IDE dependent tools since it maintains consistency between environments.

The ToStringBuilder class is just one of many useful utility classes provided by the commons-lang library and the commons-lang library is just one of the useful libraries provided by the Apache Commons project. Make sure to check it out anytime you find yourself writing some code and asking yourself “Hasn’t someone done this already?”

Setup Persistent Aliases & Macros in Windows Command Prompt (cmd.exe) using DOSKey

May 14, 2010 25 comments

Our development machines here at Point2 are not standardized; we have a mixture of Windows XP, 7, and Mac OSX/Unix computers. I find myself constantly switching back and forth between command prompt interfaces when pair programming. As a result, I catch myself using “ls” to list a directories contents regardless of what system I am on. I am currently using a Windows XP machine for my developer box and I wanted to setup an alias to the “ls” command to actually perform a “dir”. Here is how I accomplished it…

There is a command available in a Window’s shell that let’s you “alias” command to whatever you please: DOSKey. It allows you to create “macros” to execute one or more other commands with a custom name. If you open up a command prompt and type in the following, you will now be able to use “ls” to list the current directory’s contents.

DOSKEY ls=dir

It also handles command line arguments, either by index or as a collection using $1 – $2 or $* respectively. This allows you to do things perform a “dir” command every time you change directories. This example would be done with the following macro definition.

DOSKEY cd=cd $1$Tdir

This all sounds very simple and easy until you close your command prompt, open a new one, and realize all of these macros you defined earlier did not persist between instances. The DOSKey command does not save these alias automatically. The DOSKey command does support saving the “currently defined macros” to a file which will allow you to run a simple command in any new shell to load macros from any saved file. The problem is, I already forget to use “dir” instead of “ls” so I know for sure I will not remember to run a certain DOSKey command every time I open up a new command prompt. I needed something more automatic.

I investigated what options I have for running commands as part of every run of cmd.exe. I found out there is a command line argument, \K, that I can use on cmd.exe to tell it to run a ‘.cmd’ or ‘.bat’ file to run commands on startup. So you can run something like to following command to load a shell instance with the following command file being ran automatically.

cmd.exe /K C:\path\to\file.cmd

This allows you to add all of the commands you want into that file and have them run automatically for the command prompt you are about to open. In order to pass this argument, I created a shortcut to cmd.exe in my Quick Launch toolbar which I could modify and use exclusively for my command prompt instances. This can easily be done by going into your C:\WINDOWS\system32 directory, right-clicking on cmd.exe and selecting “Send to Desktop”. Right click on the newly created shortcut (on your desktop) and select “Properties”. On the Shortcut tab, you will find the “Target” field which you will have to modify to include the command line option. Here is what my configuration looks like:

The only other thing from my configuration worth mentioning is the “Start in” setting I have specified. The value “%HOMEDRIVE%%HOMEPATH%” will open the command prompt in your user’s home directory as opposed to the default which opens the new window in the system32 directory which usually isn’t very helpful.

My doskey.cmd file is also worth taking a look at. It only currently has a few alias for common Unix commands but it will give you a good idea of what kind of things are capable.

@echo off

DOSKEY ls=dir
DOSKEY cd=cd $1$Tdir
DOSKEY clear=cls

It is probably best to also include the “@echo off” at the top of your script too just so you don’t have to see the noise in the shell running your script. There are also a lot more powerful features to DOSKey that I have yet to experiment with but you can see how easy it is now to add permanent macros into your command prompt.

Another thing worth noting is that you should look at the other tabs in the cmd.exe Shortcut Properties window because it makes it easy to do things like increase the buffer size of text for the shell, change font sizes and color, as well configure the behavior of command history tracking. After tweaking with all these settings, here is my new custom, improved Command Prompt:

The only “gotcha” here is that you have to open command prompt window using the Shortcut but there are other tools available which will let you run Shortcuts a simple keystroke so this should not be an issue. I hope this makes every Window’s Command Prompt user’s life a little easier. 🙂

Pair Programming Anti-Patterns

February 19, 2010 Comments off

I read an interesting blog post by Mark Needham today that talks about some anti-patterns he found himself falling into while pairing. I have also caught myself being guilty of some of these offenses so I thought that listing some here explicitly will help keep it fresh in everyone’s mind. Of coarse what works some may not work for everyone, but here is a small list of behaviors I which I find can negatively impact pair-programming…

Moving around the code too quickly

As the driver, it is normally a bad idea to be changing windows/files/tabs/etc. too quickly while you are in control of the computer. It is easy to forget that your navigator may not know what you are thinking so you find yourself flying through the UI of your workstation at lightning speeds only to have your pair say “Ok… what just happened?” Granted, following a driver’s every keystroke is not the navigator’s concern, I do find it helps collective code ownership to have the navigator pay attention to what the driver is doing. Therefor, I believe that it is always a good idea to communicate to your navigator what you are planning to do BEFORE doing it which also give them the opportunity to offer any input they have before everything has been said and done.

Grabbing the keyboard away

This one is a little more obvious as to why it is an anti-pattern but let me elaborate anyway. My opinion is that it is NEVER a good idea to take control of the keyboard and mouse without asking first. I have seen it happen for various different reasons; lack of time, lack knowledge, lack of patience, lack of discipline, and lack of respect. Sometimes you just want to finish your task and you know you can get it done quicker or maybe you are really excited about trying to solve the problem and hand and anxiously snatch control of the computer. Sometimes the guilty may have all of the above reasons for this behavior but I don’t think it should ever be acceptable. By taking control, you are giving your pair the impression that you think they can not accomplish what needs to be done or that you can do it better. It is always a good idea to ask first before taking control and if the driver was in the middle of something, you’d better have a good reason. This also promotes better communication because you more often have to attempt to explain your ideas to your driver instead of just doing it yourself.

Vacationing as the navigator

Pair programming often come with negative connotations because some ask “Well what can/should I really do as a pair?”. The reality of it is that pair-programming involves two distinct roles each with their own very engaging responsibilities. The driver is tasked with controlling the workstation to write the test cases, code, and refactor. The navigator should be involved with design decisions made while writing the code but they do not necessarily have to be reading every line of code the driver writes. Instead, they should be thinking of what test cases come next, whether or not the existing tests are passing, can we commit, and most importantly: “Are we done yet?”. This is why it is very important to not become detached from the task at hand while your driver tries to figure out where he is missing a semi-colon. Distractions like mobile devices, food, twitter, etc. can often lead the the navigator not having a plan which leaves both people in the pair asking “What next?”. This is not to say that a driver cannot become distracted as well but, in my experience, I have found it harder to fall into the trap when you are the one engaged at the keyboard and much easier when you can literally twiddle your thumbs. Just remember you are there as a supportive role to your driver and they are lost without your “map”. To help avoid this, it is a good idea to make sure you are switching roles often to keep both people engaged in the task at hand.

You can check out the original blog post which inspired this one here.

You can learn more about pair-programming here and here.

I am also interested if anyone else has any other pair-programming anti-patterns they would like others to be aware of. What do you find negatively affects a pair-programming situation? Or for that matter, any tips on how you have become more effective? Feel free to share!

Why NOT to avoid (or forget) “Walking Skeletons”

January 8, 2010 Comments off

At Point2, we have recently embraced using the concept of a “Walking Skeleton” as the first development work we do when starting a new project/module/bundle. This approach allows us make sure all of the overhead of a new project is accounted for and functioning properly before our project becomes too complex to allow for. We create the walking skeleton as part of our first sprint which ensures we have some deliverable by the end of our first iteration. By the time our walking skeleton is complete we have worked all of the kinks out of our CI, deployment, and testing strategies which are very straight forward still at this point.

Unfortunately, when my team recently started its current project, we dropped the ball when it came to finishing the walking skeleton end-to-end before starting on more complex tasks. It didn’t occur to as at the time that this was a bad thing because we were still making visible progress. In retrospect though, we acknowledged the difficulties and extra tasks we had created for ourselves by neglecting to help the skeleton take its first steps.

One issue we encountered was not being able to easily demonstrate new functionality to the product owners for sign-off. Because we had no certain way of running data through our application end-to-end (not even “Hello World”), we ended up fudging steps in the process just to see the desired results. This didn’t allow the business to try out the features without first having knowledge of the internal mechanics of the product. It also made it difficult, if not impossible, to properly functionally test the system in a true black-box fashion.

Another speed bump we ran into was the extra refactoring we found ourselves doing because the interfaces between components in our system were still evolving. We had not pushed data through each moving part so this meant that the way the parts fit together had not been clearly considered and defined. As the pieces came together we realized different interfaces were more appropriate and with interface refactoring comes unit test refactoring. Now I normally encourage a healthy dose of refactoring to every piece of code but not when I just finished the code for the task earlier that morning. Had we actually pushed something all the way through the pipe, we might have realized earlier that our initial architecture was not appropriate and, in fact, didn’t even make sense.

So why did this happen in the first place? After recognizing what happened, my team agreed we should perform a root cause analysis. This activity produced the following Ishikawa diagram:

Walking Skeleton Fail

The causes we came up with were interesting but not surprising. We decided the main reason we forgot about finishing the walking skeleton was we were just too excited to get started on the new project, using new technology.

We came up with two action items to address the issues we came up with:

  1. Simply remember to plan for a walking skeleton next project.
  2. Blog about our experience to help others avoid the same problems. 😉

By: Jesse Webb

MS SqlServer Row Value Concatenation

November 6, 2009 Comments off

Recently, a colleague and I found ourselves in the situation of needing to concatenate values from multiple rows of a MS SqlServer database table. We were trying to form a comma-delimited list of phone numbers for every customer in our database. One customer may have many phone numbers. Here is a simplified diagram visualizing the relationship between customer and phone numbers:

customer_phonenumber

Customer - Phone Number Relationship

The format of data we were looking for was:

CustomerId PhoneNumbers
1 “(306) 555-1111”, “555-2222”, “306-555-3333”
2 “3065554444”

We tried various queries to perform the concatenation correctly but none of our solutions seemed to do the job perfectly; either the final ‘PhoneNumbers’ string would have an additional comma at the end or some other undesired effect. We finally came across an article explaining in great detail of how to use TSQL (the SQL engine behind SqlServer) to perform the operations we needed.

http://www.projectdmx.com/tsql/rowconcatenate.aspx

This article has various examples describing AND explaining how to do multiple row value concatenations such as explicit examples for “Concatenating values when the number of items is small and known upfront” and “Concatenating values when the number of items is not known”. The author even walks through a recursive solution or two.

We determined that we knew none of our customers had more than five phone numbers based on the fact that they were limited to the PhoneType enumeration which only has five values. This allowed us to use the articles first example to get our job done efficiently. Here is our final solution:

SELECT CustomertId, REPLACE(
 '"' + MAX( CASE seq WHEN 1 THEN phoneNumber ELSE '' END ) + '",' +
 '"' + MAX( CASE seq WHEN 2 THEN phoneNumber ELSE '' END ) + '",' +
 '"' + MAX( CASE seq WHEN 3 THEN phoneNumber ELSE '' END ) + '",' +
 '"' + MAX( CASE seq WHEN 4 THEN phoneNumber ELSE '' END ) + '",' +
 '"' + MAX( CASE seq WHEN 5 THEN phoneNumber ELSE '' END ) + '"', ',""', '' ) as PhoneNumbers
FROM (
 SELECT pn1.CustomerId, pn1.PhoneNumber, (
 SELECT COUNT(*)
 FROM PhoneNumber pn2
 WHERE pn2.CustomerId = pn1.CustomerId
 AND pn2.phoneNumber <= pn1.phoneNumber)
 FROM PhoneNumber pn1) PhoneNumbersPerParty ( CustomerId, phoneNumber, seq )
GROUP BY CustomerId
By: Jesse Webb