Monday, November 2, 2009

Why VB Wastes My Time: Reason 1

I'm not really a huge fan of the Visual Basic language in general, but there are just some days where it ends up really wasting my time. The code for a project I'm working on contains, or rather contained code similar to this:
Public Enum TestEnum
    Foo = 1
    Bar
    Baz
End Enum

'...

Dim enum_value as TestEnum = MyEnum.Foo
Dim str as String = MethodThatReturnsAString(enum_value)

'...

Private Function MethodThatReturnsAString() As String
    Return "Hello World!"
End Function
What you see above compiles perfectly fine. And it does, even, return a value. But you're looking at this code and contemplating WHY it compiles fine. I stared at this for a good while before I had to add another brain to the mix. But, just to give those who are smarter than me a bone before giving the puzzle away, the value of str would equal "e". For some of you, that may have done it, you now know what's going on. The rest of you may require a little background on the design of the Visual Basic language. I agree that simplicity is the best way to go, but simplicity at the cost of clarity is detrimental.

VB Simplification 1: Parentheses to delineate Function/Subroutine parameters are optional in certain circumstances. So assigning to a variable the value of a parameterless function, right hand part of the assignment may not look any different than a variable?

VB Simplification 2: Parentheses have a dual purpose in VB. Method parameters (as common in most languages) & the index operator (in C based languages this is the square brackets []). During an assignment, the right hand side can easily be mistaken as a method call.

The full blow solution, if you have yet to come up with it yourself, goes like this:
  1. MethodThatReturnsAString returns "Hello World!"
  2. As MethodThatReturnsAString does not take any parameters and therefore the parentheses are not necessary. The compiler, avoiding a compile error, assumes that the value of enum_value is not a parameter, but indeed an indexer to the string that was just returned.
  3. str is assigned the value of "e".
  4. The compiler is happier than a pig in slop.
  5. I'm not happy as all I have is an "e"...
Problem solved, end of story, time sufficiently wasted.

"Visual Basic did many other things to waste my time as well. If every one of them were written down, I suppose that even the whole world would not have room for the books that would be written." - 1 Coders 21:25

Tuesday, October 6, 2009

Running Scheduled Jobs

Working in my company's integration department, it's often necessary for us to set up scheduled data import/export jobs to and from services of various origins. Our current solution to this, which has been adequate thus far, is a simple single threaded Windows service with a tad quirky timing mechanism. Integrations are scheduled and can be turned on and off my editing a table in our database. Errors are logged into a database when jobs fail. Job processing logs, however, leave a lot to be desired. I hacked in processing logs having my hands pretty much tied. With the permission of my boss, now, I am beginning work on a new version of our job scheduling service. Features which I would like to include are:
  • Integration processing for multiple clients should be able to happen synchronously.
  • Jobs need to run every x minutes/hours/days without any variation.
  • Error and processing logs should be able to be kept.
  • Integrations should be able to be stopped and started on a per Integration/per client basis.
  • Service should be able to allow all jobs to complete processing before shutting down.
  • The service should be able to be monitored by an external application and controlled almost entirely from that application. Upcoming jobs, running jobs and completed jobs as well as error and processing logs should be viewable.
This is going to be a fun project I think. I'll talk more about design and implementation ideas at a later time.

Friday, April 10, 2009

Optional Parameter Gotcha!

Optional parameters are fake!! They don't exist. No really, they don't. They're merely a construct allowed by the compiler to reduce the number of needless overloaded methods a developer needs to create. When code that uses optional parameters gets compiled, the compiler takes the liberty of adding in any parameters you felt you didn't need to supply. You heard me! The substitution of optional parameters that are not implicitly supplied happens at compile time, not run time.

For instance, I have the following code:
Module Module1 Sub Main() Say() End Sub Sub Say(Optional ByVal what As String = "Hello World!") Console.WriteLine(what) End Sub End Module


The compiled version is a bit more interesting (converted from IL to VB):
Module Module1 <STAThread> _ Public Shared Sub Main() Module1.Say("Hello World!") End Sub Public Shared Sub Say(ByVal Optional what As String = "Hello World!") Console.WriteLine(what) End Sub End Module


Notice in the compiled version, that default value ("Hello World!") is supplied as a parameter to [Say]. Not a problem unless an assembly you haven't/can't deploy calls a method you just added an optional parameter for. Lets say we had an older version of my program where [Say] didn't take any parameters and just printed "Hello World!" to the screen. I decide later to get a little more modular and move [Say] to its own assembly. I update my program to use it the new method and everything is cool. Time goes by and I need a more generic version of [Say] so I add the [Optional what As String] parameter and deploy the modified assembly.

And then not too long later, I start getting informed from every person who happens to be using my assembly and calling [Say] that their applications are now blowing up.

Because the dependent applications weren't compiled with the new optional parameter in mind, they're trying to call a parameterless version of [Say] which, in reality, doesn't exist. Therefore, care should be taken when using optional parameters to maintain compatibility with assemblies that you can't/don't want to have to redeploy. In those situations, an overloaded method would be the best way to go.

I guess the beauty of optional parameters, as everything else, is only skin deep.

Tuesday, February 3, 2009

Breaking Down Strings

There seems to be an understanding among some coders about the use of strings in .NET. I can't say, exactly, where it comes from either, except for maybe a single misconstrued idea: Because strings are immutable, every string concatenation operation causes the CLR to make another memory allocation. If you're saying, "What gives? That's true!", keep your shirt on, and keep reading.

I have to say that I was one of those who misunderstood the inner workings of the CLR and blindly believed what other, also mislead, developers told me. However, thanks to great developers such as Jon Skeet who have published a wealth of knowledge on the internal happenings of the .NET CLR, I have been lead to the light. Therefore this article, is evangelizing what I know to be true.

The articles, for reference, that prompted this blog are http://www.yoda.arachsys.com/csharp/stringbuilder.html (Jon Skeet), and http://www.simple-talk.com/community/blogs/jcrease/archive/2009/01/16/71678.aspx (!Jon Skeet). While much of this article is summary of other people's hard work, I have personally verified as much of it as possible through debugging and IL dissembling. For IL disassembling, I use Red Gate's .NET Reflector.

Getting back to the misconstrued idea that I pointed out earlier. This statement is VERY misleading if not taken in the proper context. It isn't that string are not immutable, because that's not the case. It isn't also, that memory allocations happen whenever the string is modified, because that's very much the case. The context of the statement depends on one knowing exactly when and how the CLR modifies strings. Lets look at the following code:

Example 1:

C# code:
string str_hello = "Hello"; string str_world = "World"; string str_test = str_hello; str_test += " "; str_test += str_world str_test += "!";

IL code:
L_0001: ldstr "Hello" L_0006: stloc.0 L_0007: ldstr "World" L_000c: stloc.1 ... L_0026: ldloc.0 L_0027: stloc.3 L_0028: ldloc.3 L_0029: ldstr " " L_002e: call string [mscorlib]System.String::Concat(string, string) L_0033: stloc.3 L_0034: ldloc.3 L_0035: ldloc.1 L_0036: call string [mscorlib]System.String::Concat(string, string) L_003b: stloc.3 L_003c: ldloc.3 L_003d: ldstr "!" L_0042: call string [mscorlib]System.String::Concat(string, string) L_0047: stloc.3

This code made a total of 6 memory allocations. 6! And all we did was say "Hello World!". 2 of the allocations were to hold "Hello" and "World", but the other 4 were performed while concatenating everything together. How might we make this more efficient? I'm glad you asked, actually. The answer to that is simple. Have you ever noticed the number of overloads that String.Concat has? Lots! Including one that takes in 4 strings. And, hey, we just happen to have 4 strings.

Example 2:

C# code:
string str_hello = "Hello"; string str_world = "World"; string str_test = String.Concat(str_hello, " ", str_world, "!");

IL code:
L_0001: ldstr "Hello" L_0006: stloc.0 L_0007: ldstr "World" L_000c: stloc.1 L_000d: ldloc.0 L_000e: ldstr " " L_0013: ldloc.1 L_0014: ldstr "!" L_0019: call string [mscorlib]System.String::Concat(string, string, string, string) L_001e: stloc.2

Great! We did it! We've cut our memory allocations in half. Now all we have to do is just use String.Concat all the time, right? Well, not exactly. While that wouldn't hurt anything, there's actually something else that creates IL code that looks just the same.

Example 3:

C# code:
string str_hello = "Hello"; string str_world = "World"; string str_test = str_hello + " " + str_world + "!";

IL code:
L_0001: ldstr "Hello" L_0006: stloc.0 L_0007: ldstr "World" L_000c: stloc.1 L_000d: ldloc.0 L_000e: ldstr " " L_0013: ldloc.1 L_0014: ldstr "!" L_0019: call string [mscorlib]System.String::Concat(string, string, string, string) L_001e: stloc.2

Believe it or not, I copied the IL code from the previous 2 examples from 2 different builds of the 2 different examples, and it produced the exact same IL code! Additionally, String.Concat even has an overload that accepts a string[]! That means no matter how many times you use the concatenation operator in the same statement, like those above, there's only 1 memory allocation.

There's one 4th and final concatenation example that I'd like to go over before moving on.

Example 4:

C# code:
string str_test = "Hello" + " " + "World" + "!"

IL code:
L_0001: ldstr "Hello World!" L_0006: stloc.0

That's it. No really, those 2 lines of IL are all that are generated from the above statement. The .NET compiler is smart enough to see that all you're doing is concatenation string literals and it does it all for you at compile time!

While this example is trivial, in a web application where you might need to be long strings of HTML, or even email address lists with hundreds to thousands of users hitting all at the same time, that trivial difference can easily hit up to hundreds of MB which would easily incapacitate your app. As most developers know, this is fixable using StringBuilder, which I will be discussing later.