17 July 2010

Testing and Debugging

In this, the second follow-up post to my previous posting about a small project at work, I’m discussing issues that came up in testing and debugging.

First, click back on that older post (here).  That’s okay; I’ll wait.

Okay, done? 

Now, did you catch my problem?  I didn’t until I ran against a specific scenario.  What happens when there’s a blank line?  In the case of my code, I got caught being too clever.  I tried to edit out blank lines.  My problem is that I added a “break;” when blank line was found.  That made my “while” loop think it was done, and then left off anything after the blank line in that file.

Oops.

Then, I discovered something else.  A requirement I had interpreted as “sometimes there will be blank lines” was actually, “Sometimes, in an otherwise populated line, you’ll receive UNIX nulls.”  That’ll break a StreamReader.  If someone knows a way to handle those in a StreamReader, I’d appreciate an email.  However, I don’t know such a way, and so decided the best thing to do was just copy the bytes over.  Then, since I’d be working with bytes, it wouldn’t matter if it was the UNIX byte that said “null,” it’s still a byte and would get copied over.

Looking for a way to do this, I turned to the ultimate in developer help: StackOverflow.com.  Once there, I discovered, once again, that Jon Skeet is awesome.  I’m not posting his code, since I mostly just added Exception Handling, and changed names so they fit my solution better, and basically used his code.

So, how did I discover these problems:  Testing and Debugging.  I know it’s a long way to have gotten to what is supposed to be the point of the post, but I wanted to show how much pain it saved me, since I would have pushed to production based on the fact I knew my code “functioned as designed.”  The problem is that it would have been designed wrong.

Using the VS testing functionality (my client is on VS2005 btw), however, wasn’t enough.  You see, my test passed when I ran it.  However, I believe whole heartedly in having actual output that I can verify.  Again, my code functioned as designed.  It wasn’t until I opened up the test output that I found my problems with those blank lines.

So, three quick things about Testing your code:

1) Always Unit Test your code.  Once I started making changes, I did start failing my Unit Test.  If I’d only checked my output, I might not ever have known there were additional problems.

2) Set up Unit Tests which will test your assumptions as well as the code itself.  I could have used only files with complete data in them to test, and I would never have found my issues at all.

3) Always provide yourself actual, verifiable output, whenever possible.  If I hadn’t done just that, I would never have known about my problem at all.

An Explanation

I posted a couple of days ago about a process we’re writing as a console application.  Since I want this to be the highest possible quality code, I believe I should defend/explain the reasons I did things the way I did. 

After that, there will be a couple of additional posts explaining some issues we encountered and how we corrected them.

So, on with the explanation…

As I see it there are two basic questions (feel free to send others, though).  1: Why a console app instead of a service?  2: Why chained streams?

Why A Console Application?

Because this is specifically requested as a temporary solution, and we were given an outer limit of days (45) that it might be needed, it didn’t make sense to do anything more complex than a Windows Service or Console application.  This already feeds into a BizTalk solution, so we could have gone that route, or we could have created a WCF service that would get called somewhere, but those seemed overly complex for what is, in essence, an automated version of “CTRL+C" “CTRL+V.” 

So, between a Windows Service and a Console app, they both have their advantages and disadvantages. 

A service hooked to a FileSystemWatcher could merge the files at the moment the daily file comes in.  Since the process that uses the merged file runs on a schedule, that could be a good thing, since we’d know that it was always ready to go when needed.  On the other hand, if it fails, we would have to specifically look at the file location to verify that, and if we had to re-run the process we either have to re-drop the file or find some other way to get the service to kick off.

A Console application, on the other hand, has to be scheduled through the Windows Scheduler if it’s going to run as an automated process, and that means balancing the needs of running the merge early enough to be ready for the secondary process, and late enough to be sure we’ve got the file.  On the other hand, if it fails, it’s easy enough to pull it up and run it manually or, even, pull up the solution in TFS and run it in debug mode (to catch what the error is).  And, since it’s scheduled, we can know exactly when to check to verify it ran properly.

In the end, it was decided it was kind of “six of one, half a dozen of the other,” and my Programmer’s Virtue of Laziness said that since I was going to have to write a console app during the debug phase anyway, I may as well just keep it as a console app.

Why Chained Streams?

Really, was there another choice?  I could have converted the streams to byte arrays, but that seemed overly complex for what we were trying to accomplish (turns out that was wrong, but I didn’t know that at the time).  We’re trying to do the equivalent of opening one file, hitting “CTRL+A” then “CTRL+C”, opening a second file, hitting “CTRL+END” then “CTRL+V”, and then saving the merged file to a folder for an automated FTP process to pick up.

Reading the files and writing directly to a new, merged file, seemed to fit that need quite nicely (and would have, too, except for something I found out later).

So, there you go, there was our reasoning.  Fairly simple and straight forward.  If you have other questions, please post them in the comments.  Maybe I’ll do another follow-up based on those.

14 July 2010

A Hack solution Doesn't Mean Hacky Code

Today at work we ran into a little situation with a currently running process. The process is fine, but one of the parts of the business feeding data to our BizTalk orchestration needed a change. For whatever reason, the requirements for a file that gets sent through BizTalk had changed such that the file would be much, much bigger. The business unit responsible for that file said they couldn't supply the whole thing new every night (the current required process) because it would kill their other processes. What they proposed was to send one static file (old data that has to be sent every time) and then they would send the rest of the data (which changes) in the current daily process.

This left us with a bit of a quandary. Since the current Orchestration looks for two files (the one from this unit, and one from another) adding a new file would cause us to have to re-write the Orchestration (which would mean un-deploying it, then re-deploying it). This was a non-starter because this requirement should only be in place for a month or so, and then we'd have to go back to the old process.

So, the decision was made that we would take the static file, append the daily file to it each night, and then send it through the current process (now a single file) as normal.

I was tasked with creating the code that would do that. After a brief discussion with my team mates, it was decided we'd use a Console Application which would be called by the Windows Scheduler. We considered a Windows Service, but opted for the Console App because we can see the possible need to run it manually.

So, even though this is a hack process for a hack requirement, I decided that my code should be the highest possible standard- partially for personal and professional pride, and partially because no "temporary" requirement ever really goes away. So, with error checking removed (since that's a custom thing for my client), here are the basic guts. I, personally, think it's good, but please feel free to fire away with any problems you see...

   1:  public class FileMerge
   2:      {
   3:          public string StaticFilePath { get; set; }
   4:          public string DailyFilePath { get; set; }
   5:          public string MergedFilePath { get; private set; }
   6:          public string FinalFilePath { get; private set; }
   7:   
   8:          public FileMerge() { }
   9:          public FileMerge(string statFile, string dailyFile, string mergedFile, string finalFile)
  10:          {
  11:              StaticFilePath = statFile;
  12:              DailyFilePath = dailyFile;
  13:              MergedFilePath = mergedFile;
  14:              FinalFilePath = finalFile;
  15:          }
  16:   
  17:          public void MergeAndDrop()
  18:          {
  19:              FileInfo mergedFile = Merge();
  20:              if (mergedFile != null && mergedFile.Exists) Drop(mergedFile);
  21:          }
  22:   
  23:          private FileInfo Merge()
  24:          {
  25:              using (StreamWriter sw = new StreamWriter(MergedFilePath))
  26:              {
  27:                  if (File.Exists(MergedFilePath)) File.Delete(MergedFilePath);
  28:                  using(StreamReader staticReader = new StreamReader(StaticFilePath))
  29:                  {
  30:                      while (staticReader.Peek() != -1)
  31:                      {
  32:                          string line = staticReader.ReadLine();
  33:                          if (!string.IsNullOrEmpty(line.Trim())) sw.WriteLine(line);
  34:                      }
  35:                  }
  36:                  using (StreamReader dailyReader = new StreamReader(DailyFilePath))
  37:                  {
  38:                      while (dailyReader.Peek() != -1)
  39:                      {
  40:                          string line = dailyReader.ReadLine();
  41:                          if (!string.IsNullOrEmpty(line.Trim())) sw.WriteLine(line);
  42:                      }
  43:                  }
  44:              }
  45:              return new FileInfo(MergedFilePath);
  46:          }
  47:   
  48:          private void Drop(FileInfo fileToDrop)
  49:          {
  50:              fileToDrop.MoveTo(FinalFilePath);
  51:          }
  52:      }

09 July 2010

File Scrubbing – whether you want to, or not

As an EDI guy, I’m supposed to get to work in industry standard formats.  Since I mostly have done healthcare in the past, this primarily means the ANSI X12 4010A1 standard.  However, in the real world, the standard doesn’t mean much.

There are clients with proprietary formats, vendors with old versions of the standard, and just plain screw-ups that we deal with on a day-to-day basis.

So, with that in mind, I bring you the first in a new occasional series of posts: Public Code Review.  The following code is a scrubber I created to handle known, recurring issues with client and vendor files.  It uses Regular Expressions to find said known issues, and can either just remove them, or replace them.

First, I created two Interfaces: IScubber, and IConfigurable.  IScrubber is the interface which will do most of the work; IConfigurable just allows anyone else who wants to use this code to use their own configuration to set it up.

   1:  interface IScrubber
   2:      {
   3:          string Scrub(string original, string match);
   4:          string Replace(string original, string match, string replacement);
   5:      }

   1:  interface IConfigurable
   2:      {
   3:          void Configure();
   4:      }

Next, comes the class ScrubRule.  This simply holds two strings: the Regex to match the errors, and the replacement string.

   1:  class ScrubRule
   2:      {
   3:          public string Match { get; private set; }
   4:          public string Replacement { get; private set; }
   5:   
   6:          public ScrubRule(string m, string r)
   7:          {
   8:              Match = m;
   9:              Replacement = r;
  10:          }
  11:      }

With our base items created, I can now create the actual scrubber.  In this case, I’ve called it “BasicScrubber.”  It implements both IScrubber and IConfigurable.

   1:  public class BasicScrubber : IConfigurable, IScrubber
   2:      {
   3:          private List<ScrubRule> rules;
   4:   
   5:          public BasicScrubber(string configPath)
   6:          {
   7:              rules = new List<ScrubRule>();
   8:              Configure();
   9:          }
  10:   
  11:          public void Configure()
  12:          {
  13:              foreach (string s in ConfigurationManager.AppSettings.AllKeys)
  14:              {
  15:                  ScrubRule sr = new ScrubRule(s, ConfigurationManager.AppSettings[s].ToString());
  16:                  rules.Add(sr);
  17:              }
  18:          }
  19:   
  20:          public string Scrub(string original, string match)
  21:          {
  22:              Regex rx = new Regex(match);
  23:              return rx.Replace(original, string.Empty);
  24:          }
  25:   
  26:          public string Replace(string original, string match, string replacement)
  27:          {
  28:              Regex rx = new Regex(match);
  29:              return rx.Replace(original, replacement);
  30:          }
  31:      }

As you can see, the scrubber gets it’s configuration (in this case) from the System.Configuration.ConfigurationManager class pulling from app.config:

<?xml version="1.0" encoding="utf-8" ?>
<configuration>
  <appSettings>
    <add key ="RegexHere" value="ReplacementValueHere" />
  </appSettings>
</configuration>

So that if we replace “RegexHere” with the Regex: (?<=\.\d*)0+(?=\D|$)

and if we replace “ReplacementValueHere” with “”

we get a scrubber rule which will trim trailing zeros after a decimal place.

Wire this class up to a windows or console app, point it at your file in error, and let it go.

One of the great things about Regex is its speed at just this kind of process.  Before I started using Regex, I tried using basic string manipulation with string.Replace().  The problem is that when you start playing with special characters, or if something is off just a little bit, string.Replace() is a little unreliable for my tastes.  Additionally, it’s slow.  Running string comparisons and manipulations against a normal X12 835 file used to take a couple of minutes.  With Regex, it’s seconds.  As in, two or three, not thirty or forty.

So, let me know what you think.  This code should be highly portable.  Without much effort, it can be database driven instead of app.config driven, or you can even configure in some custom way.

08 July 2010

How to Cure Insomnia

Read technical documents for 8 hours straight. I've been at the client site for almost a full week now (technically, I guess it's a week's worth of days, but Friday before a three day weekend does not count), and I've been waiting for my system access to come through. Because of that three day weekend thing, it's been a little slower than it would otherwise have been so I haven't had much to do. What I have had to do is read technical specifications. And project plan documents. And more technical specifications. I'm sure it will get much better tomorrow or Monday (I should have all the access I need by then) but for this week, it's been like fighting narcolepsy...

03 July 2010

Changes

This week I left my position at ESI for a new position with Sogeti USA's Irving office. I am now a consultant with Sogeti- my first gig will be providing BizTalk support while the current lead on that project goes on vacation for a month. It should be a good way for me to get some more in depth knowledge of BizTalk with people who know more than I do about its practical capabilities. I'm looking forward to the new adventure. I leave the folks at ESI without regrets, but certainly with fond memories. ESI was my first "real" development position, and I will always be thankful for the opportunity I was given there. The whole team were "good people." And, if you're from Texas, you know there is little better praise than that simple phrase. So, as I begin the next phase of my journey, I say "thank you," to Jeff, Milton, Kim, and the gang at ESI. Thanks for the opportunity. Thanks for letting me learn. Thanks for the memories.

15 June 2010

When to say "No."

Okay, let's be honest. Sometimes, as developers, we have to say "no." Sometimes the case for this is clear. Something like the following: Business Process Owner: We need an application that will do your thinking for you! We can call it the AwesomeApp!! Developer: No. Other times, it's less clear. Say, when you're developing a new business application, and one of the stakeholders wants to change something fairly trivial because it would be easier for them "right now." For instance, take a situation at my job. We have a database to store records relating to medical insurance eligiblity requests. One table holds the actual subscriber (insured) information. Each subscriber gets two records: one for the request, and one for the response. The way the process is designed, there is some information that goes out on the request which is stored with the request. Some of that information can come back differently on the response- this information is housed in the response record. Some of that data cannot (according to industry standards) be changed, so we decided not to store it on the response record. Since the data structures are sufficiently convoluted to require multiple joins to get any useful data anyway, we decided one additional join wouldn't hurt anybody. Also that storing redundant data is kind of silly. One of the BPEs wants that duplicate data in the response record so that a "basic 'What happened to account so-and-so" query won't require joins. Normally, I'd say "Okay" and make the change. In this instance though (again, due to the convoluted nature of the data we send and receive) a received record without it's sent counterpart (and, for that matter, related records in other tables) has absolutely no context and would be worthless even for those "basic" querries. So, I said "no." I will continue to say "no" until over-ruled. What about you, readers? Am I way off base? What are your "Just Say No" stories?

28 April 2010

Lambda Expressions - The Shuttle _Tidirium_ of .NET

Actually, I just said that 'cause I wanted to get my Star Wars geek on. Today I ran across this article while trying to wrap my head around lambda expressions. I was trying to wrap my head around lambda expressions because they're cool (that is, useful and geeky). So, to sum up the artical (without using his example) let's look at some code I just modified to use Lambdas. Old Code:

User u = (from u1 in Users where u1.UserName == userName select u1).First(); CurrentTicket.Reasign(u);

New Code:

CurrentTicket.Reassign(Users.ToList(). Where(u => u.UserName == userName).First());

As the article points out, these are really saying the same thing. The Lambda Expression is just more concise.

22 April 2010

Remittance Headaches

For those of you who've worked with X12 835 Remittance documents, you may know my pain. Some time ago at work, I was tasked with finding a more efficient way to handle remittance advice documents for our clients. Since I like BizTalk, I figured this would be a good time to prove its abilities, and I got to work. In short order, I was able to write the remits to a set of tables. With one, teensy problem. BizTalk is (necessarily) unaware of Database relationships. Using the WCF LOB Adapters for BizTalk worked for blowing the information into the DB very quickly, but I couldn't find a way to maintain referential integrity among the tables. Eventually I decided on an "organic key" made up progressively larger strings as one walks down the tables. So, for instance, a Claim Group (LX, TS3, and TS2 segments) would refer back the check on which they're found by using a string made up of the ST02 segment and the Check number. A Claim Payment record would refer back to the claim group by adding onto it the LX identifier and so forth. This works very well, actually, and if I know exactly the check number I want, I can find it very quickly. However, this doesn't actually take care of all of our problems. Those pesky end users want -gasp - to be able to search for remits by Patient! or Provider! What are they thinking?!?! So, I have to fix my database. But how? I mean, I still haven't found a way for BizTalk to keep that referential integrity for me, and having BizTalk pass the 835 to a .NET application kind of defeats the purpose, so what to do? Barring some divine (or reader) intervention, I've settled on this: I'm writing a .NET application which will bundle up remits at the batch level (that is, i have batch object which contains my batch data as well as a list of check objects, which, in turn, hold my check data and a list of claim group objects, and so forth). Once that is done, I'll write the records into a new database (probably nightly?) in a fashion which can be aware of record identities and referential integrity. Please, if anyone has a better option, email me or leave it in the comments- it would be a God-send.

20 April 2010

XML: Nillable="True" vs MinOccurs="0"

This came up at work today, as a coworker was strugling with a WSDL for an outside vendor. I only knew it because BizTalk runs into exactly this kind of error occassionally. The situation is this: A WSDL is defined with several elements. Let's use this for an example:

What you see here is a bunch of elements in an which my coworker is trying to show as null by not including. That is, when, for instance, "address" is null, it simply will not exist in the XML document. Here's how the same code looks as our vendor expects it.

Now, leaving aside the question of why the only way to pass data to the web method is allowed to be null, we can see that the vendor's wsdl expects the nodes to exist. Therefore, instead of simply leaving out "address" the XML document should contain a node like this: <> It makes sense when you think about it in terms of code as well. How many times in C# do we pass something like "foo = new foo(new bar(), null)?" Is there any question that this is different from "foo = new foo(new bar())"? Not really. (At least, not until C# 2zillion when Microsoft decides that any parameter which can accept a NULL is automatically an optional parameter.) I did manage to help steer the coworker in the right direction, and provided him (as I am you) a pretty good synopsis on this whole thing from IBM's website. Link here. (http://www.ibm.com/developerworks/xml/library/ws-tip-null.html) So, in summary, pay attention to your WSDL and Schemas. Nillable is not the same as MinOccurs="0".

07 April 2010

MS .Toolbox

I've been working my way (slowly) through Microsofts new http://www.microsoft.com/design/toolbox/ site. For anyone who wants to know more about UI design (which should be every Mid-tier and back-end developer everywhere), it's a really good resource. But it brings me to something I see to much in the developer world: the idea that if the "functionality" works, UI doesn't matter. "Just give 'em a button" seems to be the mantra all too often. My take on it, however, is this: If the User can't make it do what it's supposed to, or if the user decides the old way was easier than your new way, your program doesn't work. The UI is how the users interact with your functionality; if it's not good, then it's bad. There is no middle ground. With WPF and Silverlight, this consideration has become much easier to handle. Simply implement INotifyPropertyChanged on your Business Layer objects, and then use MVVM to set up the UI. However, for people not using WPF or Silverlight, some consideration has to be made at design time for how the UI will look. If you'll be populating a ListBox control, you probably need a collection of some kind, for instance. This is a place where I've fallen down in the past, and I plan not to do so any more. You should make a similar pledge.

06 April 2010

WCF WTF: Update

I attempted the solution I discovered yesterday when I got into work this morning, and it wasn't working either. I did a little more research and discovered that, for WCF to work with Streams, you really need to pass them via a BasicHttpBinding. As I'm not that great at WCF yet, I've been using the default WSHttpBinding that Visual Studio creates when you add a Service Reference. I didn't really want to spend too much more time working on this- we were already getting to the point of "diminishing returns." So I decided to take stock of where I was and move forward. The OperationContract was defined as a Stream output with a Stream input. However, since I know that a Stream is really just an Array of Bytes, I decided on an alternative approach. I created a new OperationContract (called DecryptBytes) which accepts (and returns) an array of bytes. Then, in my .svc.cs file, I created the new method which really just changes the bytes into a MemoryStream, and then hands the memory stream to my original method: public byte[] DecryptBytes(byte[] inStream) { MemoryStream _iBytes = new MemoryStream(inStream); MemoryStream _oBytes = this.Decrypt((Stream)_iBytes) as MemoryStream; return _oBytes.ToArray(); } So, for 5 lines of code (including the Attribute Tag in the Service Contract, and the Method declarations) I was able to correct the issue. I feel a little silly that it was that hard to figure out, but I'll chalk it up as "lesson learned" and archive it here in case I need it again...

05 April 2010

WCF WTF

So, I have a WCF Service I created to handle PGP encryption/decryption using the BouncyCastle .dll. We wanted just the one, internal, service so that everyone who needs to be able to decrypt data can without having to ask the developers all the time. This service receives and passes streams of data. Unfortunately, I did not realize that this doesn't actually work real well in WCF. Unless I can find a way around it, WCF changes my nice FileStream or even just a base Stream and hides it in a System.ComponentModel.Dispatcher.StreamFormatter (from memory, so that namespace may be a little off). Which is not seekable. My service requires a seekable stream. I'm sure you see the problem. I didn't discover this little gem until about time to go home today. So, tomorrow I'll write a nice little wrapper around the stream that gets passed into the service so I can treat it as a normal stream. I'll post the code here when that's done, so I'll have it someplace relatively safe.